367 lines
11 KiB
Markdown
367 lines
11 KiB
Markdown
# 📊 Bot Monitoring System - Easy Configuration Guide
|
||
|
||
The Yaltipia Telegram Bot includes a comprehensive monitoring system that automatically tracks performance, errors, and sends alerts to administrators. This guide will help you set it up in 5 minutes.
|
||
|
||
## 🚀 Quick Setup (5 Minutes)
|
||
|
||
### Step 1: Get Your Chat ID
|
||
**Option A: Personal Chat (Recommended for testing)**
|
||
1. Start a chat with your bot
|
||
2. Send any message to your bot
|
||
3. Your Chat ID will be logged in the console (look for `User XXXXXXX:`)
|
||
4. Copy this number
|
||
|
||
**Option B: Group Chat (Recommended for teams)**
|
||
1. Create a Telegram group
|
||
2. Add your bot to the group
|
||
3. Make the bot an admin (required for sending messages)
|
||
4. Send any message in the group
|
||
5. Look in console logs for the group Chat ID (negative number like `-1001234567890`)
|
||
|
||
### Step 2: Configure Environment Variables
|
||
Add these lines to your `.env` file:
|
||
|
||
```bash
|
||
# REQUIRED: Admin Chat Configuration
|
||
ADMIN_CHAT_IDS=YOUR_CHAT_ID_HERE
|
||
|
||
# OPTIONAL: Advanced Configuration (use defaults if unsure)
|
||
MONITORING_TOPIC_ID=5 # For group topics (optional)
|
||
HEALTH_CHECK_INTERVAL_MINUTES=30 # How often to check system health
|
||
DAILY_REPORT_HOUR=9 # When to send daily reports (24h format)
|
||
ERROR_CLEANUP_INTERVAL_HOURS=1 # Memory cleanup interval
|
||
```
|
||
|
||
### Step 3: Test Your Setup
|
||
1. Restart your bot
|
||
2. You should receive a "🚀 Bot Started" message
|
||
3. If you don't receive it, check the troubleshooting section below
|
||
|
||
## 📋 Configuration Examples
|
||
|
||
### Example 1: Single Admin (Personal Chat)
|
||
```bash
|
||
ADMIN_CHAT_IDS=123456789
|
||
```
|
||
|
||
### Example 2: Multiple Admins
|
||
```bash
|
||
ADMIN_CHAT_IDS=123456789,987654321,555666777
|
||
```
|
||
|
||
### Example 3: Group with Topic (Advanced)
|
||
```bash
|
||
ADMIN_CHAT_IDS=-1001234567890
|
||
MONITORING_TOPIC_ID=5
|
||
```
|
||
|
||
### Example 4: Frequent Monitoring (Every 5 minutes)
|
||
```bash
|
||
ADMIN_CHAT_IDS=123456789
|
||
HEALTH_CHECK_INTERVAL_MINUTES=5
|
||
DAILY_REPORT_HOUR=8
|
||
```
|
||
|
||
## 🔧 Environment Variables Explained
|
||
|
||
| Variable | Required | Default | Description |
|
||
|----------|----------|---------|-------------|
|
||
| `ADMIN_CHAT_IDS` | ✅ Yes | None | Comma-separated list of Telegram Chat IDs |
|
||
| `MONITORING_TOPIC_ID` | ❌ No | None | Topic ID for group chats (advanced) |
|
||
| `HEALTH_CHECK_INTERVAL_MINUTES` | ❌ No | 30 | How often to check system health |
|
||
| `DAILY_REPORT_HOUR` | ❌ No | 9 | Hour (0-23) to send daily reports |
|
||
| `ERROR_CLEANUP_INTERVAL_HOURS` | ❌ No | 1 | How often to clean up error logs |
|
||
|
||
## 📱 What You'll Receive
|
||
|
||
### <20> **Srtartup Notifications**
|
||
When your bot starts, you'll get:
|
||
```
|
||
🚀 Bot Started
|
||
|
||
✅ Yaltipia Home Client TG Bot is now running
|
||
🕐 Started at: 08/01/2026, 09:00:00
|
||
<EFBFBD>️E Platform: win32
|
||
<EFBFBD> Nsode.js: v20.19.2
|
||
```
|
||
|
||
### <20> **Error Alerts** (Instant)
|
||
When errors occur, you'll receive detailed alerts:
|
||
```
|
||
🚨 BOT ERROR ALERT
|
||
|
||
🚨 URGENCY: INVESTIGATE
|
||
📍 Context: text_message_handler
|
||
❌ Error: Cannot read property 'id' of undefined
|
||
🔢 Error Code: N/A
|
||
<EFBFBD> User ID: 123456789
|
||
🔄 Count (this type): 1
|
||
<EFBFBD> Tiotal Errors Today: 3
|
||
🕐 Time: 08/01/2026, 14:30:25
|
||
|
||
<EFBFBD> RECOMMDENDED ACTIONS:
|
||
• Check error logs for patterns
|
||
• Monitor if error repeats
|
||
• Test affected functionality
|
||
|
||
<EFBFBD> Next Steps:
|
||
1. Check logs for more details
|
||
2. Test the affected feature
|
||
3. Take corrective action if needed
|
||
4. Monitor for resolution
|
||
```
|
||
|
||
### 🏥 **Health Alerts** (When Issues Detected)
|
||
System health warnings when problems are detected:
|
||
```
|
||
🚨 SYSTEM HEALTH ALERT
|
||
|
||
<EFBFBD> STATUS: CRITICAL
|
||
⚡ URGENCY: IMMEDIATE ACTION REQUIRED
|
||
|
||
<EFBFBD> CURREtNT SYSTEM STATUS:
|
||
• Memory Usage: 520MB
|
||
• Error Rate: 15%
|
||
• Uptime: 2h 15m
|
||
• Total Errors: 5
|
||
• Total Messages: 33
|
||
|
||
🔍 DETECTED ISSUES:
|
||
• High memory usage: 520MB
|
||
• High error rate: 15%
|
||
|
||
🔧 RECOMMENDED ACTIONS:
|
||
• Check server resources immediately
|
||
• Consider restarting the bot if issues persist
|
||
• Monitor user complaints
|
||
• Check recent error logs for patterns
|
||
• Verify API connectivity
|
||
|
||
<EFBFBD> Alsert Time: 08/01/2026, 16:52:50
|
||
|
||
<EFBFBD> Nexlt Steps:
|
||
1. Check server logs for details
|
||
2. Monitor system for next 15 minutes
|
||
3. Take action if issues persist
|
||
```
|
||
|
||
### 📊 **Daily Reports** (Every Morning)
|
||
Comprehensive daily reports sent at your configured time:
|
||
```
|
||
📊 Daily Bot Report
|
||
|
||
⏱️ Uptime: 24h 15m
|
||
<EFBFBD> Total Users: 45
|
||
<EFBFBD> Metssages Processed: 1,234
|
||
🔔 Notifications Sent: 89
|
||
❌ Errors: 2
|
||
💾 Memory Usage: 245MB
|
||
🏥 Health: healthy
|
||
📅 Date: 08/01/2026
|
||
```
|
||
|
||
### <20> **Shutdown Notifications**
|
||
When the bot stops (including Ctrl+C):
|
||
```
|
||
🛑 Bot Shutdown
|
||
|
||
❌ Yaltipia Home Client TG Bot is shutting down
|
||
🕐 Shutdown at: 08/01/2026, 17:30:00
|
||
⏱️ Uptime: 8h 30m
|
||
📝 Reason: SIGINT (Ctrl+C)
|
||
<EFBFBD> Mess*ages processed: 1,234
|
||
🔔 Notifications sent: 89
|
||
❌ Total errors: 2
|
||
```
|
||
|
||
### 💚 **Regular Health Checks** (If Frequent Monitoring Enabled)
|
||
For intervals ≤ 2 minutes, you'll get simple status updates:
|
||
```
|
||
Health: HEALTHY | Memory: 245MB | Uptime: 120min | Messages: 456 | Notifications: 23 | Errors: 1 | Time: 14:30:25
|
||
```
|
||
|
||
## 🎯 Admin Commands
|
||
|
||
Once monitoring is configured, admins can use these commands in any chat with the bot:
|
||
|
||
| Command | Description | Example Response |
|
||
|---------|-------------|------------------|
|
||
| `/system_status` | Get current system status | Shows uptime, memory, errors, health |
|
||
| `/send_report` | Generate manual system report | Same as daily report, on-demand |
|
||
| `/notifications_status` | Check notification service status | Shows if notifications are working |
|
||
| `/shutdown_bot` | Gracefully shutdown bot | Stops bot with proper cleanup |
|
||
|
||
### Example System Status Response:
|
||
```
|
||
<EFBFBD>️ Systeem Status
|
||
|
||
⏱️ Uptime: 5h 23m
|
||
👥 Active Users: 12
|
||
💬 Messages: 456
|
||
🔔 Notifications: 23
|
||
❌ Errors: 1
|
||
💾 Memory: 234MB
|
||
🏥 Health: healthy
|
||
📅 Started: 08/01/2026, 09:00:00
|
||
```
|
||
|
||
## 🔍 What Gets Monitored
|
||
|
||
### ✅ **Tracked Events**
|
||
- User registrations and login attempts
|
||
- Message processing and responses
|
||
- Notification creation and delivery
|
||
- API calls and responses
|
||
- System errors and exceptions
|
||
- Memory usage and performance metrics
|
||
- Failed login attempts (with user details for admin help)
|
||
|
||
### 🚨 **Error Types Monitored**
|
||
- **Critical Errors**: Uncaught exceptions, unhandled promise rejections
|
||
- **User Errors**: Message handling failures, authentication issues
|
||
- **Network Errors**: API connection failures, timeout issues
|
||
- **System Errors**: Memory issues, performance problems
|
||
|
||
### 📈 **Performance Metrics**
|
||
- **Memory Usage**: RSS, Heap Used, External memory
|
||
- **Error Rates**: Percentage of messages resulting in errors
|
||
- **Response Times**: How quickly the bot responds
|
||
- **User Activity**: Message patterns and usage statistics
|
||
|
||
## 🛠️ Health Monitoring Details
|
||
|
||
### Automatic Health Checks
|
||
The system automatically checks health at your configured interval:
|
||
|
||
- **Memory Usage**: Alerts if > 500MB
|
||
- **Error Rate**: Alerts if > 10% of messages result in errors
|
||
- **System Responsiveness**: Monitors for hanging processes
|
||
|
||
### Health Status Levels
|
||
- **🟢 Healthy**: All systems normal, no issues detected
|
||
- **🟡 Warning**: Minor issues detected, monitoring recommended
|
||
- **🔴 Critical**: Major issues requiring immediate attention
|
||
|
||
### Smart Error Alerting
|
||
- **Network Errors**: First alert immediately, then every 30 minutes if persisting
|
||
- **Critical Errors**: Always alert immediately
|
||
- **Regular Errors**: Alert with 10-minute cooldown to prevent spam
|
||
- **Similar Errors**: Grouped together to reduce noise
|
||
|
||
## 🔒 Security & Privacy
|
||
|
||
- **No Sensitive Data**: Passwords and tokens are masked in logs
|
||
- **Admin-Only Access**: Only configured admins can use monitoring commands
|
||
- **Secure Error Reporting**: User data is protected in error reports
|
||
- **Rate Limiting**: Prevents spam of admin notifications
|
||
|
||
## 🚨 Troubleshooting
|
||
|
||
### Problem: Not Receiving Startup Notification
|
||
|
||
**Possible Causes & Solutions:**
|
||
|
||
1. **Wrong Chat ID**
|
||
```bash
|
||
# Check console logs for your actual Chat ID
|
||
# Look for: [ACTIVITY] User 123456789: text_message
|
||
ADMIN_CHAT_IDS=123456789 # Use the number from logs
|
||
```
|
||
|
||
2. **Bot Not Added to Group**
|
||
- Add bot to your group
|
||
- Make bot an admin
|
||
- Use the group's Chat ID (negative number)
|
||
|
||
3. **Invalid Environment Variable**
|
||
```bash
|
||
# Make sure no spaces around the equals sign
|
||
ADMIN_CHAT_IDS=123456789 # ✅ Correct
|
||
ADMIN_CHAT_IDS = 123456789 # ❌ Wrong
|
||
```
|
||
|
||
### Problem: Topic Messages Not Working
|
||
|
||
**Solutions:**
|
||
1. **Check Topic ID**
|
||
```bash
|
||
# Make sure topic exists and ID is correct
|
||
MONITORING_TOPIC_ID=5
|
||
```
|
||
|
||
2. **Bot Permissions**
|
||
- Bot must be admin in the group
|
||
- Bot must have permission to send messages in topics
|
||
|
||
3. **Remove Topic Configuration**
|
||
```bash
|
||
# Comment out or remove this line to send to main chat
|
||
# MONITORING_TOPIC_ID=5
|
||
```
|
||
|
||
### Problem: Too Many/Too Few Health Checks
|
||
|
||
**Solutions:**
|
||
```bash
|
||
# For less frequent monitoring (every 30 minutes)
|
||
HEALTH_CHECK_INTERVAL_MINUTES=30
|
||
|
||
# For more frequent monitoring (every 5 minutes)
|
||
HEALTH_CHECK_INTERVAL_MINUTES=5
|
||
|
||
# For testing (every 1 minute)
|
||
HEALTH_CHECK_INTERVAL_MINUTES=1
|
||
```
|
||
|
||
### Problem: Memory Usage Shows NaN
|
||
|
||
**This is automatically fixed in the current version.** If you still see this:
|
||
1. Restart your bot
|
||
2. The system now properly calculates memory usage
|
||
3. You should see actual MB values like "245MB"
|
||
|
||
### Problem: Bot Permissions Error
|
||
|
||
**Error Messages & Solutions:**
|
||
|
||
- `chat not found` → Bot removed from group, re-add it
|
||
- `bot was blocked` → User blocked the bot, use different admin
|
||
- `not enough rights` → Make bot admin in group
|
||
- `topic closed` → Open the topic or remove MONITORING_TOPIC_ID
|
||
|
||
## 📞 Getting Help
|
||
|
||
### Quick Diagnostics
|
||
1. **Check Console Logs**: Look for `[MONITOR]` messages
|
||
2. **Test Configuration**: Restart bot and look for startup notification
|
||
3. **Verify Chat ID**: Send message to bot and check console for your ID
|
||
4. **Test Commands**: Try `/system_status` command
|
||
|
||
### Common Log Messages
|
||
- `✅ Test message sent successfully` → Configuration working
|
||
- `❌ Test message failed` → Check Chat ID and bot permissions
|
||
- `✅ Message sent successfully to -1001234567890 (topic 5)` → Topic working
|
||
- `⚠️ Topic 5 not found` → Check topic ID or remove it
|
||
|
||
## 🚀 Benefits
|
||
|
||
### For Administrators
|
||
- **Proactive Issue Detection**: Know about problems before users report them
|
||
- **Performance Insights**: Understand how your bot is performing
|
||
- **Error Tracking**: Identify and fix recurring issues quickly
|
||
- **Usage Analytics**: See how users interact with your bot
|
||
|
||
### For Users
|
||
- **Better Reliability**: Issues are detected and fixed faster
|
||
- **Improved Performance**: System optimization based on monitoring data
|
||
- **Faster Support**: Admins have detailed error information when helping
|
||
|
||
## 🎉 You're All Set!
|
||
|
||
Once configured:
|
||
1. **Restart your bot** to activate monitoring
|
||
2. **Look for the startup notification** in your configured chat
|
||
3. **Test with `/system_status`** to verify admin commands work
|
||
4. **Monitor the health alerts** to keep your bot running smoothly
|
||
|
||
The monitoring system will now help ensure your bot runs reliably and any issues are quickly identified and resolved! |