# Production Deployment Checklist: Extended Logstash Configuration **Important**: This checklist follows a **inspect-first, then-modify** approach. Each step first checks the current state before making changes. --- ## Phase 1: Pre-Deployment Inspection ### Step 1.1: Verify Logstash Status ```bash ssh root@projectium.com systemctl status logstash curl -s http://localhost:9600/_node/stats/pipelines?pretty | jq '.pipelines.main.events' ``` **Record current state:** - Status: [active/inactive] - Events processed: [number] - Memory usage: [amount] **Expected**: Logstash should be active and processing PostgreSQL logs from ADR-050. --- ### Step 1.2: Inspect Existing Configuration Files ```bash # List all configuration files ls -alF /etc/logstash/conf.d/ # Check existing backups (if any) ls -lh /etc/logstash/conf.d/*.backup-* 2>/dev/null || echo "No backups found" # View current configuration cat /etc/logstash/conf.d/bugsink.conf ``` **Record current state:** - Configuration files present: [list] - Existing backups: [list or "none"] - Current config size: [bytes] **Questions to answer:** - ✅ Is there an existing `bugsink.conf`? - ✅ Are there any existing backups? - ✅ What inputs/filters/outputs are currently configured? --- ### Step 1.3: Inspect Log Output Directory ```bash # Check if directory exists ls -ld /var/log/logstash 2>/dev/null || echo "Directory does not exist" # If exists, check contents ls -alF /var/log/logstash/ # Check ownership and permissions ls -ld /var/log/logstash ``` **Record current state:** - Directory exists: [yes/no] - Current ownership: [user:group] - Current permissions: [drwx------] - Existing files: [list] **Questions to answer:** - ✅ Does `/var/log/logstash/` already exist? - ✅ What files are currently in it? - ✅ Are these Logstash's own logs or our operational logs? --- ### Step 1.4: Check Logrotate Configuration ```bash # Check if logrotate config exists cat /etc/logrotate.d/logstash 2>/dev/null || echo "No logrotate config found" # List all logrotate configs ls -lh /etc/logrotate.d/ | grep logstash ``` **Record current state:** - Logrotate config exists: [yes/no] - Current rotation policy: [daily/weekly/none] --- ### Step 1.5: Check Logstash User Groups ```bash # Check current group membership groups logstash # Verify which groups have access to required logs ls -l /home/gitea-runner/.pm2/logs/*.log | head -3 ls -l /var/log/redis/redis-server.log ls -l /var/log/nginx/access.log ls -l /var/log/nginx/error.log ``` **Record current state:** - Logstash groups: [list] - PM2 log file group: [group] - Redis log file group: [group] - NGINX log file group: [group] **Questions to answer:** - ✅ Is logstash already in the `adm` group? - ✅ Is logstash already in the `postgres` group? - ✅ Can logstash currently read PM2 logs? --- ### Step 1.6: Test Log File Access (Current State) ```bash # Test PM2 worker logs sudo -u logstash cat /home/gitea-runner/.pm2/logs/flyer-crawler-worker-*.log | head -5 2>&1 # Test PM2 analytics worker logs sudo -u logstash cat /home/gitea-runner/.pm2/logs/flyer-crawler-analytics-worker-*.log | head -5 2>&1 # Test Redis logs sudo -u logstash cat /var/log/redis/redis-server.log | head -5 2>&1 # Test NGINX access logs sudo -u logstash cat /var/log/nginx/access.log | head -5 2>&1 # Test NGINX error logs sudo -u logstash cat /var/log/nginx/error.log | head -5 2>&1 ``` **Record current state:** - PM2 worker logs accessible: [yes/no/error] - PM2 analytics logs accessible: [yes/no/error] - Redis logs accessible: [yes/no/error] - NGINX access logs accessible: [yes/no/error] - NGINX error logs accessible: [yes/no/error] **If any fail**: Note the specific error message (permission denied, file not found, etc.) --- ### Step 1.7: Check PM2 Log File Locations ```bash # List all PM2 log files ls -lh /home/gitea-runner/.pm2/logs/ # Check for production and test worker logs ls -lh /home/gitea-runner/.pm2/logs/ | grep -E "(flyer-crawler-worker|flyer-crawler-analytics-worker)" ``` **Record current state:** - Production worker logs present: [yes/no] - Test worker logs present: [yes/no] - Analytics worker logs present: [yes/no] - File naming pattern: [describe pattern] **Questions to answer:** - ✅ Do the log file paths match what's in the new Logstash config? - ✅ Are there separate logs for production vs test environments? --- ### Step 1.8: Check Disk Space ```bash # Check available disk space df -h /var/log/ # Check current size of Logstash logs du -sh /var/log/logstash/ # Check size of PM2 logs du -sh /home/gitea-runner/.pm2/logs/ ``` **Record current state:** - Available space on `/var/log`: [amount] - Current Logstash log size: [amount] - Current PM2 log size: [amount] **Risk assessment:** - ✅ Is there sufficient space for 30 days of rotated logs? - ✅ Estimate: ~100MB/day for new operational logs = ~3GB for 30 days --- ### Step 1.9: Review Bugsink Projects ```bash # Check if Bugsink projects 5 and 6 exist # (This requires accessing Bugsink UI or API) echo "Manual check: Navigate to https://bugsink.projectium.com" echo "Verify project IDs 5 and 6 exist and their names/DSNs" ``` **Record current state:** - Project 5 exists: [yes/no] - Project 5 name: [name] - Project 6 exists: [yes/no] - Project 6 name: [name] **Questions to answer:** - ✅ Do the project IDs in the new config match actual Bugsink projects? - ✅ Are DSNs correct? --- ## Phase 2: Make Deployment Decisions Based on Phase 1 inspection, answer these questions: 1. **Backup needed?** - Current config exists: [yes/no] - Decision: [create backup / no backup needed] 2. **Directory creation needed?** - `/var/log/logstash/` exists with correct permissions: [yes/no] - Decision: [create directory / fix permissions / no action needed] 3. **Logrotate config needed?** - Config exists: [yes/no] - Decision: [create config / update config / no action needed] 4. **Group membership needed?** - Logstash already in `adm` group: [yes/no] - Decision: [add to group / already member] 5. **Log file access issues?** - Any files inaccessible: [list files] - Decision: [fix permissions / fix group membership / no action needed] --- ## Phase 3: Execute Deployment ### Step 3.1: Create Configuration Backup **Only if**: Configuration file exists and no recent backup. ```bash # Create timestamped backup sudo cp /etc/logstash/conf.d/bugsink.conf \ /etc/logstash/conf.d/bugsink.conf.backup-$(date +%Y%m%d-%H%M%S) # Verify backup ls -lh /etc/logstash/conf.d/*.backup-* ``` **Confirmation**: ✅ Backup file created with timestamp. --- ### Step 3.2: Handle Log Output Directory **If directory doesn't exist:** ```bash sudo mkdir -p /var/log/logstash-operational sudo chown logstash:logstash /var/log/logstash-operational sudo chmod 755 /var/log/logstash-operational ``` **If directory exists but has wrong permissions:** ```bash sudo chown logstash:logstash /var/log/logstash sudo chmod 755 /var/log/logstash ``` **Note**: The existing `/var/log/logstash/` contains Logstash's own operational logs (logstash-plain.log, etc.). You have two options: **Option A**: Use a separate directory for our operational logs (recommended): - Directory: `/var/log/logstash-operational/` - Update config to use this path instead **Option B**: Share the directory (requires careful logrotate config): - Keep using `/var/log/logstash/` - Ensure logrotate doesn't rotate our custom logs the same way as Logstash's own logs **Decision**: [Choose Option A or B] **Verification:** ```bash ls -ld /var/log/logstash-operational # or /var/log/logstash ``` **Confirmation**: ✅ Directory exists with `drwxr-xr-x logstash logstash`. --- ### Step 3.3: Configure Logrotate **Only if**: Logrotate config doesn't exist or needs updating. **For Option A (separate directory):** ```bash sudo tee /etc/logrotate.d/logstash-operational <<'EOF' /var/log/logstash-operational/*.log { daily rotate 30 compress delaycompress missingok notifempty create 0644 logstash logstash sharedscripts postrotate # No reload needed - Logstash handles rotation automatically endscript } EOF ``` **For Option B (shared directory):** ```bash sudo tee /etc/logrotate.d/logstash-operational <<'EOF' /var/log/logstash/pm2-workers-*.log /var/log/logstash/redis-operational-*.log /var/log/logstash/nginx-access-*.log { daily rotate 30 compress delaycompress missingok notifempty create 0644 logstash logstash sharedscripts postrotate # No reload needed - Logstash handles rotation automatically endscript } EOF ``` **Verify configuration:** ```bash sudo logrotate -d /etc/logrotate.d/logstash-operational cat /etc/logrotate.d/logstash-operational ``` **Confirmation**: ✅ Logrotate config created, syntax check passes. --- ### Step 3.4: Grant Logstash Permissions **Only if**: Logstash not already in `adm` group. ```bash # Add logstash to adm group (for NGINX and system logs) sudo usermod -a -G adm logstash # Verify group membership groups logstash ``` **Expected output**: `logstash : logstash adm postgres` **Confirmation**: ✅ Logstash user is in required groups. --- ### Step 3.5: Verify Log File Access (Post-Permission Changes) **Only if**: Previous access tests failed. ```bash # Re-test log file access sudo -u logstash cat /home/gitea-runner/.pm2/logs/flyer-crawler-worker-*.log | head -5 sudo -u logstash cat /home/gitea-runner/.pm2/logs/flyer-crawler-analytics-worker-*.log | head -5 sudo -u logstash cat /var/log/redis/redis-server.log | head -5 sudo -u logstash cat /var/log/nginx/access.log | head -5 sudo -u logstash cat /var/log/nginx/error.log | head -5 ``` **Confirmation**: ✅ All log files now readable without errors. --- ### Step 3.6: Update Logstash Configuration **Important**: Before pasting, adjust the file output paths based on your directory decision. ```bash # Open configuration file sudo nano /etc/logstash/conf.d/bugsink.conf ``` **Paste the complete configuration from `docs/BARE-METAL-SETUP.md`.** **If using Option A (separate directory)**, update these lines in the config: ```ruby # Change this: path => "/var/log/logstash/pm2-workers-%{+YYYY-MM-dd}.log" # To this: path => "/var/log/logstash-operational/pm2-workers-%{+YYYY-MM-dd}.log" # (Repeat for redis-operational and nginx-access file outputs) ``` **Save and exit**: Ctrl+X, Y, Enter --- ### Step 3.7: Test Configuration Syntax ```bash # Test for syntax errors sudo /usr/share/logstash/bin/logstash --config.test_and_exit -f /etc/logstash/conf.d/bugsink.conf ``` **Expected output**: `Configuration OK` **If errors:** 1. Review error message for line number 2. Check for missing braces, quotes, commas 3. Verify file paths match your directory decision 4. Compare against documentation **Confirmation**: ✅ Configuration syntax is valid. --- ### Step 3.8: Restart Logstash Service ```bash # Restart Logstash sudo systemctl restart logstash # Check service started successfully sudo systemctl status logstash # Wait for initialization sleep 30 # Check for startup errors sudo journalctl -u logstash -n 100 --no-pager | grep -i error ``` **Expected**: - Status: `active (running)` - No critical errors (warnings about missing files are OK initially) **Confirmation**: ✅ Logstash restarted successfully. --- ## Phase 4: Post-Deployment Verification ### Step 4.1: Verify Pipeline Processing ```bash # Check pipeline stats - events should be increasing curl -s http://localhost:9600/_node/stats/pipelines?pretty | jq '.pipelines.main.events' # Check input plugins curl -s http://localhost:9600/_node/stats/pipelines?pretty | jq '.pipelines.main.plugins.inputs' # Check for grok failures curl -s http://localhost:9600/_node/stats/pipelines?pretty | jq '.pipelines.main.plugins.filters[] | select(.name == "grok") | {name, events_in: .events.in, events_out: .events.out, failures}' ``` **Expected**: - `events.in` and `events.out` are increasing - Input plugins show files being read - Grok failures < 1% of events **Confirmation**: ✅ Pipeline processing events from multiple sources. --- ### Step 4.2: Verify File Outputs Created ```bash # Wait a few minutes for log generation sleep 120 # Check files were created ls -lh /var/log/logstash-operational/ # or /var/log/logstash/ # View sample logs tail -20 /var/log/logstash-operational/pm2-workers-$(date +%Y-%m-%d).log tail -20 /var/log/logstash-operational/redis-operational-$(date +%Y-%m-%d).log tail -20 /var/log/logstash-operational/nginx-access-$(date +%Y-%m-%d).log ``` **Expected**: - Files exist with today's date - Files contain JSON-formatted log entries - Timestamps are recent **Confirmation**: ✅ Operational logs being written successfully. --- ### Step 4.3: Test Error Forwarding to Bugsink ```bash # Check HTTP output stats (Bugsink forwarding) curl -s http://localhost:9600/_node/stats/pipelines?pretty | jq '.pipelines.main.plugins.outputs[] | select(.name == "http") | {name, events_in: .events.in, events_out: .events.out}' ``` **Manual check**: 1. Navigate to: https://bugsink.projectium.com 2. Check Project 5 (production infrastructure) for recent events 3. Check Project 6 (test infrastructure) for recent events **Confirmation**: ✅ Errors forwarded to correct Bugsink projects. --- ### Step 4.4: Monitor Logstash Performance ```bash # Check memory usage ps aux | grep logstash | grep -v grep # Check disk usage du -sh /var/log/logstash-operational/ # Monitor in real-time (Ctrl+C to exit) sudo journalctl -u logstash -f ``` **Expected**: - Memory usage < 1.5GB (with 1GB heap) - Disk usage reasonable (< 100MB for first day) - No repeated errors **Confirmation**: ✅ Performance is stable. --- ### Step 4.5: Verify Environment Detection ```bash # Check recent logs for environment tags sudo journalctl -u logstash -n 500 | grep -E "(production|test)" | tail -20 # Check file outputs for correct tagging grep -o '"environment":"[^"]*"' /var/log/logstash-operational/pm2-workers-$(date +%Y-%m-%d).log | sort | uniq -c ``` **Expected**: - Production worker logs tagged as "production" - Test worker logs tagged as "test" **Confirmation**: ✅ Environment detection working correctly. --- ### Step 4.6: Document Deployment ```bash # Record deployment echo "Extended Logstash Configuration deployed on $(date)" | sudo tee -a /var/log/deployments.log # Record configuration version sudo ls -lh /etc/logstash/conf.d/bugsink.conf ``` **Confirmation**: ✅ Deployment documented. --- ## Phase 5: 24-Hour Monitoring Plan Monitor these metrics over the next 24 hours: **Every 4 hours:** 1. **Service health**: `systemctl status logstash` 2. **Disk usage**: `du -sh /var/log/logstash-operational/` 3. **Memory usage**: `ps aux | grep logstash | grep -v grep` **Every 12 hours:** 1. **Error rates**: Check Bugsink projects 5 and 6 2. **Log file growth**: `ls -lh /var/log/logstash-operational/` 3. **Pipeline stats**: `curl -s http://localhost:9600/_node/stats/pipelines?pretty | jq '.pipelines.main.events'` --- ## Rollback Procedure **If issues occur:** ```bash # Stop Logstash sudo systemctl stop logstash # Find latest backup ls -lt /etc/logstash/conf.d/*.backup-* | head -1 # Restore backup (replace TIMESTAMP with actual timestamp) sudo cp /etc/logstash/conf.d/bugsink.conf.backup-TIMESTAMP \ /etc/logstash/conf.d/bugsink.conf # Test restored config sudo /usr/share/logstash/bin/logstash --config.test_and_exit -f /etc/logstash/conf.d/bugsink.conf # Restart Logstash sudo systemctl start logstash # Verify status systemctl status logstash ``` --- ## Quick Health Check Run this anytime to verify deployment health: ```bash # One-line health check systemctl is-active logstash && \ echo "Service: OK" && \ ls /var/log/logstash-operational/*.log &>/dev/null && \ echo "Logs: OK" && \ curl -s http://localhost:9600/_node/stats/pipelines?pretty | jq -e '.pipelines.main.events.in > 0' &>/dev/null && \ echo "Processing: OK" ``` Expected output: ``` active Service: OK Logs: OK Processing: OK ``` --- ## Summary Checklist After completing all steps: - ✅ Phase 1: Inspection complete, state recorded - ✅ Phase 2: Deployment decisions made - ✅ Phase 3: Configuration deployed - ✅ Backup created - ✅ Directory configured - ✅ Logrotate configured - ✅ Permissions granted - ✅ Config updated and tested - ✅ Service restarted - ✅ Phase 4: Verification complete - ✅ Pipeline processing - ✅ File outputs working - ✅ Errors forwarded to Bugsink - ✅ Performance stable - ✅ Environment detection working - ✅ Phase 5: Monitoring plan established **Deployment Status**: [READY / IN PROGRESS / COMPLETE / ROLLED BACK]