# Bare-Metal Server Setup Guide This guide covers the manual installation of Flyer Crawler and its dependencies on a bare-metal Ubuntu server (e.g., a colocation server). This is the definitive reference for setting up a production environment without containers. **Last verified**: 2026-01-28 **Target Environment**: Ubuntu 22.04 LTS (or newer) **Related documentation**: - [ADR-014: Containerization and Deployment Strategy](../adr/0014-containerization-and-deployment-strategy.md) - [ADR-015: Error Tracking and Observability](../adr/0015-error-tracking-and-observability.md) - [ADR-050: PostgreSQL Function Observability](../adr/0050-postgresql-function-observability.md) - [Deployment Guide](DEPLOYMENT.md) - [Monitoring Guide](MONITORING.md) --- ## Quick Reference ### Installation Time Estimates | Component | Estimated Time | Notes | | ----------- | --------------- | ----------------------------- | | PostgreSQL | 10-15 minutes | Including PostGIS extensions | | Redis | 5 minutes | Quick install | | Node.js | 5 minutes | Via NodeSource repository | | Application | 15-20 minutes | Clone, install, build | | PM2 | 5 minutes | Global install + config | | NGINX | 10-15 minutes | Including SSL via Certbot | | Bugsink | 20-30 minutes | Python venv, systemd services | | Logstash | 15-20 minutes | Including pipeline config | | **Total** | **~90 minutes** | For complete fresh install | ### Post-Installation Verification After completing setup, verify all services: ```bash # Check all services are running systemctl status postgresql nginx redis-server gunicorn-bugsink snappea logstash # Verify application health curl -s https://flyer-crawler.projectium.com/api/health/ready | jq . # Check PM2 processes pm2 list # Verify Bugsink is accessible curl -s https://bugsink.projectium.com/accounts/login/ | head -5 ``` --- ## Server Access Model All commands in this guide are intended for the **system administrator** to execute directly on the server. Claude Code and AI tools have **READ-ONLY** access to production servers and cannot execute these commands directly. When Claude assists with server setup or troubleshooting: 1. Claude provides commands for the administrator to execute 2. Administrator runs commands and reports output 3. Claude analyzes results and provides next steps (1-3 commands at a time) 4. Administrator executes and reports results 5. Claude provides verification commands to confirm success --- ## Table of Contents 1. [System Prerequisites](#system-prerequisites) 2. [PostgreSQL Setup](#postgresql-setup) 3. [Redis Setup](#redis-setup) 4. [Node.js and Application Setup](#nodejs-and-application-setup) 5. [PM2 Process Manager](#pm2-process-manager) 6. [NGINX Reverse Proxy](#nginx-reverse-proxy) 7. [Bugsink Error Tracking](#bugsink-error-tracking) 8. [Logstash Log Aggregation](#logstash-log-aggregation) 9. [SSL/TLS with Let's Encrypt](#ssltls-with-lets-encrypt) 10. [Firewall Configuration](#firewall-configuration) 11. [Maintenance Commands](#maintenance-commands) --- ## System Prerequisites Update the system and install essential packages: ```bash sudo apt update && sudo apt upgrade -y sudo apt install -y curl git build-essential python3 python3-pip python3-venv ``` --- ## PostgreSQL Setup ### Install PostgreSQL 14+ with PostGIS ```bash # Add PostgreSQL APT repository sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list' wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add - sudo apt update # Install PostgreSQL and PostGIS sudo apt install -y postgresql-14 postgresql-14-postgis-3 ``` ### Create Application Database and User ```bash sudo -u postgres psql ``` ```sql -- Create application user and database CREATE USER flyer_crawler WITH PASSWORD 'YOUR_SECURE_PASSWORD'; CREATE DATABASE flyer_crawler OWNER flyer_crawler; -- Connect to the database and enable extensions \c flyer_crawler CREATE EXTENSION IF NOT EXISTS postgis; CREATE EXTENSION IF NOT EXISTS pg_trgm; CREATE EXTENSION IF NOT EXISTS pgcrypto; -- Grant privileges GRANT ALL PRIVILEGES ON DATABASE flyer_crawler TO flyer_crawler; \q ``` ### Configure PostgreSQL for Remote Access (if needed) Edit `/etc/postgresql/14/main/postgresql.conf`: ```conf listen_addresses = 'localhost' # Change to '*' for remote access ``` Edit `/etc/postgresql/14/main/pg_hba.conf` to add allowed hosts: ```conf # Local connections local all all peer host all all 127.0.0.1/32 scram-sha-256 ``` Restart PostgreSQL: ```bash sudo systemctl restart postgresql ``` --- ## Redis Setup ### Install Redis ```bash sudo apt install -y redis-server ``` ### Configure Redis Password Edit `/etc/redis/redis.conf`: ```conf requirepass YOUR_REDIS_PASSWORD ``` Restart Redis: ```bash sudo systemctl restart redis-server sudo systemctl enable redis-server ``` ### Test Redis Connection ```bash redis-cli -a YOUR_REDIS_PASSWORD ping # Should output: PONG ``` --- ## Node.js and Application Setup ### Install Node.js 20.x ```bash curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash - sudo apt install -y nodejs ``` Verify installation: ```bash node --version # Should output v20.x.x npm --version ``` ### Install System Dependencies for PDF Processing ```bash sudo apt install -y poppler-utils # For pdftocairo ``` ### Clone and Install Application ```bash # Create application directory sudo mkdir -p /opt/flyer-crawler sudo chown $USER:$USER /opt/flyer-crawler # Clone repository cd /opt/flyer-crawler git clone https://gitea.projectium.com/flyer-crawler/flyer-crawler.projectium.com.git . # Install dependencies npm install # Build for production npm run build ``` ### Configure Environment Variables **Important:** The flyer-crawler application does **not** use local environment files in production. All secrets are managed through **Gitea CI/CD secrets** and injected during deployment. #### How Secrets Work 1. **Secrets are stored in Gitea** at Repository → Settings → Actions → Secrets 2. **Workflow files** (`.gitea/workflows/deploy-to-prod.yml`) reference secrets using `${{ secrets.SECRET_NAME }}` 3. **PM2** receives environment variables from the workflow's `env:` block 4. **ecosystem.config.cjs** passes variables to the application via `process.env` #### Required Gitea Secrets Before deployment, ensure these secrets are configured in Gitea: **Shared Secrets** (used by both production and test): | Secret Name | Description | | ---------------------- | --------------------------------------- | | `DB_HOST` | Database hostname (usually `localhost`) | | `DB_USER` | Database username | | `DB_PASSWORD` | Database password | | `JWT_SECRET` | JWT signing secret (min 32 characters) | | `GOOGLE_MAPS_API_KEY` | Google Maps API key | | `GOOGLE_CLIENT_ID` | Google OAuth client ID | | `GOOGLE_CLIENT_SECRET` | Google OAuth client secret | | `GH_CLIENT_ID` | GitHub OAuth client ID | | `GH_CLIENT_SECRET` | GitHub OAuth client secret | **Production-Specific Secrets**: | Secret Name | Description | | --------------------------- | -------------------------------------------------------------------- | | `DB_DATABASE_PROD` | Production database name (`flyer_crawler`) | | `REDIS_PASSWORD_PROD` | Redis password for production (uses database 0) | | `VITE_GOOGLE_GENAI_API_KEY` | Gemini API key for production | | `SENTRY_DSN` | Bugsink backend DSN (see [Bugsink section](#bugsink-error-tracking)) | | `VITE_SENTRY_DSN` | Bugsink frontend DSN | **Test-Specific Secrets**: | Secret Name | Description | | -------------------------------- | ----------------------------------------------------------------------------- | | `DB_DATABASE_TEST` | Test database name (`flyer-crawler-test`) | | `REDIS_PASSWORD_TEST` | Redis password for test (uses database 1 for isolation) | | `VITE_GOOGLE_GENAI_API_KEY_TEST` | Gemini API key for test environment | | `SENTRY_DSN_TEST` | Bugsink backend DSN for test (see [Bugsink section](#bugsink-error-tracking)) | | `VITE_SENTRY_DSN_TEST` | Bugsink frontend DSN for test | #### Test Environment Details The test environment (`flyer-crawler-test.projectium.com`) uses **both** Gitea CI/CD secrets and a local `.env.test` file: | Path | Purpose | | ------------------------------------------------------ | ---------------------------------------- | | `/var/www/flyer-crawler-test.projectium.com/` | Test application directory | | `/var/www/flyer-crawler-test.projectium.com/.env.test` | Local overrides for test-specific config | **Key differences from production:** - Uses Redis database **1** (production uses database **0**) to isolate job queues - PM2 processes are named with `-test` suffix (e.g., `flyer-crawler-api-test`) - Deployed automatically on every push to `main` branch - Has a `.env.test` file for additional local configuration overrides For detailed information on secrets management, see [CLAUDE.md](../CLAUDE.md). --- ## PM2 Process Manager ### Install PM2 Globally ```bash sudo npm install -g pm2 ``` ### PM2 Configuration Files The application uses **separate ecosystem config files** for production and test environments: | File | Purpose | Processes Started | | --------------------------- | --------------------- | -------------------------------------------------------------------------------------------- | | `ecosystem.config.cjs` | Production deployment | `flyer-crawler-api`, `flyer-crawler-worker`, `flyer-crawler-analytics-worker` | | `ecosystem-test.config.cjs` | Test deployment | `flyer-crawler-api-test`, `flyer-crawler-worker-test`, `flyer-crawler-analytics-worker-test` | **Key Points:** - Production and test processes run **simultaneously** with distinct names - Test processes use `NODE_ENV=test` which enables file logging - Test processes use Redis database 1 (isolated from production which uses database 0) - Both configs validate required environment variables but only warn (don't exit) if missing ### Start Production Application ```bash cd /var/www/flyer-crawler.projectium.com # Set required environment variables (usually done via CI/CD) export DB_HOST=localhost export JWT_SECRET=your-secret export GEMINI_API_KEY=your-api-key # ... other required variables pm2 startOrReload ecosystem.config.cjs --update-env && pm2 save ``` This starts three production processes: - `flyer-crawler-api` - Main API server (port 3001) - `flyer-crawler-worker` - Background job worker - `flyer-crawler-analytics-worker` - Analytics processing worker ### Start Test Application ```bash cd /var/www/flyer-crawler-test.projectium.com # Set required environment variables (usually done via CI/CD) export DB_HOST=localhost export DB_NAME=flyer-crawler-test export JWT_SECRET=your-secret export GEMINI_API_KEY=your-test-api-key export REDIS_URL=redis://localhost:6379/1 # Use database 1 for isolation # ... other required variables pm2 startOrReload ecosystem-test.config.cjs --update-env && pm2 save ``` This starts three test processes (running alongside production): - `flyer-crawler-api-test` - Test API server (port 3001 via different NGINX vhost) - `flyer-crawler-worker-test` - Test background job worker - `flyer-crawler-analytics-worker-test` - Test analytics worker ### Verify Running Processes After starting both environments, you should see 6 application processes: ```bash pm2 list ``` Expected output: ```text ┌────┬───────────────────────────────────┬──────────┬────────┬───────────┐ │ id │ name │ mode │ status │ cpu │ ├────┼───────────────────────────────────┼──────────┼────────┼───────────┤ │ 0 │ flyer-crawler-api │ cluster │ online │ 0% │ │ 1 │ flyer-crawler-worker │ fork │ online │ 0% │ │ 2 │ flyer-crawler-analytics-worker │ fork │ online │ 0% │ │ 3 │ flyer-crawler-api-test │ fork │ online │ 0% │ │ 4 │ flyer-crawler-worker-test │ fork │ online │ 0% │ │ 5 │ flyer-crawler-analytics-worker-test│ fork │ online │ 0% │ └────┴───────────────────────────────────┴──────────┴────────┴───────────┘ ``` ### Configure PM2 Startup ```bash pm2 startup systemd # Follow the command output to enable PM2 on boot pm2 save ``` ### PM2 Log Rotation ```bash pm2 install pm2-logrotate pm2 set pm2-logrotate:max_size 10M pm2 set pm2-logrotate:retain 14 pm2 set pm2-logrotate:compress true ``` ### Useful PM2 Commands ```bash # View logs for a specific process pm2 logs flyer-crawler-api-test --lines 50 # View environment variables for a process pm2 env # Restart only test processes pm2 restart flyer-crawler-api-test flyer-crawler-worker-test flyer-crawler-analytics-worker-test # Delete all test processes (without affecting production) pm2 delete flyer-crawler-api-test flyer-crawler-worker-test flyer-crawler-analytics-worker-test ``` --- ## NGINX Reverse Proxy ### Install NGINX ```bash sudo apt install -y nginx ``` ### Reference Configuration Files The repository contains reference copies of the actual production NGINX configurations at the project root: - `etc-nginx-sites-available-flyer-crawler.projectium.com` - Production config - `etc-nginx-sites-available-flyer-crawler-test-projectium-com.txt` - Test config These reference files document the exact configuration deployed on the server, including SSL settings managed by Certbot. Use them as a reference when setting up new servers or troubleshooting configuration issues. **Note:** The simplified example below shows the basic structure. For the complete production configuration with SSL, security headers, and all location blocks, refer to the reference files in the repository root. ### Create Site Configuration Create `/etc/nginx/sites-available/flyer-crawler.projectium.com`: ```nginx server { listen 80; server_name flyer-crawler.projectium.com; # Redirect HTTP to HTTPS (uncomment after SSL setup) # return 301 https://$server_name$request_uri; location / { proxy_pass http://localhost:5173; proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection 'upgrade'; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; proxy_cache_bypass $http_upgrade; } location /api { proxy_pass http://localhost:3001; proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection 'upgrade'; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; proxy_cache_bypass $http_upgrade; # File upload size limit client_max_body_size 50M; } # Serve flyer images from static storage (7-day cache) location /flyer-images/ { alias /var/www/flyer-crawler.projectium.com/flyer-images/; expires 7d; add_header Cache-Control "public, immutable"; } # MIME type fix for .mjs files types { application/javascript js mjs; } } ``` ### Static Flyer Images Directory Create the directory for storing flyer images: ```bash # Production sudo mkdir -p /var/www/flyer-crawler.projectium.com/flyer-images sudo chown www-data:www-data /var/www/flyer-crawler.projectium.com/flyer-images # Test environment sudo mkdir -p /var/www/flyer-crawler-test.projectium.com/flyer-images sudo chown www-data:www-data /var/www/flyer-crawler-test.projectium.com/flyer-images ``` The `/flyer-images/` location serves static images with: - **7-day browser cache** (`expires 7d`) - **Immutable cache header** for optimal CDN/browser caching - Direct file serving (no proxy overhead) ### Enable the Site ```bash sudo ln -s /etc/nginx/sites-available/flyer-crawler.projectium.com /etc/nginx/sites-enabled/ sudo nginx -t sudo systemctl reload nginx sudo systemctl enable nginx ``` --- ## Bugsink Error Tracking Bugsink is a lightweight, self-hosted Sentry-compatible error tracking system. This guide follows the [official Bugsink single-server production setup](https://www.bugsink.com/docs/single-server-production/). See [ADR-015](adr/0015-application-performance-monitoring-and-error-tracking.md) for architecture details. ### Step 1: Create Bugsink User Create a dedicated non-root user for Bugsink: ```bash sudo adduser bugsink --disabled-password --gecos "" ``` ### Step 2: Set Up Virtual Environment and Install Bugsink Switch to the bugsink user: ```bash sudo su - bugsink ``` Create the virtual environment: ```bash python3 -m venv venv ``` Activate the virtual environment: ```bash source venv/bin/activate ``` You should see `(venv)` at the beginning of your prompt. Now install Bugsink: ```bash pip install bugsink --upgrade bugsink-show-version ``` You should see output like `bugsink 2.x.x`. ### Step 3: Create Configuration File Generate the configuration file. Replace `bugsink.yourdomain.com` with your actual hostname: ```bash bugsink-create-conf --template=singleserver --host=bugsink.yourdomain.com ``` This creates `bugsink_conf.py` in `/home/bugsink/`. Edit it to customize settings: ```bash nano bugsink_conf.py ``` **Key settings to review:** | Setting | Description | | ------------------- | ------------------------------------------------------------------------------- | | `BASE_URL` | The URL where Bugsink will be accessed (e.g., `https://bugsink.yourdomain.com`) | | `SITE_TITLE` | Display name for your Bugsink instance | | `SECRET_KEY` | Auto-generated, but verify it exists | | `TIME_ZONE` | Your timezone (e.g., `America/New_York`) | | `USER_REGISTRATION` | Set to `"closed"` to disable public signup | | `SINGLE_USER` | Set to `True` if only one user will use this instance | ### Step 4: Initialize Database Bugsink uses SQLite by default, which is recommended for single-server setups. Run the database migrations: ```bash bugsink-manage migrate bugsink-manage migrate snappea --database=snappea ``` Verify the database files were created: ```bash ls *.sqlite3 ``` You should see `db.sqlite3` and `snappea.sqlite3`. ### Step 5: Create Admin User Create the superuser account. Using your email as the username is recommended: ```bash bugsink-manage createsuperuser ``` **Important:** Save these credentials - you'll need them to log into the Bugsink web UI. ### Step 6: Verify Configuration Run Django's deployment checks: ```bash bugsink-manage check_migrations bugsink-manage check --deploy --fail-level WARNING ``` Exit back to root for the next steps: ```bash exit ``` ### Step 7: Create Gunicorn Service Create `/etc/systemd/system/gunicorn-bugsink.service`: ```bash sudo nano /etc/systemd/system/gunicorn-bugsink.service ``` Add the following content: ```ini [Unit] Description=Gunicorn daemon for Bugsink After=network.target [Service] Restart=always Type=notify User=bugsink Group=bugsink Environment="PYTHONUNBUFFERED=1" WorkingDirectory=/home/bugsink ExecStart=/home/bugsink/venv/bin/gunicorn \ --bind="127.0.0.1:8000" \ --workers=4 \ --timeout=6 \ --access-logfile - \ --max-requests=1000 \ --max-requests-jitter=100 \ bugsink.wsgi ExecReload=/bin/kill -s HUP $MAINPID KillMode=mixed TimeoutStopSec=5 [Install] WantedBy=multi-user.target ``` Enable and start the service: ```bash sudo systemctl daemon-reload sudo systemctl enable --now gunicorn-bugsink.service sudo systemctl status gunicorn-bugsink.service ``` Test that Gunicorn is responding (replace hostname): ```bash curl http://localhost:8000/accounts/login/ --header "Host: bugsink.yourdomain.com" ``` You should see HTML output containing a login form. ### Step 8: Create Snappea Background Worker Service Snappea is Bugsink's background task processor. Create `/etc/systemd/system/snappea.service`: ```bash sudo nano /etc/systemd/system/snappea.service ``` Add the following content: ```ini [Unit] Description=Snappea daemon for Bugsink background tasks After=network.target [Service] Restart=always User=bugsink Group=bugsink Environment="PYTHONUNBUFFERED=1" WorkingDirectory=/home/bugsink ExecStart=/home/bugsink/venv/bin/bugsink-runsnappea KillMode=mixed TimeoutStopSec=5 RuntimeMaxSec=1d [Install] WantedBy=multi-user.target ``` Enable and start the service: ```bash sudo systemctl daemon-reload sudo systemctl enable --now snappea.service sudo systemctl status snappea.service ``` Verify snappea is working: ```bash sudo su - bugsink source venv/bin/activate bugsink-manage checksnappea exit ``` ### Step 9: Configure NGINX for Bugsink Create `/etc/nginx/sites-available/bugsink`: ```bash sudo nano /etc/nginx/sites-available/bugsink ``` Add the following (replace `bugsink.yourdomain.com` with your hostname): ```nginx server { server_name bugsink.yourdomain.com; listen 80; client_max_body_size 20M; access_log /var/log/nginx/bugsink.access.log; error_log /var/log/nginx/bugsink.error.log; location / { proxy_pass http://127.0.0.1:8000; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-Proto $scheme; } } ``` Enable the site: ```bash sudo ln -s /etc/nginx/sites-available/bugsink /etc/nginx/sites-enabled/ sudo nginx -t sudo systemctl reload nginx ``` ### Step 10: Configure SSL with Certbot (Recommended) ```bash sudo certbot --nginx -d bugsink.yourdomain.com ``` After SSL is configured, update the NGINX config to add security headers. Edit `/etc/nginx/sites-available/bugsink` and add to the `location /` block: ```nginx add_header Strict-Transport-Security "max-age=31536000; preload" always; ``` Reload NGINX: ```bash sudo nginx -t sudo systemctl reload nginx ``` ### Step 11: Create Projects and Get DSNs 1. Access Bugsink UI at `https://bugsink.yourdomain.com` 2. Log in with the admin credentials you created 3. Create a new team (or use the default) 4. Create projects for each environment: **Production:** - **flyer-crawler-backend** (Platform: Node.js) - **flyer-crawler-frontend** (Platform: JavaScript/React) **Test:** - **flyer-crawler-backend-test** (Platform: Node.js) - **flyer-crawler-frontend-test** (Platform: JavaScript/React) 5. For each project, go to Settings → Client Keys (DSN) 6. Copy the DSN URLs - you'll have 4 DSNs total (2 for production, 2 for test) > **Note:** The dev container runs its own local Bugsink instance at `localhost:8000` - no remote DSNs needed for development. ### Step 12: Configure Application to Use Bugsink The flyer-crawler application receives its configuration via **Gitea CI/CD secrets**, not local environment files. Follow these steps to add the Bugsink DSNs: #### 1. Add Secrets in Gitea Navigate to your repository in Gitea: 1. Go to **Settings** → **Actions** → **Secrets** 2. Add the following secrets: **Production DSNs:** | Secret Name | Value | Description | | ----------------- | -------------------------------------- | ----------------------- | | `SENTRY_DSN` | `https://KEY@bugsink.yourdomain.com/1` | Production backend DSN | | `VITE_SENTRY_DSN` | `https://KEY@bugsink.yourdomain.com/2` | Production frontend DSN | **Test DSNs:** | Secret Name | Value | Description | | ---------------------- | -------------------------------------- | ----------------- | | `SENTRY_DSN_TEST` | `https://KEY@bugsink.yourdomain.com/3` | Test backend DSN | | `VITE_SENTRY_DSN_TEST` | `https://KEY@bugsink.yourdomain.com/4` | Test frontend DSN | > **Note:** The project numbers in the DSN URLs are assigned by Bugsink when you create each project. Use the actual DSN values from Step 11. #### 2. Update the Deployment Workflows **Production** (`deploy-to-prod.yml`): In the `Install Backend Dependencies and Restart Production Server` step, add to the `env:` block: ```yaml env: # ... existing secrets ... # Sentry/Bugsink Error Tracking SENTRY_DSN: ${{ secrets.SENTRY_DSN }} SENTRY_ENVIRONMENT: 'production' SENTRY_ENABLED: 'true' ``` In the build step, add frontend variables: ```yaml VITE_SENTRY_DSN=${{ secrets.VITE_SENTRY_DSN }} \ VITE_SENTRY_ENVIRONMENT=production \ VITE_SENTRY_ENABLED=true \ npm run build ``` **Test** (`deploy-to-test.yml`): In the `Install Backend Dependencies and Restart Test Server` step, add to the `env:` block: ```yaml env: # ... existing secrets ... # Sentry/Bugsink Error Tracking (Test) SENTRY_DSN: ${{ secrets.SENTRY_DSN_TEST }} SENTRY_ENVIRONMENT: 'test' SENTRY_ENABLED: 'true' ``` In the build step, add frontend variables: ```yaml VITE_SENTRY_DSN=${{ secrets.VITE_SENTRY_DSN_TEST }} \ VITE_SENTRY_ENVIRONMENT=test \ VITE_SENTRY_ENABLED=true \ npm run build ``` #### 3. Update ecosystem.config.cjs Add Sentry variables to the `sharedEnv` object in `ecosystem.config.cjs`: ```javascript const sharedEnv = { // ... existing variables ... SENTRY_DSN: process.env.SENTRY_DSN, SENTRY_ENVIRONMENT: process.env.SENTRY_ENVIRONMENT, SENTRY_ENABLED: process.env.SENTRY_ENABLED, }; ``` #### 4. Dev Container (No Configuration Needed) The dev container runs its own **local Bugsink instance** at `http://localhost:8000`. No remote DSNs or Gitea secrets are needed for development: - DSNs are pre-configured in `compose.dev.yml` - Admin UI: `http://localhost:8000` (login: `admin@localhost` / `admin`) - Errors stay local and isolated from production/test #### 5. Deploy to Apply Changes Trigger deployments via Gitea Actions: - **Test**: Automatically deploys on push to `main` - **Production**: Manual trigger via workflow dispatch **Note:** There is no `/etc/flyer-crawler/environment` file on the server. Production and test secrets are managed through Gitea CI/CD and injected at deployment time. Dev container uses local `.env` file. See [CLAUDE.md](../CLAUDE.md) for details. ### Step 13: Test Error Tracking You can test Bugsink is working before configuring the flyer-crawler application. Switch to the bugsink user and open a Python shell: ```bash sudo su - bugsink source venv/bin/activate bugsink-manage shell ``` In the Python shell, send a test message using the **backend DSN** from Step 11: ```python import sentry_sdk sentry_sdk.init("https://YOUR_BACKEND_KEY@bugsink.yourdomain.com/1") sentry_sdk.capture_message("Test message from Bugsink setup") exit() ``` Exit back to root: ```bash exit ``` Check the Bugsink UI - you should see the test message appear in the `flyer-crawler-backend` project. ### Step 14: Test from Flyer-Crawler Application (After App Setup) Once the flyer-crawler application has been deployed with the Sentry secrets configured in Step 12: ```bash cd /var/www/flyer-crawler.projectium.com npx tsx scripts/test-bugsink.ts ``` Check the Bugsink UI - you should see a test event appear. ### Bugsink Maintenance Commands | Task | Command | | ----------------------- | ------------------------------------------------------------------------------------------------------------------------------------------- | | View Gunicorn status | `sudo systemctl status gunicorn-bugsink` | | View Snappea status | `sudo systemctl status snappea` | | View Gunicorn logs | `sudo journalctl -u gunicorn-bugsink -f` | | View Snappea logs | `sudo journalctl -u snappea -f` | | Restart Bugsink | `sudo systemctl restart gunicorn-bugsink snappea` | | Run management commands | `sudo su - bugsink` then `source venv/bin/activate && bugsink-manage ` | | Upgrade Bugsink | `sudo su - bugsink && source venv/bin/activate && pip install bugsink --upgrade && exit && sudo systemctl restart gunicorn-bugsink snappea` | --- ## Logstash Log Aggregation Logstash aggregates logs from the application and infrastructure, forwarding errors to Bugsink. > **Note:** Logstash integration is **optional**. The flyer-crawler application already sends errors directly to Bugsink via the Sentry SDK. Logstash is only needed if you want to aggregate logs from other sources (Redis, NGINX, etc.) into Bugsink. ### Step 1: Create Application Log Directory The flyer-crawler application automatically creates its log directory on startup, but you need to ensure proper permissions for Logstash to read the logs. Create the log directories and set appropriate permissions: ```bash # Create log directory for the production application sudo mkdir -p /var/www/flyer-crawler.projectium.com/logs # Set ownership to root (since PM2 runs as root) sudo chown -R root:root /var/www/flyer-crawler.projectium.com/logs # Make logs readable by logstash user sudo chmod 755 /var/www/flyer-crawler.projectium.com/logs ``` For the test environment: ```bash sudo mkdir -p /var/www/flyer-crawler-test.projectium.com/logs sudo chown -R root:root /var/www/flyer-crawler-test.projectium.com/logs sudo chmod 755 /var/www/flyer-crawler-test.projectium.com/logs ``` ### Step 2: Application File Logging (Already Configured) The flyer-crawler application uses Pino for logging and is configured to write logs to files in production/test environments: **Log File Locations:** | Environment | Log File Path | | ------------- | --------------------------------------------------------- | | Production | `/var/www/flyer-crawler.projectium.com/logs/app.log` | | Test | `/var/www/flyer-crawler-test.projectium.com/logs/app.log` | | Dev Container | `/app/logs/app.log` | **How It Works:** - In production/test: Pino writes JSON logs to both stdout (for PM2) AND `logs/app.log` (for Logstash) - In development: Pino uses pino-pretty for human-readable console output only - The log directory is created automatically if it doesn't exist - You can override the log directory with the `LOG_DIR` environment variable **Verify Logging After Deployment:** After deploying the application, verify that logs are being written: ```bash # Check production logs ls -la /var/www/flyer-crawler.projectium.com/logs/ tail -f /var/www/flyer-crawler.projectium.com/logs/app.log # Check test logs ls -la /var/www/flyer-crawler-test.projectium.com/logs/ tail -f /var/www/flyer-crawler-test.projectium.com/logs/app.log ``` You should see JSON-formatted log entries like: ```json { "level": 30, "time": 1704067200000, "msg": "Server started on port 3001", "module": "server" } ``` ### Step 3: Install Logstash ```bash # Add Elastic APT repository wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo gpg --dearmor -o /usr/share/keyrings/elastic-keyring.gpg echo "deb [signed-by=/usr/share/keyrings/elastic-keyring.gpg] https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elastic-8.x.list # Update and install sudo apt update sudo apt install -y logstash ``` Verify installation: ```bash /usr/share/logstash/bin/logstash --version ``` ### Step 4: Configure Logstash Pipeline Create the pipeline configuration file: ```bash sudo nano /etc/logstash/conf.d/bugsink.conf ``` Add the following content: ```conf input { # Production application logs (Pino JSON format) file { path => "/var/www/flyer-crawler.projectium.com/logs/app.log" codec => json_lines type => "pino" tags => ["app", "production"] start_position => "end" sincedb_path => "/var/lib/logstash/sincedb_pino_prod" } # Test environment logs file { path => "/var/www/flyer-crawler-test.projectium.com/logs/app.log" codec => json_lines type => "pino" tags => ["app", "test"] start_position => "end" sincedb_path => "/var/lib/logstash/sincedb_pino_test" } # Redis logs (shared by both environments) file { path => "/var/log/redis/redis-server.log" type => "redis" tags => ["infra", "redis", "production"] start_position => "end" sincedb_path => "/var/lib/logstash/sincedb_redis" } # NGINX error logs (production) file { path => "/var/log/nginx/error.log" type => "nginx" tags => ["infra", "nginx", "production"] start_position => "end" sincedb_path => "/var/lib/logstash/sincedb_nginx_error" } # NGINX access logs - for detecting 5xx errors (production) file { path => "/var/log/nginx/access.log" type => "nginx_access" tags => ["infra", "nginx", "production"] start_position => "end" sincedb_path => "/var/lib/logstash/sincedb_nginx_access" } # PM2 error logs - Production (plain text stack traces) file { path => "/home/gitea-runner/.pm2/logs/flyer-crawler-*-error.log" exclude => "*-test-error.log" type => "pm2" tags => ["infra", "pm2", "production"] start_position => "end" sincedb_path => "/var/lib/logstash/sincedb_pm2_prod" } # PM2 error logs - Test file { path => "/home/gitea-runner/.pm2/logs/flyer-crawler-*-test-error.log" type => "pm2" tags => ["infra", "pm2", "test"] start_position => "end" sincedb_path => "/var/lib/logstash/sincedb_pm2_test" } } filter { # Pino log level detection # Pino levels: 10=trace, 20=debug, 30=info, 40=warn, 50=error, 60=fatal if [type] == "pino" and [level] { if [level] >= 50 { mutate { add_tag => ["error"] } } else if [level] >= 40 { mutate { add_tag => ["warning"] } } } # Redis error detection if [type] == "redis" { grok { match => { "message" => "%{POSINT:pid}:%{WORD:role} %{MONTHDAY} %{MONTH} %{YEAR}? ?%{TIME} %{WORD:loglevel} %{GREEDYDATA:redis_message}" } } if [loglevel] in ["WARNING", "ERROR"] { mutate { add_tag => ["error"] } } } # NGINX error log detection (all entries are errors) if [type] == "nginx" { mutate { add_tag => ["error"] } grok { match => { "message" => "%{TIMESTAMP_ISO8601:timestamp} \[%{WORD:severity}\] %{GREEDYDATA:nginx_message}" } } } # NGINX access log - detect 5xx errors if [type] == "nginx_access" { grok { match => { "message" => "%{COMBINEDAPACHELOG}" } } if [response] =~ /^5\d{2}$/ { mutate { add_tag => ["error"] } } } # PM2 error log detection - tag lines with actual error indicators if [type] == "pm2" { if [message] =~ /Error:|error:|ECONNREFUSED|ENOENT|TypeError|ReferenceError|SyntaxError/ { mutate { add_tag => ["error"] } } } } output { # Production app errors -> flyer-crawler-backend (project 1) if "error" in [tags] and "app" in [tags] and "production" in [tags] { http { url => "http://localhost:8000/api/1/store/" http_method => "post" format => "json" headers => { "X-Sentry-Auth" => "Sentry sentry_version=7, sentry_client=logstash/1.0, sentry_key=YOUR_PROD_BACKEND_DSN_KEY" } } } # Test app errors -> flyer-crawler-backend-test (project 3) if "error" in [tags] and "app" in [tags] and "test" in [tags] { http { url => "http://localhost:8000/api/3/store/" http_method => "post" format => "json" headers => { "X-Sentry-Auth" => "Sentry sentry_version=7, sentry_client=logstash/1.0, sentry_key=YOUR_TEST_BACKEND_DSN_KEY" } } } # Production infrastructure errors (Redis, NGINX, PM2) -> flyer-crawler-infrastructure (project 5) if "error" in [tags] and "infra" in [tags] and "production" in [tags] { http { url => "http://localhost:8000/api/5/store/" http_method => "post" format => "json" headers => { "X-Sentry-Auth" => "Sentry sentry_version=7, sentry_client=logstash/1.0, sentry_key=b083076f94fb461b889d5dffcbef43bf" } } } # Test infrastructure errors (PM2 test logs) -> flyer-crawler-test-infrastructure (project 6) if "error" in [tags] and "infra" in [tags] and "test" in [tags] { http { url => "http://localhost:8000/api/6/store/" http_method => "post" format => "json" headers => { "X-Sentry-Auth" => "Sentry sentry_version=7, sentry_client=logstash/1.0, sentry_key=25020dd6c2b74ad78463ec90e90fadab" } } } # Debug output (uncomment to troubleshoot) # stdout { codec => rubydebug } } ``` **Bugsink Project DSNs:** | Project | DSN Key | Project ID | | ----------------------------------- | ---------------------------------- | ---------- | | `flyer-crawler-backend` | `911aef02b9a548fa8fabb8a3c81abfe5` | 1 | | `flyer-crawler-frontend` | (used by app, not Logstash) | 2 | | `flyer-crawler-backend-test` | `cdb99c314589431e83d4cc38a809449b` | 3 | | `flyer-crawler-frontend-test` | (used by app, not Logstash) | 4 | | `flyer-crawler-infrastructure` | `b083076f94fb461b889d5dffcbef43bf` | 5 | | `flyer-crawler-test-infrastructure` | `25020dd6c2b74ad78463ec90e90fadab` | 6 | **Note:** The DSN key is the part before `@` in the full DSN URL (e.g., `https://KEY@bugsink.projectium.com/PROJECT_ID`). **Note on PM2 Logs:** PM2 error logs capture stack traces from stderr, which are valuable for debugging startup errors and uncaught exceptions. Production PM2 logs go to project 5 (infrastructure), test PM2 logs go to project 6 (test-infrastructure). ### Step 5: Create Logstash State Directory and Fix Config Path Logstash needs a directory to track which log lines it has already processed, and a symlink so it can find its config files: ```bash # Create state directory for sincedb files sudo mkdir -p /var/lib/logstash sudo chown logstash:logstash /var/lib/logstash # Create symlink so Logstash finds its config (avoids "Could not find logstash.yml" warning) sudo ln -sf /etc/logstash /usr/share/logstash/config ``` ### Step 6: Grant Logstash Access to Application Logs Logstash runs as the `logstash` user and needs permission to read log files: ```bash # Add logstash user to adm group (for nginx and redis logs) sudo usermod -aG adm logstash # Make application log files readable (created automatically when app starts) sudo chmod 644 /var/www/flyer-crawler.projectium.com/logs/app.log 2>/dev/null || echo "Production log file not yet created" sudo chmod 644 /var/www/flyer-crawler-test.projectium.com/logs/app.log 2>/dev/null || echo "Test log file not yet created" # Make Redis logs and directory readable sudo chmod 755 /var/log/redis/ sudo chmod 644 /var/log/redis/redis-server.log # Make NGINX logs readable sudo chmod 644 /var/log/nginx/access.log /var/log/nginx/error.log # Make PM2 logs and directories accessible sudo chmod 755 /home/gitea-runner/ sudo chmod 755 /home/gitea-runner/.pm2/ sudo chmod 755 /home/gitea-runner/.pm2/logs/ sudo chmod 644 /home/gitea-runner/.pm2/logs/*.log # Verify logstash group membership groups logstash ``` **Note:** The application log files are created automatically when the application starts. Run the chmod commands after the first deployment. ### Step 7: Test Logstash Configuration Test the configuration before starting: ```bash sudo /usr/share/logstash/bin/logstash --config.test_and_exit -f /etc/logstash/conf.d/bugsink.conf ``` You should see `Configuration OK` if there are no errors. ### Step 8: Start Logstash ```bash sudo systemctl enable logstash sudo systemctl start logstash sudo systemctl status logstash ``` View Logstash logs to verify it's working: ```bash sudo journalctl -u logstash -f ``` ### Troubleshooting Logstash | Issue | Solution | | -------------------------- | -------------------------------------------------------------------------------------------------------- | | "Permission denied" errors | Check file permissions on log files and sincedb directory | | No events being processed | Verify log file paths exist and contain data | | HTTP output errors | Check Bugsink is running and DSN key is correct | | Logstash not starting | Run config test: `sudo /usr/share/logstash/bin/logstash --config.test_and_exit -f /etc/logstash/conf.d/` | ### Alternative: Skip Logstash Since the flyer-crawler application already sends errors directly to Bugsink via the Sentry SDK (configured in Steps 11-12), you may not need Logstash at all. Logstash is primarily useful for: - Aggregating logs from services that don't have native Sentry support (Redis, NGINX) - Centralizing all logs in one place - Complex log transformations If you only need application error tracking, the Sentry SDK integration is sufficient. --- ## PostgreSQL Function Observability (ADR-050) PostgreSQL function observability provides structured logging and error tracking for database functions, preventing silent failures. This setup forwards database errors to Bugsink for centralized monitoring. See [ADR-050](adr/0050-postgresql-function-observability.md) for the full architecture decision. ### Prerequisites - PostgreSQL 14+ installed and running - Logstash installed and configured (see [Logstash section](#logstash-log-aggregation) above) - Bugsink running at `https://bugsink.projectium.com` ### Step 1: Configure PostgreSQL Logging Create the observability configuration file: ```bash sudo nano /etc/postgresql/14/main/conf.d/observability.conf ``` Add the following content: ```ini # PostgreSQL Logging Configuration for Database Function Observability (ADR-050) # Enable logging to files for Logstash pickup logging_collector = on log_destination = 'stderr' log_directory = '/var/log/postgresql' log_filename = 'postgresql-%Y-%m-%d.log' log_rotation_age = 1d log_rotation_size = 100MB log_truncate_on_rotation = on # Log level - capture NOTICE and above (includes fn_log WARNING/ERROR) log_min_messages = notice client_min_messages = notice # Include useful context in log prefix log_line_prefix = '%t [%p] %u@%d ' # Capture slow queries from functions (1 second threshold) log_min_duration_statement = 1000 # Log statement types (off for production) log_statement = 'none' # Connection logging (off for production to reduce noise) log_connections = off log_disconnections = off ``` Set up the log directory: ```bash # Create log directory sudo mkdir -p /var/log/postgresql # Set ownership to postgres user sudo chown postgres:postgres /var/log/postgresql sudo chmod 750 /var/log/postgresql ``` Restart PostgreSQL: ```bash sudo systemctl restart postgresql ``` Verify logging is working: ```bash # Check that log files are being created ls -la /var/log/postgresql/ # Should see files like: postgresql-2026-01-20.log ``` ### Step 2: Configure Logstash for PostgreSQL Logs The Logstash configuration is located at `/etc/logstash/conf.d/bugsink.conf`. **Key features:** - Parses PostgreSQL log format with grok patterns - Extracts JSON from `fn_log()` function calls - Tags WARNING/ERROR level logs - Routes production database errors to Bugsink project 1 - Routes test database errors to Bugsink project 3 - Transforms events to Sentry-compatible format **Configuration file:** `/etc/logstash/conf.d/bugsink.conf` See the [Logstash Configuration Reference](#logstash-configuration-reference) below for the complete configuration. **Grant Logstash access to PostgreSQL logs:** ```bash # Add logstash user to postgres group sudo usermod -aG postgres logstash # Verify group membership groups logstash # Restart Logstash to apply changes sudo systemctl restart logstash ``` ### Step 3: Test the Pipeline Test structured logging from PostgreSQL: ```bash # Production database (routes to Bugsink project 1) sudo -u postgres psql -d flyer-crawler-prod -c "SELECT fn_log('WARNING', 'test_observability', 'Testing PostgreSQL observability pipeline', '{\"environment\": \"production\"}'::jsonb);" # Test database (routes to Bugsink project 3) sudo -u postgres psql -d flyer-crawler-test -c "SELECT fn_log('WARNING', 'test_observability', 'Testing PostgreSQL observability pipeline', '{\"environment\": \"test\"}'::jsonb);" ``` Check Bugsink UI: - Production errors: → Project 1 (flyer-crawler-backend) - Test errors: → Project 3 (flyer-crawler-backend-test) ### Step 4: Verify Database Functions The following critical functions use `fn_log()` for observability: | Function | What it logs | | -------------------------- | ---------------------------------------- | | `award_achievement()` | Missing achievements, duplicate awards | | `fork_recipe()` | Missing original recipes | | `handle_new_user()` | User creation events | | `approve_correction()` | Permission denied, corrections not found | | `complete_shopping_list()` | Permission checks, list not found | Test error logging with a database function: ```bash # Try to award a non-existent achievement (should fail and log to Bugsink) sudo -u postgres psql -d flyer-crawler-test -c "SELECT award_achievement('00000000-0000-0000-0000-000000000000'::uuid, 'NonexistentBadge');" # Check Bugsink project 3 - should see an ERROR with full context ``` ### Logstash Configuration Reference Complete configuration for PostgreSQL observability (`/etc/logstash/conf.d/bugsink.conf`): ```conf input { # PostgreSQL function logs (ADR-050) # Both production and test databases write to the same log files file { path => "/var/log/postgresql/*.log" type => "postgres" tags => ["postgres", "database"] start_position => "beginning" sincedb_path => "/var/lib/logstash/sincedb_postgres" } } filter { # PostgreSQL function log parsing (ADR-050) if [type] == "postgres" { # Extract timestamp, timezone, process ID, user, database, level, and message grok { match => { "message" => "%{TIMESTAMP_ISO8601:pg_timestamp} [+-]%{INT:pg_timezone} \[%{POSINT:pg_pid}\] %{DATA:pg_user}@%{DATA:pg_database} %{WORD:pg_level}: %{GREEDYDATA:pg_message}" } } # Try to parse pg_message as JSON (from fn_log()) if [pg_message] =~ /^\{/ { json { source => "pg_message" target => "fn_log" skip_on_invalid_json => true } # Mark as error if level is WARNING or ERROR if [fn_log][level] in ["WARNING", "ERROR"] { mutate { add_tag => ["error", "db_function"] } } } # Also catch native PostgreSQL errors if [pg_level] in ["ERROR", "FATAL"] { mutate { add_tag => ["error", "postgres_native"] } } # Detect environment from database name if [pg_database] == "flyer-crawler-prod" { mutate { add_tag => ["production"] } } else if [pg_database] == "flyer-crawler-test" { mutate { add_tag => ["test"] } } # Generate event_id for Sentry if "error" in [tags] { uuid { target => "[@metadata][event_id]" overwrite => true } } } } output { # Production database errors -> project 1 (flyer-crawler-backend) if "error" in [tags] and "production" in [tags] { http { url => "https://bugsink.projectium.com/api/1/store/" http_method => "post" format => "json" headers => { "X-Sentry-Auth" => "Sentry sentry_version=7, sentry_client=logstash/1.0, sentry_key=911aef02b9a548fa8fabb8a3c81abfe5" "Content-Type" => "application/json" } mapping => { "event_id" => "%{[@metadata][event_id]}" "timestamp" => "%{@timestamp}" "platform" => "other" "level" => "error" "logger" => "postgresql" "message" => "%{[fn_log][message]}" "environment" => "production" "extra" => { "pg_user" => "%{[pg_user]}" "pg_database" => "%{[pg_database]}" "pg_function" => "%{[fn_log][function]}" "pg_level" => "%{[pg_level]}" "context" => "%{[fn_log][context]}" } } } } # Test database errors -> project 3 (flyer-crawler-backend-test) if "error" in [tags] and "test" in [tags] { http { url => "https://bugsink.projectium.com/api/3/store/" http_method => "post" format => "json" headers => { "X-Sentry-Auth" => "Sentry sentry_version=7, sentry_client=logstash/1.0, sentry_key=cdb99c314589431e83d4cc38a809449b" "Content-Type" => "application/json" } mapping => { "event_id" => "%{[@metadata][event_id]}" "timestamp" => "%{@timestamp}" "platform" => "other" "level" => "error" "logger" => "postgresql" "message" => "%{[fn_log][message]}" "environment" => "test" "extra" => { "pg_user" => "%{[pg_user]}" "pg_database" => "%{[pg_database]}" "pg_function" => "%{[fn_log][function]}" "pg_level" => "%{[pg_level]}" "context" => "%{[fn_log][context]}" } } } } } ``` ### Extended Logstash Configuration (PM2, Redis, NGINX) The complete production Logstash configuration includes additional log sources beyond PostgreSQL: **Input Sources:** ```conf input { # PostgreSQL function logs (shown above) # PM2 Worker stdout logs (production) file { path => "/home/gitea-runner/.pm2/logs/flyer-crawler-worker-*.log" type => "pm2_stdout" tags => ["infra", "pm2", "worker", "production"] start_position => "end" sincedb_path => "/var/lib/logstash/sincedb_pm2_worker_prod" exclude => "*-test-*.log" } # PM2 Analytics Worker stdout (production) file { path => "/home/gitea-runner/.pm2/logs/flyer-crawler-analytics-worker-*.log" type => "pm2_stdout" tags => ["infra", "pm2", "analytics", "production"] start_position => "end" sincedb_path => "/var/lib/logstash/sincedb_pm2_analytics_prod" exclude => "*-test-*.log" } # PM2 Worker stdout (test environment) file { path => "/home/gitea-runner/.pm2/logs/flyer-crawler-worker-test-*.log" type => "pm2_stdout" tags => ["infra", "pm2", "worker", "test"] start_position => "end" sincedb_path => "/var/lib/logstash/sincedb_pm2_worker_test" } # PM2 Analytics Worker stdout (test environment) file { path => "/home/gitea-runner/.pm2/logs/flyer-crawler-analytics-worker-test-*.log" type => "pm2_stdout" tags => ["infra", "pm2", "analytics", "test"] start_position => "end" sincedb_path => "/var/lib/logstash/sincedb_pm2_analytics_test" } # Redis logs (already configured) file { path => "/var/log/redis/redis-server.log" type => "redis" tags => ["infra", "redis"] start_position => "end" sincedb_path => "/var/lib/logstash/sincedb_redis" } # NGINX access logs file { path => "/var/log/nginx/access.log" type => "nginx_access" tags => ["infra", "nginx", "access"] start_position => "end" sincedb_path => "/var/lib/logstash/sincedb_nginx_access" } # NGINX error logs file { path => "/var/log/nginx/error.log" type => "nginx_error" tags => ["infra", "nginx", "error"] start_position => "end" sincedb_path => "/var/lib/logstash/sincedb_nginx_error" } } ``` **Filter Rules:** ```conf filter { # PostgreSQL filters (shown above) # PM2 Worker log parsing if [type] == "pm2_stdout" { # Try to parse as JSON first (if worker uses Pino) json { source => "message" target => "pm2_json" skip_on_invalid_json => true } # If JSON parsing succeeded, extract level and tag errors if [pm2_json][level] { if [pm2_json][level] >= 50 { mutate { add_tag => ["error"] } } } # If not JSON, check for error keywords in plain text else if [message] =~ /(Error|ERROR|Exception|EXCEPTION|Fatal|FATAL|failed|FAILED)/ { mutate { add_tag => ["error"] } } # Generate event_id for errors if "error" in [tags] { uuid { target => "[@metadata][event_id]" overwrite => true } } } # Redis log parsing if [type] == "redis" { grok { match => { "message" => "%{POSINT:pid}:%{WORD:role} %{MONTHDAY} %{MONTH} %{TIME} %{WORD:loglevel} %{GREEDYDATA:redis_message}" } } # Tag errors (WARNING/ERROR) for Bugsink forwarding if [loglevel] in ["WARNING", "ERROR"] { mutate { add_tag => ["error"] } uuid { target => "[@metadata][event_id]" overwrite => true } } # Tag INFO-level operational events (startup, config, persistence) else if [loglevel] == "INFO" { mutate { add_tag => ["redis_operational"] } } } # NGINX access log parsing if [type] == "nginx_access" { grok { match => { "message" => "%{COMBINEDAPACHELOG}" } } # Parse response time if available (requires NGINX log format with request_time) if [message] =~ /request_time:(\d+\.\d+)/ { grok { match => { "message" => "request_time:(?\d+\.\d+)" } } } # Categorize by status code if [response] =~ /^5\d{2}$/ { mutate { add_tag => ["error", "http_5xx"] } uuid { target => "[@metadata][event_id]" overwrite => true } } else if [response] =~ /^4\d{2}$/ { mutate { add_tag => ["client_error", "http_4xx"] } } else if [response] =~ /^2\d{2}$/ { mutate { add_tag => ["success", "http_2xx"] } } else if [response] =~ /^3\d{2}$/ { mutate { add_tag => ["redirect", "http_3xx"] } } # Tag slow requests (>1 second response time) if [request_time_seconds] and [request_time_seconds] > 1.0 { mutate { add_tag => ["slow_request"] } } # Always tag for monitoring mutate { add_tag => ["access_log"] } } # NGINX error log parsing if [type] == "nginx_error" { mutate { add_tag => ["error"] } uuid { target => "[@metadata][event_id]" overwrite => true } } } ``` **Output Rules:** ```conf output { # Production errors -> Bugsink infrastructure project (5) # Includes: PM2 worker errors, Redis errors, NGINX 5xx, PostgreSQL errors if "error" in [tags] and "infra" in [tags] and "production" in [tags] { http { url => "https://bugsink.projectium.com/api/5/store/" http_method => "post" format => "json" headers => { "X-Sentry-Auth" => "Sentry sentry_version=7, sentry_client=logstash/1.0, sentry_key=b083076f94fb461b889d5dffcbef43bf" "Content-Type" => "application/json" } mapping => { "event_id" => "%{[@metadata][event_id]}" "timestamp" => "%{@timestamp}" "platform" => "other" "level" => "error" "logger" => "%{type}" "message" => "%{message}" "environment" => "production" } } } # Test errors -> Bugsink test infrastructure project (6) if "error" in [tags] and "infra" in [tags] and "test" in [tags] { http { url => "https://bugsink.projectium.com/api/6/store/" http_method => "post" format => "json" headers => { "X-Sentry-Auth" => "Sentry sentry_version=7, sentry_client=logstash/1.0, sentry_key=25020dd6c2b74ad78463ec90e90fadab" "Content-Type" => "application/json" } mapping => { "event_id" => "%{[@metadata][event_id]}" "timestamp" => "%{@timestamp}" "platform" => "other" "level" => "error" "logger" => "%{type}" "message" => "%{message}" "environment" => "test" } } } # PM2 worker operational logs (non-errors) -> file if [type] == "pm2_stdout" and "error" not in [tags] { file { path => "/var/log/logstash/pm2-workers-%{+YYYY-MM-dd}.log" codec => json_lines } } # Redis INFO logs (operational events) -> file if "redis_operational" in [tags] { file { path => "/var/log/logstash/redis-operational-%{+YYYY-MM-dd}.log" codec => json_lines } } # NGINX access logs (all requests) -> file if "access_log" in [tags] { file { path => "/var/log/logstash/nginx-access-%{+YYYY-MM-dd}.log" codec => json_lines } } } ``` **Setup Instructions:** 1. Create log output directory: ```bash sudo mkdir -p /var/log/logstash sudo chown logstash:logstash /var/log/logstash ``` 2. Configure logrotate for Logstash file outputs: ```bash sudo tee /etc/logrotate.d/logstash < backup_$(date +%Y%m%d).sql # Backup Bugsink database pg_dump -U bugsink -h localhost bugsink > bugsink_backup_$(date +%Y%m%d).sql ``` ### Log Locations | Log | Location | | ----------------- | --------------------------- | | Application (PM2) | `~/.pm2/logs/` | | NGINX access | `/var/log/nginx/access.log` | | NGINX error | `/var/log/nginx/error.log` | | PostgreSQL | `/var/log/postgresql/` | | Redis | `/var/log/redis/` | | Bugsink | `journalctl -u bugsink` | | Logstash | `/var/log/logstash/` | --- ## Related Documentation - [DEPLOYMENT.md](../DEPLOYMENT.md) - Container-based deployment - [DATABASE.md](../DATABASE.md) - Database schema and extensions - [AUTHENTICATION.md](../AUTHENTICATION.md) - OAuth provider setup - [ADR-015](adr/0015-application-performance-monitoring-and-error-tracking.md) - Error tracking architecture