# Bare-Metal Server Setup Guide

This guide covers the manual installation of Flyer Crawler and its dependencies on a bare-metal Ubuntu server (e.g., a colocation server). This is the definitive reference for setting up a production environment without containers.

**Last verified**: 2026-01-28

**Target Environment**: Ubuntu 22.04 LTS (or newer)

**Related documentation**:

- [ADR-014: Containerization and Deployment Strategy](../adr/0014-containerization-and-deployment-strategy.md)
- [ADR-015: Error Tracking and Observability](../adr/0015-error-tracking-and-observability.md)
- [ADR-050: PostgreSQL Function Observability](../adr/0050-postgresql-function-observability.md)
- [Deployment Guide](DEPLOYMENT.md)
- [Monitoring Guide](MONITORING.md)

---

## Quick Reference

### Installation Time Estimates

| Component   | Estimated Time  | Notes                         |
| ----------- | --------------- | ----------------------------- |
| PostgreSQL  | 10-15 minutes   | Including PostGIS extensions  |
| Redis       | 5 minutes       | Quick install                 |
| Node.js     | 5 minutes       | Via NodeSource repository     |
| Application | 15-20 minutes   | Clone, install, build         |
| PM2         | 5 minutes       | Global install + config       |
| NGINX       | 10-15 minutes   | Including SSL via Certbot     |
| Bugsink     | 20-30 minutes   | Python venv, systemd services |
| Logstash    | 15-20 minutes   | Including pipeline config     |
| **Total**   | **~90 minutes** | For complete fresh install    |

### Post-Installation Verification

After completing setup, verify all services:

```bash
# Check all services are running
systemctl status postgresql nginx redis-server gunicorn-bugsink snappea logstash

# Verify application health
curl -s https://flyer-crawler.projectium.com/api/health/ready | jq .

# Check PM2 processes
pm2 list

# Verify Bugsink is accessible
curl -s https://bugsink.projectium.com/accounts/login/ | head -5
```

---

## Server Access Model

All commands in this guide are intended for the **system administrator** to execute directly on the server. Claude Code and AI tools have **READ-ONLY** access to production servers and cannot execute these commands directly.

When Claude assists with server setup or troubleshooting:

1. Claude provides commands for the administrator to execute
2. Administrator runs commands and reports output
3. Claude analyzes results and provides next steps (1-3 commands at a time)
4. Administrator executes and reports results
5. Claude provides verification commands to confirm success

---

## Table of Contents

1. [System Prerequisites](#system-prerequisites)
2. [PostgreSQL Setup](#postgresql-setup)
3. [Redis Setup](#redis-setup)
4. [Node.js and Application Setup](#nodejs-and-application-setup)
5. [PM2 Process Manager](#pm2-process-manager)
6. [NGINX Reverse Proxy](#nginx-reverse-proxy)
7. [Bugsink Error Tracking](#bugsink-error-tracking)
8. [Logstash Log Aggregation](#logstash-log-aggregation)
9. [SSL/TLS with Let's Encrypt](#ssltls-with-lets-encrypt)
10. [Firewall Configuration](#firewall-configuration)
11. [Maintenance Commands](#maintenance-commands)

---

## System Prerequisites

Update the system and install essential packages:

```bash
sudo apt update && sudo apt upgrade -y
sudo apt install -y curl git build-essential python3 python3-pip python3-venv
```

---

## PostgreSQL Setup

### Install PostgreSQL 14+ with PostGIS

```bash
# Add PostgreSQL APT repository
sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -
sudo apt update

# Install PostgreSQL and PostGIS
sudo apt install -y postgresql-14 postgresql-14-postgis-3
```

### Create Application Database and User

```bash
sudo -u postgres psql
```

```sql
-- Create application user and database
CREATE USER flyer_crawler WITH PASSWORD 'YOUR_SECURE_PASSWORD';
CREATE DATABASE flyer_crawler OWNER flyer_crawler;

-- Connect to the database and enable extensions
\c flyer_crawler

CREATE EXTENSION IF NOT EXISTS postgis;
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE EXTENSION IF NOT EXISTS pgcrypto;

-- Grant privileges
GRANT ALL PRIVILEGES ON DATABASE flyer_crawler TO flyer_crawler;

\q
```

### Configure PostgreSQL for Remote Access (if needed)

Edit `/etc/postgresql/14/main/postgresql.conf`:

```conf
listen_addresses = 'localhost'  # Change to '*' for remote access
```

Edit `/etc/postgresql/14/main/pg_hba.conf` to add allowed hosts:

```conf
# Local connections
local   all   all   peer
host    all   all   127.0.0.1/32   scram-sha-256
```

Restart PostgreSQL:

```bash
sudo systemctl restart postgresql
```

---

## Redis Setup

### Install Redis

```bash
sudo apt install -y redis-server
```

### Configure Redis Password

Edit `/etc/redis/redis.conf`:

```conf
requirepass YOUR_REDIS_PASSWORD
```

Restart Redis:

```bash
sudo systemctl restart redis-server
sudo systemctl enable redis-server
```

### Test Redis Connection

```bash
redis-cli -a YOUR_REDIS_PASSWORD ping
# Should output: PONG
```

---

## Node.js and Application Setup

### Install Node.js 20.x

```bash
curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash -
sudo apt install -y nodejs
```

Verify installation:

```bash
node --version  # Should output v20.x.x
npm --version
```

### Install System Dependencies for PDF Processing

```bash
sudo apt install -y poppler-utils  # For pdftocairo
```

### Clone and Install Application

```bash
# Create application directory
sudo mkdir -p /opt/flyer-crawler
sudo chown $USER:$USER /opt/flyer-crawler

# Clone repository
cd /opt/flyer-crawler
git clone https://gitea.projectium.com/flyer-crawler/flyer-crawler.projectium.com.git .

# Install dependencies
npm install

# Build for production
npm run build
```

### Configure Environment Variables

**Important:** The flyer-crawler application does **not** use local environment files in production. All secrets are managed through **Gitea CI/CD secrets** and injected during deployment.

#### How Secrets Work

1. **Secrets are stored in Gitea** at Repository → Settings → Actions → Secrets
2. **Workflow files** (`.gitea/workflows/deploy-to-prod.yml`) reference secrets using `${{ secrets.SECRET_NAME }}`
3. **PM2** receives environment variables from the workflow's `env:` block
4. **ecosystem.config.cjs** passes variables to the application via `process.env`

#### Required Gitea Secrets

Before deployment, ensure these secrets are configured in Gitea:

**Shared Secrets** (used by both production and test):

| Secret Name            | Description                             |
| ---------------------- | --------------------------------------- |
| `DB_HOST`              | Database hostname (usually `localhost`) |
| `DB_USER`              | Database username                       |
| `DB_PASSWORD`          | Database password                       |
| `JWT_SECRET`           | JWT signing secret (min 32 characters)  |
| `GOOGLE_MAPS_API_KEY`  | Google Maps API key                     |
| `GOOGLE_CLIENT_ID`     | Google OAuth client ID                  |
| `GOOGLE_CLIENT_SECRET` | Google OAuth client secret              |
| `GH_CLIENT_ID`         | GitHub OAuth client ID                  |
| `GH_CLIENT_SECRET`     | GitHub OAuth client secret              |

**Production-Specific Secrets**:

| Secret Name                 | Description                                                          |
| --------------------------- | -------------------------------------------------------------------- |
| `DB_DATABASE_PROD`          | Production database name (`flyer_crawler`)                           |
| `REDIS_PASSWORD_PROD`       | Redis password for production (uses database 0)                      |
| `VITE_GOOGLE_GENAI_API_KEY` | Gemini API key for production                                        |
| `SENTRY_DSN`                | Bugsink backend DSN (see [Bugsink section](#bugsink-error-tracking)) |
| `VITE_SENTRY_DSN`           | Bugsink frontend DSN                                                 |

**Test-Specific Secrets**:

| Secret Name                      | Description                                                                   |
| -------------------------------- | ----------------------------------------------------------------------------- |
| `DB_DATABASE_TEST`               | Test database name (`flyer-crawler-test`)                                     |
| `REDIS_PASSWORD_TEST`            | Redis password for test (uses database 1 for isolation)                       |
| `VITE_GOOGLE_GENAI_API_KEY_TEST` | Gemini API key for test environment                                           |
| `SENTRY_DSN_TEST`                | Bugsink backend DSN for test (see [Bugsink section](#bugsink-error-tracking)) |
| `VITE_SENTRY_DSN_TEST`           | Bugsink frontend DSN for test                                                 |

#### Test Environment Details

The test environment (`flyer-crawler-test.projectium.com`) uses **both** Gitea CI/CD secrets and a local `.env.test` file:

| Path                                                   | Purpose                                  |
| ------------------------------------------------------ | ---------------------------------------- |
| `/var/www/flyer-crawler-test.projectium.com/`          | Test application directory               |
| `/var/www/flyer-crawler-test.projectium.com/.env.test` | Local overrides for test-specific config |

**Key differences from production:**

- Uses Redis database **1** (production uses database **0**) to isolate job queues
- PM2 processes are named with `-test` suffix (e.g., `flyer-crawler-api-test`)
- Deployed automatically on every push to `main` branch
- Has a `.env.test` file for additional local configuration overrides

For detailed information on secrets management, see [CLAUDE.md](../CLAUDE.md).

---

## PM2 Process Manager

### Install PM2 Globally

```bash
sudo npm install -g pm2
```

### PM2 Configuration Files

The application uses **separate ecosystem config files** for production and test environments:

| File                        | Purpose               | Processes Started                                                                            |
| --------------------------- | --------------------- | -------------------------------------------------------------------------------------------- |
| `ecosystem.config.cjs`      | Production deployment | `flyer-crawler-api`, `flyer-crawler-worker`, `flyer-crawler-analytics-worker`                |
| `ecosystem-test.config.cjs` | Test deployment       | `flyer-crawler-api-test`, `flyer-crawler-worker-test`, `flyer-crawler-analytics-worker-test` |

**Key Points:**

- Production and test processes run **simultaneously** with distinct names
- Test processes use `NODE_ENV=test` which enables file logging
- Test processes use Redis database 1 (isolated from production which uses database 0)
- Both configs validate required environment variables but only warn (don't exit) if missing

### Start Production Application

```bash
cd /var/www/flyer-crawler.projectium.com

# Set required environment variables (usually done via CI/CD)
export DB_HOST=localhost
export JWT_SECRET=your-secret
export GEMINI_API_KEY=your-api-key
# ... other required variables

pm2 startOrReload ecosystem.config.cjs --update-env && pm2 save
```

This starts three production processes:

- `flyer-crawler-api` - Main API server (port 3001)
- `flyer-crawler-worker` - Background job worker
- `flyer-crawler-analytics-worker` - Analytics processing worker

### Start Test Application

```bash
cd /var/www/flyer-crawler-test.projectium.com

# Set required environment variables (usually done via CI/CD)
export DB_HOST=localhost
export DB_NAME=flyer-crawler-test
export JWT_SECRET=your-secret
export GEMINI_API_KEY=your-test-api-key
export REDIS_URL=redis://localhost:6379/1  # Use database 1 for isolation
# ... other required variables

pm2 startOrReload ecosystem-test.config.cjs --update-env && pm2 save
```

This starts three test processes (running alongside production):

- `flyer-crawler-api-test` - Test API server (port 3001 via different NGINX vhost)
- `flyer-crawler-worker-test` - Test background job worker
- `flyer-crawler-analytics-worker-test` - Test analytics worker

### Verify Running Processes

After starting both environments, you should see 6 application processes:

```bash
pm2 list
```

Expected output:

```text
┌────┬───────────────────────────────────┬──────────┬────────┬───────────┐
│ id │ name                              │ mode     │ status │ cpu       │
├────┼───────────────────────────────────┼──────────┼────────┼───────────┤
│ 0  │ flyer-crawler-api                 │ cluster  │ online │ 0%        │
│ 1  │ flyer-crawler-worker              │ fork     │ online │ 0%        │
│ 2  │ flyer-crawler-analytics-worker    │ fork     │ online │ 0%        │
│ 3  │ flyer-crawler-api-test            │ fork     │ online │ 0%        │
│ 4  │ flyer-crawler-worker-test         │ fork     │ online │ 0%        │
│ 5  │ flyer-crawler-analytics-worker-test│ fork     │ online │ 0%        │
└────┴───────────────────────────────────┴──────────┴────────┴───────────┘
```

### Configure PM2 Startup

```bash
pm2 startup systemd
# Follow the command output to enable PM2 on boot

pm2 save
```

### PM2 Log Rotation

```bash
pm2 install pm2-logrotate
pm2 set pm2-logrotate:max_size 10M
pm2 set pm2-logrotate:retain 14
pm2 set pm2-logrotate:compress true
```

### Useful PM2 Commands

```bash
# View logs for a specific process
pm2 logs flyer-crawler-api-test --lines 50

# View environment variables for a process
pm2 env <process-id>

# Restart only test processes
pm2 restart flyer-crawler-api-test flyer-crawler-worker-test flyer-crawler-analytics-worker-test

# Delete all test processes (without affecting production)
pm2 delete flyer-crawler-api-test flyer-crawler-worker-test flyer-crawler-analytics-worker-test
```

---

## NGINX Reverse Proxy

### Install NGINX

```bash
sudo apt install -y nginx
```

### Reference Configuration Files

The repository contains reference copies of the actual production NGINX configurations at the project root:

- `etc-nginx-sites-available-flyer-crawler.projectium.com` - Production config
- `etc-nginx-sites-available-flyer-crawler-test-projectium-com.txt` - Test config

These reference files document the exact configuration deployed on the server, including SSL settings managed by Certbot. Use them as a reference when setting up new servers or troubleshooting configuration issues.

**Note:** The simplified example below shows the basic structure. For the complete production configuration with SSL, security headers, and all location blocks, refer to the reference files in the repository root.

### Create Site Configuration

Create `/etc/nginx/sites-available/flyer-crawler.projectium.com`:

```nginx
server {
    listen 80;
    server_name flyer-crawler.projectium.com;

    # Redirect HTTP to HTTPS (uncomment after SSL setup)
    # return 301 https://$server_name$request_uri;

    location / {
        proxy_pass http://localhost:5173;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection 'upgrade';
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_cache_bypass $http_upgrade;
    }

    location /api {
        proxy_pass http://localhost:3001;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection 'upgrade';
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_cache_bypass $http_upgrade;

        # File upload size limit
        client_max_body_size 50M;
    }

    # Serve flyer images from static storage (7-day cache)
    location /flyer-images/ {
        alias /var/www/flyer-crawler.projectium.com/flyer-images/;
        expires 7d;
        add_header Cache-Control "public, immutable";
    }

    # MIME type fix for .mjs files
    types {
        application/javascript js mjs;
    }
}
```

### Static Flyer Images Directory

Create the directory for storing flyer images:

```bash
# Production
sudo mkdir -p /var/www/flyer-crawler.projectium.com/flyer-images
sudo chown www-data:www-data /var/www/flyer-crawler.projectium.com/flyer-images

# Test environment
sudo mkdir -p /var/www/flyer-crawler-test.projectium.com/flyer-images
sudo chown www-data:www-data /var/www/flyer-crawler-test.projectium.com/flyer-images
```

The `/flyer-images/` location serves static images with:

- **7-day browser cache** (`expires 7d`)
- **Immutable cache header** for optimal CDN/browser caching
- Direct file serving (no proxy overhead)

### Enable the Site

```bash
sudo ln -s /etc/nginx/sites-available/flyer-crawler.projectium.com /etc/nginx/sites-enabled/
sudo nginx -t
sudo systemctl reload nginx
sudo systemctl enable nginx
```

---

## Bugsink Error Tracking

Bugsink is a lightweight, self-hosted Sentry-compatible error tracking system. This guide follows the [official Bugsink single-server production setup](https://www.bugsink.com/docs/single-server-production/).

See [ADR-015](adr/0015-application-performance-monitoring-and-error-tracking.md) for architecture details.

### Step 1: Create Bugsink User

Create a dedicated non-root user for Bugsink:

```bash
sudo adduser bugsink --disabled-password --gecos ""
```

### Step 2: Set Up Virtual Environment and Install Bugsink

Switch to the bugsink user:

```bash
sudo su - bugsink
```

Create the virtual environment:

```bash
python3 -m venv venv
```

Activate the virtual environment:

```bash
source venv/bin/activate
```

You should see `(venv)` at the beginning of your prompt. Now install Bugsink:

```bash
pip install bugsink --upgrade
bugsink-show-version
```

You should see output like `bugsink 2.x.x`.

### Step 3: Create Configuration File

Generate the configuration file. Replace `bugsink.yourdomain.com` with your actual hostname:

```bash
bugsink-create-conf --template=singleserver --host=bugsink.yourdomain.com
```

This creates `bugsink_conf.py` in `/home/bugsink/`. Edit it to customize settings:

```bash
nano bugsink_conf.py
```

**Key settings to review:**

| Setting             | Description                                                                     |
| ------------------- | ------------------------------------------------------------------------------- |
| `BASE_URL`          | The URL where Bugsink will be accessed (e.g., `https://bugsink.yourdomain.com`) |
| `SITE_TITLE`        | Display name for your Bugsink instance                                          |
| `SECRET_KEY`        | Auto-generated, but verify it exists                                            |
| `TIME_ZONE`         | Your timezone (e.g., `America/New_York`)                                        |
| `USER_REGISTRATION` | Set to `"closed"` to disable public signup                                      |
| `SINGLE_USER`       | Set to `True` if only one user will use this instance                           |

### Step 4: Initialize Database

Bugsink uses SQLite by default, which is recommended for single-server setups. Run the database migrations:

```bash
bugsink-manage migrate
bugsink-manage migrate snappea --database=snappea
```

Verify the database files were created:

```bash
ls *.sqlite3
```

You should see `db.sqlite3` and `snappea.sqlite3`.

### Step 5: Create Admin User

Create the superuser account. Using your email as the username is recommended:

```bash
bugsink-manage createsuperuser
```

**Important:** Save these credentials - you'll need them to log into the Bugsink web UI.

### Step 6: Verify Configuration

Run Django's deployment checks:

```bash
bugsink-manage check_migrations
bugsink-manage check --deploy --fail-level WARNING
```

Exit back to root for the next steps:

```bash
exit
```

### Step 7: Create Gunicorn Service

Create `/etc/systemd/system/gunicorn-bugsink.service`:

```bash
sudo nano /etc/systemd/system/gunicorn-bugsink.service
```

Add the following content:

```ini
[Unit]
Description=Gunicorn daemon for Bugsink
After=network.target

[Service]
Restart=always
Type=notify
User=bugsink
Group=bugsink

Environment="PYTHONUNBUFFERED=1"
WorkingDirectory=/home/bugsink
ExecStart=/home/bugsink/venv/bin/gunicorn \
    --bind="127.0.0.1:8000" \
    --workers=4 \
    --timeout=6 \
    --access-logfile - \
    --max-requests=1000 \
    --max-requests-jitter=100 \
    bugsink.wsgi
ExecReload=/bin/kill -s HUP $MAINPID
KillMode=mixed
TimeoutStopSec=5

[Install]
WantedBy=multi-user.target
```

Enable and start the service:

```bash
sudo systemctl daemon-reload
sudo systemctl enable --now gunicorn-bugsink.service
sudo systemctl status gunicorn-bugsink.service
```

Test that Gunicorn is responding (replace hostname):

```bash
curl http://localhost:8000/accounts/login/ --header "Host: bugsink.yourdomain.com"
```

You should see HTML output containing a login form.

### Step 8: Create Snappea Background Worker Service

Snappea is Bugsink's background task processor. Create `/etc/systemd/system/snappea.service`:

```bash
sudo nano /etc/systemd/system/snappea.service
```

Add the following content:

```ini
[Unit]
Description=Snappea daemon for Bugsink background tasks
After=network.target

[Service]
Restart=always
User=bugsink
Group=bugsink

Environment="PYTHONUNBUFFERED=1"
WorkingDirectory=/home/bugsink
ExecStart=/home/bugsink/venv/bin/bugsink-runsnappea
KillMode=mixed
TimeoutStopSec=5
RuntimeMaxSec=1d

[Install]
WantedBy=multi-user.target
```

Enable and start the service:

```bash
sudo systemctl daemon-reload
sudo systemctl enable --now snappea.service
sudo systemctl status snappea.service
```

Verify snappea is working:

```bash
sudo su - bugsink
source venv/bin/activate
bugsink-manage checksnappea
exit
```

### Step 9: Configure NGINX for Bugsink

Create `/etc/nginx/sites-available/bugsink`:

```bash
sudo nano /etc/nginx/sites-available/bugsink
```

Add the following (replace `bugsink.yourdomain.com` with your hostname):

```nginx
server {
    server_name bugsink.yourdomain.com;
    listen 80;

    client_max_body_size 20M;

    access_log /var/log/nginx/bugsink.access.log;
    error_log /var/log/nginx/bugsink.error.log;

    location / {
        proxy_pass http://127.0.0.1:8000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-Proto $scheme;
    }
}
```

Enable the site:

```bash
sudo ln -s /etc/nginx/sites-available/bugsink /etc/nginx/sites-enabled/
sudo nginx -t
sudo systemctl reload nginx
```

### Step 10: Configure SSL with Certbot (Recommended)

```bash
sudo certbot --nginx -d bugsink.yourdomain.com
```

After SSL is configured, update the NGINX config to add security headers. Edit `/etc/nginx/sites-available/bugsink` and add to the `location /` block:

```nginx
add_header Strict-Transport-Security "max-age=31536000; preload" always;
```

Reload NGINX:

```bash
sudo nginx -t
sudo systemctl reload nginx
```

### Step 11: Create Projects and Get DSNs

1. Access Bugsink UI at `https://bugsink.yourdomain.com`
2. Log in with the admin credentials you created
3. Create a new team (or use the default)
4. Create projects for each environment:

   **Production:**
   - **flyer-crawler-backend** (Platform: Node.js)
   - **flyer-crawler-frontend** (Platform: JavaScript/React)

   **Test:**
   - **flyer-crawler-backend-test** (Platform: Node.js)
   - **flyer-crawler-frontend-test** (Platform: JavaScript/React)

5. For each project, go to Settings → Client Keys (DSN)
6. Copy the DSN URLs - you'll have 4 DSNs total (2 for production, 2 for test)

> **Note:** The dev container runs its own local Bugsink instance at `localhost:8000` - no remote DSNs needed for development.

### Step 12: Configure Application to Use Bugsink

The flyer-crawler application receives its configuration via **Gitea CI/CD secrets**, not local environment files. Follow these steps to add the Bugsink DSNs:

#### 1. Add Secrets in Gitea

Navigate to your repository in Gitea:

1. Go to **Settings** → **Actions** → **Secrets**
2. Add the following secrets:

**Production DSNs:**

| Secret Name       | Value                                  | Description             |
| ----------------- | -------------------------------------- | ----------------------- |
| `SENTRY_DSN`      | `https://KEY@bugsink.yourdomain.com/1` | Production backend DSN  |
| `VITE_SENTRY_DSN` | `https://KEY@bugsink.yourdomain.com/2` | Production frontend DSN |

**Test DSNs:**

| Secret Name            | Value                                  | Description       |
| ---------------------- | -------------------------------------- | ----------------- |
| `SENTRY_DSN_TEST`      | `https://KEY@bugsink.yourdomain.com/3` | Test backend DSN  |
| `VITE_SENTRY_DSN_TEST` | `https://KEY@bugsink.yourdomain.com/4` | Test frontend DSN |

> **Note:** The project numbers in the DSN URLs are assigned by Bugsink when you create each project. Use the actual DSN values from Step 11.

#### 2. Update the Deployment Workflows

**Production** (`deploy-to-prod.yml`):

In the `Install Backend Dependencies and Restart Production Server` step, add to the `env:` block:

```yaml
env:
  # ... existing secrets ...
  # Sentry/Bugsink Error Tracking
  SENTRY_DSN: ${{ secrets.SENTRY_DSN }}
  SENTRY_ENVIRONMENT: 'production'
  SENTRY_ENABLED: 'true'
```

In the build step, add frontend variables:

```yaml
VITE_SENTRY_DSN=${{ secrets.VITE_SENTRY_DSN }} \
VITE_SENTRY_ENVIRONMENT=production \
VITE_SENTRY_ENABLED=true \
npm run build
```

**Test** (`deploy-to-test.yml`):

In the `Install Backend Dependencies and Restart Test Server` step, add to the `env:` block:

```yaml
env:
  # ... existing secrets ...
  # Sentry/Bugsink Error Tracking (Test)
  SENTRY_DSN: ${{ secrets.SENTRY_DSN_TEST }}
  SENTRY_ENVIRONMENT: 'test'
  SENTRY_ENABLED: 'true'
```

In the build step, add frontend variables:

```yaml
VITE_SENTRY_DSN=${{ secrets.VITE_SENTRY_DSN_TEST }} \
VITE_SENTRY_ENVIRONMENT=test \
VITE_SENTRY_ENABLED=true \
npm run build
```

#### 3. Update ecosystem.config.cjs

Add Sentry variables to the `sharedEnv` object in `ecosystem.config.cjs`:

```javascript
const sharedEnv = {
  // ... existing variables ...
  SENTRY_DSN: process.env.SENTRY_DSN,
  SENTRY_ENVIRONMENT: process.env.SENTRY_ENVIRONMENT,
  SENTRY_ENABLED: process.env.SENTRY_ENABLED,
};
```

#### 4. Dev Container (No Configuration Needed)

The dev container runs its own **local Bugsink instance** at `http://localhost:8000`. No remote DSNs or Gitea secrets are needed for development:

- DSNs are pre-configured in `compose.dev.yml`
- Admin UI: `http://localhost:8000` (login: `admin@localhost` / `admin`)
- Errors stay local and isolated from production/test

#### 5. Deploy to Apply Changes

Trigger deployments via Gitea Actions:

- **Test**: Automatically deploys on push to `main`
- **Production**: Manual trigger via workflow dispatch

**Note:** There is no `/etc/flyer-crawler/environment` file on the server. Production and test secrets are managed through Gitea CI/CD and injected at deployment time. Dev container uses local `.env` file. See [CLAUDE.md](../CLAUDE.md) for details.

### Step 13: Test Error Tracking

You can test Bugsink is working before configuring the flyer-crawler application.

Switch to the bugsink user and open a Python shell:

```bash
sudo su - bugsink
source venv/bin/activate
bugsink-manage shell
```

In the Python shell, send a test message using the **backend DSN** from Step 11:

```python
import sentry_sdk
sentry_sdk.init("https://YOUR_BACKEND_KEY@bugsink.yourdomain.com/1")
sentry_sdk.capture_message("Test message from Bugsink setup")
exit()
```

Exit back to root:

```bash
exit
```

Check the Bugsink UI - you should see the test message appear in the `flyer-crawler-backend` project.

### Step 14: Test from Flyer-Crawler Application (After App Setup)

Once the flyer-crawler application has been deployed with the Sentry secrets configured in Step 12:

```bash
cd /var/www/flyer-crawler.projectium.com
npx tsx scripts/test-bugsink.ts
```

Check the Bugsink UI - you should see a test event appear.

### Bugsink Maintenance Commands

| Task                    | Command                                                                                                                                     |
| ----------------------- | ------------------------------------------------------------------------------------------------------------------------------------------- |
| View Gunicorn status    | `sudo systemctl status gunicorn-bugsink`                                                                                                    |
| View Snappea status     | `sudo systemctl status snappea`                                                                                                             |
| View Gunicorn logs      | `sudo journalctl -u gunicorn-bugsink -f`                                                                                                    |
| View Snappea logs       | `sudo journalctl -u snappea -f`                                                                                                             |
| Restart Bugsink         | `sudo systemctl restart gunicorn-bugsink snappea`                                                                                           |
| Run management commands | `sudo su - bugsink` then `source venv/bin/activate && bugsink-manage <command>`                                                             |
| Upgrade Bugsink         | `sudo su - bugsink && source venv/bin/activate && pip install bugsink --upgrade && exit && sudo systemctl restart gunicorn-bugsink snappea` |

---

## Logstash Log Aggregation

Logstash aggregates logs from the application and infrastructure, forwarding errors to Bugsink.

> **Note:** Logstash integration is **optional**. The flyer-crawler application already sends errors directly to Bugsink via the Sentry SDK. Logstash is only needed if you want to aggregate logs from other sources (Redis, NGINX, etc.) into Bugsink.

### Step 1: Create Application Log Directory

The flyer-crawler application automatically creates its log directory on startup, but you need to ensure proper permissions for Logstash to read the logs.

Create the log directories and set appropriate permissions:

```bash
# Create log directory for the production application
sudo mkdir -p /var/www/flyer-crawler.projectium.com/logs

# Set ownership to root (since PM2 runs as root)
sudo chown -R root:root /var/www/flyer-crawler.projectium.com/logs

# Make logs readable by logstash user
sudo chmod 755 /var/www/flyer-crawler.projectium.com/logs
```

For the test environment:

```bash
sudo mkdir -p /var/www/flyer-crawler-test.projectium.com/logs
sudo chown -R root:root /var/www/flyer-crawler-test.projectium.com/logs
sudo chmod 755 /var/www/flyer-crawler-test.projectium.com/logs
```

### Step 2: Application File Logging (Already Configured)

The flyer-crawler application uses Pino for logging and is configured to write logs to files in production/test environments:

**Log File Locations:**

| Environment   | Log File Path                                             |
| ------------- | --------------------------------------------------------- |
| Production    | `/var/www/flyer-crawler.projectium.com/logs/app.log`      |
| Test          | `/var/www/flyer-crawler-test.projectium.com/logs/app.log` |
| Dev Container | `/app/logs/app.log`                                       |

**How It Works:**

- In production/test: Pino writes JSON logs to both stdout (for PM2) AND `logs/app.log` (for Logstash)
- In development: Pino uses pino-pretty for human-readable console output only
- The log directory is created automatically if it doesn't exist
- You can override the log directory with the `LOG_DIR` environment variable

**Verify Logging After Deployment:**

After deploying the application, verify that logs are being written:

```bash
# Check production logs
ls -la /var/www/flyer-crawler.projectium.com/logs/
tail -f /var/www/flyer-crawler.projectium.com/logs/app.log

# Check test logs
ls -la /var/www/flyer-crawler-test.projectium.com/logs/
tail -f /var/www/flyer-crawler-test.projectium.com/logs/app.log
```

You should see JSON-formatted log entries like:

```json
{ "level": 30, "time": 1704067200000, "msg": "Server started on port 3001", "module": "server" }
```

### Step 3: Install Logstash

```bash
# Add Elastic APT repository
wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo gpg --dearmor -o /usr/share/keyrings/elastic-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/elastic-keyring.gpg] https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elastic-8.x.list

# Update and install
sudo apt update
sudo apt install -y logstash
```

Verify installation:

```bash
/usr/share/logstash/bin/logstash --version
```

### Step 4: Configure Logstash Pipeline

Create the pipeline configuration file:

```bash
sudo nano /etc/logstash/conf.d/bugsink.conf
```

Add the following content:

```conf
input {
  # Production application logs (Pino JSON format)
  file {
    path => "/var/www/flyer-crawler.projectium.com/logs/app.log"
    codec => json_lines
    type => "pino"
    tags => ["app", "production"]
    start_position => "end"
    sincedb_path => "/var/lib/logstash/sincedb_pino_prod"
  }

  # Test environment logs
  file {
    path => "/var/www/flyer-crawler-test.projectium.com/logs/app.log"
    codec => json_lines
    type => "pino"
    tags => ["app", "test"]
    start_position => "end"
    sincedb_path => "/var/lib/logstash/sincedb_pino_test"
  }

  # Redis logs (shared by both environments)
  file {
    path => "/var/log/redis/redis-server.log"
    type => "redis"
    tags => ["infra", "redis", "production"]
    start_position => "end"
    sincedb_path => "/var/lib/logstash/sincedb_redis"
  }

  # NGINX error logs (production)
  file {
    path => "/var/log/nginx/error.log"
    type => "nginx"
    tags => ["infra", "nginx", "production"]
    start_position => "end"
    sincedb_path => "/var/lib/logstash/sincedb_nginx_error"
  }

  # NGINX access logs - for detecting 5xx errors (production)
  file {
    path => "/var/log/nginx/access.log"
    type => "nginx_access"
    tags => ["infra", "nginx", "production"]
    start_position => "end"
    sincedb_path => "/var/lib/logstash/sincedb_nginx_access"
  }

  # PM2 error logs - Production (plain text stack traces)
  file {
    path => "/home/gitea-runner/.pm2/logs/flyer-crawler-*-error.log"
    exclude => "*-test-error.log"
    type => "pm2"
    tags => ["infra", "pm2", "production"]
    start_position => "end"
    sincedb_path => "/var/lib/logstash/sincedb_pm2_prod"
  }

  # PM2 error logs - Test
  file {
    path => "/home/gitea-runner/.pm2/logs/flyer-crawler-*-test-error.log"
    type => "pm2"
    tags => ["infra", "pm2", "test"]
    start_position => "end"
    sincedb_path => "/var/lib/logstash/sincedb_pm2_test"
  }
}

filter {
  # Pino log level detection
  # Pino levels: 10=trace, 20=debug, 30=info, 40=warn, 50=error, 60=fatal
  if [type] == "pino" and [level] {
    if [level] >= 50 {
      mutate { add_tag => ["error"] }
    } else if [level] >= 40 {
      mutate { add_tag => ["warning"] }
    }
  }

  # Redis error detection
  if [type] == "redis" {
    grok {
      match => { "message" => "%{POSINT:pid}:%{WORD:role} %{MONTHDAY} %{MONTH} %{YEAR}? ?%{TIME} %{WORD:loglevel} %{GREEDYDATA:redis_message}" }
    }
    if [loglevel] in ["WARNING", "ERROR"] {
      mutate { add_tag => ["error"] }
    }
  }

  # NGINX error log detection (all entries are errors)
  if [type] == "nginx" {
    mutate { add_tag => ["error"] }
    grok {
      match => { "message" => "%{TIMESTAMP_ISO8601:timestamp} \[%{WORD:severity}\] %{GREEDYDATA:nginx_message}" }
    }
  }

  # NGINX access log - detect 5xx errors
  if [type] == "nginx_access" {
    grok {
      match => { "message" => "%{COMBINEDAPACHELOG}" }
    }
    if [response] =~ /^5\d{2}$/ {
      mutate { add_tag => ["error"] }
    }
  }

  # PM2 error log detection - tag lines with actual error indicators
  if [type] == "pm2" {
    if [message] =~ /Error:|error:|ECONNREFUSED|ENOENT|TypeError|ReferenceError|SyntaxError/ {
      mutate { add_tag => ["error"] }
    }
  }
}

output {
  # Production app errors -> flyer-crawler-backend (project 1)
  if "error" in [tags] and "app" in [tags] and "production" in [tags] {
    http {
      url => "http://localhost:8000/api/1/store/"
      http_method => "post"
      format => "json"
      headers => {
        "X-Sentry-Auth" => "Sentry sentry_version=7, sentry_client=logstash/1.0, sentry_key=YOUR_PROD_BACKEND_DSN_KEY"
      }
    }
  }

  # Test app errors -> flyer-crawler-backend-test (project 3)
  if "error" in [tags] and "app" in [tags] and "test" in [tags] {
    http {
      url => "http://localhost:8000/api/3/store/"
      http_method => "post"
      format => "json"
      headers => {
        "X-Sentry-Auth" => "Sentry sentry_version=7, sentry_client=logstash/1.0, sentry_key=YOUR_TEST_BACKEND_DSN_KEY"
      }
    }
  }

  # Production infrastructure errors (Redis, NGINX, PM2) -> flyer-crawler-infrastructure (project 5)
  if "error" in [tags] and "infra" in [tags] and "production" in [tags] {
    http {
      url => "http://localhost:8000/api/5/store/"
      http_method => "post"
      format => "json"
      headers => {
        "X-Sentry-Auth" => "Sentry sentry_version=7, sentry_client=logstash/1.0, sentry_key=b083076f94fb461b889d5dffcbef43bf"
      }
    }
  }

  # Test infrastructure errors (PM2 test logs) -> flyer-crawler-test-infrastructure (project 6)
  if "error" in [tags] and "infra" in [tags] and "test" in [tags] {
    http {
      url => "http://localhost:8000/api/6/store/"
      http_method => "post"
      format => "json"
      headers => {
        "X-Sentry-Auth" => "Sentry sentry_version=7, sentry_client=logstash/1.0, sentry_key=25020dd6c2b74ad78463ec90e90fadab"
      }
    }
  }

  # Debug output (uncomment to troubleshoot)
  # stdout { codec => rubydebug }
}
```

**Bugsink Project DSNs:**

| Project                             | DSN Key                            | Project ID |
| ----------------------------------- | ---------------------------------- | ---------- |
| `flyer-crawler-backend`             | `911aef02b9a548fa8fabb8a3c81abfe5` | 1          |
| `flyer-crawler-frontend`            | (used by app, not Logstash)        | 2          |
| `flyer-crawler-backend-test`        | `cdb99c314589431e83d4cc38a809449b` | 3          |
| `flyer-crawler-frontend-test`       | (used by app, not Logstash)        | 4          |
| `flyer-crawler-infrastructure`      | `b083076f94fb461b889d5dffcbef43bf` | 5          |
| `flyer-crawler-test-infrastructure` | `25020dd6c2b74ad78463ec90e90fadab` | 6          |

**Note:** The DSN key is the part before `@` in the full DSN URL (e.g., `https://KEY@bugsink.projectium.com/PROJECT_ID`).

**Note on PM2 Logs:** PM2 error logs capture stack traces from stderr, which are valuable for debugging startup errors and uncaught exceptions. Production PM2 logs go to project 5 (infrastructure), test PM2 logs go to project 6 (test-infrastructure).

### Step 5: Create Logstash State Directory and Fix Config Path

Logstash needs a directory to track which log lines it has already processed, and a symlink so it can find its config files:

```bash
# Create state directory for sincedb files
sudo mkdir -p /var/lib/logstash
sudo chown logstash:logstash /var/lib/logstash

# Create symlink so Logstash finds its config (avoids "Could not find logstash.yml" warning)
sudo ln -sf /etc/logstash /usr/share/logstash/config
```

### Step 6: Grant Logstash Access to Application Logs

Logstash runs as the `logstash` user and needs permission to read log files:

```bash
# Add logstash user to adm group (for nginx and redis logs)
sudo usermod -aG adm logstash

# Make application log files readable (created automatically when app starts)
sudo chmod 644 /var/www/flyer-crawler.projectium.com/logs/app.log 2>/dev/null || echo "Production log file not yet created"
sudo chmod 644 /var/www/flyer-crawler-test.projectium.com/logs/app.log 2>/dev/null || echo "Test log file not yet created"

# Make Redis logs and directory readable
sudo chmod 755 /var/log/redis/
sudo chmod 644 /var/log/redis/redis-server.log

# Make NGINX logs readable
sudo chmod 644 /var/log/nginx/access.log /var/log/nginx/error.log

# Make PM2 logs and directories accessible
sudo chmod 755 /home/gitea-runner/
sudo chmod 755 /home/gitea-runner/.pm2/
sudo chmod 755 /home/gitea-runner/.pm2/logs/
sudo chmod 644 /home/gitea-runner/.pm2/logs/*.log

# Verify logstash group membership
groups logstash
```

**Note:** The application log files are created automatically when the application starts. Run the chmod commands after the first deployment.

### Step 7: Test Logstash Configuration

Test the configuration before starting:

```bash
sudo /usr/share/logstash/bin/logstash --config.test_and_exit -f /etc/logstash/conf.d/bugsink.conf
```

You should see `Configuration OK` if there are no errors.

### Step 8: Start Logstash

```bash
sudo systemctl enable logstash
sudo systemctl start logstash
sudo systemctl status logstash
```

View Logstash logs to verify it's working:

```bash
sudo journalctl -u logstash -f
```

### Troubleshooting Logstash

| Issue                      | Solution                                                                                                 |
| -------------------------- | -------------------------------------------------------------------------------------------------------- |
| "Permission denied" errors | Check file permissions on log files and sincedb directory                                                |
| No events being processed  | Verify log file paths exist and contain data                                                             |
| HTTP output errors         | Check Bugsink is running and DSN key is correct                                                          |
| Logstash not starting      | Run config test: `sudo /usr/share/logstash/bin/logstash --config.test_and_exit -f /etc/logstash/conf.d/` |

### Alternative: Skip Logstash

Since the flyer-crawler application already sends errors directly to Bugsink via the Sentry SDK (configured in Steps 11-12), you may not need Logstash at all. Logstash is primarily useful for:

- Aggregating logs from services that don't have native Sentry support (Redis, NGINX)
- Centralizing all logs in one place
- Complex log transformations

If you only need application error tracking, the Sentry SDK integration is sufficient.

---

## PostgreSQL Function Observability (ADR-050)

PostgreSQL function observability provides structured logging and error tracking for database functions, preventing silent failures. This setup forwards database errors to Bugsink for centralized monitoring.

See [ADR-050](adr/0050-postgresql-function-observability.md) for the full architecture decision.

### Prerequisites

- PostgreSQL 14+ installed and running
- Logstash installed and configured (see [Logstash section](#logstash-log-aggregation) above)
- Bugsink running at `https://bugsink.projectium.com`

### Step 1: Configure PostgreSQL Logging

Create the observability configuration file:

```bash
sudo nano /etc/postgresql/14/main/conf.d/observability.conf
```

Add the following content:

```ini
# PostgreSQL Logging Configuration for Database Function Observability (ADR-050)

# Enable logging to files for Logstash pickup
logging_collector = on
log_destination = 'stderr'
log_directory = '/var/log/postgresql'
log_filename = 'postgresql-%Y-%m-%d.log'
log_rotation_age = 1d
log_rotation_size = 100MB
log_truncate_on_rotation = on

# Log level - capture NOTICE and above (includes fn_log WARNING/ERROR)
log_min_messages = notice
client_min_messages = notice

# Include useful context in log prefix
log_line_prefix = '%t [%p] %u@%d '

# Capture slow queries from functions (1 second threshold)
log_min_duration_statement = 1000

# Log statement types (off for production)
log_statement = 'none'

# Connection logging (off for production to reduce noise)
log_connections = off
log_disconnections = off
```

Set up the log directory:

```bash
# Create log directory
sudo mkdir -p /var/log/postgresql

# Set ownership to postgres user
sudo chown postgres:postgres /var/log/postgresql
sudo chmod 750 /var/log/postgresql
```

Restart PostgreSQL:

```bash
sudo systemctl restart postgresql
```

Verify logging is working:

```bash
# Check that log files are being created
ls -la /var/log/postgresql/

# Should see files like: postgresql-2026-01-20.log
```

### Step 2: Configure Logstash for PostgreSQL Logs

The Logstash configuration is located at `/etc/logstash/conf.d/bugsink.conf`.

**Key features:**

- Parses PostgreSQL log format with grok patterns
- Extracts JSON from `fn_log()` function calls
- Tags WARNING/ERROR level logs
- Routes production database errors to Bugsink project 1
- Routes test database errors to Bugsink project 3
- Transforms events to Sentry-compatible format

**Configuration file:** `/etc/logstash/conf.d/bugsink.conf`

See the [Logstash Configuration Reference](#logstash-configuration-reference) below for the complete configuration.

**Grant Logstash access to PostgreSQL logs:**

```bash
# Add logstash user to postgres group
sudo usermod -aG postgres logstash

# Verify group membership
groups logstash

# Restart Logstash to apply changes
sudo systemctl restart logstash
```

### Step 3: Test the Pipeline

Test structured logging from PostgreSQL:

```bash
# Production database (routes to Bugsink project 1)
sudo -u postgres psql -d flyer-crawler-prod -c "SELECT fn_log('WARNING', 'test_observability', 'Testing PostgreSQL observability pipeline', '{\"environment\": \"production\"}'::jsonb);"

# Test database (routes to Bugsink project 3)
sudo -u postgres psql -d flyer-crawler-test -c "SELECT fn_log('WARNING', 'test_observability', 'Testing PostgreSQL observability pipeline', '{\"environment\": \"test\"}'::jsonb);"
```

Check Bugsink UI:

- Production errors: <https://bugsink.projectium.com> → Project 1 (flyer-crawler-backend)
- Test errors: <https://bugsink.projectium.com> → Project 3 (flyer-crawler-backend-test)

### Step 4: Verify Database Functions

The following critical functions use `fn_log()` for observability:

| Function                   | What it logs                             |
| -------------------------- | ---------------------------------------- |
| `award_achievement()`      | Missing achievements, duplicate awards   |
| `fork_recipe()`            | Missing original recipes                 |
| `handle_new_user()`        | User creation events                     |
| `approve_correction()`     | Permission denied, corrections not found |
| `complete_shopping_list()` | Permission checks, list not found        |

Test error logging with a database function:

```bash
# Try to award a non-existent achievement (should fail and log to Bugsink)
sudo -u postgres psql -d flyer-crawler-test -c "SELECT award_achievement('00000000-0000-0000-0000-000000000000'::uuid, 'NonexistentBadge');"

# Check Bugsink project 3 - should see an ERROR with full context
```

### Logstash Configuration Reference

Complete configuration for PostgreSQL observability (`/etc/logstash/conf.d/bugsink.conf`):

```conf
input {
  # PostgreSQL function logs (ADR-050)
  # Both production and test databases write to the same log files
  file {
    path => "/var/log/postgresql/*.log"
    type => "postgres"
    tags => ["postgres", "database"]
    start_position => "beginning"
    sincedb_path => "/var/lib/logstash/sincedb_postgres"
  }
}

filter {
  # PostgreSQL function log parsing (ADR-050)
  if [type] == "postgres" {

    # Extract timestamp, timezone, process ID, user, database, level, and message
    grok {
      match => { "message" => "%{TIMESTAMP_ISO8601:pg_timestamp} [+-]%{INT:pg_timezone} \[%{POSINT:pg_pid}\] %{DATA:pg_user}@%{DATA:pg_database} %{WORD:pg_level}:  %{GREEDYDATA:pg_message}" }
    }

    # Try to parse pg_message as JSON (from fn_log())
    if [pg_message] =~ /^\{/ {
      json {
        source => "pg_message"
        target => "fn_log"
        skip_on_invalid_json => true
      }

      # Mark as error if level is WARNING or ERROR
      if [fn_log][level] in ["WARNING", "ERROR"] {
        mutate { add_tag => ["error", "db_function"] }
      }
    }

    # Also catch native PostgreSQL errors
    if [pg_level] in ["ERROR", "FATAL"] {
      mutate { add_tag => ["error", "postgres_native"] }
    }

    # Detect environment from database name
    if [pg_database] == "flyer-crawler-prod" {
      mutate {
        add_tag => ["production"]
      }
    } else if [pg_database] == "flyer-crawler-test" {
      mutate {
        add_tag => ["test"]
      }
    }

    # Generate event_id for Sentry
    if "error" in [tags] {
      uuid {
        target => "[@metadata][event_id]"
        overwrite => true
      }
    }
  }
}

output {
  # Production database errors -> project 1 (flyer-crawler-backend)
  if "error" in [tags] and "production" in [tags] {
    http {
      url => "https://bugsink.projectium.com/api/1/store/"
      http_method => "post"
      format => "json"
      headers => {
        "X-Sentry-Auth" => "Sentry sentry_version=7, sentry_client=logstash/1.0, sentry_key=911aef02b9a548fa8fabb8a3c81abfe5"
        "Content-Type" => "application/json"
      }
      mapping => {
        "event_id" => "%{[@metadata][event_id]}"
        "timestamp" => "%{@timestamp}"
        "platform" => "other"
        "level" => "error"
        "logger" => "postgresql"
        "message" => "%{[fn_log][message]}"
        "environment" => "production"
        "extra" => {
          "pg_user" => "%{[pg_user]}"
          "pg_database" => "%{[pg_database]}"
          "pg_function" => "%{[fn_log][function]}"
          "pg_level" => "%{[pg_level]}"
          "context" => "%{[fn_log][context]}"
        }
      }
    }
  }

  # Test database errors -> project 3 (flyer-crawler-backend-test)
  if "error" in [tags] and "test" in [tags] {
    http {
      url => "https://bugsink.projectium.com/api/3/store/"
      http_method => "post"
      format => "json"
      headers => {
        "X-Sentry-Auth" => "Sentry sentry_version=7, sentry_client=logstash/1.0, sentry_key=cdb99c314589431e83d4cc38a809449b"
        "Content-Type" => "application/json"
      }
      mapping => {
        "event_id" => "%{[@metadata][event_id]}"
        "timestamp" => "%{@timestamp}"
        "platform" => "other"
        "level" => "error"
        "logger" => "postgresql"
        "message" => "%{[fn_log][message]}"
        "environment" => "test"
        "extra" => {
          "pg_user" => "%{[pg_user]}"
          "pg_database" => "%{[pg_database]}"
          "pg_function" => "%{[fn_log][function]}"
          "pg_level" => "%{[pg_level]}"
          "context" => "%{[fn_log][context]}"
        }
      }
    }
  }
}
```

### Extended Logstash Configuration (PM2, Redis, NGINX)

The complete production Logstash configuration includes additional log sources beyond PostgreSQL:

**Input Sources:**

```conf
input {
  # PostgreSQL function logs (shown above)

  # PM2 Worker stdout logs (production)
  file {
    path => "/home/gitea-runner/.pm2/logs/flyer-crawler-worker-*.log"
    type => "pm2_stdout"
    tags => ["infra", "pm2", "worker", "production"]
    start_position => "end"
    sincedb_path => "/var/lib/logstash/sincedb_pm2_worker_prod"
    exclude => "*-test-*.log"
  }

  # PM2 Analytics Worker stdout (production)
  file {
    path => "/home/gitea-runner/.pm2/logs/flyer-crawler-analytics-worker-*.log"
    type => "pm2_stdout"
    tags => ["infra", "pm2", "analytics", "production"]
    start_position => "end"
    sincedb_path => "/var/lib/logstash/sincedb_pm2_analytics_prod"
    exclude => "*-test-*.log"
  }

  # PM2 Worker stdout (test environment)
  file {
    path => "/home/gitea-runner/.pm2/logs/flyer-crawler-worker-test-*.log"
    type => "pm2_stdout"
    tags => ["infra", "pm2", "worker", "test"]
    start_position => "end"
    sincedb_path => "/var/lib/logstash/sincedb_pm2_worker_test"
  }

  # PM2 Analytics Worker stdout (test environment)
  file {
    path => "/home/gitea-runner/.pm2/logs/flyer-crawler-analytics-worker-test-*.log"
    type => "pm2_stdout"
    tags => ["infra", "pm2", "analytics", "test"]
    start_position => "end"
    sincedb_path => "/var/lib/logstash/sincedb_pm2_analytics_test"
  }

  # Redis logs (already configured)
  file {
    path => "/var/log/redis/redis-server.log"
    type => "redis"
    tags => ["infra", "redis"]
    start_position => "end"
    sincedb_path => "/var/lib/logstash/sincedb_redis"
  }

  # NGINX access logs
  file {
    path => "/var/log/nginx/access.log"
    type => "nginx_access"
    tags => ["infra", "nginx", "access"]
    start_position => "end"
    sincedb_path => "/var/lib/logstash/sincedb_nginx_access"
  }

  # NGINX error logs
  file {
    path => "/var/log/nginx/error.log"
    type => "nginx_error"
    tags => ["infra", "nginx", "error"]
    start_position => "end"
    sincedb_path => "/var/lib/logstash/sincedb_nginx_error"
  }
}
```

**Filter Rules:**

```conf
filter {
  # PostgreSQL filters (shown above)

  # PM2 Worker log parsing
  if [type] == "pm2_stdout" {
    # Try to parse as JSON first (if worker uses Pino)
    json {
      source => "message"
      target => "pm2_json"
      skip_on_invalid_json => true
    }

    # If JSON parsing succeeded, extract level and tag errors
    if [pm2_json][level] {
      if [pm2_json][level] >= 50 {
        mutate { add_tag => ["error"] }
      }
    }
    # If not JSON, check for error keywords in plain text
    else if [message] =~ /(Error|ERROR|Exception|EXCEPTION|Fatal|FATAL|failed|FAILED)/ {
      mutate { add_tag => ["error"] }
    }

    # Generate event_id for errors
    if "error" in [tags] {
      uuid {
        target => "[@metadata][event_id]"
        overwrite => true
      }
    }
  }

  # Redis log parsing
  if [type] == "redis" {
    grok {
      match => { "message" => "%{POSINT:pid}:%{WORD:role} %{MONTHDAY} %{MONTH} %{TIME} %{WORD:loglevel} %{GREEDYDATA:redis_message}" }
    }

    # Tag errors (WARNING/ERROR) for Bugsink forwarding
    if [loglevel] in ["WARNING", "ERROR"] {
      mutate { add_tag => ["error"] }
      uuid {
        target => "[@metadata][event_id]"
        overwrite => true
      }
    }
    # Tag INFO-level operational events (startup, config, persistence)
    else if [loglevel] == "INFO" {
      mutate { add_tag => ["redis_operational"] }
    }
  }

  # NGINX access log parsing
  if [type] == "nginx_access" {
    grok {
      match => { "message" => "%{COMBINEDAPACHELOG}" }
    }

    # Parse response time if available (requires NGINX log format with request_time)
    if [message] =~ /request_time:(\d+\.\d+)/ {
      grok {
        match => { "message" => "request_time:(?<request_time_seconds>\d+\.\d+)" }
      }
    }

    # Categorize by status code
    if [response] =~ /^5\d{2}$/ {
      mutate { add_tag => ["error", "http_5xx"] }
      uuid {
        target => "[@metadata][event_id]"
        overwrite => true
      }
    }
    else if [response] =~ /^4\d{2}$/ {
      mutate { add_tag => ["client_error", "http_4xx"] }
    }
    else if [response] =~ /^2\d{2}$/ {
      mutate { add_tag => ["success", "http_2xx"] }
    }
    else if [response] =~ /^3\d{2}$/ {
      mutate { add_tag => ["redirect", "http_3xx"] }
    }

    # Tag slow requests (>1 second response time)
    if [request_time_seconds] and [request_time_seconds] > 1.0 {
      mutate { add_tag => ["slow_request"] }
    }

    # Always tag for monitoring
    mutate { add_tag => ["access_log"] }
  }

  # NGINX error log parsing
  if [type] == "nginx_error" {
    mutate { add_tag => ["error"] }
    uuid {
      target => "[@metadata][event_id]"
      overwrite => true
    }
  }
}
```

**Output Rules:**

```conf
output {
  # Production errors -> Bugsink infrastructure project (5)
  # Includes: PM2 worker errors, Redis errors, NGINX 5xx, PostgreSQL errors
  if "error" in [tags] and "infra" in [tags] and "production" in [tags] {
    http {
      url => "https://bugsink.projectium.com/api/5/store/"
      http_method => "post"
      format => "json"
      headers => {
        "X-Sentry-Auth" => "Sentry sentry_version=7, sentry_client=logstash/1.0, sentry_key=b083076f94fb461b889d5dffcbef43bf"
        "Content-Type" => "application/json"
      }
      mapping => {
        "event_id" => "%{[@metadata][event_id]}"
        "timestamp" => "%{@timestamp}"
        "platform" => "other"
        "level" => "error"
        "logger" => "%{type}"
        "message" => "%{message}"
        "environment" => "production"
      }
    }
  }

  # Test errors -> Bugsink test infrastructure project (6)
  if "error" in [tags] and "infra" in [tags] and "test" in [tags] {
    http {
      url => "https://bugsink.projectium.com/api/6/store/"
      http_method => "post"
      format => "json"
      headers => {
        "X-Sentry-Auth" => "Sentry sentry_version=7, sentry_client=logstash/1.0, sentry_key=25020dd6c2b74ad78463ec90e90fadab"
        "Content-Type" => "application/json"
      }
      mapping => {
        "event_id" => "%{[@metadata][event_id]}"
        "timestamp" => "%{@timestamp}"
        "platform" => "other"
        "level" => "error"
        "logger" => "%{type}"
        "message" => "%{message}"
        "environment" => "test"
      }
    }
  }

  # PM2 worker operational logs (non-errors) -> file
  if [type] == "pm2_stdout" and "error" not in [tags] {
    file {
      path => "/var/log/logstash/pm2-workers-%{+YYYY-MM-dd}.log"
      codec => json_lines
    }
  }

  # Redis INFO logs (operational events) -> file
  if "redis_operational" in [tags] {
    file {
      path => "/var/log/logstash/redis-operational-%{+YYYY-MM-dd}.log"
      codec => json_lines
    }
  }

  # NGINX access logs (all requests) -> file
  if "access_log" in [tags] {
    file {
      path => "/var/log/logstash/nginx-access-%{+YYYY-MM-dd}.log"
      codec => json_lines
    }
  }
}
```

**Setup Instructions:**

1. Create log output directory:

   ```bash
   sudo mkdir -p /var/log/logstash
   sudo chown logstash:logstash /var/log/logstash
   ```

2. Configure logrotate for Logstash file outputs:

   ```bash
   sudo tee /etc/logrotate.d/logstash <<EOF
   /var/log/logstash/*.log {
       daily
       rotate 30
       compress
       delaycompress
       missingok
       notifempty
       create 0644 logstash logstash
   }
   EOF
   ```

3. Verify Logstash can read PM2 logs:

   ```bash
   # Add logstash to required groups
   sudo usermod -a -G postgres logstash
   sudo usermod -a -G adm logstash

   # Test permissions
   sudo -u logstash cat /home/gitea-runner/.pm2/logs/flyer-crawler-worker-*.log | head -5
   sudo -u logstash cat /var/log/redis/redis-server.log | head -5
   sudo -u logstash cat /var/log/nginx/access.log | head -5
   ```

4. Restart Logstash:

   ```bash
   sudo systemctl restart logstash
   ```

**Verification:**

```bash
# Check Logstash is processing new log sources
curl -s http://localhost:9600/_node/stats/pipelines?pretty | jq '.pipelines.main.events'

# Check file outputs
ls -lh /var/log/logstash/
tail -f /var/log/logstash/pm2-workers-$(date +%Y-%m-%d).log
tail -f /var/log/logstash/redis-operational-$(date +%Y-%m-%d).log
tail -f /var/log/logstash/nginx-access-$(date +%Y-%m-d).log
```

### Troubleshooting

| Issue                          | Solution                                                                                            |
| ------------------------------ | --------------------------------------------------------------------------------------------------- |
| No logs appearing in Bugsink   | Check Logstash status: `sudo journalctl -u logstash -f`                                             |
| Permission denied errors       | Verify logstash is in postgres group: `groups logstash`                                             |
| Grok parse failures            | Check Logstash stats: `curl -s http://localhost:9600/_node/stats/pipelines?pretty \| grep failures` |
| Wrong Bugsink project          | Verify database name detection in filter (flyer-crawler-prod vs flyer-crawler-test)                 |
| PostgreSQL logs not created    | Check `logging_collector = on` and restart PostgreSQL                                               |
| Events not formatted correctly | Check mapping in output section matches Sentry event schema                                         |
| Test config before restarting  | Run: `/usr/share/logstash/bin/logstash --config.test_and_exit -f /etc/logstash/conf.d/bugsink.conf` |

### Maintenance Commands

| Task                          | Command                                                                                        |
| ----------------------------- | ---------------------------------------------------------------------------------------------- |
| View Logstash status          | `sudo systemctl status logstash`                                                               |
| View Logstash logs            | `sudo journalctl -u logstash -f`                                                               |
| View PostgreSQL logs          | `tail -f /var/log/postgresql/postgresql-$(date +%Y-%m-%d).log`                                 |
| Test Logstash config          | `/usr/share/logstash/bin/logstash --config.test_and_exit -f /etc/logstash/conf.d/bugsink.conf` |
| Restart Logstash              | `sudo systemctl restart logstash`                                                              |
| Check Logstash pipeline stats | `curl -s http://localhost:9600/_node/stats/pipelines?pretty`                                   |
| Clear sincedb (re-read logs)  | `sudo rm /var/lib/logstash/sincedb_postgres && sudo systemctl restart logstash`                |

---

## SSL/TLS with Let's Encrypt

### Install Certbot

```bash
sudo apt install -y certbot python3-certbot-nginx
```

### Obtain Certificate

```bash
sudo certbot --nginx -d flyer-crawler.projectium.com
```

Certbot will automatically configure NGINX for HTTPS.

### Auto-Renewal

Certbot installs a systemd timer for automatic renewal. Verify:

```bash
sudo systemctl status certbot.timer
```

---

## Firewall Configuration

### Configure UFW

```bash
sudo ufw default deny incoming
sudo ufw default allow outgoing

# Allow SSH
sudo ufw allow ssh

# Allow HTTP and HTTPS
sudo ufw allow 80/tcp
sudo ufw allow 443/tcp

# Enable firewall
sudo ufw enable
```

**Important**: Bugsink (port 8000) should NOT be exposed externally. It listens on localhost only.

---

## Maintenance Commands

### Application Management

| Task                  | Command                                                                                |
| --------------------- | -------------------------------------------------------------------------------------- |
| View PM2 status       | `pm2 status`                                                                           |
| View application logs | `pm2 logs`                                                                             |
| Restart all processes | `pm2 restart all`                                                                      |
| Restart specific app  | `pm2 restart flyer-crawler-api`                                                        |
| Update application    | `cd /opt/flyer-crawler && git pull && npm install && npm run build && pm2 restart all` |

### Service Management

| Service    | Start                               | Stop                               | Status                               |
| ---------- | ----------------------------------- | ---------------------------------- | ------------------------------------ |
| PostgreSQL | `sudo systemctl start postgresql`   | `sudo systemctl stop postgresql`   | `sudo systemctl status postgresql`   |
| Redis      | `sudo systemctl start redis-server` | `sudo systemctl stop redis-server` | `sudo systemctl status redis-server` |
| NGINX      | `sudo systemctl start nginx`        | `sudo systemctl stop nginx`        | `sudo systemctl status nginx`        |
| Bugsink    | `sudo systemctl start bugsink`      | `sudo systemctl stop bugsink`      | `sudo systemctl status bugsink`      |
| Logstash   | `sudo systemctl start logstash`     | `sudo systemctl stop logstash`     | `sudo systemctl status logstash`     |

### Database Backup

```bash
# Backup application database
pg_dump -U flyer_crawler -h localhost flyer_crawler > backup_$(date +%Y%m%d).sql

# Backup Bugsink database
pg_dump -U bugsink -h localhost bugsink > bugsink_backup_$(date +%Y%m%d).sql
```

### Log Locations

| Log               | Location                    |
| ----------------- | --------------------------- |
| Application (PM2) | `~/.pm2/logs/`              |
| NGINX access      | `/var/log/nginx/access.log` |
| NGINX error       | `/var/log/nginx/error.log`  |
| PostgreSQL        | `/var/log/postgresql/`      |
| Redis             | `/var/log/redis/`           |
| Bugsink           | `journalctl -u bugsink`     |
| Logstash          | `/var/log/logstash/`        |

---

## Related Documentation

- [DEPLOYMENT.md](../DEPLOYMENT.md) - Container-based deployment
- [DATABASE.md](../DATABASE.md) - Database schema and extensions
- [AUTHENTICATION.md](../AUTHENTICATION.md) - OAuth provider setup
- [ADR-015](adr/0015-application-performance-monitoring-and-error-tracking.md) - Error tracking architecture