torbo/flyer-crawler.projectium.com

Fork 0

Files

Torben Sorensen a87a0b6af1

Deploy to Test Environment / deploy-to-test (push) Successful in 17m12s

Details

unit test repairs

2026-01-12 14:31:41 -08:00

33 KiB

Raw Blame History

Bare-Metal Server Setup Guide

This guide covers the manual installation of Flyer Crawler and its dependencies on a bare-metal Ubuntu server (e.g., a colocation server). This is the definitive reference for setting up a production environment without containers.

Target Environment: Ubuntu 22.04 LTS (or newer)

System Prerequisites
PostgreSQL Setup
Redis Setup
Node.js and Application Setup
PM2 Process Manager
NGINX Reverse Proxy
Bugsink Error Tracking
Logstash Log Aggregation
SSL/TLS with Let's Encrypt
Firewall Configuration
Maintenance Commands

System Prerequisites

Update the system and install essential packages:

sudo apt update && sudo apt upgrade -y
sudo apt install -y curl git build-essential python3 python3-pip python3-venv

PostgreSQL Setup

Install PostgreSQL 14+ with PostGIS

# Add PostgreSQL APT repository
sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -
sudo apt update

# Install PostgreSQL and PostGIS
sudo apt install -y postgresql-14 postgresql-14-postgis-3

Create Application Database and User

sudo -u postgres psql

-- Create application user and database
CREATE USER flyer_crawler WITH PASSWORD 'YOUR_SECURE_PASSWORD';
CREATE DATABASE flyer_crawler OWNER flyer_crawler;

-- Connect to the database and enable extensions
\c flyer_crawler

CREATE EXTENSION IF NOT EXISTS postgis;
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE EXTENSION IF NOT EXISTS pgcrypto;

-- Grant privileges
GRANT ALL PRIVILEGES ON DATABASE flyer_crawler TO flyer_crawler;

\q

Configure PostgreSQL for Remote Access (if needed)

Edit /etc/postgresql/14/main/postgresql.conf:

listen_addresses = 'localhost'  # Change to '*' for remote access

Edit /etc/postgresql/14/main/pg_hba.conf to add allowed hosts:

# Local connections
local   all   all   peer
host    all   all   127.0.0.1/32   scram-sha-256

Restart PostgreSQL:

sudo systemctl restart postgresql

Redis Setup

Install Redis

sudo apt install -y redis-server

Configure Redis Password

Edit /etc/redis/redis.conf:

requirepass YOUR_REDIS_PASSWORD

Restart Redis:

sudo systemctl restart redis-server
sudo systemctl enable redis-server

Test Redis Connection

redis-cli -a YOUR_REDIS_PASSWORD ping
# Should output: PONG

Node.js and Application Setup

Install Node.js 20.x

curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash -
sudo apt install -y nodejs

Verify installation:

node --version  # Should output v20.x.x
npm --version

Install System Dependencies for PDF Processing

sudo apt install -y poppler-utils  # For pdftocairo

Clone and Install Application

# Create application directory
sudo mkdir -p /opt/flyer-crawler
sudo chown $USER:$USER /opt/flyer-crawler

# Clone repository
cd /opt/flyer-crawler
git clone https://gitea.projectium.com/flyer-crawler/flyer-crawler.projectium.com.git .

# Install dependencies
npm install

# Build for production
npm run build

Configure Environment Variables

Important: The flyer-crawler application does not use local environment files in production. All secrets are managed through Gitea CI/CD secrets and injected during deployment.

How Secrets Work

Secrets are stored in Gitea at Repository → Settings → Actions → Secrets
Workflow files (.gitea/workflows/deploy-to-prod.yml) reference secrets using ${{ secrets.SECRET_NAME }}
PM2 receives environment variables from the workflow's env: block
ecosystem.config.cjs passes variables to the application via process.env

Required Gitea Secrets

Before deployment, ensure these secrets are configured in Gitea:

Shared Secrets (used by both production and test):

Secret Name	Description
`DB_HOST`	Database hostname (usually `localhost`)
`DB_USER`	Database username
`DB_PASSWORD`	Database password
`JWT_SECRET`	JWT signing secret (min 32 characters)
`GOOGLE_MAPS_API_KEY`	Google Maps API key
`GOOGLE_CLIENT_ID`	Google OAuth client ID
`GOOGLE_CLIENT_SECRET`	Google OAuth client secret
`GH_CLIENT_ID`	GitHub OAuth client ID
`GH_CLIENT_SECRET`	GitHub OAuth client secret

Production-Specific Secrets:

Secret Name	Description
`DB_DATABASE_PROD`	Production database name (`flyer_crawler`)
`REDIS_PASSWORD_PROD`	Redis password for production (uses database 0)
`VITE_GOOGLE_GENAI_API_KEY`	Gemini API key for production
`SENTRY_DSN`	Bugsink backend DSN (see Bugsink section)
`VITE_SENTRY_DSN`	Bugsink frontend DSN

Test-Specific Secrets:

Secret Name	Description
`DB_DATABASE_TEST`	Test database name (`flyer-crawler-test`)
`REDIS_PASSWORD_TEST`	Redis password for test (uses database 1 for isolation)
`VITE_GOOGLE_GENAI_API_KEY_TEST`	Gemini API key for test environment
`SENTRY_DSN_TEST`	Bugsink backend DSN for test (see Bugsink section)
`VITE_SENTRY_DSN_TEST`	Bugsink frontend DSN for test

Test Environment Details

The test environment (flyer-crawler-test.projectium.com) uses both Gitea CI/CD secrets and a local .env.test file:

Path	Purpose
`/var/www/flyer-crawler-test.projectium.com/`	Test application directory
`/var/www/flyer-crawler-test.projectium.com/.env.test`	Local overrides for test-specific config

Key differences from production:

Uses Redis database 1 (production uses database 0) to isolate job queues
PM2 processes are named with -test suffix (e.g., flyer-crawler-api-test)
Deployed automatically on every push to main branch
Has a .env.test file for additional local configuration overrides

For detailed information on secrets management, see CLAUDE.md.

PM2 Process Manager

Install PM2 Globally

sudo npm install -g pm2

Start Application with PM2

cd /opt/flyer-crawler
npm run start:prod

This starts three processes:

flyer-crawler-api - Main API server (port 3001)
flyer-crawler-worker - Background job worker
flyer-crawler-analytics-worker - Analytics processing worker

Configure PM2 Startup

pm2 startup systemd
# Follow the command output to enable PM2 on boot

pm2 save

PM2 Log Rotation

pm2 install pm2-logrotate
pm2 set pm2-logrotate:max_size 10M
pm2 set pm2-logrotate:retain 14
pm2 set pm2-logrotate:compress true

NGINX Reverse Proxy

Install NGINX

sudo apt install -y nginx

Create Site Configuration

Create /etc/nginx/sites-available/flyer-crawler.projectium.com:

server {
    listen 80;
    server_name flyer-crawler.projectium.com;

    # Redirect HTTP to HTTPS (uncomment after SSL setup)
    # return 301 https://$server_name$request_uri;

    location / {
        proxy_pass http://localhost:5173;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection 'upgrade';
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_cache_bypass $http_upgrade;
    }

    location /api {
        proxy_pass http://localhost:3001;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection 'upgrade';
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_cache_bypass $http_upgrade;

        # File upload size limit
        client_max_body_size 50M;
    }

    # MIME type fix for .mjs files
    types {
        application/javascript js mjs;
    }
}

Enable the Site

sudo ln -s /etc/nginx/sites-available/flyer-crawler.projectium.com /etc/nginx/sites-enabled/
sudo nginx -t
sudo systemctl reload nginx
sudo systemctl enable nginx

Bugsink Error Tracking

Bugsink is a lightweight, self-hosted Sentry-compatible error tracking system. This guide follows the official Bugsink single-server production setup.

See ADR-015 for architecture details.

Step 1: Create Bugsink User

Create a dedicated non-root user for Bugsink:

sudo adduser bugsink --disabled-password --gecos ""

Step 2: Set Up Virtual Environment and Install Bugsink

Switch to the bugsink user:

sudo su - bugsink

Create the virtual environment:

python3 -m venv venv

Activate the virtual environment:

source venv/bin/activate

You should see (venv) at the beginning of your prompt. Now install Bugsink:

pip install bugsink --upgrade
bugsink-show-version

You should see output like bugsink 2.x.x.

Step 3: Create Configuration File

Generate the configuration file. Replace bugsink.yourdomain.com with your actual hostname:

bugsink-create-conf --template=singleserver --host=bugsink.yourdomain.com

This creates bugsink_conf.py in /home/bugsink/. Edit it to customize settings:

nano bugsink_conf.py

Key settings to review:

Setting	Description
`BASE_URL`	The URL where Bugsink will be accessed (e.g., `https://bugsink.yourdomain.com`)
`SITE_TITLE`	Display name for your Bugsink instance
`SECRET_KEY`	Auto-generated, but verify it exists
`TIME_ZONE`	Your timezone (e.g., `America/New_York`)
`USER_REGISTRATION`	Set to `"closed"` to disable public signup
`SINGLE_USER`	Set to `True` if only one user will use this instance

Step 4: Initialize Database

Bugsink uses SQLite by default, which is recommended for single-server setups. Run the database migrations:

bugsink-manage migrate
bugsink-manage migrate snappea --database=snappea

Verify the database files were created:

ls *.sqlite3

You should see db.sqlite3 and snappea.sqlite3.

Step 5: Create Admin User

Create the superuser account. Using your email as the username is recommended:

bugsink-manage createsuperuser

Important: Save these credentials - you'll need them to log into the Bugsink web UI.

Step 6: Verify Configuration

Run Django's deployment checks:

bugsink-manage check_migrations
bugsink-manage check --deploy --fail-level WARNING

Exit back to root for the next steps:

exit

Step 7: Create Gunicorn Service

Create /etc/systemd/system/gunicorn-bugsink.service:

sudo nano /etc/systemd/system/gunicorn-bugsink.service

Add the following content:

[Unit]
Description=Gunicorn daemon for Bugsink
After=network.target

[Service]
Restart=always
Type=notify
User=bugsink
Group=bugsink

Environment="PYTHONUNBUFFERED=1"
WorkingDirectory=/home/bugsink
ExecStart=/home/bugsink/venv/bin/gunicorn \
    --bind="127.0.0.1:8000" \
    --workers=4 \
    --timeout=6 \
    --access-logfile - \
    --max-requests=1000 \
    --max-requests-jitter=100 \
    bugsink.wsgi
ExecReload=/bin/kill -s HUP $MAINPID
KillMode=mixed
TimeoutStopSec=5

[Install]
WantedBy=multi-user.target

Enable and start the service:

sudo systemctl daemon-reload
sudo systemctl enable --now gunicorn-bugsink.service
sudo systemctl status gunicorn-bugsink.service

Test that Gunicorn is responding (replace hostname):

curl http://localhost:8000/accounts/login/ --header "Host: bugsink.yourdomain.com"

You should see HTML output containing a login form.

Step 8: Create Snappea Background Worker Service

Snappea is Bugsink's background task processor. Create /etc/systemd/system/snappea.service:

sudo nano /etc/systemd/system/snappea.service

Add the following content:

[Unit]
Description=Snappea daemon for Bugsink background tasks
After=network.target

[Service]
Restart=always
User=bugsink
Group=bugsink

Environment="PYTHONUNBUFFERED=1"
WorkingDirectory=/home/bugsink
ExecStart=/home/bugsink/venv/bin/bugsink-runsnappea
KillMode=mixed
TimeoutStopSec=5
RuntimeMaxSec=1d

[Install]
WantedBy=multi-user.target

Enable and start the service:

sudo systemctl daemon-reload
sudo systemctl enable --now snappea.service
sudo systemctl status snappea.service

Verify snappea is working:

sudo su - bugsink
source venv/bin/activate
bugsink-manage checksnappea
exit

Step 9: Configure NGINX for Bugsink

Create /etc/nginx/sites-available/bugsink:

sudo nano /etc/nginx/sites-available/bugsink

Add the following (replace bugsink.yourdomain.com with your hostname):

server {
    server_name bugsink.yourdomain.com;
    listen 80;

    client_max_body_size 20M;

    access_log /var/log/nginx/bugsink.access.log;
    error_log /var/log/nginx/bugsink.error.log;

    location / {
        proxy_pass http://127.0.0.1:8000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-Proto $scheme;
    }
}

Enable the site:

sudo ln -s /etc/nginx/sites-available/bugsink /etc/nginx/sites-enabled/
sudo nginx -t
sudo systemctl reload nginx

Step 10: Configure SSL with Certbot (Recommended)

sudo certbot --nginx -d bugsink.yourdomain.com

After SSL is configured, update the NGINX config to add security headers. Edit /etc/nginx/sites-available/bugsink and add to the location / block:

add_header Strict-Transport-Security "max-age=31536000; preload" always;

Reload NGINX:

sudo nginx -t
sudo systemctl reload nginx

Step 11: Create Projects and Get DSNs

Access Bugsink UI at https://bugsink.yourdomain.com
Log in with the admin credentials you created
Create a new team (or use the default)
Create projects for each environment:

Production:
- flyer-crawler-backend (Platform: Node.js)
- flyer-crawler-frontend (Platform: JavaScript/React)
Test:
- flyer-crawler-backend-test (Platform: Node.js)
- flyer-crawler-frontend-test (Platform: JavaScript/React)
For each project, go to Settings → Client Keys (DSN)
Copy the DSN URLs - you'll have 4 DSNs total (2 for production, 2 for test)

Note: The dev container runs its own local Bugsink instance at localhost:8000 - no remote DSNs needed for development.

Step 12: Configure Application to Use Bugsink

The flyer-crawler application receives its configuration via Gitea CI/CD secrets, not local environment files. Follow these steps to add the Bugsink DSNs:

1. Add Secrets in Gitea

Navigate to your repository in Gitea:

Go to Settings → Actions → Secrets
Add the following secrets:

Production DSNs:

Secret Name	Value	Description
`SENTRY_DSN`	`https://KEY@bugsink.yourdomain.com/1`	Production backend DSN
`VITE_SENTRY_DSN`	`https://KEY@bugsink.yourdomain.com/2`	Production frontend DSN

Test DSNs:

Secret Name	Value	Description
`SENTRY_DSN_TEST`	`https://KEY@bugsink.yourdomain.com/3`	Test backend DSN
`VITE_SENTRY_DSN_TEST`	`https://KEY@bugsink.yourdomain.com/4`	Test frontend DSN

Note: The project numbers in the DSN URLs are assigned by Bugsink when you create each project. Use the actual DSN values from Step 11.

2. Update the Deployment Workflows

Production (deploy-to-prod.yml):

In the Install Backend Dependencies and Restart Production Server step, add to the env: block:

env:
  # ... existing secrets ...
  # Sentry/Bugsink Error Tracking
  SENTRY_DSN: ${{ secrets.SENTRY_DSN }}
  SENTRY_ENVIRONMENT: 'production'
  SENTRY_ENABLED: 'true'

In the build step, add frontend variables:

VITE_SENTRY_DSN=${{ secrets.VITE_SENTRY_DSN }} \
VITE_SENTRY_ENVIRONMENT=production \
VITE_SENTRY_ENABLED=true \
npm run build

Test (deploy-to-test.yml):

In the Install Backend Dependencies and Restart Test Server step, add to the env: block:

env:
  # ... existing secrets ...
  # Sentry/Bugsink Error Tracking (Test)
  SENTRY_DSN: ${{ secrets.SENTRY_DSN_TEST }}
  SENTRY_ENVIRONMENT: 'test'
  SENTRY_ENABLED: 'true'

In the build step, add frontend variables:

VITE_SENTRY_DSN=${{ secrets.VITE_SENTRY_DSN_TEST }} \
VITE_SENTRY_ENVIRONMENT=test \
VITE_SENTRY_ENABLED=true \
npm run build

3. Update ecosystem.config.cjs

Add Sentry variables to the sharedEnv object in ecosystem.config.cjs:

const sharedEnv = {
  // ... existing variables ...
  SENTRY_DSN: process.env.SENTRY_DSN,
  SENTRY_ENVIRONMENT: process.env.SENTRY_ENVIRONMENT,
  SENTRY_ENABLED: process.env.SENTRY_ENABLED,
};

4. Dev Container (No Configuration Needed)

The dev container runs its own local Bugsink instance at http://localhost:8000. No remote DSNs or Gitea secrets are needed for development:

DSNs are pre-configured in compose.dev.yml
Admin UI: http://localhost:8000 (login: admin@localhost / admin)
Errors stay local and isolated from production/test

5. Deploy to Apply Changes

Trigger deployments via Gitea Actions:

Test: Automatically deploys on push to main
Production: Manual trigger via workflow dispatch

Note: There is no /etc/flyer-crawler/environment file on the server. Production and test secrets are managed through Gitea CI/CD and injected at deployment time. Dev container uses local .env file. See CLAUDE.md for details.

Step 13: Test Error Tracking

You can test Bugsink is working before configuring the flyer-crawler application.

Switch to the bugsink user and open a Python shell:

sudo su - bugsink
source venv/bin/activate
bugsink-manage shell

In the Python shell, send a test message using the backend DSN from Step 11:

import sentry_sdk
sentry_sdk.init("https://YOUR_BACKEND_KEY@bugsink.yourdomain.com/1")
sentry_sdk.capture_message("Test message from Bugsink setup")
exit()

Exit back to root:

exit

Check the Bugsink UI - you should see the test message appear in the flyer-crawler-backend project.

Step 14: Test from Flyer-Crawler Application (After App Setup)

Once the flyer-crawler application has been deployed with the Sentry secrets configured in Step 12:

cd /var/www/flyer-crawler.projectium.com
npx tsx scripts/test-bugsink.ts

Check the Bugsink UI - you should see a test event appear.

Bugsink Maintenance Commands

Task	Command
View Gunicorn status	`sudo systemctl status gunicorn-bugsink`
View Snappea status	`sudo systemctl status snappea`
View Gunicorn logs	`sudo journalctl -u gunicorn-bugsink -f`
View Snappea logs	`sudo journalctl -u snappea -f`
Restart Bugsink	`sudo systemctl restart gunicorn-bugsink snappea`
Run management commands	`sudo su - bugsink` then `source venv/bin/activate && bugsink-manage <command>`
Upgrade Bugsink	`sudo su - bugsink && source venv/bin/activate && pip install bugsink --upgrade && exit && sudo systemctl restart gunicorn-bugsink snappea`

Logstash Log Aggregation

Logstash aggregates logs from the application and infrastructure, forwarding errors to Bugsink.

Note: Logstash integration is optional. The flyer-crawler application already sends errors directly to Bugsink via the Sentry SDK. Logstash is only needed if you want to aggregate logs from other sources (Redis, NGINX, etc.) into Bugsink.

Step 1: Create Application Log Directory

Create the log directory and set appropriate permissions:

# Create log directory for the flyer-crawler application
sudo mkdir -p /var/www/flyer-crawler.projectium.com/logs

# Set ownership to the user running the application (typically the deploy user or www-data)
sudo chown -R $USER:$USER /var/www/flyer-crawler.projectium.com/logs

# Ensure logstash user can read the logs
sudo chmod 755 /var/www/flyer-crawler.projectium.com/logs

For the test environment:

sudo mkdir -p /var/www/flyer-crawler-test.projectium.com/logs
sudo chown -R $USER:$USER /var/www/flyer-crawler-test.projectium.com/logs
sudo chmod 755 /var/www/flyer-crawler-test.projectium.com/logs

Step 2: Configure Application to Write File Logs

The flyer-crawler application uses Pino for logging and currently outputs to stdout (captured by PM2). To enable file-based logging for Logstash, you would need to configure Pino to write to files.

Current Behavior: Logs go to stdout → PM2 captures them → ~/.pm2/logs/

For Logstash Integration: You would need to either:

Configure Pino to write directly to files (requires code changes)
Use PM2's log files instead (located at ~/.pm2/logs/flyer-crawler-*.log)

For now, we'll use PM2's log files which already exist:

# Check PM2 log location
ls -la ~/.pm2/logs/

Step 3: Install Logstash

# Add Elastic APT repository
wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo gpg --dearmor -o /usr/share/keyrings/elastic-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/elastic-keyring.gpg] https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elastic-8.x.list

# Update and install
sudo apt update
sudo apt install -y logstash

Verify installation:

/usr/share/logstash/bin/logstash --version

Step 4: Configure Logstash Pipeline

Create the pipeline configuration file:

sudo nano /etc/logstash/conf.d/bugsink.conf

Add the following content (adjust paths as needed):

input {
  # PM2 application logs (Pino JSON format)
  # PM2 stores logs in the home directory of the user running PM2
  file {
    path => "/root/.pm2/logs/flyer-crawler-api-out.log"
    codec => json_lines
    type => "pino"
    tags => ["app", "production"]
    start_position => "end"
    sincedb_path => "/var/lib/logstash/sincedb_pino_prod"
  }

  # PM2 error logs
  file {
    path => "/root/.pm2/logs/flyer-crawler-api-error.log"
    type => "pm2-error"
    tags => ["app", "production", "error"]
    start_position => "end"
    sincedb_path => "/var/lib/logstash/sincedb_pm2_error_prod"
  }

  # Test environment logs (if running on same server)
  file {
    path => "/root/.pm2/logs/flyer-crawler-api-test-out.log"
    codec => json_lines
    type => "pino"
    tags => ["app", "test"]
    start_position => "end"
    sincedb_path => "/var/lib/logstash/sincedb_pino_test"
  }

  # Redis logs
  file {
    path => "/var/log/redis/redis-server.log"
    type => "redis"
    tags => ["redis"]
    start_position => "end"
    sincedb_path => "/var/lib/logstash/sincedb_redis"
  }
}

filter {
  # Pino error detection (level 50 = error, 60 = fatal)
  if [type] == "pino" and [level] {
    if [level] >= 50 {
      mutate { add_tag => ["error"] }
    }
  }

  # Redis error detection
  if [type] == "redis" {
    grok {
      match => { "message" => "%{POSINT:pid}:%{WORD:role} %{MONTHDAY} %{MONTH} %{YEAR}? ?%{TIME} %{WORD:loglevel} %{GREEDYDATA:redis_message}" }
    }
    if [loglevel] in ["WARNING", "ERROR"] {
      mutate { add_tag => ["error"] }
    }
  }

  # PM2 error logs are always errors
  if [type] == "pm2-error" {
    mutate { add_tag => ["error"] }
  }
}

output {
  # Only send errors to Bugsink
  if "error" in [tags] {
    http {
      url => "http://localhost:8000/api/1/store/"
      http_method => "post"
      format => "json"
      headers => {
        "X-Sentry-Auth" => "Sentry sentry_version=7, sentry_client=logstash/1.0, sentry_key=YOUR_BACKEND_DSN_KEY"
      }
    }
  }

  # Debug output (remove in production after confirming it works)
  # stdout { codec => rubydebug }
}

Important: Replace YOUR_BACKEND_DSN_KEY with the key from your Bugsink backend DSN. The key is the part before the @ symbol in the DSN URL.

For example, if your DSN is:

https://abc123def456@bugsink.yourdomain.com/1

Then YOUR_BACKEND_DSN_KEY is abc123def456.

Step 5: Create Logstash State Directory

Logstash needs a directory to track which log lines it has already processed:

sudo mkdir -p /var/lib/logstash
sudo chown logstash:logstash /var/lib/logstash

Step 6: Grant Logstash Access to PM2 Logs

Logstash runs as the logstash user and needs permission to read PM2 logs:

# Add logstash user to the group that owns PM2 logs
# If PM2 runs as root:
sudo usermod -a -G root logstash

# Or, make PM2 logs world-readable (less secure but simpler)
sudo chmod 644 /root/.pm2/logs/*.log

# For Redis logs
sudo chmod 644 /var/log/redis/redis-server.log

Note: If PM2 runs as a different user, adjust the group accordingly.

Step 7: Test Logstash Configuration

Test the configuration before starting:

sudo /usr/share/logstash/bin/logstash --config.test_and_exit -f /etc/logstash/conf.d/bugsink.conf

You should see Configuration OK if there are no errors.

Step 8: Start Logstash

sudo systemctl enable logstash
sudo systemctl start logstash
sudo systemctl status logstash

View Logstash logs to verify it's working:

sudo journalctl -u logstash -f

Troubleshooting Logstash

Issue	Solution
"Permission denied" errors	Check file permissions on log files and sincedb directory
No events being processed	Verify log file paths exist and contain data
HTTP output errors	Check Bugsink is running and DSN key is correct
Logstash not starting	Run config test: `sudo /usr/share/logstash/bin/logstash --config.test_and_exit -f /etc/logstash/conf.d/`

Alternative: Skip Logstash

Since the flyer-crawler application already sends errors directly to Bugsink via the Sentry SDK (configured in Steps 11-12), you may not need Logstash at all. Logstash is primarily useful for:

Aggregating logs from services that don't have native Sentry support (Redis, NGINX)
Centralizing all logs in one place
Complex log transformations

If you only need application error tracking, the Sentry SDK integration is sufficient.

SSL/TLS with Let's Encrypt

Install Certbot

sudo apt install -y certbot python3-certbot-nginx

Obtain Certificate

sudo certbot --nginx -d flyer-crawler.projectium.com

Certbot will automatically configure NGINX for HTTPS.

Auto-Renewal

Certbot installs a systemd timer for automatic renewal. Verify:

sudo systemctl status certbot.timer

Firewall Configuration

Configure UFW

sudo ufw default deny incoming
sudo ufw default allow outgoing

# Allow SSH
sudo ufw allow ssh

# Allow HTTP and HTTPS
sudo ufw allow 80/tcp
sudo ufw allow 443/tcp

# Enable firewall
sudo ufw enable

Important: Bugsink (port 8000) should NOT be exposed externally. It listens on localhost only.

Maintenance Commands

Application Management

Task	Command
View PM2 status	`pm2 status`
View application logs	`pm2 logs`
Restart all processes	`pm2 restart all`
Restart specific app	`pm2 restart flyer-crawler-api`
Update application	`cd /opt/flyer-crawler && git pull && npm install && npm run build && pm2 restart all`

Service Management

Service	Start	Stop	Status
PostgreSQL	`sudo systemctl start postgresql`	`sudo systemctl stop postgresql`	`sudo systemctl status postgresql`
Redis	`sudo systemctl start redis-server`	`sudo systemctl stop redis-server`	`sudo systemctl status redis-server`
NGINX	`sudo systemctl start nginx`	`sudo systemctl stop nginx`	`sudo systemctl status nginx`
Bugsink	`sudo systemctl start bugsink`	`sudo systemctl stop bugsink`	`sudo systemctl status bugsink`
Logstash	`sudo systemctl start logstash`	`sudo systemctl stop logstash`	`sudo systemctl status logstash`

Database Backup

# Backup application database
pg_dump -U flyer_crawler -h localhost flyer_crawler > backup_$(date +%Y%m%d).sql

# Backup Bugsink database
pg_dump -U bugsink -h localhost bugsink > bugsink_backup_$(date +%Y%m%d).sql

Log Locations

Log	Location
Application (PM2)	`~/.pm2/logs/`
NGINX access	`/var/log/nginx/access.log`
NGINX error	`/var/log/nginx/error.log`
PostgreSQL	`/var/log/postgresql/`
Redis	`/var/log/redis/`
Bugsink	`journalctl -u bugsink`
Logstash	`/var/log/logstash/`

DEPLOYMENT.md - Container-based deployment
DATABASE.md - Database schema and extensions
AUTHENTICATION.md - OAuth provider setup
ADR-015 - Error tracking architecture

33 KiB Raw Blame History