Files
flyer-crawler.projectium.com/docs/BARE-METAL-SETUP.md
Torben Sorensen 11aeac5edd
Some checks failed
Deploy to Test Environment / deploy-to-test (push) Failing after 1m10s
whoa - so much - new features (UPC,etc) - Sentry for app logging! so much more !
2026-01-11 19:07:02 -08:00

15 KiB

Bare-Metal Server Setup Guide

This guide covers the manual installation of Flyer Crawler and its dependencies on a bare-metal Ubuntu server (e.g., a colocation server). This is the definitive reference for setting up a production environment without containers.

Target Environment: Ubuntu 22.04 LTS (or newer)


Table of Contents

  1. System Prerequisites
  2. PostgreSQL Setup
  3. Redis Setup
  4. Node.js and Application Setup
  5. PM2 Process Manager
  6. NGINX Reverse Proxy
  7. Bugsink Error Tracking
  8. Logstash Log Aggregation
  9. SSL/TLS with Let's Encrypt
  10. Firewall Configuration
  11. Maintenance Commands

System Prerequisites

Update the system and install essential packages:

sudo apt update && sudo apt upgrade -y
sudo apt install -y curl git build-essential python3 python3-pip python3-venv

PostgreSQL Setup

Install PostgreSQL 14+ with PostGIS

# Add PostgreSQL APT repository
sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -
sudo apt update

# Install PostgreSQL and PostGIS
sudo apt install -y postgresql-14 postgresql-14-postgis-3

Create Application Database and User

sudo -u postgres psql
-- Create application user and database
CREATE USER flyer_crawler WITH PASSWORD 'YOUR_SECURE_PASSWORD';
CREATE DATABASE flyer_crawler OWNER flyer_crawler;

-- Connect to the database and enable extensions
\c flyer_crawler

CREATE EXTENSION IF NOT EXISTS postgis;
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE EXTENSION IF NOT EXISTS pgcrypto;

-- Grant privileges
GRANT ALL PRIVILEGES ON DATABASE flyer_crawler TO flyer_crawler;

\q

Create Bugsink Database (for error tracking)

sudo -u postgres psql
-- Create dedicated Bugsink user and database
CREATE USER bugsink WITH PASSWORD 'BUGSINK_SECURE_PASSWORD';
CREATE DATABASE bugsink OWNER bugsink;
GRANT ALL PRIVILEGES ON DATABASE bugsink TO bugsink;

\q

Configure PostgreSQL for Remote Access (if needed)

Edit /etc/postgresql/14/main/postgresql.conf:

listen_addresses = 'localhost'  # Change to '*' for remote access

Edit /etc/postgresql/14/main/pg_hba.conf to add allowed hosts:

# Local connections
local   all   all   peer
host    all   all   127.0.0.1/32   scram-sha-256

Restart PostgreSQL:

sudo systemctl restart postgresql

Redis Setup

Install Redis

sudo apt install -y redis-server

Configure Redis Password

Edit /etc/redis/redis.conf:

requirepass YOUR_REDIS_PASSWORD

Restart Redis:

sudo systemctl restart redis-server
sudo systemctl enable redis-server

Test Redis Connection

redis-cli -a YOUR_REDIS_PASSWORD ping
# Should output: PONG

Node.js and Application Setup

Install Node.js 20.x

curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash -
sudo apt install -y nodejs

Verify installation:

node --version  # Should output v20.x.x
npm --version

Install System Dependencies for PDF Processing

sudo apt install -y poppler-utils  # For pdftocairo

Clone and Install Application

# Create application directory
sudo mkdir -p /opt/flyer-crawler
sudo chown $USER:$USER /opt/flyer-crawler

# Clone repository
cd /opt/flyer-crawler
git clone https://gitea.projectium.com/flyer-crawler/flyer-crawler.projectium.com.git .

# Install dependencies
npm install

# Build for production
npm run build

Configure Environment Variables

Create a systemd environment file at /etc/flyer-crawler/environment:

sudo mkdir -p /etc/flyer-crawler
sudo nano /etc/flyer-crawler/environment

Add the following (replace with actual values):

# Database
DB_HOST=localhost
DB_USER=flyer_crawler
DB_PASSWORD=YOUR_SECURE_PASSWORD
DB_DATABASE_PROD=flyer_crawler

# Redis
REDIS_HOST=localhost
REDIS_PORT=6379
REDIS_PASSWORD_PROD=YOUR_REDIS_PASSWORD

# Authentication
JWT_SECRET=YOUR_LONG_RANDOM_JWT_SECRET

# Google APIs
VITE_GOOGLE_GENAI_API_KEY=YOUR_GEMINI_API_KEY
GOOGLE_MAPS_API_KEY=YOUR_MAPS_API_KEY

# Sentry/Bugsink Error Tracking (ADR-015)
SENTRY_DSN=http://BACKEND_KEY@localhost:8000/1
VITE_SENTRY_DSN=http://FRONTEND_KEY@localhost:8000/2
SENTRY_ENVIRONMENT=production
VITE_SENTRY_ENVIRONMENT=production
SENTRY_ENABLED=true
VITE_SENTRY_ENABLED=true
SENTRY_DEBUG=false
VITE_SENTRY_DEBUG=false

# Application
NODE_ENV=production
PORT=3001

Secure the file:

sudo chmod 600 /etc/flyer-crawler/environment

PM2 Process Manager

Install PM2 Globally

sudo npm install -g pm2

Start Application with PM2

cd /opt/flyer-crawler
npm run start:prod

This starts three processes:

  • flyer-crawler-api - Main API server (port 3001)
  • flyer-crawler-worker - Background job worker
  • flyer-crawler-analytics-worker - Analytics processing worker

Configure PM2 Startup

pm2 startup systemd
# Follow the command output to enable PM2 on boot

pm2 save

PM2 Log Rotation

pm2 install pm2-logrotate
pm2 set pm2-logrotate:max_size 10M
pm2 set pm2-logrotate:retain 14
pm2 set pm2-logrotate:compress true

NGINX Reverse Proxy

Install NGINX

sudo apt install -y nginx

Create Site Configuration

Create /etc/nginx/sites-available/flyer-crawler.projectium.com:

server {
    listen 80;
    server_name flyer-crawler.projectium.com;

    # Redirect HTTP to HTTPS (uncomment after SSL setup)
    # return 301 https://$server_name$request_uri;

    location / {
        proxy_pass http://localhost:5173;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection 'upgrade';
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_cache_bypass $http_upgrade;
    }

    location /api {
        proxy_pass http://localhost:3001;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection 'upgrade';
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_cache_bypass $http_upgrade;

        # File upload size limit
        client_max_body_size 50M;
    }

    # MIME type fix for .mjs files
    types {
        application/javascript js mjs;
    }
}

Enable the Site

sudo ln -s /etc/nginx/sites-available/flyer-crawler.projectium.com /etc/nginx/sites-enabled/
sudo nginx -t
sudo systemctl reload nginx
sudo systemctl enable nginx

Bugsink Error Tracking

Bugsink is a lightweight, self-hosted Sentry-compatible error tracking system. See ADR-015 for architecture details.

Install Bugsink

# Create virtual environment
sudo mkdir -p /opt/bugsink
sudo python3 -m venv /opt/bugsink/venv

# Activate and install
source /opt/bugsink/venv/bin/activate
pip install bugsink

# Create wrapper scripts
sudo tee /opt/bugsink/bin/bugsink-manage << 'EOF'
#!/bin/bash
source /opt/bugsink/venv/bin/activate
exec python -m bugsink.manage "$@"
EOF

sudo tee /opt/bugsink/bin/bugsink-runserver << 'EOF'
#!/bin/bash
source /opt/bugsink/venv/bin/activate
exec python -m bugsink.runserver "$@"
EOF

sudo chmod +x /opt/bugsink/bin/bugsink-manage /opt/bugsink/bin/bugsink-runserver

Configure Bugsink

Create /etc/bugsink/environment:

sudo mkdir -p /etc/bugsink
sudo nano /etc/bugsink/environment
SECRET_KEY=YOUR_RANDOM_50_CHAR_SECRET_KEY
DATABASE_URL=postgresql://bugsink:BUGSINK_SECURE_PASSWORD@localhost:5432/bugsink
BASE_URL=http://localhost:8000
PORT=8000
sudo chmod 600 /etc/bugsink/environment

Initialize Bugsink Database

source /etc/bugsink/environment
/opt/bugsink/bin/bugsink-manage migrate
/opt/bugsink/bin/bugsink-manage migrate --database=snappea

Create Bugsink Admin User

/opt/bugsink/bin/bugsink-manage createsuperuser

Create Systemd Service

Create /etc/systemd/system/bugsink.service:

[Unit]
Description=Bugsink Error Tracking
After=network.target postgresql.service

[Service]
Type=simple
User=www-data
Group=www-data
EnvironmentFile=/etc/bugsink/environment
ExecStart=/opt/bugsink/bin/bugsink-runserver 0.0.0.0:8000
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target
sudo systemctl daemon-reload
sudo systemctl enable bugsink
sudo systemctl start bugsink

Create Bugsink Projects and Get DSNs

  1. Access Bugsink UI at http://localhost:8000
  2. Log in with admin credentials
  3. Create projects:
    • flyer-crawler-backend (Platform: Node.js)
    • flyer-crawler-frontend (Platform: React)
  4. Copy the DSNs from each project's settings
  5. Update /etc/flyer-crawler/environment with the DSNs

Test Error Tracking

cd /opt/flyer-crawler
npx tsx scripts/test-bugsink.ts

Check Bugsink UI for test events.


Logstash Log Aggregation

Logstash aggregates logs from the application and infrastructure, forwarding errors to Bugsink.

Install Logstash

# Add Elastic APT repository
wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo gpg --dearmor -o /usr/share/keyrings/elastic-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/elastic-keyring.gpg] https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elastic-8.x.list

sudo apt update
sudo apt install -y logstash

Configure Logstash Pipeline

Create /etc/logstash/conf.d/bugsink.conf:

input {
  # Pino application logs
  file {
    path => "/opt/flyer-crawler/logs/*.log"
    codec => json
    type => "pino"
    tags => ["app"]
  }

  # Redis logs
  file {
    path => "/var/log/redis/*.log"
    type => "redis"
    tags => ["redis"]
  }
}

filter {
  # Pino error detection (level 50 = error, 60 = fatal)
  if [type] == "pino" and [level] >= 50 {
    mutate { add_tag => ["error"] }
  }

  # Redis error detection
  if [type] == "redis" {
    grok {
      match => { "message" => "%{POSINT:pid}:%{WORD:role} %{MONTHDAY} %{MONTH} %{TIME} %{WORD:loglevel} %{GREEDYDATA:redis_message}" }
    }
    if [loglevel] in ["WARNING", "ERROR"] {
      mutate { add_tag => ["error"] }
    }
  }
}

output {
  if "error" in [tags] {
    http {
      url => "http://localhost:8000/api/1/store/"
      http_method => "post"
      format => "json"
      headers => {
        "X-Sentry-Auth" => "Sentry sentry_version=7, sentry_client=logstash/1.0, sentry_key=YOUR_BACKEND_DSN_KEY"
      }
    }
  }
}

Replace YOUR_BACKEND_DSN_KEY with the key from your backend project DSN.

Start Logstash

sudo systemctl enable logstash
sudo systemctl start logstash

SSL/TLS with Let's Encrypt

Install Certbot

sudo apt install -y certbot python3-certbot-nginx

Obtain Certificate

sudo certbot --nginx -d flyer-crawler.projectium.com

Certbot will automatically configure NGINX for HTTPS.

Auto-Renewal

Certbot installs a systemd timer for automatic renewal. Verify:

sudo systemctl status certbot.timer

Firewall Configuration

Configure UFW

sudo ufw default deny incoming
sudo ufw default allow outgoing

# Allow SSH
sudo ufw allow ssh

# Allow HTTP and HTTPS
sudo ufw allow 80/tcp
sudo ufw allow 443/tcp

# Enable firewall
sudo ufw enable

Important: Bugsink (port 8000) should NOT be exposed externally. It listens on localhost only.


Maintenance Commands

Application Management

Task Command
View PM2 status pm2 status
View application logs pm2 logs
Restart all processes pm2 restart all
Restart specific app pm2 restart flyer-crawler-api
Update application cd /opt/flyer-crawler && git pull && npm install && npm run build && pm2 restart all

Service Management

Service Start Stop Status
PostgreSQL sudo systemctl start postgresql sudo systemctl stop postgresql sudo systemctl status postgresql
Redis sudo systemctl start redis-server sudo systemctl stop redis-server sudo systemctl status redis-server
NGINX sudo systemctl start nginx sudo systemctl stop nginx sudo systemctl status nginx
Bugsink sudo systemctl start bugsink sudo systemctl stop bugsink sudo systemctl status bugsink
Logstash sudo systemctl start logstash sudo systemctl stop logstash sudo systemctl status logstash

Database Backup

# Backup application database
pg_dump -U flyer_crawler -h localhost flyer_crawler > backup_$(date +%Y%m%d).sql

# Backup Bugsink database
pg_dump -U bugsink -h localhost bugsink > bugsink_backup_$(date +%Y%m%d).sql

Log Locations

Log Location
Application (PM2) ~/.pm2/logs/
NGINX access /var/log/nginx/access.log
NGINX error /var/log/nginx/error.log
PostgreSQL /var/log/postgresql/
Redis /var/log/redis/
Bugsink journalctl -u bugsink
Logstash /var/log/logstash/