Files
flyer-crawler.projectium.com/docs/operations/DEPLOYMENT.md
Torben Sorensen fe79522ea4
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 27m44s
pm2 save fix + test work
2026-02-18 15:43:20 -08:00

499 lines
16 KiB
Markdown

# Deployment Guide
This guide covers deploying Flyer Crawler to a production server.
**Last verified**: 2026-01-28
**Related documentation**:
- [ADR-014: Containerization and Deployment Strategy](../adr/0014-containerization-and-deployment-strategy.md)
- [ADR-015: Error Tracking and Observability](../adr/0015-error-tracking-and-observability.md)
- [Bare-Metal Setup Guide](BARE-METAL-SETUP.md)
- [Monitoring Guide](MONITORING.md)
---
## Quick Reference
### Command Reference Table
| Task | Command |
| -------------------- | ----------------------------------------------------------------------------------------------- |
| Deploy to production | Gitea Actions workflow (manual trigger) |
| Deploy to test | Automatic on push to `main` |
| Check PM2 status | `pm2 list` |
| View logs | `pm2 logs flyer-crawler-api --lines 100` |
| Restart all | `pm2 restart flyer-crawler-api flyer-crawler-worker flyer-crawler-analytics-worker && pm2 save` |
| Check NGINX | `sudo nginx -t && sudo systemctl status nginx` |
| Check health | `curl -s https://flyer-crawler.projectium.com/api/health/ready \| jq .` |
### Deployment URLs
| Environment | URL | API Port |
| ------------- | ------------------------------------------- | -------- |
| Production | `https://flyer-crawler.projectium.com` | 3001 |
| Test | `https://flyer-crawler-test.projectium.com` | 3002 |
| Dev Container | `https://localhost` | 3001 |
---
## Server Access Model
**Important**: Claude Code (and AI tools) have **READ-ONLY** access to production/test servers. The deployment workflow is:
| Actor | Capability |
| ------------ | --------------------------------------------------------------- |
| Gitea CI/CD | Automated deployments via workflows (has write access) |
| User (human) | Manual server access for troubleshooting and emergency fixes |
| Claude Code | Provides commands for user to execute; cannot run them directly |
When troubleshooting deployment issues:
1. Claude provides **diagnostic commands** for the user to run
2. User executes commands and reports output
3. Claude analyzes results and provides **fix commands** (1-3 at a time)
4. User executes fixes and reports results
5. Claude provides **verification commands** to confirm success
---
## Prerequisites
| Component | Version | Purpose |
| ---------- | --------- | ------------------------------- |
| Ubuntu | 22.04 LTS | Operating system |
| PostgreSQL | 14+ | Database with PostGIS extension |
| Redis | 6+ | Caching and job queues |
| Node.js | 20.x LTS | Application runtime |
| NGINX | 1.18+ | Reverse proxy and static files |
| PM2 | Latest | Process manager |
**Verify prerequisites**:
```bash
node --version # Should be v20.x.x
psql --version # Should be 14+
redis-cli ping # Should return PONG
nginx -v # Should be 1.18+
pm2 --version # Any recent version
```
## Dev Container Parity (ADR-014)
The development container now matches production architecture:
| Component | Production | Dev Container |
| ------------ | ---------------- | -------------------------- |
| Process Mgmt | PM2 | PM2 |
| API Server | PM2 cluster mode | PM2 fork + tsx watch |
| Worker | PM2 process | PM2 process + tsx watch |
| Logs | PM2 -> Logstash | PM2 -> Logstash -> Bugsink |
| HTTPS | Let's Encrypt | Self-signed (mkcert) |
This ensures issues caught in dev will behave the same in production.
**Dev Container Guide**: See [docs/development/DEV-CONTAINER.md](../development/DEV-CONTAINER.md)
---
## Server Setup
### Install Node.js
```bash
curl -sL https://deb.nodesource.com/setup_20.x | sudo bash -
sudo apt-get install -y nodejs
```
### Install PM2
```bash
sudo npm install -g pm2
```
---
## Application Deployment
### Clone and Install
```bash
git clone <repository-url>
cd flyer-crawler.projectium.com
npm install
```
### Build for Production
```bash
npm run build
```
### Start with PM2
```bash
npm run start:prod
```
This starts three PM2 processes:
- `flyer-crawler-api` - Main API server
- `flyer-crawler-worker` - Background job worker
- `flyer-crawler-analytics-worker` - Analytics processing worker
---
## Environment Variables (Gitea Secrets)
For deployments using Gitea CI/CD workflows, configure these as **repository secrets**:
| Secret | Description |
| --------------------------- | ------------------------------------------- |
| `DB_HOST` | PostgreSQL server hostname |
| `DB_USER` | PostgreSQL username |
| `DB_PASSWORD` | PostgreSQL password |
| `DB_DATABASE_PROD` | Production database name |
| `REDIS_PASSWORD_PROD` | Production Redis password |
| `REDIS_PASSWORD_TEST` | Test Redis password |
| `JWT_SECRET` | Long, random string for signing auth tokens |
| `VITE_GOOGLE_GENAI_API_KEY` | Google Gemini API key |
| `GOOGLE_MAPS_API_KEY` | Google Maps Geocoding API key |
---
## NGINX Configuration
### Reference Configuration Files
The repository contains reference copies of the production NGINX configurations for documentation and version control purposes:
| File | Server Config |
| ----------------------------------------------------------------- | ------------------------- |
| `etc-nginx-sites-available-flyer-crawler.projectium.com` | Production NGINX config |
| `etc-nginx-sites-available-flyer-crawler-test-projectium-com.txt` | Test/staging NGINX config |
**Important Notes:**
- These are **reference copies only** - they are not used directly by NGINX
- The actual live configurations reside on the server at `/etc/nginx/sites-available/`
- When modifying server NGINX configs, update these reference files to keep them in sync
- Use the dev container config at `docker/nginx/dev.conf` for local development
### Reverse Proxy Setup
Create a site configuration at `/etc/nginx/sites-available/flyer-crawler.projectium.com`:
```nginx
server {
listen 80;
server_name flyer-crawler.projectium.com;
location / {
proxy_pass http://localhost:5173;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection 'upgrade';
proxy_set_header Host $host;
proxy_cache_bypass $http_upgrade;
}
location /api {
proxy_pass http://localhost:3001;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection 'upgrade';
proxy_set_header Host $host;
proxy_cache_bypass $http_upgrade;
}
# Serve flyer images from static storage (7-day cache)
location /flyer-images/ {
alias /var/www/flyer-crawler.projectium.com/flyer-images/;
expires 7d;
add_header Cache-Control "public, immutable";
}
}
```
### Static Flyer Images
Flyer images are served as static files from the `/flyer-images/` path with browser caching enabled:
| Environment | Directory | URL Pattern |
| ------------- | ---------------------------------------------------------- | --------------------------------------------------------- |
| Production | `/var/www/flyer-crawler.projectium.com/flyer-images/` | `https://flyer-crawler.projectium.com/flyer-images/` |
| Test | `/var/www/flyer-crawler-test.projectium.com/flyer-images/` | `https://flyer-crawler-test.projectium.com/flyer-images/` |
| Dev Container | `/app/public/flyer-images/` | `https://localhost/flyer-images/` |
**Cache Settings**: Files are served with `expires 7d` and `Cache-Control: public, immutable` headers for optimal browser caching.
Create the flyer images directory if it does not exist:
```bash
sudo mkdir -p /var/www/flyer-crawler.projectium.com/flyer-images
sudo chown www-data:www-data /var/www/flyer-crawler.projectium.com/flyer-images
```
Enable the site:
```bash
sudo ln -s /etc/nginx/sites-available/flyer-crawler.projectium.com /etc/nginx/sites-enabled/
sudo nginx -t
sudo systemctl reload nginx
```
### MIME Types Fix for .mjs Files
If JavaScript modules (`.mjs` files) aren't loading correctly, add the proper MIME type.
**Option 1**: Edit the site configuration file directly:
```nginx
# Add inside the server block
types {
application/javascript js mjs;
}
```
**Option 2**: Edit `/etc/nginx/mime.types` globally:
```text
# Change this line:
application/javascript js;
# To:
application/javascript js mjs;
```
After changes:
```bash
sudo nginx -t
sudo systemctl reload nginx
```
---
## PM2 Process Management
### Critical: Always Save After State Changes
**CRITICAL**: Every `pm2 start`, `pm2 restart`, `pm2 stop`, or `pm2 delete` command MUST be immediately followed by `pm2 save`.
Without `pm2 save`, processes become ephemeral and will disappear on:
- PM2 daemon restarts
- Server reboots
- Internal PM2 reconciliation events
**Correct Pattern:**
```bash
# ✅ CORRECT - Always save after state changes
pm2 restart flyer-crawler-api && pm2 save
pm2 stop flyer-crawler-worker && pm2 save
pm2 delete flyer-crawler-analytics-worker && pm2 save
pm2 startOrReload ecosystem.config.cjs --env production --update-env && pm2 save
# ❌ WRONG - Missing save (processes become ephemeral)
pm2 restart flyer-crawler-api
pm2 stop flyer-crawler-worker
# ✅ Read-only operations don't need save
pm2 list
pm2 logs flyer-crawler-api
pm2 monit
```
**Why This Matters**: PM2 maintains an in-memory process list. The `pm2 save` command writes this list to `~/.pm2/dump.pm2`, which PM2 uses to resurrect processes after daemon restarts. Without it, your carefully managed process state is lost.
### PM2 Log Management
Install and configure pm2-logrotate to manage log files:
```bash
pm2 install pm2-logrotate
pm2 set pm2-logrotate:max_size 10M
pm2 set pm2-logrotate:retain 14
pm2 set pm2-logrotate:compress false
pm2 set pm2-logrotate:dateFormat YYYY-MM-DD_HH-mm-ss
```
---
## Rate Limiting
The application respects the Gemini AI service's rate limits. You can adjust the `GEMINI_RPM` (requests per minute) environment variable in production as needed without changing the code.
---
## CI/CD Pipeline
The project includes Gitea workflows at `.gitea/workflows/deploy.yml` that:
1. Run tests against a test database
2. Build the application
3. Deploy to production on successful builds
The workflow automatically:
- Sets up the test database schema before tests
- Tears down test data after tests complete
- Deploys to the production server
---
## Monitoring
### Check PM2 Status
```bash
pm2 status
pm2 logs
pm2 logs flyer-crawler-api --lines 100
```
### Restart Services
```bash
pm2 restart all
pm2 restart flyer-crawler-api
```
---
## Error Tracking with Bugsink (ADR-015)
Bugsink is a self-hosted Sentry-compatible error tracking system. See [docs/adr/0015-application-performance-monitoring-and-error-tracking.md](docs/adr/0015-application-performance-monitoring-and-error-tracking.md) for the full architecture decision.
### Creating Bugsink Projects and DSNs
After Bugsink is installed and running, you need to create projects and obtain DSNs:
1. **Access Bugsink UI**: Navigate to `http://localhost:8000`
2. **Log in** with your admin credentials
3. **Create Backend Project**:
- Click "Create Project"
- Name: `flyer-crawler-backend`
- Platform: Node.js
- Copy the generated DSN (format: `http://<key>@localhost:8000/<project_id>`)
4. **Create Frontend Project**:
- Click "Create Project"
- Name: `flyer-crawler-frontend`
- Platform: React
- Copy the generated DSN
5. **Configure Environment Variables**:
```bash
# Backend (server-side)
export SENTRY_DSN=http://<backend-key>@localhost:8000/<backend-project-id>
# Frontend (client-side, exposed to browser)
export VITE_SENTRY_DSN=http://<frontend-key>@localhost:8000/<frontend-project-id>
# Shared settings
export SENTRY_ENVIRONMENT=production
export VITE_SENTRY_ENVIRONMENT=production
export SENTRY_ENABLED=true
export VITE_SENTRY_ENABLED=true
```
### Testing Error Tracking
Verify Bugsink is receiving events:
```bash
npx tsx scripts/test-bugsink.ts
```
This sends test error and info events. Check the Bugsink UI for:
- `BugsinkTestError` in the backend project
- Info message "Test info message from test-bugsink.ts"
### Sentry SDK v10+ HTTP DSN Limitation
The Sentry SDK v10+ enforces HTTPS-only DSNs by default. Since Bugsink runs locally over HTTP, our implementation uses the Sentry Store API directly instead of the SDK's built-in transport. This is handled transparently by the `sentry.server.ts` and `sentry.client.ts` modules.
---
## Deployment Troubleshooting
### Decision Tree: Deployment Issues
```text
Deployment failed?
|
+-- Build step failed?
| |
| +-- TypeScript errors --> Fix type issues, run `npm run type-check`
| +-- Missing dependencies --> Run `npm ci`
| +-- Out of memory --> Increase Node heap size
|
+-- Tests failed?
| |
| +-- Database connection --> Check DB_HOST, credentials
| +-- Redis connection --> Check REDIS_URL
| +-- Test isolation --> Check for race conditions
|
+-- SSH/Deploy failed?
|
+-- Permission denied --> Check SSH keys in Gitea secrets
+-- Host unreachable --> Check firewall, VPN
+-- PM2 error --> Check PM2 logs on server
```
### Common Deployment Issues
| Symptom | Diagnosis | Solution |
| ------------------------------------ | ----------------------- | ------------------------------------------------ |
| "Connection refused" on health check | API not started | Check `pm2 logs flyer-crawler-api` |
| 502 Bad Gateway | NGINX cannot reach API | Verify API port (3001), restart PM2 |
| CSS/JS not loading | Build artifacts missing | Re-run `npm run build`, check NGINX static paths |
| Database migrations failed | Schema mismatch | Run migrations manually, check DB connectivity |
| "ENOSPC" error | Disk full | Clear old logs: `pm2 flush`, clean npm cache |
| SSL certificate error | Cert expired/missing | Run `certbot renew`, check NGINX config |
### Post-Deployment Verification Checklist
After every deployment, verify:
- [ ] Health check passes: `curl -s https://flyer-crawler.projectium.com/api/health/ready`
- [ ] PM2 processes running: `pm2 list` shows `online` status
- [ ] No recent errors: Check Bugsink for new issues
- [ ] Frontend loads: Browser shows login page
- [ ] API responds: `curl https://flyer-crawler.projectium.com/api/health/ping`
### Rollback Procedure
If deployment causes issues:
```bash
# 1. Check current release
cd /var/www/flyer-crawler.projectium.com
git log --oneline -5
# 2. Revert to previous commit
git checkout HEAD~1
# 3. Rebuild and restart
npm ci && npm run build
pm2 restart all
# 4. Verify health
curl -s http://localhost:3001/api/health/ready | jq .
```
---
## Related Documentation
- [Database Setup](../architecture/DATABASE.md) - PostgreSQL and PostGIS configuration
- [Monitoring Guide](MONITORING.md) - Health checks and error tracking
- [Logstash Quick Reference](LOGSTASH-QUICK-REF.md) - Log aggregation
- [Bare-Metal Server Setup](BARE-METAL-SETUP.md) - Manual server installation guide