Files
flyer-crawler.projectium.com/docs/adr/0014-containerization-and-deployment-strategy.md

290 lines
9.9 KiB
Markdown

# ADR-014: Containerization and Deployment Strategy
**Date**: 2025-12-12
**Status**: Implemented
**Implemented**: 2026-01-09
## Context
The project is currently run using `pm2`, and the `README.md` contains manual setup instructions. While functional, this lacks the portability, scalability, and consistency of modern deployment practices. Local development environments also suffered from inconsistency issues.
## Decision
We will standardize the deployment process using a hybrid approach:
1. **PM2 for Production**: Use PM2 cluster mode for process management, load balancing, and zero-downtime reloads.
2. **Docker/Podman for Development**: Provide a complete containerized development environment with automatic initialization.
3. **VS Code Dev Containers**: Enable one-click development environment setup.
4. **Gitea Actions for CI/CD**: Automated deployment pipelines handle builds and deployments.
## Consequences
- **Positive**: Ensures consistency between development and production environments. Simplifies the setup for new developers to a single "Reopen in Container" action. Improves portability and scalability of the application.
- **Negative**: Requires Docker/Podman installation. Container builds take time on first setup.
## Implementation Details
### Quick Start (Development)
```bash
# Prerequisites:
# - Docker Desktop or Podman installed
# - VS Code with "Dev Containers" extension
# Option 1: VS Code Dev Containers (Recommended)
# 1. Open project in VS Code
# 2. Click "Reopen in Container" when prompted
# 3. Wait for initialization to complete
# 4. Development server starts automatically
# Option 2: Manual Docker Compose
podman-compose -f compose.dev.yml up -d
podman exec -it flyer-crawler-dev bash
./scripts/docker-init.sh
npm run dev:container
```
### Container Services Architecture
```text
┌─────────────────────────────────────────────────────────────┐
│ Development Environment │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ app │ │ postgres │ │ redis │ │
│ │ (Node.js) │───▶│ (PostGIS) │ │ (Cache) │ │
│ │ │───▶│ │ │ │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
│ :3000/:3001 :5432 :6379 │
│ │
└─────────────────────────────────────────────────────────────┘
```
### compose.dev.yml Services
| Service | Image | Purpose | Healthcheck |
| ---------- | ----------------------- | ---------------------- | ---------------- |
| `app` | Custom (Dockerfile.dev) | Node.js application | HTTP /api/health |
| `postgres` | postgis/postgis:15-3.4 | Database with PostGIS | pg_isready |
| `redis` | redis:alpine | Caching and job queues | redis-cli ping |
### Automatic Initialization
The container initialization script (`scripts/docker-init.sh`) performs:
1. **npm install** - Installs dependencies into isolated volume
2. **Wait for PostgreSQL** - Polls until database is ready
3. **Wait for Redis** - Polls until Redis is responding
4. **Schema Check** - Detects if database needs initialization
5. **Database Setup** - Runs `npm run db:reset:dev` if needed (schema + seed data)
### Development Dockerfile
Located in `Dockerfile.dev`:
```dockerfile
FROM ubuntu:22.04
ENV DEBIAN_FRONTEND=noninteractive
# Install Node.js 20.x LTS + database clients
RUN apt-get update && apt-get install -y \
curl git build-essential python3 \
postgresql-client redis-tools \
&& rm -rf /var/lib/apt/lists/*
RUN curl -fsSL https://deb.nodesource.com/setup_20.x | bash - \
&& apt-get install -y nodejs
WORKDIR /app
ENV NODE_ENV=development
ENV NODE_OPTIONS='--max-old-space-size=8192'
CMD ["bash"]
```
### Environment Configuration
Copy `.env.example` to `.env` for local overrides (optional for containers):
```bash
# Container defaults (set in compose.dev.yml)
DB_HOST=postgres # Use Docker service name, not IP
DB_PORT=5432
DB_USER=postgres
DB_PASSWORD=postgres
DB_NAME=flyer_crawler_dev
REDIS_URL=redis://redis:6379
```
### VS Code Dev Container Configuration
Located in `.devcontainer/devcontainer.json`:
| Lifecycle Hook | Timing | Action |
| ------------------- | ----------------- | ------------------------------ |
| `initializeCommand` | Before container | Start Podman machine (Windows) |
| `postCreateCommand` | Container created | Run `docker-init.sh` |
| `postAttachCommand` | VS Code attached | Start dev server |
### Default Test Accounts
After initialization, these accounts are available:
| Role | Email | Password |
| ----- | ------------------- | --------- |
| Admin | `admin@example.com` | adminpass |
| User | `user@example.com` | userpass |
---
## Production Deployment (PM2)
### PM2 Ecosystem Configuration
Located in `ecosystem.config.cjs`:
```javascript
module.exports = {
apps: [
{
// API Server - Cluster mode for load balancing
name: 'flyer-crawler-api',
script: './node_modules/.bin/tsx',
args: 'server.ts',
max_memory_restart: '500M',
instances: 'max', // Use all CPU cores
exec_mode: 'cluster', // Enable cluster mode
kill_timeout: 5000, // Graceful shutdown timeout
// Restart configuration
max_restarts: 40,
exp_backoff_restart_delay: 100,
min_uptime: '10s',
env_production: {
NODE_ENV: 'production',
cwd: '/var/www/flyer-crawler.projectium.com',
},
env_test: {
NODE_ENV: 'test',
cwd: '/var/www/flyer-crawler-test.projectium.com',
},
},
{
// Background Worker - Single instance
name: 'flyer-crawler-worker',
script: './node_modules/.bin/tsx',
args: 'src/services/worker.ts',
max_memory_restart: '1G',
kill_timeout: 10000, // Workers need more time for jobs
// ... similar config
},
],
};
```
### Deployment Directory Structure
```text
/var/www/
├── flyer-crawler.projectium.com/ # Production
│ ├── server.ts
│ ├── ecosystem.config.cjs
│ ├── package.json
│ ├── flyer-images/
│ │ ├── icons/
│ │ └── archive/
│ └── ...
└── flyer-crawler-test.projectium.com/ # Test environment
└── ... (same structure)
```
### Environment-Specific Configuration
| Environment | Port | Redis DB | PM2 Process Suffix |
| ----------- | ---- | -------- | ------------------ |
| Production | 3000 | 0 | (none) |
| Test | 3001 | 1 | `-test` |
| Development | 3000 | 0 | `-dev` |
### PM2 Commands Reference
```bash
# Start/reload with environment
pm2 startOrReload ecosystem.config.cjs --env production --update-env
# Save process list for startup
pm2 save
# View logs
pm2 logs flyer-crawler-api --lines 50
# Monitor processes
pm2 monit
# List all processes
pm2 list
# Describe process details
pm2 describe flyer-crawler-api
```
### Resource Limits
| Process | Memory Limit | Restart Delay | Kill Timeout |
| ---------------- | ------------ | ------------------------ | ------------ |
| API Server | 500MB | Exponential (100ms base) | 5s |
| Worker | 1GB | Exponential (100ms base) | 10s |
| Analytics Worker | 1GB | Exponential (100ms base) | 10s |
---
## Troubleshooting
### Container Issues
```bash
# Reset everything and start fresh
podman-compose -f compose.dev.yml down -v
podman-compose -f compose.dev.yml up -d --build
# View container logs
podman-compose -f compose.dev.yml logs -f app
# Connect to database manually
podman exec -it flyer-crawler-postgres psql -U postgres -d flyer_crawler_dev
# Rebuild just the app container
podman-compose -f compose.dev.yml build app
```
### Common Issues
| Issue | Solution |
| ------------------------ | --------------------------------------------------------------- |
| "Database not ready" | Wait for postgres healthcheck, or run `docker-init.sh` manually |
| "node_modules not found" | Run `npm install` inside container |
| "Permission denied" | Ensure scripts have execute permission: `chmod +x scripts/*.sh` |
| "Network unreachable" | Use service names (postgres, redis) not IPs |
## Key Files
- `compose.dev.yml` - Docker Compose configuration
- `Dockerfile.dev` - Development container definition
- `.devcontainer/devcontainer.json` - VS Code Dev Container config
- `scripts/docker-init.sh` - Container initialization script
- `.env.example` - Environment variable template
- `ecosystem.config.cjs` - PM2 production configuration
- `.gitea/workflows/deploy-to-prod.yml` - Production deployment pipeline
- `.gitea/workflows/deploy-to-test.yml` - Test deployment pipeline
## Related ADRs
- [ADR-017](./0017-ci-cd-and-branching-strategy.md) - CI/CD Strategy
- [ADR-038](./0038-graceful-shutdown-pattern.md) - Graceful Shutdown Pattern