Files
flyer-crawler.projectium.com/docs/adr/0014-containerization-and-deployment-strategy.md
Torben Sorensen 2913c7aa09
Some checks failed
Deploy to Test Environment / deploy-to-test (push) Failing after 1m1s
tanstack
2026-01-10 03:20:40 -08:00

353 lines
13 KiB
Markdown

# ADR-014: Containerization and Deployment Strategy
**Date**: 2025-12-12
**Status**: Implemented
**Implemented**: 2026-01-09
## Context
The project is currently run using `pm2`, and the `README.md` contains manual setup instructions. While functional, this lacks the portability, scalability, and consistency of modern deployment practices. Local development environments also suffered from inconsistency issues.
## Platform Requirement: Linux Only
**CRITICAL**: This application is designed and intended to run **exclusively on Linux**, either:
- **In a container** (Docker/Podman) - the recommended and primary development environment
- **On bare-metal Linux** - for production deployments
### Windows Compatibility
**Windows is NOT a supported platform.** Any apparent Windows compatibility is:
- Coincidental and not guaranteed
- Subject to break at any time without notice
- Not a priority to fix or maintain
Specific issues that arise on Windows include:
- **Path separators**: The codebase uses POSIX-style paths (`/`) which work natively on Linux but may cause issues with `path.join()` on Windows producing backslash paths
- **Shell scripts**: Bash scripts in `scripts/` directory are Linux-only
- **External dependencies**: Tools like `pdftocairo` assume Linux installation paths
- **File permissions**: Unix-style permissions are assumed throughout
### Test Execution Requirement
**ALL tests MUST be executed on Linux.** This includes:
- Unit tests
- Integration tests
- End-to-end tests
- Any CI/CD pipeline tests
Tests that pass on Windows but fail on Linux are considered **broken tests**. Tests that fail on Windows but pass on Linux are considered **passing tests**.
**For Windows developers**: Always use the Dev Container (VS Code "Reopen in Container") to run tests. Never rely on test results from the Windows host machine.
## Decision
We will standardize the deployment process using a hybrid approach:
1. **PM2 for Production**: Use PM2 cluster mode for process management, load balancing, and zero-downtime reloads.
2. **Docker/Podman for Development**: Provide a complete containerized development environment with automatic initialization.
3. **VS Code Dev Containers**: Enable one-click development environment setup.
4. **Gitea Actions for CI/CD**: Automated deployment pipelines handle builds and deployments.
## Consequences
- **Positive**: Ensures consistency between development and production environments. Simplifies the setup for new developers to a single "Reopen in Container" action. Improves portability and scalability of the application.
- **Negative**: Requires Docker/Podman installation. Container builds take time on first setup.
## Implementation Details
### Quick Start (Development)
```bash
# Prerequisites:
# - Docker Desktop or Podman installed
# - VS Code with "Dev Containers" extension
# Option 1: VS Code Dev Containers (Recommended)
# 1. Open project in VS Code
# 2. Click "Reopen in Container" when prompted
# 3. Wait for initialization to complete
# 4. Development server starts automatically
# Option 2: Manual Docker Compose
podman-compose -f compose.dev.yml up -d
podman exec -it flyer-crawler-dev bash
./scripts/docker-init.sh
npm run dev:container
```
### Container Services Architecture
```text
┌─────────────────────────────────────────────────────────────┐
│ Development Environment │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ app │ │ postgres │ │ redis │ │
│ │ (Node.js) │───▶│ (PostGIS) │ │ (Cache) │ │
│ │ │───▶│ │ │ │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
│ :3000/:3001 :5432 :6379 │
│ │
└─────────────────────────────────────────────────────────────┘
```
### compose.dev.yml Services
| Service | Image | Purpose | Healthcheck |
| ---------- | ----------------------- | ---------------------- | ---------------- |
| `app` | Custom (Dockerfile.dev) | Node.js application | HTTP /api/health |
| `postgres` | postgis/postgis:15-3.4 | Database with PostGIS | pg_isready |
| `redis` | redis:alpine | Caching and job queues | redis-cli ping |
### Automatic Initialization
The container initialization script (`scripts/docker-init.sh`) performs:
1. **npm install** - Installs dependencies into isolated volume
2. **Wait for PostgreSQL** - Polls until database is ready
3. **Wait for Redis** - Polls until Redis is responding
4. **Schema Check** - Detects if database needs initialization
5. **Database Setup** - Runs `npm run db:reset:dev` if needed (schema + seed data)
### Development Dockerfile
Located in `Dockerfile.dev`:
```dockerfile
FROM ubuntu:22.04
ENV DEBIAN_FRONTEND=noninteractive
# Install Node.js 20.x LTS + database clients
RUN apt-get update && apt-get install -y \
curl git build-essential python3 \
postgresql-client redis-tools \
&& rm -rf /var/lib/apt/lists/*
RUN curl -fsSL https://deb.nodesource.com/setup_20.x | bash - \
&& apt-get install -y nodejs
WORKDIR /app
ENV NODE_ENV=development
ENV NODE_OPTIONS='--max-old-space-size=8192'
CMD ["bash"]
```
### Environment Configuration
Copy `.env.example` to `.env` for local overrides (optional for containers):
```bash
# Container defaults (set in compose.dev.yml)
DB_HOST=postgres # Use Docker service name, not IP
DB_PORT=5432
DB_USER=postgres
DB_PASSWORD=postgres
DB_NAME=flyer_crawler_dev
REDIS_URL=redis://redis:6379
```
### VS Code Dev Container Configuration
Located in `.devcontainer/devcontainer.json`:
| Lifecycle Hook | Timing | Action |
| ------------------- | ----------------- | ------------------------------ |
| `initializeCommand` | Before container | Start Podman machine (Windows) |
| `postCreateCommand` | Container created | Run `docker-init.sh` |
| `postAttachCommand` | VS Code attached | Start dev server |
### Default Test Accounts
After initialization, these accounts are available:
| Role | Email | Password |
| ----- | ------------------- | --------- |
| Admin | `admin@example.com` | adminpass |
| User | `user@example.com` | userpass |
---
## Production Deployment (PM2)
### PM2 Ecosystem Configuration
Located in `ecosystem.config.cjs`:
```javascript
module.exports = {
apps: [
{
// API Server - Cluster mode for load balancing
name: 'flyer-crawler-api',
script: './node_modules/.bin/tsx',
args: 'server.ts',
max_memory_restart: '500M',
instances: 'max', // Use all CPU cores
exec_mode: 'cluster', // Enable cluster mode
kill_timeout: 5000, // Graceful shutdown timeout
// Restart configuration
max_restarts: 40,
exp_backoff_restart_delay: 100,
min_uptime: '10s',
env_production: {
NODE_ENV: 'production',
cwd: '/var/www/flyer-crawler.projectium.com',
},
env_test: {
NODE_ENV: 'test',
cwd: '/var/www/flyer-crawler-test.projectium.com',
},
},
{
// Background Worker - Single instance
name: 'flyer-crawler-worker',
script: './node_modules/.bin/tsx',
args: 'src/services/worker.ts',
max_memory_restart: '1G',
kill_timeout: 10000, // Workers need more time for jobs
// ... similar config
},
],
};
```
### Deployment Directory Structure
```text
/var/www/
├── flyer-crawler.projectium.com/ # Production
│ ├── server.ts
│ ├── ecosystem.config.cjs
│ ├── package.json
│ ├── flyer-images/
│ │ ├── icons/
│ │ └── archive/
│ └── ...
└── flyer-crawler-test.projectium.com/ # Test environment
└── ... (same structure)
```
### Environment-Specific Configuration
| Environment | Port | Redis DB | PM2 Process Suffix |
| ----------- | ---- | -------- | ------------------ |
| Production | 3000 | 0 | (none) |
| Test | 3001 | 1 | `-test` |
| Development | 3000 | 0 | `-dev` |
### PM2 Commands Reference
```bash
# Start/reload with environment
pm2 startOrReload ecosystem.config.cjs --env production --update-env
# Save process list for startup
pm2 save
# View logs
pm2 logs flyer-crawler-api --lines 50
# Monitor processes
pm2 monit
# List all processes
pm2 list
# Describe process details
pm2 describe flyer-crawler-api
```
### Resource Limits
| Process | Memory Limit | Restart Delay | Kill Timeout |
| ---------------- | ------------ | ------------------------ | ------------ |
| API Server | 500MB | Exponential (100ms base) | 5s |
| Worker | 1GB | Exponential (100ms base) | 10s |
| Analytics Worker | 1GB | Exponential (100ms base) | 10s |
---
## Troubleshooting
### Container Issues
```bash
# Reset everything and start fresh
podman-compose -f compose.dev.yml down -v
podman-compose -f compose.dev.yml up -d --build
# View container logs
podman-compose -f compose.dev.yml logs -f app
# Connect to database manually
podman exec -it flyer-crawler-postgres psql -U postgres -d flyer_crawler_dev
# Rebuild just the app container
podman-compose -f compose.dev.yml build app
```
### Common Issues
| Issue | Solution |
| ------------------------ | --------------------------------------------------------------- |
| "Database not ready" | Wait for postgres healthcheck, or run `docker-init.sh` manually |
| "node_modules not found" | Run `npm install` inside container |
| "Permission denied" | Ensure scripts have execute permission: `chmod +x scripts/*.sh` |
| "Network unreachable" | Use service names (postgres, redis) not IPs |
## Key Files
- `compose.dev.yml` - Docker Compose configuration
- `Dockerfile.dev` - Development container definition
- `.devcontainer/devcontainer.json` - VS Code Dev Container config
- `scripts/docker-init.sh` - Container initialization script
- `.env.example` - Environment variable template
- `ecosystem.config.cjs` - PM2 production configuration
- `.gitea/workflows/deploy-to-prod.yml` - Production deployment pipeline
- `.gitea/workflows/deploy-to-test.yml` - Test deployment pipeline
## Container Test Readiness Requirement
**CRITICAL**: The development container MUST be fully test-ready on startup. This means:
1. **Zero Manual Steps**: After running `podman-compose -f compose.dev.yml up -d` and entering the container, tests MUST run immediately with `npm test` without any additional setup steps.
2. **Complete Environment**: All environment variables, database connections, Redis connections, and seed data MUST be automatically initialized during container startup.
3. **Enforcement Checklist**:
- [ ] `npm test` runs successfully immediately after container start
- [ ] Database is seeded with test data (admin account, sample data)
- [ ] Redis is connected and healthy
- [ ] All environment variables are set via `compose.dev.yml` or `.env` files
- [ ] No "database not ready" or "connection refused" errors on first test run
4. **Current Gaps (To Fix)**:
- Integration tests require database seeding (`npm run db:reset:test`)
- Environment variables from `.env.test` may not be loaded automatically
- Some npm scripts use `NODE_ENV=` syntax which fails on Windows (use `cross-env`)
5. **Resolution Steps**:
- The `docker-init.sh` script should seed the test database after seeding dev database
- Add automatic `.env.test` loading or move all test env vars to `compose.dev.yml`
- Update all npm scripts to use `cross-env` for cross-platform compatibility
**Rationale**: Developers and CI systems should never need to run manual setup commands to execute tests. If the container is running, tests should work. Any deviation from this principle indicates an incomplete container setup.
## Related ADRs
- [ADR-017](./0017-ci-cd-and-branching-strategy.md) - CI/CD Strategy
- [ADR-038](./0038-graceful-shutdown-pattern.md) - Graceful Shutdown Pattern
- [ADR-010](./0010-testing-strategy-and-standards.md) - Testing Strategy and Standards