Files
flyer-crawler.projectium.com/docs/development/DEV-CONTAINER.md
Torben Sorensen 45ac4fccf5
Some checks failed
Deploy to Test Environment / deploy-to-test (push) Failing after 2m15s
comprehensive documentation review + test fixes
2026-01-28 16:35:38 -08:00

409 lines
15 KiB
Markdown

# Dev Container Guide
Comprehensive documentation for the Flyer Crawler development container.
**Last Updated**: 2026-01-22
---
## Table of Contents
1. [Overview](#overview)
2. [Architecture](#architecture)
3. [PM2 Process Management](#pm2-process-management)
4. [Log Aggregation](#log-aggregation)
5. [Quick Reference](#quick-reference)
6. [Troubleshooting](#troubleshooting)
---
## Overview
The dev container provides a production-like environment for local development. Key features:
- **PM2 Process Management** - Matches production architecture (ADR-014)
- **Logstash Integration** - All logs forwarded to Bugsink (ADR-050)
- **HTTPS by Default** - Self-signed certificates via mkcert
- **Hot Reloading** - tsx watch mode for API and worker processes
### Container Services
| Service | Container Name | Purpose |
| ----------- | ------------------------ | ------------------------------ |
| Application | `flyer-crawler-dev` | Node.js app, Bugsink, Logstash |
| PostgreSQL | `flyer-crawler-postgres` | Primary database with PostGIS |
| Redis | `flyer-crawler-redis` | Cache and job queue backing |
### Access Points
| Service | URL | Notes |
| ----------- | ------------------------ | ------------------------- |
| Frontend | `https://localhost` | NGINX proxies Vite (5173) |
| Backend API | `http://localhost:3001` | Express server |
| Bugsink | `https://localhost:8443` | Error tracking UI |
| PostgreSQL | `localhost:5432` | Direct database access |
| Redis | `localhost:6379` | Direct Redis access |
---
## Architecture
### Production vs Development Parity
The dev container was updated to match production architecture (ADR-014):
| Component | Production | Dev Container (OLD) | Dev Container (NEW) |
| ------------ | ---------------------- | ---------------------- | ----------------------- |
| API Server | PM2 cluster mode | `npm run dev` (inline) | PM2 fork + tsx watch |
| Worker | PM2 process | Inline with API | PM2 process + tsx watch |
| Frontend | Static files via NGINX | Vite standalone | PM2 + Vite dev server |
| Logs | PM2 logs -> Logstash | Console only | PM2 logs -> Logstash |
| Process Mgmt | PM2 | None | PM2 |
### Container Startup Flow
When the container starts (`scripts/dev-entrypoint.sh`):
1. **NGINX** starts (HTTPS proxy for Vite and Bugsink)
2. **Bugsink** starts (error tracking on port 8000)
3. **Logstash** starts (log aggregation)
4. **PM2** starts with `ecosystem.dev.config.cjs`:
- `flyer-crawler-api-dev` - API server
- `flyer-crawler-worker-dev` - Background worker
- `flyer-crawler-vite-dev` - Vite dev server
5. Container tails PM2 logs to stay alive
### Key Configuration Files
| File | Purpose |
| ------------------------------ | ---------------------------------- |
| `ecosystem.dev.config.cjs` | PM2 development configuration |
| `ecosystem.config.cjs` | PM2 production configuration |
| `scripts/dev-entrypoint.sh` | Container startup script |
| `docker/logstash/bugsink.conf` | Logstash pipeline configuration |
| `docker/nginx/dev.conf` | NGINX development configuration |
| `compose.dev.yml` | Docker Compose service definitions |
| `Dockerfile.dev` | Container image definition |
---
## PM2 Process Management
### Process Overview
PM2 manages three processes in the dev container:
```text
+--------------------+ +------------------------+ +--------------------+
| flyer-crawler- | | flyer-crawler- | | flyer-crawler- |
| api-dev | | worker-dev | | vite-dev |
+--------------------+ +------------------------+ +--------------------+
| tsx watch | | tsx watch | | vite --host |
| server.ts | | src/services/worker.ts | | |
| Port: 3001 | | No port | | Port: 5173 |
+--------------------+ +------------------------+ +--------------------+
| | |
v v v
+------------------------------------------------------------------------+
| /var/log/pm2/*.log |
| (Logstash picks up for Bugsink) |
+------------------------------------------------------------------------+
```
### PM2 Commands
All commands should be run inside the container:
```bash
# View process status
podman exec -it flyer-crawler-dev pm2 status
# View all logs (tail -f style)
podman exec -it flyer-crawler-dev pm2 logs
# View specific process logs
podman exec -it flyer-crawler-dev pm2 logs flyer-crawler-api-dev
# Restart all processes
podman exec -it flyer-crawler-dev pm2 restart all
# Restart specific process
podman exec -it flyer-crawler-dev pm2 restart flyer-crawler-api-dev
# Stop all processes
podman exec -it flyer-crawler-dev pm2 delete all
# Show detailed process info
podman exec -it flyer-crawler-dev pm2 show flyer-crawler-api-dev
# Monitor processes in real-time
podman exec -it flyer-crawler-dev pm2 monit
```
### PM2 Log Locations
| Process | stdout Log | stderr Log |
| -------------------------- | ----------------------------- | ------------------------------- |
| `flyer-crawler-api-dev` | `/var/log/pm2/api-out.log` | `/var/log/pm2/api-error.log` |
| `flyer-crawler-worker-dev` | `/var/log/pm2/worker-out.log` | `/var/log/pm2/worker-error.log` |
| `flyer-crawler-vite-dev` | `/var/log/pm2/vite-out.log` | `/var/log/pm2/vite-error.log` |
### NPM Scripts for PM2
The following npm scripts are available for PM2 management:
```bash
# Start PM2 with dev config (inside container)
npm run dev:pm2
# Restart all PM2 processes
npm run dev:pm2:restart
# Stop all PM2 processes
npm run dev:pm2:stop
# View PM2 status
npm run dev:pm2:status
# View PM2 logs
npm run dev:pm2:logs
```
---
## Log Aggregation
### Log Flow Architecture (ADR-050)
All application logs flow through Logstash to Bugsink using a 3-project architecture:
```text
+------------------+ +------------------+ +------------------+
| PM2 Logs | | PostgreSQL | | Redis/NGINX |
| /var/log/pm2/ | | /var/log/ | | /var/log/redis/ |
| (API + Worker) | | postgresql/ | | /var/log/nginx/ |
+--------+---------+ +--------+---------+ +--------+---------+
| | |
v v v
+------------------------------------------------------------------------+
| LOGSTASH |
| /etc/logstash/conf.d/bugsink.conf |
| (Routes by log type) |
+------------------------------------------------------------------------+
| | |
v v v
+------------------+ +------------------+ +------------------+
| Backend API | | Frontend (Dev) | | Infrastructure |
| (Project 1) | | (Project 2) | | (Project 4) |
| - Pino errors | | - Browser SDK | | - Redis warnings |
| - PostgreSQL | | (not Logstash) | | - NGINX errors |
+------------------+ +------------------+ | - Vite errors |
+------------------+
```
### Log Sources
| Source | Log Path | Format | Errors To Bugsink |
| ------------ | --------------------------------- | ---------- | ----------------- |
| API Server | `/var/log/pm2/api-*.log` | Pino JSON | Yes |
| Worker | `/var/log/pm2/worker-*.log` | Pino JSON | Yes |
| Vite | `/var/log/pm2/vite-*.log` | Plain text | Yes (if error) |
| PostgreSQL | `/var/log/postgresql/*.log` | PostgreSQL | Yes (ERROR/FATAL) |
| Redis | `/var/log/redis/redis-server.log` | Redis | Yes (warnings) |
| NGINX Access | `/var/log/nginx/access.log` | Combined | Yes (5xx only) |
| NGINX Error | `/var/log/nginx/error.log` | NGINX | Yes |
### Viewing Logs
```bash
# View Logstash processed logs
MSYS_NO_PATHCONV=1 podman exec flyer-crawler-dev cat /var/log/logstash/pm2-api-$(date +%Y-%m-%d).log
MSYS_NO_PATHCONV=1 podman exec flyer-crawler-dev cat /var/log/logstash/redis-operational-$(date +%Y-%m-%d).log
# View raw Redis logs
MSYS_NO_PATHCONV=1 podman exec flyer-crawler-redis cat /var/log/redis/redis-server.log
# Check Logstash status
podman exec flyer-crawler-dev curl -s localhost:9600/_node/stats/pipelines?pretty
```
### Bugsink Access
- **URL**: `https://localhost:8443`
- **Login**: `admin@localhost` / `admin`
- **Projects**:
- Project 1: Backend API (Dev) - Pino app errors, PostgreSQL errors
- Project 2: Frontend (Dev) - Browser errors via Sentry SDK
- Project 4: Infrastructure (Dev) - Redis warnings, NGINX errors, Vite build errors
**Note**: Frontend DSN uses nginx proxy (`/bugsink-api/`) because browsers cannot reach `localhost:8000` directly. See [BUGSINK-SETUP.md](../tools/BUGSINK-SETUP.md#frontend-nginx-proxy) for details.
---
## Quick Reference
### Starting the Dev Container
```bash
# Start all services
podman-compose -f compose.dev.yml up -d
# View container logs
podman-compose -f compose.dev.yml logs -f
# Stop all services
podman-compose -f compose.dev.yml down
```
### Common Tasks
| Task | Command |
| -------------------- | ------------------------------------------------------------------------------ |
| Run tests | `podman exec -it flyer-crawler-dev npm test` |
| Run type check | `podman exec -it flyer-crawler-dev npm run type-check` |
| View PM2 status | `podman exec -it flyer-crawler-dev pm2 status` |
| View PM2 logs | `podman exec -it flyer-crawler-dev pm2 logs` |
| Restart API | `podman exec -it flyer-crawler-dev pm2 restart flyer-crawler-api-dev` |
| Access PostgreSQL | `podman exec -it flyer-crawler-postgres psql -U postgres -d flyer_crawler_dev` |
| Access Redis CLI | `podman exec -it flyer-crawler-redis redis-cli` |
| Shell into container | `podman exec -it flyer-crawler-dev bash` |
### Environment Variables
Key environment variables are set in `compose.dev.yml`:
| Variable | Value | Purpose |
| ----------------- | ----------------------------- | --------------------------- |
| `TZ` | `America/Los_Angeles` | Timezone (PST) for all logs |
| `NODE_ENV` | `development` | Environment mode |
| `DB_HOST` | `postgres` | PostgreSQL hostname |
| `REDIS_URL` | `redis://redis:6379` | Redis connection URL |
| `FRONTEND_URL` | `https://localhost` | CORS origin |
| `SENTRY_DSN` | `http://...@127.0.0.1:8000/1` | Backend Bugsink DSN |
| `VITE_SENTRY_DSN` | `http://...@127.0.0.1:8000/2` | Frontend Bugsink DSN |
### Timezone Configuration
All dev container services are configured to use PST (America/Los_Angeles) timezone for consistent log timestamps:
| Service | Configuration | Notes |
| ---------- | ------------------------------------------------ | ------------------------------ |
| App | `TZ=America/Los_Angeles` in compose.dev.yml | Also set via dev-entrypoint.sh |
| PostgreSQL | `timezone` and `log_timezone` in postgres config | Logs timestamps in PST |
| Redis | `TZ=America/Los_Angeles` in compose.dev.yml | Alpine uses TZ env var |
| PM2 | `TZ` in ecosystem.dev.config.cjs | Pino timestamps use local time |
**Verifying Timezone**:
```bash
# Check container timezone
podman exec flyer-crawler-dev date
# Check PostgreSQL timezone
podman exec flyer-crawler-postgres psql -U postgres -c "SHOW timezone;"
# Check Redis log timestamps
MSYS_NO_PATHCONV=1 podman exec flyer-crawler-redis cat /var/log/redis/redis-server.log | head -5
```
**Note**: If you need UTC timestamps for production compatibility, change `TZ=UTC` in compose.dev.yml and restart containers.
---
## Troubleshooting
### PM2 Process Not Starting
**Symptom**: `pm2 status` shows process as "errored" or "stopped"
**Debug**:
```bash
# Check process logs
podman exec -it flyer-crawler-dev pm2 logs flyer-crawler-api-dev --lines 50
# Check if port is in use
podman exec -it flyer-crawler-dev netstat -tlnp | grep 3001
# Try manual start to see errors
podman exec -it flyer-crawler-dev tsx server.ts
```
**Common Causes**:
- Port already in use
- Missing environment variables
- Syntax error in code
### Logs Not Appearing in Bugsink
**Symptom**: Errors in application but nothing in Bugsink UI
**Debug**:
```bash
# Check Logstash is running
podman exec flyer-crawler-dev curl -s localhost:9600/_node/stats/pipelines?pretty
# Check Logstash logs for errors
MSYS_NO_PATHCONV=1 podman exec flyer-crawler-dev cat /var/log/logstash/logstash.log
# Verify PM2 logs exist
podman exec flyer-crawler-dev ls -la /var/log/pm2/
```
**Common Causes**:
- Logstash not started
- Log file permissions
- Bugsink not running
### Redis Logs Not Captured
**Symptom**: Redis warnings not appearing in Bugsink
**Debug**:
```bash
# Verify Redis logs exist
MSYS_NO_PATHCONV=1 podman exec flyer-crawler-redis cat /var/log/redis/redis-server.log
# Verify shared volume is mounted
podman exec flyer-crawler-dev ls -la /var/log/redis/
```
**Common Causes**:
- `redis_logs` volume not mounted
- Redis not configured to write to file
### Hot Reload Not Working
**Symptom**: Code changes not reflected in running application
**Debug**:
```bash
# Check tsx watch is running
podman exec -it flyer-crawler-dev pm2 logs flyer-crawler-api-dev
# Manually restart process
podman exec -it flyer-crawler-dev pm2 restart flyer-crawler-api-dev
```
**Common Causes**:
- File watcher limit reached
- Volume mount issues on Windows
---
## Related Documentation
- [QUICKSTART.md](../getting-started/QUICKSTART.md) - Getting started guide
- [DEBUGGING.md](DEBUGGING.md) - Debugging strategies
- [LOGSTASH-QUICK-REF.md](../operations/LOGSTASH-QUICK-REF.md) - Logstash quick reference
- [DEV-CONTAINER-BUGSINK.md](../DEV-CONTAINER-BUGSINK.md) - Bugsink setup in dev container
- [ADR-014](../adr/0014-containerization-and-deployment-strategy.md) - Containerization and deployment strategy
- [ADR-050](../adr/0050-postgresql-function-observability.md) - PostgreSQL function observability (includes log aggregation)