diff --git a/CLAUDE.md b/CLAUDE.md index a68e1779..9169848f 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -1,494 +1,409 @@ + + # Claude Code Project Instructions -## Session Startup Checklist +## Session Startup -**IMPORTANT**: At the start of every session, perform these steps: - -1. **Check Memory First** - Use `mcp__memory__read_graph` or `mcp__memory__search_nodes` to recall: - - Project-specific configurations and credentials - - Previous work context and decisions - - Infrastructure details (URLs, ports, access patterns) - - Known issues and their solutions - -2. **Review Recent Git History** - Check `git log --oneline -10` to understand recent changes - -3. **Check Container Status** - Use `mcp__podman__container_list` to see what's running +1. **Memory**: `mcp__memory__read_graph` or `mcp__memory__search_nodes` - recall project context, credentials, known issues +2. **Git**: `git log --oneline -10` - recent changes +3. **Containers**: `mcp__podman__container_list` - running state --- -## Project Instructions +## Application Overview -### Things to Remember +**Flyer Crawler** - Grocery deal extraction and analysis platform. -Before writing any code: +| Domain | Description | +| ------------- | ------------------------------------------------------------------- | +| Core Function | AI-powered extraction of deals from grocery store flyer images/PDFs | +| User Features | Watchlists, price history, shopping lists, deal alerts, recipes | +| Data Flow | Upload → AI extraction (Gemini) → PostgreSQL → React display | -1. State how you will verify this change works (test, bash command, browser check, etc.) +Key entities: Flyers, FlyerItems, Stores, StoreLocations, Watchlists, ShoppingLists, Recipes, Achievements -2. Write the test or verification step first +--- -3. Then implement the code +## Architecture Patterns -4. Run verification and iterate until it passes +**Layer Responsibilities** (ADR-035): -## Git Bash / MSYS Path Conversion Issue (Windows Host) - -**CRITICAL ISSUE**: Git Bash on Windows automatically converts Unix-style paths to Windows paths, which breaks Podman/Docker commands. - -### Problem Examples: - -```bash -# This FAILS in Git Bash: -podman exec container /usr/local/bin/script.sh -# Git Bash converts to: C:/Program Files/Git/usr/local/bin/script.sh - -# This FAILS in Git Bash: -podman exec container bash -c "cat /tmp/file.sql" -# Git Bash converts /tmp to C:/Users/user/AppData/Local/Temp +```text +Routes → Services → Repositories → Database + ↓ + External APIs (*.server.ts) ``` -### Solutions: +**Key Patterns:** -1. **Use `sh -c` instead of `bash -c`** for single-quoted commands: +| Pattern | ADR | When to Use | +| ------------------ | ------- | ------------------------------------------------------------------- | +| Error Handling | ADR-001 | All DB operations: `handleDbError()`, throw `NotFoundError` | +| Repository Methods | ADR-034 | `get*` throws NotFound, `find*` returns null, `list*` returns array | +| API Responses | ADR-028 | Use `sendSuccess()`, `sendPaginated()`, `sendError()` helpers | +| Transactions | ADR-002 | Multi-repo operations: `withTransaction(async (client) => {...})` | - ```bash - podman exec container sh -c '/usr/local/bin/script.sh' - ``` +**Full Details:** [docs/adr/index.md](docs/adr/index.md) -2. **Use double slashes** to escape path conversion: +--- - ```bash - podman exec container //usr//local//bin//script.sh - ``` +## Key Files -3. **Set MSYS_NO_PATHCONV** environment variable: +| Purpose | File | +| -------------------- | -------------------------------- | +| Express app setup | `server.ts` | +| Environment config | `src/config/env.ts` | +| All routes | `src/routes/*.routes.ts` | +| DB repositories | `src/services/db/*.db.ts` | +| Error types | `src/services/db/errors.db.ts` | +| API response helpers | `src/utils/apiResponse.ts` | +| Background workers | `src/services/workers.server.ts` | +| Queue definitions | `src/services/queues.server.ts` | +| Type definitions | `src/types.ts`, `src/types/*.ts` | - ```bash - MSYS_NO_PATHCONV=1 podman exec container /usr/local/bin/script.sh - ``` +## Directory Structure -4. **Use Windows paths with forward slashes** when referencing host files: - ```bash - podman cp "d:/path/to/file" container:/tmp/file - ``` - -**ALWAYS use one of these workarounds when running Bash commands on Windows that involve Unix paths inside containers.** - -## Communication Style: Ask Before Assuming - -**IMPORTANT**: When helping with tasks, **ask clarifying questions before making assumptions**. Do not assume: - -- What steps the user has or hasn't completed -- What the user already knows or has configured -- What external services (OAuth providers, APIs, etc.) are already set up -- What secrets or credentials have already been created - -Instead, ask the user to confirm the current state before providing instructions or making recommendations. This prevents wasted effort and respects the user's existing work. - -## Platform Requirement: Linux Only - -**CRITICAL**: This application is designed to run **exclusively on Linux**. See [ADR-014](docs/adr/0014-containerization-and-deployment-strategy.md) for full details. - -### Environment Terminology - -- **Dev Container** (or just "dev"): The containerized Linux development environment (`flyer-crawler-dev`). This is where all development and testing should occur. -- **Host**: The Windows machine running Podman/Docker and VS Code. - -When instructions say "run in dev" or "run in the dev container", they mean executing commands inside the `flyer-crawler-dev` container. - -### Test Execution Rules - -1. **ALL tests MUST be executed in the dev container** - the Linux container environment -2. **NEVER run tests directly on Windows host** - test results from Windows are unreliable -3. **Always use the dev container for testing** when developing on Windows -4. **TypeScript type-check MUST run in dev container** - `npm run type-check` on Windows does not reliably detect errors - -See [docs/TESTING.md](docs/TESTING.md) for comprehensive testing documentation. - -### How to Run Tests Correctly - -```bash -# If on Windows, first open VS Code and "Reopen in Container" -# Then run tests inside the dev container: -npm test # Run all unit tests -npm run test:unit # Run unit tests only -npm run test:integration # Run integration tests (requires DB/Redis) +```text +src/ +├── components/ # React UI components +├── config/ # Environment config (env.ts) +├── contexts/ # React contexts +├── features/ # Feature-specific UI (partial migration to ADR-047) +├── hooks/ # React hooks +├── layouts/ # Page layouts +├── middleware/ # Express middleware +├── pages/ # Route page components +├── routes/ # Express API routes +├── schemas/ # Zod validation schemas +├── services/ # Business logic + external APIs +│ └── db/ # Database repositories (*.db.ts) +├── tests/ # Test utilities, fixtures, integration/e2e tests +├── types/ # TypeScript type definitions +└── utils/ # Shared utilities ``` -### Running Tests via Podman (from Windows host) +**Note:** Structure is transitioning per ADR-047. Frontend/backend currently share `src/`. -**Note:** This project has 2900+ unit tests. For AI-assisted development, pipe output to a file for easier processing. +--- -The command to run unit tests in the dev container via podman: +## Common Workflows + +### Adding a New API Endpoint + +1. Add route in `src/routes/{domain}.routes.ts` +2. Use `validateRequest(schema)` middleware for input validation +3. Call service layer (never access DB directly from routes) +4. Return via `sendSuccess()` or `sendPaginated()` +5. Add tests in `*.routes.test.ts` + +### Adding a New Database Operation + +1. Add method to `src/services/db/{domain}.db.ts` +2. Follow naming: `get*` (throws), `find*` (returns null), `list*` (array) +3. Use `handleDbError()` for error handling +4. Accept optional `PoolClient` for transaction support +5. Add unit test + +### Adding a Background Job + +1. Define queue in `src/services/queues.server.ts` +2. Add worker in `src/services/workers.server.ts` +3. Call `queue.add()` from service layer + +--- + +## Custom Subagents + +Use specialized Task subagents for complex work. Launch with `Task` tool. + +### Core Development + +- **plan** - Design implementation plans, identify files, architectural trade-offs +- **coder** - Write/modify production Node.js/TypeScript code + +### Testing & Quality + +- **tester** - Adversarial testing: edge cases, race conditions, security vulnerabilities +- **testwriter** - Create comprehensive tests for features, fixes, refactoring +- **code-reviewer** - Review code quality, security, best practices, architecture + +### Database & Infrastructure + +- **db-dev** - Schemas, queries, migrations, optimization, N+1 problems +- **db-admin** - PostgreSQL/Redis admin, security, backups, infrastructure +- **devops** - Containers, services, CI/CD pipelines, deployments +- **infra-architect** - Resource optimization: RAM, CPU, disk, storage, capacity + +### Specialized Technical + +- **bg-worker** - Background jobs: PM2 workers, BullMQ queues, async tasks +- **ai-usage** - LLM APIs (Gemini, Claude), prompt engineering, AI features +- **security-engineer** - Security audits, vulnerability scanning, OWASP, pentesting +- **log-debug** - Production errors, observability, Bugsink/Sentry analysis +- **integrations-specialist** - Third-party services, webhooks, external APIs + +### Frontend & Design + +- **frontend-specialist** - UI components, Neo-Brutalism, Core Web Vitals, accessibility +- **uiux-designer** - UI/UX decisions, component design, Neo-Brutalism compliance + +### Documentation & Planning + +- **documenter** - User docs, API specs, feature documentation +- **describer-for-ai** - Technical docs for AI: ADRs, system overviews, context docs +- **planner** - Break down features, roadmaps, scope management +- **product-owner** - Feature requirements, user stories, validation, backlog prioritization + +### Support + +- **tools-integration-specialist** - Bugsink, Gitea, OAuth, operational tools + +--- + +## Critical Rules + +### Platform: Linux Only (ADR-014) + +| Term | Meaning | +| ------------- | ------------------------------------------------------- | +| Dev Container | `flyer-crawler-dev` Linux container - ALL dev/test here | +| Host | Windows machine running Podman/VS Code | + +**Test Execution**: + +- ALL tests MUST run in dev container +- NEVER test on Windows host - results unreliable +- `npm run type-check` MUST run in container + +**Test Interpretation**: + +- Pass Windows / Fail Linux = BROKEN (fix required) +- Fail Windows / Pass Linux = PASSING (acceptable) + +### Code Workflow + +Before any code change: + +1. State verification method (test, bash, browser) +2. Write test/verification first +3. Implement code +4. Run verification until passing +5. Run `npm run type-check` before commit + +### Git Bash Path Conversion (Windows) + +Git Bash auto-converts Unix paths, breaking container commands. + +| Solution | Example | +| ---------------------------- | -------------------------------------------------------- | +| `sh -c` with single quotes | `podman exec container sh -c '/usr/local/bin/script.sh'` | +| Double slashes | `podman exec container //usr//local//bin//script.sh` | +| MSYS_NO_PATHCONV=1 | `MSYS_NO_PATHCONV=1 podman exec ...` | +| Windows paths for host files | `podman cp "d:/path/file" container:/tmp/file` | + +### Communication Style + +Ask before assuming. Do not assume: + +- Steps completed / User knowledge / External services configured / Secrets created + +--- + +## Commands + +| Command | Description | +| -------------------------- | ------------------------------------- | +| `npm test` | All unit tests | +| `npm run test:unit` | Unit tests only | +| `npm run test:integration` | Integration tests (requires DB/Redis) | +| `npm run dev:container` | Dev server | +| `npm run build` | Production build | +| `npm run type-check` | TypeScript check | + +**Podman (from Windows)**: ```bash -# Basic (output to terminal) -podman exec -it flyer-crawler-dev npm run test:unit - -# Recommended for AI processing: pipe to file podman exec -it flyer-crawler-dev npm run test:unit 2>&1 | tee test-results.txt -``` - -The command to run integration tests in the dev container via podman: - -```bash podman exec -it flyer-crawler-dev npm run test:integration -``` - -For running specific test files: - -```bash podman exec -it flyer-crawler-dev npm test -- --run src/hooks/useAuth.test.tsx ``` -### Why Linux Only? +--- -- Path separators: Code uses POSIX-style paths (`/`) which may break on Windows -- Shell scripts in `scripts/` directory are Linux-only -- External dependencies like `pdftocairo` assume Linux installation paths -- Unix-style file permissions are assumed throughout +## Database Schema Sync -### Test Result Interpretation +**CRITICAL**: Schema files must stay synchronized. -- Tests that **pass on Windows but fail on Linux** = **BROKEN tests** (must be fixed) -- Tests that **fail on Windows but pass on Linux** = **PASSING tests** (acceptable) +| File | Purpose | +| ------------------------------ | --------------------------------- | +| `sql/master_schema_rollup.sql` | Test DB setup, complete reference | +| `sql/initial_schema.sql` | Fresh install schema | +| `sql/migrations/*.sql` | Production incremental changes | -## Development Workflow +**Rules**: -1. Open project in VS Code -2. Use "Reopen in Container" (Dev Containers extension required) to enter the dev environment -3. Wait for dev container initialization to complete -4. Run `npm test` to verify the dev environment is working -5. Make changes and run tests inside the dev container +- Keep `master_schema_rollup.sql` and `initial_schema.sql` in sync +- When adding columns via migration, add to both schema files +- Migrations = production ALTER TABLE; Schema files = complete CREATE TABLE +- Test DB uses `master_schema_rollup.sql` - out-of-sync = test failures -## Code Change Verification +--- -After making any code changes, **always run a type-check** to catch TypeScript errors before committing: +## Integration Test Issues -```bash -npm run type-check -``` +### 1. Vitest globalSetup Context Isolation -This prevents linting/type errors from being introduced into the codebase. +Vitest's `globalSetup` runs in separate Node.js context. Singletons, spies, mocks do NOT share instances with test files. -## Quick Reference +**Affected**: BullMQ worker service mocks (AI/DB failure tests) -| Command | Description | -| -------------------------- | ---------------------------- | -| `npm test` | Run all unit tests | -| `npm run test:unit` | Run unit tests only | -| `npm run test:integration` | Run integration tests | -| `npm run dev:container` | Start dev server (container) | -| `npm run build` | Build for production | -| `npm run type-check` | Run TypeScript type checking | - -## Database Schema Files - -**CRITICAL**: The database schema files must be kept in sync with each other. When making schema changes: - -| File | Purpose | -| ------------------------------ | ----------------------------------------------------------- | -| `sql/master_schema_rollup.sql` | Complete schema used by test database setup and reference | -| `sql/initial_schema.sql` | Base schema without seed data, used as standalone reference | -| `sql/migrations/*.sql` | Incremental migrations for production database updates | - -**Maintenance Rules:** - -1. **Keep `master_schema_rollup.sql` and `initial_schema.sql` in sync** - These files should contain the same table definitions -2. **When adding columns via migration**, also add them to both `master_schema_rollup.sql` and `initial_schema.sql` -3. **Migrations are for production deployments** - They use `ALTER TABLE` to add columns incrementally -4. **Schema files are for fresh installs** - They define the complete table structure -5. **Test database uses `master_schema_rollup.sql`** - If schema files are out of sync with migrations, tests will fail - -**Example:** When `002_expiry_tracking.sql` adds `purchase_date` to `pantry_items`, that column must also exist in the `CREATE TABLE` statements in both `master_schema_rollup.sql` and `initial_schema.sql`. - -## Known Integration Test Issues and Solutions - -This section documents common test issues encountered in integration tests, their root causes, and solutions. These patterns recur frequently. - -### 1. Vitest globalSetup Runs in Separate Node.js Context - -**Problem:** Vitest's `globalSetup` runs in a completely separate Node.js context from test files. This means: - -- Singletons created in globalSetup are NOT the same instances as those in test files -- `global`, `globalThis`, and `process` are all isolated between contexts -- `vi.spyOn()` on module exports doesn't work cross-context -- Dependency injection via setter methods fails across contexts - -**Affected Tests:** Any test trying to inject mocks into BullMQ worker services (e.g., AI failure tests, DB failure tests) - -**Solution Options:** - -1. Mark tests as `.todo()` until an API-based mock injection mechanism is implemented -2. Create test-only API endpoints that allow setting mock behaviors via HTTP -3. Use file-based or Redis-based mock flags that services check at runtime - -**Example of affected code pattern:** +**Solutions**: Mark `.todo()`, create test-only API endpoints, use Redis-based mock flags ```typescript -// This DOES NOT work - different module instances +// DOES NOT WORK - different instances const { flyerProcessingService } = await import('../../services/workers.server'); flyerProcessingService._getAiProcessor()._setExtractAndValidateData(mockFn); -// The worker uses a different flyerProcessingService instance! ``` -### 2. BullMQ Cleanup Queue Deleting Files Before Test Verification +### 2. Cleanup Queue Deletes Before Verification -**Problem:** The cleanup worker runs in the globalSetup context and processes cleanup jobs even when tests spy on `cleanupQueue.add()`. The spy intercepts calls in the test context, but jobs already queued run in the worker's context. +Cleanup worker processes jobs in globalSetup context, ignoring test spies. -**Affected Tests:** EXIF/PNG metadata stripping tests that need to verify file contents before deletion - -**Solution:** Drain and pause the cleanup queue before the test: +**Solution**: Drain and pause queue: ```typescript const { cleanupQueue } = await import('../../services/queues.server'); -await cleanupQueue.drain(); // Remove existing jobs -await cleanupQueue.pause(); // Prevent new jobs from processing -// ... run test ... -await cleanupQueue.resume(); // Restore normal operation +await cleanupQueue.drain(); +await cleanupQueue.pause(); +// ... test ... +await cleanupQueue.resume(); ``` -### 3. Cache Invalidation After Direct Database Inserts +### 3. Cache Stale After Direct SQL -**Problem:** Tests that insert data directly via SQL (bypassing the service layer) don't trigger cache invalidation. Subsequent API calls return stale cached data. +Direct `pool.query()` inserts bypass cache invalidation. -**Affected Tests:** Any test using `pool.query()` to insert flyers, stores, or other cached entities +**Solution**: `await cacheService.invalidateFlyers();` after inserts -**Solution:** Manually invalidate the cache after direct inserts: +### 4. Test Filename Collisions -```typescript -await pool.query('INSERT INTO flyers ...'); -await cacheService.invalidateFlyers(); // Clear stale cache -``` +Multer predictable filenames cause race conditions. -### 4. Unique Filenames Required for Test Isolation - -**Problem:** Multer generates predictable filenames in test environments, causing race conditions when multiple tests upload files concurrently or in sequence. - -**Affected Tests:** Flyer processing tests, file upload tests - -**Solution:** Always use unique filenames with timestamps: - -```typescript -// In multer.middleware.ts -const uniqueSuffix = `${Date.now()}-${Math.round(Math.random() * 1e9)}`; -cb(null, `${file.fieldname}-${uniqueSuffix}-${sanitizedOriginalName}`); -``` +**Solution**: Use unique suffix: `${Date.now()}-${Math.round(Math.random() * 1e9)}` ### 5. Response Format Mismatches -**Problem:** API response formats may change, causing tests to fail when expecting old formats. +API formats change: `data.jobId` vs `data.job.id`, nested vs flat, string vs number IDs. -**Common Issues:** - -- `response.body.data.jobId` vs `response.body.data.job.id` -- Nested objects vs flat response structures -- Type coercion (string vs number for IDs) - -**Solution:** Always log response bodies during debugging and update test assertions to match actual API contracts. +**Solution**: Log response bodies, update assertions ### 6. External Service Availability -**Problem:** Tests depending on external services (PM2, Redis health checks) fail when those services aren't available in the test environment. +PM2/Redis health checks fail when unavailable. -**Solution:** Use try/catch with graceful degradation or mock the external service checks. +**Solution**: try/catch with graceful degradation or mock -## Secrets and Environment Variables +--- -**CRITICAL**: This project uses **Gitea CI/CD secrets** for all sensitive configuration. There is NO `/etc/flyer-crawler/environment` file or similar local config file on the server. +## Secrets & Environment -### Server Directory Structure +**NO `/etc/flyer-crawler/environment` file** - Gitea CI/CD secrets only. -| Path | Environment | Notes | -| --------------------------------------------- | ----------- | ------------------------------------------------ | -| `/var/www/flyer-crawler.projectium.com/` | Production | NO `.env` file - secrets injected via CI/CD only | -| `/var/www/flyer-crawler-test.projectium.com/` | Test | Has `.env.test` file for test-specific config | +| Path | Env | Notes | +| --------------------------------------------- | ---- | ----------------------------- | +| `/var/www/flyer-crawler.projectium.com/` | Prod | No .env - CI/CD secrets only | +| `/var/www/flyer-crawler-test.projectium.com/` | Test | Has `.env.test` for overrides | -### How Secrets Work +### Config Files -1. **Gitea Secrets**: All secrets are stored in Gitea repository settings (Settings → Secrets) -2. **CI/CD Injection**: Secrets are injected during deployment via `.gitea/workflows/deploy-to-prod.yml` and `deploy-to-test.yml` -3. **PM2 Environment**: The CI/CD workflow passes secrets to PM2 via environment variables, which are then available to the application +| File | Purpose | +| ------------------------------------- | --------------------- | +| `src/config/env.ts` | Zod schema validation | +| `ecosystem.config.cjs` | PM2 config | +| `.gitea/workflows/deploy-to-prod.yml` | Prod deployment | +| `.gitea/workflows/deploy-to-test.yml` | Test deployment | +| `.env.example` | Variable template | -### Key Files for Configuration +### Adding Secrets -| File | Purpose | -| ------------------------------------- | ---------------------------------------------------- | -| `src/config/env.ts` | Centralized config with Zod schema validation | -| `ecosystem.config.cjs` | PM2 process config - reads from `process.env` | -| `.gitea/workflows/deploy-to-prod.yml` | Production deployment with secret injection | -| `.gitea/workflows/deploy-to-test.yml` | Test deployment with secret injection | -| `.env.example` | Template showing all available environment variables | -| `.env.test` | Test environment overrides (only on test server) | +1. Add to Gitea Settings > Secrets +2. Update workflow: `SENTRY_DSN: ${{ secrets.SENTRY_DSN }}` +3. Update `ecosystem.config.cjs` +4. Update `src/config/env.ts` schema +5. Update `.env.example` -### Adding New Secrets +### Gitea Secrets -To add a new secret (e.g., `SENTRY_DSN`): +**Shared**: `DB_HOST`, `JWT_SECRET`, `GOOGLE_MAPS_API_KEY`, `GOOGLE_CLIENT_ID/SECRET`, `GH_CLIENT_ID/SECRET`, `SENTRY_AUTH_TOKEN` -1. Add the secret to Gitea repository settings -2. Update the relevant workflow file (e.g., `deploy-to-prod.yml`) to inject it: +**Prod**: `DB_USER_PROD`, `DB_PASSWORD_PROD`, `DB_DATABASE_PROD`, `REDIS_PASSWORD_PROD`, `VITE_GOOGLE_GENAI_API_KEY`, `SENTRY_DSN`, `VITE_SENTRY_DSN` - ```yaml - SENTRY_DSN: ${{ secrets.SENTRY_DSN }} - ``` +**Test**: `DB_USER_TEST`, `DB_PASSWORD_TEST`, `DB_DATABASE_TEST`, `REDIS_PASSWORD_TEST`, `VITE_GOOGLE_GENAI_API_KEY_TEST`, `SENTRY_DSN_TEST`, `VITE_SENTRY_DSN_TEST` -3. Update `ecosystem.config.cjs` to read it from `process.env` -4. Update `src/config/env.ts` schema if validation is needed -5. Update `.env.example` to document the new variable - -### Current Gitea Secrets - -**Shared (used by both environments):** - -- `DB_HOST` - Database host (shared PostgreSQL server) -- `JWT_SECRET` - Authentication -- `GOOGLE_MAPS_API_KEY` - Google Maps -- `GOOGLE_CLIENT_ID`, `GOOGLE_CLIENT_SECRET` - Google OAuth -- `GH_CLIENT_ID`, `GH_CLIENT_SECRET` - GitHub OAuth -- `SENTRY_AUTH_TOKEN` - Bugsink API token for source map uploads (create at Settings > API Keys in Bugsink) - -**Production-specific:** - -- `DB_USER_PROD`, `DB_PASSWORD_PROD` - Production database credentials (`flyer_crawler_prod`) -- `DB_DATABASE_PROD` - Production database name (`flyer-crawler`) -- `REDIS_PASSWORD_PROD` - Redis password (uses database 0) -- `VITE_GOOGLE_GENAI_API_KEY` - Gemini API key for production -- `SENTRY_DSN`, `VITE_SENTRY_DSN` - Bugsink error tracking DSNs (production projects) - -**Test-specific:** - -- `DB_USER_TEST`, `DB_PASSWORD_TEST` - Test database credentials (`flyer_crawler_test`) -- `DB_DATABASE_TEST` - Test database name (`flyer-crawler-test`) -- `REDIS_PASSWORD_TEST` - Redis password (uses database 1 for isolation) -- `VITE_GOOGLE_GENAI_API_KEY_TEST` - Gemini API key for test -- `SENTRY_DSN_TEST`, `VITE_SENTRY_DSN_TEST` - Bugsink error tracking DSNs (test projects) - -### Test Environment - -The test environment (`flyer-crawler-test.projectium.com`) uses **both** Gitea CI/CD secrets and a local `.env.test` file: - -- **Gitea secrets**: Injected during deployment via `.gitea/workflows/deploy-to-test.yml` -- **`.env.test` file**: Located at `/var/www/flyer-crawler-test.projectium.com/.env.test` for local overrides -- **Redis database 1**: Isolates test job queues from production (which uses database 0) -- **PM2 process names**: Suffixed with `-test` (e.g., `flyer-crawler-api-test`) - -### Database User Setup (Test Environment) - -**CRITICAL**: The test database requires specific PostgreSQL permissions to be configured manually. Schema ownership alone is NOT sufficient - explicit privileges must be granted. - -**Database Users:** +### Database Users | User | Database | Purpose | | -------------------- | -------------------- | ---------- | | `flyer_crawler_prod` | `flyer-crawler-prod` | Production | | `flyer_crawler_test` | `flyer-crawler-test` | Testing | -**Required Setup Commands** (run as `postgres` superuser): +**Setup (as postgres superuser)**: -```bash -# Connect as postgres superuser -sudo -u postgres psql - -# Create the test database and user (if not exists) +```sql CREATE DATABASE "flyer-crawler-test"; -CREATE USER flyer_crawler_test WITH PASSWORD 'your-password-here'; - -# Grant ownership and privileges +CREATE USER flyer_crawler_test WITH PASSWORD 'password'; ALTER DATABASE "flyer-crawler-test" OWNER TO flyer_crawler_test; \c "flyer-crawler-test" ALTER SCHEMA public OWNER TO flyer_crawler_test; GRANT CREATE, USAGE ON SCHEMA public TO flyer_crawler_test; - -# Create required extension (must be done by superuser) CREATE EXTENSION IF NOT EXISTS "uuid-ossp"; ``` -**Why These Steps Are Necessary:** +**Verify**: `psql -d "flyer-crawler-test" -c "\dn+ public"` - expect 'UC' for user -1. **Schema ownership alone is insufficient** - PostgreSQL requires explicit `GRANT CREATE, USAGE` privileges even when the user owns the schema -2. **uuid-ossp extension** - Required by the application for UUID generation; must be created by a superuser before the app can use it -3. **Separate users for prod/test** - Prevents accidental cross-environment data access; each environment has its own credentials in Gitea secrets +--- -**Verification:** +## Dev Container Bugsink -```bash -# Check schema privileges (should show 'UC' for flyer_crawler_test) -psql -d "flyer-crawler-test" -c "\dn+ public" - -# Expected output: -# Name | Owner | Access privileges -# -------+--------------------+------------------------------------------ -# public | flyer_crawler_test | flyer_crawler_test=UC/flyer_crawler_test -``` - -### Dev Container Environment - -The dev container runs its own **local Bugsink instance** - it does NOT connect to the production Bugsink server: - -- **Local Bugsink UI**: Accessible at `https://localhost:8443` (proxied from `http://localhost:8000` by nginx) -- **Admin credentials**: `admin@localhost` / `admin` -- **Bugsink Projects**: Backend (Dev) - Project ID 1, Frontend (Dev) - Project ID 2 -- **Configuration Files**: - - `compose.dev.yml` - Sets default DSNs using `127.0.0.1:8000` protocol (for initial container setup) - - `.env.local` - **OVERRIDES** compose.dev.yml with `localhost:8000` protocol (this is what the app actually uses) - - **CRITICAL**: `.env.local` takes precedence over `compose.dev.yml` environment variables -- **DSN Configuration**: - - **Backend DSN** (Node.js/Express): Configured in `.env.local` as `SENTRY_DSN=http://@localhost:8000/1` - - **Frontend DSN** (React/Browser): Configured in `.env.local` as `VITE_SENTRY_DSN=http://@localhost:8000/2` - - **Why localhost instead of 127.0.0.1?** The `.env.local` file was created separately and uses `localhost` which works fine in practice -- **HTTPS Setup**: Self-signed certificates auto-generated with mkcert on container startup (for UI access only, not for Sentry SDK) -- **CSRF Protection**: Django configured with `SECURE_PROXY_SSL_HEADER` to trust `X-Forwarded-Proto` from nginx -- **Isolated**: Dev errors stay local, don't pollute production/test dashboards -- **No Gitea secrets needed**: Everything is self-contained in the container -- **Accessing Errors**: - - **Via Browser**: Open `https://localhost:8443` and login to view issues - - **Via MCP**: Configure a second Bugsink MCP server pointing to `http://localhost:8000` (see MCP Servers section below) +See [docs/DEV-CONTAINER-BUGSINK.md](docs/DEV-CONTAINER-BUGSINK.md) - Local Bugsink at `https://localhost:8443`, credentials `admin@localhost`/`admin` --- ## MCP Servers -The following MCP servers are configured for this project: +| Server | Purpose | +| ---------------- | ----------------------------------- | +| gitea-projectium | Gitea API gitea.projectium.com | +| gitea-torbonium | Gitea API gitea.torbonium.com | +| podman | Container management | +| filesystem | File system access | +| memory | Knowledge graph | +| postgres | DB queries localhost:5432 | +| redis | Cache localhost:6379 | +| bugsink | Prod Bugsink bugsink.projectium.com | +| bugsink-dev | Dev Bugsink localhost:8000 | -| Server | Purpose | -| ------------------- | ---------------------------------------------------------------------------- | -| gitea-projectium | Gitea API for gitea.projectium.com | -| gitea-torbonium | Gitea API for gitea.torbonium.com | -| podman | Container management | -| filesystem | File system access | -| fetch | Web fetching | -| markitdown | Convert documents to markdown | -| sequential-thinking | Step-by-step reasoning | -| memory | Knowledge graph persistence | -| postgres | Direct database queries (localhost:5432) | -| playwright | Browser automation and testing | -| redis | Redis cache inspection (localhost:6379) | -| bugsink | Error tracking - production Bugsink (bugsink.projectium.com) - **PROD/TEST** | -| bugsink-dev | Error tracking - dev container Bugsink (localhost:8000) - **DEV CONTAINER** | +**Bugsink MCP Setup**: Clone `https://github.com/j-shelfwood/bugsink-mcp.git`, `npm install && npm run build` -**Note:** MCP servers work in both **Claude CLI** and **Claude Code VS Code extension** (as of January 2026). +### Creating Bugsink API Tokens -**CRITICAL**: There are **TWO separate Bugsink MCP servers**: +**IMPORTANT**: Bugsink 2.0.11 does NOT have a "Settings > API Keys" menu in the UI. Tokens must be created via Django management command. -- **bugsink**: Connects to production Bugsink at `https://bugsink.projectium.com` for production and test server errors -- **bugsink-dev**: Connects to local dev container Bugsink at `http://localhost:8000` for local development errors - -### Bugsink MCP Server Setup (ADR-015) - -**IMPORTANT**: You need to configure **TWO separate MCP servers** - one for production/test, one for local dev. - -#### Installation (shared for both servers) +**For Dev Container** (local Bugsink at localhost:8000): ```bash -# Clone the bugsink-mcp repository (NOT sentry-selfhosted-mcp) -git clone https://github.com/j-shelfwood/bugsink-mcp.git -cd bugsink-mcp -npm install -npm run build +MSYS_NO_PATHCONV=1 podman exec -e DATABASE_URL=postgresql://bugsink:bugsink_dev_password@postgres:5432/bugsink -e SECRET_KEY=dev-bugsink-secret-key-minimum-50-characters-for-security flyer-crawler-dev sh -c 'cd /opt/bugsink/conf && DJANGO_SETTINGS_MODULE=bugsink_conf PYTHONPATH=/opt/bugsink/conf:/opt/bugsink/lib/python3.10/site-packages /opt/bugsink/bin/python -m django create_auth_token' ``` -#### Production/Test Bugsink MCP (bugsink) +**For Production** (bugsink.projectium.com via SSH): -Add to `.claude/mcp.json`: +```bash +ssh root@projectium.com "cd /opt/bugsink && bugsink-manage create_auth_token" +``` + +The command outputs a 40-character hex token (e.g., `a609c2886daa4e1e05f1517074d7779a5fb49056`). + +**Config** (`~/.claude/settings.json` under `mcpServers`): ```json { @@ -497,164 +412,26 @@ Add to `.claude/mcp.json`: "args": ["d:\\gitea\\bugsink-mcp\\dist\\index.js"], "env": { "BUGSINK_URL": "https://bugsink.projectium.com", - "BUGSINK_API_TOKEN": "", + "BUGSINK_API_TOKEN": "", "BUGSINK_ORG_SLUG": "sentry" } - } -} -``` - -**Get the auth token**: - -- Navigate to https://bugsink.projectium.com -- Log in with production credentials -- Go to Settings > API Keys -- Create a new API key with read access - -#### Dev Container Bugsink MCP (bugsink-dev) - -Add to `.claude/mcp.json`: - -```json -{ + }, "bugsink-dev": { "command": "node", "args": ["d:\\gitea\\bugsink-mcp\\dist\\index.js"], "env": { "BUGSINK_URL": "http://localhost:8000", - "BUGSINK_API_TOKEN": "", + "BUGSINK_API_TOKEN": "", "BUGSINK_ORG_SLUG": "sentry" } } } ``` -**Get the auth token**: - -- Navigate to http://localhost:8000 (or https://localhost:8443) -- Log in with `admin@localhost` / `admin` -- Go to Settings > API Keys -- Create a new API key with read access - -#### MCP Tool Usage - -When using Bugsink MCP tools, remember: - -- `mcp__bugsink__*` tools connect to **production/test** Bugsink -- `mcp__bugsink-dev__*` tools connect to **dev container** Bugsink -- Available capabilities for both: - - List projects and issues - - View detailed error events and stacktraces - - Search by error message or stack trace - - Update issue status (resolve, ignore) - - Create releases - -### SSH Server Access - -Claude Code can execute commands on the production server via SSH: - -```bash -# Basic command execution -ssh root@projectium.com "command here" - -# Examples: -ssh root@projectium.com "systemctl status logstash" -ssh root@projectium.com "pm2 list" -ssh root@projectium.com "tail -50 /var/www/flyer-crawler.projectium.com/logs/app.log" -``` - -**Use cases:** - -- Managing Logstash, PM2, NGINX, Redis services -- Viewing server logs -- Deploying configuration changes -- Checking service status - -**Important:** SSH access requires the host machine to have SSH keys configured for `root@projectium.com`. +**SSH**: `ssh root@projectium.com "command"` - requires configured keys --- -## Logstash Configuration (ADR-050) +## Logstash (ADR-050) -The production server uses **Logstash** to aggregate logs from multiple sources and forward errors to Bugsink for centralized error tracking. - -**Log Sources:** - -- **PostgreSQL function logs** - Structured JSON logs from `fn_log()` helper function -- **PM2 worker logs** - Service logs from BullMQ job workers (stdout) -- **Redis logs** - Operational logs (INFO level) and errors -- **NGINX logs** - Access logs (all requests) and error logs - -### Configuration Location - -**Primary configuration file:** - -- `/etc/logstash/conf.d/bugsink.conf` - Complete Logstash pipeline configuration - -**Related files:** - -- `/etc/postgresql/14/main/conf.d/observability.conf` - PostgreSQL logging configuration -- `/var/log/postgresql/*.log` - PostgreSQL log files -- `/home/gitea-runner/.pm2/logs/*.log` - PM2 worker logs -- `/var/log/redis/redis-server.log` - Redis logs -- `/var/log/nginx/access.log` - NGINX access logs -- `/var/log/nginx/error.log` - NGINX error logs -- `/var/log/logstash/*.log` - Logstash file outputs (operational logs) -- `/var/lib/logstash/sincedb_*` - Logstash position tracking files - -### Key Features - -1. **Multi-source aggregation**: Collects logs from PostgreSQL, PM2 workers, Redis, and NGINX -2. **Environment-based routing**: Automatically detects production vs test environments and routes errors to the correct Bugsink project -3. **Structured JSON parsing**: Extracts `fn_log()` function output from PostgreSQL logs and Pino JSON from PM2 workers -4. **Sentry-compatible format**: Transforms events to Sentry format with `event_id`, `timestamp`, `level`, `message`, and `extra` context -5. **Error filtering**: Only forwards WARNING and ERROR level messages to Bugsink -6. **Operational log storage**: Stores non-error logs (Redis INFO, NGINX access, PM2 operational) to `/var/log/logstash/` for analysis -7. **Request monitoring**: Categorizes NGINX requests by status code (2xx, 3xx, 4xx, 5xx) and identifies slow requests - -### Common Maintenance Commands - -```bash -# Check Logstash status -systemctl status logstash - -# Restart Logstash after configuration changes -systemctl restart logstash - -# Test configuration syntax -/usr/share/logstash/bin/logstash --config.test_and_exit -f /etc/logstash/conf.d/bugsink.conf - -# View Logstash logs -journalctl -u logstash -f - -# Check Logstash stats (events processed, failures) -curl -XGET 'localhost:9600/_node/stats/pipelines?pretty' | jq '.pipelines.main.plugins.filters' - -# Monitor PostgreSQL logs being processed -tail -f /var/log/postgresql/postgresql-$(date +%Y-%m-%d).log - -# View operational log outputs -tail -f /var/log/logstash/pm2-workers-$(date +%Y-%m-%d).log -tail -f /var/log/logstash/redis-operational-$(date +%Y-%m-%d).log -tail -f /var/log/logstash/nginx-access-$(date +%Y-%m-%d).log - -# Check disk usage of log files -du -sh /var/log/logstash/ -``` - -### Troubleshooting - -| Issue | Check | Solution | -| ------------------------------- | ---------------------------- | ---------------------------------------------------------------------------------------------- | -| Errors not appearing in Bugsink | Check Logstash is running | `systemctl status logstash` | -| Configuration syntax errors | Test config file | `/usr/share/logstash/bin/logstash --config.test_and_exit -f /etc/logstash/conf.d/bugsink.conf` | -| Grok pattern failures | Check Logstash stats | `curl localhost:9600/_node/stats/pipelines?pretty \| jq '.pipelines.main.plugins.filters'` | -| Wrong Bugsink project | Verify environment detection | Check tags in logs match expected environment (production/test) | -| Permission denied reading logs | Check Logstash permissions | `groups logstash` should include `postgres`, `adm` groups | -| PM2 logs not captured | Check file paths exist | `ls /home/gitea-runner/.pm2/logs/flyer-crawler-worker-*.log` | -| NGINX access logs not showing | Check file output directory | `ls -lh /var/log/logstash/nginx-access-*.log` | -| High disk usage | Check log rotation | Verify `/etc/logrotate.d/logstash` is configured and running daily | - -**Full setup guide**: See [docs/BARE-METAL-SETUP.md](docs/BARE-METAL-SETUP.md) section "PostgreSQL Function Observability (ADR-050)" - -**Architecture details**: See [docs/adr/0050-postgresql-function-observability.md](docs/adr/0050-postgresql-function-observability.md) +See [docs/LOGSTASH-QUICK-REF.md](docs/LOGSTASH-QUICK-REF.md) - Log aggregation from PostgreSQL, PM2, Redis, NGINX to Bugsink. Config: `/etc/logstash/conf.d/bugsink.conf` diff --git a/CLAUDE.md.backup b/CLAUDE.md.backup new file mode 100644 index 00000000..a68e1779 --- /dev/null +++ b/CLAUDE.md.backup @@ -0,0 +1,660 @@ +# Claude Code Project Instructions + +## Session Startup Checklist + +**IMPORTANT**: At the start of every session, perform these steps: + +1. **Check Memory First** - Use `mcp__memory__read_graph` or `mcp__memory__search_nodes` to recall: + - Project-specific configurations and credentials + - Previous work context and decisions + - Infrastructure details (URLs, ports, access patterns) + - Known issues and their solutions + +2. **Review Recent Git History** - Check `git log --oneline -10` to understand recent changes + +3. **Check Container Status** - Use `mcp__podman__container_list` to see what's running + +--- + +## Project Instructions + +### Things to Remember + +Before writing any code: + +1. State how you will verify this change works (test, bash command, browser check, etc.) + +2. Write the test or verification step first + +3. Then implement the code + +4. Run verification and iterate until it passes + +## Git Bash / MSYS Path Conversion Issue (Windows Host) + +**CRITICAL ISSUE**: Git Bash on Windows automatically converts Unix-style paths to Windows paths, which breaks Podman/Docker commands. + +### Problem Examples: + +```bash +# This FAILS in Git Bash: +podman exec container /usr/local/bin/script.sh +# Git Bash converts to: C:/Program Files/Git/usr/local/bin/script.sh + +# This FAILS in Git Bash: +podman exec container bash -c "cat /tmp/file.sql" +# Git Bash converts /tmp to C:/Users/user/AppData/Local/Temp +``` + +### Solutions: + +1. **Use `sh -c` instead of `bash -c`** for single-quoted commands: + + ```bash + podman exec container sh -c '/usr/local/bin/script.sh' + ``` + +2. **Use double slashes** to escape path conversion: + + ```bash + podman exec container //usr//local//bin//script.sh + ``` + +3. **Set MSYS_NO_PATHCONV** environment variable: + + ```bash + MSYS_NO_PATHCONV=1 podman exec container /usr/local/bin/script.sh + ``` + +4. **Use Windows paths with forward slashes** when referencing host files: + ```bash + podman cp "d:/path/to/file" container:/tmp/file + ``` + +**ALWAYS use one of these workarounds when running Bash commands on Windows that involve Unix paths inside containers.** + +## Communication Style: Ask Before Assuming + +**IMPORTANT**: When helping with tasks, **ask clarifying questions before making assumptions**. Do not assume: + +- What steps the user has or hasn't completed +- What the user already knows or has configured +- What external services (OAuth providers, APIs, etc.) are already set up +- What secrets or credentials have already been created + +Instead, ask the user to confirm the current state before providing instructions or making recommendations. This prevents wasted effort and respects the user's existing work. + +## Platform Requirement: Linux Only + +**CRITICAL**: This application is designed to run **exclusively on Linux**. See [ADR-014](docs/adr/0014-containerization-and-deployment-strategy.md) for full details. + +### Environment Terminology + +- **Dev Container** (or just "dev"): The containerized Linux development environment (`flyer-crawler-dev`). This is where all development and testing should occur. +- **Host**: The Windows machine running Podman/Docker and VS Code. + +When instructions say "run in dev" or "run in the dev container", they mean executing commands inside the `flyer-crawler-dev` container. + +### Test Execution Rules + +1. **ALL tests MUST be executed in the dev container** - the Linux container environment +2. **NEVER run tests directly on Windows host** - test results from Windows are unreliable +3. **Always use the dev container for testing** when developing on Windows +4. **TypeScript type-check MUST run in dev container** - `npm run type-check` on Windows does not reliably detect errors + +See [docs/TESTING.md](docs/TESTING.md) for comprehensive testing documentation. + +### How to Run Tests Correctly + +```bash +# If on Windows, first open VS Code and "Reopen in Container" +# Then run tests inside the dev container: +npm test # Run all unit tests +npm run test:unit # Run unit tests only +npm run test:integration # Run integration tests (requires DB/Redis) +``` + +### Running Tests via Podman (from Windows host) + +**Note:** This project has 2900+ unit tests. For AI-assisted development, pipe output to a file for easier processing. + +The command to run unit tests in the dev container via podman: + +```bash +# Basic (output to terminal) +podman exec -it flyer-crawler-dev npm run test:unit + +# Recommended for AI processing: pipe to file +podman exec -it flyer-crawler-dev npm run test:unit 2>&1 | tee test-results.txt +``` + +The command to run integration tests in the dev container via podman: + +```bash +podman exec -it flyer-crawler-dev npm run test:integration +``` + +For running specific test files: + +```bash +podman exec -it flyer-crawler-dev npm test -- --run src/hooks/useAuth.test.tsx +``` + +### Why Linux Only? + +- Path separators: Code uses POSIX-style paths (`/`) which may break on Windows +- Shell scripts in `scripts/` directory are Linux-only +- External dependencies like `pdftocairo` assume Linux installation paths +- Unix-style file permissions are assumed throughout + +### Test Result Interpretation + +- Tests that **pass on Windows but fail on Linux** = **BROKEN tests** (must be fixed) +- Tests that **fail on Windows but pass on Linux** = **PASSING tests** (acceptable) + +## Development Workflow + +1. Open project in VS Code +2. Use "Reopen in Container" (Dev Containers extension required) to enter the dev environment +3. Wait for dev container initialization to complete +4. Run `npm test` to verify the dev environment is working +5. Make changes and run tests inside the dev container + +## Code Change Verification + +After making any code changes, **always run a type-check** to catch TypeScript errors before committing: + +```bash +npm run type-check +``` + +This prevents linting/type errors from being introduced into the codebase. + +## Quick Reference + +| Command | Description | +| -------------------------- | ---------------------------- | +| `npm test` | Run all unit tests | +| `npm run test:unit` | Run unit tests only | +| `npm run test:integration` | Run integration tests | +| `npm run dev:container` | Start dev server (container) | +| `npm run build` | Build for production | +| `npm run type-check` | Run TypeScript type checking | + +## Database Schema Files + +**CRITICAL**: The database schema files must be kept in sync with each other. When making schema changes: + +| File | Purpose | +| ------------------------------ | ----------------------------------------------------------- | +| `sql/master_schema_rollup.sql` | Complete schema used by test database setup and reference | +| `sql/initial_schema.sql` | Base schema without seed data, used as standalone reference | +| `sql/migrations/*.sql` | Incremental migrations for production database updates | + +**Maintenance Rules:** + +1. **Keep `master_schema_rollup.sql` and `initial_schema.sql` in sync** - These files should contain the same table definitions +2. **When adding columns via migration**, also add them to both `master_schema_rollup.sql` and `initial_schema.sql` +3. **Migrations are for production deployments** - They use `ALTER TABLE` to add columns incrementally +4. **Schema files are for fresh installs** - They define the complete table structure +5. **Test database uses `master_schema_rollup.sql`** - If schema files are out of sync with migrations, tests will fail + +**Example:** When `002_expiry_tracking.sql` adds `purchase_date` to `pantry_items`, that column must also exist in the `CREATE TABLE` statements in both `master_schema_rollup.sql` and `initial_schema.sql`. + +## Known Integration Test Issues and Solutions + +This section documents common test issues encountered in integration tests, their root causes, and solutions. These patterns recur frequently. + +### 1. Vitest globalSetup Runs in Separate Node.js Context + +**Problem:** Vitest's `globalSetup` runs in a completely separate Node.js context from test files. This means: + +- Singletons created in globalSetup are NOT the same instances as those in test files +- `global`, `globalThis`, and `process` are all isolated between contexts +- `vi.spyOn()` on module exports doesn't work cross-context +- Dependency injection via setter methods fails across contexts + +**Affected Tests:** Any test trying to inject mocks into BullMQ worker services (e.g., AI failure tests, DB failure tests) + +**Solution Options:** + +1. Mark tests as `.todo()` until an API-based mock injection mechanism is implemented +2. Create test-only API endpoints that allow setting mock behaviors via HTTP +3. Use file-based or Redis-based mock flags that services check at runtime + +**Example of affected code pattern:** + +```typescript +// This DOES NOT work - different module instances +const { flyerProcessingService } = await import('../../services/workers.server'); +flyerProcessingService._getAiProcessor()._setExtractAndValidateData(mockFn); +// The worker uses a different flyerProcessingService instance! +``` + +### 2. BullMQ Cleanup Queue Deleting Files Before Test Verification + +**Problem:** The cleanup worker runs in the globalSetup context and processes cleanup jobs even when tests spy on `cleanupQueue.add()`. The spy intercepts calls in the test context, but jobs already queued run in the worker's context. + +**Affected Tests:** EXIF/PNG metadata stripping tests that need to verify file contents before deletion + +**Solution:** Drain and pause the cleanup queue before the test: + +```typescript +const { cleanupQueue } = await import('../../services/queues.server'); +await cleanupQueue.drain(); // Remove existing jobs +await cleanupQueue.pause(); // Prevent new jobs from processing +// ... run test ... +await cleanupQueue.resume(); // Restore normal operation +``` + +### 3. Cache Invalidation After Direct Database Inserts + +**Problem:** Tests that insert data directly via SQL (bypassing the service layer) don't trigger cache invalidation. Subsequent API calls return stale cached data. + +**Affected Tests:** Any test using `pool.query()` to insert flyers, stores, or other cached entities + +**Solution:** Manually invalidate the cache after direct inserts: + +```typescript +await pool.query('INSERT INTO flyers ...'); +await cacheService.invalidateFlyers(); // Clear stale cache +``` + +### 4. Unique Filenames Required for Test Isolation + +**Problem:** Multer generates predictable filenames in test environments, causing race conditions when multiple tests upload files concurrently or in sequence. + +**Affected Tests:** Flyer processing tests, file upload tests + +**Solution:** Always use unique filenames with timestamps: + +```typescript +// In multer.middleware.ts +const uniqueSuffix = `${Date.now()}-${Math.round(Math.random() * 1e9)}`; +cb(null, `${file.fieldname}-${uniqueSuffix}-${sanitizedOriginalName}`); +``` + +### 5. Response Format Mismatches + +**Problem:** API response formats may change, causing tests to fail when expecting old formats. + +**Common Issues:** + +- `response.body.data.jobId` vs `response.body.data.job.id` +- Nested objects vs flat response structures +- Type coercion (string vs number for IDs) + +**Solution:** Always log response bodies during debugging and update test assertions to match actual API contracts. + +### 6. External Service Availability + +**Problem:** Tests depending on external services (PM2, Redis health checks) fail when those services aren't available in the test environment. + +**Solution:** Use try/catch with graceful degradation or mock the external service checks. + +## Secrets and Environment Variables + +**CRITICAL**: This project uses **Gitea CI/CD secrets** for all sensitive configuration. There is NO `/etc/flyer-crawler/environment` file or similar local config file on the server. + +### Server Directory Structure + +| Path | Environment | Notes | +| --------------------------------------------- | ----------- | ------------------------------------------------ | +| `/var/www/flyer-crawler.projectium.com/` | Production | NO `.env` file - secrets injected via CI/CD only | +| `/var/www/flyer-crawler-test.projectium.com/` | Test | Has `.env.test` file for test-specific config | + +### How Secrets Work + +1. **Gitea Secrets**: All secrets are stored in Gitea repository settings (Settings → Secrets) +2. **CI/CD Injection**: Secrets are injected during deployment via `.gitea/workflows/deploy-to-prod.yml` and `deploy-to-test.yml` +3. **PM2 Environment**: The CI/CD workflow passes secrets to PM2 via environment variables, which are then available to the application + +### Key Files for Configuration + +| File | Purpose | +| ------------------------------------- | ---------------------------------------------------- | +| `src/config/env.ts` | Centralized config with Zod schema validation | +| `ecosystem.config.cjs` | PM2 process config - reads from `process.env` | +| `.gitea/workflows/deploy-to-prod.yml` | Production deployment with secret injection | +| `.gitea/workflows/deploy-to-test.yml` | Test deployment with secret injection | +| `.env.example` | Template showing all available environment variables | +| `.env.test` | Test environment overrides (only on test server) | + +### Adding New Secrets + +To add a new secret (e.g., `SENTRY_DSN`): + +1. Add the secret to Gitea repository settings +2. Update the relevant workflow file (e.g., `deploy-to-prod.yml`) to inject it: + + ```yaml + SENTRY_DSN: ${{ secrets.SENTRY_DSN }} + ``` + +3. Update `ecosystem.config.cjs` to read it from `process.env` +4. Update `src/config/env.ts` schema if validation is needed +5. Update `.env.example` to document the new variable + +### Current Gitea Secrets + +**Shared (used by both environments):** + +- `DB_HOST` - Database host (shared PostgreSQL server) +- `JWT_SECRET` - Authentication +- `GOOGLE_MAPS_API_KEY` - Google Maps +- `GOOGLE_CLIENT_ID`, `GOOGLE_CLIENT_SECRET` - Google OAuth +- `GH_CLIENT_ID`, `GH_CLIENT_SECRET` - GitHub OAuth +- `SENTRY_AUTH_TOKEN` - Bugsink API token for source map uploads (create at Settings > API Keys in Bugsink) + +**Production-specific:** + +- `DB_USER_PROD`, `DB_PASSWORD_PROD` - Production database credentials (`flyer_crawler_prod`) +- `DB_DATABASE_PROD` - Production database name (`flyer-crawler`) +- `REDIS_PASSWORD_PROD` - Redis password (uses database 0) +- `VITE_GOOGLE_GENAI_API_KEY` - Gemini API key for production +- `SENTRY_DSN`, `VITE_SENTRY_DSN` - Bugsink error tracking DSNs (production projects) + +**Test-specific:** + +- `DB_USER_TEST`, `DB_PASSWORD_TEST` - Test database credentials (`flyer_crawler_test`) +- `DB_DATABASE_TEST` - Test database name (`flyer-crawler-test`) +- `REDIS_PASSWORD_TEST` - Redis password (uses database 1 for isolation) +- `VITE_GOOGLE_GENAI_API_KEY_TEST` - Gemini API key for test +- `SENTRY_DSN_TEST`, `VITE_SENTRY_DSN_TEST` - Bugsink error tracking DSNs (test projects) + +### Test Environment + +The test environment (`flyer-crawler-test.projectium.com`) uses **both** Gitea CI/CD secrets and a local `.env.test` file: + +- **Gitea secrets**: Injected during deployment via `.gitea/workflows/deploy-to-test.yml` +- **`.env.test` file**: Located at `/var/www/flyer-crawler-test.projectium.com/.env.test` for local overrides +- **Redis database 1**: Isolates test job queues from production (which uses database 0) +- **PM2 process names**: Suffixed with `-test` (e.g., `flyer-crawler-api-test`) + +### Database User Setup (Test Environment) + +**CRITICAL**: The test database requires specific PostgreSQL permissions to be configured manually. Schema ownership alone is NOT sufficient - explicit privileges must be granted. + +**Database Users:** + +| User | Database | Purpose | +| -------------------- | -------------------- | ---------- | +| `flyer_crawler_prod` | `flyer-crawler-prod` | Production | +| `flyer_crawler_test` | `flyer-crawler-test` | Testing | + +**Required Setup Commands** (run as `postgres` superuser): + +```bash +# Connect as postgres superuser +sudo -u postgres psql + +# Create the test database and user (if not exists) +CREATE DATABASE "flyer-crawler-test"; +CREATE USER flyer_crawler_test WITH PASSWORD 'your-password-here'; + +# Grant ownership and privileges +ALTER DATABASE "flyer-crawler-test" OWNER TO flyer_crawler_test; +\c "flyer-crawler-test" +ALTER SCHEMA public OWNER TO flyer_crawler_test; +GRANT CREATE, USAGE ON SCHEMA public TO flyer_crawler_test; + +# Create required extension (must be done by superuser) +CREATE EXTENSION IF NOT EXISTS "uuid-ossp"; +``` + +**Why These Steps Are Necessary:** + +1. **Schema ownership alone is insufficient** - PostgreSQL requires explicit `GRANT CREATE, USAGE` privileges even when the user owns the schema +2. **uuid-ossp extension** - Required by the application for UUID generation; must be created by a superuser before the app can use it +3. **Separate users for prod/test** - Prevents accidental cross-environment data access; each environment has its own credentials in Gitea secrets + +**Verification:** + +```bash +# Check schema privileges (should show 'UC' for flyer_crawler_test) +psql -d "flyer-crawler-test" -c "\dn+ public" + +# Expected output: +# Name | Owner | Access privileges +# -------+--------------------+------------------------------------------ +# public | flyer_crawler_test | flyer_crawler_test=UC/flyer_crawler_test +``` + +### Dev Container Environment + +The dev container runs its own **local Bugsink instance** - it does NOT connect to the production Bugsink server: + +- **Local Bugsink UI**: Accessible at `https://localhost:8443` (proxied from `http://localhost:8000` by nginx) +- **Admin credentials**: `admin@localhost` / `admin` +- **Bugsink Projects**: Backend (Dev) - Project ID 1, Frontend (Dev) - Project ID 2 +- **Configuration Files**: + - `compose.dev.yml` - Sets default DSNs using `127.0.0.1:8000` protocol (for initial container setup) + - `.env.local` - **OVERRIDES** compose.dev.yml with `localhost:8000` protocol (this is what the app actually uses) + - **CRITICAL**: `.env.local` takes precedence over `compose.dev.yml` environment variables +- **DSN Configuration**: + - **Backend DSN** (Node.js/Express): Configured in `.env.local` as `SENTRY_DSN=http://@localhost:8000/1` + - **Frontend DSN** (React/Browser): Configured in `.env.local` as `VITE_SENTRY_DSN=http://@localhost:8000/2` + - **Why localhost instead of 127.0.0.1?** The `.env.local` file was created separately and uses `localhost` which works fine in practice +- **HTTPS Setup**: Self-signed certificates auto-generated with mkcert on container startup (for UI access only, not for Sentry SDK) +- **CSRF Protection**: Django configured with `SECURE_PROXY_SSL_HEADER` to trust `X-Forwarded-Proto` from nginx +- **Isolated**: Dev errors stay local, don't pollute production/test dashboards +- **No Gitea secrets needed**: Everything is self-contained in the container +- **Accessing Errors**: + - **Via Browser**: Open `https://localhost:8443` and login to view issues + - **Via MCP**: Configure a second Bugsink MCP server pointing to `http://localhost:8000` (see MCP Servers section below) + +--- + +## MCP Servers + +The following MCP servers are configured for this project: + +| Server | Purpose | +| ------------------- | ---------------------------------------------------------------------------- | +| gitea-projectium | Gitea API for gitea.projectium.com | +| gitea-torbonium | Gitea API for gitea.torbonium.com | +| podman | Container management | +| filesystem | File system access | +| fetch | Web fetching | +| markitdown | Convert documents to markdown | +| sequential-thinking | Step-by-step reasoning | +| memory | Knowledge graph persistence | +| postgres | Direct database queries (localhost:5432) | +| playwright | Browser automation and testing | +| redis | Redis cache inspection (localhost:6379) | +| bugsink | Error tracking - production Bugsink (bugsink.projectium.com) - **PROD/TEST** | +| bugsink-dev | Error tracking - dev container Bugsink (localhost:8000) - **DEV CONTAINER** | + +**Note:** MCP servers work in both **Claude CLI** and **Claude Code VS Code extension** (as of January 2026). + +**CRITICAL**: There are **TWO separate Bugsink MCP servers**: + +- **bugsink**: Connects to production Bugsink at `https://bugsink.projectium.com` for production and test server errors +- **bugsink-dev**: Connects to local dev container Bugsink at `http://localhost:8000` for local development errors + +### Bugsink MCP Server Setup (ADR-015) + +**IMPORTANT**: You need to configure **TWO separate MCP servers** - one for production/test, one for local dev. + +#### Installation (shared for both servers) + +```bash +# Clone the bugsink-mcp repository (NOT sentry-selfhosted-mcp) +git clone https://github.com/j-shelfwood/bugsink-mcp.git +cd bugsink-mcp +npm install +npm run build +``` + +#### Production/Test Bugsink MCP (bugsink) + +Add to `.claude/mcp.json`: + +```json +{ + "bugsink": { + "command": "node", + "args": ["d:\\gitea\\bugsink-mcp\\dist\\index.js"], + "env": { + "BUGSINK_URL": "https://bugsink.projectium.com", + "BUGSINK_API_TOKEN": "", + "BUGSINK_ORG_SLUG": "sentry" + } + } +} +``` + +**Get the auth token**: + +- Navigate to https://bugsink.projectium.com +- Log in with production credentials +- Go to Settings > API Keys +- Create a new API key with read access + +#### Dev Container Bugsink MCP (bugsink-dev) + +Add to `.claude/mcp.json`: + +```json +{ + "bugsink-dev": { + "command": "node", + "args": ["d:\\gitea\\bugsink-mcp\\dist\\index.js"], + "env": { + "BUGSINK_URL": "http://localhost:8000", + "BUGSINK_API_TOKEN": "", + "BUGSINK_ORG_SLUG": "sentry" + } + } +} +``` + +**Get the auth token**: + +- Navigate to http://localhost:8000 (or https://localhost:8443) +- Log in with `admin@localhost` / `admin` +- Go to Settings > API Keys +- Create a new API key with read access + +#### MCP Tool Usage + +When using Bugsink MCP tools, remember: + +- `mcp__bugsink__*` tools connect to **production/test** Bugsink +- `mcp__bugsink-dev__*` tools connect to **dev container** Bugsink +- Available capabilities for both: + - List projects and issues + - View detailed error events and stacktraces + - Search by error message or stack trace + - Update issue status (resolve, ignore) + - Create releases + +### SSH Server Access + +Claude Code can execute commands on the production server via SSH: + +```bash +# Basic command execution +ssh root@projectium.com "command here" + +# Examples: +ssh root@projectium.com "systemctl status logstash" +ssh root@projectium.com "pm2 list" +ssh root@projectium.com "tail -50 /var/www/flyer-crawler.projectium.com/logs/app.log" +``` + +**Use cases:** + +- Managing Logstash, PM2, NGINX, Redis services +- Viewing server logs +- Deploying configuration changes +- Checking service status + +**Important:** SSH access requires the host machine to have SSH keys configured for `root@projectium.com`. + +--- + +## Logstash Configuration (ADR-050) + +The production server uses **Logstash** to aggregate logs from multiple sources and forward errors to Bugsink for centralized error tracking. + +**Log Sources:** + +- **PostgreSQL function logs** - Structured JSON logs from `fn_log()` helper function +- **PM2 worker logs** - Service logs from BullMQ job workers (stdout) +- **Redis logs** - Operational logs (INFO level) and errors +- **NGINX logs** - Access logs (all requests) and error logs + +### Configuration Location + +**Primary configuration file:** + +- `/etc/logstash/conf.d/bugsink.conf` - Complete Logstash pipeline configuration + +**Related files:** + +- `/etc/postgresql/14/main/conf.d/observability.conf` - PostgreSQL logging configuration +- `/var/log/postgresql/*.log` - PostgreSQL log files +- `/home/gitea-runner/.pm2/logs/*.log` - PM2 worker logs +- `/var/log/redis/redis-server.log` - Redis logs +- `/var/log/nginx/access.log` - NGINX access logs +- `/var/log/nginx/error.log` - NGINX error logs +- `/var/log/logstash/*.log` - Logstash file outputs (operational logs) +- `/var/lib/logstash/sincedb_*` - Logstash position tracking files + +### Key Features + +1. **Multi-source aggregation**: Collects logs from PostgreSQL, PM2 workers, Redis, and NGINX +2. **Environment-based routing**: Automatically detects production vs test environments and routes errors to the correct Bugsink project +3. **Structured JSON parsing**: Extracts `fn_log()` function output from PostgreSQL logs and Pino JSON from PM2 workers +4. **Sentry-compatible format**: Transforms events to Sentry format with `event_id`, `timestamp`, `level`, `message`, and `extra` context +5. **Error filtering**: Only forwards WARNING and ERROR level messages to Bugsink +6. **Operational log storage**: Stores non-error logs (Redis INFO, NGINX access, PM2 operational) to `/var/log/logstash/` for analysis +7. **Request monitoring**: Categorizes NGINX requests by status code (2xx, 3xx, 4xx, 5xx) and identifies slow requests + +### Common Maintenance Commands + +```bash +# Check Logstash status +systemctl status logstash + +# Restart Logstash after configuration changes +systemctl restart logstash + +# Test configuration syntax +/usr/share/logstash/bin/logstash --config.test_and_exit -f /etc/logstash/conf.d/bugsink.conf + +# View Logstash logs +journalctl -u logstash -f + +# Check Logstash stats (events processed, failures) +curl -XGET 'localhost:9600/_node/stats/pipelines?pretty' | jq '.pipelines.main.plugins.filters' + +# Monitor PostgreSQL logs being processed +tail -f /var/log/postgresql/postgresql-$(date +%Y-%m-%d).log + +# View operational log outputs +tail -f /var/log/logstash/pm2-workers-$(date +%Y-%m-%d).log +tail -f /var/log/logstash/redis-operational-$(date +%Y-%m-%d).log +tail -f /var/log/logstash/nginx-access-$(date +%Y-%m-%d).log + +# Check disk usage of log files +du -sh /var/log/logstash/ +``` + +### Troubleshooting + +| Issue | Check | Solution | +| ------------------------------- | ---------------------------- | ---------------------------------------------------------------------------------------------- | +| Errors not appearing in Bugsink | Check Logstash is running | `systemctl status logstash` | +| Configuration syntax errors | Test config file | `/usr/share/logstash/bin/logstash --config.test_and_exit -f /etc/logstash/conf.d/bugsink.conf` | +| Grok pattern failures | Check Logstash stats | `curl localhost:9600/_node/stats/pipelines?pretty \| jq '.pipelines.main.plugins.filters'` | +| Wrong Bugsink project | Verify environment detection | Check tags in logs match expected environment (production/test) | +| Permission denied reading logs | Check Logstash permissions | `groups logstash` should include `postgres`, `adm` groups | +| PM2 logs not captured | Check file paths exist | `ls /home/gitea-runner/.pm2/logs/flyer-crawler-worker-*.log` | +| NGINX access logs not showing | Check file output directory | `ls -lh /var/log/logstash/nginx-access-*.log` | +| High disk usage | Check log rotation | Verify `/etc/logrotate.d/logstash` is configured and running daily | + +**Full setup guide**: See [docs/BARE-METAL-SETUP.md](docs/BARE-METAL-SETUP.md) section "PostgreSQL Function Observability (ADR-050)" + +**Architecture details**: See [docs/adr/0050-postgresql-function-observability.md](docs/adr/0050-postgresql-function-observability.md) diff --git a/docs/BUGSINK-SYNC.md b/docs/BUGSINK-SYNC.md index f5ba0a06..b1729569 100644 --- a/docs/BUGSINK-SYNC.md +++ b/docs/BUGSINK-SYNC.md @@ -63,7 +63,7 @@ Add these to **test environment only** (`deploy-to-test.yml`): ```bash # Bugsink API BUGSINK_URL=https://bugsink.projectium.com -BUGSINK_API_TOKEN= API Keys> +BUGSINK_API_TOKEN= # Gitea API GITEA_URL=https://gitea.projectium.com @@ -76,15 +76,38 @@ BUGSINK_SYNC_ENABLED=true # Only set true in test env BUGSINK_SYNC_INTERVAL=15 # Minutes between sync runs ``` +## Creating Bugsink API Token + +Bugsink 2.0.11 does not have a "Settings > API Keys" UI. Create API tokens via Django management command: + +**On Production Server:** + +```bash +sudo su - bugsink +source venv/bin/activate +cd ~ +bugsink-manage shell -c " +from django.contrib.auth import get_user_model +from rest_framework.authtoken.models import Token +User = get_user_model() +user = User.objects.get(email='admin@yourdomain.com') # Use your admin email +token, created = Token.objects.get_or_create(user=user) +print(f'Token: {token.key}') +" +exit +``` + +This will output a 40-character lowercase hex token. + ## Gitea Secrets to Add Add these secrets in Gitea repository settings (Settings > Secrets): -| Secret Name | Value | Environment | -| ---------------------- | ---------------------- | ----------- | -| `BUGSINK_API_TOKEN` | API token from Bugsink | Test only | -| `GITEA_SYNC_TOKEN` | Personal access token | Test only | -| `BUGSINK_SYNC_ENABLED` | `true` | Test only | +| Secret Name | Value | Environment | +| ---------------------- | ------------------------ | ----------- | +| `BUGSINK_API_TOKEN` | Token from command above | Test only | +| `GITEA_SYNC_TOKEN` | Personal access token | Test only | +| `BUGSINK_SYNC_ENABLED` | `true` | Test only | ## Redis Configuration diff --git a/docs/DEV-CONTAINER-BUGSINK.md b/docs/DEV-CONTAINER-BUGSINK.md new file mode 100644 index 00000000..2780ffb0 --- /dev/null +++ b/docs/DEV-CONTAINER-BUGSINK.md @@ -0,0 +1,81 @@ +# Dev Container Bugsink Setup + +Local Bugsink instance for development - NOT connected to production. + +## Quick Reference + +| Item | Value | +| ------------ | ----------------------------------------------------------- | +| UI | `https://localhost:8443` (nginx proxy from 8000) | +| Credentials | `admin@localhost` / `admin` | +| Projects | Backend (Dev) = Project ID 1, Frontend (Dev) = Project ID 2 | +| Backend DSN | `SENTRY_DSN=http://@localhost:8000/1` | +| Frontend DSN | `VITE_SENTRY_DSN=http://@localhost:8000/2` | + +## Configuration Files + +| File | Purpose | +| ----------------- | ----------------------------------------------------------------- | +| `compose.dev.yml` | Initial DSNs using `127.0.0.1:8000` (container startup) | +| `.env.local` | **OVERRIDES** compose.dev.yml with `localhost:8000` (app runtime) | + +**CRITICAL**: `.env.local` takes precedence over `compose.dev.yml` environment variables. + +## Why localhost vs 127.0.0.1? + +The `.env.local` file uses `localhost` while `compose.dev.yml` uses `127.0.0.1`. Both work in practice - `localhost` was chosen when `.env.local` was created separately. + +## HTTPS Setup + +- Self-signed certificates auto-generated with mkcert on container startup +- CSRF Protection: Django configured with `SECURE_PROXY_SSL_HEADER` to trust `X-Forwarded-Proto` from nginx +- HTTPS is for UI access only - Sentry SDK uses HTTP directly + +## Isolation Benefits + +- Dev errors stay local, don't pollute production/test dashboards +- No Gitea secrets needed - everything self-contained +- Independent testing of error tracking without affecting metrics + +## Accessing Errors + +### Via Browser + +1. Open `https://localhost:8443` +2. Login with credentials above +3. Navigate to Issues to view captured errors + +### Via MCP (bugsink-dev) + +Configure in `.claude/mcp.json`: + +```json +{ + "bugsink-dev": { + "command": "node", + "args": ["d:\\gitea\\bugsink-mcp\\dist\\index.js"], + "env": { + "BUGSINK_URL": "http://localhost:8000", + "BUGSINK_API_TOKEN": "", + "BUGSINK_ORG_SLUG": "sentry" + } + } +} +``` + +**Get auth token**: + +API tokens must be created via Django management command (Bugsink 2.0.11 does not have a "Settings > API Keys" UI): + +```bash +podman exec flyer-crawler-dev sh -c 'cd /opt/bugsink/conf && \ + DATABASE_URL=postgresql://bugsink:bugsink_dev_password@postgres:5432/bugsink \ + SECRET_KEY=dev-bugsink-secret-key-minimum-50-characters-for-security \ + DJANGO_SETTINGS_MODULE=bugsink_conf \ + PYTHONPATH=/opt/bugsink/conf:/opt/bugsink/lib/python3.10/site-packages \ + /opt/bugsink/bin/python -m django create_auth_token' +``` + +This will output a 40-character lowercase hex token. Copy it to your MCP configuration. + +**MCP Tools**: Use `mcp__bugsink-dev__*` tools (not `mcp__bugsink__*` which connects to production). diff --git a/docs/LOGSTASH-QUICK-REF.md b/docs/LOGSTASH-QUICK-REF.md new file mode 100644 index 00000000..6296f325 --- /dev/null +++ b/docs/LOGSTASH-QUICK-REF.md @@ -0,0 +1,75 @@ +# Logstash Quick Reference (ADR-050) + +Aggregates logs from PostgreSQL, PM2, Redis, NGINX; forwards errors to Bugsink. + +## Configuration + +**Primary config**: `/etc/logstash/conf.d/bugsink.conf` + +### Related Files + +| Path | Purpose | +| --------------------------------------------------- | ------------------------- | +| `/etc/postgresql/14/main/conf.d/observability.conf` | PostgreSQL logging config | +| `/var/log/postgresql/*.log` | PostgreSQL logs | +| `/home/gitea-runner/.pm2/logs/*.log` | PM2 worker logs | +| `/var/log/redis/redis-server.log` | Redis logs | +| `/var/log/nginx/access.log` | NGINX access logs | +| `/var/log/nginx/error.log` | NGINX error logs | +| `/var/log/logstash/*.log` | Logstash file outputs | +| `/var/lib/logstash/sincedb_*` | Position tracking files | + +## Features + +- **Multi-source aggregation**: PostgreSQL, PM2 workers, Redis, NGINX +- **Environment routing**: Auto-detects prod/test, routes to correct Bugsink project +- **JSON parsing**: Extracts `fn_log()` from PostgreSQL, Pino JSON from PM2 +- **Sentry format**: Transforms to `event_id`, `timestamp`, `level`, `message`, `extra` +- **Error filtering**: Only forwards WARNING/ERROR to Bugsink +- **Operational storage**: Non-error logs saved to `/var/log/logstash/` +- **Request monitoring**: NGINX requests categorized by status, slow request detection + +## Commands + +```bash +# Status and control +systemctl status logstash +systemctl restart logstash + +# Test configuration +/usr/share/logstash/bin/logstash --config.test_and_exit -f /etc/logstash/conf.d/bugsink.conf + +# View logs +journalctl -u logstash -f + +# Check stats (events processed, failures) +curl -XGET 'localhost:9600/_node/stats/pipelines?pretty' | jq '.pipelines.main.plugins.filters' + +# Monitor sources +tail -f /var/log/postgresql/postgresql-$(date +%Y-%m-%d).log +tail -f /var/log/logstash/pm2-workers-$(date +%Y-%m-%d).log +tail -f /var/log/logstash/redis-operational-$(date +%Y-%m-%d).log +tail -f /var/log/logstash/nginx-access-$(date +%Y-%m-%d).log + +# Check disk usage +du -sh /var/log/logstash/ +``` + +## Troubleshooting + +| Issue | Check | Solution | +| --------------------- | ---------------- | ---------------------------------------------------------------------------------------------- | +| No Bugsink errors | Logstash running | `systemctl status logstash` | +| Config syntax error | Test config | `/usr/share/logstash/bin/logstash --config.test_and_exit -f /etc/logstash/conf.d/bugsink.conf` | +| Grok pattern failures | Stats endpoint | `curl localhost:9600/_node/stats/pipelines?pretty \| jq '.pipelines.main.plugins.filters'` | +| Wrong Bugsink project | Env detection | Check tags in logs match expected environment | +| Permission denied | Logstash groups | `groups logstash` should include `postgres`, `adm` | +| PM2 not captured | File paths | `ls /home/gitea-runner/.pm2/logs/flyer-crawler-worker-*.log` | +| NGINX logs missing | Output directory | `ls -lh /var/log/logstash/nginx-access-*.log` | +| High disk usage | Log rotation | Verify `/etc/logrotate.d/logstash` configured | + +## Related Documentation + +- **Full setup**: [BARE-METAL-SETUP.md](BARE-METAL-SETUP.md) - PostgreSQL Function Observability section +- **Architecture**: [adr/0050-postgresql-function-observability.md](adr/0050-postgresql-function-observability.md) +- **Troubleshooting details**: [LOGSTASH-TROUBLESHOOTING.md](LOGSTASH-TROUBLESHOOTING.md) diff --git a/docs/adr/0015-application-performance-monitoring-and-error-tracking.md b/docs/adr/0015-application-performance-monitoring-and-error-tracking.md index 2c6b19b8..4c6f9e48 100644 --- a/docs/adr/0015-application-performance-monitoring-and-error-tracking.md +++ b/docs/adr/0015-application-performance-monitoring-and-error-tracking.md @@ -76,16 +76,18 @@ This provides a secondary error capture path for: - Database function errors and slow queries - Historical error analysis from log files -### 5. MCP Server Integration: sentry-selfhosted-mcp +### 5. MCP Server Integration: bugsink-mcp -For AI tool integration (Claude Code, Cursor, etc.), we use the open-source [sentry-selfhosted-mcp](https://github.com/ddfourtwo/sentry-selfhosted-mcp) server: +For AI tool integration (Claude Code, Cursor, etc.), we use the open-source [bugsink-mcp](https://github.com/j-shelfwood/bugsink-mcp) server: - **No code changes required**: Configurable via environment variables -- **Capabilities**: List projects, get issues, view events, update status, add comments +- **Capabilities**: List projects, get issues, view events, get stacktraces, manage releases - **Configuration**: - - `SENTRY_URL`: Points to Bugsink instance - - `SENTRY_AUTH_TOKEN`: API token from Bugsink - - `SENTRY_ORG_SLUG`: Organization identifier + - `BUGSINK_URL`: Points to Bugsink instance (`http://localhost:8000` for dev, `https://bugsink.projectium.com` for prod) + - `BUGSINK_API_TOKEN`: API token from Bugsink (created via Django management command) + - `BUGSINK_ORG_SLUG`: Organization identifier (usually "sentry") + +**Note:** Despite the name `sentry-selfhosted-mcp` mentioned in earlier drafts of this ADR, the actual MCP server used is `bugsink-mcp` which is specifically designed for Bugsink's API structure. ## Architecture @@ -144,12 +146,12 @@ External (Developer Machine): ┌──────────────────────────────────────┐ │ Claude Code / Cursor / VS Code │ │ ┌────────────────────────────────┐ │ -│ │ sentry-selfhosted-mcp │ │ +│ │ bugsink-mcp │ │ │ │ (MCP Server) │ │ │ │ │ │ -│ │ SENTRY_URL=http://localhost:8000 -│ │ SENTRY_AUTH_TOKEN=... │ │ -│ │ SENTRY_ORG_SLUG=... │ │ +│ │ BUGSINK_URL=http://localhost:8000 +│ │ BUGSINK_API_TOKEN=... │ │ +│ │ BUGSINK_ORG_SLUG=... │ │ │ └────────────────────────────────┘ │ └──────────────────────────────────────┘ ``` @@ -279,7 +281,7 @@ output { - Configure Redis log monitoring (connection errors, slow commands) 7. **MCP server documentation**: - - Document `sentry-selfhosted-mcp` setup in CLAUDE.md + - Document `bugsink-mcp` setup in CLAUDE.md 8. **PostgreSQL function logging** (future): - Configure PostgreSQL to log function execution errors @@ -318,5 +320,5 @@ output { - [Bugsink Docker Install](https://www.bugsink.com/docs/docker-install/) - [@sentry/node Documentation](https://docs.sentry.io/platforms/javascript/guides/node/) - [@sentry/react Documentation](https://docs.sentry.io/platforms/javascript/guides/react/) -- [sentry-selfhosted-mcp](https://github.com/ddfourtwo/sentry-selfhosted-mcp) +- [bugsink-mcp](https://github.com/j-shelfwood/bugsink-mcp) - [Logstash Reference](https://www.elastic.co/guide/en/logstash/current/index.html) diff --git a/docs/adr/0054-bugsink-gitea-issue-sync.md b/docs/adr/0054-bugsink-gitea-issue-sync.md index 0c8a42e9..c0a6f6ec 100644 --- a/docs/adr/0054-bugsink-gitea-issue-sync.md +++ b/docs/adr/0054-bugsink-gitea-issue-sync.md @@ -149,7 +149,7 @@ All synced issues also receive the `source:bugsink` label. ```bash # Bugsink Configuration BUGSINK_URL=https://bugsink.projectium.com -BUGSINK_API_TOKEN=77deaa5e... # From Bugsink Settings > API Keys +BUGSINK_API_TOKEN=77deaa5e... # Created via Django management command (see BUGSINK-SYNC.md) # Gitea Configuration GITEA_URL=https://gitea.projectium.com diff --git a/src/layouts/MainLayout.test.tsx b/src/layouts/MainLayout.test.tsx index ceae6185..8b44fd16 100644 --- a/src/layouts/MainLayout.test.tsx +++ b/src/layouts/MainLayout.test.tsx @@ -195,7 +195,7 @@ describe('MainLayout Component', () => { onOpenProfile: mockOnOpenProfile, }; - it('renders all main sections and the Outlet content', () => { + it('renders all main sections and the Outlet content for unauthenticated users', () => { renderWithRouter(); expect(screen.getByTestId('flyer-list')).toBeInTheDocument(); @@ -203,9 +203,10 @@ describe('MainLayout Component', () => { expect(screen.getByTestId('shopping-list')).toBeInTheDocument(); expect(screen.getByTestId('watched-items-list')).toBeInTheDocument(); expect(screen.getByTestId('price-chart')).toBeInTheDocument(); - expect(screen.getByTestId('price-history-chart')).toBeInTheDocument(); - expect(screen.getByTestId('leaderboard')).toBeInTheDocument(); - expect(screen.getByTestId('activity-log')).toBeInTheDocument(); + // Auth-gated components should NOT be present for unauthenticated users + expect(screen.queryByTestId('price-history-chart')).not.toBeInTheDocument(); + expect(screen.queryByTestId('leaderboard')).not.toBeInTheDocument(); + expect(screen.queryByTestId('activity-log')).not.toBeInTheDocument(); expect(screen.getByText('Outlet Content')).toBeInTheDocument(); }); @@ -235,6 +236,44 @@ describe('MainLayout Component', () => { renderWithRouter(); expect(screen.queryByTestId('anonymous-banner')).not.toBeInTheDocument(); }); + + it('renders auth-gated components (PriceHistoryChart, Leaderboard, ActivityLog)', () => { + renderWithRouter(); + expect(screen.getByTestId('price-history-chart')).toBeInTheDocument(); + expect(screen.getByTestId('leaderboard')).toBeInTheDocument(); + expect(screen.getByTestId('activity-log')).toBeInTheDocument(); + }); + + it('calls setActiveListId when a list is shared via ActivityLog and the list exists', () => { + mockedUseShoppingLists.mockReturnValue({ + ...defaultUseShoppingListsReturn, + shoppingLists: [ + createMockShoppingList({ shopping_list_id: 1, name: 'My List', user_id: 'user-123' }), + ], + }); + + renderWithRouter(); + const activityLog = screen.getByTestId('activity-log'); + fireEvent.click(activityLog); + + expect(mockSetActiveListId).toHaveBeenCalledWith(1); + }); + + it('does not call setActiveListId for actions other than list_shared', () => { + renderWithRouter(); + const otherLogAction = screen.getByTestId('activity-log-other'); + fireEvent.click(otherLogAction); + + expect(mockSetActiveListId).not.toHaveBeenCalled(); + }); + + it('does not call setActiveListId if the shared list does not exist', () => { + renderWithRouter(); + const activityLog = screen.getByTestId('activity-log'); + fireEvent.click(activityLog); // Mock click simulates sharing list with id 1 + + expect(mockSetActiveListId).not.toHaveBeenCalled(); + }); }); describe('Error Handling', () => { @@ -285,37 +324,6 @@ describe('MainLayout Component', () => { }); describe('Event Handlers', () => { - it('calls setActiveListId when a list is shared via ActivityLog and the list exists', () => { - mockedUseShoppingLists.mockReturnValue({ - ...defaultUseShoppingListsReturn, - shoppingLists: [ - createMockShoppingList({ shopping_list_id: 1, name: 'My List', user_id: 'user-123' }), - ], - }); - - renderWithRouter(); - const activityLog = screen.getByTestId('activity-log'); - fireEvent.click(activityLog); - - expect(mockSetActiveListId).toHaveBeenCalledWith(1); - }); - - it('does not call setActiveListId for actions other than list_shared', () => { - renderWithRouter(); - const otherLogAction = screen.getByTestId('activity-log-other'); - fireEvent.click(otherLogAction); - - expect(mockSetActiveListId).not.toHaveBeenCalled(); - }); - - it('does not call setActiveListId if the shared list does not exist', () => { - renderWithRouter(); - const activityLog = screen.getByTestId('activity-log'); - fireEvent.click(activityLog); // Mock click simulates sharing list with id 1 - - expect(mockSetActiveListId).not.toHaveBeenCalled(); - }); - it('calls addItemToList when an item is added from ShoppingListComponent and a list is active', () => { const mockAddItemToList = vi.fn(); mockedUseShoppingLists.mockReturnValue({