Compare commits

..

79 Commits

Author SHA1 Message Date
Gitea Actions
a14816c8ee ci: Bump version to 0.11.0 for production release [skip ci] 2026-01-18 05:02:54 +05:00
Gitea Actions
08b220e29c ci: Bump version to 0.10.0 for production release [skip ci] 2026-01-18 04:50:17 +05:00
Gitea Actions
d41a3f1887 ci: Bump version to 0.9.115 [skip ci] 2026-01-18 04:10:18 +05:00
1f6cdc62d7 still fixin test
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 16m20s
2026-01-17 15:09:17 -08:00
Gitea Actions
978c63bacd ci: Bump version to 0.9.114 [skip ci] 2026-01-18 04:00:21 +05:00
544eb7ae3c still fixin test
Some checks failed
Deploy to Test Environment / deploy-to-test (push) Failing after 2m1s
2026-01-17 14:59:01 -08:00
Gitea Actions
f6839f6e14 ci: Bump version to 0.9.113 [skip ci] 2026-01-18 03:35:25 +05:00
3fac29436a still fixin test
Some checks failed
Deploy to Test Environment / deploy-to-test (push) Failing after 2m6s
2026-01-17 14:34:18 -08:00
Gitea Actions
56f45c9301 ci: Bump version to 0.9.112 [skip ci] 2026-01-18 03:19:53 +05:00
83460abce4 md fixin
Some checks failed
Deploy to Test Environment / deploy-to-test (push) Failing after 1m57s
2026-01-17 14:18:55 -08:00
Gitea Actions
1b084b2ba4 ci: Bump version to 0.9.111 [skip ci] 2026-01-18 02:56:20 +05:00
0ea034bdc8 push
Some checks failed
Deploy to Test Environment / deploy-to-test (push) Failing after 1m54s
2026-01-17 13:55:22 -08:00
Gitea Actions
fc9e27078a ci: Bump version to 0.9.110 [skip ci] 2026-01-18 02:41:36 +05:00
fb8cbe8007 update mcp and created new test user and reset passes
Some checks failed
Deploy to Test Environment / deploy-to-test (push) Failing after 1m56s
2026-01-17 13:40:31 -08:00
f49f786c23 fix: Add .env file loading to ecosystem-test.config.cjs
Allows test environment PM2 processes to load environment variables
from /var/www/flyer-crawler-test.projectium.com/.env file, enabling
manual restarts without requiring CI/CD to inject variables.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-17 13:38:15 -08:00
Gitea Actions
dd31141d4e ci: Bump version to 0.9.109 [skip ci] 2026-01-13 23:09:47 +05:00
8073094760 testing/staging fixin
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 16m15s
2026-01-13 10:08:28 -08:00
Gitea Actions
33a1e146ab ci: Bump version to 0.9.108 [skip ci] 2026-01-13 22:34:20 +05:00
4f8216db77 testing/staging fixin
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 15m55s
2026-01-13 09:33:38 -08:00
Gitea Actions
42d605d19f ci: Bump version to 0.9.107 [skip ci] 2026-01-13 22:06:39 +05:00
749350df7f testing/staging fixin
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 15m56s
2026-01-13 09:03:42 -08:00
Gitea Actions
ac085100fe ci: Bump version to 0.9.106 [skip ci] 2026-01-13 21:43:43 +05:00
ce4ecd1268 use port 3002 in test
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 16m13s
2026-01-13 08:42:34 -08:00
Gitea Actions
a57cfc396b ci: Bump version to 0.9.105 [skip ci] 2026-01-13 21:00:45 +05:00
987badbf8d use port 3002 in test
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 15m41s
2026-01-13 07:59:49 -08:00
Gitea Actions
d38fcd21c1 ci: Bump version to 0.9.104 [skip ci] 2026-01-13 08:11:38 +05:00
6e36cc3b07 logging + e2e test fixes
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 16m34s
2026-01-12 19:10:29 -08:00
Gitea Actions
62a8a8bf4b ci: Bump version to 0.9.103 [skip ci] 2026-01-13 06:39:39 +05:00
96038cfcf4 logging work - almost there
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 15m51s
2026-01-12 17:38:58 -08:00
Gitea Actions
981214fdd0 ci: Bump version to 0.9.102 [skip ci] 2026-01-13 06:27:55 +05:00
92b0138108 logging work - almost there
Some checks failed
Deploy to Test Environment / deploy-to-test (push) Failing after 1m2s
2026-01-12 17:26:59 -08:00
Gitea Actions
27f0255240 ci: Bump version to 0.9.101 [skip ci] 2026-01-13 05:57:55 +05:00
4e06dde9e1 logging work - almost there
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 15m30s
2026-01-12 16:57:18 -08:00
Gitea Actions
b9a0e5b82c ci: Bump version to 0.9.100 [skip ci] 2026-01-13 05:35:11 +05:00
bb7fe8dc2c logging work - almost there
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 15m28s
2026-01-12 16:34:18 -08:00
Gitea Actions
81f1f2250b ci: Bump version to 0.9.99 [skip ci] 2026-01-13 05:08:56 +05:00
c6c90bb615 more new feature fixes + sentry logging
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 15m53s
2026-01-12 16:08:18 -08:00
Gitea Actions
60489a626b ci: Bump version to 0.9.98 [skip ci] 2026-01-13 05:05:59 +05:00
3c63e1ecbb more new feature fixes + sentry logging
Some checks failed
Deploy to Test Environment / deploy-to-test (push) Has been cancelled
2026-01-12 16:04:09 -08:00
Gitea Actions
acbcb39cbe ci: Bump version to 0.9.97 [skip ci] 2026-01-13 03:34:42 +05:00
a87a0b6af1 unit test repairs
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 17m12s
2026-01-12 14:31:41 -08:00
Gitea Actions
abdc3cb6db ci: Bump version to 0.9.96 [skip ci] 2026-01-13 00:52:54 +05:00
7a1bd50119 unit test repairs
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 17m42s
2026-01-12 11:51:48 -08:00
Gitea Actions
87d75d0571 ci: Bump version to 0.9.95 [skip ci] 2026-01-13 00:04:10 +05:00
faf2900c28 unit test repairs
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 16m43s
2026-01-12 10:58:00 -08:00
Gitea Actions
5258efc179 ci: Bump version to 0.9.94 [skip ci] 2026-01-12 21:11:57 +05:00
2a5cc5bb51 unit test repairs
Some checks failed
Deploy to Test Environment / deploy-to-test (push) Failing after 1m17s
2026-01-12 08:10:37 -08:00
Gitea Actions
8eaee2844f ci: Bump version to 0.9.93 [skip ci] 2026-01-12 08:57:24 +05:00
440a19c3a7 whoa - so much - new features (UPC,etc) - Sentry for app logging! so much more !
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 14m53s
2026-01-11 19:55:10 -08:00
4ae6d84240 sql fix
Some checks failed
Deploy to Test Environment / deploy-to-test (push) Has been cancelled
2026-01-11 19:49:13 -08:00
Gitea Actions
5870e5c614 ci: Bump version to 0.9.92 [skip ci] 2026-01-12 08:20:09 +05:00
2e7ebbd9ed whoa - so much - new features (UPC,etc) - Sentry for app logging! so much more !
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 14m47s
2026-01-11 19:18:52 -08:00
Gitea Actions
dc3fa21359 ci: Bump version to 0.9.91 [skip ci] 2026-01-12 08:08:50 +05:00
11aeac5edd whoa - so much - new features (UPC,etc) - Sentry for app logging! so much more !
Some checks failed
Deploy to Test Environment / deploy-to-test (push) Failing after 1m10s
2026-01-11 19:07:02 -08:00
Gitea Actions
f6c0c082bc ci: Bump version to 0.9.90 [skip ci] 2026-01-11 15:05:48 +05:00
4e22213cd1 all the new shiny things
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 15m54s
2026-01-11 02:04:52 -08:00
Gitea Actions
9815eb3686 ci: Bump version to 0.9.89 [skip ci] 2026-01-11 12:58:20 +05:00
2bf4a7c1e6 google + github oauth
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 15m39s
2026-01-10 23:57:18 -08:00
Gitea Actions
5eed3f51f4 ci: Bump version to 0.9.88 [skip ci] 2026-01-11 12:01:25 +05:00
d250932c05 all tests fixed? can it be?
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 15m28s
2026-01-10 22:58:38 -08:00
Gitea Actions
7d1f964574 ci: Bump version to 0.9.87 [skip ci] 2026-01-11 08:30:29 +05:00
3b69e58de3 remove useless windows testing files, fix testing?
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 14m1s
2026-01-10 19:29:54 -08:00
Gitea Actions
5211aadd22 ci: Bump version to 0.9.86 [skip ci] 2026-01-11 08:05:21 +05:00
a997d1d0b0 ranstack query fixes
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 13m21s
2026-01-10 19:03:40 -08:00
cf5f77c58e Adopt TanStack Query fixes 2026-01-10 19:02:42 -08:00
Gitea Actions
cf0f5bb820 ci: Bump version to 0.9.85 [skip ci] 2026-01-11 06:44:28 +05:00
503e7084da Adopt TanStack Query fixes
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 14m41s
2026-01-10 17:42:45 -08:00
Gitea Actions
d8aa19ac40 ci: Bump version to 0.9.84 [skip ci] 2026-01-10 23:45:42 +05:00
dcd9452b8c Adopt TanStack Query
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 13m46s
2026-01-10 10:45:10 -08:00
Gitea Actions
6d468544e2 ci: Bump version to 0.9.83 [skip ci] 2026-01-10 23:14:18 +05:00
2913c7aa09 tanstack
Some checks failed
Deploy to Test Environment / deploy-to-test (push) Failing after 1m1s
2026-01-10 03:20:40 -08:00
Gitea Actions
77f9cb6081 ci: Bump version to 0.9.82 [skip ci] 2026-01-10 12:17:24 +05:00
2f1d73ca12 fix(tests): access wrapped API response data correctly
Some checks failed
Deploy to Test Environment / deploy-to-test (push) Failing after 3h0m5s
Tests were accessing response.body directly instead of response.body.data,
causing failures since sendSuccess() wraps responses in { success, data }.
2026-01-09 23:16:30 -08:00
Gitea Actions
402e2617ca ci: Bump version to 0.9.81 [skip ci] 2026-01-10 11:40:07 +05:00
e14c19c112 linting docs + some fixes go claude and gemini
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 16m0s
2026-01-09 22:38:57 -08:00
Gitea Actions
ea46f66c7a ci: Bump version to 0.9.80 [skip ci] 2026-01-10 11:00:30 +05:00
a42ee5a461 unit tests - wheeee! Claude is the mvp
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 15m11s
2026-01-09 21:59:09 -08:00
Gitea Actions
71710c8316 ci: Bump version to 0.9.79 [skip ci] 2026-01-10 09:32:36 +05:00
1480a73ab0 more compliance
Some checks failed
Deploy to Test Environment / deploy-to-test (push) Failing after 58s
2026-01-09 20:30:52 -08:00
326 changed files with 48568 additions and 4134 deletions

16
.claude/hooks.json Normal file
View File

@@ -0,0 +1,16 @@
{
"$schema": "https://claude.ai/schemas/hooks.json",
"hooks": {
"PreToolUse": [
{
"matcher": "Bash",
"hooks": [
{
"type": "command",
"command": "node -e \"const cmd = process.argv[1] || ''; const isTest = /\\b(npm\\s+(run\\s+)?test|vitest|jest)\\b/i.test(cmd); const isWindows = process.platform === 'win32'; const inContainer = process.env.REMOTE_CONTAINERS === 'true' || process.env.DEVCONTAINER === 'true'; if (isTest && isWindows && !inContainer) { console.error('BLOCKED: Tests must run on Linux. Use Dev Container (Reopen in Container) or WSL.'); process.exit(1); }\" -- \"$CLAUDE_TOOL_INPUT\""
}
]
}
]
}
}

View File

@@ -18,11 +18,9 @@
"Bash(PGPASSWORD=postgres psql:*)",
"Bash(npm search:*)",
"Bash(npx:*)",
"Bash(curl -s -H \"Authorization: token c72bc0f14f623fec233d3c94b3a16397fe3649ef\" https://gitea.projectium.com/api/v1/user)",
"Bash(curl:*)",
"Bash(powershell:*)",
"Bash(cmd.exe:*)",
"Bash(export NODE_ENV=test DB_HOST=localhost DB_USER=postgres DB_PASSWORD=postgres DB_NAME=flyer_crawler_dev REDIS_URL=redis://localhost:6379 FRONTEND_URL=http://localhost:5173 JWT_SECRET=test-jwt-secret:*)",
"Bash(npm run test:integration:*)",
"Bash(grep:*)",
"Bash(done)",
@@ -79,7 +77,28 @@
"Bash(npm run lint)",
"Bash(npm run typecheck:*)",
"Bash(npm run type-check:*)",
"Bash(npm run test:unit:*)"
"Bash(npm run test:unit:*)",
"mcp__filesystem__move_file",
"Bash(git checkout:*)",
"Bash(podman image inspect:*)",
"Bash(node -e:*)",
"Bash(xargs -I {} sh -c 'if ! grep -q \"\"vi.mock.*apiClient\"\" \"\"{}\"\"; then echo \"\"{}\"\"; fi')",
"Bash(MSYS_NO_PATHCONV=1 podman exec:*)",
"Bash(docker ps:*)",
"Bash(find:*)",
"Bash(\"/c/Users/games3/.local/bin/uvx.exe\" markitdown-mcp --help)",
"Bash(git stash:*)",
"Bash(ping:*)",
"Bash(tee:*)",
"Bash(timeout 1800 podman exec flyer-crawler-dev npm run test:unit:*)",
"mcp__filesystem__edit_file",
"Bash(timeout 300 tail:*)",
"mcp__filesystem__list_allowed_directories",
"mcp__memory__add_observations",
"Bash(ssh:*)",
"mcp__redis__list",
"Read(//d/gitea/bugsink-mcp/**)",
"Bash(d:/nodejs/npm.cmd install)"
]
}
}

View File

@@ -41,6 +41,14 @@ FRONTEND_URL=http://localhost:3000
# REQUIRED: Secret key for signing JWT tokens (generate a random 64+ character string)
JWT_SECRET=your-super-secret-jwt-key-change-this-in-production
# OAuth Providers (Optional - enable social login)
# Google OAuth - https://console.cloud.google.com/apis/credentials
GOOGLE_CLIENT_ID=
GOOGLE_CLIENT_SECRET=
# GitHub OAuth - https://github.com/settings/developers
GITHUB_CLIENT_ID=
GITHUB_CLIENT_SECRET=
# ===================
# AI/ML Services
# ===================
@@ -75,3 +83,22 @@ CLEANUP_WORKER_CONCURRENCY=10
# Worker lock duration in milliseconds (default: 2 minutes)
WORKER_LOCK_DURATION=120000
# ===================
# Error Tracking (ADR-015)
# ===================
# Sentry-compatible error tracking via Bugsink (self-hosted)
# DSNs are created in Bugsink UI at http://localhost:8000 (dev) or your production URL
# Backend DSN - for Express/Node.js errors
SENTRY_DSN=
# Frontend DSN - for React/browser errors (uses VITE_ prefix)
VITE_SENTRY_DSN=
# Environment name for error grouping (defaults to NODE_ENV)
SENTRY_ENVIRONMENT=development
VITE_SENTRY_ENVIRONMENT=development
# Enable/disable error tracking (default: true)
SENTRY_ENABLED=true
VITE_SENTRY_ENABLED=true
# Enable debug mode for SDK troubleshooting (default: false)
SENTRY_DEBUG=false
VITE_SENTRY_DEBUG=false

View File

@@ -1,66 +0,0 @@
{
"mcpServers": {
"gitea-projectium": {
"command": "d:\\gitea-mcp\\gitea-mcp.exe",
"args": ["run", "-t", "stdio"],
"env": {
"GITEA_HOST": "https://gitea.projectium.com",
"GITEA_ACCESS_TOKEN": "c72bc0f14f623fec233d3c94b3a16397fe3649ef"
}
},
"gitea-torbonium": {
"command": "d:\\gitea-mcp\\gitea-mcp.exe",
"args": ["run", "-t", "stdio"],
"env": {
"GITEA_HOST": "https://gitea.torbonium.com",
"GITEA_ACCESS_TOKEN": "391c9ddbe113378bc87bb8184800ba954648fcf8"
}
},
"gitea-lan": {
"command": "d:\\gitea-mcp\\gitea-mcp.exe",
"args": ["run", "-t", "stdio"],
"env": {
"GITEA_HOST": "https://gitea.torbolan.com",
"GITEA_ACCESS_TOKEN": "YOUR_LAN_TOKEN_HERE"
},
"disabled": true
},
"podman": {
"command": "D:\\nodejs\\npx.cmd",
"args": ["-y", "podman-mcp-server@latest"],
"env": {
"DOCKER_HOST": "npipe:////./pipe/podman-machine-default"
}
},
"filesystem": {
"command": "d:\\nodejs\\node.exe",
"args": [
"c:\\Users\\games3\\AppData\\Roaming\\npm\\node_modules\\@modelcontextprotocol\\server-filesystem\\dist\\index.js",
"d:\\gitea\\flyer-crawler.projectium.com\\flyer-crawler.projectium.com"
]
},
"fetch": {
"command": "D:\\nodejs\\npx.cmd",
"args": ["-y", "@modelcontextprotocol/server-fetch"]
},
"io.github.ChromeDevTools/chrome-devtools-mcp": {
"type": "stdio",
"command": "npx",
"args": ["chrome-devtools-mcp@0.12.1"],
"gallery": "https://api.mcp.github.com",
"version": "0.12.1"
},
"markitdown": {
"command": "C:\\Users\\games3\\.local\\bin\\uvx.exe",
"args": ["markitdown-mcp"]
},
"sequential-thinking": {
"command": "D:\\nodejs\\npx.cmd",
"args": ["-y", "@modelcontextprotocol/server-sequential-thinking"]
},
"memory": {
"command": "D:\\nodejs\\npx.cmd",
"args": ["-y", "@modelcontextprotocol/server-memory"]
}
}
}

View File

@@ -63,8 +63,8 @@ jobs:
- name: Check for Production Database Schema Changes
env:
DB_HOST: ${{ secrets.DB_HOST }}
DB_USER: ${{ secrets.DB_USER }}
DB_PASSWORD: ${{ secrets.DB_PASSWORD }}
DB_USER: ${{ secrets.DB_USER_PROD }}
DB_PASSWORD: ${{ secrets.DB_PASSWORD_PROD }}
DB_NAME: ${{ secrets.DB_DATABASE_PROD }}
run: |
if [ -z "$DB_HOST" ] || [ -z "$DB_USER" ] || [ -z "$DB_PASSWORD" ] || [ -z "$DB_NAME" ]; then
@@ -98,6 +98,9 @@ jobs:
VITE_APP_VERSION="$(date +'%Y%m%d-%H%M'):$(git rev-parse --short HEAD):$PACKAGE_VERSION" \
VITE_APP_COMMIT_URL="$GITEA_SERVER_URL/${{ gitea.repository }}/commit/${{ gitea.sha }}" \
VITE_APP_COMMIT_MESSAGE="$COMMIT_MESSAGE" \
VITE_SENTRY_DSN="${{ secrets.VITE_SENTRY_DSN }}" \
VITE_SENTRY_ENVIRONMENT="production" \
VITE_SENTRY_ENABLED="true" \
VITE_API_BASE_URL=/api VITE_API_KEY=${{ secrets.VITE_GOOGLE_GENAI_API_KEY }} npm run build
- name: Deploy Application to Production Server
@@ -114,8 +117,8 @@ jobs:
env:
# --- Production Secrets Injection ---
DB_HOST: ${{ secrets.DB_HOST }}
DB_USER: ${{ secrets.DB_USER }}
DB_PASSWORD: ${{ secrets.DB_PASSWORD }}
DB_USER: ${{ secrets.DB_USER_PROD }}
DB_PASSWORD: ${{ secrets.DB_PASSWORD_PROD }}
DB_NAME: ${{ secrets.DB_DATABASE_PROD }}
# Explicitly use database 0 for production (test uses database 1)
REDIS_URL: 'redis://localhost:6379/0'
@@ -130,6 +133,15 @@ jobs:
SMTP_USER: ''
SMTP_PASS: ''
SMTP_FROM_EMAIL: 'noreply@flyer-crawler.projectium.com'
# OAuth Providers
GOOGLE_CLIENT_ID: ${{ secrets.GOOGLE_CLIENT_ID }}
GOOGLE_CLIENT_SECRET: ${{ secrets.GOOGLE_CLIENT_SECRET }}
GITHUB_CLIENT_ID: ${{ secrets.GH_CLIENT_ID }}
GITHUB_CLIENT_SECRET: ${{ secrets.GH_CLIENT_SECRET }}
# Sentry/Bugsink Error Tracking (ADR-015)
SENTRY_DSN: ${{ secrets.SENTRY_DSN }}
SENTRY_ENVIRONMENT: 'production'
SENTRY_ENABLED: 'true'
run: |
if [ -z "$DB_HOST" ] || [ -z "$DB_USER" ] || [ -z "$DB_PASSWORD" ] || [ -z "$DB_NAME" ]; then
echo "ERROR: One or more production database secrets (DB_HOST, DB_USER, DB_PASSWORD, DB_DATABASE_PROD) are not set."
@@ -159,7 +171,7 @@ jobs:
else
echo "Version mismatch (Running: $RUNNING_VERSION -> Deployed: $NEW_VERSION) or app not running. Reloading PM2..."
fi
pm2 startOrReload ecosystem.config.cjs --env production --update-env && pm2 save
pm2 startOrReload ecosystem.config.cjs --update-env && pm2 save
echo "Production backend server reloaded successfully."
else
echo "Version $NEW_VERSION is already running. Skipping PM2 reload."

View File

@@ -121,10 +121,11 @@ jobs:
env:
# --- Database credentials for the test suite ---
# These are injected from Gitea secrets into the runner's environment.
# CRITICAL: Use TEST-specific credentials that have CREATE privileges on the public schema.
DB_HOST: ${{ secrets.DB_HOST }}
DB_USER: ${{ secrets.DB_USER }}
DB_PASSWORD: ${{ secrets.DB_PASSWORD }}
DB_NAME: 'flyer-crawler-test' # Explicitly set for tests
DB_USER: ${{ secrets.DB_USER_TEST }}
DB_PASSWORD: ${{ secrets.DB_PASSWORD_TEST }}
DB_NAME: ${{ secrets.DB_DATABASE_TEST }}
# --- Redis credentials for the test suite ---
# CRITICAL: Use Redis database 1 to isolate tests from production (which uses db 0).
@@ -198,8 +199,8 @@ jobs:
--reporter=verbose --includeTaskLocation --testTimeout=10000 --silent=passed-only || true
echo "--- Running E2E Tests ---"
# Run E2E tests using the dedicated E2E config which inherits from integration config.
# We still pass --coverage to enable it, but directory and timeout are now in the config.
# Run E2E tests using the dedicated E2E config.
# E2E uses port 3098, integration uses 3099 to avoid conflicts.
npx vitest run --config vitest.config.e2e.ts --coverage \
--coverage.exclude='**/*.test.ts' \
--coverage.exclude='**/tests/**' \
@@ -240,7 +241,19 @@ jobs:
# Run c8: read raw files from the temp dir, and output an Istanbul JSON report.
# We only generate the 'json' report here because it's all nyc needs for merging.
echo "Server coverage report about to be generated..."
npx c8 report --exclude='**/*.test.ts' --exclude='**/tests/**' --exclude='**/mocks/**' --reporter=json --temp-directory .coverage/tmp/integration-server --reports-dir .coverage/integration-server
npx c8 report \
--include='src/**' \
--exclude='**/*.test.ts' \
--exclude='**/*.test.tsx' \
--exclude='**/tests/**' \
--exclude='**/mocks/**' \
--exclude='hostexecutor/**' \
--exclude='scripts/**' \
--exclude='*.config.js' \
--exclude='*.config.ts' \
--reporter=json \
--temp-directory .coverage/tmp/integration-server \
--reports-dir .coverage/integration-server
echo "Server coverage report generated. Verifying existence:"
ls -l .coverage/integration-server/coverage-final.json
@@ -280,12 +293,18 @@ jobs:
--reporter=html \
--report-dir .coverage/ \
--temp-dir "$NYC_SOURCE_DIR" \
--include "src/**" \
--exclude "**/*.test.ts" \
--exclude "**/*.test.tsx" \
--exclude "**/tests/**" \
--exclude "**/mocks/**" \
--exclude "**/index.tsx" \
--exclude "**/vite-env.d.ts" \
--exclude "**/vitest.setup.ts"
--exclude "**/vitest.setup.ts" \
--exclude "hostexecutor/**" \
--exclude "scripts/**" \
--exclude "*.config.js" \
--exclude "*.config.ts"
# Re-enable secret masking for subsequent steps.
echo "::secret-masking::"
@@ -310,10 +329,11 @@ jobs:
- name: Check for Test Database Schema Changes
env:
# Use test database credentials for this check.
# CRITICAL: Use TEST-specific credentials that have CREATE privileges on the public schema.
DB_HOST: ${{ secrets.DB_HOST }}
DB_USER: ${{ secrets.DB_USER }}
DB_PASSWORD: ${{ secrets.DB_PASSWORD }} # This is used by psql
DB_NAME: ${{ secrets.DB_DATABASE_TEST }} # This is used by the application
DB_USER: ${{ secrets.DB_USER_TEST }}
DB_PASSWORD: ${{ secrets.DB_PASSWORD_TEST }}
DB_NAME: ${{ secrets.DB_DATABASE_TEST }}
run: |
# Fail-fast check to ensure secrets are configured in Gitea.
if [ -z "$DB_HOST" ] || [ -z "$DB_USER" ] || [ -z "$DB_PASSWORD" ] || [ -z "$DB_NAME" ]; then
@@ -368,6 +388,9 @@ jobs:
VITE_APP_VERSION="$(date +'%Y%m%d-%H%M'):$(git rev-parse --short HEAD):$PACKAGE_VERSION" \
VITE_APP_COMMIT_URL="$GITEA_SERVER_URL/${{ gitea.repository }}/commit/${{ gitea.sha }}" \
VITE_APP_COMMIT_MESSAGE="$COMMIT_MESSAGE" \
VITE_SENTRY_DSN="${{ secrets.VITE_SENTRY_DSN_TEST }}" \
VITE_SENTRY_ENVIRONMENT="test" \
VITE_SENTRY_ENABLED="true" \
VITE_API_BASE_URL="https://flyer-crawler-test.projectium.com/api" VITE_API_KEY=${{ secrets.VITE_GOOGLE_GENAI_API_KEY_TEST }} npm run build
- name: Deploy Application to Test Server
@@ -406,9 +429,10 @@ jobs:
# Your Node.js application will read these directly from `process.env`.
# Database Credentials
# CRITICAL: Use TEST-specific credentials that have CREATE privileges on the public schema.
DB_HOST: ${{ secrets.DB_HOST }}
DB_USER: ${{ secrets.DB_USER }}
DB_PASSWORD: ${{ secrets.DB_PASSWORD }}
DB_USER: ${{ secrets.DB_USER_TEST }}
DB_PASSWORD: ${{ secrets.DB_PASSWORD_TEST }}
DB_NAME: ${{ secrets.DB_DATABASE_TEST }}
# Redis Credentials (use database 1 to isolate from production)
@@ -428,6 +452,10 @@ jobs:
SMTP_USER: '' # Using MailHog, no auth needed
SMTP_PASS: '' # Using MailHog, no auth needed
SMTP_FROM_EMAIL: 'noreply@flyer-crawler-test.projectium.com'
# Sentry/Bugsink Error Tracking (ADR-015)
SENTRY_DSN: ${{ secrets.SENTRY_DSN_TEST }}
SENTRY_ENVIRONMENT: 'test'
SENTRY_ENABLED: 'true'
run: |
# Fail-fast check to ensure secrets are configured in Gitea.
@@ -451,10 +479,11 @@ jobs:
echo "Cleaning up errored or stopped PM2 processes..."
node -e "const exec = require('child_process').execSync; try { const list = JSON.parse(exec('pm2 jlist').toString()); list.forEach(p => { if (p.pm2_env.status === 'errored' || p.pm2_env.status === 'stopped') { console.log('Deleting ' + p.pm2_env.status + ' process: ' + p.name + ' (' + p.pm2_env.pm_id + ')'); try { exec('pm2 delete ' + p.pm2_env.pm_id); } catch(e) { console.error('Failed to delete ' + p.pm2_env.pm_id); } } }); } catch (e) { console.error('Error cleaning up processes:', e); }"
# Use `startOrReload` with the ecosystem file. This is the standard, idempotent way to deploy.
# It will START the process if it's not running, or RELOAD it if it is.
# Use `startOrReload` with the TEST ecosystem file. This starts test-specific processes
# (flyer-crawler-api-test, flyer-crawler-worker-test, flyer-crawler-analytics-worker-test)
# that run separately from production processes.
# We also add `&& pm2 save` to persist the process list across server reboots.
pm2 startOrReload ecosystem.config.cjs --env test --update-env && pm2 save
pm2 startOrReload ecosystem-test.config.cjs --update-env && pm2 save
echo "Test backend server reloaded successfully."
# After a successful deployment, update the schema hash in the database.

View File

@@ -20,9 +20,9 @@ jobs:
# Use production database credentials for this entire job.
DB_HOST: ${{ secrets.DB_HOST }}
DB_PORT: ${{ secrets.DB_PORT }}
DB_USER: ${{ secrets.DB_USER }}
DB_PASSWORD: ${{ secrets.DB_PASSWORD }}
DB_NAME: ${{ secrets.DB_NAME_PROD }}
DB_USER: ${{ secrets.DB_USER_PROD }}
DB_PASSWORD: ${{ secrets.DB_PASSWORD_PROD }}
DB_NAME: ${{ secrets.DB_DATABASE_PROD }}
steps:
- name: Validate Secrets

View File

@@ -23,9 +23,9 @@ jobs:
env:
# Use production database credentials for this entire job.
DB_HOST: ${{ secrets.DB_HOST }}
DB_USER: ${{ secrets.DB_USER }}
DB_PASSWORD: ${{ secrets.DB_PASSWORD }} # Used by psql
DB_NAME: ${{ secrets.DB_DATABASE_PROD }} # Used by the application
DB_USER: ${{ secrets.DB_USER_PROD }}
DB_PASSWORD: ${{ secrets.DB_PASSWORD_PROD }}
DB_NAME: ${{ secrets.DB_DATABASE_PROD }}
steps:
- name: Checkout Code

View File

@@ -23,9 +23,9 @@ jobs:
env:
# Use test database credentials for this entire job.
DB_HOST: ${{ secrets.DB_HOST }}
DB_USER: ${{ secrets.DB_USER }}
DB_PASSWORD: ${{ secrets.DB_PASSWORD }} # Used by psql
DB_NAME: ${{ secrets.DB_DATABASE_TEST }} # Used by the application
DB_USER: ${{ secrets.DB_USER_TEST }}
DB_PASSWORD: ${{ secrets.DB_PASSWORD_TEST }}
DB_NAME: ${{ secrets.DB_DATABASE_TEST }}
steps:
- name: Checkout Code

View File

@@ -22,8 +22,8 @@ jobs:
env:
# Use production database credentials for this entire job.
DB_HOST: ${{ secrets.DB_HOST }}
DB_USER: ${{ secrets.DB_USER }}
DB_PASSWORD: ${{ secrets.DB_PASSWORD }}
DB_USER: ${{ secrets.DB_USER_PROD }}
DB_PASSWORD: ${{ secrets.DB_PASSWORD_PROD }}
DB_NAME: ${{ secrets.DB_DATABASE_PROD }}
BACKUP_DIR: '/var/www/backups' # Define a dedicated directory for backups

View File

@@ -62,8 +62,8 @@ jobs:
- name: Check for Production Database Schema Changes
env:
DB_HOST: ${{ secrets.DB_HOST }}
DB_USER: ${{ secrets.DB_USER }}
DB_PASSWORD: ${{ secrets.DB_PASSWORD }}
DB_USER: ${{ secrets.DB_USER_PROD }}
DB_PASSWORD: ${{ secrets.DB_PASSWORD_PROD }}
DB_NAME: ${{ secrets.DB_DATABASE_PROD }}
run: |
if [ -z "$DB_HOST" ] || [ -z "$DB_USER" ] || [ -z "$DB_PASSWORD" ] || [ -z "$DB_NAME" ]; then
@@ -113,8 +113,8 @@ jobs:
env:
# --- Production Secrets Injection ---
DB_HOST: ${{ secrets.DB_HOST }}
DB_USER: ${{ secrets.DB_USER }}
DB_PASSWORD: ${{ secrets.DB_PASSWORD }}
DB_USER: ${{ secrets.DB_USER_PROD }}
DB_PASSWORD: ${{ secrets.DB_PASSWORD_PROD }}
DB_NAME: ${{ secrets.DB_DATABASE_PROD }}
# Explicitly use database 0 for production (test uses database 1)
REDIS_URL: 'redis://localhost:6379/0'

16
.gitignore vendored
View File

@@ -11,6 +11,18 @@ node_modules
dist
dist-ssr
*.local
.env
*.tsbuildinfo
# Test coverage
coverage
.nyc_output
.coverage
# Test artifacts - flyer-images/ is a runtime directory
# Test fixtures are stored in src/tests/assets/ instead
flyer-images/
test-output.txt
# Editor directories and files
.vscode/*
@@ -22,3 +34,7 @@ dist-ssr
*.njsproj
*.sln
*.sw?
Thumbs.db
.claude
nul
tmpclaude*

5
.nycrc.json Normal file
View File

@@ -0,0 +1,5 @@
{
"text": {
"maxCols": 200
}
}

110
AUTHENTICATION.md Normal file
View File

@@ -0,0 +1,110 @@
# Authentication Setup
Flyer Crawler supports OAuth authentication via Google and GitHub. This guide walks through configuring both providers.
---
## Google OAuth
### Step 1: Create OAuth Credentials
1. Go to the [Google Cloud Console](https://console.cloud.google.com/)
2. Create a new project (or select an existing one)
3. Navigate to **APIs & Services > Credentials**
4. Click **Create Credentials > OAuth client ID**
5. Select **Web application** as the application type
### Step 2: Configure Authorized Redirect URIs
Add the callback URL where Google will redirect users after authentication:
| Environment | Redirect URI |
| ----------- | -------------------------------------------------- |
| Development | `http://localhost:3001/api/auth/google/callback` |
| Production | `https://your-domain.com/api/auth/google/callback` |
### Step 3: Save Credentials
After clicking **Create**, you'll receive:
- **Client ID**
- **Client Secret**
Store these securely as environment variables:
- `GOOGLE_CLIENT_ID`
- `GOOGLE_CLIENT_SECRET`
---
## GitHub OAuth
### Step 1: Create OAuth App
1. Go to your [GitHub Developer Settings](https://github.com/settings/developers)
2. Navigate to **OAuth Apps**
3. Click **New OAuth App**
### Step 2: Fill in Application Details
| Field | Value |
| -------------------------- | ---------------------------------------------------- |
| Application name | Flyer Crawler (or your preferred name) |
| Homepage URL | `http://localhost:5173` (dev) or your production URL |
| Authorization callback URL | `http://localhost:3001/api/auth/github/callback` |
### Step 3: Save GitHub Credentials
After clicking **Register application**, you'll receive:
- **Client ID**
- **Client Secret**
Store these securely as environment variables:
- `GITHUB_CLIENT_ID`
- `GITHUB_CLIENT_SECRET`
---
## Environment Variables Summary
| Variable | Description |
| ---------------------- | ---------------------------------------- |
| `GOOGLE_CLIENT_ID` | Google OAuth client ID |
| `GOOGLE_CLIENT_SECRET` | Google OAuth client secret |
| `GITHUB_CLIENT_ID` | GitHub OAuth client ID |
| `GITHUB_CLIENT_SECRET` | GitHub OAuth client secret |
| `JWT_SECRET` | Secret for signing authentication tokens |
---
## Production Considerations
When deploying to production:
1. **Update redirect URIs** in both Google Cloud Console and GitHub OAuth settings to use your production domain
2. **Use HTTPS** for all callback URLs in production
3. **Store secrets securely** using your CI/CD platform's secrets management (e.g., Gitea repository secrets)
---
## Troubleshooting
### "redirect_uri_mismatch" Error
The callback URL in your OAuth provider settings doesn't match what the application is sending. Verify:
- The URL is exactly correct (no trailing slashes, correct port)
- You're using the right environment (dev vs production URLs)
### "invalid_client" Error
The Client ID or Client Secret is incorrect. Double-check your environment variables.
---
## Related Documentation
- [Installation Guide](INSTALL.md) - Local development setup
- [Deployment Guide](DEPLOYMENT.md) - Production deployment

378
CLAUDE-MCP.md Normal file
View File

@@ -0,0 +1,378 @@
# Claude Code MCP Configuration Guide
This document explains how to configure MCP (Model Context Protocol) servers for Claude Code, covering both the CLI and VS Code extension.
## The Two Config Files
Claude Code uses **two separate configuration files** for MCP servers. They must be kept in sync manually.
| File | Used By | Notes |
| ------------------------- | ----------------------------- | ------------------------------------------- |
| `~/.claude.json` | Claude CLI (`claude` command) | Requires `"type": "stdio"` in each server |
| `~/.claude/settings.json` | VS Code Extension | Simpler format, supports `"disabled": true` |
**Important:** Changes to one file do NOT automatically sync to the other!
## File Locations (Windows)
```text
C:\Users\<username>\.claude.json # CLI config
C:\Users\<username>\.claude\settings.json # VS Code extension config
```
## Config Format Differences
### VS Code Extension Format (`~/.claude/settings.json`)
```json
{
"mcpServers": {
"server-name": {
"command": "path/to/executable",
"args": ["arg1", "arg2"],
"env": {
"ENV_VAR": "value"
},
"disabled": true // Optional - disable without removing
}
}
}
```
### CLI Format (`~/.claude.json`)
The CLI config is a larger file with many settings. The `mcpServers` section is nested within it:
```json
{
"numStartups": 14,
"installMethod": "global",
// ... other settings ...
"mcpServers": {
"server-name": {
"type": "stdio", // REQUIRED for CLI
"command": "path/to/executable",
"args": ["arg1", "arg2"],
"env": {
"ENV_VAR": "value"
}
}
}
// ... more settings ...
}
```
**Key difference:** CLI format requires `"type": "stdio"` in each server definition.
## Common MCP Server Examples
### Memory (Knowledge Graph)
```json
// VS Code format
"memory": {
"command": "D:\\nodejs\\npx.cmd",
"args": ["-y", "@modelcontextprotocol/server-memory"]
}
// CLI format
"memory": {
"type": "stdio",
"command": "D:\\nodejs\\npx.cmd",
"args": ["-y", "@modelcontextprotocol/server-memory"],
"env": {}
}
```
### Filesystem
```json
// VS Code format
"filesystem": {
"command": "d:\\nodejs\\node.exe",
"args": [
"c:\\Users\\<user>\\AppData\\Roaming\\npm\\node_modules\\@modelcontextprotocol\\server-filesystem\\dist\\index.js",
"d:\\path\\to\\project"
]
}
// CLI format
"filesystem": {
"type": "stdio",
"command": "d:\\nodejs\\node.exe",
"args": [
"c:\\Users\\<user>\\AppData\\Roaming\\npm\\node_modules\\@modelcontextprotocol\\server-filesystem\\dist\\index.js",
"d:\\path\\to\\project"
],
"env": {}
}
```
### Podman/Docker
```json
// VS Code format
"podman": {
"command": "D:\\nodejs\\npx.cmd",
"args": ["-y", "podman-mcp-server@latest"],
"env": {
"DOCKER_HOST": "npipe:////./pipe/podman-machine-default"
}
}
```
### Gitea
```json
// VS Code format
"gitea-myserver": {
"command": "d:\\gitea-mcp\\gitea-mcp.exe",
"args": ["run", "-t", "stdio"],
"env": {
"GITEA_HOST": "https://gitea.example.com",
"GITEA_ACCESS_TOKEN": "your-token-here"
}
}
```
### Redis
```json
// VS Code format
"redis": {
"command": "D:\\nodejs\\npx.cmd",
"args": ["-y", "@modelcontextprotocol/server-redis", "redis://localhost:6379"]
}
```
### Bugsink (Error Tracking)
**Important:** Bugsink has a different API than Sentry. Use `bugsink-mcp`, NOT `sentry-selfhosted-mcp`.
**Note:** The `bugsink-mcp` npm package is NOT published. You must clone and build from source:
```bash
# Clone and build bugsink-mcp
git clone https://github.com/j-shelfwood/bugsink-mcp.git d:\gitea\bugsink-mcp
cd d:\gitea\bugsink-mcp
npm install
npm run build
```
```json
// VS Code format (using locally built version)
"bugsink": {
"command": "d:\\nodejs\\node.exe",
"args": ["d:\\gitea\\bugsink-mcp\\dist\\index.js"],
"env": {
"BUGSINK_URL": "https://bugsink.example.com",
"BUGSINK_TOKEN": "your-api-token"
}
}
// CLI format
"bugsink": {
"type": "stdio",
"command": "d:\\nodejs\\node.exe",
"args": ["d:\\gitea\\bugsink-mcp\\dist\\index.js"],
"env": {
"BUGSINK_URL": "https://bugsink.example.com",
"BUGSINK_TOKEN": "your-api-token"
}
}
```
- GitHub: <https://github.com/j-shelfwood/bugsink-mcp>
- Get token from Bugsink UI: Settings > API Tokens
- **Do NOT use npx** - the package is not on npm
### Sentry (Cloud or Self-hosted)
For actual Sentry instances (not Bugsink), use:
```json
"sentry": {
"command": "D:\\nodejs\\npx.cmd",
"args": ["-y", "@sentry/mcp-server"],
"env": {
"SENTRY_AUTH_TOKEN": "your-sentry-token"
}
}
```
## Troubleshooting
### Server Not Loading
1. **Check both config files** - Make sure the server is defined in both `~/.claude.json` AND `~/.claude/settings.json`
2. **Verify server order** - Servers load sequentially. Broken/slow servers can block others. Put important servers first.
3. **Check for timeout** - Each server has 30 seconds to connect. Slow npx downloads can cause timeouts.
4. **Fully restart VS Code** - Window reload is not enough. Close all VS Code windows and reopen.
### Verifying Configuration
**For CLI:**
```bash
claude mcp list
```
**For VS Code:**
1. Open VS Code
2. View → Output
3. Select "Claude" from the dropdown
4. Look for MCP server connection logs
### Common Errors
| Error | Cause | Solution |
| ------------------------------------ | ----------------------------- | --------------------------------------------------------------------------- |
| `Connection timed out after 30000ms` | Server took too long to start | Move server earlier in config, or use pre-installed packages instead of npx |
| `npm error 404 Not Found` | Package doesn't exist | Check package name spelling |
| `The system cannot find the path` | Wrong executable path | Verify the command path exists |
| `Connection closed` | Server crashed on startup | Check server logs, verify environment variables |
### Disabling Problem Servers
In `~/.claude/settings.json`, add `"disabled": true`:
```json
"problem-server": {
"command": "...",
"args": ["..."],
"disabled": true
}
```
**Note:** The CLI config (`~/.claude.json`) does not support the `disabled` flag. You must remove the server entirely from that file.
## Adding a New MCP Server
1. **Install/clone the MCP server** (if not using npx)
2. **Add to VS Code config** (`~/.claude/settings.json`):
```json
"new-server": {
"command": "path/to/command",
"args": ["arg1", "arg2"],
"env": { "VAR": "value" }
}
```
3. **Add to CLI config** (`~/.claude.json`) - find the `mcpServers` section:
```json
"new-server": {
"type": "stdio",
"command": "path/to/command",
"args": ["arg1", "arg2"],
"env": { "VAR": "value" }
}
```
4. **Fully restart VS Code**
5. **Verify with `claude mcp list`**
## Quick Reference: Available MCP Servers
| Server | Package/Repo | Purpose |
| ------------------- | -------------------------------------------------- | --------------------------- |
| memory | `@modelcontextprotocol/server-memory` | Knowledge graph persistence |
| filesystem | `@modelcontextprotocol/server-filesystem` | File system access |
| redis | `@modelcontextprotocol/server-redis` | Redis cache inspection |
| postgres | `@modelcontextprotocol/server-postgres` | PostgreSQL queries |
| sequential-thinking | `@modelcontextprotocol/server-sequential-thinking` | Step-by-step reasoning |
| podman | `podman-mcp-server` | Container management |
| gitea | `gitea-mcp` (binary) | Gitea API access |
| bugsink | `j-shelfwood/bugsink-mcp` (build from source) | Error tracking for Bugsink |
| sentry | `@sentry/mcp-server` | Error tracking for Sentry |
| playwright | `@anthropics/mcp-server-playwright` | Browser automation |
## Best Practices
1. **Keep configs in sync** - When you change one file, update the other
2. **Order servers by importance** - Put essential servers (memory, filesystem) first
3. **Disable instead of delete** - Use `"disabled": true` in settings.json to troubleshoot
4. **Use node.exe directly** - For faster startup, install packages globally and use `node.exe` instead of `npx`
5. **Store sensitive data in memory** - Use the memory MCP to store API tokens and config for future sessions
---
## Future: MCP Launchpad
**Project:** <https://github.com/kenneth-liao/mcp-launchpad>
MCP Launchpad is a CLI tool that wraps multiple MCP servers into a single interface. Worth revisiting when:
- [ ] Windows support is stable (currently experimental)
- [ ] Available as an MCP server itself (currently Bash-based)
**Why it's interesting:**
| Benefit | Description |
| ---------------------- | -------------------------------------------------------------- |
| Single config file | No more syncing `~/.claude.json` and `~/.claude/settings.json` |
| Project-level configs | Drop `mcp.json` in any project for instant MCP setup |
| Context window savings | One MCP server in context instead of 10+, reducing token usage |
| Persistent daemon | Keeps server connections alive for faster repeated calls |
| Tool search | Find tools across all servers with `mcpl search` |
**Current limitations:**
- Experimental Windows support
- Requires Python 3.13+ and uv
- Claude calls tools via Bash instead of native MCP integration
- Different mental model (runtime discovery vs startup loading)
---
## Future: Graphiti (Advanced Knowledge Graph)
**Project:** <https://github.com/getzep/graphiti>
Graphiti provides temporal-aware knowledge graphs - it tracks not just facts, but _when_ they became true/outdated. Much more powerful than simple memory MCP, but requires significant infrastructure.
**Ideal setup:** Run on a Linux server, connect via HTTP from Windows:
```json
// Windows client config (settings.json)
"graphiti": {
"type": "sse",
"url": "http://linux-server:8000/mcp/"
}
```
**Linux server setup:**
```bash
git clone https://github.com/getzep/graphiti.git
cd graphiti/mcp_server
docker compose up -d # Starts FalkorDB + MCP server on port 8000
```
**Requirements:**
- Docker on Linux server
- OpenAI API key (for embeddings)
- Port 8000 open on LAN
**Benefits of remote deployment:**
- Heavy lifting (Neo4j/FalkorDB + embeddings) offloaded to Linux
- Always-on server, Windows connects/disconnects freely
- Multiple machines can share the same knowledge graph
- Avoids Windows Docker/WSL2 complexity
---
\_Last updated: January 2026

472
CLAUDE.md Normal file
View File

@@ -0,0 +1,472 @@
# Claude Code Project Instructions
## Session Startup Checklist
**IMPORTANT**: At the start of every session, perform these steps:
1. **Check Memory First** - Use `mcp__memory__read_graph` or `mcp__memory__search_nodes` to recall:
- Project-specific configurations and credentials
- Previous work context and decisions
- Infrastructure details (URLs, ports, access patterns)
- Known issues and their solutions
2. **Review Recent Git History** - Check `git log --oneline -10` to understand recent changes
3. **Check Container Status** - Use `mcp__podman__container_list` to see what's running
---
## Project Instructions
### Things to Remember
Before writing any code:
1. State how you will verify this change works (test, bash command, browser check, etc.)
2. Write the test or verification step first
3. Then implement the code
4. Run verification and iterate until it passes
## Communication Style: Ask Before Assuming
**IMPORTANT**: When helping with tasks, **ask clarifying questions before making assumptions**. Do not assume:
- What steps the user has or hasn't completed
- What the user already knows or has configured
- What external services (OAuth providers, APIs, etc.) are already set up
- What secrets or credentials have already been created
Instead, ask the user to confirm the current state before providing instructions or making recommendations. This prevents wasted effort and respects the user's existing work.
## Platform Requirement: Linux Only
**CRITICAL**: This application is designed to run **exclusively on Linux**. See [ADR-014](docs/adr/0014-containerization-and-deployment-strategy.md) for full details.
### Environment Terminology
- **Dev Container** (or just "dev"): The containerized Linux development environment (`flyer-crawler-dev`). This is where all development and testing should occur.
- **Host**: The Windows machine running Podman/Docker and VS Code.
When instructions say "run in dev" or "run in the dev container", they mean executing commands inside the `flyer-crawler-dev` container.
### Test Execution Rules
1. **ALL tests MUST be executed in the dev container** - the Linux container environment
2. **NEVER run tests directly on Windows host** - test results from Windows are unreliable
3. **Always use the dev container for testing** when developing on Windows
### How to Run Tests Correctly
```bash
# If on Windows, first open VS Code and "Reopen in Container"
# Then run tests inside the dev container:
npm test # Run all unit tests
npm run test:unit # Run unit tests only
npm run test:integration # Run integration tests (requires DB/Redis)
```
### Running Tests via Podman (from Windows host)
**Note:** This project has 2900+ unit tests. For AI-assisted development, pipe output to a file for easier processing.
The command to run unit tests in the dev container via podman:
```bash
# Basic (output to terminal)
podman exec -it flyer-crawler-dev npm run test:unit
# Recommended for AI processing: pipe to file
podman exec -it flyer-crawler-dev npm run test:unit 2>&1 | tee test-results.txt
```
The command to run integration tests in the dev container via podman:
```bash
podman exec -it flyer-crawler-dev npm run test:integration
```
For running specific test files:
```bash
podman exec -it flyer-crawler-dev npm test -- --run src/hooks/useAuth.test.tsx
```
### Why Linux Only?
- Path separators: Code uses POSIX-style paths (`/`) which may break on Windows
- Shell scripts in `scripts/` directory are Linux-only
- External dependencies like `pdftocairo` assume Linux installation paths
- Unix-style file permissions are assumed throughout
### Test Result Interpretation
- Tests that **pass on Windows but fail on Linux** = **BROKEN tests** (must be fixed)
- Tests that **fail on Windows but pass on Linux** = **PASSING tests** (acceptable)
## Development Workflow
1. Open project in VS Code
2. Use "Reopen in Container" (Dev Containers extension required) to enter the dev environment
3. Wait for dev container initialization to complete
4. Run `npm test` to verify the dev environment is working
5. Make changes and run tests inside the dev container
## Code Change Verification
After making any code changes, **always run a type-check** to catch TypeScript errors before committing:
```bash
npm run type-check
```
This prevents linting/type errors from being introduced into the codebase.
## Quick Reference
| Command | Description |
| -------------------------- | ---------------------------- |
| `npm test` | Run all unit tests |
| `npm run test:unit` | Run unit tests only |
| `npm run test:integration` | Run integration tests |
| `npm run dev:container` | Start dev server (container) |
| `npm run build` | Build for production |
| `npm run type-check` | Run TypeScript type checking |
## Database Schema Files
**CRITICAL**: The database schema files must be kept in sync with each other. When making schema changes:
| File | Purpose |
| ------------------------------ | ----------------------------------------------------------- |
| `sql/master_schema_rollup.sql` | Complete schema used by test database setup and reference |
| `sql/initial_schema.sql` | Base schema without seed data, used as standalone reference |
| `sql/migrations/*.sql` | Incremental migrations for production database updates |
**Maintenance Rules:**
1. **Keep `master_schema_rollup.sql` and `initial_schema.sql` in sync** - These files should contain the same table definitions
2. **When adding columns via migration**, also add them to both `master_schema_rollup.sql` and `initial_schema.sql`
3. **Migrations are for production deployments** - They use `ALTER TABLE` to add columns incrementally
4. **Schema files are for fresh installs** - They define the complete table structure
5. **Test database uses `master_schema_rollup.sql`** - If schema files are out of sync with migrations, tests will fail
**Example:** When `002_expiry_tracking.sql` adds `purchase_date` to `pantry_items`, that column must also exist in the `CREATE TABLE` statements in both `master_schema_rollup.sql` and `initial_schema.sql`.
## Known Integration Test Issues and Solutions
This section documents common test issues encountered in integration tests, their root causes, and solutions. These patterns recur frequently.
### 1. Vitest globalSetup Runs in Separate Node.js Context
**Problem:** Vitest's `globalSetup` runs in a completely separate Node.js context from test files. This means:
- Singletons created in globalSetup are NOT the same instances as those in test files
- `global`, `globalThis`, and `process` are all isolated between contexts
- `vi.spyOn()` on module exports doesn't work cross-context
- Dependency injection via setter methods fails across contexts
**Affected Tests:** Any test trying to inject mocks into BullMQ worker services (e.g., AI failure tests, DB failure tests)
**Solution Options:**
1. Mark tests as `.todo()` until an API-based mock injection mechanism is implemented
2. Create test-only API endpoints that allow setting mock behaviors via HTTP
3. Use file-based or Redis-based mock flags that services check at runtime
**Example of affected code pattern:**
```typescript
// This DOES NOT work - different module instances
const { flyerProcessingService } = await import('../../services/workers.server');
flyerProcessingService._getAiProcessor()._setExtractAndValidateData(mockFn);
// The worker uses a different flyerProcessingService instance!
```
### 2. BullMQ Cleanup Queue Deleting Files Before Test Verification
**Problem:** The cleanup worker runs in the globalSetup context and processes cleanup jobs even when tests spy on `cleanupQueue.add()`. The spy intercepts calls in the test context, but jobs already queued run in the worker's context.
**Affected Tests:** EXIF/PNG metadata stripping tests that need to verify file contents before deletion
**Solution:** Drain and pause the cleanup queue before the test:
```typescript
const { cleanupQueue } = await import('../../services/queues.server');
await cleanupQueue.drain(); // Remove existing jobs
await cleanupQueue.pause(); // Prevent new jobs from processing
// ... run test ...
await cleanupQueue.resume(); // Restore normal operation
```
### 3. Cache Invalidation After Direct Database Inserts
**Problem:** Tests that insert data directly via SQL (bypassing the service layer) don't trigger cache invalidation. Subsequent API calls return stale cached data.
**Affected Tests:** Any test using `pool.query()` to insert flyers, stores, or other cached entities
**Solution:** Manually invalidate the cache after direct inserts:
```typescript
await pool.query('INSERT INTO flyers ...');
await cacheService.invalidateFlyers(); // Clear stale cache
```
### 4. Unique Filenames Required for Test Isolation
**Problem:** Multer generates predictable filenames in test environments, causing race conditions when multiple tests upload files concurrently or in sequence.
**Affected Tests:** Flyer processing tests, file upload tests
**Solution:** Always use unique filenames with timestamps:
```typescript
// In multer.middleware.ts
const uniqueSuffix = `${Date.now()}-${Math.round(Math.random() * 1e9)}`;
cb(null, `${file.fieldname}-${uniqueSuffix}-${sanitizedOriginalName}`);
```
### 5. Response Format Mismatches
**Problem:** API response formats may change, causing tests to fail when expecting old formats.
**Common Issues:**
- `response.body.data.jobId` vs `response.body.data.job.id`
- Nested objects vs flat response structures
- Type coercion (string vs number for IDs)
**Solution:** Always log response bodies during debugging and update test assertions to match actual API contracts.
### 6. External Service Availability
**Problem:** Tests depending on external services (PM2, Redis health checks) fail when those services aren't available in the test environment.
**Solution:** Use try/catch with graceful degradation or mock the external service checks.
## Secrets and Environment Variables
**CRITICAL**: This project uses **Gitea CI/CD secrets** for all sensitive configuration. There is NO `/etc/flyer-crawler/environment` file or similar local config file on the server.
### Server Directory Structure
| Path | Environment | Notes |
| --------------------------------------------- | ----------- | ------------------------------------------------ |
| `/var/www/flyer-crawler.projectium.com/` | Production | NO `.env` file - secrets injected via CI/CD only |
| `/var/www/flyer-crawler-test.projectium.com/` | Test | Has `.env.test` file for test-specific config |
### How Secrets Work
1. **Gitea Secrets**: All secrets are stored in Gitea repository settings (Settings → Secrets)
2. **CI/CD Injection**: Secrets are injected during deployment via `.gitea/workflows/deploy-to-prod.yml` and `deploy-to-test.yml`
3. **PM2 Environment**: The CI/CD workflow passes secrets to PM2 via environment variables, which are then available to the application
### Key Files for Configuration
| File | Purpose |
| ------------------------------------- | ---------------------------------------------------- |
| `src/config/env.ts` | Centralized config with Zod schema validation |
| `ecosystem.config.cjs` | PM2 process config - reads from `process.env` |
| `.gitea/workflows/deploy-to-prod.yml` | Production deployment with secret injection |
| `.gitea/workflows/deploy-to-test.yml` | Test deployment with secret injection |
| `.env.example` | Template showing all available environment variables |
| `.env.test` | Test environment overrides (only on test server) |
### Adding New Secrets
To add a new secret (e.g., `SENTRY_DSN`):
1. Add the secret to Gitea repository settings
2. Update the relevant workflow file (e.g., `deploy-to-prod.yml`) to inject it:
```yaml
SENTRY_DSN: ${{ secrets.SENTRY_DSN }}
```
3. Update `ecosystem.config.cjs` to read it from `process.env`
4. Update `src/config/env.ts` schema if validation is needed
5. Update `.env.example` to document the new variable
### Current Gitea Secrets
**Shared (used by both environments):**
- `DB_HOST` - Database host (shared PostgreSQL server)
- `JWT_SECRET` - Authentication
- `GOOGLE_MAPS_API_KEY` - Google Maps
- `GOOGLE_CLIENT_ID`, `GOOGLE_CLIENT_SECRET` - Google OAuth
- `GH_CLIENT_ID`, `GH_CLIENT_SECRET` - GitHub OAuth
**Production-specific:**
- `DB_USER_PROD`, `DB_PASSWORD_PROD` - Production database credentials (`flyer_crawler_prod`)
- `DB_DATABASE_PROD` - Production database name (`flyer-crawler`)
- `REDIS_PASSWORD_PROD` - Redis password (uses database 0)
- `VITE_GOOGLE_GENAI_API_KEY` - Gemini API key for production
- `SENTRY_DSN`, `VITE_SENTRY_DSN` - Bugsink error tracking DSNs (production projects)
**Test-specific:**
- `DB_USER_TEST`, `DB_PASSWORD_TEST` - Test database credentials (`flyer_crawler_test`)
- `DB_DATABASE_TEST` - Test database name (`flyer-crawler-test`)
- `REDIS_PASSWORD_TEST` - Redis password (uses database 1 for isolation)
- `VITE_GOOGLE_GENAI_API_KEY_TEST` - Gemini API key for test
- `SENTRY_DSN_TEST`, `VITE_SENTRY_DSN_TEST` - Bugsink error tracking DSNs (test projects)
### Test Environment
The test environment (`flyer-crawler-test.projectium.com`) uses **both** Gitea CI/CD secrets and a local `.env.test` file:
- **Gitea secrets**: Injected during deployment via `.gitea/workflows/deploy-to-test.yml`
- **`.env.test` file**: Located at `/var/www/flyer-crawler-test.projectium.com/.env.test` for local overrides
- **Redis database 1**: Isolates test job queues from production (which uses database 0)
- **PM2 process names**: Suffixed with `-test` (e.g., `flyer-crawler-api-test`)
### Database User Setup (Test Environment)
**CRITICAL**: The test database requires specific PostgreSQL permissions to be configured manually. Schema ownership alone is NOT sufficient - explicit privileges must be granted.
**Database Users:**
| User | Database | Purpose |
| -------------------- | -------------------- | ---------- |
| `flyer_crawler_prod` | `flyer-crawler` | Production |
| `flyer_crawler_test` | `flyer-crawler-test` | Testing |
**Required Setup Commands** (run as `postgres` superuser):
```bash
# Connect as postgres superuser
sudo -u postgres psql
# Create the test database and user (if not exists)
CREATE DATABASE "flyer-crawler-test";
CREATE USER flyer_crawler_test WITH PASSWORD 'your-password-here';
# Grant ownership and privileges
ALTER DATABASE "flyer-crawler-test" OWNER TO flyer_crawler_test;
\c "flyer-crawler-test"
ALTER SCHEMA public OWNER TO flyer_crawler_test;
GRANT CREATE, USAGE ON SCHEMA public TO flyer_crawler_test;
# Create required extension (must be done by superuser)
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
```
**Why These Steps Are Necessary:**
1. **Schema ownership alone is insufficient** - PostgreSQL requires explicit `GRANT CREATE, USAGE` privileges even when the user owns the schema
2. **uuid-ossp extension** - Required by the application for UUID generation; must be created by a superuser before the app can use it
3. **Separate users for prod/test** - Prevents accidental cross-environment data access; each environment has its own credentials in Gitea secrets
**Verification:**
```bash
# Check schema privileges (should show 'UC' for flyer_crawler_test)
psql -d "flyer-crawler-test" -c "\dn+ public"
# Expected output:
# Name | Owner | Access privileges
# -------+--------------------+------------------------------------------
# public | flyer_crawler_test | flyer_crawler_test=UC/flyer_crawler_test
```
### Dev Container Environment
The dev container runs its own **local Bugsink instance** - it does NOT connect to the production Bugsink server:
- **Local Bugsink**: Runs at `http://localhost:8000` inside the container
- **Pre-configured DSNs**: Set in `compose.dev.yml`, pointing to local instance
- **Admin credentials**: `admin@localhost` / `admin`
- **Isolated**: Dev errors stay local, don't pollute production/test dashboards
- **No Gitea secrets needed**: Everything is self-contained in the container
---
## MCP Servers
The following MCP servers are configured for this project:
| Server | Purpose |
| --------------------- | ------------------------------------------- |
| gitea-projectium | Gitea API for gitea.projectium.com |
| gitea-torbonium | Gitea API for gitea.torbonium.com |
| podman | Container management |
| filesystem | File system access |
| fetch | Web fetching |
| markitdown | Convert documents to markdown |
| sequential-thinking | Step-by-step reasoning |
| memory | Knowledge graph persistence |
| postgres | Direct database queries (localhost:5432) |
| playwright | Browser automation and testing |
| redis | Redis cache inspection (localhost:6379) |
| sentry-selfhosted-mcp | Error tracking via Bugsink (localhost:8000) |
**Note:** MCP servers work in both **Claude CLI** and **Claude Code VS Code extension** (as of January 2026).
### Sentry/Bugsink MCP Server Setup (ADR-015)
To enable Claude Code to query and analyze application errors from Bugsink:
1. **Install the MCP server**:
```bash
# Clone the sentry-selfhosted-mcp repository
git clone https://github.com/ddfourtwo/sentry-selfhosted-mcp.git
cd sentry-selfhosted-mcp
npm install
```
2. **Configure Claude Code** (add to `.claude/mcp.json`):
```json
{
"sentry-selfhosted-mcp": {
"command": "node",
"args": ["/path/to/sentry-selfhosted-mcp/dist/index.js"],
"env": {
"SENTRY_URL": "http://localhost:8000",
"SENTRY_AUTH_TOKEN": "<get-from-bugsink-ui>",
"SENTRY_ORG_SLUG": "flyer-crawler"
}
}
}
```
3. **Get the auth token**:
- Navigate to Bugsink UI at `http://localhost:8000`
- Log in with admin credentials
- Go to Settings > API Keys
- Create a new API key with read access
4. **Available capabilities**:
- List projects and issues
- View detailed error events
- Search by error message or stack trace
- Update issue status (resolve, ignore)
- Add comments to issues
### SSH Server Access
Claude Code can execute commands on the production server via SSH:
```bash
# Basic command execution
ssh root@projectium.com "command here"
# Examples:
ssh root@projectium.com "systemctl status logstash"
ssh root@projectium.com "pm2 list"
ssh root@projectium.com "tail -50 /var/www/flyer-crawler.projectium.com/logs/app.log"
```
**Use cases:**
- Managing Logstash, PM2, NGINX, Redis services
- Viewing server logs
- Deploying configuration changes
- Checking service status
**Important:** SSH access requires the host machine to have SSH keys configured for `root@projectium.com`.

188
DATABASE.md Normal file
View File

@@ -0,0 +1,188 @@
# Database Setup
Flyer Crawler uses PostgreSQL with several extensions for full-text search, geographic data, and UUID generation.
---
## Required Extensions
| Extension | Purpose |
| ----------- | ------------------------------------------- |
| `postgis` | Geographic/spatial data for store locations |
| `pg_trgm` | Trigram matching for fuzzy text search |
| `uuid-ossp` | UUID generation for primary keys |
---
## Production Database Setup
### Step 1: Install PostgreSQL
```bash
sudo apt update
sudo apt install postgresql postgresql-contrib
```
### Step 2: Create Database and User
Switch to the postgres system user:
```bash
sudo -u postgres psql
```
Run the following SQL commands (replace `'a_very_strong_password'` with a secure password):
```sql
-- Create a new role for your application
CREATE ROLE flyer_crawler_user WITH LOGIN PASSWORD 'a_very_strong_password';
-- Create the production database
CREATE DATABASE "flyer-crawler-prod" WITH OWNER = flyer_crawler_user;
-- Connect to the new database
\c "flyer-crawler-prod"
-- Install required extensions (must be done as superuser)
CREATE EXTENSION IF NOT EXISTS postgis;
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
-- Exit
\q
```
### Step 3: Apply the Schema
Navigate to your project directory and run:
```bash
psql -U flyer_crawler_user -d "flyer-crawler-prod" -f sql/master_schema_rollup.sql
```
This creates all tables, functions, triggers, and seeds essential data (categories, master items).
### Step 4: Seed the Admin Account
Set the required environment variables and run the seed script:
```bash
export DB_USER=flyer_crawler_user
export DB_PASSWORD=your_password
export DB_NAME="flyer-crawler-prod"
export DB_HOST=localhost
npx tsx src/db/seed_admin_account.ts
```
---
## Test Database Setup
The test database is used by CI/CD pipelines and local test runs.
### Step 1: Create the Test Database
```bash
sudo -u postgres psql
```
```sql
-- Create the test database
CREATE DATABASE "flyer-crawler-test" WITH OWNER = flyer_crawler_user;
-- Connect to the test database
\c "flyer-crawler-test"
-- Install required extensions
CREATE EXTENSION IF NOT EXISTS postgis;
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
-- Grant schema ownership (required for test runner to reset schema)
ALTER SCHEMA public OWNER TO flyer_crawler_user;
-- Exit
\q
```
### Step 2: Configure CI/CD Secrets
Ensure these secrets are set in your Gitea repository settings:
| Secret | Description |
| ------------- | ------------------------------------------ |
| `DB_HOST` | Database hostname (e.g., `localhost`) |
| `DB_PORT` | Database port (e.g., `5432`) |
| `DB_USER` | Database user (e.g., `flyer_crawler_user`) |
| `DB_PASSWORD` | Database password |
---
## How the Test Pipeline Works
The CI pipeline uses a permanent test database that gets reset on each test run:
1. **Setup**: The vitest global setup script connects to `flyer-crawler-test`
2. **Schema Reset**: Executes `sql/drop_tables.sql` (`DROP SCHEMA public CASCADE`)
3. **Schema Application**: Runs `sql/master_schema_rollup.sql` to build a fresh schema
4. **Test Execution**: Tests run against the clean database
This approach is faster than creating/destroying databases and doesn't require sudo access.
---
## Connecting to Production Database
```bash
psql -h localhost -U flyer_crawler_user -d "flyer-crawler-prod" -W
```
---
## Checking PostGIS Version
```sql
SELECT version();
SELECT PostGIS_Full_Version();
```
Example output:
```text
PostgreSQL 14.19 (Ubuntu 14.19-0ubuntu0.22.04.1)
POSTGIS="3.2.0 c3e3cc0" GEOS="3.10.2-CAPI-1.16.0" PROJ="8.2.1"
```
---
## Schema Files
| File | Purpose |
| ------------------------------ | --------------------------------------------------------- |
| `sql/master_schema_rollup.sql` | Complete schema with all tables, functions, and seed data |
| `sql/drop_tables.sql` | Drops entire schema (used by test runner) |
| `sql/schema.sql.txt` | Legacy schema file (reference only) |
---
## Backup and Restore
### Create a Backup
```bash
pg_dump -U flyer_crawler_user -d "flyer-crawler-prod" -F c -f backup.dump
```
### Restore from Backup
```bash
pg_restore -U flyer_crawler_user -d "flyer-crawler-prod" -c backup.dump
```
---
## Related Documentation
- [Installation Guide](INSTALL.md) - Local development setup
- [Deployment Guide](DEPLOYMENT.md) - Production deployment

271
DEPLOYMENT.md Normal file
View File

@@ -0,0 +1,271 @@
# Deployment Guide
This guide covers deploying Flyer Crawler to a production server.
## Prerequisites
- Ubuntu server (22.04 LTS recommended)
- PostgreSQL 14+ with PostGIS extension
- Redis
- Node.js 20.x
- NGINX (reverse proxy)
- PM2 (process manager)
---
## Server Setup
### Install Node.js
```bash
curl -sL https://deb.nodesource.com/setup_20.x | sudo bash -
sudo apt-get install -y nodejs
```
### Install PM2
```bash
sudo npm install -g pm2
```
---
## Application Deployment
### Clone and Install
```bash
git clone <repository-url>
cd flyer-crawler.projectium.com
npm install
```
### Build for Production
```bash
npm run build
```
### Start with PM2
```bash
npm run start:prod
```
This starts three PM2 processes:
- `flyer-crawler-api` - Main API server
- `flyer-crawler-worker` - Background job worker
- `flyer-crawler-analytics-worker` - Analytics processing worker
---
## Environment Variables (Gitea Secrets)
For deployments using Gitea CI/CD workflows, configure these as **repository secrets**:
| Secret | Description |
| --------------------------- | ------------------------------------------- |
| `DB_HOST` | PostgreSQL server hostname |
| `DB_USER` | PostgreSQL username |
| `DB_PASSWORD` | PostgreSQL password |
| `DB_DATABASE_PROD` | Production database name |
| `REDIS_PASSWORD_PROD` | Production Redis password |
| `REDIS_PASSWORD_TEST` | Test Redis password |
| `JWT_SECRET` | Long, random string for signing auth tokens |
| `VITE_GOOGLE_GENAI_API_KEY` | Google Gemini API key |
| `GOOGLE_MAPS_API_KEY` | Google Maps Geocoding API key |
---
## NGINX Configuration
### Reverse Proxy Setup
Create a site configuration at `/etc/nginx/sites-available/flyer-crawler.projectium.com`:
```nginx
server {
listen 80;
server_name flyer-crawler.projectium.com;
location / {
proxy_pass http://localhost:5173;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection 'upgrade';
proxy_set_header Host $host;
proxy_cache_bypass $http_upgrade;
}
location /api {
proxy_pass http://localhost:3001;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection 'upgrade';
proxy_set_header Host $host;
proxy_cache_bypass $http_upgrade;
}
}
```
Enable the site:
```bash
sudo ln -s /etc/nginx/sites-available/flyer-crawler.projectium.com /etc/nginx/sites-enabled/
sudo nginx -t
sudo systemctl reload nginx
```
### MIME Types Fix for .mjs Files
If JavaScript modules (`.mjs` files) aren't loading correctly, add the proper MIME type.
**Option 1**: Edit the site configuration file directly:
```nginx
# Add inside the server block
types {
application/javascript js mjs;
}
```
**Option 2**: Edit `/etc/nginx/mime.types` globally:
```
# Change this line:
application/javascript js;
# To:
application/javascript js mjs;
```
After changes:
```bash
sudo nginx -t
sudo systemctl reload nginx
```
---
## PM2 Log Management
Install and configure pm2-logrotate to manage log files:
```bash
pm2 install pm2-logrotate
pm2 set pm2-logrotate:max_size 10M
pm2 set pm2-logrotate:retain 14
pm2 set pm2-logrotate:compress false
pm2 set pm2-logrotate:dateFormat YYYY-MM-DD_HH-mm-ss
```
---
## Rate Limiting
The application respects the Gemini AI service's rate limits. You can adjust the `GEMINI_RPM` (requests per minute) environment variable in production as needed without changing the code.
---
## CI/CD Pipeline
The project includes Gitea workflows at `.gitea/workflows/deploy.yml` that:
1. Run tests against a test database
2. Build the application
3. Deploy to production on successful builds
The workflow automatically:
- Sets up the test database schema before tests
- Tears down test data after tests complete
- Deploys to the production server
---
## Monitoring
### Check PM2 Status
```bash
pm2 status
pm2 logs
pm2 logs flyer-crawler-api --lines 100
```
### Restart Services
```bash
pm2 restart all
pm2 restart flyer-crawler-api
```
---
## Error Tracking with Bugsink (ADR-015)
Bugsink is a self-hosted Sentry-compatible error tracking system. See [docs/adr/0015-application-performance-monitoring-and-error-tracking.md](docs/adr/0015-application-performance-monitoring-and-error-tracking.md) for the full architecture decision.
### Creating Bugsink Projects and DSNs
After Bugsink is installed and running, you need to create projects and obtain DSNs:
1. **Access Bugsink UI**: Navigate to `http://localhost:8000`
2. **Log in** with your admin credentials
3. **Create Backend Project**:
- Click "Create Project"
- Name: `flyer-crawler-backend`
- Platform: Node.js
- Copy the generated DSN (format: `http://<key>@localhost:8000/<project_id>`)
4. **Create Frontend Project**:
- Click "Create Project"
- Name: `flyer-crawler-frontend`
- Platform: React
- Copy the generated DSN
5. **Configure Environment Variables**:
```bash
# Backend (server-side)
export SENTRY_DSN=http://<backend-key>@localhost:8000/<backend-project-id>
# Frontend (client-side, exposed to browser)
export VITE_SENTRY_DSN=http://<frontend-key>@localhost:8000/<frontend-project-id>
# Shared settings
export SENTRY_ENVIRONMENT=production
export VITE_SENTRY_ENVIRONMENT=production
export SENTRY_ENABLED=true
export VITE_SENTRY_ENABLED=true
```
### Testing Error Tracking
Verify Bugsink is receiving events:
```bash
npx tsx scripts/test-bugsink.ts
```
This sends test error and info events. Check the Bugsink UI for:
- `BugsinkTestError` in the backend project
- Info message "Test info message from test-bugsink.ts"
### Sentry SDK v10+ HTTP DSN Limitation
The Sentry SDK v10+ enforces HTTPS-only DSNs by default. Since Bugsink runs locally over HTTP, our implementation uses the Sentry Store API directly instead of the SDK's built-in transport. This is handled transparently by the `sentry.server.ts` and `sentry.client.ts` modules.
---
## Related Documentation
- [Database Setup](DATABASE.md) - PostgreSQL and PostGIS configuration
- [Authentication Setup](AUTHENTICATION.md) - OAuth provider configuration
- [Installation Guide](INSTALL.md) - Local development setup
- [Bare-Metal Server Setup](docs/BARE-METAL-SETUP.md) - Manual server installation guide

View File

@@ -7,7 +7,7 @@
#
# Base: Ubuntu 22.04 (LTS) - matches production server
# Node: v20.x (LTS) - matches production
# Includes: PostgreSQL client, Redis CLI, build tools
# Includes: PostgreSQL client, Redis CLI, build tools, Bugsink, Logstash
# ============================================================================
FROM ubuntu:22.04
@@ -21,16 +21,23 @@ ENV DEBIAN_FRONTEND=noninteractive
# - curl: for downloading Node.js setup script and health checks
# - git: for version control operations
# - build-essential: for compiling native Node.js modules (node-gyp)
# - python3: required by some Node.js build tools
# - python3, python3-pip, python3-venv: for Bugsink
# - postgresql-client: for psql CLI (database initialization)
# - redis-tools: for redis-cli (health checks)
# - gnupg, apt-transport-https: for Elastic APT repository (Logstash)
# - openjdk-17-jre-headless: required by Logstash
RUN apt-get update && apt-get install -y \
curl \
git \
build-essential \
python3 \
python3-pip \
python3-venv \
postgresql-client \
redis-tools \
gnupg \
apt-transport-https \
openjdk-17-jre-headless \
&& rm -rf /var/lib/apt/lists/*
# ============================================================================
@@ -39,6 +46,204 @@ RUN apt-get update && apt-get install -y \
RUN curl -fsSL https://deb.nodesource.com/setup_20.x | bash - \
&& apt-get install -y nodejs
# ============================================================================
# Install Logstash (Elastic APT Repository)
# ============================================================================
# ADR-015: Log aggregation for Pino and Redis logs → Bugsink
RUN curl -fsSL https://artifacts.elastic.co/GPG-KEY-elasticsearch | gpg --dearmor -o /usr/share/keyrings/elastic-keyring.gpg \
&& echo "deb [signed-by=/usr/share/keyrings/elastic-keyring.gpg] https://artifacts.elastic.co/packages/8.x/apt stable main" | tee /etc/apt/sources.list.d/elastic-8.x.list \
&& apt-get update \
&& apt-get install -y logstash \
&& rm -rf /var/lib/apt/lists/*
# ============================================================================
# Install Bugsink (Python Package)
# ============================================================================
# ADR-015: Self-hosted Sentry-compatible error tracking
# Create a virtual environment for Bugsink to avoid conflicts
RUN python3 -m venv /opt/bugsink \
&& /opt/bugsink/bin/pip install --upgrade pip \
&& /opt/bugsink/bin/pip install bugsink gunicorn psycopg2-binary
# Create Bugsink directories and configuration
RUN mkdir -p /var/log/bugsink /var/lib/bugsink /opt/bugsink/conf
# Create Bugsink configuration file (Django settings module)
# This file is imported by bugsink-manage via DJANGO_SETTINGS_MODULE
# Based on bugsink/conf_templates/docker.py.template but customized for our setup
RUN echo 'import os\n\
from urllib.parse import urlparse\n\
\n\
from bugsink.settings.default import *\n\
from bugsink.settings.default import DATABASES, SILENCED_SYSTEM_CHECKS\n\
from bugsink.conf_utils import deduce_allowed_hosts, deduce_script_name\n\
\n\
IS_DOCKER = True\n\
\n\
# Security settings\n\
SECRET_KEY = os.getenv("SECRET_KEY")\n\
DEBUG = os.getenv("DEBUG", "False").lower() in ("true", "1", "yes")\n\
\n\
# Silence cookie security warnings for dev (no HTTPS)\n\
SILENCED_SYSTEM_CHECKS += ["security.W012", "security.W016"]\n\
\n\
# Database configuration from DATABASE_URL environment variable\n\
if os.getenv("DATABASE_URL"):\n\
DATABASE_URL = os.getenv("DATABASE_URL")\n\
parsed = urlparse(DATABASE_URL)\n\
\n\
if parsed.scheme in ["postgres", "postgresql"]:\n\
DATABASES["default"] = {\n\
"ENGINE": "django.db.backends.postgresql",\n\
"NAME": parsed.path.lstrip("/"),\n\
"USER": parsed.username,\n\
"PASSWORD": parsed.password,\n\
"HOST": parsed.hostname,\n\
"PORT": parsed.port or "5432",\n\
}\n\
\n\
# Snappea (background task runner) settings\n\
SNAPPEA = {\n\
"TASK_ALWAYS_EAGER": False,\n\
"WORKAHOLIC": True,\n\
"NUM_WORKERS": 2,\n\
"PID_FILE": None,\n\
}\n\
DATABASES["snappea"]["NAME"] = "/tmp/snappea.sqlite3"\n\
\n\
# Site settings\n\
_PORT = os.getenv("PORT", "8000")\n\
BUGSINK = {\n\
"BASE_URL": os.getenv("BASE_URL", f"http://localhost:{_PORT}"),\n\
"SITE_TITLE": os.getenv("SITE_TITLE", "Flyer Crawler Error Tracking"),\n\
"SINGLE_USER": os.getenv("SINGLE_USER", "True").lower() in ("true", "1", "yes"),\n\
"SINGLE_TEAM": os.getenv("SINGLE_TEAM", "True").lower() in ("true", "1", "yes"),\n\
"PHONEHOME": False,\n\
}\n\
\n\
ALLOWED_HOSTS = deduce_allowed_hosts(BUGSINK["BASE_URL"])\n\
\n\
# Console email backend for dev\n\
EMAIL_BACKEND = "bugsink.email_backends.QuietConsoleEmailBackend"\n\
' > /opt/bugsink/conf/bugsink_conf.py
# Create Bugsink startup script
# Uses DATABASE_URL environment variable (standard Docker approach per docs)
RUN echo '#!/bin/bash\n\
set -e\n\
\n\
# Build DATABASE_URL from individual env vars for flexibility\n\
export DATABASE_URL="postgresql://${BUGSINK_DB_USER:-bugsink}:${BUGSINK_DB_PASSWORD:-bugsink_dev_password}@${BUGSINK_DB_HOST:-postgres}:${BUGSINK_DB_PORT:-5432}/${BUGSINK_DB_NAME:-bugsink}"\n\
# SECRET_KEY is required by Bugsink/Django\n\
export SECRET_KEY="${BUGSINK_SECRET_KEY:-dev-bugsink-secret-key-minimum-50-characters-for-security}"\n\
\n\
# Create superuser if not exists (for dev convenience)\n\
if [ -n "$BUGSINK_ADMIN_EMAIL" ] && [ -n "$BUGSINK_ADMIN_PASSWORD" ]; then\n\
export CREATE_SUPERUSER="${BUGSINK_ADMIN_EMAIL}:${BUGSINK_ADMIN_PASSWORD}"\n\
fi\n\
\n\
# Wait for PostgreSQL to be ready\n\
until pg_isready -h ${BUGSINK_DB_HOST:-postgres} -p ${BUGSINK_DB_PORT:-5432} -U ${BUGSINK_DB_USER:-bugsink}; do\n\
echo "Waiting for PostgreSQL..."\n\
sleep 2\n\
done\n\
\n\
echo "PostgreSQL is ready. Starting Bugsink..."\n\
echo "DATABASE_URL: postgresql://${BUGSINK_DB_USER}:***@${BUGSINK_DB_HOST}:${BUGSINK_DB_PORT}/${BUGSINK_DB_NAME}"\n\
\n\
# Change to config directory so bugsink_conf.py can be found\n\
cd /opt/bugsink/conf\n\
\n\
# Run migrations\n\
echo "Running database migrations..."\n\
/opt/bugsink/bin/bugsink-manage migrate --noinput\n\
\n\
# Create superuser if CREATE_SUPERUSER is set (format: email:password)\n\
if [ -n "$CREATE_SUPERUSER" ]; then\n\
IFS=":" read -r ADMIN_EMAIL ADMIN_PASS <<< "$CREATE_SUPERUSER"\n\
/opt/bugsink/bin/bugsink-manage shell -c "\n\
from django.contrib.auth import get_user_model\n\
User = get_user_model()\n\
if not User.objects.filter(email='"'"'$ADMIN_EMAIL'"'"').exists():\n\
User.objects.create_superuser('"'"'$ADMIN_EMAIL'"'"', '"'"'$ADMIN_PASS'"'"')\n\
print('"'"'Superuser created'"'"')\n\
else:\n\
print('"'"'Superuser already exists'"'"')\n\
" || true\n\
fi\n\
\n\
# Start Bugsink with Gunicorn\n\
echo "Starting Gunicorn on port ${BUGSINK_PORT:-8000}..."\n\
exec /opt/bugsink/bin/gunicorn \\\n\
--bind 0.0.0.0:${BUGSINK_PORT:-8000} \\\n\
--workers ${BUGSINK_WORKERS:-2} \\\n\
--access-logfile - \\\n\
--error-logfile - \\\n\
bugsink.wsgi:application\n\
' > /usr/local/bin/start-bugsink.sh \
&& chmod +x /usr/local/bin/start-bugsink.sh
# ============================================================================
# Create Logstash Pipeline Configuration
# ============================================================================
# ADR-015: Pino and Redis logs → Bugsink
RUN mkdir -p /etc/logstash/conf.d /app/logs
RUN echo 'input {\n\
# Pino application logs\n\
file {\n\
path => "/app/logs/*.log"\n\
codec => json\n\
type => "pino"\n\
tags => ["app"]\n\
start_position => "beginning"\n\
sincedb_path => "/var/lib/logstash/sincedb_pino"\n\
}\n\
\n\
# Redis logs\n\
file {\n\
path => "/var/log/redis/*.log"\n\
type => "redis"\n\
tags => ["redis"]\n\
start_position => "beginning"\n\
sincedb_path => "/var/lib/logstash/sincedb_redis"\n\
}\n\
}\n\
\n\
filter {\n\
# Pino error detection (level 50 = error, 60 = fatal)\n\
if [type] == "pino" and [level] >= 50 {\n\
mutate { add_tag => ["error"] }\n\
}\n\
\n\
# Redis error detection\n\
if [type] == "redis" {\n\
grok {\n\
match => { "message" => "%%{POSINT:pid}:%%{WORD:role} %%{MONTHDAY} %%{MONTH} %%{TIME} %%{WORD:loglevel} %%{GREEDYDATA:redis_message}" }\n\
}\n\
if [loglevel] in ["WARNING", "ERROR"] {\n\
mutate { add_tag => ["error"] }\n\
}\n\
}\n\
}\n\
\n\
output {\n\
if "error" in [tags] {\n\
http {\n\
url => "http://localhost:8000/api/store/"\n\
http_method => "post"\n\
format => "json"\n\
}\n\
}\n\
\n\
# Debug output (comment out in production)\n\
stdout { codec => rubydebug }\n\
}\n\
' > /etc/logstash/conf.d/bugsink.conf
# Create Logstash sincedb directory
RUN mkdir -p /var/lib/logstash && chown -R logstash:logstash /var/lib/logstash
# ============================================================================
# Set Working Directory
# ============================================================================
@@ -52,6 +257,25 @@ ENV NODE_ENV=development
# Increase Node.js memory limit for large builds
ENV NODE_OPTIONS='--max-old-space-size=8192'
# Bugsink defaults (ADR-015)
ENV BUGSINK_DB_HOST=postgres
ENV BUGSINK_DB_PORT=5432
ENV BUGSINK_DB_NAME=bugsink
ENV BUGSINK_DB_USER=bugsink
ENV BUGSINK_DB_PASSWORD=bugsink_dev_password
ENV BUGSINK_PORT=8000
ENV BUGSINK_BASE_URL=http://localhost:8000
ENV BUGSINK_ADMIN_EMAIL=admin@localhost
ENV BUGSINK_ADMIN_PASSWORD=admin
# ============================================================================
# Expose Ports
# ============================================================================
# 3000 - Vite frontend
# 3001 - Express backend
# 8000 - Bugsink error tracking
EXPOSE 3000 3001 8000
# ============================================================================
# Default Command
# ============================================================================

168
INSTALL.md Normal file
View File

@@ -0,0 +1,168 @@
# Installation Guide
This guide covers setting up a local development environment for Flyer Crawler.
## Prerequisites
- Node.js 20.x or later
- Access to a PostgreSQL database (local or remote)
- Redis instance (for session management)
- Google Gemini API key
- Google Maps API key (for geocoding)
## Quick Start
If you already have PostgreSQL and Redis configured:
```bash
# Install dependencies
npm install
# Run in development mode
npm run dev
```
---
## Development Environment with Podman (Recommended for Windows)
This approach uses Podman with an Ubuntu container for a consistent development environment.
### Step 1: Install Prerequisites on Windows
1. **Install WSL 2**: Podman on Windows relies on the Windows Subsystem for Linux.
```powershell
wsl --install
```
Run this in an administrator PowerShell.
2. **Install Podman Desktop**: Download and install [Podman Desktop for Windows](https://podman-desktop.io/).
### Step 2: Set Up Podman
1. **Initialize Podman**: Launch Podman Desktop. It will automatically set up its WSL 2 machine.
2. **Start Podman**: Ensure the Podman machine is running from the Podman Desktop interface.
### Step 3: Set Up the Ubuntu Container
1. **Pull Ubuntu Image**:
```bash
podman pull ubuntu:latest
```
2. **Create a Podman Volume** (persists node_modules between container restarts):
```bash
podman volume create node_modules_cache
```
3. **Run the Ubuntu Container**:
Open a terminal in your project's root directory and run:
```bash
podman run -it -p 3001:3001 -p 5173:5173 --name flyer-dev \
-v "$(pwd):/app" \
-v "node_modules_cache:/app/node_modules" \
ubuntu:latest
```
| Flag | Purpose |
| ------------------------------------------- | ------------------------------------------------ |
| `-p 3001:3001` | Forwards the backend server port |
| `-p 5173:5173` | Forwards the Vite frontend server port |
| `--name flyer-dev` | Names the container for easy reference |
| `-v "...:/app"` | Mounts your project directory into the container |
| `-v "node_modules_cache:/app/node_modules"` | Mounts the named volume for node_modules |
### Step 4: Configure the Ubuntu Environment
You are now inside the Ubuntu container's shell.
1. **Update Package Lists**:
```bash
apt-get update
```
2. **Install Dependencies**:
```bash
apt-get install -y curl git
curl -sL https://deb.nodesource.com/setup_20.x | bash -
apt-get install -y nodejs
```
3. **Navigate to Project Directory**:
```bash
cd /app
```
4. **Install Project Dependencies**:
```bash
npm install
```
### Step 5: Run the Development Server
```bash
npm run dev
```
### Step 6: Access the Application
- **Frontend**: http://localhost:5173
- **Backend API**: http://localhost:3001
### Managing the Container
| Action | Command |
| --------------------- | -------------------------------- |
| Stop the container | Press `Ctrl+C`, then type `exit` |
| Restart the container | `podman start -a -i flyer-dev` |
| Remove the container | `podman rm flyer-dev` |
---
## Environment Variables
This project is configured to run in a CI/CD environment and does not use `.env` files. All configuration must be provided as environment variables.
For local development, you can export these in your shell or use your IDE's environment configuration:
| Variable | Description |
| --------------------------- | ------------------------------------- |
| `DB_HOST` | PostgreSQL server hostname |
| `DB_USER` | PostgreSQL username |
| `DB_PASSWORD` | PostgreSQL password |
| `DB_DATABASE_PROD` | Production database name |
| `JWT_SECRET` | Secret string for signing auth tokens |
| `VITE_GOOGLE_GENAI_API_KEY` | Google Gemini API key |
| `GOOGLE_MAPS_API_KEY` | Google Maps Geocoding API key |
| `REDIS_PASSWORD_PROD` | Production Redis password |
| `REDIS_PASSWORD_TEST` | Test Redis password |
---
## Seeding Development Users
To create initial test accounts (`admin@example.com` and `user@example.com`):
```bash
npm run seed
```
After running, you may need to restart your IDE's TypeScript server to pick up any generated types.
---
## Next Steps
- [Database Setup](DATABASE.md) - Set up PostgreSQL with required extensions
- [Authentication Setup](AUTHENTICATION.md) - Configure OAuth providers
- [Deployment Guide](DEPLOYMENT.md) - Deploy to production

Some files were not shown because too many files have changed in this diff Show More