Compare commits

...

75 Commits

Author SHA1 Message Date
Gitea Actions
33a1e146ab ci: Bump version to 0.9.108 [skip ci] 2026-01-13 22:34:20 +05:00
4f8216db77 testing/staging fixin
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 15m55s
2026-01-13 09:33:38 -08:00
Gitea Actions
42d605d19f ci: Bump version to 0.9.107 [skip ci] 2026-01-13 22:06:39 +05:00
749350df7f testing/staging fixin
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 15m56s
2026-01-13 09:03:42 -08:00
Gitea Actions
ac085100fe ci: Bump version to 0.9.106 [skip ci] 2026-01-13 21:43:43 +05:00
ce4ecd1268 use port 3002 in test
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 16m13s
2026-01-13 08:42:34 -08:00
Gitea Actions
a57cfc396b ci: Bump version to 0.9.105 [skip ci] 2026-01-13 21:00:45 +05:00
987badbf8d use port 3002 in test
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 15m41s
2026-01-13 07:59:49 -08:00
Gitea Actions
d38fcd21c1 ci: Bump version to 0.9.104 [skip ci] 2026-01-13 08:11:38 +05:00
6e36cc3b07 logging + e2e test fixes
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 16m34s
2026-01-12 19:10:29 -08:00
Gitea Actions
62a8a8bf4b ci: Bump version to 0.9.103 [skip ci] 2026-01-13 06:39:39 +05:00
96038cfcf4 logging work - almost there
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 15m51s
2026-01-12 17:38:58 -08:00
Gitea Actions
981214fdd0 ci: Bump version to 0.9.102 [skip ci] 2026-01-13 06:27:55 +05:00
92b0138108 logging work - almost there
Some checks failed
Deploy to Test Environment / deploy-to-test (push) Failing after 1m2s
2026-01-12 17:26:59 -08:00
Gitea Actions
27f0255240 ci: Bump version to 0.9.101 [skip ci] 2026-01-13 05:57:55 +05:00
4e06dde9e1 logging work - almost there
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 15m30s
2026-01-12 16:57:18 -08:00
Gitea Actions
b9a0e5b82c ci: Bump version to 0.9.100 [skip ci] 2026-01-13 05:35:11 +05:00
bb7fe8dc2c logging work - almost there
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 15m28s
2026-01-12 16:34:18 -08:00
Gitea Actions
81f1f2250b ci: Bump version to 0.9.99 [skip ci] 2026-01-13 05:08:56 +05:00
c6c90bb615 more new feature fixes + sentry logging
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 15m53s
2026-01-12 16:08:18 -08:00
Gitea Actions
60489a626b ci: Bump version to 0.9.98 [skip ci] 2026-01-13 05:05:59 +05:00
3c63e1ecbb more new feature fixes + sentry logging
Some checks failed
Deploy to Test Environment / deploy-to-test (push) Has been cancelled
2026-01-12 16:04:09 -08:00
Gitea Actions
acbcb39cbe ci: Bump version to 0.9.97 [skip ci] 2026-01-13 03:34:42 +05:00
a87a0b6af1 unit test repairs
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 17m12s
2026-01-12 14:31:41 -08:00
Gitea Actions
abdc3cb6db ci: Bump version to 0.9.96 [skip ci] 2026-01-13 00:52:54 +05:00
7a1bd50119 unit test repairs
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 17m42s
2026-01-12 11:51:48 -08:00
Gitea Actions
87d75d0571 ci: Bump version to 0.9.95 [skip ci] 2026-01-13 00:04:10 +05:00
faf2900c28 unit test repairs
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 16m43s
2026-01-12 10:58:00 -08:00
Gitea Actions
5258efc179 ci: Bump version to 0.9.94 [skip ci] 2026-01-12 21:11:57 +05:00
2a5cc5bb51 unit test repairs
Some checks failed
Deploy to Test Environment / deploy-to-test (push) Failing after 1m17s
2026-01-12 08:10:37 -08:00
Gitea Actions
8eaee2844f ci: Bump version to 0.9.93 [skip ci] 2026-01-12 08:57:24 +05:00
440a19c3a7 whoa - so much - new features (UPC,etc) - Sentry for app logging! so much more !
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 14m53s
2026-01-11 19:55:10 -08:00
4ae6d84240 sql fix
Some checks failed
Deploy to Test Environment / deploy-to-test (push) Has been cancelled
2026-01-11 19:49:13 -08:00
Gitea Actions
5870e5c614 ci: Bump version to 0.9.92 [skip ci] 2026-01-12 08:20:09 +05:00
2e7ebbd9ed whoa - so much - new features (UPC,etc) - Sentry for app logging! so much more !
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 14m47s
2026-01-11 19:18:52 -08:00
Gitea Actions
dc3fa21359 ci: Bump version to 0.9.91 [skip ci] 2026-01-12 08:08:50 +05:00
11aeac5edd whoa - so much - new features (UPC,etc) - Sentry for app logging! so much more !
Some checks failed
Deploy to Test Environment / deploy-to-test (push) Failing after 1m10s
2026-01-11 19:07:02 -08:00
Gitea Actions
f6c0c082bc ci: Bump version to 0.9.90 [skip ci] 2026-01-11 15:05:48 +05:00
4e22213cd1 all the new shiny things
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 15m54s
2026-01-11 02:04:52 -08:00
Gitea Actions
9815eb3686 ci: Bump version to 0.9.89 [skip ci] 2026-01-11 12:58:20 +05:00
2bf4a7c1e6 google + github oauth
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 15m39s
2026-01-10 23:57:18 -08:00
Gitea Actions
5eed3f51f4 ci: Bump version to 0.9.88 [skip ci] 2026-01-11 12:01:25 +05:00
d250932c05 all tests fixed? can it be?
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 15m28s
2026-01-10 22:58:38 -08:00
Gitea Actions
7d1f964574 ci: Bump version to 0.9.87 [skip ci] 2026-01-11 08:30:29 +05:00
3b69e58de3 remove useless windows testing files, fix testing?
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 14m1s
2026-01-10 19:29:54 -08:00
Gitea Actions
5211aadd22 ci: Bump version to 0.9.86 [skip ci] 2026-01-11 08:05:21 +05:00
a997d1d0b0 ranstack query fixes
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 13m21s
2026-01-10 19:03:40 -08:00
cf5f77c58e Adopt TanStack Query fixes 2026-01-10 19:02:42 -08:00
Gitea Actions
cf0f5bb820 ci: Bump version to 0.9.85 [skip ci] 2026-01-11 06:44:28 +05:00
503e7084da Adopt TanStack Query fixes
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 14m41s
2026-01-10 17:42:45 -08:00
Gitea Actions
d8aa19ac40 ci: Bump version to 0.9.84 [skip ci] 2026-01-10 23:45:42 +05:00
dcd9452b8c Adopt TanStack Query
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 13m46s
2026-01-10 10:45:10 -08:00
Gitea Actions
6d468544e2 ci: Bump version to 0.9.83 [skip ci] 2026-01-10 23:14:18 +05:00
2913c7aa09 tanstack
Some checks failed
Deploy to Test Environment / deploy-to-test (push) Failing after 1m1s
2026-01-10 03:20:40 -08:00
Gitea Actions
77f9cb6081 ci: Bump version to 0.9.82 [skip ci] 2026-01-10 12:17:24 +05:00
2f1d73ca12 fix(tests): access wrapped API response data correctly
Some checks failed
Deploy to Test Environment / deploy-to-test (push) Failing after 3h0m5s
Tests were accessing response.body directly instead of response.body.data,
causing failures since sendSuccess() wraps responses in { success, data }.
2026-01-09 23:16:30 -08:00
Gitea Actions
402e2617ca ci: Bump version to 0.9.81 [skip ci] 2026-01-10 11:40:07 +05:00
e14c19c112 linting docs + some fixes go claude and gemini
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 16m0s
2026-01-09 22:38:57 -08:00
Gitea Actions
ea46f66c7a ci: Bump version to 0.9.80 [skip ci] 2026-01-10 11:00:30 +05:00
a42ee5a461 unit tests - wheeee! Claude is the mvp
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 15m11s
2026-01-09 21:59:09 -08:00
Gitea Actions
71710c8316 ci: Bump version to 0.9.79 [skip ci] 2026-01-10 09:32:36 +05:00
1480a73ab0 more compliance
Some checks failed
Deploy to Test Environment / deploy-to-test (push) Failing after 58s
2026-01-09 20:30:52 -08:00
Gitea Actions
b3efa3c756 ci: Bump version to 0.9.78 [skip ci] 2026-01-10 08:01:56 +05:00
fb8fd57bb6 huge linting fixes
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 15m3s
2026-01-09 19:01:05 -08:00
Gitea Actions
0a90d9d590 ci: Bump version to 0.9.77 [skip ci] 2026-01-10 07:54:20 +05:00
6ab473f5f0 huge linting fixes
Some checks failed
Deploy to Test Environment / deploy-to-test (push) Failing after 58s
2026-01-09 18:50:04 -08:00
Gitea Actions
c46efe1474 ci: Bump version to 0.9.76 [skip ci] 2026-01-10 06:59:56 +05:00
25d6b76f6d ADR-026: Client-Side Logging + linting fixes
Some checks failed
Deploy to Test Environment / deploy-to-test (push) Has been cancelled
2026-01-09 17:58:21 -08:00
Gitea Actions
9ffcc9d65d ci: Bump version to 0.9.75 [skip ci] 2026-01-10 03:25:25 +05:00
1285702210 adr-028 fixes for tests
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 15m38s
2026-01-09 14:24:20 -08:00
Gitea Actions
d38b751b40 ci: Bump version to 0.9.74 [skip ci] 2026-01-10 03:14:12 +05:00
e122d55ced adr-028 fixes for tests
Some checks failed
Deploy to Test Environment / deploy-to-test (push) Failing after 1m1s
2026-01-09 14:12:48 -08:00
Gitea Actions
af9992f773 ci: Bump version to 0.9.73 [skip ci] 2026-01-10 01:54:56 +05:00
3912139273 adr-028 and int tests
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 16m24s
2026-01-09 12:47:41 -08:00
b5f7f5e4d1 adr-0028 and int test fixes 2026-01-09 12:35:55 -08:00
361 changed files with 50877 additions and 5638 deletions

16
.claude/hooks.json Normal file
View File

@@ -0,0 +1,16 @@
{
"$schema": "https://claude.ai/schemas/hooks.json",
"hooks": {
"PreToolUse": [
{
"matcher": "Bash",
"hooks": [
{
"type": "command",
"command": "node -e \"const cmd = process.argv[1] || ''; const isTest = /\\b(npm\\s+(run\\s+)?test|vitest|jest)\\b/i.test(cmd); const isWindows = process.platform === 'win32'; const inContainer = process.env.REMOTE_CONTAINERS === 'true' || process.env.DEVCONTAINER === 'true'; if (isTest && isWindows && !inContainer) { console.error('BLOCKED: Tests must run on Linux. Use Dev Container (Reopen in Container) or WSL.'); process.exit(1); }\" -- \"$CLAUDE_TOOL_INPUT\""
}
]
}
]
}
}

View File

@@ -18,11 +18,9 @@
"Bash(PGPASSWORD=postgres psql:*)",
"Bash(npm search:*)",
"Bash(npx:*)",
"Bash(curl -s -H \"Authorization: token c72bc0f14f623fec233d3c94b3a16397fe3649ef\" https://gitea.projectium.com/api/v1/user)",
"Bash(curl:*)",
"Bash(powershell:*)",
"Bash(cmd.exe:*)",
"Bash(export NODE_ENV=test DB_HOST=localhost DB_USER=postgres DB_PASSWORD=postgres DB_NAME=flyer_crawler_dev REDIS_URL=redis://localhost:6379 FRONTEND_URL=http://localhost:5173 JWT_SECRET=test-jwt-secret:*)",
"Bash(npm run test:integration:*)",
"Bash(grep:*)",
"Bash(done)",
@@ -76,7 +74,28 @@
"Bash(timeout 60 podman machine start:*)",
"Bash(podman build:*)",
"Bash(podman network rm:*)",
"Bash(npm run lint)"
"Bash(npm run lint)",
"Bash(npm run typecheck:*)",
"Bash(npm run type-check:*)",
"Bash(npm run test:unit:*)",
"mcp__filesystem__move_file",
"Bash(git checkout:*)",
"Bash(podman image inspect:*)",
"Bash(node -e:*)",
"Bash(xargs -I {} sh -c 'if ! grep -q \"\"vi.mock.*apiClient\"\" \"\"{}\"\"; then echo \"\"{}\"\"; fi')",
"Bash(MSYS_NO_PATHCONV=1 podman exec:*)",
"Bash(docker ps:*)",
"Bash(find:*)",
"Bash(\"/c/Users/games3/.local/bin/uvx.exe\" markitdown-mcp --help)",
"Bash(git stash:*)",
"Bash(ping:*)",
"Bash(tee:*)",
"Bash(timeout 1800 podman exec flyer-crawler-dev npm run test:unit:*)",
"mcp__filesystem__edit_file",
"Bash(timeout 300 tail:*)",
"mcp__filesystem__list_allowed_directories",
"mcp__memory__add_observations",
"Bash(ssh:*)"
]
}
}

View File

@@ -41,6 +41,14 @@ FRONTEND_URL=http://localhost:3000
# REQUIRED: Secret key for signing JWT tokens (generate a random 64+ character string)
JWT_SECRET=your-super-secret-jwt-key-change-this-in-production
# OAuth Providers (Optional - enable social login)
# Google OAuth - https://console.cloud.google.com/apis/credentials
GOOGLE_CLIENT_ID=
GOOGLE_CLIENT_SECRET=
# GitHub OAuth - https://github.com/settings/developers
GITHUB_CLIENT_ID=
GITHUB_CLIENT_SECRET=
# ===================
# AI/ML Services
# ===================
@@ -75,3 +83,22 @@ CLEANUP_WORKER_CONCURRENCY=10
# Worker lock duration in milliseconds (default: 2 minutes)
WORKER_LOCK_DURATION=120000
# ===================
# Error Tracking (ADR-015)
# ===================
# Sentry-compatible error tracking via Bugsink (self-hosted)
# DSNs are created in Bugsink UI at http://localhost:8000 (dev) or your production URL
# Backend DSN - for Express/Node.js errors
SENTRY_DSN=
# Frontend DSN - for React/browser errors (uses VITE_ prefix)
VITE_SENTRY_DSN=
# Environment name for error grouping (defaults to NODE_ENV)
SENTRY_ENVIRONMENT=development
VITE_SENTRY_ENVIRONMENT=development
# Enable/disable error tracking (default: true)
SENTRY_ENABLED=true
VITE_SENTRY_ENABLED=true
# Enable debug mode for SDK troubleshooting (default: false)
SENTRY_DEBUG=false
VITE_SENTRY_DEBUG=false

View File

@@ -1,66 +0,0 @@
{
"mcpServers": {
"gitea-projectium": {
"command": "d:\\gitea-mcp\\gitea-mcp.exe",
"args": ["run", "-t", "stdio"],
"env": {
"GITEA_HOST": "https://gitea.projectium.com",
"GITEA_ACCESS_TOKEN": "c72bc0f14f623fec233d3c94b3a16397fe3649ef"
}
},
"gitea-torbonium": {
"command": "d:\\gitea-mcp\\gitea-mcp.exe",
"args": ["run", "-t", "stdio"],
"env": {
"GITEA_HOST": "https://gitea.torbonium.com",
"GITEA_ACCESS_TOKEN": "391c9ddbe113378bc87bb8184800ba954648fcf8"
}
},
"gitea-lan": {
"command": "d:\\gitea-mcp\\gitea-mcp.exe",
"args": ["run", "-t", "stdio"],
"env": {
"GITEA_HOST": "https://gitea.torbolan.com",
"GITEA_ACCESS_TOKEN": "YOUR_LAN_TOKEN_HERE"
},
"disabled": true
},
"podman": {
"command": "D:\\nodejs\\npx.cmd",
"args": ["-y", "podman-mcp-server@latest"],
"env": {
"DOCKER_HOST": "npipe:////./pipe/podman-machine-default"
}
},
"filesystem": {
"command": "d:\\nodejs\\node.exe",
"args": [
"c:\\Users\\games3\\AppData\\Roaming\\npm\\node_modules\\@modelcontextprotocol\\server-filesystem\\dist\\index.js",
"d:\\gitea\\flyer-crawler.projectium.com\\flyer-crawler.projectium.com"
]
},
"fetch": {
"command": "D:\\nodejs\\npx.cmd",
"args": ["-y", "@modelcontextprotocol/server-fetch"]
},
"io.github.ChromeDevTools/chrome-devtools-mcp": {
"type": "stdio",
"command": "npx",
"args": ["chrome-devtools-mcp@0.12.1"],
"gallery": "https://api.mcp.github.com",
"version": "0.12.1"
},
"markitdown": {
"command": "C:\\Users\\games3\\.local\\bin\\uvx.exe",
"args": ["markitdown-mcp"]
},
"sequential-thinking": {
"command": "D:\\nodejs\\npx.cmd",
"args": ["-y", "@modelcontextprotocol/server-sequential-thinking"]
},
"memory": {
"command": "D:\\nodejs\\npx.cmd",
"args": ["-y", "@modelcontextprotocol/server-memory"]
}
}
}

View File

@@ -98,6 +98,9 @@ jobs:
VITE_APP_VERSION="$(date +'%Y%m%d-%H%M'):$(git rev-parse --short HEAD):$PACKAGE_VERSION" \
VITE_APP_COMMIT_URL="$GITEA_SERVER_URL/${{ gitea.repository }}/commit/${{ gitea.sha }}" \
VITE_APP_COMMIT_MESSAGE="$COMMIT_MESSAGE" \
VITE_SENTRY_DSN="${{ secrets.VITE_SENTRY_DSN }}" \
VITE_SENTRY_ENVIRONMENT="production" \
VITE_SENTRY_ENABLED="true" \
VITE_API_BASE_URL=/api VITE_API_KEY=${{ secrets.VITE_GOOGLE_GENAI_API_KEY }} npm run build
- name: Deploy Application to Production Server
@@ -130,6 +133,15 @@ jobs:
SMTP_USER: ''
SMTP_PASS: ''
SMTP_FROM_EMAIL: 'noreply@flyer-crawler.projectium.com'
# OAuth Providers
GOOGLE_CLIENT_ID: ${{ secrets.GOOGLE_CLIENT_ID }}
GOOGLE_CLIENT_SECRET: ${{ secrets.GOOGLE_CLIENT_SECRET }}
GITHUB_CLIENT_ID: ${{ secrets.GH_CLIENT_ID }}
GITHUB_CLIENT_SECRET: ${{ secrets.GH_CLIENT_SECRET }}
# Sentry/Bugsink Error Tracking (ADR-015)
SENTRY_DSN: ${{ secrets.SENTRY_DSN }}
SENTRY_ENVIRONMENT: 'production'
SENTRY_ENABLED: 'true'
run: |
if [ -z "$DB_HOST" ] || [ -z "$DB_USER" ] || [ -z "$DB_PASSWORD" ] || [ -z "$DB_NAME" ]; then
echo "ERROR: One or more production database secrets (DB_HOST, DB_USER, DB_PASSWORD, DB_DATABASE_PROD) are not set."
@@ -159,7 +171,7 @@ jobs:
else
echo "Version mismatch (Running: $RUNNING_VERSION -> Deployed: $NEW_VERSION) or app not running. Reloading PM2..."
fi
pm2 startOrReload ecosystem.config.cjs --env production --update-env && pm2 save
pm2 startOrReload ecosystem.config.cjs --update-env && pm2 save
echo "Production backend server reloaded successfully."
else
echo "Version $NEW_VERSION is already running. Skipping PM2 reload."

View File

@@ -198,8 +198,8 @@ jobs:
--reporter=verbose --includeTaskLocation --testTimeout=10000 --silent=passed-only || true
echo "--- Running E2E Tests ---"
# Run E2E tests using the dedicated E2E config which inherits from integration config.
# We still pass --coverage to enable it, but directory and timeout are now in the config.
# Run E2E tests using the dedicated E2E config.
# E2E uses port 3098, integration uses 3099 to avoid conflicts.
npx vitest run --config vitest.config.e2e.ts --coverage \
--coverage.exclude='**/*.test.ts' \
--coverage.exclude='**/tests/**' \
@@ -240,7 +240,19 @@ jobs:
# Run c8: read raw files from the temp dir, and output an Istanbul JSON report.
# We only generate the 'json' report here because it's all nyc needs for merging.
echo "Server coverage report about to be generated..."
npx c8 report --exclude='**/*.test.ts' --exclude='**/tests/**' --exclude='**/mocks/**' --reporter=json --temp-directory .coverage/tmp/integration-server --reports-dir .coverage/integration-server
npx c8 report \
--include='src/**' \
--exclude='**/*.test.ts' \
--exclude='**/*.test.tsx' \
--exclude='**/tests/**' \
--exclude='**/mocks/**' \
--exclude='hostexecutor/**' \
--exclude='scripts/**' \
--exclude='*.config.js' \
--exclude='*.config.ts' \
--reporter=json \
--temp-directory .coverage/tmp/integration-server \
--reports-dir .coverage/integration-server
echo "Server coverage report generated. Verifying existence:"
ls -l .coverage/integration-server/coverage-final.json
@@ -280,12 +292,18 @@ jobs:
--reporter=html \
--report-dir .coverage/ \
--temp-dir "$NYC_SOURCE_DIR" \
--include "src/**" \
--exclude "**/*.test.ts" \
--exclude "**/*.test.tsx" \
--exclude "**/tests/**" \
--exclude "**/mocks/**" \
--exclude "**/index.tsx" \
--exclude "**/vite-env.d.ts" \
--exclude "**/vitest.setup.ts"
--exclude "**/vitest.setup.ts" \
--exclude "hostexecutor/**" \
--exclude "scripts/**" \
--exclude "*.config.js" \
--exclude "*.config.ts"
# Re-enable secret masking for subsequent steps.
echo "::secret-masking::"
@@ -368,6 +386,9 @@ jobs:
VITE_APP_VERSION="$(date +'%Y%m%d-%H%M'):$(git rev-parse --short HEAD):$PACKAGE_VERSION" \
VITE_APP_COMMIT_URL="$GITEA_SERVER_URL/${{ gitea.repository }}/commit/${{ gitea.sha }}" \
VITE_APP_COMMIT_MESSAGE="$COMMIT_MESSAGE" \
VITE_SENTRY_DSN="${{ secrets.VITE_SENTRY_DSN_TEST }}" \
VITE_SENTRY_ENVIRONMENT="test" \
VITE_SENTRY_ENABLED="true" \
VITE_API_BASE_URL="https://flyer-crawler-test.projectium.com/api" VITE_API_KEY=${{ secrets.VITE_GOOGLE_GENAI_API_KEY_TEST }} npm run build
- name: Deploy Application to Test Server
@@ -428,6 +449,10 @@ jobs:
SMTP_USER: '' # Using MailHog, no auth needed
SMTP_PASS: '' # Using MailHog, no auth needed
SMTP_FROM_EMAIL: 'noreply@flyer-crawler-test.projectium.com'
# Sentry/Bugsink Error Tracking (ADR-015)
SENTRY_DSN: ${{ secrets.SENTRY_DSN_TEST }}
SENTRY_ENVIRONMENT: 'test'
SENTRY_ENABLED: 'true'
run: |
# Fail-fast check to ensure secrets are configured in Gitea.
@@ -451,10 +476,11 @@ jobs:
echo "Cleaning up errored or stopped PM2 processes..."
node -e "const exec = require('child_process').execSync; try { const list = JSON.parse(exec('pm2 jlist').toString()); list.forEach(p => { if (p.pm2_env.status === 'errored' || p.pm2_env.status === 'stopped') { console.log('Deleting ' + p.pm2_env.status + ' process: ' + p.name + ' (' + p.pm2_env.pm_id + ')'); try { exec('pm2 delete ' + p.pm2_env.pm_id); } catch(e) { console.error('Failed to delete ' + p.pm2_env.pm_id); } } }); } catch (e) { console.error('Error cleaning up processes:', e); }"
# Use `startOrReload` with the ecosystem file. This is the standard, idempotent way to deploy.
# It will START the process if it's not running, or RELOAD it if it is.
# Use `startOrReload` with the TEST ecosystem file. This starts test-specific processes
# (flyer-crawler-api-test, flyer-crawler-worker-test, flyer-crawler-analytics-worker-test)
# that run separately from production processes.
# We also add `&& pm2 save` to persist the process list across server reboots.
pm2 startOrReload ecosystem.config.cjs --env test --update-env && pm2 save
pm2 startOrReload ecosystem-test.config.cjs --update-env && pm2 save
echo "Test backend server reloaded successfully."
# After a successful deployment, update the schema hash in the database.

16
.gitignore vendored
View File

@@ -11,6 +11,18 @@ node_modules
dist
dist-ssr
*.local
.env
*.tsbuildinfo
# Test coverage
coverage
.nyc_output
.coverage
# Test artifacts - flyer-images/ is a runtime directory
# Test fixtures are stored in src/tests/assets/ instead
flyer-images/
test-output.txt
# Editor directories and files
.vscode/*
@@ -22,3 +34,7 @@ dist-ssr
*.njsproj
*.sln
*.sw?
Thumbs.db
.claude
nul
tmpclaude*

5
.nycrc.json Normal file
View File

@@ -0,0 +1,5 @@
{
"text": {
"maxCols": 200
}
}

110
AUTHENTICATION.md Normal file
View File

@@ -0,0 +1,110 @@
# Authentication Setup
Flyer Crawler supports OAuth authentication via Google and GitHub. This guide walks through configuring both providers.
---
## Google OAuth
### Step 1: Create OAuth Credentials
1. Go to the [Google Cloud Console](https://console.cloud.google.com/)
2. Create a new project (or select an existing one)
3. Navigate to **APIs & Services > Credentials**
4. Click **Create Credentials > OAuth client ID**
5. Select **Web application** as the application type
### Step 2: Configure Authorized Redirect URIs
Add the callback URL where Google will redirect users after authentication:
| Environment | Redirect URI |
| ----------- | -------------------------------------------------- |
| Development | `http://localhost:3001/api/auth/google/callback` |
| Production | `https://your-domain.com/api/auth/google/callback` |
### Step 3: Save Credentials
After clicking **Create**, you'll receive:
- **Client ID**
- **Client Secret**
Store these securely as environment variables:
- `GOOGLE_CLIENT_ID`
- `GOOGLE_CLIENT_SECRET`
---
## GitHub OAuth
### Step 1: Create OAuth App
1. Go to your [GitHub Developer Settings](https://github.com/settings/developers)
2. Navigate to **OAuth Apps**
3. Click **New OAuth App**
### Step 2: Fill in Application Details
| Field | Value |
| -------------------------- | ---------------------------------------------------- |
| Application name | Flyer Crawler (or your preferred name) |
| Homepage URL | `http://localhost:5173` (dev) or your production URL |
| Authorization callback URL | `http://localhost:3001/api/auth/github/callback` |
### Step 3: Save GitHub Credentials
After clicking **Register application**, you'll receive:
- **Client ID**
- **Client Secret**
Store these securely as environment variables:
- `GITHUB_CLIENT_ID`
- `GITHUB_CLIENT_SECRET`
---
## Environment Variables Summary
| Variable | Description |
| ---------------------- | ---------------------------------------- |
| `GOOGLE_CLIENT_ID` | Google OAuth client ID |
| `GOOGLE_CLIENT_SECRET` | Google OAuth client secret |
| `GITHUB_CLIENT_ID` | GitHub OAuth client ID |
| `GITHUB_CLIENT_SECRET` | GitHub OAuth client secret |
| `JWT_SECRET` | Secret for signing authentication tokens |
---
## Production Considerations
When deploying to production:
1. **Update redirect URIs** in both Google Cloud Console and GitHub OAuth settings to use your production domain
2. **Use HTTPS** for all callback URLs in production
3. **Store secrets securely** using your CI/CD platform's secrets management (e.g., Gitea repository secrets)
---
## Troubleshooting
### "redirect_uri_mismatch" Error
The callback URL in your OAuth provider settings doesn't match what the application is sending. Verify:
- The URL is exactly correct (no trailing slashes, correct port)
- You're using the right environment (dev vs production URLs)
### "invalid_client" Error
The Client ID or Client Secret is incorrect. Double-check your environment variables.
---
## Related Documentation
- [Installation Guide](INSTALL.md) - Local development setup
- [Deployment Guide](DEPLOYMENT.md) - Production deployment

391
CLAUDE.md Normal file
View File

@@ -0,0 +1,391 @@
# Claude Code Project Instructions
## Communication Style: Ask Before Assuming
**IMPORTANT**: When helping with tasks, **ask clarifying questions before making assumptions**. Do not assume:
- What steps the user has or hasn't completed
- What the user already knows or has configured
- What external services (OAuth providers, APIs, etc.) are already set up
- What secrets or credentials have already been created
Instead, ask the user to confirm the current state before providing instructions or making recommendations. This prevents wasted effort and respects the user's existing work.
## Platform Requirement: Linux Only
**CRITICAL**: This application is designed to run **exclusively on Linux**. See [ADR-014](docs/adr/0014-containerization-and-deployment-strategy.md) for full details.
### Environment Terminology
- **Dev Container** (or just "dev"): The containerized Linux development environment (`flyer-crawler-dev`). This is where all development and testing should occur.
- **Host**: The Windows machine running Podman/Docker and VS Code.
When instructions say "run in dev" or "run in the dev container", they mean executing commands inside the `flyer-crawler-dev` container.
### Test Execution Rules
1. **ALL tests MUST be executed in the dev container** - the Linux container environment
2. **NEVER run tests directly on Windows host** - test results from Windows are unreliable
3. **Always use the dev container for testing** when developing on Windows
### How to Run Tests Correctly
```bash
# If on Windows, first open VS Code and "Reopen in Container"
# Then run tests inside the dev container:
npm test # Run all unit tests
npm run test:unit # Run unit tests only
npm run test:integration # Run integration tests (requires DB/Redis)
```
### Running Tests via Podman (from Windows host)
**Note:** This project has 2900+ unit tests. For AI-assisted development, pipe output to a file for easier processing.
The command to run unit tests in the dev container via podman:
```bash
# Basic (output to terminal)
podman exec -it flyer-crawler-dev npm run test:unit
# Recommended for AI processing: pipe to file
podman exec -it flyer-crawler-dev npm run test:unit 2>&1 | tee test-results.txt
```
The command to run integration tests in the dev container via podman:
```bash
podman exec -it flyer-crawler-dev npm run test:integration
```
For running specific test files:
```bash
podman exec -it flyer-crawler-dev npm test -- --run src/hooks/useAuth.test.tsx
```
### Why Linux Only?
- Path separators: Code uses POSIX-style paths (`/`) which may break on Windows
- Shell scripts in `scripts/` directory are Linux-only
- External dependencies like `pdftocairo` assume Linux installation paths
- Unix-style file permissions are assumed throughout
### Test Result Interpretation
- Tests that **pass on Windows but fail on Linux** = **BROKEN tests** (must be fixed)
- Tests that **fail on Windows but pass on Linux** = **PASSING tests** (acceptable)
## Development Workflow
1. Open project in VS Code
2. Use "Reopen in Container" (Dev Containers extension required) to enter the dev environment
3. Wait for dev container initialization to complete
4. Run `npm test` to verify the dev environment is working
5. Make changes and run tests inside the dev container
## Code Change Verification
After making any code changes, **always run a type-check** to catch TypeScript errors before committing:
```bash
npm run type-check
```
This prevents linting/type errors from being introduced into the codebase.
## Quick Reference
| Command | Description |
| -------------------------- | ---------------------------- |
| `npm test` | Run all unit tests |
| `npm run test:unit` | Run unit tests only |
| `npm run test:integration` | Run integration tests |
| `npm run dev:container` | Start dev server (container) |
| `npm run build` | Build for production |
| `npm run type-check` | Run TypeScript type checking |
## Database Schema Files
**CRITICAL**: The database schema files must be kept in sync with each other. When making schema changes:
| File | Purpose |
| ------------------------------ | ----------------------------------------------------------- |
| `sql/master_schema_rollup.sql` | Complete schema used by test database setup and reference |
| `sql/initial_schema.sql` | Base schema without seed data, used as standalone reference |
| `sql/migrations/*.sql` | Incremental migrations for production database updates |
**Maintenance Rules:**
1. **Keep `master_schema_rollup.sql` and `initial_schema.sql` in sync** - These files should contain the same table definitions
2. **When adding columns via migration**, also add them to both `master_schema_rollup.sql` and `initial_schema.sql`
3. **Migrations are for production deployments** - They use `ALTER TABLE` to add columns incrementally
4. **Schema files are for fresh installs** - They define the complete table structure
5. **Test database uses `master_schema_rollup.sql`** - If schema files are out of sync with migrations, tests will fail
**Example:** When `002_expiry_tracking.sql` adds `purchase_date` to `pantry_items`, that column must also exist in the `CREATE TABLE` statements in both `master_schema_rollup.sql` and `initial_schema.sql`.
## Known Integration Test Issues and Solutions
This section documents common test issues encountered in integration tests, their root causes, and solutions. These patterns recur frequently.
### 1. Vitest globalSetup Runs in Separate Node.js Context
**Problem:** Vitest's `globalSetup` runs in a completely separate Node.js context from test files. This means:
- Singletons created in globalSetup are NOT the same instances as those in test files
- `global`, `globalThis`, and `process` are all isolated between contexts
- `vi.spyOn()` on module exports doesn't work cross-context
- Dependency injection via setter methods fails across contexts
**Affected Tests:** Any test trying to inject mocks into BullMQ worker services (e.g., AI failure tests, DB failure tests)
**Solution Options:**
1. Mark tests as `.todo()` until an API-based mock injection mechanism is implemented
2. Create test-only API endpoints that allow setting mock behaviors via HTTP
3. Use file-based or Redis-based mock flags that services check at runtime
**Example of affected code pattern:**
```typescript
// This DOES NOT work - different module instances
const { flyerProcessingService } = await import('../../services/workers.server');
flyerProcessingService._getAiProcessor()._setExtractAndValidateData(mockFn);
// The worker uses a different flyerProcessingService instance!
```
### 2. BullMQ Cleanup Queue Deleting Files Before Test Verification
**Problem:** The cleanup worker runs in the globalSetup context and processes cleanup jobs even when tests spy on `cleanupQueue.add()`. The spy intercepts calls in the test context, but jobs already queued run in the worker's context.
**Affected Tests:** EXIF/PNG metadata stripping tests that need to verify file contents before deletion
**Solution:** Drain and pause the cleanup queue before the test:
```typescript
const { cleanupQueue } = await import('../../services/queues.server');
await cleanupQueue.drain(); // Remove existing jobs
await cleanupQueue.pause(); // Prevent new jobs from processing
// ... run test ...
await cleanupQueue.resume(); // Restore normal operation
```
### 3. Cache Invalidation After Direct Database Inserts
**Problem:** Tests that insert data directly via SQL (bypassing the service layer) don't trigger cache invalidation. Subsequent API calls return stale cached data.
**Affected Tests:** Any test using `pool.query()` to insert flyers, stores, or other cached entities
**Solution:** Manually invalidate the cache after direct inserts:
```typescript
await pool.query('INSERT INTO flyers ...');
await cacheService.invalidateFlyers(); // Clear stale cache
```
### 4. Unique Filenames Required for Test Isolation
**Problem:** Multer generates predictable filenames in test environments, causing race conditions when multiple tests upload files concurrently or in sequence.
**Affected Tests:** Flyer processing tests, file upload tests
**Solution:** Always use unique filenames with timestamps:
```typescript
// In multer.middleware.ts
const uniqueSuffix = `${Date.now()}-${Math.round(Math.random() * 1e9)}`;
cb(null, `${file.fieldname}-${uniqueSuffix}-${sanitizedOriginalName}`);
```
### 5. Response Format Mismatches
**Problem:** API response formats may change, causing tests to fail when expecting old formats.
**Common Issues:**
- `response.body.data.jobId` vs `response.body.data.job.id`
- Nested objects vs flat response structures
- Type coercion (string vs number for IDs)
**Solution:** Always log response bodies during debugging and update test assertions to match actual API contracts.
### 6. External Service Availability
**Problem:** Tests depending on external services (PM2, Redis health checks) fail when those services aren't available in the test environment.
**Solution:** Use try/catch with graceful degradation or mock the external service checks.
## Secrets and Environment Variables
**CRITICAL**: This project uses **Gitea CI/CD secrets** for all sensitive configuration. There is NO `/etc/flyer-crawler/environment` file or similar local config file on the server.
### Server Directory Structure
| Path | Environment | Notes |
| --------------------------------------------- | ----------- | ------------------------------------------------ |
| `/var/www/flyer-crawler.projectium.com/` | Production | NO `.env` file - secrets injected via CI/CD only |
| `/var/www/flyer-crawler-test.projectium.com/` | Test | Has `.env.test` file for test-specific config |
### How Secrets Work
1. **Gitea Secrets**: All secrets are stored in Gitea repository settings (Settings → Secrets)
2. **CI/CD Injection**: Secrets are injected during deployment via `.gitea/workflows/deploy-to-prod.yml` and `deploy-to-test.yml`
3. **PM2 Environment**: The CI/CD workflow passes secrets to PM2 via environment variables, which are then available to the application
### Key Files for Configuration
| File | Purpose |
| ------------------------------------- | ---------------------------------------------------- |
| `src/config/env.ts` | Centralized config with Zod schema validation |
| `ecosystem.config.cjs` | PM2 process config - reads from `process.env` |
| `.gitea/workflows/deploy-to-prod.yml` | Production deployment with secret injection |
| `.gitea/workflows/deploy-to-test.yml` | Test deployment with secret injection |
| `.env.example` | Template showing all available environment variables |
| `.env.test` | Test environment overrides (only on test server) |
### Adding New Secrets
To add a new secret (e.g., `SENTRY_DSN`):
1. Add the secret to Gitea repository settings
2. Update the relevant workflow file (e.g., `deploy-to-prod.yml`) to inject it:
```yaml
SENTRY_DSN: ${{ secrets.SENTRY_DSN }}
```
3. Update `ecosystem.config.cjs` to read it from `process.env`
4. Update `src/config/env.ts` schema if validation is needed
5. Update `.env.example` to document the new variable
### Current Gitea Secrets
**Shared (used by both environments):**
- `DB_HOST`, `DB_USER`, `DB_PASSWORD` - Database credentials
- `JWT_SECRET` - Authentication
- `GOOGLE_MAPS_API_KEY` - Google Maps
- `GOOGLE_CLIENT_ID`, `GOOGLE_CLIENT_SECRET` - Google OAuth
- `GH_CLIENT_ID`, `GH_CLIENT_SECRET` - GitHub OAuth
**Production-specific:**
- `DB_DATABASE_PROD` - Production database name
- `REDIS_PASSWORD_PROD` - Redis password (uses database 0)
- `VITE_GOOGLE_GENAI_API_KEY` - Gemini API key for production
- `SENTRY_DSN`, `VITE_SENTRY_DSN` - Bugsink error tracking DSNs (production projects)
**Test-specific:**
- `DB_DATABASE_TEST` - Test database name
- `REDIS_PASSWORD_TEST` - Redis password (uses database 1 for isolation)
- `VITE_GOOGLE_GENAI_API_KEY_TEST` - Gemini API key for test
- `SENTRY_DSN_TEST`, `VITE_SENTRY_DSN_TEST` - Bugsink error tracking DSNs (test projects)
### Test Environment
The test environment (`flyer-crawler-test.projectium.com`) uses **both** Gitea CI/CD secrets and a local `.env.test` file:
- **Gitea secrets**: Injected during deployment via `.gitea/workflows/deploy-to-test.yml`
- **`.env.test` file**: Located at `/var/www/flyer-crawler-test.projectium.com/.env.test` for local overrides
- **Redis database 1**: Isolates test job queues from production (which uses database 0)
- **PM2 process names**: Suffixed with `-test` (e.g., `flyer-crawler-api-test`)
### Dev Container Environment
The dev container runs its own **local Bugsink instance** - it does NOT connect to the production Bugsink server:
- **Local Bugsink**: Runs at `http://localhost:8000` inside the container
- **Pre-configured DSNs**: Set in `compose.dev.yml`, pointing to local instance
- **Admin credentials**: `admin@localhost` / `admin`
- **Isolated**: Dev errors stay local, don't pollute production/test dashboards
- **No Gitea secrets needed**: Everything is self-contained in the container
---
## MCP Servers
The following MCP servers are configured for this project:
| Server | Purpose |
| --------------------- | ------------------------------------------- |
| gitea-projectium | Gitea API for gitea.projectium.com |
| gitea-torbonium | Gitea API for gitea.torbonium.com |
| podman | Container management |
| filesystem | File system access |
| fetch | Web fetching |
| markitdown | Convert documents to markdown |
| sequential-thinking | Step-by-step reasoning |
| memory | Knowledge graph persistence |
| postgres | Direct database queries (localhost:5432) |
| playwright | Browser automation and testing |
| redis | Redis cache inspection (localhost:6379) |
| sentry-selfhosted-mcp | Error tracking via Bugsink (localhost:8000) |
**Note:** MCP servers work in both **Claude CLI** and **Claude Code VS Code extension** (as of January 2026).
### Sentry/Bugsink MCP Server Setup (ADR-015)
To enable Claude Code to query and analyze application errors from Bugsink:
1. **Install the MCP server**:
```bash
# Clone the sentry-selfhosted-mcp repository
git clone https://github.com/ddfourtwo/sentry-selfhosted-mcp.git
cd sentry-selfhosted-mcp
npm install
```
2. **Configure Claude Code** (add to `.claude/mcp.json`):
```json
{
"sentry-selfhosted-mcp": {
"command": "node",
"args": ["/path/to/sentry-selfhosted-mcp/dist/index.js"],
"env": {
"SENTRY_URL": "http://localhost:8000",
"SENTRY_AUTH_TOKEN": "<get-from-bugsink-ui>",
"SENTRY_ORG_SLUG": "flyer-crawler"
}
}
}
```
3. **Get the auth token**:
- Navigate to Bugsink UI at `http://localhost:8000`
- Log in with admin credentials
- Go to Settings > API Keys
- Create a new API key with read access
4. **Available capabilities**:
- List projects and issues
- View detailed error events
- Search by error message or stack trace
- Update issue status (resolve, ignore)
- Add comments to issues
### SSH Server Access
Claude Code can execute commands on the production server via SSH:
```bash
# Basic command execution
ssh root@projectium.com "command here"
# Examples:
ssh root@projectium.com "systemctl status logstash"
ssh root@projectium.com "pm2 list"
ssh root@projectium.com "tail -50 /var/www/flyer-crawler.projectium.com/logs/app.log"
```
**Use cases:**
- Managing Logstash, PM2, NGINX, Redis services
- Viewing server logs
- Deploying configuration changes
- Checking service status
**Important:** SSH access requires the host machine to have SSH keys configured for `root@projectium.com`.

188
DATABASE.md Normal file
View File

@@ -0,0 +1,188 @@
# Database Setup
Flyer Crawler uses PostgreSQL with several extensions for full-text search, geographic data, and UUID generation.
---
## Required Extensions
| Extension | Purpose |
| ----------- | ------------------------------------------- |
| `postgis` | Geographic/spatial data for store locations |
| `pg_trgm` | Trigram matching for fuzzy text search |
| `uuid-ossp` | UUID generation for primary keys |
---
## Production Database Setup
### Step 1: Install PostgreSQL
```bash
sudo apt update
sudo apt install postgresql postgresql-contrib
```
### Step 2: Create Database and User
Switch to the postgres system user:
```bash
sudo -u postgres psql
```
Run the following SQL commands (replace `'a_very_strong_password'` with a secure password):
```sql
-- Create a new role for your application
CREATE ROLE flyer_crawler_user WITH LOGIN PASSWORD 'a_very_strong_password';
-- Create the production database
CREATE DATABASE "flyer-crawler-prod" WITH OWNER = flyer_crawler_user;
-- Connect to the new database
\c "flyer-crawler-prod"
-- Install required extensions (must be done as superuser)
CREATE EXTENSION IF NOT EXISTS postgis;
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
-- Exit
\q
```
### Step 3: Apply the Schema
Navigate to your project directory and run:
```bash
psql -U flyer_crawler_user -d "flyer-crawler-prod" -f sql/master_schema_rollup.sql
```
This creates all tables, functions, triggers, and seeds essential data (categories, master items).
### Step 4: Seed the Admin Account
Set the required environment variables and run the seed script:
```bash
export DB_USER=flyer_crawler_user
export DB_PASSWORD=your_password
export DB_NAME="flyer-crawler-prod"
export DB_HOST=localhost
npx tsx src/db/seed_admin_account.ts
```
---
## Test Database Setup
The test database is used by CI/CD pipelines and local test runs.
### Step 1: Create the Test Database
```bash
sudo -u postgres psql
```
```sql
-- Create the test database
CREATE DATABASE "flyer-crawler-test" WITH OWNER = flyer_crawler_user;
-- Connect to the test database
\c "flyer-crawler-test"
-- Install required extensions
CREATE EXTENSION IF NOT EXISTS postgis;
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
-- Grant schema ownership (required for test runner to reset schema)
ALTER SCHEMA public OWNER TO flyer_crawler_user;
-- Exit
\q
```
### Step 2: Configure CI/CD Secrets
Ensure these secrets are set in your Gitea repository settings:
| Secret | Description |
| ------------- | ------------------------------------------ |
| `DB_HOST` | Database hostname (e.g., `localhost`) |
| `DB_PORT` | Database port (e.g., `5432`) |
| `DB_USER` | Database user (e.g., `flyer_crawler_user`) |
| `DB_PASSWORD` | Database password |
---
## How the Test Pipeline Works
The CI pipeline uses a permanent test database that gets reset on each test run:
1. **Setup**: The vitest global setup script connects to `flyer-crawler-test`
2. **Schema Reset**: Executes `sql/drop_tables.sql` (`DROP SCHEMA public CASCADE`)
3. **Schema Application**: Runs `sql/master_schema_rollup.sql` to build a fresh schema
4. **Test Execution**: Tests run against the clean database
This approach is faster than creating/destroying databases and doesn't require sudo access.
---
## Connecting to Production Database
```bash
psql -h localhost -U flyer_crawler_user -d "flyer-crawler-prod" -W
```
---
## Checking PostGIS Version
```sql
SELECT version();
SELECT PostGIS_Full_Version();
```
Example output:
```
PostgreSQL 14.19 (Ubuntu 14.19-0ubuntu0.22.04.1)
POSTGIS="3.2.0 c3e3cc0" GEOS="3.10.2-CAPI-1.16.0" PROJ="8.2.1"
```
---
## Schema Files
| File | Purpose |
| ------------------------------ | --------------------------------------------------------- |
| `sql/master_schema_rollup.sql` | Complete schema with all tables, functions, and seed data |
| `sql/drop_tables.sql` | Drops entire schema (used by test runner) |
| `sql/schema.sql.txt` | Legacy schema file (reference only) |
---
## Backup and Restore
### Create a Backup
```bash
pg_dump -U flyer_crawler_user -d "flyer-crawler-prod" -F c -f backup.dump
```
### Restore from Backup
```bash
pg_restore -U flyer_crawler_user -d "flyer-crawler-prod" -c backup.dump
```
---
## Related Documentation
- [Installation Guide](INSTALL.md) - Local development setup
- [Deployment Guide](DEPLOYMENT.md) - Production deployment

271
DEPLOYMENT.md Normal file
View File

@@ -0,0 +1,271 @@
# Deployment Guide
This guide covers deploying Flyer Crawler to a production server.
## Prerequisites
- Ubuntu server (22.04 LTS recommended)
- PostgreSQL 14+ with PostGIS extension
- Redis
- Node.js 20.x
- NGINX (reverse proxy)
- PM2 (process manager)
---
## Server Setup
### Install Node.js
```bash
curl -sL https://deb.nodesource.com/setup_20.x | sudo bash -
sudo apt-get install -y nodejs
```
### Install PM2
```bash
sudo npm install -g pm2
```
---
## Application Deployment
### Clone and Install
```bash
git clone <repository-url>
cd flyer-crawler.projectium.com
npm install
```
### Build for Production
```bash
npm run build
```
### Start with PM2
```bash
npm run start:prod
```
This starts three PM2 processes:
- `flyer-crawler-api` - Main API server
- `flyer-crawler-worker` - Background job worker
- `flyer-crawler-analytics-worker` - Analytics processing worker
---
## Environment Variables (Gitea Secrets)
For deployments using Gitea CI/CD workflows, configure these as **repository secrets**:
| Secret | Description |
| --------------------------- | ------------------------------------------- |
| `DB_HOST` | PostgreSQL server hostname |
| `DB_USER` | PostgreSQL username |
| `DB_PASSWORD` | PostgreSQL password |
| `DB_DATABASE_PROD` | Production database name |
| `REDIS_PASSWORD_PROD` | Production Redis password |
| `REDIS_PASSWORD_TEST` | Test Redis password |
| `JWT_SECRET` | Long, random string for signing auth tokens |
| `VITE_GOOGLE_GENAI_API_KEY` | Google Gemini API key |
| `GOOGLE_MAPS_API_KEY` | Google Maps Geocoding API key |
---
## NGINX Configuration
### Reverse Proxy Setup
Create a site configuration at `/etc/nginx/sites-available/flyer-crawler.projectium.com`:
```nginx
server {
listen 80;
server_name flyer-crawler.projectium.com;
location / {
proxy_pass http://localhost:5173;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection 'upgrade';
proxy_set_header Host $host;
proxy_cache_bypass $http_upgrade;
}
location /api {
proxy_pass http://localhost:3001;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection 'upgrade';
proxy_set_header Host $host;
proxy_cache_bypass $http_upgrade;
}
}
```
Enable the site:
```bash
sudo ln -s /etc/nginx/sites-available/flyer-crawler.projectium.com /etc/nginx/sites-enabled/
sudo nginx -t
sudo systemctl reload nginx
```
### MIME Types Fix for .mjs Files
If JavaScript modules (`.mjs` files) aren't loading correctly, add the proper MIME type.
**Option 1**: Edit the site configuration file directly:
```nginx
# Add inside the server block
types {
application/javascript js mjs;
}
```
**Option 2**: Edit `/etc/nginx/mime.types` globally:
```
# Change this line:
application/javascript js;
# To:
application/javascript js mjs;
```
After changes:
```bash
sudo nginx -t
sudo systemctl reload nginx
```
---
## PM2 Log Management
Install and configure pm2-logrotate to manage log files:
```bash
pm2 install pm2-logrotate
pm2 set pm2-logrotate:max_size 10M
pm2 set pm2-logrotate:retain 14
pm2 set pm2-logrotate:compress false
pm2 set pm2-logrotate:dateFormat YYYY-MM-DD_HH-mm-ss
```
---
## Rate Limiting
The application respects the Gemini AI service's rate limits. You can adjust the `GEMINI_RPM` (requests per minute) environment variable in production as needed without changing the code.
---
## CI/CD Pipeline
The project includes Gitea workflows at `.gitea/workflows/deploy.yml` that:
1. Run tests against a test database
2. Build the application
3. Deploy to production on successful builds
The workflow automatically:
- Sets up the test database schema before tests
- Tears down test data after tests complete
- Deploys to the production server
---
## Monitoring
### Check PM2 Status
```bash
pm2 status
pm2 logs
pm2 logs flyer-crawler-api --lines 100
```
### Restart Services
```bash
pm2 restart all
pm2 restart flyer-crawler-api
```
---
## Error Tracking with Bugsink (ADR-015)
Bugsink is a self-hosted Sentry-compatible error tracking system. See [docs/adr/0015-application-performance-monitoring-and-error-tracking.md](docs/adr/0015-application-performance-monitoring-and-error-tracking.md) for the full architecture decision.
### Creating Bugsink Projects and DSNs
After Bugsink is installed and running, you need to create projects and obtain DSNs:
1. **Access Bugsink UI**: Navigate to `http://localhost:8000`
2. **Log in** with your admin credentials
3. **Create Backend Project**:
- Click "Create Project"
- Name: `flyer-crawler-backend`
- Platform: Node.js
- Copy the generated DSN (format: `http://<key>@localhost:8000/<project_id>`)
4. **Create Frontend Project**:
- Click "Create Project"
- Name: `flyer-crawler-frontend`
- Platform: React
- Copy the generated DSN
5. **Configure Environment Variables**:
```bash
# Backend (server-side)
export SENTRY_DSN=http://<backend-key>@localhost:8000/<backend-project-id>
# Frontend (client-side, exposed to browser)
export VITE_SENTRY_DSN=http://<frontend-key>@localhost:8000/<frontend-project-id>
# Shared settings
export SENTRY_ENVIRONMENT=production
export VITE_SENTRY_ENVIRONMENT=production
export SENTRY_ENABLED=true
export VITE_SENTRY_ENABLED=true
```
### Testing Error Tracking
Verify Bugsink is receiving events:
```bash
npx tsx scripts/test-bugsink.ts
```
This sends test error and info events. Check the Bugsink UI for:
- `BugsinkTestError` in the backend project
- Info message "Test info message from test-bugsink.ts"
### Sentry SDK v10+ HTTP DSN Limitation
The Sentry SDK v10+ enforces HTTPS-only DSNs by default. Since Bugsink runs locally over HTTP, our implementation uses the Sentry Store API directly instead of the SDK's built-in transport. This is handled transparently by the `sentry.server.ts` and `sentry.client.ts` modules.
---
## Related Documentation
- [Database Setup](DATABASE.md) - PostgreSQL and PostGIS configuration
- [Authentication Setup](AUTHENTICATION.md) - OAuth provider configuration
- [Installation Guide](INSTALL.md) - Local development setup
- [Bare-Metal Server Setup](docs/BARE-METAL-SETUP.md) - Manual server installation guide

View File

@@ -7,7 +7,7 @@
#
# Base: Ubuntu 22.04 (LTS) - matches production server
# Node: v20.x (LTS) - matches production
# Includes: PostgreSQL client, Redis CLI, build tools
# Includes: PostgreSQL client, Redis CLI, build tools, Bugsink, Logstash
# ============================================================================
FROM ubuntu:22.04
@@ -21,16 +21,23 @@ ENV DEBIAN_FRONTEND=noninteractive
# - curl: for downloading Node.js setup script and health checks
# - git: for version control operations
# - build-essential: for compiling native Node.js modules (node-gyp)
# - python3: required by some Node.js build tools
# - python3, python3-pip, python3-venv: for Bugsink
# - postgresql-client: for psql CLI (database initialization)
# - redis-tools: for redis-cli (health checks)
# - gnupg, apt-transport-https: for Elastic APT repository (Logstash)
# - openjdk-17-jre-headless: required by Logstash
RUN apt-get update && apt-get install -y \
curl \
git \
build-essential \
python3 \
python3-pip \
python3-venv \
postgresql-client \
redis-tools \
gnupg \
apt-transport-https \
openjdk-17-jre-headless \
&& rm -rf /var/lib/apt/lists/*
# ============================================================================
@@ -39,6 +46,204 @@ RUN apt-get update && apt-get install -y \
RUN curl -fsSL https://deb.nodesource.com/setup_20.x | bash - \
&& apt-get install -y nodejs
# ============================================================================
# Install Logstash (Elastic APT Repository)
# ============================================================================
# ADR-015: Log aggregation for Pino and Redis logs → Bugsink
RUN curl -fsSL https://artifacts.elastic.co/GPG-KEY-elasticsearch | gpg --dearmor -o /usr/share/keyrings/elastic-keyring.gpg \
&& echo "deb [signed-by=/usr/share/keyrings/elastic-keyring.gpg] https://artifacts.elastic.co/packages/8.x/apt stable main" | tee /etc/apt/sources.list.d/elastic-8.x.list \
&& apt-get update \
&& apt-get install -y logstash \
&& rm -rf /var/lib/apt/lists/*
# ============================================================================
# Install Bugsink (Python Package)
# ============================================================================
# ADR-015: Self-hosted Sentry-compatible error tracking
# Create a virtual environment for Bugsink to avoid conflicts
RUN python3 -m venv /opt/bugsink \
&& /opt/bugsink/bin/pip install --upgrade pip \
&& /opt/bugsink/bin/pip install bugsink gunicorn psycopg2-binary
# Create Bugsink directories and configuration
RUN mkdir -p /var/log/bugsink /var/lib/bugsink /opt/bugsink/conf
# Create Bugsink configuration file (Django settings module)
# This file is imported by bugsink-manage via DJANGO_SETTINGS_MODULE
# Based on bugsink/conf_templates/docker.py.template but customized for our setup
RUN echo 'import os\n\
from urllib.parse import urlparse\n\
\n\
from bugsink.settings.default import *\n\
from bugsink.settings.default import DATABASES, SILENCED_SYSTEM_CHECKS\n\
from bugsink.conf_utils import deduce_allowed_hosts, deduce_script_name\n\
\n\
IS_DOCKER = True\n\
\n\
# Security settings\n\
SECRET_KEY = os.getenv("SECRET_KEY")\n\
DEBUG = os.getenv("DEBUG", "False").lower() in ("true", "1", "yes")\n\
\n\
# Silence cookie security warnings for dev (no HTTPS)\n\
SILENCED_SYSTEM_CHECKS += ["security.W012", "security.W016"]\n\
\n\
# Database configuration from DATABASE_URL environment variable\n\
if os.getenv("DATABASE_URL"):\n\
DATABASE_URL = os.getenv("DATABASE_URL")\n\
parsed = urlparse(DATABASE_URL)\n\
\n\
if parsed.scheme in ["postgres", "postgresql"]:\n\
DATABASES["default"] = {\n\
"ENGINE": "django.db.backends.postgresql",\n\
"NAME": parsed.path.lstrip("/"),\n\
"USER": parsed.username,\n\
"PASSWORD": parsed.password,\n\
"HOST": parsed.hostname,\n\
"PORT": parsed.port or "5432",\n\
}\n\
\n\
# Snappea (background task runner) settings\n\
SNAPPEA = {\n\
"TASK_ALWAYS_EAGER": False,\n\
"WORKAHOLIC": True,\n\
"NUM_WORKERS": 2,\n\
"PID_FILE": None,\n\
}\n\
DATABASES["snappea"]["NAME"] = "/tmp/snappea.sqlite3"\n\
\n\
# Site settings\n\
_PORT = os.getenv("PORT", "8000")\n\
BUGSINK = {\n\
"BASE_URL": os.getenv("BASE_URL", f"http://localhost:{_PORT}"),\n\
"SITE_TITLE": os.getenv("SITE_TITLE", "Flyer Crawler Error Tracking"),\n\
"SINGLE_USER": os.getenv("SINGLE_USER", "True").lower() in ("true", "1", "yes"),\n\
"SINGLE_TEAM": os.getenv("SINGLE_TEAM", "True").lower() in ("true", "1", "yes"),\n\
"PHONEHOME": False,\n\
}\n\
\n\
ALLOWED_HOSTS = deduce_allowed_hosts(BUGSINK["BASE_URL"])\n\
\n\
# Console email backend for dev\n\
EMAIL_BACKEND = "bugsink.email_backends.QuietConsoleEmailBackend"\n\
' > /opt/bugsink/conf/bugsink_conf.py
# Create Bugsink startup script
# Uses DATABASE_URL environment variable (standard Docker approach per docs)
RUN echo '#!/bin/bash\n\
set -e\n\
\n\
# Build DATABASE_URL from individual env vars for flexibility\n\
export DATABASE_URL="postgresql://${BUGSINK_DB_USER:-bugsink}:${BUGSINK_DB_PASSWORD:-bugsink_dev_password}@${BUGSINK_DB_HOST:-postgres}:${BUGSINK_DB_PORT:-5432}/${BUGSINK_DB_NAME:-bugsink}"\n\
# SECRET_KEY is required by Bugsink/Django\n\
export SECRET_KEY="${BUGSINK_SECRET_KEY:-dev-bugsink-secret-key-minimum-50-characters-for-security}"\n\
\n\
# Create superuser if not exists (for dev convenience)\n\
if [ -n "$BUGSINK_ADMIN_EMAIL" ] && [ -n "$BUGSINK_ADMIN_PASSWORD" ]; then\n\
export CREATE_SUPERUSER="${BUGSINK_ADMIN_EMAIL}:${BUGSINK_ADMIN_PASSWORD}"\n\
fi\n\
\n\
# Wait for PostgreSQL to be ready\n\
until pg_isready -h ${BUGSINK_DB_HOST:-postgres} -p ${BUGSINK_DB_PORT:-5432} -U ${BUGSINK_DB_USER:-bugsink}; do\n\
echo "Waiting for PostgreSQL..."\n\
sleep 2\n\
done\n\
\n\
echo "PostgreSQL is ready. Starting Bugsink..."\n\
echo "DATABASE_URL: postgresql://${BUGSINK_DB_USER}:***@${BUGSINK_DB_HOST}:${BUGSINK_DB_PORT}/${BUGSINK_DB_NAME}"\n\
\n\
# Change to config directory so bugsink_conf.py can be found\n\
cd /opt/bugsink/conf\n\
\n\
# Run migrations\n\
echo "Running database migrations..."\n\
/opt/bugsink/bin/bugsink-manage migrate --noinput\n\
\n\
# Create superuser if CREATE_SUPERUSER is set (format: email:password)\n\
if [ -n "$CREATE_SUPERUSER" ]; then\n\
IFS=":" read -r ADMIN_EMAIL ADMIN_PASS <<< "$CREATE_SUPERUSER"\n\
/opt/bugsink/bin/bugsink-manage shell -c "\n\
from django.contrib.auth import get_user_model\n\
User = get_user_model()\n\
if not User.objects.filter(email='"'"'$ADMIN_EMAIL'"'"').exists():\n\
User.objects.create_superuser('"'"'$ADMIN_EMAIL'"'"', '"'"'$ADMIN_PASS'"'"')\n\
print('"'"'Superuser created'"'"')\n\
else:\n\
print('"'"'Superuser already exists'"'"')\n\
" || true\n\
fi\n\
\n\
# Start Bugsink with Gunicorn\n\
echo "Starting Gunicorn on port ${BUGSINK_PORT:-8000}..."\n\
exec /opt/bugsink/bin/gunicorn \\\n\
--bind 0.0.0.0:${BUGSINK_PORT:-8000} \\\n\
--workers ${BUGSINK_WORKERS:-2} \\\n\
--access-logfile - \\\n\
--error-logfile - \\\n\
bugsink.wsgi:application\n\
' > /usr/local/bin/start-bugsink.sh \
&& chmod +x /usr/local/bin/start-bugsink.sh
# ============================================================================
# Create Logstash Pipeline Configuration
# ============================================================================
# ADR-015: Pino and Redis logs → Bugsink
RUN mkdir -p /etc/logstash/conf.d /app/logs
RUN echo 'input {\n\
# Pino application logs\n\
file {\n\
path => "/app/logs/*.log"\n\
codec => json\n\
type => "pino"\n\
tags => ["app"]\n\
start_position => "beginning"\n\
sincedb_path => "/var/lib/logstash/sincedb_pino"\n\
}\n\
\n\
# Redis logs\n\
file {\n\
path => "/var/log/redis/*.log"\n\
type => "redis"\n\
tags => ["redis"]\n\
start_position => "beginning"\n\
sincedb_path => "/var/lib/logstash/sincedb_redis"\n\
}\n\
}\n\
\n\
filter {\n\
# Pino error detection (level 50 = error, 60 = fatal)\n\
if [type] == "pino" and [level] >= 50 {\n\
mutate { add_tag => ["error"] }\n\
}\n\
\n\
# Redis error detection\n\
if [type] == "redis" {\n\
grok {\n\
match => { "message" => "%%{POSINT:pid}:%%{WORD:role} %%{MONTHDAY} %%{MONTH} %%{TIME} %%{WORD:loglevel} %%{GREEDYDATA:redis_message}" }\n\
}\n\
if [loglevel] in ["WARNING", "ERROR"] {\n\
mutate { add_tag => ["error"] }\n\
}\n\
}\n\
}\n\
\n\
output {\n\
if "error" in [tags] {\n\
http {\n\
url => "http://localhost:8000/api/store/"\n\
http_method => "post"\n\
format => "json"\n\
}\n\
}\n\
\n\
# Debug output (comment out in production)\n\
stdout { codec => rubydebug }\n\
}\n\
' > /etc/logstash/conf.d/bugsink.conf
# Create Logstash sincedb directory
RUN mkdir -p /var/lib/logstash && chown -R logstash:logstash /var/lib/logstash
# ============================================================================
# Set Working Directory
# ============================================================================
@@ -52,6 +257,25 @@ ENV NODE_ENV=development
# Increase Node.js memory limit for large builds
ENV NODE_OPTIONS='--max-old-space-size=8192'
# Bugsink defaults (ADR-015)
ENV BUGSINK_DB_HOST=postgres
ENV BUGSINK_DB_PORT=5432
ENV BUGSINK_DB_NAME=bugsink
ENV BUGSINK_DB_USER=bugsink
ENV BUGSINK_DB_PASSWORD=bugsink_dev_password
ENV BUGSINK_PORT=8000
ENV BUGSINK_BASE_URL=http://localhost:8000
ENV BUGSINK_ADMIN_EMAIL=admin@localhost
ENV BUGSINK_ADMIN_PASSWORD=admin
# ============================================================================
# Expose Ports
# ============================================================================
# 3000 - Vite frontend
# 3001 - Express backend
# 8000 - Bugsink error tracking
EXPOSE 3000 3001 8000
# ============================================================================
# Default Command
# ============================================================================

168
INSTALL.md Normal file
View File

@@ -0,0 +1,168 @@
# Installation Guide
This guide covers setting up a local development environment for Flyer Crawler.
## Prerequisites
- Node.js 20.x or later
- Access to a PostgreSQL database (local or remote)
- Redis instance (for session management)
- Google Gemini API key
- Google Maps API key (for geocoding)
## Quick Start
If you already have PostgreSQL and Redis configured:
```bash
# Install dependencies
npm install
# Run in development mode
npm run dev
```
---
## Development Environment with Podman (Recommended for Windows)
This approach uses Podman with an Ubuntu container for a consistent development environment.
### Step 1: Install Prerequisites on Windows
1. **Install WSL 2**: Podman on Windows relies on the Windows Subsystem for Linux.
```powershell
wsl --install
```
Run this in an administrator PowerShell.
2. **Install Podman Desktop**: Download and install [Podman Desktop for Windows](https://podman-desktop.io/).
### Step 2: Set Up Podman
1. **Initialize Podman**: Launch Podman Desktop. It will automatically set up its WSL 2 machine.
2. **Start Podman**: Ensure the Podman machine is running from the Podman Desktop interface.
### Step 3: Set Up the Ubuntu Container
1. **Pull Ubuntu Image**:
```bash
podman pull ubuntu:latest
```
2. **Create a Podman Volume** (persists node_modules between container restarts):
```bash
podman volume create node_modules_cache
```
3. **Run the Ubuntu Container**:
Open a terminal in your project's root directory and run:
```bash
podman run -it -p 3001:3001 -p 5173:5173 --name flyer-dev \
-v "$(pwd):/app" \
-v "node_modules_cache:/app/node_modules" \
ubuntu:latest
```
| Flag | Purpose |
| ------------------------------------------- | ------------------------------------------------ |
| `-p 3001:3001` | Forwards the backend server port |
| `-p 5173:5173` | Forwards the Vite frontend server port |
| `--name flyer-dev` | Names the container for easy reference |
| `-v "...:/app"` | Mounts your project directory into the container |
| `-v "node_modules_cache:/app/node_modules"` | Mounts the named volume for node_modules |
### Step 4: Configure the Ubuntu Environment
You are now inside the Ubuntu container's shell.
1. **Update Package Lists**:
```bash
apt-get update
```
2. **Install Dependencies**:
```bash
apt-get install -y curl git
curl -sL https://deb.nodesource.com/setup_20.x | bash -
apt-get install -y nodejs
```
3. **Navigate to Project Directory**:
```bash
cd /app
```
4. **Install Project Dependencies**:
```bash
npm install
```
### Step 5: Run the Development Server
```bash
npm run dev
```
### Step 6: Access the Application
- **Frontend**: http://localhost:5173
- **Backend API**: http://localhost:3001
### Managing the Container
| Action | Command |
| --------------------- | -------------------------------- |
| Stop the container | Press `Ctrl+C`, then type `exit` |
| Restart the container | `podman start -a -i flyer-dev` |
| Remove the container | `podman rm flyer-dev` |
---
## Environment Variables
This project is configured to run in a CI/CD environment and does not use `.env` files. All configuration must be provided as environment variables.
For local development, you can export these in your shell or use your IDE's environment configuration:
| Variable | Description |
| --------------------------- | ------------------------------------- |
| `DB_HOST` | PostgreSQL server hostname |
| `DB_USER` | PostgreSQL username |
| `DB_PASSWORD` | PostgreSQL password |
| `DB_DATABASE_PROD` | Production database name |
| `JWT_SECRET` | Secret string for signing auth tokens |
| `VITE_GOOGLE_GENAI_API_KEY` | Google Gemini API key |
| `GOOGLE_MAPS_API_KEY` | Google Maps Geocoding API key |
| `REDIS_PASSWORD_PROD` | Production Redis password |
| `REDIS_PASSWORD_TEST` | Test Redis password |
---
## Seeding Development Users
To create initial test accounts (`admin@example.com` and `user@example.com`):
```bash
npm run seed
```
After running, you may need to restart your IDE's TypeScript server to pick up any generated types.
---
## Next Steps
- [Database Setup](DATABASE.md) - Set up PostgreSQL with required extensions
- [Authentication Setup](AUTHENTICATION.md) - Configure OAuth providers
- [Deployment Guide](DEPLOYMENT.md) - Deploy to production

451
README.md
View File

@@ -1,424 +1,91 @@
# Flyer Crawler - Grocery AI Analyzer
Flyer Crawler is a web application that uses the Google Gemini AI to extract, analyze, and manage data from grocery store flyers. Users can upload flyer images or PDFs, and the application will automatically identify items, prices, and sale dates, storing the structured data in a PostgreSQL database for historical analysis, price tracking, and personalized deal alerts.
Flyer Crawler is a web application that uses Google Gemini AI to extract, analyze, and manage data from grocery store flyers. Users can upload flyer images or PDFs, and the application automatically identifies items, prices, and sale dates, storing structured data in a PostgreSQL database for historical analysis, price tracking, and personalized deal alerts.
We are working on an app to help people save money, by finding good deals that are only advertized in store flyers/ads. So, the primary purpose of the site is to make uploading flyers as easy as possible and as accurate as possible, and to store peoples needs, so sales can be matched to needs.
**Our mission**: Help people save money by finding good deals that are only advertised in store flyers. The app makes uploading flyers as easy and accurate as possible, and matches sales to users' needs.
---
## Features
- **AI-Powered Data Extraction**: Upload PNG, JPG, or PDF flyers to automatically extract store names, sale dates, and a detailed list of items with prices and quantities.
- **Bulk Import**: Process multiple flyers at once with a summary report of successes, skips (duplicates), and errors.
- **Database Integration**: All extracted data is saved to a PostgreSQL database, enabling long-term persistence and analysis.
- **Personalized Watchlist**: Authenticated users can create a "watchlist" of specific grocery items they want to track.
- **Active Deal Alerts**: The app highlights current sales on your watched items from all valid flyers in the database.
- **Price History Charts**: Visualize the price trends of your watched items over time.
- **Shopping List Management**: Users can create multiple shopping lists, add items from flyers or their watchlist, and track purchased items.
- **User Authentication & Management**: Secure user sign-up, login, and profile management, including a secure account deletion process.
- **Dynamic UI**: A responsive interface with dark mode and a choice between metric/imperial unit systems.
- **AI-Powered Data Extraction**: Upload PNG, JPG, or PDF flyers to automatically extract store names, sale dates, and detailed item lists with prices and quantities
- **Bulk Import**: Process multiple flyers at once with summary reports of successes, skips (duplicates), and errors
- **Personalized Watchlist**: Create a watchlist of specific grocery items you want to track
- **Active Deal Alerts**: See current sales on your watched items from all valid flyers
- **Price History Charts**: Visualize price trends of watched items over time
- **Shopping List Management**: Create multiple shopping lists, add items from flyers or your watchlist, and track purchased items
- **User Authentication**: Secure sign-up, login, profile management, and account deletion
- **Dynamic UI**: Responsive interface with dark mode and metric/imperial unit systems
---
## Tech Stack
- **Frontend**: React, TypeScript, Tailwind CSS
- **AI**: Google Gemini API (`@google/genai`)
- **Backend**: Node.js with Express
- **Database**: PostgreSQL
- **Authentication**: Passport.js
- **UI Components**: Recharts for charts
| Layer | Technology |
| -------------- | ----------------------------------- |
| Frontend | React, TypeScript, Tailwind CSS |
| AI | Google Gemini API (`@google/genai`) |
| Backend | Node.js, Express |
| Database | PostgreSQL with PostGIS |
| Authentication | Passport.js (Google, GitHub OAuth) |
| Charts | Recharts |
---
## Required Secrets & Configuration
This project is configured to run in a CI/CD environment and does not use `.env` files. All configuration and secrets must be provided as environment variables. For deployments using the included Gitea workflows, these must be configured as **repository secrets** in your Gitea instance.
- **`DB_HOST`, `DB_USER`, `DB_PASSWORD`**: Credentials for your PostgreSQL server. The port is assumed to be `5432`.
- **`DB_DATABASE_PROD`**: The name of your production database.
- **`REDIS_PASSWORD_PROD`**: The password for your production Redis instance.
- **`REDIS_PASSWORD_TEST`**: The password for your test Redis instance.
- **`JWT_SECRET`**: A long, random, and secret string for signing authentication tokens.
- **`VITE_GOOGLE_GENAI_API_KEY`**: Your Google Gemini API key.
- **`GOOGLE_MAPS_API_KEY`**: Your Google Maps Geocoding API key.
## Setup and Installation
### Step 1: Set Up PostgreSQL Database
1. **Set up a PostgreSQL database instance.**
2. **Run the Database Schema**:
- Connect to your database using a tool like `psql` or DBeaver.
- Open `sql/schema.sql.txt`, copy its entire contents, and execute it against your database.
- This will create all necessary tables, functions, and relationships.
### Step 2: Install Dependencies and Run the Application
1. **Install Dependencies**:
```bash
npm install
```
2. **Run the Application**:
```bash
npm run start:prod
```
### Step 3: Seed Development Users (Optional)
To create the initial `admin@example.com` and `user@example.com` accounts, you can run the seed script:
## Quick Start
```bash
npm run seed
# Install dependencies
npm install
# Run in development mode
npm run dev
```
After running, you may need to restart your IDE's TypeScript server to pick up the changes.
## NGINX mime types issue
sudo nano /etc/nginx/mime.types
change
application/javascript js;
TO
application/javascript js mjs;
RESTART NGINX
sudo nginx -t
sudo systemctl reload nginx
actually the proper change was to do this in the /etc/nginx/sites-available/flyer-crawler.projectium.com file
## for OAuth
1. Get Google OAuth Credentials
This is a crucial step that you must do outside the codebase:
Go to the Google Cloud Console.
Create a new project (or select an existing one).
In the navigation menu, go to APIs & Services > Credentials.
Click Create Credentials > OAuth client ID.
Select Web application as the application type.
Under Authorized redirect URIs, click ADD URI and enter the URL where Google will redirect users back to your server. For local development, this will be: http://localhost:3001/api/auth/google/callback.
Click Create. You will be given a Client ID and a Client Secret.
2. Get GitHub OAuth Credentials
You'll need to obtain a Client ID and Client Secret from GitHub:
Go to your GitHub profile settings.
Navigate to Developer settings > OAuth Apps.
Click New OAuth App.
Fill in the required fields:
Application name: A descriptive name for your app (e.g., "Flyer Crawler").
Homepage URL: The base URL of your application (e.g., http://localhost:5173 for local development).
Authorization callback URL: This is where GitHub will redirect users after they authorize your app. For local development, this will be: <http://localhost:3001/api/auth/github/callback>.
Click Register application.
You will be given a Client ID and a Client Secret.
## connect to postgres on projectium.com
psql -h localhost -U flyer_crawler_user -d "flyer-crawler-prod" -W
## postgis
flyer-crawler-prod=> SELECT version();
version
See [INSTALL.md](INSTALL.md) for detailed setup instructions.
---
PostgreSQL 14.19 (Ubuntu 14.19-0ubuntu0.22.04.1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.4.0-1ubuntu1~22.04.2) 11.4.0, 64-bit
(1 row)
## Documentation
flyer-crawler-prod=> SELECT PostGIS_Full_Version();
postgis_full_version
| Document | Description |
| -------------------------------------- | ---------------------------------------- |
| [INSTALL.md](INSTALL.md) | Local development setup with Podman |
| [DATABASE.md](DATABASE.md) | PostgreSQL setup, schema, and extensions |
| [AUTHENTICATION.md](AUTHENTICATION.md) | OAuth configuration (Google, GitHub) |
| [DEPLOYMENT.md](DEPLOYMENT.md) | Production server setup, NGINX, PM2 |
---
POSTGIS="3.2.0 c3e3cc0" [EXTENSION] PGSQL="140" GEOS="3.10.2-CAPI-1.16.0" PROJ="8.2.1" LIBXML="2.9.12" LIBJSON="0.15" LIBPROTOBUF="1.3.3" WAGYU="0.5.0 (Internal)"
(1 row)
## Environment Variables
## production postgres setup
This project uses environment variables for configuration (no `.env` files). Key variables:
Part 1: Production Database Setup
This database will be the live, persistent storage for your application.
| Variable | Description |
| ----------------------------------- | -------------------------------- |
| `DB_HOST`, `DB_USER`, `DB_PASSWORD` | PostgreSQL credentials |
| `DB_DATABASE_PROD` | Production database name |
| `JWT_SECRET` | Authentication token signing key |
| `VITE_GOOGLE_GENAI_API_KEY` | Google Gemini API key |
| `GOOGLE_MAPS_API_KEY` | Google Maps Geocoding API key |
| `REDIS_PASSWORD_PROD` | Redis password |
Step 1: Install PostgreSQL (if not already installed)
First, ensure PostgreSQL is installed on your server.
See [INSTALL.md](INSTALL.md) for the complete list.
bash
sudo apt update
sudo apt install postgresql postgresql-contrib
Step 2: Create the Production Database and User
It's best practice to create a dedicated, non-superuser role for your application to connect with.
---
Switch to the postgres system user to get superuser access to the database.
## Scripts
bash
sudo -u postgres psql
Inside the psql shell, run the following SQL commands. Remember to replace 'a_very_strong_password' with a secure password that you will manage with a secrets tool or in your .env file.
| Command | Description |
| -------------------- | -------------------------------- |
| `npm run dev` | Start development server |
| `npm run build` | Build for production |
| `npm run start:prod` | Start production server with PM2 |
| `npm run test` | Run test suite |
| `npm run seed` | Seed development user accounts |
sql
-- Create a new role (user) for your application
CREATE ROLE flyer_crawler_user WITH LOGIN PASSWORD 'a_very_strong_password';
---
-- Create the production database and assign ownership to the new user
CREATE DATABASE "flyer-crawler-prod" WITH OWNER = flyer_crawler_user;
## License
-- Connect to the new database to install extensions within it.
\c "flyer-crawler-prod"
-- Install the required extensions as a superuser. This only needs to be done once.
CREATE EXTENSION IF NOT EXISTS postgis;
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
-- Exit the psql shell
Step 3: Apply the Master Schema
Now, you'll populate your new database with all the tables, functions, and initial data. Your master_schema_rollup.sql file is perfect for this.
Navigate to your project's root directory on the server.
Run the following command to execute the master schema script against your new production database. You will be prompted for the password you created in the previous step.
bash
psql -U flyer_crawler_user -d "flyer-crawler-prod" -f sql/master_schema_rollup.sql
This single command creates all tables, extensions (pg_trgm, postgis), functions, and triggers, and seeds essential data like categories and master items.
Step 4: Seed the Admin Account (If Needed)
Your application has a separate script to create the initial admin user. To run it, you must first set the required environment variables in your shell session.
bash
# Set variables for the current session
export DB_USER=flyer_crawler_user DB_PASSWORD=your_password DB_NAME="flyer-crawler-prod" ...
# Run the seeding script
npx tsx src/db/seed_admin_account.ts
Your production database is now ready!
Part 2: Test Database Setup (for CI/CD)
Your Gitea workflow (deploy.yml) already automates the creation and teardown of the test database during the pipeline run. The steps below are for understanding what the workflow does and for manual setup if you ever need to run tests outside the CI pipeline.
The process your CI pipeline follows is:
Setup (sql/test_setup.sql):
As the postgres superuser, it runs sql/test_setup.sql.
This creates a temporary role named test_runner.
It creates a separate database named "flyer-crawler-test" owned by test_runner.
Schema Application (src/tests/setup/global-setup.ts):
The test runner (vitest) executes the global-setup.ts file.
This script connects to the "flyer-crawler-test" database using the temporary credentials.
It then runs the same sql/master_schema_rollup.sql file, ensuring your test database has the exact same structure as production.
Test Execution:
Your tests run against this clean, isolated "flyer-crawler-test" database.
Teardown (sql/test_teardown.sql):
After tests complete (whether they pass or fail), the if: always() step in your workflow ensures that sql/test_teardown.sql is executed.
This script terminates any lingering connections to the test database, drops the "flyer-crawler-test" database completely, and drops the test_runner role.
Part 3: Test Database Setup (for CI/CD and Local Testing)
Your Gitea workflow and local test runner rely on a permanent test database. This database needs to be created once on your server. The test runner will automatically reset the schema inside it before every test run.
Step 1: Create the Test Database
On your server, switch to the postgres system user to get superuser access.
bash
sudo -u postgres psql
Inside the psql shell, create a new database. We will assign ownership to the same flyer_crawler_user that your application uses. This user needs to be the owner to have permission to drop and recreate the schema during testing.
sql
-- Create the test database and assign ownership to your existing application user
CREATE DATABASE "flyer-crawler-test" WITH OWNER = flyer_crawler_user;
-- Connect to the newly created test database
\c "flyer-crawler-test"
-- Install the required extensions as a superuser. This only needs to be done once.
CREATE EXTENSION IF NOT EXISTS postgis;
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
-- Connect to the newly created test database
\c "flyer-crawler-test"
-- Grant ownership of the public schema within this database to your application user.
-- This is CRITICAL for allowing the test runner to drop and recreate the schema.
ALTER SCHEMA public OWNER TO flyer_crawler_user;
-- Exit the psql shell
\q
Step 2: Configure Gitea Secrets for Testing
Your CI pipeline needs to know how to connect to this test database. Ensure the following secrets are set in your Gitea repository settings:
DB_HOST: The hostname of your database server (e.g., localhost).
DB_PORT: The port for your database (e.g., 5432).
DB_USER: The user for the database (e.g., flyer_crawler_user).
DB_PASSWORD: The password for the database user.
The workflow file (.gitea/workflows/deploy.yml) is configured to use these secrets and will automatically connect to the "flyer-crawler-test" database when it runs the npm test command.
How the Test Workflow Works
The CI pipeline no longer uses sudo or creates/destroys the database on each run. Instead, the process is now:
Setup: The vitest global setup script (src/tests/setup/global-setup.ts) connects to the permanent "flyer-crawler-test" database.
Schema Reset: It executes sql/drop_tables.sql (which runs DROP SCHEMA public CASCADE) to completely wipe all tables, functions, and triggers.
Schema Application: It then immediately executes sql/master_schema_rollup.sql to build a fresh, clean schema and seed initial data.
Test Execution: Your tests run against this clean, isolated schema.
This approach is faster, more reliable, and removes the need for sudo access within the CI pipeline.
gitea-runner@projectium:~$ pm2 install pm2-logrotate
[PM2][Module] Installing NPM pm2-logrotate module
[PM2][Module] Calling [NPM] to install pm2-logrotate ...
added 161 packages in 5s
21 packages are looking for funding
run `npm fund` for details
npm notice
npm notice New patch version of npm available! 11.6.3 -> 11.6.4
npm notice Changelog: https://github.com/npm/cli/releases/tag/v11.6.4
npm notice To update run: npm install -g npm@11.6.4
npm notice
[PM2][Module] Module downloaded
[PM2][WARN] Applications pm2-logrotate not running, starting...
[PM2] App [pm2-logrotate] launched (1 instances)
Module: pm2-logrotate
$ pm2 set pm2-logrotate:max_size 10M
$ pm2 set pm2-logrotate:retain 30
$ pm2 set pm2-logrotate:compress false
$ pm2 set pm2-logrotate:dateFormat YYYY-MM-DD_HH-mm-ss
$ pm2 set pm2-logrotate:workerInterval 30
$ pm2 set pm2-logrotate:rotateInterval 0 0 \* \* _
$ pm2 set pm2-logrotate:rotateModule true
Modules configuration. Copy/Paste line to edit values.
[PM2][Module] Module successfully installed and launched
[PM2][Module] Checkout module options: `$ pm2 conf`
┌────┬───────────────────────────────────┬─────────────┬─────────┬─────────┬──────────┬────────┬──────┬───────────┬──────────┬──────────┬──────────┬──────────┐
│ id │ name │ namespace │ version │ mode │ pid │ uptime │ ↺ │ status │ cpu │ mem │ user │ watching │
├────┼───────────────────────────────────┼─────────────┼─────────┼─────────┼──────────┼────────┼──────┼───────────┼──────────┼──────────┼──────────┼──────────┤
│ 2 │ flyer-crawler-analytics-worker │ default │ 0.0.0 │ fork │ 3846981 │ 7m │ 5 │ online │ 0% │ 55.8mb │ git… │ disabled │
│ 11 │ flyer-crawler-api │ default │ 0.0.0 │ fork │ 3846987 │ 7m │ 0 │ online │ 0% │ 59.0mb │ git… │ disabled │
│ 12 │ flyer-crawler-worker │ default │ 0.0.0 │ fork │ 3846988 │ 7m │ 0 │ online │ 0% │ 54.2mb │ git… │ disabled │
└────┴───────────────────────────────────┴─────────────┴─────────┴─────────┴──────────┴────────┴──────┴───────────┴──────────┴──────────┴──────────┴──────────┘
Module
┌────┬──────────────────────────────┬───────────────┬──────────┬──────────┬──────┬──────────┬──────────┬──────────┐
│ id │ module │ version │ pid │ status │ ↺ │ cpu │ mem │ user │
├────┼──────────────────────────────┼───────────────┼──────────┼──────────┼──────┼──────────┼──────────┼──────────┤
│ 13 │ pm2-logrotate │ 3.0.0 │ 3848878 │ online │ 0 │ 0% │ 20.1mb │ git… │
└────┴──────────────────────────────┴───────────────┴──────────┴──────────┴──────┴──────────┴──────────┴──────────┘
gitea-runner@projectium:~$ pm2 set pm2-logrotate:max_size 10M
[PM2] Module pm2-logrotate restarted
[PM2] Setting changed
Module: pm2-logrotate
$ pm2 set pm2-logrotate:max_size 10M
$ pm2 set pm2-logrotate:retain 30
$ pm2 set pm2-logrotate:compress false
$ pm2 set pm2-logrotate:dateFormat YYYY-MM-DD_HH-mm-ss
$ pm2 set pm2-logrotate:workerInterval 30
$ pm2 set pm2-logrotate:rotateInterval 0 0 _ \* _
$ pm2 set pm2-logrotate:rotateModule true
gitea-runner@projectium:~$ pm2 set pm2-logrotate:retain 14
[PM2] Module pm2-logrotate restarted
[PM2] Setting changed
Module: pm2-logrotate
$ pm2 set pm2-logrotate:max_size 10M
$ pm2 set pm2-logrotate:retain 14
$ pm2 set pm2-logrotate:compress false
$ pm2 set pm2-logrotate:dateFormat YYYY-MM-DD_HH-mm-ss
$ pm2 set pm2-logrotate:workerInterval 30
$ pm2 set pm2-logrotate:rotateInterval 0 0 _ \* \*
$ pm2 set pm2-logrotate:rotateModule true
gitea-runner@projectium:~$
## dev server setup:
Here are the steps to set up the development environment on Windows using Podman with an Ubuntu container:
1. Install Prerequisites on Windows
Install WSL 2: Podman on Windows relies on the Windows Subsystem for Linux. Install it by running wsl --install in an administrator PowerShell.
Install Podman Desktop: Download and install Podman Desktop for Windows.
2. Set Up Podman
Initialize Podman: Launch Podman Desktop. It will automatically set up its WSL 2 machine.
Start Podman: Ensure the Podman machine is running from the Podman Desktop interface.
3. Set Up the Ubuntu Container
- Pull Ubuntu Image: Open a PowerShell or command prompt and pull the latest Ubuntu image:
podman pull ubuntu:latest
- Create a Podman Volume: Create a volume to persist node_modules and avoid installing them every time the container starts.
podman volume create node_modules_cache
- Run the Ubuntu Container: Start a new container with the project directory mounted and the necessary ports forwarded.
- Open a terminal in your project's root directory on Windows.
- Run the following command, replacing D:\gitea\flyer-crawler.projectium.com\flyer-crawler.projectium.com with the full path to your project:
podman run -it -p 3001:3001 -p 5173:5173 --name flyer-dev -v "D:\gitea\flyer-crawler.projectium.com\flyer-crawler.projectium.com:/app" -v "node_modules_cache:/app/node_modules" ubuntu:latest
-p 3001:3001: Forwards the backend server port.
-p 5173:5173: Forwards the Vite frontend server port.
--name flyer-dev: Names the container for easy reference.
-v "...:/app": Mounts your project directory into the container at /app.
-v "node_modules_cache:/app/node_modules": Mounts the named volume for node_modules.
4. Configure the Ubuntu Environment
You are now inside the Ubuntu container's shell.
- Update Package Lists:
apt-get update
- Install Dependencies: Install curl, git, and nodejs (which includes npm).
apt-get install -y curl git
curl -sL https://deb.nodesource.com/setup_20.x | bash -
apt-get install -y nodejs
- Navigate to Project Directory:
cd /app
- Install Project Dependencies:
npm install
5. Run the Development Server
- Start the Application:
npm run dev
6. Accessing the Application
- Frontend: Open your browser and go to http://localhost:5173.
- Backend: The frontend will make API calls to http://localhost:3001.
Managing the Environment
- Stopping the Container: Press Ctrl+C in the container terminal, then type exit.
- Restarting the Container:
podman start -a -i flyer-dev
## for me:
cd /mnt/d/gitea/flyer-crawler.projectium.com/flyer-crawler.projectium.com
podman run -it -p 3001:3001 -p 5173:5173 --name flyer-dev -v "$(pwd):/app" -v "node_modules_cache:/app/node_modules" ubuntu:latest
rate limiting
respect the AI service's rate limits, making it more stable and robust. You can adjust the GEMINI_RPM environment variable in your production environment as needed without changing the code.
[Add license information here]

3
README.testing.md Normal file
View File

@@ -0,0 +1,3 @@
using powershell on win10 use this command to run the integration tests only in the container
podman exec -i flyer-crawler-dev npm run test:integration 2>&1 | Tee-Object -FilePath test-output.txt

View File

@@ -5,7 +5,7 @@
# This file defines the local development environment using Docker/Podman.
#
# Services:
# - app: Node.js application (API + Frontend)
# - app: Node.js application (API + Frontend + Bugsink + Logstash)
# - postgres: PostgreSQL 15 with PostGIS extension
# - redis: Redis for caching and job queues
#
@@ -18,6 +18,10 @@
# VS Code Dev Containers:
# This file is referenced by .devcontainer/devcontainer.json for seamless
# VS Code integration. Open the project in VS Code and use "Reopen in Container".
#
# Bugsink (ADR-015):
# Access error tracking UI at http://localhost:8000
# Default login: admin@localhost / admin
# ============================================================================
version: '3.8'
@@ -43,6 +47,7 @@ services:
ports:
- '3000:3000' # Frontend (Vite default)
- '3001:3001' # Backend API
- '8000:8000' # Bugsink error tracking (ADR-015)
environment:
# Core settings
- NODE_ENV=development
@@ -62,6 +67,26 @@ services:
- JWT_SECRET=dev-jwt-secret-change-in-production
# Worker settings
- WORKER_LOCK_DURATION=120000
# Bugsink error tracking (ADR-015)
- BUGSINK_DB_HOST=postgres
- BUGSINK_DB_PORT=5432
- BUGSINK_DB_NAME=bugsink
- BUGSINK_DB_USER=bugsink
- BUGSINK_DB_PASSWORD=bugsink_dev_password
- BUGSINK_PORT=8000
- BUGSINK_BASE_URL=http://localhost:8000
- BUGSINK_ADMIN_EMAIL=admin@localhost
- BUGSINK_ADMIN_PASSWORD=admin
- BUGSINK_SECRET_KEY=dev-bugsink-secret-key-minimum-50-characters-for-security
# Sentry SDK configuration (points to local Bugsink)
- SENTRY_DSN=http://59a58583-e869-7697-f94a-cfa0337676a8@localhost:8000/1
- VITE_SENTRY_DSN=http://d5fc5221-4266-ff2f-9af8-5689696072f3@localhost:8000/2
- SENTRY_ENVIRONMENT=development
- VITE_SENTRY_ENVIRONMENT=development
- SENTRY_ENABLED=true
- VITE_SENTRY_ENABLED=true
- SENTRY_DEBUG=true
- VITE_SENTRY_DEBUG=true
depends_on:
postgres:
condition: service_healthy
@@ -93,9 +118,10 @@ services:
POSTGRES_INITDB_ARGS: '--encoding=UTF8 --locale=C'
volumes:
- postgres_data:/var/lib/postgresql/data
# Mount the extensions init script to run on first database creation
# The 00- prefix ensures it runs before any other init scripts
# Mount init scripts to run on first database creation
# Scripts run in alphabetical order: 00-extensions, 01-bugsink
- ./sql/00-init-extensions.sql:/docker-entrypoint-initdb.d/00-init-extensions.sql:ro
- ./sql/01-init-bugsink.sh:/docker-entrypoint-initdb.d/01-init-bugsink.sh:ro
# Healthcheck ensures postgres is ready before app starts
healthcheck:
test: ['CMD-SHELL', 'pg_isready -U postgres -d flyer_crawler_dev']

1347
docs/BARE-METAL-SETUP.md Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -3,7 +3,7 @@
**Date**: 2025-12-12
**Implementation Date**: 2026-01-08
**Status**: Accepted and Implemented (Phases 1-5 complete, user + admin features migrated)
**Status**: Accepted and Fully Implemented (Phases 1-8 complete, 100% coverage)
## Context
@@ -23,18 +23,21 @@ We will adopt a dedicated library for managing server state, such as **TanStack
### Phase 1: Infrastructure & Core Queries (✅ Complete - 2026-01-08)
**Files Created:**
- [src/config/queryClient.ts](../../src/config/queryClient.ts) - Global QueryClient configuration
- [src/hooks/queries/useFlyersQuery.ts](../../src/hooks/queries/useFlyersQuery.ts) - Flyers data query
- [src/hooks/queries/useWatchedItemsQuery.ts](../../src/hooks/queries/useWatchedItemsQuery.ts) - Watched items query
- [src/hooks/queries/useShoppingListsQuery.ts](../../src/hooks/queries/useShoppingListsQuery.ts) - Shopping lists query
**Files Modified:**
- [src/providers/AppProviders.tsx](../../src/providers/AppProviders.tsx) - Added QueryClientProvider wrapper
- [src/providers/FlyersProvider.tsx](../../src/providers/FlyersProvider.tsx) - Refactored to use TanStack Query
- [src/providers/UserDataProvider.tsx](../../src/providers/UserDataProvider.tsx) - Refactored to use TanStack Query
- [src/services/apiClient.ts](../../src/services/apiClient.ts) - Added pagination params to fetchFlyers
**Benefits Achieved:**
- ✅ Removed ~150 lines of custom state management code
- ✅ Automatic caching of server data
- ✅ Background refetching for stale data
@@ -45,14 +48,17 @@ We will adopt a dedicated library for managing server state, such as **TanStack
### Phase 2: Remaining Queries (✅ Complete - 2026-01-08)
**Files Created:**
- [src/hooks/queries/useMasterItemsQuery.ts](../../src/hooks/queries/useMasterItemsQuery.ts) - Master grocery items query
- [src/hooks/queries/useFlyerItemsQuery.ts](../../src/hooks/queries/useFlyerItemsQuery.ts) - Flyer items query
**Files Modified:**
- [src/providers/MasterItemsProvider.tsx](../../src/providers/MasterItemsProvider.tsx) - Refactored to use TanStack Query
- [src/hooks/useFlyerItems.ts](../../src/hooks/useFlyerItems.ts) - Refactored to use TanStack Query
**Benefits Achieved:**
- ✅ Removed additional ~50 lines of custom state management code
- ✅ Per-flyer item caching (items cached separately for each flyer)
- ✅ Longer cache times for infrequently changing data (master items)
@@ -82,78 +88,154 @@ We will adopt a dedicated library for managing server state, such as **TanStack
**See**: [plans/adr-0005-phase-3-summary.md](../../plans/adr-0005-phase-3-summary.md) for detailed documentation
### Phase 4: Hook Refactoring (✅ Complete - 2026-01-08)
### Phase 4: Hook Refactoring (✅ Complete)
**Goal:** Refactor user-facing hooks to use TanStack Query mutation hooks.
**Files Modified:**
- [src/hooks/useWatchedItems.tsx](../../src/hooks/useWatchedItems.tsx) - Refactored to use mutation hooks
- [src/hooks/useShoppingLists.tsx](../../src/hooks/useShoppingLists.tsx) - Refactored to use mutation hooks
- [src/contexts/UserDataContext.ts](../../src/contexts/UserDataContext.ts) - Removed deprecated setters
- [src/providers/UserDataProvider.tsx](../../src/providers/UserDataProvider.tsx) - Removed setter stub implementations
- [src/contexts/UserDataContext.ts](../../src/contexts/UserDataContext.ts) - Clean read-only interface (no setters)
- [src/providers/UserDataProvider.tsx](../../src/providers/UserDataProvider.tsx) - Uses query hooks, no setter stubs
**Benefits Achieved:**
-Removed 52 lines of code from custom hooks (-17%)
-Eliminated all `useApi` dependencies from user-facing hooks
-Removed 150+ lines of manual state management
-Simplified useShoppingLists by 21% (222 → 176 lines)
-Maintained backward compatibility for hook consumers
- ✅ Cleaner context interface (read-only server state)
-Both hooks now use TanStack Query mutations
-Automatic cache invalidation after mutations
-Consistent error handling via mutation hooks
-Clean context interface (read-only server state)
-Backward compatible API for hook consumers
**See**: [plans/adr-0005-phase-4-summary.md](../../plans/adr-0005-phase-4-summary.md) for detailed documentation
### Phase 5: Admin Features (✅ Complete)
### Phase 5: Admin Features (✅ Complete - 2026-01-08)
**Goal:** Create query hooks for admin features.
**Files Created:**
- [src/hooks/queries/useActivityLogQuery.ts](../../src/hooks/queries/useActivityLogQuery.ts) - Activity log query with pagination
- [src/hooks/queries/useApplicationStatsQuery.ts](../../src/hooks/queries/useApplicationStatsQuery.ts) - Application statistics query
- [src/hooks/queries/useSuggestedCorrectionsQuery.ts](../../src/hooks/queries/useSuggestedCorrectionsQuery.ts) - Corrections query
- [src/hooks/queries/useCategoriesQuery.ts](../../src/hooks/queries/useCategoriesQuery.ts) - Categories query (public endpoint)
- [src/hooks/queries/useActivityLogQuery.ts](../../src/hooks/queries/useActivityLogQuery.ts) - Activity log with pagination
- [src/hooks/queries/useApplicationStatsQuery.ts](../../src/hooks/queries/useApplicationStatsQuery.ts) - Application statistics
- [src/hooks/queries/useSuggestedCorrectionsQuery.ts](../../src/hooks/queries/useSuggestedCorrectionsQuery.ts) - Corrections data
- [src/hooks/queries/useCategoriesQuery.ts](../../src/hooks/queries/useCategoriesQuery.ts) - Categories (public endpoint)
**Files Modified:**
**Components Migrated:**
- [src/pages/admin/ActivityLog.tsx](../../src/pages/admin/ActivityLog.tsx) - Refactored to use TanStack Query
- [src/pages/admin/AdminStatsPage.tsx](../../src/pages/admin/AdminStatsPage.tsx) - Refactored to use TanStack Query
- [src/pages/admin/CorrectionsPage.tsx](../../src/pages/admin/CorrectionsPage.tsx) - Refactored to use TanStack Query
- [src/pages/admin/ActivityLog.tsx](../../src/pages/admin/ActivityLog.tsx) - Uses useActivityLogQuery
- [src/pages/admin/AdminStatsPage.tsx](../../src/pages/admin/AdminStatsPage.tsx) - Uses useApplicationStatsQuery
- [src/pages/admin/CorrectionsPage.tsx](../../src/pages/admin/CorrectionsPage.tsx) - Uses useSuggestedCorrectionsQuery, useMasterItemsQuery, useCategoriesQuery
**Benefits Achieved:**
-Removed 121 lines from admin components (-32%)
-Eliminated manual state management from all admin queries
-Automatic parallel fetching (CorrectionsPage fetches 3 queries simultaneously)
- ✅ Consistent caching strategy across all admin features
- ✅ Smart refetching with appropriate stale times (30s to 1 hour)
-Automatic caching of admin data
-Parallel fetching (CorrectionsPage fetches 3 queries simultaneously)
-Consistent stale times (30s to 2 min based on data volatility)
- ✅ Shared cache across components (useMasterItemsQuery reused)
**See**: [plans/adr-0005-phase-5-summary.md](../../plans/adr-0005-phase-5-summary.md) for detailed documentation
### Phase 6: Analytics Features (✅ Complete - 2026-01-10)
### Phase 6: Cleanup (🔄 In Progress - 2026-01-08)
**Goal:** Migrate analytics and deals features.
**Completed:**
**Files Created:**
- ✅ Removed custom useInfiniteQuery hook (not used in production)
- ✅ Analyzed remaining useApi/useApiOnMount usage
- [src/hooks/queries/useBestSalePricesQuery.ts](../../src/hooks/queries/useBestSalePricesQuery.ts) - Best sale prices for watched items
- [src/hooks/queries/useFlyerItemsForFlyersQuery.ts](../../src/hooks/queries/useFlyerItemsForFlyersQuery.ts) - Batch fetch items for multiple flyers
- [src/hooks/queries/useFlyerItemCountQuery.ts](../../src/hooks/queries/useFlyerItemCountQuery.ts) - Count items across flyers
**Remaining:**
**Files Modified:**
- ⏳ Migrate auth features (AuthProvider, AuthView, ProfileManager) from useApi to TanStack Query
- ⏳ Migrate useActiveDeals from useApi to TanStack Query
- ⏳ Migrate AdminBrandManager from useApiOnMount to TanStack Query
- ⏳ Consider removal of useApi/useApiOnMount hooks once fully migrated
- ⏳ Update all tests for migrated features
- [src/pages/MyDealsPage.tsx](../../src/pages/MyDealsPage.tsx) - Now uses useBestSalePricesQuery
- [src/hooks/useActiveDeals.tsx](../../src/hooks/useActiveDeals.tsx) - Refactored to use TanStack Query hooks
**Note**: `useApi` and `useApiOnMount` are still actively used in 6 production files for authentication, profile management, and some admin features. Full migration of these critical features requires careful planning and is documented as future work.
**Benefits Achieved:**
- ✅ Removed useApi dependency from analytics features
- ✅ Automatic caching of deal data (2-5 minute stale times)
- ✅ Consistent error handling via TanStack Query
- ✅ Batch fetching for flyer items (single query for multiple flyers)
### Phase 7: Cleanup (✅ Complete - 2026-01-10)
**Goal:** Remove legacy hooks once migration is complete.
**Files Created:**
- [src/hooks/queries/useUserAddressQuery.ts](../../src/hooks/queries/useUserAddressQuery.ts) - User address fetching
- [src/hooks/queries/useAuthProfileQuery.ts](../../src/hooks/queries/useAuthProfileQuery.ts) - Auth profile fetching
- [src/hooks/mutations/useGeocodeMutation.ts](../../src/hooks/mutations/useGeocodeMutation.ts) - Address geocoding
**Files Modified:**
- [src/hooks/useProfileAddress.ts](../../src/hooks/useProfileAddress.ts) - Refactored to use TanStack Query
- [src/providers/AuthProvider.tsx](../../src/providers/AuthProvider.tsx) - Refactored to use TanStack Query
**Files Removed:**
- ~~src/hooks/useApi.ts~~ - Legacy hook removed
- ~~src/hooks/useApi.test.ts~~ - Test file removed
- ~~src/hooks/useApiOnMount.ts~~ - Legacy hook removed
- ~~src/hooks/useApiOnMount.test.ts~~ - Test file removed
**Benefits Achieved:**
- ✅ Removed all legacy `useApi` and `useApiOnMount` hooks
- ✅ Complete TanStack Query coverage for all data fetching
- ✅ Consistent error handling across the entire application
- ✅ Unified caching strategy for all server state
### Phase 8: Additional Component Migration (✅ Complete - 2026-01-10)
**Goal:** Migrate remaining components with manual data fetching to TanStack Query.
**Files Created:**
- [src/hooks/queries/useUserProfileDataQuery.ts](../../src/hooks/queries/useUserProfileDataQuery.ts) - Combined user profile + achievements query
- [src/hooks/queries/useLeaderboardQuery.ts](../../src/hooks/queries/useLeaderboardQuery.ts) - Public leaderboard data
- [src/hooks/queries/usePriceHistoryQuery.ts](../../src/hooks/queries/usePriceHistoryQuery.ts) - Historical price data for watched items
**Files Modified:**
- [src/hooks/useUserProfileData.ts](../../src/hooks/useUserProfileData.ts) - Refactored to use useUserProfileDataQuery
- [src/components/Leaderboard.tsx](../../src/components/Leaderboard.tsx) - Refactored to use useLeaderboardQuery
- [src/features/charts/PriceHistoryChart.tsx](../../src/features/charts/PriceHistoryChart.tsx) - Refactored to use usePriceHistoryQuery
**Benefits Achieved:**
- ✅ Parallel fetching for profile + achievements data
- ✅ Public leaderboard cached with 2-minute stale time
- ✅ Price history cached with 10-minute stale time (data changes infrequently)
- ✅ Backward-compatible setProfile function via queryClient.setQueryData
- ✅ Stable query keys with sorted IDs for price history
## Migration Status
Current Coverage: **85% complete**
Current Coverage: **100% complete**
-**User Features: 100%** - All core user-facing features fully migrated (queries + mutations + hooks)
-**Admin Features: 100%** - Activity log, stats, corrections now use TanStack Query
-**Auth/Profile Features: 0%** - Auth provider, profile manager still use useApi
-**Analytics Features: 0%** - Active Deals need migration
-**Brand Management: 0%** - AdminBrandManager still uses useApiOnMount
| Category | Total | Migrated | Status |
| ----------------------------- | ----- | -------- | ------- |
| Query Hooks (User) | 7 | 7 | ✅ 100% |
| Query Hooks (Admin) | 4 | 4 | ✅ 100% |
| Query Hooks (Analytics) | 3 | 3 | ✅ 100% |
| Query Hooks (Phase 8) | 3 | 3 | ✅ 100% |
| Mutation Hooks | 8 | 8 | ✅ 100% |
| User Hooks | 2 | 2 | ✅ 100% |
| Analytics Features | 2 | 2 | ✅ 100% |
| Component Migration (Phase 8) | 3 | 3 | ✅ 100% |
| Legacy Hook Cleanup | 4 | 4 | ✅ 100% |
**Completed:**
- ✅ Core query hooks (flyers, flyerItems, masterItems, watchedItems, shoppingLists)
- ✅ Admin query hooks (activityLog, applicationStats, suggestedCorrections, categories)
- ✅ Analytics query hooks (bestSalePrices, flyerItemsForFlyers, flyerItemCount)
- ✅ Auth/Profile query hooks (authProfile, userAddress)
- ✅ Phase 8 query hooks (userProfileData, leaderboard, priceHistory)
- ✅ All mutation hooks (watched items, shopping lists, geocode)
- ✅ Provider refactoring (AppProviders, FlyersProvider, MasterItemsProvider, UserDataProvider, AuthProvider)
- ✅ User hooks refactoring (useWatchedItems, useShoppingLists, useProfileAddress, useUserProfileData)
- ✅ Admin component migration (ActivityLog, AdminStatsPage, CorrectionsPage)
- ✅ Analytics features (MyDealsPage, useActiveDeals)
- ✅ Component migration (Leaderboard, PriceHistoryChart)
- ✅ Legacy hooks removed (useApi, useApiOnMount)
See [plans/adr-0005-master-migration-status.md](../../plans/adr-0005-master-migration-status.md) for complete tracking of all components.

View File

@@ -25,15 +25,15 @@ We will formalize the testing pyramid for the project, defining the role of each
### Testing Framework Stack
| Tool | Version | Purpose |
| ---- | ------- | ------- |
| Vitest | 4.0.15 | Test runner for all test types |
| @testing-library/react | 16.3.0 | React component testing |
| @testing-library/jest-dom | 6.9.1 | DOM assertion matchers |
| supertest | 7.1.4 | HTTP assertion library for API testing |
| msw | 2.12.3 | Mock Service Worker for network mocking |
| testcontainers | 11.8.1 | Database containerization (optional) |
| c8 + nyc | 10.1.3 / 17.1.0 | Coverage reporting |
| Tool | Version | Purpose |
| ------------------------- | --------------- | --------------------------------------- |
| Vitest | 4.0.15 | Test runner for all test types |
| @testing-library/react | 16.3.0 | React component testing |
| @testing-library/jest-dom | 6.9.1 | DOM assertion matchers |
| supertest | 7.1.4 | HTTP assertion library for API testing |
| msw | 2.12.3 | Mock Service Worker for network mocking |
| testcontainers | 11.8.1 | Database containerization (optional) |
| c8 + nyc | 10.1.3 / 17.1.0 | Coverage reporting |
### Test File Organization
@@ -61,12 +61,12 @@ src/
### Configuration Files
| Config | Environment | Purpose |
| ------ | ----------- | ------- |
| `vite.config.ts` | jsdom | Unit tests (React components, hooks) |
| `vitest.config.integration.ts` | node | Integration tests (API routes) |
| `vitest.config.e2e.ts` | node | E2E tests (full user flows) |
| `vitest.workspace.ts` | - | Orchestrates all test projects |
| Config | Environment | Purpose |
| ------------------------------ | ----------- | ------------------------------------ |
| `vite.config.ts` | jsdom | Unit tests (React components, hooks) |
| `vitest.config.integration.ts` | node | Integration tests (API routes) |
| `vitest.config.e2e.ts` | node | E2E tests (full user flows) |
| `vitest.workspace.ts` | - | Orchestrates all test projects |
### Test Pyramid
@@ -150,9 +150,7 @@ describe('Auth API', () => {
});
it('GET /api/auth/me returns user profile', async () => {
const response = await request
.get('/api/auth/me')
.set('Authorization', `Bearer ${authToken}`);
const response = await request.get('/api/auth/me').set('Authorization', `Bearer ${authToken}`);
expect(response.status).toBe(200);
expect(response.body.user.email).toBeDefined();
@@ -212,13 +210,13 @@ it('creates flyer with items', () => {
### Test Utilities
| Utility | Purpose |
| ------- | ------- |
| Utility | Purpose |
| ----------------------- | ------------------------------------------ |
| `renderWithProviders()` | Wrap components with AppProviders + Router |
| `createAndLoginUser()` | Create user and return auth token |
| `cleanupDb()` | Database cleanup respecting FK constraints |
| `createTestApp()` | Create Express app for route testing |
| `poll()` | Polling utility for async operations |
| `createAndLoginUser()` | Create user and return auth token |
| `cleanupDb()` | Database cleanup respecting FK constraints |
| `createTestApp()` | Create Express app for route testing |
| `poll()` | Polling utility for async operations |
### Coverage Configuration
@@ -257,11 +255,11 @@ npm run clean
### Test Timeouts
| Test Type | Timeout | Rationale |
| --------- | ------- | --------- |
| Unit | 5 seconds | Fast, isolated tests |
| Integration | 60 seconds | AI service calls, DB operations |
| E2E | 120 seconds | Full user flow with multiple API calls |
| Test Type | Timeout | Rationale |
| ----------- | ----------- | -------------------------------------- |
| Unit | 5 seconds | Fast, isolated tests |
| Integration | 60 seconds | AI service calls, DB operations |
| E2E | 120 seconds | Full user flow with multiple API calls |
## Best Practices
@@ -298,6 +296,62 @@ npm run clean
2. **Integration tests**: Mock only external APIs (AI services)
3. **E2E tests**: Minimal mocking, use real services where possible
### Testing Code Smells
**When testing requires any of the following patterns, treat it as a code smell indicating the production code needs refactoring:**
1. **Capturing callbacks through mocks**: If you need to capture a callback passed to a mock and manually invoke it to test behavior, the code under test likely has poor separation of concerns.
2. **Complex module resets**: If tests require `vi.resetModules()`, `vi.doMock()`, or careful ordering of mock setup to work correctly, the module likely has problematic initialization or hidden global state.
3. **Indirect verification**: If you can only verify behavior by checking that internal mocks were called with specific arguments (rather than asserting on direct outputs), the code likely lacks proper return values or has side effects that should be explicit.
4. **Excessive mock setup**: If setting up mocks requires more lines than the actual test assertions, consider whether the code under test has too many dependencies or responsibilities.
**The Fix**: Rather than writing complex test scaffolding, refactor the production code to be more testable:
- Extract pure functions that can be tested with simple input/output assertions
- Use dependency injection to make dependencies explicit and easily replaceable
- Return values from functions instead of relying on side effects
- Split modules with complex initialization into smaller, focused units
- Make async flows explicit and controllable rather than callback-based
**Example anti-pattern**:
```typescript
// BAD: Capturing callback to test behavior
const capturedCallback = vi.fn();
mockService.onEvent.mockImplementation((cb) => {
capturedCallback = cb;
});
await initializeModule();
capturedCallback('test-data'); // Manually triggering to test
expect(mockOtherService.process).toHaveBeenCalledWith('test-data');
```
**Example preferred pattern**:
```typescript
// GOOD: Direct input/output testing
const result = await processEvent('test-data');
expect(result).toEqual({ processed: true, data: 'test-data' });
```
### Known Code Smell Violations (Technical Debt)
The following files contain acknowledged code smell violations that are deferred for future refactoring:
| File | Violations | Rationale for Deferral |
| ------------------------------------------------------ | ------------------------------------------------------ | ----------------------------------------------------------------------------------------- |
| `src/services/queueService.workers.test.ts` | Callback capture, `vi.resetModules()`, excessive setup | BullMQ workers instantiate at module load; business logic is tested via service classes |
| `src/services/workers.server.test.ts` | `vi.resetModules()` | Same as above - worker wiring tests |
| `src/services/queues.server.test.ts` | `vi.resetModules()` | Queue instantiation at module load |
| `src/App.test.tsx` | Callback capture, excessive setup | Component integration test; refactoring would require significant UI architecture changes |
| `src/features/voice-assistant/VoiceAssistant.test.tsx` | Multiple callback captures | WebSocket/audio APIs are inherently callback-based |
| `src/services/aiService.server.test.ts` | Multiple `vi.resetModules()` | AI service initialization complexity |
**Policy**: New code should follow the code smell guidelines. These existing violations are tracked here and will be addressed when the underlying modules are refactored or replaced.
## Key Files
- `vite.config.ts` - Unit test configuration

Some files were not shown because too many files have changed in this diff Show More