Compare commits

...

51 Commits

Author SHA1 Message Date
Gitea Actions
cf2cc5b832 ci: Bump version to 0.16.2 [skip ci] 2026-02-18 15:01:02 +05:00
d2db3562bb test deploy
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 24m32s
2026-02-18 01:35:16 -08:00
Gitea Actions
0532b4b22e style: auto-format code via Prettier [skip ci] 2026-02-18 14:06:10 +05:00
Gitea Actions
e767ccbb21 ci: Bump version to 0.16.1 [skip ci] 2026-02-18 14:04:40 +05:00
1ff813f495 job to fix pm2
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 24m45s
2026-02-18 00:54:08 -08:00
204fe4394a oh god maybe pm2 finally workin
Some checks are pending
Deploy to Test Environment / deploy-to-test (push) Has started running
2026-02-17 23:54:27 -08:00
Gitea Actions
029b621632 ci: Bump version to 0.16.0 for production release [skip ci] 2026-02-18 11:21:36 +05:00
Gitea Actions
0656ab3ae7 style: auto-format code via Prettier [skip ci] 2026-02-18 10:48:03 +05:00
Gitea Actions
ae0bb9e04d ci: Bump version to 0.15.2 [skip ci] 2026-02-18 10:46:29 +05:00
b83c37b977 deploy fixes
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 25m45s
2026-02-17 21:44:34 -08:00
Gitea Actions
69ae23a1ae ci: Bump version to 0.15.1 [skip ci] 2026-02-18 09:50:16 +05:00
c059b30201 PM2 Process Isolation
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 30m15s
2026-02-17 20:49:01 -08:00
Gitea Actions
93ad624658 ci: Bump version to 0.15.0 for production release [skip ci] 2026-02-18 07:40:36 +05:00
Gitea Actions
7dd4f21071 ci: Bump version to 0.14.4 [skip ci] 2026-02-18 06:27:20 +05:00
174b637a0a even more typescript fixes
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 25m5s
2026-02-17 17:20:54 -08:00
Gitea Actions
4f80baf466 ci: Bump version to 0.14.3 [skip ci] 2026-02-17 10:03:15 +05:00
8450b5e22f Generate TSOA Spec and Routes
Some checks failed
Deploy to Test Environment / deploy-to-test (push) Failing after 2m32s
2026-02-16 21:01:30 -08:00
Gitea Actions
e4d830ab90 ci: Bump version to 0.14.2 [skip ci] 2026-02-13 23:35:46 +05:00
b6a62a036f be specific about pm2 processes
Some checks failed
Deploy to Test Environment / deploy-to-test (push) Failing after 3m31s
2026-02-13 10:19:28 -08:00
2d2cd52011 Massive Dependency Modernization Project
Some checks failed
Deploy to Test Environment / deploy-to-test (push) Failing after 3m58s
2026-02-13 00:34:22 -08:00
379b8bf532 fix tour / whats new collision
Some checks failed
Deploy to Test Environment / deploy-to-test (push) Has been cancelled
2026-02-12 11:05:47 -08:00
Gitea Actions
d06a1952a0 ci: Bump version to 0.14.1 [skip ci] 2026-02-12 17:37:36 +05:00
4d323a51ca fix tour / whats new collision
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 49m39s
2026-02-12 04:29:43 -08:00
Gitea Actions
ee15c67429 ci: Bump version to 0.14.0 for production release [skip ci] 2026-02-12 16:16:16 +05:00
Gitea Actions
9956d07480 ci: Bump version to 0.13.0 for production release [skip ci] 2026-02-12 16:08:44 +05:00
Gitea Actions
5bc8f6a42b ci: Bump version to 0.12.25 [skip ci] 2026-01-31 03:35:28 +05:00
4fd5e900af minor test fixes
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 25m22s
2026-01-30 14:29:45 -08:00
Gitea Actions
39ab773b82 ci: Bump version to 0.12.24 [skip ci] 2026-01-30 06:23:37 +05:00
75406cd924 typescript fix
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 25m7s
2026-01-29 17:21:55 -08:00
Gitea Actions
8fb0a57f02 ci: Bump version to 0.12.23 [skip ci] 2026-01-30 05:24:50 +05:00
c78323275b more unit tests - done for now
Some checks failed
Deploy to Test Environment / deploy-to-test (push) Failing after 2m28s
2026-01-29 16:21:48 -08:00
Gitea Actions
5fe537b93d ci: Bump version to 0.12.22 [skip ci] 2026-01-29 12:26:33 +05:00
61f24305fb ADR-024 Feature Flagging Strategy
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 22m13s
2026-01-28 23:23:45 -08:00
Gitea Actions
de3f0cf26e ci: Bump version to 0.12.21 [skip ci] 2026-01-29 05:37:59 +05:00
45ac4fccf5 comprehensive documentation review + test fixes
Some checks failed
Deploy to Test Environment / deploy-to-test (push) Failing after 2m15s
2026-01-28 16:35:38 -08:00
Gitea Actions
b6c3ca9abe ci: Bump version to 0.12.20 [skip ci] 2026-01-29 04:36:43 +05:00
4f06698dfd test fixes and doc work
Some checks failed
Deploy to Test Environment / deploy-to-test (push) Failing after 2m50s
2026-01-28 15:33:48 -08:00
Gitea Actions
e548d1b0cc ci: Bump version to 0.12.19 [skip ci] 2026-01-28 23:03:57 +05:00
771f59d009 more api versioning work -whee
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 22m47s
2026-01-28 09:58:28 -08:00
Gitea Actions
0979a074ad ci: Bump version to 0.12.18 [skip ci] 2026-01-28 13:08:49 +05:00
0d4b028a66 design fixup and docs + api versioning
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 21m49s
2026-01-28 00:04:56 -08:00
Gitea Actions
4baed53713 ci: Bump version to 0.12.17 [skip ci] 2026-01-28 00:08:39 +05:00
f10c6c0cd6 Complete ADR-008 Phase 2
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 17m56s
2026-01-27 11:06:09 -08:00
Gitea Actions
107465b5cb ci: Bump version to 0.12.16 [skip ci] 2026-01-27 10:57:46 +05:00
e92ad25ce9 claude
Some checks failed
Deploy to Test Environment / deploy-to-test (push) Failing after 1m14s
2026-01-26 21:55:20 -08:00
2075ed199b Complete ADR-008 Phase 1: API Versioning Strategy
Implement URI-based API versioning with /api/v1 prefix across all routes.
This establishes a foundation for future API evolution and breaking changes.

Changes:
- server.ts: All routes mounted under /api/v1/ (15 route handlers)
- apiClient.ts: Base URL updated to /api/v1
- swagger.ts: OpenAPI server URL changed to /api/v1
- Redirect middleware: Added backwards compatibility for /api/* → /api/v1/*
- Tests: Updated 72 test files with versioned path assertions
- ADR documentation: Marked Phase 1 as complete (Accepted status)

Test fixes:
- apiClient.test.ts: 27 tests updated for /api/v1 paths
- user.routes.ts: 36 log messages updated to reflect versioned paths
- swagger.test.ts: 1 test updated for new server URL
- All integration/E2E tests updated for versioned endpoints

All Phase 1 acceptance criteria met:
✓ Routes use /api/v1/ prefix
✓ Frontend requests /api/v1/
✓ OpenAPI docs reflect /api/v1/
✓ Backwards compatibility via redirect middleware
✓ Tests pass with versioned paths

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-26 21:23:25 -08:00
Gitea Actions
4346332bbf ci: Bump version to 0.12.15 [skip ci] 2026-01-27 00:54:43 +05:00
61cfb518e6 ADR-015 done
Some checks failed
Deploy to Test Environment / deploy-to-test (push) Failing after 1m13s
2026-01-26 11:48:42 -08:00
Gitea Actions
e86ce51b6c ci: Bump version to 0.12.14 [skip ci] 2026-01-26 17:52:02 +05:00
840a7a62d3 adr work
Some checks failed
Deploy to Test Environment / deploy-to-test (push) Failing after 1m15s
2026-01-26 04:51:10 -08:00
5720820d95 adr-053 done 2026-01-26 04:51:09 -08:00
374 changed files with 64134 additions and 11963 deletions

View File

@@ -0,0 +1,152 @@
---
name: ui-ux-designer
description: "Use this agent when implementing, reviewing, or modifying any user interface elements, layouts, or user experience flows in the application. Specifically call this agent when: (1) creating new UI components or pages, (2) implementing responsive designs across different screen sizes, (3) designing or implementing wizards for user tasks like uploading flyers, creating shopping lists, or uploading receipts, (4) reviewing existing UI code for adherence to the design system, (5) making decisions about button placement, color schemes, or layout structures, or (6) addressing user experience concerns related to navigation or task completion flows.\\n\\n<examples>\\n<example>\\nContext: The developer is creating a new feature for uploading flyers.\\nuser: \"I need to implement the flyer upload feature. Can you help me create the upload form?\"\\nassistant: \"Let me use the Task tool to launch the ui-ux-designer agent to design and implement the flyer upload wizard with appropriate UI/UX patterns.\"\\n<commentary>\\nSince this involves creating a new UI feature with a wizard flow, the ui-ux-designer agent should be used to ensure it follows the project's design principles including pastel colors, sharp borders, wizard-based guidance, and responsive design.\\n</commentary>\\n</example>\\n\\n<example>\\nContext: Code has been written for a shopping list creation page.\\nuser: \"I've implemented the shopping list creation page with a form\"\\nassistant: \"Great! Now let me use the Task tool to launch the ui-ux-designer agent to review the implementation and ensure it follows our UI/UX guidelines.\"\\n<commentary>\\nSince a significant UI component was implemented, the ui-ux-designer agent should review it to ensure it adheres to the minimalist design, proper color scheme, button placement, and responsive requirements.\\n</commentary>\\n</example>\\n\\n<example>\\nContext: The developer is about to add multiple action buttons to a page.\\nuser: \"Should I add buttons for save, cancel, edit, delete, and export at the top of the page?\"\\nassistant: \"Let me use the Task tool to launch the ui-ux-designer agent to provide guidance on the optimal button placement and selection for this page.\"\\n<commentary>\\nThis is a UI/UX decision about button placement and minimalism that directly relates to the design principles, so the ui-ux-designer agent should be consulted.\\n</commentary>\\n</example>\\n\\n<example>\\nContext: The developer is working on mobile responsiveness.\\nuser: \"I'm implementing the receipt upload page. Here's my initial layout code.\"\\nassistant: \"Before proceeding further, let me use the Task tool to launch the ui-ux-designer agent to ensure the layout follows our responsive design patterns and UI guidelines.\"\\n<commentary>\\nSince this involves implementing a UI page that needs to be responsive across devices and should include wizard-style guidance, the ui-ux-designer agent should be involved proactively.\\n</commentary>\\n</example>\\n</examples>"
model: opus
color: green
---
You are an elite UI/UX designer with over 20 years of experience specializing in creating clean, intuitive, and user-friendly interfaces. Your expertise spans user interface design, user experience optimization, responsive design, and accessibility best practices.
## Core Design Philosophy for This Project
You will ensure that this application maintains a clean, welcoming, and minimalist design aesthetic with the following specific requirements:
### Visual Design Standards
**Color Palette:**
- Use pastel colors as the primary color scheme throughout the application
- Select soft, muted tones that are easy on the eyes and create a calm, welcoming atmosphere
- Ensure sufficient contrast for accessibility while maintaining the pastel aesthetic
- Use color purposefully to guide user attention and indicate status
**Border and Container Styling:**
- Apply sharp, clean borders to all interactive elements (buttons, menus, form fields)
- Use sharp borders to clearly delineate separate areas and sections of the interface
- Avoid rounded corners unless there is a specific functional reason
- Ensure borders are visible but not overpowering, maintaining the clean aesthetic
**Minimalism:**
- Eliminate all unnecessary buttons and UI elements
- Every element on the screen must serve a clear purpose
- Co-locate buttons near their related features on the page, not grouped separately
- Use progressive disclosure to hide advanced features until needed
- Favor white space and breathing room over density
### Responsive Design Requirements
You must ensure the application works flawlessly across:
**Large Screens (Desktop):**
- Utilize horizontal space effectively without overcrowding
- Consider multi-column layouts where appropriate
- Ensure comfortable reading width for text content
**Tablets:**
- Adapt layouts to accommodate touch targets of at least 44x44 pixels
- Optimize for both portrait and landscape orientations
- Ensure navigation remains accessible
**Mobile Devices:**
- Stack elements vertically with appropriate spacing
- Make all interactive elements easily tappable
- Optimize for one-handed use where possible
- Ensure critical actions are easily accessible
- Test on various screen sizes (small, medium, large phones)
### Wizard Design for Key User Tasks
For the following tasks, implement or guide the creation of clear, step-by-step wizards:
1. **Uploading a Flyer**
2. **Creating a Shopping List**
3. **Uploading Receipts**
4. **Any other multi-step user tasks**
**Wizard Best Practices:**
- Minimize the number of steps (ideally 3-5 steps maximum)
- Show progress clearly (e.g., "Step 2 of 4")
- Each step should focus on one primary action or decision
- Provide clear, concise instructions at each step
- Allow users to go back and edit previous steps
- Use visual cues to guide the user through the process
- Display a summary before final submission
- Provide helpful tooltips or examples where needed
- Ensure wizards are fully responsive and work well on mobile devices
## Your Approach to Tasks
**When Reviewing Existing UI Code:**
1. Evaluate adherence to the pastel color scheme
2. Check that all borders are sharp and properly applied
3. Identify any unnecessary UI elements or buttons
4. Verify that buttons are co-located with their related features
5. Test responsive behavior across all target screen sizes
6. Assess wizard flows for clarity and step efficiency
7. Provide specific, actionable feedback with code examples when needed
**When Designing New UI Components:**
1. Start by understanding the user's goal and the feature's purpose
2. Sketch out the minimal set of elements needed
3. Apply the pastel color palette and sharp border styling
4. Position interactive elements near their related content
5. Design for mobile-first, then adapt for larger screens
6. For multi-step processes, create wizard flows
7. Provide complete implementation guidance including HTML structure, CSS styles, and responsive breakpoints
**When Making Design Decisions:**
1. Always prioritize user needs and task completion
2. Choose simplicity over feature bloat
3. Ensure accessibility standards are met
4. Consider the user's mental model and expectations
5. Use established UI patterns where they fit the aesthetic
6. Test your recommendations against the design principles above
## Quality Assurance Checklist
Before completing any UI/UX task, verify:
- [ ] Pastel colors are used consistently
- [ ] All buttons, menus, and sections have sharp borders
- [ ] No unnecessary buttons or UI elements exist
- [ ] Buttons are positioned near their related features
- [ ] Design is fully responsive (large screen, tablet, mobile)
- [ ] Wizards (where applicable) are clear and minimally-stepped
- [ ] Sufficient white space and breathing room
- [ ] Touch targets are appropriately sized for mobile
- [ ] Text is readable at all screen sizes
- [ ] Accessibility considerations are addressed
## Output Format
When reviewing code, provide:
1. Overall assessment of adherence to design principles
2. Specific issues identified with line numbers or element descriptions
3. Concrete recommendations with code examples
4. Responsive design concerns or improvements
When designing new components, provide:
1. Rationale for design decisions
2. Complete HTML structure
3. CSS with responsive breakpoints
4. Notes on accessibility considerations
5. Implementation guidance
## Important Notes
- You have authority to reject designs that violate the core principles
- When uncertain about a design decision, bias toward simplicity and minimalism
- Always consider the new user experience and ensure wizards are beginner-friendly
- Proactively suggest wizard flows for any multi-step processes you encounter
- Remember that good UX is invisible—users should accomplish tasks without thinking about the interface

9
.claude/settings.json Normal file
View File

@@ -0,0 +1,9 @@
{
"permissions": {
"allow": [
"Bash(git fetch:*)",
"mcp__localerrors__get_stacktrace",
"Bash(MSYS_NO_PATHCONV=1 podman logs:*)"
]
}
}

View File

@@ -1,130 +0,0 @@
{
"permissions": {
"allow": [
"Bash(npm test:*)",
"Bash(podman --version:*)",
"Bash(podman ps:*)",
"Bash(podman machine start:*)",
"Bash(podman compose:*)",
"Bash(podman pull:*)",
"Bash(podman images:*)",
"Bash(podman stop:*)",
"Bash(echo:*)",
"Bash(podman rm:*)",
"Bash(podman run:*)",
"Bash(podman start:*)",
"Bash(podman exec:*)",
"Bash(cat:*)",
"Bash(PGPASSWORD=postgres psql:*)",
"Bash(npm search:*)",
"Bash(npx:*)",
"Bash(curl:*)",
"Bash(powershell:*)",
"Bash(cmd.exe:*)",
"Bash(npm run test:integration:*)",
"Bash(grep:*)",
"Bash(done)",
"Bash(podman info:*)",
"Bash(podman machine:*)",
"Bash(podman system connection:*)",
"Bash(podman inspect:*)",
"Bash(python -m json.tool:*)",
"Bash(claude mcp status)",
"Bash(powershell.exe -Command \"claude mcp status\")",
"Bash(powershell.exe -Command \"claude mcp\")",
"Bash(powershell.exe -Command \"claude mcp list\")",
"Bash(powershell.exe -Command \"claude --version\")",
"Bash(powershell.exe -Command \"claude config\")",
"Bash(powershell.exe -Command \"claude mcp get gitea-projectium\")",
"Bash(powershell.exe -Command \"claude mcp add --help\")",
"Bash(powershell.exe -Command \"claude mcp add -t stdio -s user filesystem -- D:\\\\nodejs\\\\npx.cmd -y @modelcontextprotocol/server-filesystem D:\\\\gitea\\\\flyer-crawler.projectium.com\\\\flyer-crawler.projectium.com\")",
"Bash(powershell.exe -Command \"claude mcp add -t stdio -s user fetch -- D:\\\\nodejs\\\\npx.cmd -y @modelcontextprotocol/server-fetch\")",
"Bash(powershell.exe -Command \"echo ''List files in src/hooks using filesystem MCP'' | claude --print\")",
"Bash(powershell.exe -Command \"echo ''List all podman containers'' | claude --print\")",
"Bash(powershell.exe -Command \"echo ''List my repositories on gitea.projectium.com using gitea-projectium MCP'' | claude --print\")",
"Bash(powershell.exe -Command \"echo ''List my repositories on gitea.projectium.com using gitea-projectium MCP'' | claude --print --allowedTools ''mcp__gitea-projectium__*''\")",
"Bash(powershell.exe -Command \"echo ''Fetch the homepage of https://gitea.projectium.com and summarize it'' | claude --print --allowedTools ''mcp__fetch__*''\")",
"Bash(dir \"C:\\\\Users\\\\games3\\\\.claude\")",
"Bash(dir:*)",
"Bash(D:nodejsnpx.cmd -y @modelcontextprotocol/server-fetch --help)",
"Bash(cmd /c \"dir /o-d C:\\\\Users\\\\games3\\\\.claude\\\\debug 2>nul | head -10\")",
"mcp__memory__read_graph",
"mcp__memory__create_entities",
"mcp__memory__search_nodes",
"mcp__memory__delete_entities",
"mcp__sequential-thinking__sequentialthinking",
"mcp__filesystem__list_directory",
"mcp__filesystem__read_multiple_files",
"mcp__filesystem__directory_tree",
"mcp__filesystem__read_text_file",
"Bash(wc:*)",
"Bash(npm install:*)",
"Bash(git grep:*)",
"Bash(findstr:*)",
"Bash(git add:*)",
"mcp__filesystem__write_file",
"mcp__podman__container_list",
"Bash(podman cp:*)",
"mcp__podman__container_inspect",
"mcp__podman__network_list",
"Bash(podman network connect:*)",
"Bash(npm run build:*)",
"Bash(set NODE_ENV=test)",
"Bash(podman-compose:*)",
"Bash(timeout 60 podman machine start:*)",
"Bash(podman build:*)",
"Bash(podman network rm:*)",
"Bash(npm run lint)",
"Bash(npm run typecheck:*)",
"Bash(npm run type-check:*)",
"Bash(npm run test:unit:*)",
"mcp__filesystem__move_file",
"Bash(git checkout:*)",
"Bash(podman image inspect:*)",
"Bash(node -e:*)",
"Bash(xargs -I {} sh -c 'if ! grep -q \"\"vi.mock.*apiClient\"\" \"\"{}\"\"; then echo \"\"{}\"\"; fi')",
"Bash(MSYS_NO_PATHCONV=1 podman exec:*)",
"Bash(docker ps:*)",
"Bash(find:*)",
"Bash(\"/c/Users/games3/.local/bin/uvx.exe\" markitdown-mcp --help)",
"Bash(git stash:*)",
"Bash(ping:*)",
"Bash(tee:*)",
"Bash(timeout 1800 podman exec flyer-crawler-dev npm run test:unit:*)",
"mcp__filesystem__edit_file",
"Bash(timeout 300 tail:*)",
"mcp__filesystem__list_allowed_directories",
"mcp__memory__add_observations",
"Bash(ssh:*)",
"mcp__redis__list",
"Read(//d/gitea/bugsink-mcp/**)",
"Bash(d:/nodejs/npm.cmd install)",
"Bash(node node_modules/vitest/vitest.mjs run:*)",
"Bash(npm run test:e2e:*)",
"Bash(export BUGSINK_URL=http://localhost:8000)",
"Bash(export BUGSINK_TOKEN=a609c2886daa4e1e05f1517074d7779a5fb49056)",
"Bash(timeout 3 d:/nodejs/node.exe:*)",
"Bash(export BUGSINK_URL=https://bugsink.projectium.com)",
"Bash(export BUGSINK_API_TOKEN=77deaa5e2649ab0fbbca51bbd427ec4637d073a0)",
"Bash(export BUGSINK_TOKEN=77deaa5e2649ab0fbbca51bbd427ec4637d073a0)",
"Bash(where:*)",
"mcp__localerrors__test_connection",
"mcp__localerrors__list_projects",
"Bash(\"D:\\\\nodejs\\\\npx.cmd\" -y @modelcontextprotocol/server-postgres --help)",
"Bash(git rm:*)",
"Bash(git -C \"C:\\\\Users\\\\games3\\\\.claude\\\\plugins\\\\marketplaces\\\\claude-plugins-official\" log -1 --format=\"%H %ci %s\")",
"Bash(git -C \"C:\\\\Users\\\\games3\\\\.claude\\\\plugins\\\\marketplaces\\\\claude-plugins-official\" config --get remote.origin.url)",
"Bash(git -C \"C:\\\\Users\\\\games3\\\\.claude\\\\plugins\\\\marketplaces\\\\claude-plugins-official\" fetch --dry-run -v)",
"mcp__localerrors__get_project",
"mcp__localerrors__get_issue",
"mcp__localerrors__get_event",
"mcp__localerrors__list_teams",
"WebSearch"
]
},
"enabledMcpjsonServers": [
"localerrors",
"devdb",
"gitea-projectium"
]
}

View File

@@ -59,6 +59,8 @@ GITHUB_CLIENT_SECRET=
# AI/ML Services
# ===================
# REQUIRED: Google Gemini API key for flyer OCR processing
# NOTE: Test/staging environment deliberately OMITS this to preserve free API quota.
# Production has a working key. Deploy warnings in test are expected and safe to ignore.
GEMINI_API_KEY=your-gemini-api-key
# ===================
@@ -128,3 +130,35 @@ GENERATE_SOURCE_MAPS=true
SENTRY_AUTH_TOKEN=
# URL of your Bugsink instance (for source map uploads)
SENTRY_URL=https://bugsink.projectium.com
# ===================
# Feature Flags (ADR-024)
# ===================
# Feature flags control the availability of features at runtime.
# All flags default to disabled (false) when not set or set to any value other than 'true'.
# Set to 'true' to enable a feature.
#
# Backend flags use: FEATURE_SNAKE_CASE
# Frontend flags use: VITE_FEATURE_SNAKE_CASE (VITE_ prefix required for client-side access)
#
# Lifecycle:
# 1. Add flag with default false
# 2. Enable via env var when ready for testing/rollout
# 3. Remove conditional code when feature is fully rolled out
# 4. Remove flag from config within 3 months of full rollout
#
# See: docs/adr/0024-feature-flagging-strategy.md
# Backend Feature Flags
# FEATURE_BUGSINK_SYNC=false # Enable Bugsink error sync integration
# FEATURE_ADVANCED_RBAC=false # Enable advanced RBAC features
# FEATURE_NEW_DASHBOARD=false # Enable new dashboard experience
# FEATURE_BETA_RECIPES=false # Enable beta recipe features
# FEATURE_EXPERIMENTAL_AI=false # Enable experimental AI features
# FEATURE_DEBUG_MODE=false # Enable debug mode for development
# Frontend Feature Flags (VITE_ prefix required)
# VITE_FEATURE_NEW_DASHBOARD=false # Enable new dashboard experience
# VITE_FEATURE_BETA_RECIPES=false # Enable beta recipe features
# VITE_FEATURE_EXPERIMENTAL_AI=false # Enable experimental AI features
# VITE_FEATURE_DEBUG_MODE=false # Enable debug mode for development

View File

@@ -121,12 +121,28 @@ jobs:
run: |
echo "Deploying application files to /var/www/flyer-crawler.projectium.com..."
APP_PATH="/var/www/flyer-crawler.projectium.com"
# CRITICAL: Stop PM2 processes BEFORE deploying files to prevent CWD errors
echo "--- Stopping production PM2 processes before file deployment ---"
pm2 stop flyer-crawler-api flyer-crawler-worker flyer-crawler-analytics-worker || echo "No production processes to stop"
mkdir -p "$APP_PATH"
mkdir -p "$APP_PATH/flyer-images/icons" "$APP_PATH/flyer-images/archive"
rsync -avz --delete --exclude 'node_modules' --exclude '.git' --exclude 'dist' --exclude 'flyer-images' ./ "$APP_PATH/"
rsync -avz dist/ "$APP_PATH"
echo "Application deployment complete."
- name: Log Workflow Metadata
run: |
echo "=== WORKFLOW METADATA ==="
echo "Workflow file: deploy-to-prod.yml"
echo "Workflow file hash: $(sha256sum .gitea/workflows/deploy-to-prod.yml | cut -d' ' -f1)"
echo "Git commit: $(git rev-parse HEAD)"
echo "Git branch: $(git rev-parse --abbrev-ref HEAD)"
echo "Timestamp: $(date -u '+%Y-%m-%d %H:%M:%S UTC')"
echo "Actor: ${{ gitea.actor }}"
echo "=== END METADATA ==="
- name: Install Backend Dependencies and Restart Production Server
env:
# --- Production Secrets Injection ---
@@ -165,9 +181,74 @@ jobs:
cd /var/www/flyer-crawler.projectium.com
npm install --omit=dev
# --- Cleanup Errored Processes ---
echo "Cleaning up errored or stopped PM2 processes..."
node -e "const exec = require('child_process').execSync; try { const list = JSON.parse(exec('pm2 jlist').toString()); list.forEach(p => { if (p.pm2_env.status === 'errored' || p.pm2_env.status === 'stopped') { console.log('Deleting ' + p.pm2_env.status + ' process: ' + p.name + ' (' + p.pm2_env.pm_id + ')'); try { exec('pm2 delete ' + p.pm2_env.pm_id); } catch(e) { console.error('Failed to delete ' + p.pm2_env.pm_id); } } }); } catch (e) { console.error('Error cleaning up processes:', e); }"
# === PRE-CLEANUP PM2 STATE LOGGING ===
echo "=== PRE-CLEANUP PM2 STATE ==="
pm2 jlist
echo "=== END PRE-CLEANUP STATE ==="
# --- Cleanup Errored Processes with Defense-in-Depth Safeguards ---
echo "Cleaning up errored or stopped PRODUCTION PM2 processes..."
node -e "
const exec = require('child_process').execSync;
try {
const list = JSON.parse(exec('pm2 jlist').toString());
const prodProcesses = ['flyer-crawler-api', 'flyer-crawler-worker', 'flyer-crawler-analytics-worker'];
// Filter for processes that match our criteria
const targetProcesses = list.filter(p =>
(p.pm2_env.status === 'errored' || p.pm2_env.status === 'stopped') &&
prodProcesses.includes(p.name)
);
// SAFEGUARD 1: Process count validation
const totalProcesses = list.length;
if (targetProcesses.length === totalProcesses && totalProcesses > 3) {
console.error('SAFETY ABORT: Filter would delete ALL processes!');
console.error('Total processes: ' + totalProcesses + ', Target processes: ' + targetProcesses.length);
console.error('This indicates a potential filter bug. Aborting cleanup.');
process.exit(1);
}
// SAFEGUARD 2: Explicit name verification
console.log('Found ' + targetProcesses.length + ' PRODUCTION processes to clean:');
targetProcesses.forEach(p => {
console.log(' - ' + p.name + ' (status: ' + p.pm2_env.status + ', pm_id: ' + p.pm2_env.pm_id + ')');
});
// Perform the cleanup
targetProcesses.forEach(p => {
console.log('Deleting ' + p.pm2_env.status + ' production process: ' + p.name + ' (' + p.pm2_env.pm_id + ')');
try {
exec('pm2 delete ' + p.pm2_env.pm_id);
} catch(e) {
console.error('Failed to delete ' + p.pm2_env.pm_id);
}
});
console.log('Production process cleanup complete.');
} catch (e) {
console.error('Error cleaning up processes:', e);
}
"
# === POST-CLEANUP VERIFICATION ===
echo "=== POST-CLEANUP VERIFICATION ==="
pm2 jlist | node -e "
try {
const list = JSON.parse(require('fs').readFileSync(0, 'utf-8'));
const prodProcesses = list.filter(p => p.name && p.name.startsWith('flyer-crawler-') && !p.name.endsWith('-test') && !p.name.endsWith('-dev'));
console.log('Production processes after cleanup:');
prodProcesses.forEach(p => {
console.log(' ' + p.name + ': ' + p.pm2_env.status);
});
if (prodProcesses.length === 0) {
console.log(' (no production processes currently running)');
}
} catch (e) {
console.error('Failed to parse PM2 output:', e.message);
}
"
echo "=== END POST-CLEANUP VERIFICATION ==="
# --- Version Check Logic ---
# Get the version from the newly deployed package.json

View File

@@ -75,15 +75,45 @@ jobs:
echo "--- Listing SRC Directory ---"
ls -alF src
- name: Generate TSOA Spec and Routes
run: npm run tsoa:build
- name: TypeScript Type-Check
run: npm run type-check
- name: Prettier Check
run: npx prettier --check . || true
- name: Prettier Auto-Fix
run: |
echo "--- Running Prettier auto-fix for test/staging deployment ---"
# Auto-format all files
npx prettier --write .
# Check if any files were changed
if ! git diff --quiet; then
echo "📝 Prettier made formatting changes. Committing..."
git config --global user.name 'Gitea Actions'
git config --global user.email 'actions@gitea.projectium.com'
git add .
git commit -m "style: auto-format code via Prettier [skip ci]"
git push
echo "✅ Formatting changes committed and pushed."
else
echo "✅ No formatting changes needed."
fi
- name: Lint Check
run: npm run lint || true
- name: Log Workflow Metadata
run: |
echo "=== WORKFLOW METADATA ==="
echo "Workflow file: deploy-to-test.yml"
echo "Workflow file hash: $(sha256sum .gitea/workflows/deploy-to-test.yml | cut -d' ' -f1)"
echo "Git commit: $(git rev-parse HEAD)"
echo "Git branch: $(git rev-parse --abbrev-ref HEAD)"
echo "Timestamp: $(date -u '+%Y-%m-%d %H:%M:%S UTC')"
echo "Actor: ${{ gitea.actor }}"
echo "=== END METADATA ==="
- name: Stop Test Server Before Tests
# This is a critical step to ensure a clean test environment.
# It stops the currently running pm2 process, freeing up port 3001 so that the
@@ -91,10 +121,74 @@ jobs:
# '|| true' ensures the workflow doesn't fail if the process isn't running.
run: |
echo "--- Stopping and deleting all test processes ---"
# === PRE-CLEANUP PM2 STATE LOGGING ===
echo "=== PRE-CLEANUP PM2 STATE ==="
pm2 jlist || echo "No PM2 processes running"
echo "=== END PRE-CLEANUP STATE ==="
# Use a script to parse pm2's JSON output and delete any process whose name ends with '-test'.
# This is safer than 'pm2 delete all' and more robust than naming each process individually.
# It prevents the accumulation of duplicate processes from previous test runs.
node -e "const exec = require('child_process').execSync; try { const list = JSON.parse(exec('pm2 jlist').toString()); list.forEach(p => { if (p.name && p.name.endsWith('-test')) { console.log('Deleting test process: ' + p.name + ' (' + p.pm2_env.pm_id + ')'); try { exec('pm2 delete ' + p.pm2_env.pm_id); } catch(e) { console.error('Failed to delete ' + p.pm2_env.pm_id, e.message); } } }); console.log('✅ Test process cleanup complete.'); } catch (e) { if (e.stdout.toString().includes('No process found')) { console.log('No PM2 processes running, cleanup not needed.'); } else { console.error('Error cleaning up test processes:', e.message); } }" || true
node -e "
const exec = require('child_process').execSync;
try {
const list = JSON.parse(exec('pm2 jlist').toString());
// Filter for test processes only
const targetProcesses = list.filter(p => p.name && p.name.endsWith('-test'));
// SAFEGUARD 1: Process count validation
const totalProcesses = list.length;
if (targetProcesses.length === totalProcesses && totalProcesses > 3) {
console.error('SAFETY ABORT: Filter would delete ALL processes!');
console.error('Total processes: ' + totalProcesses + ', Target processes: ' + targetProcesses.length);
console.error('This indicates a potential filter bug. Aborting cleanup.');
process.exit(1);
}
// SAFEGUARD 2: Explicit name verification
console.log('Found ' + targetProcesses.length + ' TEST processes to clean:');
targetProcesses.forEach(p => {
console.log(' - ' + p.name + ' (status: ' + p.pm2_env.status + ', pm_id: ' + p.pm2_env.pm_id + ')');
});
// Perform the cleanup
targetProcesses.forEach(p => {
console.log('Deleting test process: ' + p.name + ' (' + p.pm2_env.pm_id + ')');
try {
exec('pm2 delete ' + p.pm2_env.pm_id);
} catch(e) {
console.error('Failed to delete ' + p.pm2_env.pm_id, e.message);
}
});
console.log('Test process cleanup complete.');
} catch (e) {
if (e.stdout && e.stdout.toString().includes('No process found')) {
console.log('No PM2 processes running, cleanup not needed.');
} else {
console.error('Error cleaning up test processes:', e.message);
}
}
" || true
# === POST-CLEANUP VERIFICATION ===
echo "=== POST-CLEANUP VERIFICATION ==="
pm2 jlist 2>/dev/null | node -e "
try {
const list = JSON.parse(require('fs').readFileSync(0, 'utf-8'));
const testProcesses = list.filter(p => p.name && p.name.endsWith('-test'));
const prodProcesses = list.filter(p => p.name && p.name.startsWith('flyer-crawler-') && !p.name.endsWith('-test') && !p.name.endsWith('-dev'));
console.log('Test processes after cleanup: ' + testProcesses.length);
testProcesses.forEach(p => console.log(' ' + p.name + ': ' + p.pm2_env.status));
console.log('Production processes (should be untouched): ' + prodProcesses.length);
prodProcesses.forEach(p => console.log(' ' + p.name + ': ' + p.pm2_env.status));
} catch (e) {
console.log('No PM2 processes or failed to parse output');
}
" || true
echo "=== END POST-CLEANUP VERIFICATION ==="
- name: Flush Redis Test Database Before Tests
# CRITICAL: Clear Redis database 1 (test database) to remove stale BullMQ jobs.
@@ -412,6 +506,10 @@ jobs:
echo "Deploying application files to /var/www/flyer-crawler-test.projectium.com..."
APP_PATH="/var/www/flyer-crawler-test.projectium.com"
# CRITICAL: Stop PM2 processes BEFORE deploying files to prevent CWD errors
echo "--- Stopping test PM2 processes before file deployment ---"
pm2 stop flyer-crawler-api-test flyer-crawler-worker-test flyer-crawler-analytics-worker-test || echo "No test processes to stop"
# Ensure the destination directory exists
mkdir -p "$APP_PATH"
mkdir -p "$APP_PATH/flyer-images/icons" "$APP_PATH/flyer-images/archive" # Ensure all required subdirectories exist
@@ -489,9 +587,74 @@ jobs:
cd /var/www/flyer-crawler-test.projectium.com
npm install --omit=dev
# --- Cleanup Errored Processes ---
echo "Cleaning up errored or stopped PM2 processes..."
node -e "const exec = require('child_process').execSync; try { const list = JSON.parse(exec('pm2 jlist').toString()); list.forEach(p => { if (p.pm2_env.status === 'errored' || p.pm2_env.status === 'stopped') { console.log('Deleting ' + p.pm2_env.status + ' process: ' + p.name + ' (' + p.pm2_env.pm_id + ')'); try { exec('pm2 delete ' + p.pm2_env.pm_id); } catch(e) { console.error('Failed to delete ' + p.pm2_env.pm_id); } } }); } catch (e) { console.error('Error cleaning up processes:', e); }"
# === PRE-CLEANUP PM2 STATE LOGGING ===
echo "=== PRE-CLEANUP PM2 STATE ==="
pm2 jlist
echo "=== END PRE-CLEANUP STATE ==="
# --- Cleanup Errored Processes with Defense-in-Depth Safeguards ---
echo "Cleaning up errored or stopped TEST PM2 processes..."
node -e "
const exec = require('child_process').execSync;
try {
const list = JSON.parse(exec('pm2 jlist').toString());
// Filter for errored/stopped test processes only
const targetProcesses = list.filter(p =>
(p.pm2_env.status === 'errored' || p.pm2_env.status === 'stopped') &&
p.name && p.name.endsWith('-test')
);
// SAFEGUARD 1: Process count validation
const totalProcesses = list.length;
if (targetProcesses.length === totalProcesses && totalProcesses > 3) {
console.error('SAFETY ABORT: Filter would delete ALL processes!');
console.error('Total processes: ' + totalProcesses + ', Target processes: ' + targetProcesses.length);
console.error('This indicates a potential filter bug. Aborting cleanup.');
process.exit(1);
}
// SAFEGUARD 2: Explicit name verification
console.log('Found ' + targetProcesses.length + ' errored/stopped TEST processes to clean:');
targetProcesses.forEach(p => {
console.log(' - ' + p.name + ' (status: ' + p.pm2_env.status + ', pm_id: ' + p.pm2_env.pm_id + ')');
});
// Perform the cleanup
targetProcesses.forEach(p => {
console.log('Deleting ' + p.pm2_env.status + ' test process: ' + p.name + ' (' + p.pm2_env.pm_id + ')');
try {
exec('pm2 delete ' + p.pm2_env.pm_id);
} catch(e) {
console.error('Failed to delete ' + p.pm2_env.pm_id);
}
});
console.log('Test process cleanup complete.');
} catch (e) {
console.error('Error cleaning up processes:', e);
}
"
# === POST-CLEANUP VERIFICATION ===
echo "=== POST-CLEANUP VERIFICATION ==="
pm2 jlist | node -e "
try {
const list = JSON.parse(require('fs').readFileSync(0, 'utf-8'));
const testProcesses = list.filter(p => p.name && p.name.endsWith('-test'));
const prodProcesses = list.filter(p => p.name && p.name.startsWith('flyer-crawler-') && !p.name.endsWith('-test') && !p.name.endsWith('-dev'));
console.log('Test processes after cleanup:');
testProcesses.forEach(p => console.log(' ' + p.name + ': ' + p.pm2_env.status));
if (testProcesses.length === 0) {
console.log(' (no test processes currently running)');
}
console.log('Production processes (should be untouched): ' + prodProcesses.length);
prodProcesses.forEach(p => console.log(' ' + p.name + ': ' + p.pm2_env.status));
} catch (e) {
console.error('Failed to parse PM2 output:', e.message);
}
"
echo "=== END POST-CLEANUP VERIFICATION ==="
# Use `startOrReload` with the TEST ecosystem file. This starts test-specific processes
# (flyer-crawler-api-test, flyer-crawler-worker-test, flyer-crawler-analytics-worker-test)

View File

@@ -56,9 +56,9 @@ jobs:
- name: Step 1 - Stop Application Server
run: |
echo "Stopping all PM2 processes to release database connections..."
pm2 stop all || echo "PM2 processes were not running."
echo "✅ Application server stopped."
echo "Stopping PRODUCTION PM2 processes to release database connections..."
pm2 stop flyer-crawler-api flyer-crawler-worker flyer-crawler-analytics-worker || echo "Production PM2 processes were not running."
echo "✅ Production application server stopped."
- name: Step 2 - Drop and Recreate Database
run: |

View File

@@ -109,6 +109,17 @@ jobs:
rsync -avz dist/ "$APP_PATH"
echo "Application deployment complete."
- name: Log Workflow Metadata
run: |
echo "=== WORKFLOW METADATA ==="
echo "Workflow file: manual-deploy-major.yml"
echo "Workflow file hash: $(sha256sum .gitea/workflows/manual-deploy-major.yml | cut -d' ' -f1)"
echo "Git commit: $(git rev-parse HEAD)"
echo "Git branch: $(git rev-parse --abbrev-ref HEAD)"
echo "Timestamp: $(date -u '+%Y-%m-%d %H:%M:%S UTC')"
echo "Actor: ${{ gitea.actor }}"
echo "=== END METADATA ==="
- name: Install Backend Dependencies and Restart Production Server
env:
# --- Production Secrets Injection ---
@@ -138,9 +149,74 @@ jobs:
cd /var/www/flyer-crawler.projectium.com
npm install --omit=dev
# --- Cleanup Errored Processes ---
echo "Cleaning up errored or stopped PM2 processes..."
node -e "const exec = require('child_process').execSync; try { const list = JSON.parse(exec('pm2 jlist').toString()); list.forEach(p => { if (p.pm2_env.status === 'errored' || p.pm2_env.status === 'stopped') { console.log('Deleting ' + p.pm2_env.status + ' process: ' + p.name + ' (' + p.pm2_env.pm_id + ')'); try { exec('pm2 delete ' + p.pm2_env.pm_id); } catch(e) { console.error('Failed to delete ' + p.pm2_env.pm_id); } } }); } catch (e) { console.error('Error cleaning up processes:', e); }"
# === PRE-CLEANUP PM2 STATE LOGGING ===
echo "=== PRE-CLEANUP PM2 STATE ==="
pm2 jlist
echo "=== END PRE-CLEANUP STATE ==="
# --- Cleanup Errored Processes with Defense-in-Depth Safeguards ---
echo "Cleaning up errored or stopped PRODUCTION PM2 processes..."
node -e "
const exec = require('child_process').execSync;
try {
const list = JSON.parse(exec('pm2 jlist').toString());
const prodProcesses = ['flyer-crawler-api', 'flyer-crawler-worker', 'flyer-crawler-analytics-worker'];
// Filter for processes that match our criteria
const targetProcesses = list.filter(p =>
(p.pm2_env.status === 'errored' || p.pm2_env.status === 'stopped') &&
prodProcesses.includes(p.name)
);
// SAFEGUARD 1: Process count validation
const totalProcesses = list.length;
if (targetProcesses.length === totalProcesses && totalProcesses > 3) {
console.error('SAFETY ABORT: Filter would delete ALL processes!');
console.error('Total processes: ' + totalProcesses + ', Target processes: ' + targetProcesses.length);
console.error('This indicates a potential filter bug. Aborting cleanup.');
process.exit(1);
}
// SAFEGUARD 2: Explicit name verification
console.log('Found ' + targetProcesses.length + ' PRODUCTION processes to clean:');
targetProcesses.forEach(p => {
console.log(' - ' + p.name + ' (status: ' + p.pm2_env.status + ', pm_id: ' + p.pm2_env.pm_id + ')');
});
// Perform the cleanup
targetProcesses.forEach(p => {
console.log('Deleting ' + p.pm2_env.status + ' production process: ' + p.name + ' (' + p.pm2_env.pm_id + ')');
try {
exec('pm2 delete ' + p.pm2_env.pm_id);
} catch(e) {
console.error('Failed to delete ' + p.pm2_env.pm_id);
}
});
console.log('Production process cleanup complete.');
} catch (e) {
console.error('Error cleaning up processes:', e);
}
"
# === POST-CLEANUP VERIFICATION ===
echo "=== POST-CLEANUP VERIFICATION ==="
pm2 jlist | node -e "
try {
const list = JSON.parse(require('fs').readFileSync(0, 'utf-8'));
const prodProcesses = list.filter(p => p.name && p.name.startsWith('flyer-crawler-') && !p.name.endsWith('-test') && !p.name.endsWith('-dev'));
console.log('Production processes after cleanup:');
prodProcesses.forEach(p => {
console.log(' ' + p.name + ': ' + p.pm2_env.status);
});
if (prodProcesses.length === 0) {
console.log(' (no production processes currently running)');
}
} catch (e) {
console.error('Failed to parse PM2 output:', e.message);
}
"
echo "=== END POST-CLEANUP VERIFICATION ==="
# --- Version Check Logic ---
# Get the version from the newly deployed package.json

View File

@@ -0,0 +1,86 @@
# .gitea/workflows/restart-pm2.yml
#
# Manual workflow to restart PM2 processes and verify their status.
# Useful for recovering from PM2 daemon crashes or process issues.
name: Restart PM2 Processes
on:
workflow_dispatch:
inputs:
environment:
description: 'Environment to restart (test, production, or both)'
required: true
default: 'test'
type: choice
options:
- test
- production
- both
jobs:
restart-pm2:
runs-on: projectium.com
steps:
- name: Validate Environment Input
run: |
echo "Restarting PM2 processes for environment: ${{ gitea.event.inputs.environment }}"
- name: Restart Test Environment
if: gitea.event.inputs.environment == 'test' || gitea.event.inputs.environment == 'both'
run: |
echo "=== RESTARTING TEST ENVIRONMENT ==="
cd /var/www/flyer-crawler-test.projectium.com
echo "--- Current PM2 State (Before Restart) ---"
pm2 list
echo "--- Restarting Test Processes ---"
pm2 restart flyer-crawler-api-test flyer-crawler-worker-test flyer-crawler-analytics-worker-test || {
echo "Restart failed, attempting to start processes..."
pm2 start ecosystem-test.config.cjs
}
echo "--- Saving PM2 Process List ---"
pm2 save
echo "--- Waiting 3 seconds for processes to stabilize ---"
sleep 3
echo "=== TEST ENVIRONMENT STATUS ==="
pm2 ps
- name: Restart Production Environment
if: gitea.event.inputs.environment == 'production' || gitea.event.inputs.environment == 'both'
run: |
echo "=== RESTARTING PRODUCTION ENVIRONMENT ==="
cd /var/www/flyer-crawler.projectium.com
echo "--- Current PM2 State (Before Restart) ---"
pm2 list
echo "--- Restarting Production Processes ---"
pm2 restart flyer-crawler-api flyer-crawler-worker flyer-crawler-analytics-worker || {
echo "Restart failed, attempting to start processes..."
pm2 start ecosystem.config.cjs
}
echo "--- Saving PM2 Process List ---"
pm2 save
echo "--- Waiting 3 seconds for processes to stabilize ---"
sleep 3
echo "=== PRODUCTION ENVIRONMENT STATUS ==="
pm2 ps
- name: Final PM2 Status (All Processes)
run: |
echo "========================================="
echo "FINAL PM2 STATUS - ALL PROCESSES"
echo "========================================="
pm2 ps
echo ""
echo "--- PM2 Logs (Last 20 Lines) ---"
pm2 logs --lines 20 --nostream || echo "No logs available"

10
.gitignore vendored
View File

@@ -14,6 +14,10 @@ dist-ssr
.env
*.tsbuildinfo
# tsoa generated files (regenerated on build)
src/routes/tsoa-generated.ts
src/config/tsoa-spec.json
# Test coverage
coverage
.nyc_output
@@ -35,6 +39,10 @@ test-output.txt
*.sln
*.sw?
Thumbs.db
.claude
.claude/settings.local.json
nul
tmpclaude*
test.tmp

1
.nvmrc Normal file
View File

@@ -0,0 +1 @@
22

224
CLAUDE.md
View File

@@ -27,6 +27,68 @@ podman exec -it flyer-crawler-dev npm run type-check
Out-of-sync = test failures.
### Server Access: READ-ONLY (Production/Test Servers)
**CRITICAL**: The `claude-win10` user has **READ-ONLY** access to production and test servers.
| Capability | Status |
| ---------------------- | ---------------------- |
| Root/sudo access | NO |
| Write permissions | NO |
| PM2 restart, systemctl | NO - User must execute |
**Server Operations Workflow**: Diagnose → User executes → Analyze → Fix (1-3 commands) → User executes → Verify
**Rules**:
- Provide diagnostic commands first, wait for user to report results
- Maximum 3 fix commands at a time (errors may cascade)
- Always verify after fixes complete
### PM2 Process Isolation (Production/Test Servers)
**CRITICAL**: Production and test environments share the same PM2 daemon on the server.
**See also**: [PM2 Process Isolation Incidents](#pm2-process-isolation-incidents) for past incidents and response procedures.
| Environment | Processes | Config File |
| ----------- | -------------------------------------------------------------------------------------------- | --------------------------- |
| Production | `flyer-crawler-api`, `flyer-crawler-worker`, `flyer-crawler-analytics-worker` | `ecosystem.config.cjs` |
| Test | `flyer-crawler-api-test`, `flyer-crawler-worker-test`, `flyer-crawler-analytics-worker-test` | `ecosystem-test.config.cjs` |
| Development | `flyer-crawler-api-dev`, `flyer-crawler-worker-dev`, `flyer-crawler-vite-dev` | `ecosystem.dev.config.cjs` |
**Deployment Scripts MUST:**
- ✅ Filter PM2 commands by exact process names or name patterns (e.g., `endsWith('-test')`)
- ❌ NEVER use `pm2 stop all`, `pm2 delete all`, or `pm2 restart all`
- ❌ NEVER delete/stop processes based solely on status without name filtering
- ✅ Always verify process names match the target environment before any operation
**Examples:**
```bash
# ✅ CORRECT - Production cleanup (filter by name)
pm2 stop flyer-crawler-api flyer-crawler-worker flyer-crawler-analytics-worker
# ✅ CORRECT - Test cleanup (filter by name pattern)
# Only delete test processes that are errored/stopped
list.forEach(p => {
if ((p.pm2_env.status === 'errored' || p.pm2_env.status === 'stopped') &&
p.name && p.name.endsWith('-test')) {
exec('pm2 delete ' + p.pm2_env.pm_id);
}
});
# ❌ WRONG - Affects all environments
pm2 stop all
pm2 delete all
# ❌ WRONG - No name filtering (could delete test processes during prod deploy)
if (p.pm2_env.status === 'errored') {
exec('pm2 delete ' + p.pm2_env.pm_id);
}
```
### Communication Style
Ask before assuming. Never assume:
@@ -60,25 +122,27 @@ Ask before assuming. Never assume:
### Key Patterns (with file locations)
| Pattern | ADR | Implementation | File |
| ------------------ | ------- | ------------------------------------------------- | ----------------------------------- |
| Error Handling | ADR-001 | `handleDbError()`, throw `NotFoundError` | `src/services/db/errors.db.ts` |
| Repository Methods | ADR-034 | `get*` (throws), `find*` (null), `list*` (array) | `src/services/db/*.db.ts` |
| API Responses | ADR-028 | `sendSuccess()`, `sendPaginated()`, `sendError()` | `src/utils/apiResponse.ts` |
| Transactions | ADR-002 | `withTransaction(async (client) => {...})` | `src/services/db/transaction.db.ts` |
| Pattern | ADR | Implementation | File |
| ------------------ | ------- | ------------------------------------------------- | ------------------------------------- |
| Error Handling | ADR-001 | `handleDbError()`, throw `NotFoundError` | `src/services/db/errors.db.ts` |
| Repository Methods | ADR-034 | `get*` (throws), `find*` (null), `list*` (array) | `src/services/db/*.db.ts` |
| API Responses | ADR-028 | `sendSuccess()`, `sendPaginated()`, `sendError()` | `src/utils/apiResponse.ts` |
| Transactions | ADR-002 | `withTransaction(async (client) => {...})` | `src/services/db/connection.db.ts` |
| Feature Flags | ADR-024 | `isFeatureEnabled()`, `useFeatureFlag()` | `src/services/featureFlags.server.ts` |
### Key Files Quick Access
| Purpose | File |
| ----------------- | -------------------------------- |
| Express app | `server.ts` |
| Environment | `src/config/env.ts` |
| Routes | `src/routes/*.routes.ts` |
| Repositories | `src/services/db/*.db.ts` |
| Workers | `src/services/workers.server.ts` |
| Queues | `src/services/queues.server.ts` |
| PM2 Config (Dev) | `ecosystem.dev.config.cjs` |
| PM2 Config (Prod) | `ecosystem.config.cjs` |
| Purpose | File |
| ----------------- | ------------------------------------- |
| Express app | `server.ts` |
| Environment | `src/config/env.ts` |
| Routes | `src/routes/*.routes.ts` |
| Repositories | `src/services/db/*.db.ts` |
| Workers | `src/services/workers.server.ts` |
| Queues | `src/services/queues.server.ts` |
| Feature Flags | `src/services/featureFlags.server.ts` |
| PM2 Config (Dev) | `ecosystem.dev.config.cjs` |
| PM2 Config (Prod) | `ecosystem.config.cjs` |
---
@@ -121,7 +185,7 @@ The dev container now matches production by using PM2 for process management.
- `flyer-crawler-worker-dev` - Background job worker
- `flyer-crawler-vite-dev` - Vite frontend dev server (port 5173)
### Log Aggregation (ADR-050)
### Log Aggregation (ADR-015)
All logs flow to Bugsink via Logstash with 3-project routing:
@@ -204,7 +268,7 @@ All logs flow to Bugsink via Logstash with 3-project routing:
**Launch Pattern**:
```
```text
Use Task tool with subagent_type: "coder", "db-dev", "tester", etc.
```
@@ -226,6 +290,39 @@ Common issues with solutions:
**Full Details**: See test issues section at end of this document or [docs/development/TESTING.md](docs/development/TESTING.md)
### PM2 Process Isolation Incidents
**CRITICAL**: PM2 process cleanup scripts can affect all PM2 processes if not properly filtered.
**Incident**: 2026-02-17 Production Deployment (v0.15.0)
- **Impact**: ALL PM2 processes on production server were killed
- **Affected**: stock-alert.projectium.com and all other PM2-managed applications
- **Root Cause**: Under investigation (see [incident report](docs/operations/INCIDENT-2026-02-17-PM2-PROCESS-KILL.md))
- **Status**: Safeguards added to prevent recurrence
**Prevention Measures** (implemented):
1. Name-based filtering (exact match or pattern-based)
2. Pre-cleanup process list logging
3. Process count validation (abort if filtering all processes)
4. Explicit name verification in logs
5. Post-cleanup verification
6. Workflow version hash logging
**If PM2 Incident Occurs**:
- **DO NOT** attempt another deployment immediately
- Follow the [PM2 Incident Response Runbook](docs/operations/PM2-INCIDENT-RESPONSE.md)
- Manually restore affected processes
- Investigate workflow execution logs before next deployment
**Related Documentation**:
- [PM2 Process Isolation Requirements](#pm2-process-isolation-productiontest-servers) (existing section)
- [Incident Report 2026-02-17](docs/operations/INCIDENT-2026-02-17-PM2-PROCESS-KILL.md)
- [PM2 Incident Response Runbook](docs/operations/PM2-INCIDENT-RESPONSE.md)
### Git Bash Path Conversion (Windows)
Git Bash auto-converts Unix paths, breaking container commands.
@@ -285,8 +382,8 @@ podman cp "d:/path/file" container:/tmp/file
**Quick Access**:
- **Dev**: https://localhost:8443 (`admin@localhost`/`admin`)
- **Prod**: https://bugsink.projectium.com
- **Dev**: <https://localhost:8443> (`admin@localhost`/`admin`)
- **Prod**: <https://bugsink.projectium.com>
**Token Creation** (required for MCP):
@@ -294,15 +391,15 @@ podman cp "d:/path/file" container:/tmp/file
# Dev container
MSYS_NO_PATHCONV=1 podman exec -e DATABASE_URL=postgresql://bugsink:bugsink_dev_password@postgres:5432/bugsink -e SECRET_KEY=dev-bugsink-secret-key-minimum-50-characters-for-security flyer-crawler-dev sh -c 'cd /opt/bugsink/conf && DJANGO_SETTINGS_MODULE=bugsink_conf PYTHONPATH=/opt/bugsink/conf:/opt/bugsink/lib/python3.10/site-packages /opt/bugsink/bin/python -m django create_auth_token'
# Production (via SSH)
ssh root@projectium.com "cd /opt/bugsink && bugsink-manage create_auth_token"
# Production (user executes on server)
cd /opt/bugsink && bugsink-manage create_auth_token
```
### Logstash
**See**: [docs/operations/LOGSTASH-QUICK-REF.md](docs/operations/LOGSTASH-QUICK-REF.md)
Log aggregation: PostgreSQL + PM2 + Redis + NGINX → Bugsink (ADR-050)
Log aggregation: PostgreSQL + PM2 + Redis + NGINX → Bugsink (ADR-015)
---
@@ -322,84 +419,3 @@ Log aggregation: PostgreSQL + PM2 + Redis + NGINX → Bugsink (ADR-050)
| **Logstash** | [LOGSTASH-QUICK-REF.md](docs/operations/LOGSTASH-QUICK-REF.md) |
| **ADRs** | [docs/adr/index.md](docs/adr/index.md) |
| **All Docs** | [docs/README.md](docs/README.md) |
---
## Appendix: Integration Test Issues (Full Details)
### 1. Vitest globalSetup Context Isolation
Vitest's `globalSetup` runs in separate Node.js context. Singletons, spies, mocks do NOT share instances with test files.
**Affected**: BullMQ worker service mocks (AI/DB failure tests)
**Solutions**: Mark `.todo()`, create test-only API endpoints, use Redis-based mock flags
```typescript
// DOES NOT WORK - different instances
const { flyerProcessingService } = await import('../../services/workers.server');
flyerProcessingService._getAiProcessor()._setExtractAndValidateData(mockFn);
```
### 2. Cleanup Queue Deletes Before Verification
Cleanup worker processes jobs in globalSetup context, ignoring test spies.
**Solution**: Drain and pause queue:
```typescript
const { cleanupQueue } = await import('../../services/queues.server');
await cleanupQueue.drain();
await cleanupQueue.pause();
// ... test ...
await cleanupQueue.resume();
```
### 3. Cache Stale After Direct SQL
Direct `pool.query()` inserts bypass cache invalidation.
**Solution**: `await cacheService.invalidateFlyers();` after inserts
### 4. Test Filename Collisions
Multer predictable filenames cause race conditions.
**Solution**: Use unique suffix: `${Date.now()}-${Math.round(Math.random() * 1e9)}`
### 5. Response Format Mismatches
API formats change: `data.jobId` vs `data.job.id`, nested vs flat, string vs number IDs.
**Solution**: Log response bodies, update assertions
### 6. External Service Availability
PM2/Redis health checks fail when unavailable.
**Solution**: try/catch with graceful degradation or mock
### 7. TZ Environment Variable Breaking Async Hooks
**Problem**: When `TZ=America/Los_Angeles` (or other timezone values) is set in the environment, Node.js async_hooks module can produce `RangeError: Invalid triggerAsyncId value: NaN`. This breaks React Testing Library's `render()` function which uses async hooks internally.
**Root Cause**: Setting `TZ` to certain timezone values interferes with Node.js's internal async tracking mechanism, causing invalid async IDs to be generated.
**Symptoms**:
```text
RangeError: Invalid triggerAsyncId value: NaN
process.env.NODE_ENV.queueSeveralMicrotasks node_modules/react/cjs/react.development.js:751:15
process.env.NODE_ENV.exports.act node_modules/react/cjs/react.development.js:886:11
node_modules/@testing-library/react/dist/act-compat.js:46:25
renderRoot node_modules/@testing-library/react/dist/pure.js:189:26
```
**Solution**: Explicitly unset `TZ` in all test scripts by adding `TZ=` (empty value) to cross-env:
```json
"test:unit": "cross-env NODE_ENV=test TZ= tsx ..."
"test:integration": "cross-env NODE_ENV=test TZ= tsx ..."
```
**Context**: This issue was introduced in commit `d03900c` which added `TZ: 'America/Los_Angeles'` to PM2 ecosystem configs for consistent log timestamps in production/dev environments. Tests must explicitly override this to prevent the async hooks error.

View File

@@ -139,3 +139,5 @@ See [INSTALL.md](INSTALL.md) for the complete list.
## License
[Add license information here]
annoyed

View File

@@ -0,0 +1,393 @@
# AI Documentation Index
Machine-optimized navigation for AI agents. Structured for vector retrieval and semantic search.
---
## Quick Lookup Table
| Task/Question | Primary Doc | Section/ADR |
| ----------------------- | --------------------------------------------------- | --------------------------------------- |
| Add new API endpoint | [CODE-PATTERNS.md](development/CODE-PATTERNS.md) | API Response Patterns, Input Validation |
| Add repository method | [CODE-PATTERNS.md](development/CODE-PATTERNS.md) | Repository Patterns (get*/find*/list\*) |
| Fix failing test | [TESTING.md](development/TESTING.md) | Known Integration Test Issues |
| Run tests correctly | [TESTING.md](development/TESTING.md) | Test Execution Environment |
| Add database column | [DATABASE-GUIDE.md](subagents/DATABASE-GUIDE.md) | Schema sync required |
| Deploy to production | [DEPLOYMENT.md](operations/DEPLOYMENT.md) | Application Deployment |
| Debug container issue | [DEBUGGING.md](development/DEBUGGING.md) | Container Issues |
| Configure environment | [ENVIRONMENT.md](getting-started/ENVIRONMENT.md) | Configuration by Environment |
| Add background job | [CODE-PATTERNS.md](development/CODE-PATTERNS.md) | Background Jobs |
| Handle errors correctly | [CODE-PATTERNS.md](development/CODE-PATTERNS.md) | Error Handling |
| Use transactions | [CODE-PATTERNS.md](development/CODE-PATTERNS.md) | Transaction Management |
| Add authentication | [AUTHENTICATION.md](architecture/AUTHENTICATION.md) | JWT Token Architecture |
| Cache data | [CODE-PATTERNS.md](development/CODE-PATTERNS.md) | Caching |
| Check PM2 status | [DEV-CONTAINER.md](development/DEV-CONTAINER.md) | PM2 Process Management |
| View logs | [DEBUGGING.md](development/DEBUGGING.md) | PM2 Log Access |
| Understand architecture | [OVERVIEW.md](architecture/OVERVIEW.md) | System Architecture Diagram |
| Check ADR for decision | [adr/index.md](adr/index.md) | ADR by category |
| Use subagent | [subagents/OVERVIEW.md](subagents/OVERVIEW.md) | Available Subagents |
| API versioning | [API-VERSIONING.md](development/API-VERSIONING.md) | Phase 2 infrastructure |
---
## Documentation Tree
```
docs/
+-- AI-DOCUMENTATION-INDEX.md # THIS FILE - AI navigation index
+-- README.md # Human-readable doc hub
|
+-- adr/ # Architecture Decision Records (57 ADRs)
| +-- index.md # ADR index by category
| +-- 0001-*.md # Standardized error handling
| +-- 0002-*.md # Transaction management (withTransaction)
| +-- 0003-*.md # Input validation (Zod middleware)
| +-- 0008-*.md # API versioning (/api/v1/)
| +-- 0014-*.md # Platform: Linux only (CRITICAL)
| +-- 0028-*.md # API response (sendSuccess/sendError)
| +-- 0034-*.md # Repository pattern (get*/find*/list*)
| +-- 0035-*.md # Service layer architecture
| +-- 0050-*.md # PostgreSQL observability + Logstash
| +-- 0057-*.md # Test remediation post-API versioning
| +-- adr-implementation-tracker.md # Implementation status
|
+-- architecture/
| +-- OVERVIEW.md # System architecture, data flows, entities
| +-- DATABASE.md # Schema design, extensions, setup
| +-- AUTHENTICATION.md # OAuth, JWT, security features
| +-- WEBSOCKET_USAGE.md # Real-time communication patterns
| +-- api-versioning-infrastructure.md # Phase 2 versioning details
|
+-- development/
| +-- CODE-PATTERNS.md # Error handling, repos, API responses
| +-- TESTING.md # Unit/integration/E2E, known issues
| +-- DEBUGGING.md # Container, DB, API, PM2 debugging
| +-- DEV-CONTAINER.md # PM2, Logstash, container services
| +-- API-VERSIONING.md # API versioning workflows
| +-- DESIGN_TOKENS.md # Neo-Brutalism design system
| +-- ERROR-LOGGING-PATHS.md # req.originalUrl pattern
| +-- test-path-migration.md # Test file reorganization
|
+-- getting-started/
| +-- QUICKSTART.md # Quick setup instructions
| +-- INSTALL.md # Full installation guide
| +-- ENVIRONMENT.md # Environment variables reference
|
+-- operations/
| +-- DEPLOYMENT.md # Production deployment guide
| +-- BARE-METAL-SETUP.md # Server provisioning
| +-- MONITORING.md # Bugsink, health checks
| +-- LOGSTASH-QUICK-REF.md # Log aggregation reference
| +-- LOGSTASH-TROUBLESHOOTING.md # Logstash debugging
|
+-- subagents/
| +-- OVERVIEW.md # Subagent system introduction
| +-- CODER-GUIDE.md # Code development patterns
| +-- TESTER-GUIDE.md # Testing strategies
| +-- DATABASE-GUIDE.md # Database workflows
| +-- DEVOPS-GUIDE.md # Deployment/infrastructure
| +-- FRONTEND-GUIDE.md # UI/UX development
| +-- AI-USAGE-GUIDE.md # Gemini integration
| +-- DOCUMENTATION-GUIDE.md # Writing docs
| +-- SECURITY-DEBUG-GUIDE.md # Security and debugging
|
+-- tools/
| +-- MCP-CONFIGURATION.md # MCP servers setup
| +-- BUGSINK-SETUP.md # Error tracking setup
| +-- VSCODE-SETUP.md # Editor configuration
|
+-- archive/ # Historical docs, session notes
+-- sessions/ # Development session logs
+-- plans/ # Feature implementation plans
+-- research/ # Investigation notes
```
---
## Problem-to-Document Mapping
### Database Issues
| Problem | Documents |
| -------------------- | ----------------------------------------------------------------------------------------------- |
| Schema out of sync | [DATABASE-GUIDE.md](subagents/DATABASE-GUIDE.md), [CLAUDE.md](../CLAUDE.md) schema sync section |
| Migration needed | [DATABASE.md](architecture/DATABASE.md), ADR-013, ADR-023 |
| Query performance | [DEBUGGING.md](development/DEBUGGING.md) Query Performance Issues |
| Connection errors | [DEBUGGING.md](development/DEBUGGING.md) Database Issues |
| Transaction patterns | [CODE-PATTERNS.md](development/CODE-PATTERNS.md) Transaction Management, ADR-002 |
| Repository methods | [CODE-PATTERNS.md](development/CODE-PATTERNS.md) Repository Patterns, ADR-034 |
### Test Failures
| Problem | Documents |
| ---------------------------- | --------------------------------------------------------------------- |
| Tests fail in container | [TESTING.md](development/TESTING.md), ADR-014 |
| Vitest globalSetup isolation | [CLAUDE.md](../CLAUDE.md) Integration Test Issues #1 |
| Cache stale after insert | [CLAUDE.md](../CLAUDE.md) Integration Test Issues #3 |
| Queue interference | [CLAUDE.md](../CLAUDE.md) Integration Test Issues #2 |
| API path mismatches | [TESTING.md](development/TESTING.md) API Versioning in Tests, ADR-057 |
| Type check failures | [DEBUGGING.md](development/DEBUGGING.md) Type Check Failures |
| TZ environment breaks async | [CLAUDE.md](../CLAUDE.md) Integration Test Issues #7 |
### Deployment Issues
| Problem | Documents |
| --------------------- | ------------------------------------------------------------------------------------- |
| PM2 not starting | [DEBUGGING.md](development/DEBUGGING.md) PM2 Process Issues |
| NGINX configuration | [DEPLOYMENT.md](operations/DEPLOYMENT.md) NGINX Configuration |
| SSL certificates | [DEBUGGING.md](development/DEBUGGING.md) SSL Certificate Issues |
| CI/CD failures | [DEPLOYMENT.md](operations/DEPLOYMENT.md) CI/CD Pipeline, ADR-017 |
| Container won't start | [DEBUGGING.md](development/DEBUGGING.md) Container Issues |
| Bugsink not receiving | [BUGSINK-SETUP.md](tools/BUGSINK-SETUP.md), [MONITORING.md](operations/MONITORING.md) |
### Frontend/UI Changes
| Problem | Documents |
| ------------------ | --------------------------------------------------------------- |
| Component patterns | [FRONTEND-GUIDE.md](subagents/FRONTEND-GUIDE.md), ADR-044 |
| Design tokens | [DESIGN_TOKENS.md](development/DESIGN_TOKENS.md), ADR-012 |
| State management | ADR-005, [OVERVIEW.md](architecture/OVERVIEW.md) Frontend Stack |
| Hot reload broken | [DEBUGGING.md](development/DEBUGGING.md) Frontend Issues |
| CORS errors | [DEBUGGING.md](development/DEBUGGING.md) API Calls Failing |
### API Development
| Problem | Documents |
| ---------------- | ------------------------------------------------------------------------------- |
| Response format | [CODE-PATTERNS.md](development/CODE-PATTERNS.md) API Response Patterns, ADR-028 |
| Input validation | [CODE-PATTERNS.md](development/CODE-PATTERNS.md) Input Validation, ADR-003 |
| Error handling | [CODE-PATTERNS.md](development/CODE-PATTERNS.md) Error Handling, ADR-001 |
| Rate limiting | ADR-032, [OVERVIEW.md](architecture/OVERVIEW.md) |
| API versioning | [API-VERSIONING.md](development/API-VERSIONING.md), ADR-008 |
| Authentication | [AUTHENTICATION.md](architecture/AUTHENTICATION.md), ADR-048 |
### Background Jobs
| Problem | Documents |
| ------------------- | ------------------------------------------------------------------------- |
| Jobs not processing | [DEBUGGING.md](development/DEBUGGING.md) Background Job Issues |
| Queue configuration | [CODE-PATTERNS.md](development/CODE-PATTERNS.md) Background Jobs, ADR-006 |
| Worker crashes | [DEBUGGING.md](development/DEBUGGING.md), ADR-053 |
| Scheduled jobs | ADR-037, [OVERVIEW.md](architecture/OVERVIEW.md) Scheduled Jobs |
---
## Document Priority Matrix
### CRITICAL (Read First)
| Document | Purpose | Key Content |
| --------------------------------------------------------------- | ----------------------- | ----------------------------- |
| [CLAUDE.md](../CLAUDE.md) | AI agent instructions | Rules, patterns, known issues |
| [ADR-014](adr/0014-containerization-and-deployment-strategy.md) | Platform requirement | Tests MUST run in container |
| [DEV-CONTAINER.md](development/DEV-CONTAINER.md) | Development environment | PM2, Logstash, services |
### HIGH (Core Development)
| Document | Purpose | Key Content |
| --------------------------------------------------- | ----------------- | ---------------------------- |
| [CODE-PATTERNS.md](development/CODE-PATTERNS.md) | Code templates | Error handling, repos, APIs |
| [TESTING.md](development/TESTING.md) | Test execution | Commands, known issues |
| [DATABASE.md](architecture/DATABASE.md) | Schema reference | Setup, extensions, users |
| [ADR-034](adr/0034-repository-pattern-standards.md) | Repository naming | get*/find*/list\* |
| [ADR-028](adr/0028-api-response-standardization.md) | API responses | sendSuccess/sendError |
| [ADR-001](adr/0001-standardized-error-handling.md) | Error handling | handleDbError, NotFoundError |
### MEDIUM (Specialized Tasks)
| Document | Purpose | Key Content |
| --------------------------------------------------- | --------------------- | ------------------------ |
| [subagents/OVERVIEW.md](subagents/OVERVIEW.md) | Subagent selection | When to delegate |
| [DEPLOYMENT.md](operations/DEPLOYMENT.md) | Production deployment | PM2, NGINX, CI/CD |
| [DEBUGGING.md](development/DEBUGGING.md) | Troubleshooting | Common issues, solutions |
| [ENVIRONMENT.md](getting-started/ENVIRONMENT.md) | Config reference | Variables by environment |
| [AUTHENTICATION.md](architecture/AUTHENTICATION.md) | Auth patterns | OAuth, JWT, security |
| [API-VERSIONING.md](development/API-VERSIONING.md) | Versioning | /api/v1/ prefix |
### LOW (Reference/Historical)
| Document | Purpose | Key Content |
| -------------------- | ------------------ | ------------------------- |
| [archive/](archive/) | Historical docs | Session notes, old plans |
| ADR-013, ADR-023 | Migration strategy | Proposed, not implemented |
| ADR-024 | Feature flags | Proposed |
| ADR-025 | i18n/l10n | Proposed |
---
## Cross-Reference Matrix
| Document | References | Referenced By |
| -------------------- | ------------------------------------------------------------------------------- | ------------------------------------------------------ |
| **CLAUDE.md** | ADR-001, ADR-002, ADR-008, ADR-014, ADR-028, ADR-034, ADR-035, ADR-050, ADR-057 | All development docs |
| **ADR-008** | ADR-028 | API-VERSIONING.md, TESTING.md, ADR-057 |
| **ADR-014** | - | CLAUDE.md, TESTING.md, DEPLOYMENT.md, DEV-CONTAINER.md |
| **ADR-028** | ADR-001 | CODE-PATTERNS.md, OVERVIEW.md |
| **ADR-034** | ADR-001 | CODE-PATTERNS.md, DATABASE-GUIDE.md |
| **ADR-057** | ADR-008, ADR-028 | TESTING.md |
| **CODE-PATTERNS.md** | ADR-001, ADR-002, ADR-003, ADR-028, ADR-034, ADR-036, ADR-048 | CODER-GUIDE.md |
| **TESTING.md** | ADR-014, ADR-057, CLAUDE.md | TESTER-GUIDE.md, DEBUGGING.md |
| **DEBUGGING.md** | DEV-CONTAINER.md, TESTING.md, MONITORING.md | DEVOPS-GUIDE.md |
| **DEV-CONTAINER.md** | ADR-014, ADR-050, ecosystem.dev.config.cjs | DEBUGGING.md, CLAUDE.md |
| **OVERVIEW.md** | ADR-001 through ADR-050+ | All architecture docs |
| **DATABASE.md** | ADR-002, ADR-013, ADR-055 | DATABASE-GUIDE.md |
---
## Navigation Patterns
### Adding a Feature
```
1. CLAUDE.md -> Project rules, patterns
2. CODE-PATTERNS.md -> Implementation templates
3. Relevant subagent guide -> Domain-specific patterns
4. Related ADRs -> Design decisions
5. TESTING.md -> Test requirements
```
### Fixing a Bug
```
1. DEBUGGING.md -> Common issues checklist
2. TESTING.md -> Run tests in container
3. Error logs (pm2/bugsink) -> Identify root cause
4. CODE-PATTERNS.md -> Correct pattern reference
5. Related ADR -> Architectural context
```
### Deploying
```
1. DEPLOYMENT.md -> Deployment procedures
2. ENVIRONMENT.md -> Required variables
3. MONITORING.md -> Health check verification
4. LOGSTASH-QUICK-REF.md -> Log aggregation check
```
### Database Changes
```
1. DATABASE-GUIDE.md -> Schema sync requirements (CRITICAL)
2. DATABASE.md -> Schema design patterns
3. ADR-002 -> Transaction patterns
4. ADR-034 -> Repository methods
5. ADR-055 -> Normalization rules
```
### Subagent Selection
| Task Type | Subagent | Guide |
| --------------------- | ------------------------- | ------------------------------------------------------------ |
| Write production code | `coder` | [CODER-GUIDE.md](subagents/CODER-GUIDE.md) |
| Database changes | `db-dev` | [DATABASE-GUIDE.md](subagents/DATABASE-GUIDE.md) |
| Create tests | `testwriter` | [TESTER-GUIDE.md](subagents/TESTER-GUIDE.md) |
| Fix failing tests | `tester` | [TESTER-GUIDE.md](subagents/TESTER-GUIDE.md) |
| Container/deployment | `devops` | [DEVOPS-GUIDE.md](subagents/DEVOPS-GUIDE.md) |
| UI components | `frontend-specialist` | [FRONTEND-GUIDE.md](subagents/FRONTEND-GUIDE.md) |
| External APIs | `integrations-specialist` | - |
| Security review | `security-engineer` | [SECURITY-DEBUG-GUIDE.md](subagents/SECURITY-DEBUG-GUIDE.md) |
| Production errors | `log-debug` | [SECURITY-DEBUG-GUIDE.md](subagents/SECURITY-DEBUG-GUIDE.md) |
| AI/Gemini issues | `ai-usage` | [AI-USAGE-GUIDE.md](subagents/AI-USAGE-GUIDE.md) |
---
## Key File Quick Reference
### Configuration
| File | Purpose |
| -------------------------- | ---------------------------- |
| `server.ts` | Express app setup |
| `src/config/env.ts` | Environment validation (Zod) |
| `ecosystem.dev.config.cjs` | PM2 dev config |
| `ecosystem.config.cjs` | PM2 prod config |
| `vite.config.ts` | Vite build config |
### Core Implementation
| File | Purpose |
| ----------------------------------- | ----------------------------------- |
| `src/routes/*.routes.ts` | API route handlers |
| `src/services/db/*.db.ts` | Repository layer |
| `src/services/*.server.ts` | Server-only services |
| `src/services/queues.server.ts` | BullMQ queue definitions |
| `src/services/workers.server.ts` | BullMQ workers |
| `src/utils/apiResponse.ts` | sendSuccess/sendError/sendPaginated |
| `src/services/db/errors.db.ts` | handleDbError, NotFoundError |
| `src/services/db/transaction.db.ts` | withTransaction |
### Database Schema
| File | Purpose |
| ------------------------------ | ----------------------------------- |
| `sql/master_schema_rollup.sql` | Test DB, complete reference |
| `sql/initial_schema.sql` | Fresh install (identical to rollup) |
| `sql/migrations/*.sql` | Production ALTER statements |
### Testing
| File | Purpose |
| ---------------------------------- | ----------------------- |
| `vitest.config.ts` | Unit test config |
| `vitest.config.integration.ts` | Integration test config |
| `vitest.config.e2e.ts` | E2E test config |
| `src/tests/utils/mockFactories.ts` | Mock data factories |
| `src/tests/utils/storeHelpers.ts` | Store test helpers |
---
## ADR Quick Reference
### By Implementation Status
**Implemented**: 001, 002, 003, 004, 006, 008, 009, 010, 016, 017, 020, 021, 028, 032, 033, 034, 035, 036, 037, 038, 040, 041, 043, 044, 045, 046, 050, 051, 052, 055, 057
**Partially Implemented**: 012, 014, 015, 048
**Proposed**: 011, 013, 022, 023, 024, 025, 029, 030, 031, 039, 047, 053, 054, 056
### By Category
| Category | ADRs |
| --------------------- | ------------------------------------------- |
| Core Infrastructure | 002, 007, 020, 030 |
| Data Management | 009, 013, 019, 023, 031, 055 |
| API & Integration | 003, 008, 018, 022, 028 |
| Security | 001, 011, 016, 029, 032, 033, 048 |
| Observability | 004, 015, 050, 051, 052, 056 |
| Deployment & Ops | 006, 014, 017, 024, 037, 038, 053, 054 |
| Frontend/UI | 005, 012, 025, 026, 044 |
| Dev Workflow | 010, 021, 027, 040, 045, 047, 057 |
| Architecture Patterns | 034, 035, 036, 039, 041, 042, 043, 046, 049 |
---
## Essential Commands
```bash
# Run all tests (MUST use container)
podman exec -it flyer-crawler-dev npm test
# Run unit tests
podman exec -it flyer-crawler-dev npm run test:unit
# Run type check
podman exec -it flyer-crawler-dev npm run type-check
# Run integration tests
podman exec -it flyer-crawler-dev npm run test:integration
# PM2 status
podman exec -it flyer-crawler-dev pm2 status
# PM2 logs
podman exec -it flyer-crawler-dev pm2 logs
# Restart all processes
podman exec -it flyer-crawler-dev pm2 restart all
```
---
_This index is optimized for AI agent consumption. Updated: 2026-01-28_

View File

@@ -56,7 +56,7 @@ podman exec -it flyer-crawler-dev npm run dev:container
**Pass/Fail**: [ ]
**Notes**: **********************\_\_\_**********************
**Notes**: \***\*\*\*\*\***\*\*\***\*\*\*\*\***\_\_\_\***\*\*\*\*\***\*\*\***\*\*\*\*\***
---
@@ -90,7 +90,7 @@ podman exec -it flyer-crawler-dev npm run dev:container
**Pass/Fail**: [ ]
**Notes**: **********************\_\_\_**********************
**Notes**: \***\*\*\*\*\***\*\*\***\*\*\*\*\***\_\_\_\***\*\*\*\*\***\*\*\***\*\*\*\*\***
---
@@ -114,7 +114,7 @@ podman exec -it flyer-crawler-dev npm run dev:container
**Pass/Fail**: [ ]
**Notes**: **********************\_\_\_**********************
**Notes**: \***\*\*\*\*\***\*\*\***\*\*\*\*\***\_\_\_\***\*\*\*\*\***\*\*\***\*\*\*\*\***
---
@@ -138,7 +138,7 @@ podman exec -it flyer-crawler-dev npm run dev:container
**Pass/Fail**: [ ]
**Notes**: **********************\_\_\_**********************
**Notes**: \***\*\*\*\*\***\*\*\***\*\*\*\*\***\_\_\_\***\*\*\*\*\***\*\*\***\*\*\*\*\***
---
@@ -161,7 +161,7 @@ podman exec -it flyer-crawler-dev npm run dev:container
**Pass/Fail**: [ ]
**Notes**: **********************\_\_\_**********************
**Notes**: \***\*\*\*\*\***\*\*\***\*\*\*\*\***\_\_\_\***\*\*\*\*\***\*\*\***\*\*\*\*\***
---
@@ -189,7 +189,7 @@ podman exec -it flyer-crawler-dev npm run dev:container
**Pass/Fail**: [ ]
**Notes**: **********************\_\_\_**********************
**Notes**: \***\*\*\*\*\***\*\*\***\*\*\*\*\***\_\_\_\***\*\*\*\*\***\*\*\***\*\*\*\*\***
---
@@ -211,7 +211,7 @@ podman exec -it flyer-crawler-dev npm run dev:container
**Pass/Fail**: [ ]
**Notes**: **********************\_\_\_**********************
**Notes**: \***\*\*\*\*\***\*\*\***\*\*\*\*\***\_\_\_\***\*\*\*\*\***\*\*\***\*\*\*\*\***
---
@@ -234,7 +234,7 @@ podman exec -it flyer-crawler-dev npm run dev:container
**Pass/Fail**: [ ]
**Notes**: **********************\_\_\_**********************
**Notes**: \***\*\*\*\*\***\*\*\***\*\*\*\*\***\_\_\_\***\*\*\*\*\***\*\*\***\*\*\*\*\***
---
@@ -259,7 +259,7 @@ podman exec -it flyer-crawler-dev npm run dev:container
**Pass/Fail**: [ ]
**Notes**: **********************\_\_\_**********************
**Notes**: \***\*\*\*\*\***\*\*\***\*\*\*\*\***\_\_\_\***\*\*\*\*\***\*\*\***\*\*\*\*\***
---
@@ -284,7 +284,7 @@ podman exec -it flyer-crawler-dev npm run dev:container
**Pass/Fail**: [ ]
**Notes**: **********************\_\_\_**********************
**Notes**: \***\*\*\*\*\***\*\*\***\*\*\*\*\***\_\_\_\***\*\*\*\*\***\*\*\***\*\*\*\*\***
---
@@ -307,7 +307,7 @@ podman exec -it flyer-crawler-dev npm run dev:container
**Pass/Fail**: [ ]
**Notes**: **********************\_\_\_**********************
**Notes**: \***\*\*\*\*\***\*\*\***\*\*\*\*\***\_\_\_\***\*\*\*\*\***\*\*\***\*\*\*\*\***
---
@@ -330,7 +330,7 @@ podman exec -it flyer-crawler-dev npm run dev:container
**Pass/Fail**: [ ]
**Notes**: **********************\_\_\_**********************
**Notes**: \***\*\*\*\*\***\*\*\***\*\*\*\*\***\_\_\_\***\*\*\*\*\***\*\*\***\*\*\*\*\***
---
@@ -355,7 +355,7 @@ podman exec -it flyer-crawler-dev npm run dev:container
**Pass/Fail**: [ ]
**Notes**: **********************\_\_\_**********************
**Notes**: \***\*\*\*\*\***\*\*\***\*\*\*\*\***\_\_\_\***\*\*\*\*\***\*\*\***\*\*\*\*\***
---
@@ -379,7 +379,7 @@ podman exec -it flyer-crawler-dev npm run dev:container
**Pass/Fail**: [ ]
**Notes**: **********************\_\_\_**********************
**Notes**: \***\*\*\*\*\***\*\*\***\*\*\*\*\***\_\_\_\***\*\*\*\*\***\*\*\***\*\*\*\*\***
---
@@ -425,7 +425,7 @@ podman exec -it flyer-crawler-dev npm run dev:container
**Pass/Fail**: [ ]
**Notes**: **********************\_\_\_**********************
**Notes**: \***\*\*\*\*\***\*\*\***\*\*\*\*\***\_\_\_\***\*\*\*\*\***\*\*\***\*\*\*\*\***
---
@@ -448,7 +448,7 @@ podman exec -it flyer-crawler-dev npm run dev:container
**Pass/Fail**: [ ]
**Notes**: **********************\_\_\_**********************
**Notes**: \***\*\*\*\*\***\*\*\***\*\*\*\*\***\_\_\_\***\*\*\*\*\***\*\*\***\*\*\*\*\***
---
@@ -476,7 +476,7 @@ podman exec -it flyer-crawler-dev npm run dev:container
**Pass/Fail**: [ ]
**Notes**: **********************\_\_\_**********************
**Notes**: \***\*\*\*\*\***\*\*\***\*\*\*\*\***\_\_\_\***\*\*\*\*\***\*\*\***\*\*\*\*\***
---
@@ -502,7 +502,7 @@ podman exec -it flyer-crawler-dev npm run dev:container
**Pass/Fail**: [ ]
**Notes**: **********************\_\_\_**********************
**Notes**: \***\*\*\*\*\***\*\*\***\*\*\*\*\***\_\_\_\***\*\*\*\*\***\*\*\***\*\*\*\*\***
---
@@ -529,7 +529,7 @@ podman exec -it flyer-crawler-dev npm run dev:container
**Pass/Fail**: [ ]
**Notes**: **********************\_\_\_**********************
**Notes**: \***\*\*\*\*\***\*\*\***\*\*\*\*\***\_\_\_\***\*\*\*\*\***\*\*\***\*\*\*\*\***
---
@@ -555,7 +555,7 @@ podman exec -it flyer-crawler-dev npm run dev:container
**Pass/Fail**: [ ]
**Notes**: **********************\_\_\_**********************
**Notes**: \***\*\*\*\*\***\*\*\***\*\*\*\*\***\_\_\_\***\*\*\*\*\***\*\*\***\*\*\*\*\***
---
@@ -579,7 +579,7 @@ podman exec -it flyer-crawler-dev npm run dev:container
**Pass/Fail**: [ ]
**Notes**: **********************\_\_\_**********************
**Notes**: \***\*\*\*\*\***\*\*\***\*\*\*\*\***\_\_\_\***\*\*\*\*\***\*\*\***\*\*\*\*\***
---
@@ -612,7 +612,7 @@ podman exec -it flyer-crawler-dev npm run dev:container
**Pass/Fail**: [ ]
**Notes**: **********************\_\_\_**********************
**Notes**: \***\*\*\*\*\***\*\*\***\*\*\*\*\***\_\_\_\***\*\*\*\*\***\*\*\***\*\*\*\*\***
---
@@ -637,7 +637,7 @@ podman exec -it flyer-crawler-dev npm run dev:container
**Pass/Fail**: [ ]
**Notes**: **********************\_\_\_**********************
**Notes**: \***\*\*\*\*\***\*\*\***\*\*\*\*\***\_\_\_\***\*\*\*\*\***\*\*\***\*\*\*\*\***
---
@@ -656,7 +656,7 @@ podman exec -it flyer-crawler-dev npm run dev:container
**Pass/Fail**: [ ]
**Notes**: **********************\_\_\_**********************
**Notes**: \***\*\*\*\*\***\*\*\***\*\*\*\*\***\_\_\_\***\*\*\*\*\***\*\*\***\*\*\*\*\***
---
@@ -681,7 +681,7 @@ podman exec -it flyer-crawler-dev npm run dev:container
**Pass/Fail**: [ ]
**Notes**: **********************\_\_\_**********************
**Notes**: \***\*\*\*\*\***\*\*\***\*\*\*\*\***\_\_\_\***\*\*\*\*\***\*\*\***\*\*\*\*\***
---
@@ -705,7 +705,7 @@ podman exec -it flyer-crawler-dev npm run dev:container
**Pass/Fail**: [ ]
**Notes**: **********************\_\_\_**********************
**Notes**: \***\*\*\*\*\***\*\*\***\*\*\*\*\***\_\_\_\***\*\*\*\*\***\*\*\***\*\*\*\*\***
---
@@ -757,7 +757,7 @@ podman exec -it flyer-crawler-dev npm run dev:container
**Pass/Fail**: [ ]
**Measurements**: **********************\_\_\_**********************
**Measurements**: \***\*\*\*\*\***\*\*\***\*\*\*\*\***\_\_\_\***\*\*\*\*\***\*\*\***\*\*\*\*\***
---
@@ -765,7 +765,7 @@ podman exec -it flyer-crawler-dev npm run dev:container
### Test 8.1: Chrome/Edge
**Browser Version**: ******\_\_\_******
**Browser Version**: **\*\***\_\_\_**\*\***
**Tests to Run**:
@@ -775,13 +775,13 @@ podman exec -it flyer-crawler-dev npm run dev:container
**Pass/Fail**: [ ]
**Notes**: **********************\_\_\_**********************
**Notes**: \***\*\*\*\*\***\*\*\***\*\*\*\*\***\_\_\_\***\*\*\*\*\***\*\*\***\*\*\*\*\***
---
### Test 8.2: Firefox
**Browser Version**: ******\_\_\_******
**Browser Version**: **\*\***\_\_\_**\*\***
**Tests to Run**:
@@ -791,13 +791,13 @@ podman exec -it flyer-crawler-dev npm run dev:container
**Pass/Fail**: [ ]
**Notes**: **********************\_\_\_**********************
**Notes**: \***\*\*\*\*\***\*\*\***\*\*\*\*\***\_\_\_\***\*\*\*\*\***\*\*\***\*\*\*\*\***
---
### Test 8.3: Safari (macOS/iOS)
**Browser Version**: ******\_\_\_******
**Browser Version**: **\*\***\_\_\_**\*\***
**Tests to Run**:
@@ -807,7 +807,7 @@ podman exec -it flyer-crawler-dev npm run dev:container
**Pass/Fail**: [ ]
**Notes**: **********************\_\_\_**********************
**Notes**: \***\*\*\*\*\***\*\*\***\*\*\*\*\***\_\_\_\***\*\*\*\*\***\*\*\***\*\*\*\*\***
---
@@ -849,8 +849,8 @@ podman exec -it flyer-crawler-dev npm run dev:container
## Sign-Off
**Tester Name**: **********************\_\_\_**********************
**Date Completed**: **********************\_\_\_**********************
**Tester Name**: \***\*\*\*\*\***\*\*\***\*\*\*\*\***\_\_\_\***\*\*\*\*\***\*\*\***\*\*\*\*\***
**Date Completed**: \***\*\*\*\*\***\*\*\***\*\*\*\*\***\_\_\_\***\*\*\*\*\***\*\*\***\*\*\*\*\***
**Overall Status**: [ ] PASS [ ] PASS WITH ISSUES [ ] FAIL
**Ready for Production**: [ ] YES [ ] NO [ ] WITH FIXES

View File

@@ -208,7 +208,7 @@ Press F12 or Ctrl+Shift+I
**Result**: [ ] PASS [ ] FAIL
**Errors found**: ******************\_\_\_******************
**Errors found**: **\*\*\*\***\*\***\*\*\*\***\_\_\_**\*\*\*\***\*\***\*\*\*\***
---
@@ -224,7 +224,7 @@ Check for:
**Result**: [ ] PASS [ ] FAIL
**Issues found**: ******************\_\_\_******************
**Issues found**: **\*\*\*\***\*\***\*\*\*\***\_\_\_**\*\*\*\***\*\***\*\*\*\***
---
@@ -272,4 +272,4 @@ Check for:
2. ***
3. ***
**Sign-off**: ********\_\_\_******** **Date**: ****\_\_\_****
**Sign-off**: **\*\*\*\***\_\_\_**\*\*\*\*** **Date**: \***\*\_\_\_\*\***

View File

@@ -32,8 +32,10 @@ Day-to-day development guides:
- [Testing Guide](development/TESTING.md) - Unit, integration, and E2E testing
- [Code Patterns](development/CODE-PATTERNS.md) - Common code patterns and ADR examples
- [API Versioning](development/API-VERSIONING.md) - API versioning infrastructure and workflows
- [Design Tokens](development/DESIGN_TOKENS.md) - UI design system and Neo-Brutalism
- [Debugging Guide](development/DEBUGGING.md) - Common debugging patterns
- [Dev Container](development/DEV-CONTAINER.md) - Development container setup and PM2
### 🔧 Operations
@@ -45,6 +47,14 @@ Production operations and deployment:
- [Logstash Troubleshooting](operations/LOGSTASH-TROUBLESHOOTING.md) - Debugging logs
- [Monitoring](operations/MONITORING.md) - Bugsink, health checks, observability
**Incident Response**:
- [PM2 Incident Response Runbook](operations/PM2-INCIDENT-RESPONSE.md) - Step-by-step procedures for PM2 incidents
**Incident Reports**:
- [2026-02-17 PM2 Process Kill](operations/INCIDENT-2026-02-17-PM2-PROCESS-KILL.md) - ALL PM2 processes killed during v0.15.0 deployment (Mitigated)
**NGINX Reference Configs** (in repository root):
- `etc-nginx-sites-available-flyer-crawler.projectium.com` - Production server config

View File

@@ -1,5 +1,21 @@
# DevOps Subagent Reference
## Critical Rule: Server Access is READ-ONLY
**Claude Code has READ-ONLY access to production/test servers.** The `claude-win10` user cannot execute write operations directly.
When working with production/test servers:
1. **Provide commands** for the user to execute (do not attempt SSH)
2. **Wait for user** to report command output
3. **Provide fix commands** 1-3 at a time (errors may cascade)
4. **Verify success** with read-only commands after user executes fixes
5. **Document findings** in relevant documentation
Commands in this reference are for the **user to run on the server**, not for Claude to execute.
---
## Critical Rule: Git Bash Path Conversion
Git Bash on Windows auto-converts Unix paths, breaking container commands.
@@ -69,12 +85,11 @@ MSYS_NO_PATHCONV=1 podman exec -it flyer-crawler-dev psql -U postgres -d flyer_c
## PM2 Commands
### Production Server (via SSH)
### Production Server
> **Note**: These commands are for the **user to execute on the server**. Claude Code provides commands but cannot run them directly. See [Server Access is READ-ONLY](#critical-rule-server-access-is-read-only) above.
```bash
# SSH to server
ssh root@projectium.com
# List all apps
pm2 list
@@ -210,9 +225,10 @@ INFO
### Production
> **Note**: User executes these commands on the server.
```bash
# Via SSH
ssh root@projectium.com
# Access Redis CLI
redis-cli -a $REDIS_PASSWORD
# Flush cache (use with caution)
@@ -278,10 +294,9 @@ Trigger `manual-db-backup.yml` from Gitea Actions UI.
### Manual Backup
```bash
# SSH to server
ssh root@projectium.com
> **Note**: User executes these commands on the server.
```bash
# Backup
PGPASSWORD=$DB_PASSWORD pg_dump -h $DB_HOST -U $DB_USER $DB_NAME > backup_$(date +%Y%m%d).sql
@@ -301,8 +316,10 @@ MSYS_NO_PATHCONV=1 podman exec -e DATABASE_URL=postgresql://bugsink:bugsink_dev_
### Production Token Generation
> **Note**: User executes this command on the server.
```bash
ssh root@projectium.com "cd /opt/bugsink && bugsink-manage create_auth_token"
cd /opt/bugsink && bugsink-manage create_auth_token
```
---

View File

@@ -2,17 +2,408 @@
**Date**: 2025-12-12
**Status**: Proposed
**Status**: Accepted (Phase 2 Complete - All Tasks Done)
**Updated**: 2026-01-27
**Completion Note**: Phase 2 fully complete including test path migration. All 23 integration test files updated to use `/api/v1/` paths. Test suite improved from 274/348 to 345/348 passing (3 remain as todo/skipped for known issues unrelated to versioning).
## Context
As the application grows, the API will need to evolve. Making breaking changes to existing endpoints can disrupt clients (e.g., a mobile app or the web frontend). The current routing has no formal versioning scheme.
### Current State
As of January 2026, the API operates without explicit versioning:
- All routes are mounted under `/api/*` (e.g., `/api/flyers`, `/api/users/profile`)
- The frontend `apiClient.ts` uses `API_BASE_URL = '/api'` as the base
- No version prefix exists in route paths
- Breaking changes would immediately affect all consumers
### Why Version Now?
1. **Future Mobile App**: A native mobile app is planned, which will have slower update cycles than the web frontend
2. **Third-Party Integrations**: Store partners may integrate with our API
3. **Deprecation Path**: Need a clear way to deprecate and remove endpoints
4. **Documentation**: OpenAPI documentation (ADR-018) should reflect versioned endpoints
## Decision
We will adopt a URI-based versioning strategy for the API. All new and existing routes will be prefixed with a version number (e.g., `/api/v1/flyers`). This ADR establishes a clear policy for when to introduce a new version (`v2`) and how to manage deprecation of old versions.
We will adopt a URI-based versioning strategy for the API using a phased rollout approach. All routes will be prefixed with a version number (e.g., `/api/v1/flyers`).
### Versioning Format
```text
/api/v{MAJOR}/resource
```
- **MAJOR**: Incremented for breaking changes (v1, v2, v3...)
- Resource paths remain unchanged within a version
### What Constitutes a Breaking Change?
The following changes require a new API version:
| Change Type | Breaking? | Example |
| ----------------------------- | --------- | ------------------------------------------ |
| Remove endpoint | Yes | DELETE `/api/v1/legacy-feature` |
| Remove response field | Yes | Remove `user.email` from response |
| Change response field type | Yes | `id: number` to `id: string` |
| Change required request field | Yes | Make `email` required when it was optional |
| Rename endpoint | Yes | `/users` to `/accounts` |
| Add optional response field | No | Add `user.avatar_url` |
| Add optional request field | No | Add optional `page` parameter |
| Add new endpoint | No | Add `/api/v1/new-feature` |
| Fix bug in behavior | No\* | Correct calculation error |
\*Bug fixes may warrant version increment if clients depend on the buggy behavior.
## Implementation Phases
### Phase 1: Namespace Migration (Current)
**Goal**: Add `/v1/` prefix to all existing routes without behavioral changes.
**Changes Required**:
1. **server.ts**: Update all route registrations
```typescript
// Before
app.use('/api/auth', authRouter);
// After
app.use('/api/v1/auth', authRouter);
```
2. **apiClient.ts**: Update base URL
```typescript
// Before
const API_BASE_URL = import.meta.env.VITE_API_BASE_URL || '/api';
// After
const API_BASE_URL = import.meta.env.VITE_API_BASE_URL || '/api/v1';
```
3. **swagger.ts**: Update server definition
```typescript
servers: [
{
url: '/api/v1',
description: 'API v1 server',
},
],
```
4. **Redirect Middleware** (optional): Support legacy clients
```typescript
// Redirect unversioned routes to v1
app.use('/api/:resource', (req, res, next) => {
if (req.params.resource !== 'v1') {
return res.redirect(307, `/api/v1/${req.params.resource}${req.url}`);
}
next();
});
```
**Acceptance Criteria**:
- All existing functionality works at `/api/v1/*`
- Frontend makes requests to `/api/v1/*`
- OpenAPI documentation reflects `/api/v1/*` paths
- Integration tests pass with new paths
### Phase 2: Versioning Infrastructure
**Goal**: Build tooling to support multiple API versions.
**Components**:
1. **Version Router Factory**
```typescript
// src/routes/versioned.ts
export function createVersionedRoutes(version: 'v1' | 'v2') {
const router = express.Router();
if (version === 'v1') {
router.use('/auth', authRouterV1);
router.use('/users', userRouterV1);
// ...
} else if (version === 'v2') {
router.use('/auth', authRouterV2);
router.use('/users', userRouterV2);
// ...
}
return router;
}
```
2. **Version Detection Middleware**
```typescript
// Extract version from URL and attach to request
app.use('/api/:version', (req, res, next) => {
req.apiVersion = req.params.version;
next();
});
```
3. **Deprecation Headers**
```typescript
// Middleware to add deprecation headers
function deprecateVersion(sunsetDate: string) {
return (req, res, next) => {
res.set('Deprecation', 'true');
res.set('Sunset', sunsetDate);
res.set('Link', '</api/v2>; rel="successor-version"');
next();
};
}
```
### Phase 3: Version 2 Support
**Goal**: Introduce v2 API when breaking changes are needed.
**Triggers for v2**:
- Major schema changes (e.g., unified item model)
- Response format overhaul
- Authentication mechanism changes
- Significant performance-driven restructuring
**Parallel Support**:
```typescript
app.use('/api/v1', createVersionedRoutes('v1'));
app.use('/api/v2', createVersionedRoutes('v2'));
```
## Migration Path
### For Frontend (Web)
The web frontend is deployed alongside the API, so migration is straightforward:
1. Update `API_BASE_URL` in `apiClient.ts`
2. Update any hardcoded paths in tests
3. Deploy frontend and backend together
### For External Consumers
External consumers (mobile apps, partner integrations) need a transition period:
1. **Announcement**: 30 days before deprecation of v(N-1)
2. **Deprecation Headers**: Add headers 30 days before sunset
3. **Documentation**: Maintain docs for both versions during transition
4. **Sunset**: Remove v(N-1) after grace period
## Deprecation Timeline
| Version | Status | Sunset Date | Notes |
| -------------------- | ---------- | ---------------------- | --------------- |
| Unversioned `/api/*` | Deprecated | Phase 1 completion | Redirect to v1 |
| v1 | Active | TBD (when v2 releases) | Current version |
### Support Policy
- **Current Version (v(N))**: Full support, all features
- **Previous Version (v(N-1))**: Security fixes only for 6 months after v(N) release
- **Older Versions**: No support, endpoints return 410 Gone
## Backwards Compatibility Strategy
### Redirect Middleware
For a smooth transition, implement redirects from unversioned to versioned endpoints:
```typescript
// src/middleware/versionRedirect.ts
import { Request, Response, NextFunction } from 'express';
import { logger } from '../services/logger.server';
/**
* Middleware to redirect unversioned API requests to v1.
* This provides backwards compatibility during the transition period.
*
* Example: /api/flyers -> /api/v1/flyers (307 Temporary Redirect)
*/
export function versionRedirectMiddleware(req: Request, res: Response, next: NextFunction) {
const path = req.path;
// Skip if already versioned
if (path.startsWith('/v1') || path.startsWith('/v2')) {
return next();
}
// Skip health checks and documentation
if (path.startsWith('/health') || path.startsWith('/docs')) {
return next();
}
// Log deprecation warning
logger.warn(
{
path: req.originalUrl,
method: req.method,
ip: req.ip,
},
'Unversioned API request - redirecting to v1',
);
// Use 307 to preserve HTTP method
const redirectUrl = `/api/v1${path}${req.url.includes('?') ? req.url.substring(req.url.indexOf('?')) : ''}`;
return res.redirect(307, redirectUrl);
}
```
### Response Versioning Headers
All API responses include version information:
```typescript
// Middleware to add version headers
app.use('/api/v1', (req, res, next) => {
res.set('X-API-Version', 'v1');
next();
});
```
## Consequences
**Positive**: Establishes a critical pattern for long-term maintainability. Allows the API to evolve without breaking existing clients.
**Negative**: Adds a small amount of complexity to the routing setup. Requires discipline to manage versions and deprecations correctly.
### Positive
- **Clear Evolution Path**: Establishes a critical pattern for long-term maintainability
- **Client Protection**: Allows the API to evolve without breaking existing clients
- **Parallel Development**: Can develop v2 features while maintaining v1 stability
- **Documentation Clarity**: Each version has its own complete documentation
- **Graceful Deprecation**: Clients have clear timelines and migration paths
### Negative
- **Routing Complexity**: Adds complexity to the routing setup
- **Code Duplication**: May need to maintain multiple versions of handlers
- **Testing Overhead**: Tests may need to cover multiple versions
- **Documentation Maintenance**: Must keep docs for multiple versions in sync
### Mitigation
- Use shared business logic with version-specific adapters
- Automate deprecation header addition
- Generate versioned OpenAPI specs from code
- Clear internal guidelines on when to increment versions
## Key Files
| File | Purpose |
| ----------------------------------- | ------------------------------------------- |
| `server.ts` | Route registration with version prefixes |
| `src/services/apiClient.ts` | Frontend API base URL configuration |
| `src/config/swagger.ts` | OpenAPI server URL and version info |
| `src/routes/*.routes.ts` | Individual route handlers |
| `src/middleware/versionRedirect.ts` | Backwards compatibility redirects (Phase 1) |
## Related ADRs
- [ADR-003](./0003-standardized-input-validation-using-middleware.md) - Input Validation (consistent across versions)
- [ADR-018](./0018-api-documentation-strategy.md) - API Documentation Strategy (versioned OpenAPI specs)
- [ADR-028](./0028-api-response-standardization.md) - Response Standardization (envelope pattern applies to all versions)
- [ADR-016](./0016-api-security-hardening.md) - Security Hardening (applies to all versions)
- [ADR-057](./0057-test-remediation-post-api-versioning.md) - Test Remediation Post-API Versioning (documents test migration)
## Implementation Checklist
### Phase 1 Tasks
- [x] Update `server.ts` to mount all routes under `/api/v1/`
- [x] Update `src/services/apiClient.ts` API_BASE_URL to `/api/v1`
- [x] Update `src/config/swagger.ts` server URL to `/api/v1`
- [x] Add redirect middleware for unversioned requests
- [x] Update integration tests to use versioned paths
- [x] Update API documentation examples (Swagger server URL updated)
- [x] Verify all health checks work at `/api/v1/health/*`
### Phase 2 Tasks
**Implementation Guide**: [API Versioning Infrastructure](../architecture/api-versioning-infrastructure.md)
**Developer Guide**: [API Versioning Developer Guide](../development/API-VERSIONING.md)
- [x] Create version router factory (`src/routes/versioned.ts`)
- [x] Implement deprecation header middleware (`src/middleware/deprecation.middleware.ts`)
- [x] Add version detection to request context (`src/middleware/apiVersion.middleware.ts`)
- [x] Add version types to Express Request (`src/types/express.d.ts`)
- [x] Create version constants configuration (`src/config/apiVersions.ts`)
- [x] Update server.ts to use version router factory
- [x] Update swagger.ts for multi-server documentation
- [x] Add unit tests for version middleware
- [x] Add integration tests for versioned router
- [x] Document versioning patterns for developers
- [x] Migrate all test files to use `/api/v1/` paths (23 files, ~70 occurrences)
### Test Path Migration Summary (2026-01-27)
The final cleanup task for Phase 2 was completed by updating all integration test files to use versioned API paths:
| Metric | Value |
| ---------------------------- | ---------------------------------------- |
| Test files updated | 23 |
| Path occurrences changed | ~70 |
| Test failures resolved | 71 (274 -> 345 passing) |
| Tests remaining todo/skipped | 3 (known issues, not versioning-related) |
| Type check | Passing |
| Versioning-specific tests | 82/82 passing |
**Test Results After Migration**:
- Integration tests: 345/348 passing
- Unit tests: 3,375/3,391 passing (16 pre-existing failures unrelated to versioning)
### Unit Test Path Fix (2026-01-27)
Following the test path migration, 16 unit test failures were discovered and fixed. These failures were caused by error log messages using hardcoded `/api/` paths instead of versioned `/api/v1/` paths.
**Root Cause**: Error log messages in route handlers used hardcoded path strings like:
```typescript
// INCORRECT - hardcoded path doesn't reflect actual request URL
req.log.error({ error }, 'Error in /api/flyers/:id:');
```
**Solution**: Updated to use `req.originalUrl` for dynamic path logging:
```typescript
// CORRECT - uses actual request URL including version prefix
req.log.error({ error }, `Error in ${req.originalUrl.split('?')[0]}:`);
```
**Files Modified**:
| File | Changes |
| -------------------------------------- | ---------------------------------- |
| `src/routes/recipe.routes.ts` | 3 error log statements updated |
| `src/routes/stats.routes.ts` | 1 error log statement updated |
| `src/routes/flyer.routes.ts` | 2 error logs + 2 test expectations |
| `src/routes/personalization.routes.ts` | 3 error log statements updated |
**Test Results After Fix**:
- Unit tests: 3,382/3,391 passing (0 failures in fixed files)
- Remaining 9 failures are pre-existing, unrelated issues (CSS/mocking)
**Best Practice**: See [Error Logging Path Patterns](../development/ERROR-LOGGING-PATHS.md) for guidance on logging request paths in error handlers.
**Migration Documentation**: [Test Path Migration Guide](../development/test-path-migration.md)
### Phase 3 Tasks (Future)
- [ ] Identify breaking changes requiring v2
- [ ] Create v2 route handlers
- [ ] Set deprecation timeline for v1
- [ ] Migrate documentation to multi-version format

View File

@@ -39,15 +39,15 @@ All cache operations are fail-safe - cache failures do not break the application
Different data types use different TTL values based on volatility:
| Data Type | TTL | Rationale |
| ------------------- | --------- | -------------------------------------- |
| Brands/Stores | 1 hour | Rarely changes, safe to cache longer |
| Flyer lists | 5 minutes | Changes when new flyers are added |
| Individual flyers | 10 minutes| Stable once created |
| Flyer items | 10 minutes| Stable once created |
| Statistics | 5 minutes | Can be slightly stale |
| Frequent sales | 15 minutes| Aggregated data, updated periodically |
| Categories | 1 hour | Rarely changes |
| Data Type | TTL | Rationale |
| ----------------- | ---------- | ------------------------------------- |
| Brands/Stores | 1 hour | Rarely changes, safe to cache longer |
| Flyer lists | 5 minutes | Changes when new flyers are added |
| Individual flyers | 10 minutes | Stable once created |
| Flyer items | 10 minutes | Stable once created |
| Statistics | 5 minutes | Can be slightly stale |
| Frequent sales | 15 minutes | Aggregated data, updated periodically |
| Categories | 1 hour | Rarely changes |
### Cache Key Strategy
@@ -64,11 +64,11 @@ Cache keys follow a consistent prefix pattern for pattern-based invalidation:
The following repository methods implement server-side caching:
| Method | Cache Key Pattern | TTL |
| ------ | ----------------- | --- |
| `FlyerRepository.getAllBrands()` | `cache:brands` | 1 hour |
| `FlyerRepository.getFlyers()` | `cache:flyers:{limit}:{offset}` | 5 minutes |
| `FlyerRepository.getFlyerItems()` | `cache:flyer-items:{flyerId}` | 10 minutes |
| Method | Cache Key Pattern | TTL |
| --------------------------------- | ------------------------------- | ---------- |
| `FlyerRepository.getAllBrands()` | `cache:brands` | 1 hour |
| `FlyerRepository.getFlyers()` | `cache:flyers:{limit}:{offset}` | 5 minutes |
| `FlyerRepository.getFlyerItems()` | `cache:flyer-items:{flyerId}` | 10 minutes |
### Cache Invalidation
@@ -86,14 +86,14 @@ The following repository methods implement server-side caching:
TanStack React Query provides client-side caching with configurable stale times:
| Query Type | Stale Time |
| ----------------- | ----------- |
| Categories | 1 hour |
| Master Items | 10 minutes |
| Flyer Items | 5 minutes |
| Flyers | 2 minutes |
| Shopping Lists | 1 minute |
| Activity Log | 30 seconds |
| Query Type | Stale Time |
| -------------- | ---------- |
| Categories | 1 hour |
| Master Items | 10 minutes |
| Flyer Items | 5 minutes |
| Flyers | 2 minutes |
| Shopping Lists | 1 minute |
| Activity Log | 30 seconds |
### Multi-Layer Cache Architecture

View File

@@ -363,6 +363,13 @@ The following files contain acknowledged code smell violations that are deferred
- `src/tests/utils/mockFactories.ts` - Mock factories (1553 lines)
- `src/tests/utils/testHelpers.ts` - Test utilities
## Related ADRs
- [ADR-014](./0014-containerization-and-deployment-strategy.md) - Containerization (tests must run in dev container)
- [ADR-040](./0040-testing-economics-and-priorities.md) - Testing Economics and Priorities
- [ADR-045](./0045-test-data-factories-and-fixtures.md) - Test Data Factories and Fixtures
- [ADR-057](./0057-test-remediation-post-api-versioning.md) - Test Remediation Post-API Versioning
## Future Enhancements
1. **Browser E2E Tests**: Consider adding Playwright for actual browser testing

View File

@@ -80,13 +80,13 @@ src/
**Common Utility Patterns**:
| Pattern | Classes |
| ------- | ------- |
| Card container | `bg-white dark:bg-gray-800 rounded-lg shadow-md p-6` |
| Primary button | `bg-brand-primary hover:bg-brand-dark text-white rounded-lg px-4 py-2` |
| Secondary button | `bg-gray-100 dark:bg-gray-700 text-gray-700 dark:text-gray-200` |
| Input field | `border border-gray-300 dark:border-gray-600 rounded-md px-3 py-2` |
| Focus ring | `focus:outline-none focus:ring-2 focus:ring-brand-primary` |
| Pattern | Classes |
| ---------------- | ---------------------------------------------------------------------- |
| Card container | `bg-white dark:bg-gray-800 rounded-lg shadow-md p-6` |
| Primary button | `bg-brand-primary hover:bg-brand-dark text-white rounded-lg px-4 py-2` |
| Secondary button | `bg-gray-100 dark:bg-gray-700 text-gray-700 dark:text-gray-200` |
| Input field | `border border-gray-300 dark:border-gray-600 rounded-md px-3 py-2` |
| Focus ring | `focus:outline-none focus:ring-2 focus:ring-brand-primary` |
### Color System
@@ -187,13 +187,13 @@ export const CheckCircleIcon: React.FC<IconProps> = ({ title, ...props }) => (
**Context Providers** (see ADR-005):
| Provider | Purpose |
| -------- | ------- |
| `AuthProvider` | Authentication state |
| `ModalProvider` | Modal open/close state |
| `FlyersProvider` | Flyer data |
| `MasterItemsProvider` | Grocery items |
| `UserDataProvider` | User-specific data |
| Provider | Purpose |
| --------------------- | ---------------------- |
| `AuthProvider` | Authentication state |
| `ModalProvider` | Modal open/close state |
| `FlyersProvider` | Flyer data |
| `MasterItemsProvider` | Grocery items |
| `UserDataProvider` | User-specific data |
**Provider Hierarchy** in `AppProviders.tsx`:

View File

@@ -2,7 +2,9 @@
**Date**: 2025-12-12
**Status**: Proposed
**Status**: Superseded by [ADR-023](./0023-database-schema-migration-strategy.md)
**Note**: This ADR was an early draft. ADR-023 provides a more detailed specification for the same topic.
## Context

View File

@@ -1,324 +0,0 @@
# ADR-015: Application Performance Monitoring (APM) and Error Tracking
**Date**: 2025-12-12
**Status**: Accepted
**Updated**: 2026-01-11
## Context
While `ADR-004` established structured logging with Pino, the application lacks a high-level, aggregated view of its health, performance, and errors. It's difficult to spot trends, identify slow API endpoints, or be proactively notified of new types of errors.
Key requirements:
1. **Self-hosted**: No external SaaS dependencies for error tracking
2. **Sentry SDK compatible**: Leverage mature, well-documented SDKs
3. **Lightweight**: Minimal resource overhead in the dev container
4. **Production-ready**: Same architecture works on bare-metal production servers
5. **AI-accessible**: MCP server integration for Claude Code and other AI tools
## Decision
We will implement a self-hosted error tracking stack using **Bugsink** as the Sentry-compatible backend, with the following components:
### 1. Error Tracking Backend: Bugsink
**Bugsink** is a lightweight, self-hosted Sentry alternative that:
- Runs as a single process (no Kafka, Redis, ClickHouse required)
- Is fully compatible with Sentry SDKs
- Supports ARM64 and AMD64 architectures
- Can use SQLite (dev) or PostgreSQL (production)
**Deployment**:
- **Dev container**: Installed as a systemd service inside the container
- **Production**: Runs as a systemd service on bare-metal, listening on localhost only
- **Database**: Uses PostgreSQL with a dedicated `bugsink` user and `bugsink` database (same PostgreSQL instance as the main application)
### 2. Backend Integration: @sentry/node
The Express backend will integrate `@sentry/node` SDK to:
- Capture unhandled exceptions before PM2/process manager restarts
- Report errors with full stack traces and context
- Integrate with Pino logger for breadcrumbs
- Track transaction performance (optional)
### 3. Frontend Integration: @sentry/react
The React frontend will integrate `@sentry/react` SDK to:
- Wrap the app in a Sentry Error Boundary
- Capture unhandled JavaScript errors
- Report errors with component stack traces
- Track user session context
- **Frontend Error Correlation**: The global API client (Axios/Fetch wrapper) MUST intercept 4xx/5xx responses. It MUST extract the `x-request-id` header (if present) and attach it to the Sentry scope as a tag `api_request_id` before re-throwing the error. This allows developers to copy the ID from Sentry and search for it in backend logs.
### 4. Log Aggregation: Logstash
**Logstash** parses application and infrastructure logs, forwarding error patterns to Bugsink:
- **Installation**: Installed inside the dev container (and on bare-metal prod servers)
- **Inputs**:
- Pino JSON logs from the Node.js application
- Redis logs (connection errors, memory warnings, slow commands)
- PostgreSQL function logs (future - see Implementation Steps)
- **Filter**: Identifies error-level logs (5xx responses, unhandled exceptions, Redis errors)
- **Output**: Sends to Bugsink via Sentry-compatible HTTP API
This provides a secondary error capture path for:
- Errors that occur before Sentry SDK initialization
- Log-based errors that don't throw exceptions
- Redis connection/performance issues
- Database function errors and slow queries
- Historical error analysis from log files
### 5. MCP Server Integration: bugsink-mcp
For AI tool integration (Claude Code, Cursor, etc.), we use the open-source [bugsink-mcp](https://github.com/j-shelfwood/bugsink-mcp) server:
- **No code changes required**: Configurable via environment variables
- **Capabilities**: List projects, get issues, view events, get stacktraces, manage releases
- **Configuration**:
- `BUGSINK_URL`: Points to Bugsink instance (`http://localhost:8000` for dev, `https://bugsink.projectium.com` for prod)
- `BUGSINK_API_TOKEN`: API token from Bugsink (created via Django management command)
- `BUGSINK_ORG_SLUG`: Organization identifier (usually "sentry")
**Note:** Despite the name `sentry-selfhosted-mcp` mentioned in earlier drafts of this ADR, the actual MCP server used is `bugsink-mcp` which is specifically designed for Bugsink's API structure.
## Architecture
```text
┌─────────────────────────────────────────────────────────────────────────┐
│ Dev Container / Production Server │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────────┐ ┌──────────────────┐ │
│ │ Frontend │ │ Backend │ │
│ │ (React) │ │ (Express) │ │
│ │ @sentry/react │ │ @sentry/node │ │
│ └────────┬─────────┘ └────────┬─────────┘ │
│ │ │ │
│ │ Sentry SDK Protocol │ │
│ └───────────┬───────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────┐ │
│ │ Bugsink │ │
│ │ (localhost:8000) │◄──────────────────┐ │
│ │ │ │ │
│ │ PostgreSQL backend │ │ │
│ └──────────────────────┘ │ │
│ │ │
│ ┌──────────────────────┐ │ │
│ │ Logstash │───────────────────┘ │
│ │ (Log Aggregator) │ Sentry Output │
│ │ │ │
│ │ Inputs: │ │
│ │ - Pino app logs │ │
│ │ - Redis logs │ │
│ │ - PostgreSQL (future) │
│ └──────────────────────┘ │
│ ▲ ▲ ▲ │
│ │ │ │ │
│ ┌───────────┘ │ └───────────┐ │
│ │ │ │ │
│ ┌────┴─────┐ ┌─────┴────┐ ┌──────┴─────┐ │
│ │ Pino │ │ Redis │ │ PostgreSQL │ │
│ │ Logs │ │ Logs │ │ Logs (TBD) │ │
│ └──────────┘ └──────────┘ └────────────┘ │
│ │
│ ┌──────────────────────┐ │
│ │ PostgreSQL │ │
│ │ ┌────────────────┐ │ │
│ │ │ flyer_crawler │ │ (main app database) │
│ │ ├────────────────┤ │ │
│ │ │ bugsink │ │ (error tracking database) │
│ │ └────────────────┘ │ │
│ └──────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────┘
External (Developer Machine):
┌──────────────────────────────────────┐
│ Claude Code / Cursor / VS Code │
│ ┌────────────────────────────────┐ │
│ │ bugsink-mcp │ │
│ │ (MCP Server) │ │
│ │ │ │
│ │ BUGSINK_URL=http://localhost:8000
│ │ BUGSINK_API_TOKEN=... │ │
│ │ BUGSINK_ORG_SLUG=... │ │
│ └────────────────────────────────┘ │
└──────────────────────────────────────┘
```
## Configuration
### Environment Variables
| Variable | Description | Default (Dev) |
| ------------------ | ------------------------------ | -------------------------- |
| `BUGSINK_DSN` | Sentry-compatible DSN for SDKs | Set after project creation |
| `BUGSINK_ENABLED` | Enable/disable error reporting | `true` |
| `BUGSINK_BASE_URL` | Bugsink web UI URL (internal) | `http://localhost:8000` |
### PostgreSQL Setup
```sql
-- Create dedicated Bugsink database and user
CREATE USER bugsink WITH PASSWORD 'bugsink_dev_password';
CREATE DATABASE bugsink OWNER bugsink;
GRANT ALL PRIVILEGES ON DATABASE bugsink TO bugsink;
```
### Bugsink Configuration
```bash
# Environment variables for Bugsink service
SECRET_KEY=<random-50-char-string>
DATABASE_URL=postgresql://bugsink:bugsink_dev_password@localhost:5432/bugsink
BASE_URL=http://localhost:8000
PORT=8000
```
### Logstash Pipeline
```conf
# /etc/logstash/conf.d/bugsink.conf
# === INPUTS ===
input {
# Pino application logs
file {
path => "/app/logs/*.log"
codec => json
type => "pino"
tags => ["app"]
}
# Redis logs
file {
path => "/var/log/redis/*.log"
type => "redis"
tags => ["redis"]
}
# PostgreSQL logs (for function logging - future)
# file {
# path => "/var/log/postgresql/*.log"
# type => "postgres"
# tags => ["postgres"]
# }
}
# === FILTERS ===
filter {
# Pino error detection (level 50 = error, 60 = fatal)
if [type] == "pino" and [level] >= 50 {
mutate { add_tag => ["error"] }
}
# Redis error detection
if [type] == "redis" {
grok {
match => { "message" => "%{POSINT:pid}:%{WORD:role} %{MONTHDAY} %{MONTH} %{TIME} %{WORD:loglevel} %{GREEDYDATA:redis_message}" }
}
if [loglevel] in ["WARNING", "ERROR"] {
mutate { add_tag => ["error"] }
}
}
# PostgreSQL function error detection (future)
# if [type] == "postgres" {
# # Parse PostgreSQL log format and detect ERROR/FATAL levels
# }
}
# === OUTPUT ===
output {
if "error" in [tags] {
http {
url => "http://localhost:8000/api/store/"
http_method => "post"
format => "json"
# Sentry envelope format
}
}
}
```
## Implementation Steps
1. **Update Dockerfile.dev**:
- Install Bugsink (pip package or binary)
- Install Logstash (Elastic APT repository)
- Add systemd service files for both
2. **PostgreSQL initialization**:
- Add Bugsink user/database creation to `sql/00-init-extensions.sql`
3. **Backend SDK integration**:
- Install `@sentry/node`
- Initialize in `server.ts` before Express app
- Configure error handler middleware integration
4. **Frontend SDK integration**:
- Install `@sentry/react`
- Wrap `App` component with `Sentry.ErrorBoundary`
- Configure in `src/index.tsx`
5. **Environment configuration**:
- Add Bugsink variables to `src/config/env.ts`
- Update `.env.example` and `compose.dev.yml`
6. **Logstash configuration**:
- Create pipeline config for Pino → Bugsink
- Configure Pino to write to log file in addition to stdout
- Configure Redis log monitoring (connection errors, slow commands)
7. **MCP server documentation**:
- Document `bugsink-mcp` setup in CLAUDE.md
8. **PostgreSQL function logging** (future):
- Configure PostgreSQL to log function execution errors
- Add Logstash input for PostgreSQL logs
- Define filter rules for function-level error detection
- _Note: Ask for implementation details when this step is reached_
## Consequences
### Positive
- **Full observability**: Aggregated view of errors, trends, and performance
- **Self-hosted**: No external SaaS dependencies or subscription costs
- **SDK compatibility**: Leverages mature Sentry SDKs with excellent documentation
- **AI integration**: MCP server enables Claude Code to query and analyze errors
- **Unified architecture**: Same setup works in dev container and production
- **Lightweight**: Bugsink runs in a single process, unlike full Sentry (16GB+ RAM)
### Negative
- **Additional services**: Bugsink and Logstash add complexity to the container
- **PostgreSQL overhead**: Additional database for error tracking
- **Initial setup**: Requires configuration of multiple components
- **Logstash learning curve**: Pipeline configuration requires Logstash knowledge
## Alternatives Considered
1. **Full Sentry self-hosted**: Rejected due to complexity (Kafka, Redis, ClickHouse, 16GB+ RAM minimum)
2. **GlitchTip**: Considered, but Bugsink is lighter weight and easier to deploy
3. **Sentry SaaS**: Rejected due to self-hosted requirement
4. **Custom error aggregation**: Rejected in favor of proven Sentry SDK ecosystem
## References
- [Bugsink Documentation](https://www.bugsink.com/docs/)
- [Bugsink Docker Install](https://www.bugsink.com/docs/docker-install/)
- [@sentry/node Documentation](https://docs.sentry.io/platforms/javascript/guides/node/)
- [@sentry/react Documentation](https://docs.sentry.io/platforms/javascript/guides/react/)
- [bugsink-mcp](https://github.com/j-shelfwood/bugsink-mcp)
- [Logstash Reference](https://www.elastic.co/guide/en/logstash/current/index.html)

View File

@@ -0,0 +1,272 @@
# ADR-015: Error Tracking and Observability
**Date**: 2025-12-12
**Status**: Accepted (Fully Implemented)
**Updated**: 2026-01-26 (user context integration completed)
**Related**: [ADR-056](./0056-application-performance-monitoring.md) (Application Performance Monitoring)
## Context
While ADR-004 established structured logging with Pino, the application lacks a high-level, aggregated view of its health and errors. It's difficult to spot trends, identify recurring issues, or be proactively notified of new types of errors.
Key requirements:
1. **Self-hosted**: No external SaaS dependencies for error tracking
2. **Sentry SDK compatible**: Leverage mature, well-documented SDKs
3. **Lightweight**: Minimal resource overhead in the dev container
4. **Production-ready**: Same architecture works on bare-metal production servers
5. **AI-accessible**: MCP server integration for Claude Code and other AI tools
**Note**: Application Performance Monitoring (APM) and distributed tracing are covered separately in [ADR-056](./0056-application-performance-monitoring.md).
## Decision
We implement a self-hosted error tracking stack using **Bugsink** as the Sentry-compatible backend, with the following components:
### 1. Error Tracking Backend: Bugsink
**Bugsink** is a lightweight, self-hosted Sentry alternative that:
- Runs as a single process (no Kafka, Redis, ClickHouse required)
- Is fully compatible with Sentry SDKs
- Supports ARM64 and AMD64 architectures
- Can use SQLite (dev) or PostgreSQL (production)
**Deployment**:
- **Dev container**: Installed as a systemd service inside the container
- **Production**: Runs as a systemd service on bare-metal, listening on localhost only
- **Database**: Uses PostgreSQL with a dedicated `bugsink` user and `bugsink` database (same PostgreSQL instance as the main application)
### 2. Backend Integration: @sentry/node
The Express backend integrates `@sentry/node` SDK to:
- Capture unhandled exceptions before PM2/process manager restarts
- Report errors with full stack traces and context
- Integrate with Pino logger for breadcrumbs
- Filter errors by severity (only 5xx errors sent by default)
### 3. Frontend Integration: @sentry/react
The React frontend integrates `@sentry/react` SDK to:
- Wrap the app in an Error Boundary for graceful error handling
- Capture unhandled JavaScript errors
- Report errors with component stack traces
- Filter out browser extension errors
- **Frontend Error Correlation**: The global API client intercepts 4xx/5xx responses and can attach the `x-request-id` header to Sentry scope for correlation with backend logs
### 4. Log Aggregation: Logstash
**Logstash** parses application and infrastructure logs, forwarding error patterns to Bugsink:
- **Installation**: Installed inside the dev container (and on bare-metal prod servers)
- **Inputs**:
- Pino JSON logs from the Node.js application (PM2 managed)
- Redis logs (connection errors, memory warnings, slow commands)
- PostgreSQL function logs (via `fn_log()` - see ADR-050)
- NGINX access/error logs
- **Filter**: Identifies error-level logs (5xx responses, unhandled exceptions, Redis errors)
- **Output**: Sends to Bugsink via Sentry-compatible HTTP API
This provides a secondary error capture path for:
- Errors that occur before Sentry SDK initialization
- Log-based errors that don't throw exceptions
- Redis connection/performance issues
- Database function errors and slow queries
- Historical error analysis from log files
### 5. MCP Server Integration: bugsink-mcp
For AI tool integration (Claude Code, Cursor, etc.), we use the open-source [bugsink-mcp](https://github.com/j-shelfwood/bugsink-mcp) server:
- **No code changes required**: Configurable via environment variables
- **Capabilities**: List projects, get issues, view events, get stacktraces, manage releases
- **Configuration**:
- `BUGSINK_URL`: Points to Bugsink instance (`http://localhost:8000` for dev, `https://bugsink.projectium.com` for prod)
- `BUGSINK_API_TOKEN`: API token from Bugsink (created via Django management command)
- `BUGSINK_ORG_SLUG`: Organization identifier (usually "sentry")
## Architecture
```text
+---------------------------------------------------------------------------+
| Dev Container / Production Server |
+---------------------------------------------------------------------------+
| |
| +------------------+ +------------------+ |
| | Frontend | | Backend | |
| | (React) | | (Express) | |
| | @sentry/react | | @sentry/node | |
| +--------+---------+ +--------+---------+ |
| | | |
| | Sentry SDK Protocol | |
| +-----------+---------------+ |
| | |
| v |
| +----------------------+ |
| | Bugsink | |
| | (localhost:8000) |<------------------+ |
| | | | |
| | PostgreSQL backend | | |
| +----------------------+ | |
| | |
| +----------------------+ | |
| | Logstash |-------------------+ |
| | (Log Aggregator) | Sentry Output |
| | | |
| | Inputs: | |
| | - PM2/Pino logs | |
| | - Redis logs | |
| | - PostgreSQL logs | |
| | - NGINX logs | |
| +----------------------+ |
| ^ ^ ^ ^ |
| | | | | |
| +-----------+ | | +-----------+ |
| | | | | |
| +----+-----+ +-----+----+ +-----+----+ +-----+----+ |
| | PM2 | | Redis | | PostgreSQL| | NGINX | |
| | Logs | | Logs | | Logs | | Logs | |
| +----------+ +----------+ +-----------+ +---------+ |
| |
| +----------------------+ |
| | PostgreSQL | |
| | +----------------+ | |
| | | flyer_crawler | | (main app database) |
| | +----------------+ | |
| | | bugsink | | (error tracking database) |
| | +----------------+ | |
| +----------------------+ |
| |
+---------------------------------------------------------------------------+
External (Developer Machine):
+--------------------------------------+
| Claude Code / Cursor / VS Code |
| +--------------------------------+ |
| | bugsink-mcp | |
| | (MCP Server) | |
| | | |
| | BUGSINK_URL=http://localhost:8000
| | BUGSINK_API_TOKEN=... | |
| | BUGSINK_ORG_SLUG=... | |
| +--------------------------------+ |
+--------------------------------------+
```
## Implementation Status
### Completed
- [x] Bugsink installed and configured in dev container
- [x] PostgreSQL `bugsink` database and user created
- [x] `@sentry/node` SDK integrated in backend (`src/services/sentry.server.ts`)
- [x] `@sentry/react` SDK integrated in frontend (`src/services/sentry.client.ts`)
- [x] ErrorBoundary component created (`src/components/ErrorBoundary.tsx`)
- [x] ErrorBoundary wrapped around app (`src/providers/AppProviders.tsx`)
- [x] Logstash pipeline configured for PM2/Pino, Redis, PostgreSQL, NGINX logs
- [x] MCP server (`bugsink-mcp`) documented and configured
- [x] Environment variables added to `src/config/env.ts` and frontend `src/config.ts`
- [x] Browser extension errors filtered in `beforeSend`
- [x] 5xx error filtering in backend error handler
### Recently Completed (2026-01-26)
- [x] **User context after authentication**: Integrated `setUser()` calls in `AuthProvider.tsx` to associate errors with authenticated users
- Called on profile fetch from query (line 44-49)
- Called on direct login with profile (line 94-99)
- Called on login with profile fetch (line 124-129)
- Cleared on logout (line 76-77)
- Maps `user_id``id`, `email``email`, `full_name``username`
This completes the error tracking implementation - all errors are now associated with the authenticated user who encountered them, enabling user-specific error analysis and debugging.
## Configuration
### Environment Variables
| Variable | Description | Default (Dev) |
| -------------------- | -------------------------------- | -------------------------- |
| `SENTRY_DSN` | Sentry-compatible DSN (backend) | Set after project creation |
| `VITE_SENTRY_DSN` | Sentry-compatible DSN (frontend) | Set after project creation |
| `SENTRY_ENVIRONMENT` | Environment name | `development` |
| `SENTRY_DEBUG` | Enable debug logging | `false` |
| `SENTRY_ENABLED` | Enable/disable error reporting | `true` |
### PostgreSQL Setup
```sql
-- Create dedicated Bugsink database and user
CREATE USER bugsink WITH PASSWORD 'bugsink_dev_password';
CREATE DATABASE bugsink OWNER bugsink;
GRANT ALL PRIVILEGES ON DATABASE bugsink TO bugsink;
```
### Bugsink Configuration
```bash
# Environment variables for Bugsink service
SECRET_KEY=<random-50-char-string>
DATABASE_URL=postgresql://bugsink:bugsink_dev_password@localhost:5432/bugsink
BASE_URL=http://localhost:8000
PORT=8000
```
### Logstash Pipeline
See `docker/logstash/bugsink.conf` for the full pipeline configuration.
Key routing:
| Source | Bugsink Project |
| --------------- | --------------- |
| Backend (Pino) | Backend API |
| Worker (Pino) | Backend API |
| PostgreSQL logs | Backend API |
| Vite logs | Infrastructure |
| Redis logs | Infrastructure |
| NGINX logs | Infrastructure |
| Frontend errors | Frontend |
## Consequences
### Positive
- **Full observability**: Aggregated view of errors and trends
- **Self-hosted**: No external SaaS dependencies or subscription costs
- **SDK compatibility**: Leverages mature Sentry SDKs with excellent documentation
- **AI integration**: MCP server enables Claude Code to query and analyze errors
- **Unified architecture**: Same setup works in dev container and production
- **Lightweight**: Bugsink runs in a single process, unlike full Sentry (16GB+ RAM)
- **Error correlation**: Request IDs allow correlation between frontend errors and backend logs
### Negative
- **Additional services**: Bugsink and Logstash add complexity to the container
- **PostgreSQL overhead**: Additional database for error tracking
- **Initial setup**: Requires configuration of multiple components
- **Logstash learning curve**: Pipeline configuration requires Logstash knowledge
## Alternatives Considered
1. **Full Sentry self-hosted**: Rejected due to complexity (Kafka, Redis, ClickHouse, 16GB+ RAM minimum)
2. **GlitchTip**: Considered, but Bugsink is lighter weight and easier to deploy
3. **Sentry SaaS**: Rejected due to self-hosted requirement
4. **Custom error aggregation**: Rejected in favor of proven Sentry SDK ecosystem
## References
- [Bugsink Documentation](https://www.bugsink.com/docs/)
- [Bugsink Docker Install](https://www.bugsink.com/docs/docker-install/)
- [@sentry/node Documentation](https://docs.sentry.io/platforms/javascript/guides/node/)
- [@sentry/react Documentation](https://docs.sentry.io/platforms/javascript/guides/react/)
- [bugsink-mcp](https://github.com/j-shelfwood/bugsink-mcp)
- [Logstash Reference](https://www.elastic.co/guide/en/logstash/current/index.html)
- [ADR-050: PostgreSQL Function Observability](./0050-postgresql-function-observability.md)
- [ADR-056: Application Performance Monitoring](./0056-application-performance-monitoring.md)

View File

@@ -45,15 +45,15 @@ Using **helmet v8.x** configured in `server.ts` as the first middleware after ap
**Security Headers Applied**:
| Header | Configuration | Purpose |
| ------ | ------------- | ------- |
| Content-Security-Policy | Custom directives | Prevents XSS, code injection |
| Strict-Transport-Security | 1 year, includeSubDomains, preload | Forces HTTPS connections |
| X-Content-Type-Options | nosniff | Prevents MIME type sniffing |
| X-Frame-Options | DENY | Prevents clickjacking |
| X-XSS-Protection | 0 (disabled) | Deprecated, CSP preferred |
| Referrer-Policy | strict-origin-when-cross-origin | Controls referrer information |
| Cross-Origin-Resource-Policy | cross-origin | Allows external resource loading |
| Header | Configuration | Purpose |
| ---------------------------- | ---------------------------------- | -------------------------------- |
| Content-Security-Policy | Custom directives | Prevents XSS, code injection |
| Strict-Transport-Security | 1 year, includeSubDomains, preload | Forces HTTPS connections |
| X-Content-Type-Options | nosniff | Prevents MIME type sniffing |
| X-Frame-Options | DENY | Prevents clickjacking |
| X-XSS-Protection | 0 (disabled) | Deprecated, CSP preferred |
| Referrer-Policy | strict-origin-when-cross-origin | Controls referrer information |
| Cross-Origin-Resource-Policy | cross-origin | Allows external resource loading |
**Content Security Policy Directives**:
@@ -87,35 +87,35 @@ Using **express-rate-limit v8.2.1** with a centralized configuration in `src/con
```typescript
const standardConfig = {
standardHeaders: true, // Sends RateLimit-* headers
standardHeaders: true, // Sends RateLimit-* headers
legacyHeaders: false,
skip: shouldSkipRateLimit, // Disabled in test environment
skip: shouldSkipRateLimit, // Disabled in test environment
};
```
**Rate Limiters by Category**:
| Category | Limiter | Window | Max Requests |
| -------- | ------- | ------ | ------------ |
| **Authentication** | loginLimiter | 15 min | 5 |
| | registerLimiter | 1 hour | 5 |
| | forgotPasswordLimiter | 15 min | 5 |
| | resetPasswordLimiter | 15 min | 10 |
| | refreshTokenLimiter | 15 min | 20 |
| | logoutLimiter | 15 min | 10 |
| **Public/User Read** | publicReadLimiter | 15 min | 100 |
| | userReadLimiter | 15 min | 100 |
| | userUpdateLimiter | 15 min | 100 |
| **Sensitive Operations** | userSensitiveUpdateLimiter | 1 hour | 5 |
| | adminTriggerLimiter | 15 min | 30 |
| **AI/Costly** | aiGenerationLimiter | 15 min | 20 |
| | geocodeLimiter | 1 hour | 100 |
| | priceHistoryLimiter | 15 min | 50 |
| **Uploads** | adminUploadLimiter | 15 min | 20 |
| | aiUploadLimiter | 15 min | 10 |
| | batchLimiter | 15 min | 50 |
| **Tracking** | trackingLimiter | 15 min | 200 |
| | reactionToggleLimiter | 15 min | 150 |
| Category | Limiter | Window | Max Requests |
| ------------------------ | -------------------------- | ------ | ------------ |
| **Authentication** | loginLimiter | 15 min | 5 |
| | registerLimiter | 1 hour | 5 |
| | forgotPasswordLimiter | 15 min | 5 |
| | resetPasswordLimiter | 15 min | 10 |
| | refreshTokenLimiter | 15 min | 20 |
| | logoutLimiter | 15 min | 10 |
| **Public/User Read** | publicReadLimiter | 15 min | 100 |
| | userReadLimiter | 15 min | 100 |
| | userUpdateLimiter | 15 min | 100 |
| **Sensitive Operations** | userSensitiveUpdateLimiter | 1 hour | 5 |
| | adminTriggerLimiter | 15 min | 30 |
| **AI/Costly** | aiGenerationLimiter | 15 min | 20 |
| | geocodeLimiter | 1 hour | 100 |
| | priceHistoryLimiter | 15 min | 50 |
| **Uploads** | adminUploadLimiter | 15 min | 20 |
| | aiUploadLimiter | 15 min | 10 |
| | batchLimiter | 15 min | 50 |
| **Tracking** | trackingLimiter | 15 min | 200 |
| | reactionToggleLimiter | 15 min | 150 |
**Test Environment Handling**:
@@ -140,7 +140,7 @@ sanitizeFilename(filename: string): string
**Multer Configuration** (`src/middleware/multer.middleware.ts`):
- MIME type validation via `imageFileFilter` (only image/* allowed)
- MIME type validation via `imageFileFilter` (only image/\* allowed)
- File size limits (2MB for logos, configurable per upload type)
- Unique filenames using timestamps + random suffixes
- User-scoped storage paths
@@ -203,10 +203,12 @@ Per-request structured logging (ADR-004):
```typescript
import cors from 'cors';
app.use(cors({
origin: process.env.ALLOWED_ORIGINS?.split(',') || 'http://localhost:3000',
credentials: true,
}));
app.use(
cors({
origin: process.env.ALLOWED_ORIGINS?.split(',') || 'http://localhost:3000',
credentials: true,
}),
);
```
2. **Redis-backed rate limiting**: For distributed deployments, use `rate-limit-redis` store

View File

@@ -2,9 +2,11 @@
**Date**: 2025-12-12
**Status**: Accepted
**Status**: Superseded
**Implemented**: 2026-01-11
**Superseded By**: This ADR was updated in February 2026 to reflect the migration from swagger-jsdoc to tsoa. The original approach using JSDoc annotations has been replaced with a decorator-based controller pattern.
**Implemented**: 2026-02-12
## Context
@@ -16,139 +18,296 @@ Key requirements:
2. **Code-Documentation Sync**: Documentation should stay in sync with the actual code to prevent drift.
3. **Low Maintenance Overhead**: The documentation approach should be "fast and lite" - minimal additional work for developers.
4. **Security**: Documentation should not expose sensitive information in production environments.
5. **Type Safety**: Documentation should be derived from TypeScript types to ensure accuracy.
### Why We Migrated from swagger-jsdoc to tsoa
The original implementation used `swagger-jsdoc` to generate OpenAPI specs from JSDoc comments. This approach had several limitations:
| Issue | Impact |
| --------------------------------------- | -------------------------------------------- |
| `swagger-jsdoc` unmaintained since 2022 | Security and compatibility risks |
| JSDoc duplication with TypeScript types | Maintenance burden, potential for drift |
| No runtime validation from schema | Validation logic separate from documentation |
| Manual type definitions in comments | Error-prone, no compiler verification |
## Decision
We will adopt **OpenAPI 3.0 (Swagger)** for API documentation using the following approach:
We adopt **tsoa** for API documentation using a decorator-based controller pattern:
1. **JSDoc Annotations**: Use `swagger-jsdoc` to generate OpenAPI specs from JSDoc comments in route files.
2. **Swagger UI**: Use `swagger-ui-express` to serve interactive documentation at `/docs/api-docs`.
3. **Environment Restriction**: Only expose the Swagger UI in development and test environments, not production.
4. **Incremental Adoption**: Start with key public routes and progressively add annotations to all endpoints.
1. **Controller Classes**: Use tsoa decorators (`@Route`, `@Get`, `@Post`, `@Security`, etc.) on controller classes.
2. **TypeScript-First**: OpenAPI specs are generated directly from TypeScript interfaces and types.
3. **Swagger UI**: Continue using `swagger-ui-express` to serve interactive documentation at `/docs/api-docs`.
4. **Environment Restriction**: Only expose the Swagger UI in development and test environments, not production.
5. **BaseController Pattern**: All controllers extend a base class providing response formatting utilities.
### Tooling Selection
| Tool | Purpose |
| -------------------- | ---------------------------------------------- |
| `swagger-jsdoc` | Generates OpenAPI 3.0 spec from JSDoc comments |
| `swagger-ui-express` | Serves interactive Swagger UI |
| Tool | Purpose |
| -------------------- | ----------------------------------------------------- |
| `tsoa` (6.6.0) | Generates OpenAPI 3.0 spec from decorators and routes |
| `swagger-ui-express` | Serves interactive Swagger UI |
**Why JSDoc over separate schema files?**
**Why tsoa over swagger-jsdoc?**
- Documentation lives with the code, reducing drift
- No separate files to maintain
- Developers see documentation when editing routes
- Lower learning curve for the team
- **Type-safe contracts**: Decorators derive types directly from TypeScript, eliminating duplicate definitions
- **Active maintenance**: tsoa has an active community and regular releases
- **Route generation**: tsoa generates Express routes automatically, reducing boilerplate
- **Validation integration**: Request body types serve as validation contracts
- **Reduced duplication**: No more parallel JSDoc + TypeScript type definitions
## Implementation Details
### OpenAPI Configuration
### tsoa Configuration
Located in `src/config/swagger.ts`:
Located in `tsoa.json`:
```typescript
import swaggerJsdoc from 'swagger-jsdoc';
const options: swaggerJsdoc.Options = {
definition: {
openapi: '3.0.0',
info: {
title: 'Flyer Crawler API',
version: '1.0.0',
description: 'API for the Flyer Crawler application',
contact: {
name: 'API Support',
},
},
servers: [
{
url: '/api',
description: 'API server',
},
],
components: {
securitySchemes: {
bearerAuth: {
type: 'http',
scheme: 'bearer',
bearerFormat: 'JWT',
},
},
```json
{
"entryFile": "server.ts",
"noImplicitAdditionalProperties": "throw-on-extras",
"controllerPathGlobs": ["src/controllers/**/*.controller.ts"],
"spec": {
"outputDirectory": "src/config",
"specVersion": 3,
"securityDefinitions": {
"bearerAuth": {
"type": "http",
"scheme": "bearer",
"bearerFormat": "JWT"
}
},
"basePath": "/api",
"specFileBaseName": "tsoa-spec",
"name": "Flyer Crawler API",
"version": "1.0.0"
},
apis: ['./src/routes/*.ts'],
};
export const swaggerSpec = swaggerJsdoc(options);
"routes": {
"routesDir": "src/routes",
"basePath": "/api",
"middleware": "express",
"routesFileName": "tsoa-generated.ts",
"esm": true,
"authenticationModule": "src/middleware/tsoaAuthentication.ts"
}
}
```
### JSDoc Annotation Pattern
### Controller Pattern
Each route handler should include OpenAPI annotations using the `@openapi` tag:
Each controller extends `BaseController` and uses tsoa decorators:
```typescript
/**
* @openapi
* /health/ping:
* get:
* summary: Simple ping endpoint
* description: Returns a pong response to verify server is responsive
* tags:
* - Health
* responses:
* 200:
* description: Server is responsive
* content:
* application/json:
* schema:
* type: object
* properties:
* success:
* type: boolean
* example: true
* data:
* type: object
* properties:
* message:
* type: string
* example: pong
*/
router.get('/ping', validateRequest(emptySchema), (_req: Request, res: Response) => {
return sendSuccess(res, { message: 'pong' });
});
import { Route, Tags, Get, Post, Body, Security, SuccessResponse, Response } from 'tsoa';
import {
BaseController,
SuccessResponse as SuccessResponseType,
ErrorResponse,
} from './base.controller';
interface CreateUserRequest {
email: string;
password: string;
full_name?: string;
}
@Route('users')
@Tags('Users')
export class UserController extends BaseController {
/**
* Create a new user account.
* @summary Create user
* @param requestBody User creation data
* @returns Created user profile
*/
@Post()
@SuccessResponse(201, 'User created')
@Response<ErrorResponse>(400, 'Validation error')
@Response<ErrorResponse>(409, 'Email already exists')
public async createUser(
@Body() requestBody: CreateUserRequest,
): Promise<SuccessResponseType<UserProfileDto>> {
// Implementation
const user = await userService.createUser(requestBody);
return this.created(user);
}
/**
* Get current user's profile.
* @summary Get my profile
* @param request Express request with authenticated user
* @returns User profile
*/
@Get('me')
@Security('bearerAuth')
@SuccessResponse(200, 'Profile retrieved')
@Response<ErrorResponse>(401, 'Not authenticated')
public async getMyProfile(
@Request() request: Express.Request,
): Promise<SuccessResponseType<UserProfileDto>> {
const user = request.user as UserProfile;
return this.success(toUserProfileDto(user));
}
}
```
### Route Documentation Priority
### BaseController Helpers
Document routes in this order of priority:
The `BaseController` class provides standardized response formatting:
1. **Health Routes** - `/api/health/*` (public, critical for operations)
2. **Auth Routes** - `/api/auth/*` (public, essential for integration)
3. **Gamification Routes** - `/api/achievements/*` (simple, good example)
4. **Flyer Routes** - `/api/flyers/*` (core functionality)
5. **User Routes** - `/api/users/*` (common CRUD patterns)
6. **Remaining Routes** - Budget, Recipe, Admin, etc.
```typescript
export abstract class BaseController extends Controller {
// Success response with data
protected success<T>(data: T): SuccessResponse<T> {
return { success: true, data };
}
// Success with 201 Created status
protected created<T>(data: T): SuccessResponse<T> {
this.setStatus(201);
return this.success(data);
}
// Paginated response with metadata
protected paginated<T>(data: T[], pagination: PaginationInput): PaginatedResponse<T> {
return {
success: true,
data,
meta: { pagination: this.calculatePagination(pagination) },
};
}
// Message-only response
protected message(message: string): SuccessResponse<{ message: string }> {
return this.success({ message });
}
// No content response (204)
protected noContent(): void {
this.setStatus(204);
}
// Error response (prefer throwing errors instead)
protected error(code: string, message: string, details?: unknown): ErrorResponse {
return { success: false, error: { code, message, details } };
}
}
```
### Authentication with @Security
tsoa integrates with the existing passport-jwt strategy via a custom authentication module:
```typescript
// src/middleware/tsoaAuthentication.ts
export async function expressAuthentication(
request: Request,
securityName: string,
_scopes?: string[],
): Promise<UserProfile> {
if (securityName !== 'bearerAuth') {
throw new AuthenticationError(`Unknown security scheme: ${securityName}`);
}
const token = extractBearerToken(request);
const decoded = jwt.verify(token, process.env.JWT_SECRET!);
const userProfile = await userRepo.findUserProfileById(decoded.user_id);
if (!userProfile) {
throw new AuthenticationError('User not found');
}
request.user = userProfile;
return userProfile;
}
```
Usage in controllers:
```typescript
@Get('profile')
@Security('bearerAuth')
public async getProfile(@Request() req: Express.Request): Promise<...> {
const user = req.user as UserProfile;
// ...
}
```
### DTO Organization
Shared DTOs are defined in `src/dtos/common.dto.ts` to avoid duplicate type definitions across controllers:
```typescript
// src/dtos/common.dto.ts
/**
* Address with flattened coordinates (tsoa-compatible).
* GeoJSONPoint uses coordinates: [number, number] which tsoa cannot handle.
*/
export interface AddressDto {
address_id: number;
address_line_1: string;
city: string;
province_state: string;
postal_code: string;
country: string;
latitude?: number | null; // Flattened from GeoJSONPoint
longitude?: number | null; // Flattened from GeoJSONPoint
// ...
}
export interface UserDto {
user_id: string;
email: string;
created_at: string;
updated_at: string;
}
export interface UserProfileDto {
full_name?: string | null;
role: 'admin' | 'user';
points: number;
user: UserDto;
address?: AddressDto | null;
// ...
}
```
### Swagger UI Setup
In `server.ts`, add the Swagger UI middleware (development/test only):
In `server.ts`, the Swagger UI middleware serves the tsoa-generated spec:
```typescript
import swaggerUi from 'swagger-ui-express';
import { swaggerSpec } from './src/config/swagger';
import tsoaSpec from './src/config/tsoa-spec.json' with { type: 'json' };
// Only serve Swagger UI in non-production environments
if (process.env.NODE_ENV !== 'production') {
app.use('/docs/api-docs', swaggerUi.serve, swaggerUi.setup(swaggerSpec));
app.use('/docs/api-docs', swaggerUi.serve, swaggerUi.setup(tsoaSpec));
// Optionally expose raw JSON spec for tooling
// Raw JSON spec for tooling
app.get('/docs/api-docs.json', (_req, res) => {
res.setHeader('Content-Type', 'application/json');
res.send(swaggerSpec);
res.send(tsoaSpec);
});
}
```
### Build Integration
tsoa spec and route generation is integrated into the build pipeline:
```json
{
"scripts": {
"tsoa:spec": "tsoa spec",
"tsoa:routes": "tsoa routes",
"prebuild": "npm run tsoa:spec && npm run tsoa:routes",
"build": "tsc"
}
}
```
### Response Schema Standardization
All API responses follow the standardized format from [ADR-028](./0028-api-response-standardization.md):
@@ -160,107 +319,144 @@ All API responses follow the standardized format from [ADR-028](./0028-api-respo
"data": { ... }
}
// Paginated response
{
"success": true,
"data": [...],
"meta": {
"pagination": {
"page": 1,
"limit": 20,
"total": 100,
"totalPages": 5,
"hasNextPage": true,
"hasPrevPage": false
}
}
}
// Error response
{
"success": false,
"error": {
"code": "ERROR_CODE",
"message": "Human-readable message"
"code": "NOT_FOUND",
"message": "User not found"
}
}
```
Define reusable schema components for these patterns:
```typescript
/**
* @openapi
* components:
* schemas:
* SuccessResponse:
* type: object
* properties:
* success:
* type: boolean
* example: true
* data:
* type: object
* ErrorResponse:
* type: object
* properties:
* success:
* type: boolean
* example: false
* error:
* type: object
* properties:
* code:
* type: string
* message:
* type: string
*/
```
### Security Considerations
1. **Production Disabled**: Swagger UI is not available in production to prevent information disclosure.
2. **No Sensitive Data**: Never include actual secrets, tokens, or PII in example values.
3. **Authentication Documented**: Clearly document which endpoints require authentication.
## API Route Tags
Organize endpoints using consistent tags:
| Tag | Description | Routes |
| Tag | Description | Route Prefix |
| ------------ | ---------------------------------- | --------------------- |
| Health | Server health and readiness checks | `/api/health/*` |
| Auth | Authentication and authorization | `/api/auth/*` |
| Users | User profile management | `/api/users/*` |
| Flyers | Flyer uploads and retrieval | `/api/flyers/*` |
| Achievements | Gamification and leaderboards | `/api/achievements/*` |
| Budgets | Budget tracking | `/api/budgets/*` |
| Deals | Deal search and management | `/api/deals/*` |
| Stores | Store information | `/api/stores/*` |
| Recipes | Recipe management | `/api/recipes/*` |
| Budgets | Budget tracking | `/api/budgets/*` |
| Inventory | User inventory management | `/api/inventory/*` |
| Gamification | Achievements and leaderboards | `/api/achievements/*` |
| Admin | Administrative operations | `/api/admin/*` |
| System | System status and monitoring | `/api/system/*` |
## Controller Inventory
The following controllers have been migrated to tsoa:
| Controller | Endpoints | Description |
| ------------------------------- | --------- | ----------------------------------------- |
| `health.controller.ts` | 10 | Health checks, probes, service status |
| `auth.controller.ts` | 8 | Login, register, password reset, OAuth |
| `user.controller.ts` | 30 | User profiles, preferences, notifications |
| `admin.controller.ts` | 32 | System administration, user management |
| `ai.controller.ts` | 15 | AI-powered extraction and analysis |
| `flyer.controller.ts` | 12 | Flyer upload and management |
| `store.controller.ts` | 8 | Store information |
| `recipe.controller.ts` | 10 | Recipe CRUD and suggestions |
| `upc.controller.ts` | 6 | UPC barcode lookups |
| `inventory.controller.ts` | 8 | User inventory management |
| `receipt.controller.ts` | 6 | Receipt processing |
| `budget.controller.ts` | 8 | Budget tracking |
| `category.controller.ts` | 4 | Category management |
| `deals.controller.ts` | 8 | Deal search and discovery |
| `stats.controller.ts` | 6 | Usage statistics |
| `price.controller.ts` | 6 | Price history and tracking |
| `system.controller.ts` | 4 | System status |
| `gamification.controller.ts` | 10 | Achievements, leaderboards |
| `personalization.controller.ts` | 6 | User recommendations |
| `reactions.controller.ts` | 4 | Item reactions and ratings |
## Security Considerations
1. **Production Disabled**: Swagger UI is not available in production to prevent information disclosure.
2. **No Sensitive Data**: Never include actual secrets, tokens, or PII in example values.
3. **Authentication Documented**: Clearly document which endpoints require authentication.
4. **Rate Limiting**: Rate limiters are applied via `@Middlewares` decorator.
## Testing
Verify API documentation is correct by:
1. **Manual Review**: Navigate to `/docs/api-docs` and test each endpoint.
2. **Spec Validation**: Use OpenAPI validators to check the generated spec.
3. **Integration Tests**: Existing integration tests serve as implicit documentation verification.
3. **Controller Tests**: Each controller has comprehensive test coverage (369 controller tests total).
4. **Integration Tests**: 345 integration tests verify endpoint behavior.
## Consequences
### Positive
- **Single Source of Truth**: Documentation lives with the code and stays in sync.
- **Interactive Exploration**: Developers can try endpoints directly from the UI.
- **SDK Generation**: OpenAPI spec enables automatic client SDK generation.
- **Onboarding**: New developers can quickly understand the API surface.
- **Low Overhead**: JSDoc annotations are minimal additions to existing code.
- **Type-safe API contracts**: tsoa decorators derive types from TypeScript, eliminating duplicate definitions
- **Single Source of Truth**: Documentation lives with the code and stays in sync
- **Active Maintenance**: tsoa is actively maintained with regular releases
- **Interactive Exploration**: Developers can try endpoints directly from Swagger UI
- **SDK Generation**: OpenAPI spec enables automatic client SDK generation
- **Reduced Boilerplate**: tsoa generates Express routes automatically
### Negative
- **Maintenance Required**: Developers must update annotations when routes change.
- **Build Dependency**: Adds `swagger-jsdoc` and `swagger-ui-express` packages.
- **Initial Investment**: Existing routes need annotations added incrementally.
- **Learning Curve**: Decorator-based controller pattern differs from Express handlers
- **Generated Code**: `tsoa-generated.ts` must be regenerated when controllers change
- **Build Step**: Adds `tsoa spec && tsoa routes` to the build pipeline
### Mitigation
- Include documentation checks in code review process.
- Start with high-priority routes and expand coverage over time.
- Use TypeScript types to reduce documentation duplication where possible.
- **Migration Guide**: Created comprehensive TSOA-MIGRATION-GUIDE.md for developers
- **BaseController**: Provides familiar response helpers matching existing patterns
- **Incremental Adoption**: Existing Express routes continue to work alongside tsoa controllers
## Key Files
- `src/config/swagger.ts` - OpenAPI configuration
- `src/routes/*.ts` - Route files with JSDoc annotations
- `server.ts` - Swagger UI middleware setup
| File | Purpose |
| -------------------------------------- | --------------------------------------- |
| `tsoa.json` | tsoa configuration |
| `src/controllers/base.controller.ts` | Base controller with response utilities |
| `src/controllers/types.ts` | Shared controller type definitions |
| `src/controllers/*.controller.ts` | Individual domain controllers |
| `src/dtos/common.dto.ts` | Shared DTO definitions |
| `src/middleware/tsoaAuthentication.ts` | JWT authentication handler |
| `src/routes/tsoa-generated.ts` | tsoa-generated Express routes |
| `src/config/tsoa-spec.json` | Generated OpenAPI 3.0 spec |
| `server.ts` | Swagger UI middleware setup |
## Migration History
| Date | Change |
| ---------- | --------------------------------------------------------------- |
| 2025-12-12 | Initial ADR created with swagger-jsdoc approach |
| 2026-01-11 | Began implementation with swagger-jsdoc |
| 2026-02-12 | Completed migration to tsoa, superseding swagger-jsdoc approach |
## Related ADRs
- [ADR-059](./0059-dependency-modernization.md) - Dependency Modernization (tsoa migration plan)
- [ADR-003](./0003-standardized-input-validation-using-middleware.md) - Input Validation (Zod schemas)
- [ADR-028](./0028-api-response-standardization.md) - Response Standardization
- [ADR-001](./0001-standardized-error-handling.md) - Error Handling
- [ADR-016](./0016-api-security-hardening.md) - Security Hardening
- [ADR-048](./0048-authentication-strategy.md) - Authentication Strategy

View File

@@ -4,6 +4,8 @@
**Status**: Proposed
**Supersedes**: [ADR-013](./0013-database-schema-migration-strategy.md)
## Context
The `README.md` indicates that the database schema is managed by manually running a large `schema.sql.txt` file. This approach is highly error-prone, makes tracking changes difficult, and is not feasible for updating a live production database without downtime or data loss.

View File

@@ -1,18 +1,333 @@
# ADR-024: Feature Flagging Strategy
**Date**: 2025-12-12
**Status**: Accepted
**Implemented**: 2026-01-28
**Implementation Plan**: [2026-01-28-adr-024-feature-flags-implementation.md](../plans/2026-01-28-adr-024-feature-flags-implementation.md)
**Status**: Proposed
## Implementation Summary
Feature flag infrastructure fully implemented with 89 new tests (all passing). Total test suite: 3,616 tests passing.
**Backend**:
- Zod-validated schema in `src/config/env.ts` with 6 feature flags
- Service module `src/services/featureFlags.server.ts` with `isFeatureEnabled()`, `getFeatureFlags()`, `getEnabledFeatureFlags()`
- Admin endpoint `GET /api/v1/admin/feature-flags` (requires admin authentication)
- Convenience exports for direct boolean access
**Frontend**:
- Config section in `src/config.ts` with `VITE_FEATURE_*` environment variables
- Type declarations in `src/vite-env.d.ts`
- React hook `useFeatureFlag()` and `useAllFeatureFlags()` in `src/hooks/useFeatureFlag.ts`
- Declarative component `<FeatureFlag>` in `src/components/FeatureFlag.tsx`
**Current Flags**: `bugsinkSync`, `advancedRbac`, `newDashboard`, `betaRecipes`, `experimentalAi`, `debugMode`
---
## Context
As the application grows, there is no way to roll out new features to a subset of users (e.g., for beta testing) or to quickly disable a problematic feature in production without a full redeployment.
Application lacks controlled feature rollout capability. No mechanism for beta testing, quick production disablement, or gradual rollouts without full redeployment. Need type-safe, configuration-based system integrating with ADR-007 Zod validation.
## Decision
We will implement a feature flagging system. This could start with a simple configuration-based approach (defined in `ADR-007`) and evolve to use a dedicated service like **Flagsmith** or **LaunchDarkly**. This ADR will define how feature flags are created, managed, and checked in both the backend and frontend code.
Implement environment-variable-based feature flag system. Backend: Zod-validated schema in `src/config/env.ts` + dedicated service. Frontend: Vite env vars + React hook + declarative component. All flags default `false` (opt-in model). Future migration path to Flagsmith/LaunchDarkly preserved via abstraction layer.
## Consequences
**Positive**: Decouples feature releases from code deployments, reducing risk and allowing for more controlled, gradual rollouts and A/B testing. Enables easier experimentation and faster iteration.
**Negative**: Adds complexity to the codebase with conditional logic around features. Requires careful management of feature flag states to avoid technical debt.
- **Positive**: Decouples releases from deployments reduced risk, gradual rollouts, A/B testing capability
- **Negative**: Conditional logic complexity → requires sunset policy (3-month max after full rollout)
- **Neutral**: Restart required for flag changes (acceptable for current scale, external service removes this constraint)
---
## Implementation Details
### Architecture Overview
```text
Environment Variables (FEATURE_*, VITE_FEATURE_*)
├── Backend ──► src/config/env.ts (Zod) ──► src/services/featureFlags.server.ts
│ │
│ ┌──────────┴──────────┐
│ │ │
│ isFeatureEnabled() getAllFeatureFlags()
│ │
│ Routes/Services
└── Frontend ─► src/config.ts ──► src/hooks/useFeatureFlag.ts
┌──────────────┼──────────────┐
│ │ │
useFeatureFlag() useAllFeatureFlags() <FeatureFlag>
│ Component
Components
```
### File Structure
| File | Purpose | Layer |
| ------------------------------------- | ------------------------ | ---------------- |
| `src/config/env.ts` | Zod schema + env loading | Backend config |
| `src/services/featureFlags.server.ts` | Flag access service | Backend runtime |
| `src/config.ts` | Vite env parsing | Frontend config |
| `src/vite-env.d.ts` | TypeScript declarations | Frontend types |
| `src/hooks/useFeatureFlag.ts` | React hook | Frontend runtime |
| `src/components/FeatureFlag.tsx` | Declarative wrapper | Frontend UI |
### Naming Convention
| Context | Pattern | Example |
| ------------------- | ------------------------- | ---------------------------------- |
| Backend env var | `FEATURE_SNAKE_CASE` | `FEATURE_NEW_DASHBOARD` |
| Frontend env var | `VITE_FEATURE_SNAKE_CASE` | `VITE_FEATURE_NEW_DASHBOARD` |
| Config property | `camelCase` | `config.featureFlags.newDashboard` |
| Hook/function param | `camelCase` literal | `isFeatureEnabled('newDashboard')` |
### Backend Implementation
#### Schema Definition (`src/config/env.ts`)
```typescript
/**
* Feature flags schema (ADR-024).
* All flags default false (disabled) for safety.
*/
const featureFlagsSchema = z.object({
newDashboard: booleanString(false), // FEATURE_NEW_DASHBOARD
betaRecipes: booleanString(false), // FEATURE_BETA_RECIPES
experimentalAi: booleanString(false), // FEATURE_EXPERIMENTAL_AI
debugMode: booleanString(false), // FEATURE_DEBUG_MODE
});
// In loadEnvVars():
featureFlags: {
newDashboard: process.env.FEATURE_NEW_DASHBOARD,
betaRecipes: process.env.FEATURE_BETA_RECIPES,
experimentalAi: process.env.FEATURE_EXPERIMENTAL_AI,
debugMode: process.env.FEATURE_DEBUG_MODE,
},
```
#### Service Module (`src/services/featureFlags.server.ts`)
```typescript
import { config, isDevelopment } from '../config/env';
import { logger } from './logger.server';
export type FeatureFlagName = keyof typeof config.featureFlags;
/**
* Check feature flag state. Logs in development mode.
*/
export function isFeatureEnabled(flagName: FeatureFlagName): boolean {
const enabled = config.featureFlags[flagName];
if (isDevelopment) {
logger.debug({ flag: flagName, enabled }, 'Feature flag checked');
}
return enabled;
}
/**
* Get all flags (admin/debug endpoints).
*/
export function getAllFeatureFlags(): Record<FeatureFlagName, boolean> {
return { ...config.featureFlags };
}
// Convenience exports (evaluated once at startup)
export const isNewDashboardEnabled = config.featureFlags.newDashboard;
export const isBetaRecipesEnabled = config.featureFlags.betaRecipes;
```
#### Usage in Routes
```typescript
import { isFeatureEnabled } from '../services/featureFlags.server';
router.get('/dashboard', async (req, res) => {
if (isFeatureEnabled('newDashboard')) {
return sendSuccess(res, { version: 'v2', data: await getNewDashboardData() });
}
return sendSuccess(res, { version: 'v1', data: await getLegacyDashboardData() });
});
```
### Frontend Implementation
#### Config (`src/config.ts`)
```typescript
const config = {
// ... existing sections ...
featureFlags: {
newDashboard: import.meta.env.VITE_FEATURE_NEW_DASHBOARD === 'true',
betaRecipes: import.meta.env.VITE_FEATURE_BETA_RECIPES === 'true',
experimentalAi: import.meta.env.VITE_FEATURE_EXPERIMENTAL_AI === 'true',
debugMode: import.meta.env.VITE_FEATURE_DEBUG_MODE === 'true',
},
};
```
#### Type Declarations (`src/vite-env.d.ts`)
```typescript
interface ImportMetaEnv {
readonly VITE_FEATURE_NEW_DASHBOARD?: string;
readonly VITE_FEATURE_BETA_RECIPES?: string;
readonly VITE_FEATURE_EXPERIMENTAL_AI?: string;
readonly VITE_FEATURE_DEBUG_MODE?: string;
}
```
#### React Hook (`src/hooks/useFeatureFlag.ts`)
```typescript
import { useMemo } from 'react';
import config from '../config';
export type FeatureFlagName = keyof typeof config.featureFlags;
export function useFeatureFlag(flagName: FeatureFlagName): boolean {
return useMemo(() => config.featureFlags[flagName], [flagName]);
}
export function useAllFeatureFlags(): Record<FeatureFlagName, boolean> {
return useMemo(() => ({ ...config.featureFlags }), []);
}
```
#### Declarative Component (`src/components/FeatureFlag.tsx`)
```typescript
import { ReactNode } from 'react';
import { useFeatureFlag, FeatureFlagName } from '../hooks/useFeatureFlag';
interface FeatureFlagProps {
name: FeatureFlagName;
children: ReactNode;
fallback?: ReactNode;
}
export function FeatureFlag({ name, children, fallback = null }: FeatureFlagProps) {
const isEnabled = useFeatureFlag(name);
return <>{isEnabled ? children : fallback}</>;
}
```
#### Usage in Components
```tsx
// Declarative approach
<FeatureFlag name="newDashboard" fallback={<LegacyDashboard />}>
<NewDashboard />
</FeatureFlag>;
// Hook approach (for logic beyond rendering)
const isNewDashboard = useFeatureFlag('newDashboard');
useEffect(() => {
if (isNewDashboard) analytics.track('new_dashboard_viewed');
}, [isNewDashboard]);
```
### Testing Patterns
#### Backend Test Setup
```typescript
// Reset modules to test different flag states
beforeEach(() => {
vi.resetModules();
process.env.FEATURE_NEW_DASHBOARD = 'true';
});
// src/services/featureFlags.server.test.ts
describe('isFeatureEnabled', () => {
it('returns false for disabled flags', () => {
expect(isFeatureEnabled('newDashboard')).toBe(false);
});
});
```
#### Frontend Test Setup
```typescript
// Mock config module
vi.mock('../config', () => ({
default: {
featureFlags: {
newDashboard: true,
betaRecipes: false,
},
},
}));
// Component test
describe('FeatureFlag', () => {
it('renders fallback when disabled', () => {
render(
<FeatureFlag name="betaRecipes" fallback={<div>Old</div>}>
<div>New</div>
</FeatureFlag>
);
expect(screen.getByText('Old')).toBeInTheDocument();
});
});
```
### Flag Lifecycle
| Phase | Actions |
| ---------- | -------------------------------------------------------------------------------------------- |
| **Add** | 1. Add to both schemas (backend + frontend) 2. Default `false` 3. Document in `.env.example` |
| **Enable** | Set env var `='true'` → restart application |
| **Remove** | 1. Remove conditional code 2. Remove from schemas 3. Remove env vars |
| **Sunset** | Max 3 months after full rollout → remove flag |
### Admin Endpoint (Optional)
```typescript
// GET /api/v1/admin/feature-flags (admin-only)
router.get('/feature-flags', requireAdmin, async (req, res) => {
sendSuccess(res, { flags: getAllFeatureFlags() });
});
```
### Integration with ADR-007
Feature flags extend existing Zod configuration pattern:
- **Validation**: Same `booleanString()` transform used by other config
- **Loading**: Same `loadEnvVars()` function loads `FEATURE_*` vars
- **Type Safety**: `FeatureFlagName` type derived from config schema
- **Fail-Fast**: Invalid flag values fail at startup (Zod validation)
### Future Migration Path
Current implementation abstracts flag access via `isFeatureEnabled()` function and `useFeatureFlag()` hook. External service migration requires:
1. Replace implementation internals of these functions
2. Add API client for Flagsmith/LaunchDarkly
3. No changes to consuming code (routes/components)
### Explicitly Out of Scope
- External service integration (Flagsmith/LaunchDarkly)
- Database-stored flags
- Real-time flag updates (WebSocket/SSE)
- User-specific flags (A/B testing percentages)
- Flag inheritance/hierarchy
- Flag audit logging
### Key Files Reference
| Action | Files |
| --------------------- | ------------------------------------------------------------------------------------------------- |
| Add new flag | `src/config/env.ts`, `src/config.ts`, `src/vite-env.d.ts`, `.env.example` |
| Check flag (backend) | Import from `src/services/featureFlags.server.ts` |
| Check flag (frontend) | Import hook from `src/hooks/useFeatureFlag.ts` or component from `src/components/FeatureFlag.tsx` |
| Test flag behavior | Mock via `vi.resetModules()` (backend) or `vi.mock('../config')` (frontend) |

View File

@@ -16,12 +16,12 @@ We will adopt a hybrid naming convention strategy to explicitly distinguish betw
1. **Database and AI Types (`snake_case`)**:
Interfaces, Type definitions, and Zod schemas that represent raw database rows or direct AI responses **MUST** use `snake_case`.
- *Examples*: `AiFlyerDataSchema`, `ExtractedFlyerItemSchema`, `FlyerInsert`.
- *Reasoning*: This avoids unnecessary mapping layers when inserting data into the database or parsing AI output. It serves as a visual cue that the data is "raw", "external", or destined for persistence.
- _Examples_: `AiFlyerDataSchema`, `ExtractedFlyerItemSchema`, `FlyerInsert`.
- _Reasoning_: This avoids unnecessary mapping layers when inserting data into the database or parsing AI output. It serves as a visual cue that the data is "raw", "external", or destined for persistence.
2. **Internal Application Logic (`camelCase`)**:
Variables, function arguments, and processed data structures used within the application logic (Service layer, UI components, utility functions) **MUST** use `camelCase`.
- *Reasoning*: This adheres to standard JavaScript/TypeScript practices and maintains consistency with the rest of the ecosystem (React, etc.).
- _Reasoning_: This adheres to standard JavaScript/TypeScript practices and maintains consistency with the rest of the ecosystem (React, etc.).
3. **Boundary Handling**:
- For background jobs that primarily move data from AI to DB, preserving `snake_case` is preferred to minimize transformation logic.

View File

@@ -195,6 +195,12 @@ Do NOT add tests:
- Coverage percentages may not satisfy external audits
- Requires judgment calls that may be inconsistent
## Related ADRs
- [ADR-010](./0010-testing-strategy-and-standards.md) - Testing Strategy and Standards (this ADR extends ADR-010)
- [ADR-045](./0045-test-data-factories-and-fixtures.md) - Test Data Factories and Fixtures
- [ADR-057](./0057-test-remediation-post-api-versioning.md) - Test Remediation Post-API Versioning
## Key Files
- `docs/adr/0010-testing-strategy-and-standards.md` - Testing mechanics

View File

@@ -2,22 +2,22 @@
**Date**: 2026-01-09
**Status**: Partially Implemented
**Status**: Accepted (Fully Implemented)
**Implemented**: 2026-01-09 (Local auth only)
**Implemented**: 2026-01-09 (Local auth + JWT), 2026-01-26 (OAuth enabled)
## Context
The application requires a secure authentication system that supports both traditional email/password login and social OAuth providers (Google, GitHub). The system must handle user sessions, token refresh, account security (lockout after failed attempts), and integrate seamlessly with the existing Express middleware pipeline.
Currently, **only local authentication is enabled**. OAuth strategies are fully implemented but commented out, pending configuration of OAuth provider credentials.
**All authentication methods are now fully implemented**: Local authentication (email/password), JWT tokens, and OAuth (Google + GitHub). OAuth strategies use conditional registration - they activate automatically when the corresponding environment variables are configured.
## Decision
We will implement a stateless JWT-based authentication system with the following components:
1. **Local Authentication**: Email/password login with bcrypt hashing.
2. **OAuth Authentication**: Google and GitHub OAuth 2.0 (currently disabled).
2. **OAuth Authentication**: Google and GitHub OAuth 2.0 (conditionally enabled via environment variables).
3. **JWT Access Tokens**: Short-lived tokens (15 minutes) for API authentication.
4. **Refresh Tokens**: Long-lived tokens (7 days) stored in HTTP-only cookies.
5. **Account Security**: Lockout after 5 failed login attempts for 15 minutes.
@@ -59,7 +59,7 @@ We will implement a stateless JWT-based authentication system with the following
│ │ │ │ │
│ │ ┌──────────┐ │ │ │
│ └────────>│ OAuth │─────────────┘ │ │
(disabled) │ Provider │ │ │
│ Provider │ │ │
│ └──────────┘ │ │
│ │ │
│ ┌──────────┐ ┌──────────┐ │ │
@@ -130,72 +130,139 @@ passport.use(
- Refresh token: 7 days expiry, 64-byte random hex
- Refresh token stored in HTTP-only cookie with `secure` flag in production
### OAuth Strategies (Disabled)
### OAuth Strategies (Conditionally Enabled)
OAuth strategies are **fully implemented** and activate automatically when the corresponding environment variables are set. The strategies use conditional registration to gracefully handle missing credentials.
#### Google OAuth
Located in `src/routes/passport.routes.ts` (lines 167-217, commented):
Located in `src/config/passport.ts` (lines 167-235):
```typescript
// passport.use(new GoogleStrategy({
// clientID: process.env.GOOGLE_CLIENT_ID!,
// clientSecret: process.env.GOOGLE_CLIENT_SECRET!,
// callbackURL: '/api/auth/google/callback',
// scope: ['profile', 'email']
// },
// async (accessToken, refreshToken, profile, done) => {
// const email = profile.emails?.[0]?.value;
// const user = await db.findUserByEmail(email);
// if (user) {
// return done(null, user);
// }
// // Create new user with null password_hash
// const newUser = await db.createUser(email, null, {
// full_name: profile.displayName,
// avatar_url: profile.photos?.[0]?.value
// });
// return done(null, newUser);
// }
// ));
// Only register the strategy if the required environment variables are set.
if (process.env.GOOGLE_CLIENT_ID && process.env.GOOGLE_CLIENT_SECRET) {
passport.use(
new GoogleStrategy(
{
clientID: process.env.GOOGLE_CLIENT_ID,
clientSecret: process.env.GOOGLE_CLIENT_SECRET,
callbackURL: '/api/auth/google/callback',
scope: ['profile', 'email'],
},
async (_accessToken, _refreshToken, profile, done) => {
const email = profile.emails?.[0]?.value;
if (!email) {
return done(new Error('No email found in Google profile.'), false);
}
const existingUserProfile = await db.userRepo.findUserWithProfileByEmail(email, logger);
if (existingUserProfile) {
// User exists, log them in (strip sensitive fields)
return done(null, cleanUserProfile);
} else {
// Create new user with null password_hash for OAuth users
const newUserProfile = await db.userRepo.createUser(
email,
null,
{
full_name: profile.displayName,
avatar_url: profile.photos?.[0]?.value,
},
logger,
);
return done(null, newUserProfile);
}
},
),
);
logger.info('[Passport] Google OAuth strategy registered.');
} else {
logger.warn('[Passport] Google OAuth strategy NOT registered: credentials not set.');
}
```
#### GitHub OAuth
Located in `src/routes/passport.routes.ts` (lines 219-269, commented):
Located in `src/config/passport.ts` (lines 237-310):
```typescript
// passport.use(new GitHubStrategy({
// clientID: process.env.GITHUB_CLIENT_ID!,
// clientSecret: process.env.GITHUB_CLIENT_SECRET!,
// callbackURL: '/api/auth/github/callback',
// scope: ['user:email']
// },
// async (accessToken, refreshToken, profile, done) => {
// const email = profile.emails?.[0]?.value;
// // Similar flow to Google OAuth
// }
// ));
// Only register the strategy if the required environment variables are set.
if (process.env.GITHUB_CLIENT_ID && process.env.GITHUB_CLIENT_SECRET) {
passport.use(
new GitHubStrategy(
{
clientID: process.env.GITHUB_CLIENT_ID,
clientSecret: process.env.GITHUB_CLIENT_SECRET,
callbackURL: '/api/auth/github/callback',
scope: ['user:email'],
},
async (_accessToken, _refreshToken, profile, done) => {
const email = profile.emails?.[0]?.value;
if (!email) {
return done(new Error('No public email found in GitHub profile.'), false);
}
// Same flow as Google OAuth - find or create user
},
),
);
logger.info('[Passport] GitHub OAuth strategy registered.');
} else {
logger.warn('[Passport] GitHub OAuth strategy NOT registered: credentials not set.');
}
```
#### OAuth Routes (Disabled)
#### OAuth Routes (Active)
Located in `src/routes/auth.routes.ts` (lines 289-315, commented):
Located in `src/routes/auth.routes.ts` (lines 587-609):
```typescript
// const handleOAuthCallback = (req, res) => {
// const user = req.user;
// const accessToken = jwt.sign(payload, JWT_SECRET, { expiresIn: '15m' });
// const refreshToken = crypto.randomBytes(64).toString('hex');
//
// await db.saveRefreshToken(user.user_id, refreshToken);
// res.cookie('refreshToken', refreshToken, { httpOnly: true, secure: true });
// res.redirect(`${FRONTEND_URL}/auth/callback?token=${accessToken}`);
// };
// Google OAuth routes
router.get('/google', passport.authenticate('google', { session: false }));
router.get(
'/google/callback',
passport.authenticate('google', {
session: false,
failureRedirect: '/?error=google_auth_failed',
}),
createOAuthCallbackHandler('google'),
);
// router.get('/google', passport.authenticate('google', { session: false }));
// router.get('/google/callback', passport.authenticate('google', { ... }), handleOAuthCallback);
// router.get('/github', passport.authenticate('github', { session: false }));
// router.get('/github/callback', passport.authenticate('github', { ... }), handleOAuthCallback);
// GitHub OAuth routes
router.get('/github', passport.authenticate('github', { session: false }));
router.get(
'/github/callback',
passport.authenticate('github', {
session: false,
failureRedirect: '/?error=github_auth_failed',
}),
createOAuthCallbackHandler('github'),
);
```
#### OAuth Callback Handler
The callback handler generates tokens and redirects to the frontend:
```typescript
const createOAuthCallbackHandler = (provider: 'google' | 'github') => {
return async (req: Request, res: Response) => {
const userProfile = req.user as UserProfile;
const { accessToken, refreshToken } = await authService.handleSuccessfulLogin(
userProfile,
req.log,
);
res.cookie('refreshToken', refreshToken, {
httpOnly: true,
secure: process.env.NODE_ENV === 'production',
maxAge: 30 * 24 * 60 * 60 * 1000, // 30 days
});
// Redirect to frontend with provider-specific token param
const tokenParam = provider === 'google' ? 'googleAuthToken' : 'githubAuthToken';
res.redirect(`${process.env.FRONTEND_URL}/?${tokenParam}=${accessToken}`);
};
};
```
### Database Schema
@@ -248,11 +315,13 @@ export const mockAuth = (req, res, next) => {
};
```
## Enabling OAuth
## Configuring OAuth Providers
OAuth is fully implemented and activates automatically when credentials are provided. No code changes are required.
### Step 1: Set Environment Variables
Add to `.env`:
Add to your environment (`.env.local` for development, Gitea secrets for production):
```bash
# Google OAuth
@@ -283,54 +352,29 @@ GITHUB_CLIENT_SECRET=your-github-client-secret
- Development: `http://localhost:3001/api/auth/github/callback`
- Production: `https://your-domain.com/api/auth/github/callback`
### Step 3: Uncomment Backend Code
### Step 3: Restart the Application
**In `src/routes/passport.routes.ts`**:
After setting the environment variables, restart PM2:
1. Uncomment import statements (lines 5-6):
```typescript
import { Strategy as GoogleStrategy } from 'passport-google-oauth20';
import { Strategy as GitHubStrategy } from 'passport-github2';
```
2. Uncomment Google strategy (lines 167-217)
3. Uncomment GitHub strategy (lines 219-269)
**In `src/routes/auth.routes.ts`**:
1. Uncomment `handleOAuthCallback` function (lines 291-309)
2. Uncomment OAuth routes (lines 311-315)
### Step 4: Add Frontend OAuth Buttons
Create login buttons that redirect to:
- Google: `GET /api/auth/google`
- GitHub: `GET /api/auth/github`
Handle callback at `/auth/callback?token=<accessToken>`:
1. Extract token from URL
2. Store in client-side token storage
3. Redirect to dashboard
### Step 5: Handle OAuth Callback Page
Create `src/pages/AuthCallback.tsx`:
```typescript
const AuthCallback = () => {
const token = new URLSearchParams(location.search).get('token');
if (token) {
setToken(token);
navigate('/dashboard');
} else {
navigate('/login?error=auth_failed');
}
};
```bash
podman exec -it flyer-crawler-dev pm2 restart all
```
The Passport configuration will automatically register the OAuth strategies when it detects the credentials. Check the logs for confirmation:
```text
[Passport] Google OAuth strategy registered.
[Passport] GitHub OAuth strategy registered.
```
### Frontend Integration
OAuth login buttons are implemented in `src/client/pages/AuthView.tsx`. The frontend:
1. Redirects users to `/api/auth/google` or `/api/auth/github`
2. Handles the callback via the `useAppInitialization` hook which looks for `googleAuthToken` or `githubAuthToken` query parameters
3. Stores the token and redirects to the dashboard
## Known Limitations
1. **No OAuth Provider ID Mapping**: Users are identified by email only. If a user has accounts with different emails on Google and GitHub, they create separate accounts.
@@ -372,31 +416,32 @@ const AuthCallback = () => {
- **Stateless Architecture**: No session storage required; scales horizontally.
- **Secure by Default**: HTTP-only cookies, short token expiry, bcrypt hashing.
- **Account Protection**: Lockout prevents brute-force attacks.
- **Flexible OAuth**: Can enable/disable OAuth without code changes (just env vars + uncommenting).
- **Graceful Degradation**: System works with local auth only.
- **Flexible OAuth**: OAuth activates automatically when credentials are set - no code changes needed.
- **Graceful Degradation**: System works with local auth only when OAuth credentials are not configured.
- **Full Feature Set**: Both local and OAuth authentication are production-ready.
### Negative
- **OAuth Disabled by Default**: Requires manual uncommenting to enable.
- **No Account Linking**: Multiple OAuth providers create separate accounts.
- **Frontend Work Required**: OAuth login buttons don't exist yet.
- **Token in URL**: OAuth callback passes token in URL (visible in browser history).
- **No Account Linking**: Multiple OAuth providers create separate accounts if emails differ.
- **Token in URL**: OAuth callback passes token in URL query parameter (visible in browser history).
- **Email-Based Identity**: OAuth users are identified by email only, not provider-specific IDs.
### Mitigation
- Document OAuth enablement steps clearly (see [../architecture/AUTHENTICATION.md](../architecture/AUTHENTICATION.md)).
- Document OAuth configuration steps clearly (see [../architecture/AUTHENTICATION.md](../architecture/AUTHENTICATION.md)).
- Consider adding OAuth provider ID columns for future account linking.
- Use URL fragment (`#token=`) instead of query parameter for callback.
- Consider using URL fragment (`#token=`) instead of query parameter for callback in future enhancement.
## Key Files
| File | Purpose |
| ------------------------------------------------------ | ------------------------------------------------ |
| `src/routes/passport.routes.ts` | Passport strategies (local, JWT, OAuth) |
| `src/config/passport.ts` | Passport strategies (local, JWT, OAuth) |
| `src/routes/auth.routes.ts` | Auth endpoints (login, register, refresh, OAuth) |
| `src/services/authService.ts` | Auth business logic |
| `src/services/db/user.db.ts` | User database operations |
| `src/config/env.ts` | Environment variable validation |
| `src/client/pages/AuthView.tsx` | Frontend login/register UI with OAuth buttons |
| [AUTHENTICATION.md](../architecture/AUTHENTICATION.md) | OAuth setup guide |
| `.env.example` | Environment variable template |
@@ -409,11 +454,11 @@ const AuthCallback = () => {
## Future Enhancements
1. **Enable OAuth**: Uncomment strategies and configure providers.
2. **Add OAuth Provider Mapping Table**: Store `googleId`, `githubId` for account linking.
3. **Implement Account Linking**: Allow users to connect multiple OAuth providers.
4. **Add Password to OAuth Users**: Allow OAuth users to set a password.
5. **Implement PKCE**: Add PKCE flow for enhanced OAuth security.
6. **Token in Fragment**: Use URL fragment for OAuth callback token.
7. **OAuth Token Storage**: Store OAuth refresh tokens for provider API access.
8. **Magic Link Login**: Add passwordless email login option.
1. **Add OAuth Provider Mapping Table**: Store `googleId`, `githubId` for account linking.
2. **Implement Account Linking**: Allow users to connect multiple OAuth providers.
3. **Add Password to OAuth Users**: Allow OAuth users to set a password for local login.
4. **Implement PKCE**: Add PKCE flow for enhanced OAuth security.
5. **Token in Fragment**: Use URL fragment for OAuth callback token instead of query parameter.
6. **OAuth Token Storage**: Store OAuth refresh tokens for provider API access.
7. **Magic Link Login**: Add passwordless email login option.
8. **Additional OAuth Providers**: Support for Apple, Microsoft, or other providers.

View File

@@ -2,9 +2,9 @@
**Date**: 2026-01-11
**Status**: Proposed
**Status**: Accepted (Fully Implemented)
**Related**: [ADR-015](0015-application-performance-monitoring-and-error-tracking.md), [ADR-004](0004-standardized-application-wide-structured-logging.md)
**Related**: [ADR-015](0015-error-tracking-and-observability.md), [ADR-004](0004-standardized-application-wide-structured-logging.md)
## Context
@@ -335,7 +335,7 @@ SELECT award_achievement('user-uuid', 'Nonexistent Badge');
## References
- [ADR-015: Application Performance Monitoring](0015-application-performance-monitoring-and-error-tracking.md)
- [ADR-015: Error Tracking and Observability](0015-error-tracking-and-observability.md)
- [ADR-004: Standardized Structured Logging](0004-standardized-application-wide-structured-logging.md)
- [PostgreSQL RAISE Documentation](https://www.postgresql.org/docs/current/plpgsql-errors-and-messages.html)
- [PostgreSQL Logging Configuration](https://www.postgresql.org/docs/current/runtime-config-logging.html)

View File

@@ -2,7 +2,9 @@
**Date**: 2026-01-11
**Status**: Proposed
**Status**: Accepted (Fully Implemented)
**Related**: [ADR-004](0004-standardized-application-wide-structured-logging.md)
## Context
@@ -17,7 +19,9 @@ We will adopt a namespace-based debug filter pattern, similar to the `debug` npm
## Implementation
In `src/services/logger.server.ts`:
### Core Implementation (Completed 2026-01-11)
Implemented in [src/services/logger.server.ts:140-150](src/services/logger.server.ts#L140-L150):
```typescript
const debugModules = (process.env.DEBUG_MODULES || '').split(',').map((s) => s.trim());
@@ -33,10 +37,100 @@ export const createScopedLogger = (moduleName: string) => {
};
```
### Adopted Services (Completed 2026-01-26)
Services currently using `createScopedLogger`:
- `ai-service` - AI/Gemini integration ([src/services/aiService.server.ts:1020](src/services/aiService.server.ts#L1020))
- `flyer-processing-service` - Flyer upload and processing ([src/services/flyerProcessingService.server.ts:20](src/services/flyerProcessingService.server.ts#L20))
## Usage
To debug only AI and Database interactions:
### Enable Debug Logging for Specific Modules
To debug only AI and flyer processing:
```bash
DEBUG_MODULES=ai-service,db-repo npm run dev
DEBUG_MODULES=ai-service,flyer-processing-service npm run dev
```
### Enable All Debug Logging
Use wildcard to enable debug logging for all modules:
```bash
DEBUG_MODULES=* npm run dev
```
### Common Module Names
| Module Name | Purpose | File |
| -------------------------- | ---------------------------------------- | ----------------------------------------------- |
| `ai-service` | AI/Gemini API interactions | `src/services/aiService.server.ts` |
| `flyer-processing-service` | Flyer upload, validation, and processing | `src/services/flyerProcessingService.server.ts` |
## Best Practices
1. **Use Scoped Loggers for Long-Running Services**: Services with complex workflows or external API calls should use `createScopedLogger` to allow targeted debugging.
2. **Use Child Loggers for Contextual Data**: Even within scoped loggers, create child loggers with job/request-specific context:
```typescript
const logger = createScopedLogger('my-service');
async function processJob(job: Job) {
const jobLogger = logger.child({ jobId: job.id, jobName: job.name });
jobLogger.debug('Starting job processing');
}
```
3. **Module Naming Convention**: Use kebab-case suffixed with `-service` or `-worker` (e.g., `ai-service`, `email-worker`).
4. **Production Usage**: `DEBUG_MODULES` can be set in production for temporary debugging, but should not be used continuously due to increased log volume.
## Examples
### Development Debugging
Debug AI service issues during development:
```bash
# Dev container
DEBUG_MODULES=ai-service npm run dev
# Or via PM2
DEBUG_MODULES=ai-service pm2 restart flyer-crawler-api-dev
```
### Production Troubleshooting
Temporarily enable debug logging for a specific subsystem:
```bash
# SSH into production server
ssh root@projectium.com
# Set environment variable and restart
DEBUG_MODULES=ai-service pm2 restart flyer-crawler-api
# View logs
pm2 logs flyer-crawler-api --lines 100
# Disable debug logging
pm2 unset DEBUG_MODULES flyer-crawler-api
pm2 restart flyer-crawler-api
```
## Consequences
**Positive**:
- Developers can inspect detailed logs for specific subsystems without log flooding
- Production debugging becomes more targeted and efficient
- No performance impact when debug logging is disabled
- Compatible with existing Pino logging infrastructure
**Negative**:
- Requires developers to know module names (mitigated by documentation above)
- Not all services have adopted scoped loggers yet (gradual migration)

View File

@@ -2,7 +2,14 @@
**Date**: 2026-01-11
**Status**: Proposed
**Status**: Accepted (Fully Implemented)
**Implementation Status**:
- ✅ BullMQ worker stall configuration (complete)
- ✅ Basic health endpoints (/live, /ready, /redis, etc.)
- ✅ /health/queues endpoint (complete)
- ✅ Worker heartbeat mechanism (complete)
## Context
@@ -60,3 +67,76 @@ The `/health/queues` endpoint will:
**Negative**:
- Requires configuring external monitoring to poll the new endpoint.
## Implementation Notes
### Completed (2026-01-11)
1. **BullMQ Stall Configuration** - `src/config/workerOptions.ts`
- All workers use `defaultWorkerOptions` with:
- `stalledInterval: 30000` (30s)
- `maxStalledCount: 3`
- `lockDuration: 30000` (30s)
- Applied to all 9 workers: flyer, email, analytics, cleanup, weekly-analytics, token-cleanup, receipt, expiry-alert, barcode
2. **Basic Health Endpoints** - `src/routes/health.routes.ts`
- `/health/live` - Liveness probe
- `/health/ready` - Readiness probe (checks DB, Redis, storage)
- `/health/startup` - Startup probe
- `/health/redis` - Redis connectivity
- `/health/db-pool` - Database connection pool status
### Implementation Completed (2026-01-26)
1. **`/health/queues` Endpoint** ✅
- Added route to `src/routes/health.routes.ts:511-674`
- Iterates through all 9 queues from `src/services/queues.server.ts`
- Fetches job counts using BullMQ Queue API: `getJobCounts()`
- Returns structured response including both queue metrics and worker heartbeats:
```typescript
{
status: 'healthy' | 'unhealthy',
timestamp: string,
queues: {
[queueName]: {
waiting: number,
active: number,
failed: number,
delayed: number
}
},
workers: {
[workerName]: {
alive: boolean,
lastSeen?: string,
pid?: number,
host?: string
}
}
}
```
- Returns 200 OK if all healthy, 503 if any queue/worker unavailable
- Full OpenAPI documentation included
2. **Worker Heartbeat Mechanism** ✅
- Added `updateWorkerHeartbeat()` and `startWorkerHeartbeat()` in `src/services/workers.server.ts:100-149`
- Key pattern: `worker:heartbeat:<worker-name>`
- Stores: `{ timestamp: ISO8601, pid: number, host: string }`
- Updates every 30s with 90s TTL
- Integrated with `/health/queues` endpoint (checks if heartbeat < 60s old)
- Heartbeat intervals properly cleaned up in `closeWorkers()` and `gracefulShutdown()`
3. **Comprehensive Tests** ✅
- Added 5 test cases in `src/routes/health.routes.test.ts:623-858`
- Tests cover: healthy state, queue failures, stale heartbeats, missing heartbeats, Redis errors
- All tests follow existing patterns with proper mocking
### Future Enhancements (Not Implemented)
1. **Queue Depth Alerting** (Low Priority)
- Add configurable thresholds per queue type
- Return 500 if `waiting` count exceeds threshold for extended period
- Consider using Redis for storing threshold breach timestamps
- **Estimate**: 1-2 hours

View File

@@ -332,6 +332,6 @@ Response:
## References
- [ADR-006: Background Job Processing](./0006-background-job-processing-and-task-queues.md)
- [ADR-015: Application Performance Monitoring](./0015-application-performance-monitoring-and-error-tracking.md)
- [ADR-015: Error Tracking and Observability](./0015-error-tracking-and-observability.md)
- [Bugsink API Documentation](https://bugsink.com/docs/api/)
- [Gitea API Documentation](https://docs.gitea.io/en-us/api-usage/)

View File

@@ -1,4 +1,4 @@
# ADR-023: Database Normalization and Referential Integrity
# ADR-055: Database Normalization and Referential Integrity
**Date:** 2026-01-19
**Status:** Accepted

View File

@@ -0,0 +1,262 @@
# ADR-056: Application Performance Monitoring (APM)
**Date**: 2026-01-26
**Status**: Proposed
**Related**: [ADR-015](./0015-error-tracking-and-observability.md) (Error Tracking and Observability)
## Context
Application Performance Monitoring (APM) provides visibility into application behavior through:
- **Distributed Tracing**: Track requests across services, queues, and database calls
- **Performance Metrics**: Response times, throughput, error rates
- **Resource Monitoring**: Memory usage, CPU, database connections
- **Transaction Analysis**: Identify slow endpoints and bottlenecks
While ADR-015 covers error tracking and observability, APM is a distinct concern focused on performance rather than errors. The Sentry SDK supports APM through its tracing features, but this capability is currently **intentionally disabled** in our application.
### Current State
The Sentry SDK is installed and configured for error tracking (see ADR-015), but APM features are disabled:
```typescript
// src/services/sentry.client.ts
Sentry.init({
dsn: config.sentry.dsn,
environment: config.sentry.environment,
// Performance monitoring - disabled for now to keep it simple
tracesSampleRate: 0,
// ...
});
```
```typescript
// src/services/sentry.server.ts
Sentry.init({
dsn: config.sentry.dsn,
environment: config.sentry.environment || config.server.nodeEnv,
// Performance monitoring - disabled for now to keep it simple
tracesSampleRate: 0,
// ...
});
```
### Why APM is Currently Disabled
1. **Complexity**: APM adds overhead and complexity to debugging
2. **Bugsink Limitations**: Bugsink's APM support is less mature than its error tracking
3. **Resource Overhead**: Tracing adds memory and CPU overhead
4. **Focus**: Error tracking provides more immediate value for our current scale
5. **Cost**: High sample rates can significantly increase storage requirements
## Decision
We propose a **staged approach** to APM implementation:
### Phase 1: Selective Backend Tracing (Low Priority)
Enable tracing for specific high-value operations:
```typescript
// Enable tracing for specific transactions only
Sentry.init({
dsn: config.sentry.dsn,
tracesSampleRate: 0, // Keep default at 0
// Trace only specific high-value transactions
tracesSampler: (samplingContext) => {
const transactionName = samplingContext.transactionContext?.name;
// Always trace flyer processing jobs
if (transactionName?.includes('flyer-processing')) {
return 0.1; // 10% sample rate
}
// Always trace AI/Gemini calls
if (transactionName?.includes('gemini')) {
return 0.5; // 50% sample rate
}
// Trace slow endpoints (determined by custom logic)
if (samplingContext.parentSampled) {
return 0.1; // 10% for child transactions
}
return 0; // Don't trace other transactions
},
});
```
### Phase 2: Custom Performance Metrics
Add custom metrics without full tracing overhead:
```typescript
// Custom metric for slow database queries
import { metrics } from '@sentry/node';
// In repository methods
const startTime = performance.now();
const result = await pool.query(sql, params);
const duration = performance.now() - startTime;
metrics.distribution('db.query.duration', duration, {
tags: { query_type: 'select', table: 'flyers' },
});
if (duration > 1000) {
logger.warn({ duration, sql }, 'Slow query detected');
}
```
### Phase 3: Full APM Integration (Future)
When/if full APM is needed:
```typescript
Sentry.init({
dsn: config.sentry.dsn,
tracesSampleRate: 0.1, // 10% of transactions
profilesSampleRate: 0.1, // 10% of traced transactions get profiled
integrations: [
// Database tracing
Sentry.postgresIntegration(),
// Redis tracing
Sentry.redisIntegration(),
// BullMQ job tracing
Sentry.prismaIntegration(), // or custom BullMQ integration
],
});
```
## Implementation Steps
### To Enable Basic APM
1. **Update Sentry Configuration**:
- Set `tracesSampleRate` > 0 in `src/services/sentry.server.ts`
- Set `tracesSampleRate` > 0 in `src/services/sentry.client.ts`
- Add environment variable `SENTRY_TRACES_SAMPLE_RATE` (default: 0)
2. **Add Instrumentation**:
- Enable automatic Express instrumentation
- Add manual spans for BullMQ job processing
- Add database query instrumentation
3. **Frontend Tracing**:
- Add Browser Tracing integration
- Configure page load and navigation tracing
4. **Environment Variables**:
```bash
SENTRY_TRACES_SAMPLE_RATE=0.1 # 10% sampling
SENTRY_PROFILES_SAMPLE_RATE=0 # Profiling disabled
```
5. **Bugsink Configuration**:
- Verify Bugsink supports performance data ingestion
- Configure retention policies for performance data
### Configuration Changes Required
```typescript
// src/config/env.ts - Add new config
sentry: {
dsn: env.SENTRY_DSN,
environment: env.SENTRY_ENVIRONMENT,
debug: env.SENTRY_DEBUG === 'true',
tracesSampleRate: parseFloat(env.SENTRY_TRACES_SAMPLE_RATE || '0'),
profilesSampleRate: parseFloat(env.SENTRY_PROFILES_SAMPLE_RATE || '0'),
},
```
```typescript
// src/services/sentry.server.ts - Updated init
Sentry.init({
dsn: config.sentry.dsn,
environment: config.sentry.environment,
tracesSampleRate: config.sentry.tracesSampleRate,
profilesSampleRate: config.sentry.profilesSampleRate,
// ... rest of config
});
```
## Trade-offs
### Enabling APM
**Benefits**:
- Identify performance bottlenecks
- Track distributed transactions across services
- Profile slow endpoints
- Monitor resource utilization trends
**Costs**:
- Increased memory usage (~5-15% overhead)
- Additional CPU for trace processing
- Increased storage in Bugsink/Sentry
- More complex debugging (noise in traces)
- Potential latency from tracing overhead
### Keeping APM Disabled
**Benefits**:
- Simpler operation and debugging
- Lower resource overhead
- Focused on error tracking (higher priority)
- No additional storage costs
**Costs**:
- No automated performance insights
- Manual profiling required for bottleneck detection
- Limited visibility into slow transactions
## Alternatives Considered
1. **OpenTelemetry**: More vendor-neutral, but adds another dependency and complexity
2. **Prometheus + Grafana**: Good for metrics, but doesn't provide distributed tracing
3. **Jaeger/Zipkin**: Purpose-built for tracing, but requires additional infrastructure
4. **New Relic/Datadog SaaS**: Full-featured but conflicts with self-hosted requirement
## Current Recommendation
**Keep APM disabled** (`tracesSampleRate: 0`) until:
1. Specific performance issues are identified that require tracing
2. Bugsink's APM support is verified and tested
3. Infrastructure can support the additional overhead
4. There is a clear business need for performance visibility
When enabling APM becomes necessary, start with Phase 1 (selective tracing) to minimize overhead while gaining targeted insights.
## Consequences
### Positive (When Implemented)
- Automated identification of slow endpoints
- Distributed trace visualization across async operations
- Correlation between errors and performance issues
- Proactive alerting on performance degradation
### Negative
- Additional infrastructure complexity
- Storage overhead for trace data
- Potential performance impact from tracing itself
- Learning curve for trace analysis
## References
- [Sentry Performance Monitoring](https://docs.sentry.io/product/performance/)
- [@sentry/node Performance](https://docs.sentry.io/platforms/javascript/guides/node/performance/)
- [@sentry/react Performance](https://docs.sentry.io/platforms/javascript/guides/react/performance/)
- [OpenTelemetry](https://opentelemetry.io/) (alternative approach)
- [ADR-015: Error Tracking and Observability](./0015-error-tracking-and-observability.md)

View File

@@ -0,0 +1,367 @@
# ADR-057: Test Remediation Post-API Versioning and Frontend Rework
**Date**: 2026-01-28
**Status**: Accepted
**Context**: Major test remediation effort completed after ADR-008 API versioning implementation and frontend style rework
## Context
Following the completion of ADR-008 Phase 2 (API Versioning Strategy) and a concurrent frontend style/design rework, the test suite experienced 105 test failures across unit tests and E2E tests. This ADR documents the systematic remediation effort, root cause analysis, and lessons learned to prevent similar issues in future migrations.
### Scope of Failures
| Test Type | Failures | Total Tests | Pass Rate After Fix |
| ---------- | -------- | ----------- | ------------------- |
| Unit Tests | 69 | 3,392 | 100% |
| E2E Tests | 36 | 36 | 100% |
| **Total** | **105** | **3,428** | **100%** |
### Root Causes Identified
The failures were categorized into six distinct categories:
1. **API Versioning Path Mismatches** (71 failures)
- Test files using `/api/` instead of `/api/v1/`
- Environment variables not set for API base URL
- Integration and E2E tests calling unversioned endpoints
2. **Dark Mode Class Assertion Failures** (8 failures)
- Frontend rework changed Tailwind dark mode utility classes
- Test assertions checking for outdated class names
3. **Selected Item Styling Changes** (6 failures)
- Component styling refactored to new design tokens
- Test assertions expecting old CSS class combinations
4. **Admin-Only Component Visibility** (12 failures)
- MainLayout tests not properly mocking admin role
- ActivityLog component visibility tied to role-based access
5. **Mock Hoisting Issues** (5 failures)
- Queue mocks not available during module initialization
- Vitest's module hoisting order causing mock setup failures
6. **Error Log Path Hardcoding** (3 failures)
- Route handlers logging hardcoded paths like `/api/flyers`
- Test assertions expecting versioned paths `/api/v1/flyers`
## Decision
We implemented a systematic remediation approach addressing each failure category with targeted fixes while establishing patterns to prevent regression.
### 1. API Versioning Configuration Updates
**Files Modified**:
- `vite.config.ts`
- `vitest.config.e2e.ts`
- `vitest.config.integration.ts`
**Pattern Applied**: Centralize API base URL in Vitest environment variables
```typescript
// vite.config.ts - Unit test configuration
test: {
env: {
// ADR-008: Ensure API versioning is correctly set for unit tests
VITE_API_BASE_URL: '/api/v1',
},
// ...
}
// vitest.config.e2e.ts - E2E test configuration
test: {
env: {
// ADR-008: API versioning - all routes use /api/v1 prefix
VITE_API_BASE_URL: 'http://localhost:3098/api/v1',
},
// ...
}
// vitest.config.integration.ts - Integration test configuration
test: {
env: {
// ADR-008: API versioning - all routes use /api/v1 prefix
VITE_API_BASE_URL: 'http://localhost:3099/api/v1',
},
// ...
}
```
### 2. E2E Test URL Path Updates
**Files Modified** (7 files, 31 URL occurrences):
- `src/tests/e2e/budget-journey.e2e.test.ts`
- `src/tests/e2e/deals-journey.e2e.test.ts`
- `src/tests/e2e/flyer-upload.e2e.test.ts`
- `src/tests/e2e/inventory-journey.e2e.test.ts`
- `src/tests/e2e/receipt-journey.e2e.test.ts`
- `src/tests/e2e/upc-journey.e2e.test.ts`
- `src/tests/e2e/user-journey.e2e.test.ts`
**Pattern Applied**: Update all hardcoded API paths to versioned endpoints
```typescript
// Before
const response = await getRequest().post('/api/auth/register').send({...});
// After
const response = await getRequest().post('/api/v1/auth/register').send({...});
```
### 3. Unit Test Assertion Updates for UI Changes
**Files Modified**:
- `src/features/flyer/FlyerDisplay.test.tsx`
- `src/features/flyer/FlyerList.test.tsx`
**Pattern Applied**: Update CSS class assertions to match new design system
```typescript
// FlyerDisplay.test.tsx - Dark mode class update
// Before
expect(image).toHaveClass('dark:brightness-75');
// After
expect(image).toHaveClass('dark:brightness-90');
// FlyerList.test.tsx - Selected item styling update
// Before
expect(selectedItem).toHaveClass('ring-2', 'ring-brand-primary');
// After
expect(selectedItem).toHaveClass('border-brand-primary', 'bg-teal-50/50', 'dark:bg-teal-900/10');
```
### 4. Admin-Only Component Test Separation
**File Modified**: `src/layouts/MainLayout.test.tsx`
**Pattern Applied**: Separate test cases for admin vs. regular user visibility
```typescript
describe('for authenticated users', () => {
beforeEach(() => {
mockedUseAuth.mockReturnValue({
...defaultUseAuthReturn,
authStatus: 'AUTHENTICATED',
userProfile: createMockUserProfile({ user: mockUser }),
});
});
it('renders auth-gated components for regular users (PriceHistoryChart, Leaderboard)', () => {
renderWithRouter(<MainLayout {...defaultProps} />);
expect(screen.getByTestId('price-history-chart')).toBeInTheDocument();
expect(screen.getByTestId('leaderboard')).toBeInTheDocument();
// ActivityLog is admin-only, should NOT be present for regular users
expect(screen.queryByTestId('activity-log')).not.toBeInTheDocument();
});
it('renders ActivityLog for admin users', () => {
mockedUseAuth.mockReturnValue({
...defaultUseAuthReturn,
authStatus: 'AUTHENTICATED',
userProfile: createMockUserProfile({ user: mockUser, role: 'admin' }),
});
renderWithRouter(<MainLayout {...defaultProps} />);
expect(screen.getByTestId('activity-log')).toBeInTheDocument();
});
});
```
### 5. vi.hoisted() Pattern for Queue Mocks
**File Modified**: `src/routes/health.routes.test.ts`
**Pattern Applied**: Use `vi.hoisted()` to ensure mocks are available during module hoisting
```typescript
// Use vi.hoisted to create mock queue objects that are available during vi.mock hoisting.
// This ensures the mock objects exist when the factory function runs.
const { mockQueuesModule } = vi.hoisted(() => {
// Helper function to create a mock queue object with vi.fn()
const createMockQueue = () => ({
getJobCounts: vi.fn().mockResolvedValue({
waiting: 0,
active: 0,
failed: 0,
delayed: 0,
}),
});
return {
mockQueuesModule: {
flyerQueue: createMockQueue(),
emailQueue: createMockQueue(),
// ... additional queues
},
};
});
// Mock the queues.server module BEFORE the health router imports it.
vi.mock('../services/queues.server', () => mockQueuesModule);
// Import the router AFTER all mocks are defined.
import healthRouter from './health.routes';
```
### 6. Dynamic Error Log Paths
**Pattern Applied**: Use `req.originalUrl` instead of hardcoded paths in error handlers
```typescript
// Before (INCORRECT - hardcoded path)
req.log.error({ error }, 'Error in /api/flyers/:id:');
// After (CORRECT - dynamic path)
req.log.error({ error }, `Error in ${req.originalUrl.split('?')[0]}:`);
```
## Implementation Summary
### Files Modified (14 total)
| Category | Files | Changes |
| -------------------- | ----- | ------------------------------------------------- |
| Vitest Configuration | 3 | Added `VITE_API_BASE_URL` environment variables |
| E2E Tests | 7 | Updated 31 API endpoint URLs |
| Unit Tests | 4 | Updated assertions for UI, mocks, and admin roles |
### Verification Results
After remediation, all tests pass in the dev container environment:
```text
Unit Tests: 3,392 passing
E2E Tests: 36 passing
Integration: 345/348 passing (3 known issues, unrelated)
Type Check: Passing
```
## Consequences
### Positive
1. **Test Suite Stability**: All tests now pass consistently in the dev container
2. **API Versioning Compliance**: Tests enforce the `/api/v1/` path requirement
3. **Pattern Documentation**: Clear patterns established for future test maintenance
4. **Separation of Concerns**: Admin vs. user test cases properly separated
5. **Mock Reliability**: `vi.hoisted()` pattern prevents mock timing issues
### Negative
1. **Maintenance Overhead**: Future API version changes will require test updates
2. **Manual Migration**: No automated tool to update test paths during versioning
### Neutral
1. **Test Execution Time**: No significant impact on test execution duration
2. **Coverage Metrics**: Coverage percentages unchanged
## Best Practices Established
### 1. API Versioning in Tests
**Always use versioned API paths in tests**:
```typescript
// Good
const response = await request.get('/api/v1/users/profile');
// Bad
const response = await request.get('/api/users/profile');
```
**Configure environment variables centrally in Vitest configs** rather than in individual test files.
### 2. vi.hoisted() for Module-Level Mocks
When mocking modules that are imported at the top level of other modules:
```typescript
// Pattern: Define mocks with vi.hoisted() BEFORE vi.mock() calls
const { mockModule } = vi.hoisted(() => ({
mockModule: {
someFunction: vi.fn(),
},
}));
vi.mock('./some-module', () => mockModule);
// Import AFTER mocks
import { something } from './module-that-imports-some-module';
```
### 3. Testing Conditional Component Rendering
When testing components that render differently based on user role:
1. Create separate `describe` blocks for each role
2. Set up role-specific mocks in `beforeEach`
3. Explicitly test both presence AND absence of role-gated components
### 4. CSS Class Assertions After UI Refactors
After frontend style changes:
1. Review component implementation for new class names
2. Update test assertions to match actual CSS classes
3. Consider using partial matching for complex class combinations:
```typescript
// Flexible matching for Tailwind classes
expect(element).toHaveClass('border-brand-primary');
// vs exact matching
expect(element).toHaveClass('border-brand-primary', 'bg-teal-50/50', 'dark:bg-teal-900/10');
```
### 5. Error Logging Paths
**Always use dynamic paths in error logs**:
```typescript
// Pattern: Use req.originalUrl for request path logging
req.log.error({ error }, `Error in ${req.originalUrl.split('?')[0]}:`);
```
This ensures error logs reflect the actual request URL including version prefixes.
## Migration Checklist for Future API Version Changes
When implementing a new API version (e.g., v2), follow this checklist:
- [ ] Update `vite.config.ts` test environment `VITE_API_BASE_URL`
- [ ] Update `vitest.config.e2e.ts` test environment `VITE_API_BASE_URL`
- [ ] Update `vitest.config.integration.ts` test environment `VITE_API_BASE_URL`
- [ ] Search and replace `/api/v1/` with `/api/v2/` in E2E test files
- [ ] Search and replace `/api/v1/` with `/api/v2/` in integration test files
- [ ] Verify route handler error logs use `req.originalUrl`
- [ ] Run full test suite in dev container to verify
**Search command for finding hardcoded paths**:
```bash
grep -r "/api/v1/" src/tests/
grep -r "'/api/" src/routes/*.ts
```
## Related ADRs
- [ADR-008](./0008-api-versioning-strategy.md) - API Versioning Strategy
- [ADR-010](./0010-testing-strategy-and-standards.md) - Testing Strategy and Standards
- [ADR-014](./0014-containerization-and-deployment-strategy.md) - Platform: Linux Only
- [ADR-040](./0040-testing-economics-and-priorities.md) - Testing Economics and Priorities
- [ADR-012](./0012-frontend-component-library-and-design-system.md) - Frontend Component Library
## Key Files
| File | Purpose |
| ------------------------------ | -------------------------------------------- |
| `vite.config.ts` | Unit test environment configuration |
| `vitest.config.e2e.ts` | E2E test environment configuration |
| `vitest.config.integration.ts` | Integration test environment configuration |
| `src/tests/e2e/*.e2e.test.ts` | E2E test files with versioned API paths |
| `src/routes/*.routes.test.ts` | Route test files with `vi.hoisted()` pattern |
| `docs/development/TESTING.md` | Testing guide with best practices |

View File

@@ -0,0 +1,517 @@
# ADR-0042: Browser Test Performance Optimization
**Status**: Accepted
**Date**: 2026-02-10
**Authors**: Claude Code AI Agent
## Context
### Current State
The stock-alert project has 64 Playwright browser tests across 5 spec files taking approximately 240 seconds (~4 minutes) to execute. Analysis reveals three major performance bottlenecks:
| Metric | Count | Impact |
| ----------------------------------- | ----- | -------------------------------------------- |
| Hardcoded `waitForTimeout()` calls | 66 | ~120s cumulative wait time |
| Redundant login calls per test | 43 | ~2-3s each = 86-129s overhead |
| Visual regression tests blocking CI | 4 | Cannot run in parallel with functional tests |
### Test Distribution
| File | Tests | `waitForTimeout` Calls | `login()` Calls |
| ------------------- | ------ | ---------------------- | ------------------------ |
| `dashboard.spec.js` | 10 | 8 | 10 |
| `alerts.spec.js` | 14 | 25 | 1 (beforeEach) |
| `gaps.spec.js` | 20 | 29 | 1 (beforeEach) |
| `login.spec.js` | 11 | 4 | 0 (tests login itself) |
| `visual.spec.js` | 4 | 0 | 4 (via navigateWithAuth) |
| **Total** | **59** | **66** | **16 patterns** |
### Root Causes
1. **Anti-Pattern: Hardcoded Timeouts**
- `waitForTimeout(2000)` used to "wait for data to load"
- Unnecessarily slow on fast systems, flaky on slow systems
- No correlation to actual page readiness
2. **Anti-Pattern: Per-Test Authentication**
- Each test navigates to `/login`, enters password, submits
- Session cookie persists across requests but not across tests
- `beforeEach` login adds 2-3 seconds per test
3. **Architecture: Mixed Test Types**
- Visual regression tests require different infrastructure (baseline images)
- Functional tests and visual tests compete for worker slots
- Cannot optimize CI parallelization
### Requirements
1. Reduce test suite runtime by 40-55%
2. Improve test determinism (eliminate flakiness)
3. Maintain test coverage and reliability
4. Enable parallel CI execution where possible
5. Document patterns for other projects
## Decision
Implement three optimization phases:
### Phase 1: Event-Based Wait Replacement (Primary Impact: ~50% of time savings)
Replace all 66 `waitForTimeout()` calls with Playwright's event-based waiting APIs.
**Replacement Patterns:**
| Current Pattern | Replacement | Rationale |
| --------------------------------------- | ------------------------------------------------- | ----------------------------- |
| `waitForTimeout(2000)` after navigation | `waitForLoadState('networkidle')` | Waits for network quiescence |
| `waitForTimeout(1000)` after click | `waitForSelector('.result')` | Waits for specific DOM change |
| `waitForTimeout(3000)` for charts | `waitForSelector('canvas', { state: 'visible' })` | Waits for chart render |
| `waitForTimeout(500)` for viewport | `waitForFunction(() => ...)` | Waits for layout reflow |
**Implementation Examples:**
```javascript
// BEFORE: Hardcoded timeout
await page.goto('/alerts');
await page.waitForTimeout(2000);
const rows = await page.locator('tbody tr').count();
// AFTER: Event-based wait
await page.goto('/alerts');
await page.waitForLoadState('networkidle');
await page.waitForSelector('tbody tr', { state: 'attached' });
const rows = await page.locator('tbody tr').count();
```
```javascript
// BEFORE: Hardcoded timeout after action
await page.click('#runCheckBtn');
await page.waitForTimeout(2000);
// AFTER: Wait for response
const [response] = await Promise.all([
page.waitForResponse((resp) => resp.url().includes('/api/check')),
page.click('#runCheckBtn'),
]);
```
**Helper Function Addition to `helpers.js`:**
```javascript
/**
* Waits for page to be fully loaded with data.
* Replaces hardcoded waitForTimeout calls.
*/
async function waitForPageReady(page, options = {}) {
const { dataSelector = null, networkIdle = true, minTime = 0 } = options;
const promises = [];
if (networkIdle) {
promises.push(page.waitForLoadState('networkidle'));
}
if (dataSelector) {
promises.push(page.waitForSelector(dataSelector, { state: 'visible' }));
}
if (minTime > 0) {
promises.push(page.waitForTimeout(minTime)); // Escape hatch for animations
}
await Promise.all(promises);
}
```
**Estimated Time Savings:** 60-80 seconds (eliminates ~120s of cumulative waits, but event waits have overhead)
### Phase 2: Global Authentication Setup (Primary Impact: ~35% of time savings)
Share authenticated session across all tests using Playwright's global setup feature.
**Architecture:**
```
┌──────────────────┐
│ global-setup.js │
│ │
│ 1. Login once │
│ 2. Save storage │
└────────┬─────────┘
┌──────────────────────┼──────────────────────┐
│ │ │
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ dashboard.spec │ │ alerts.spec │ │ gaps.spec │
│ (reuses auth) │ │ (reuses auth) │ │ (reuses auth) │
└─────────────────┘ └─────────────────┘ └─────────────────┘
```
**Implementation Files:**
**`tests/browser/global-setup.js`:**
```javascript
const { chromium } = require('@playwright/test');
const path = require('path');
const authFile = path.join(__dirname, '.auth', 'user.json');
module.exports = async function globalSetup() {
const browser = await chromium.launch();
const page = await browser.newPage();
// Only perform login if authentication is enabled
if (process.env.DASHBOARD_PASSWORD) {
await page.goto(process.env.PLAYWRIGHT_BASE_URL || 'http://localhost:8980');
// Perform login
await page.goto('/login');
await page.fill('#password', process.env.DASHBOARD_PASSWORD);
await page.click('button[type="submit"]');
await page.waitForURL('/');
// Save authentication state
await page.context().storageState({ path: authFile });
}
await browser.close();
};
```
**`playwright.config.js` Updates:**
```javascript
module.exports = defineConfig({
// ... existing config ...
// Global setup runs once before all tests
globalSetup: require.resolve('./tests/browser/global-setup.js'),
projects: [
{
name: 'chromium',
use: {
...devices['Desktop Chrome'],
// Reuse authentication state from global setup
storageState: './tests/browser/.auth/user.json',
},
},
],
});
```
**Test File Updates:**
```javascript
// BEFORE: Login in beforeEach
test.beforeEach(async ({ page }) => {
page.consoleErrors = captureConsoleErrors(page);
if (isAuthEnabled()) {
await login(page);
}
});
// AFTER: Remove login (handled by global setup)
test.beforeEach(async ({ page }) => {
page.consoleErrors = captureConsoleErrors(page);
// Authentication already applied via storageState
});
```
**Estimated Time Savings:** 80-100 seconds (43 logins x ~2-3s each, minus 3s for global setup)
### Phase 3: Visual Test Separation (Primary Impact: CI parallelization)
Separate visual regression tests into a dedicated project for parallel CI execution.
**Project Configuration:**
```javascript
// playwright.config.js
module.exports = defineConfig({
projects: [
// Functional tests - fast, event-based
{
name: 'functional',
testMatch: /^(?!.*visual).*\.spec\.js$/,
use: {
...devices['Desktop Chrome'],
storageState: './tests/browser/.auth/user.json',
},
},
// Visual tests - separate baseline management
{
name: 'visual',
testMatch: '**/visual.spec.js',
use: {
...devices['Desktop Chrome'],
storageState: './tests/browser/.auth/user.json',
},
// Different snapshot handling
snapshotPathTemplate: '{testDir}/__screenshots__/{projectName}/{testFilePath}/{arg}{ext}',
},
],
});
```
**CI Pipeline Updates:**
```yaml
# .gitea/workflows/test.yml
jobs:
browser-functional:
runs-on: ubuntu-latest
steps:
- run: npx playwright test --project=functional
browser-visual:
runs-on: ubuntu-latest
steps:
- run: npx playwright test --project=visual
```
**Estimated Time Savings:** 30-45 seconds (parallel execution vs sequential)
## Implementation Schedule
### Critical Path (Estimated 8-12 hours)
```
Phase 1 (Event Waits) ████████████████ [4-6h]
Phase 2 (Global Auth) ████████ [2-3h]
Phase 3 (Visual Separation) ████ [2-3h]
```
### Effort Summary
| Phase | Min Hours | Max Hours | Expected Savings |
| ------------------------ | --------- | --------- | --------------------- |
| 1. Event-Based Waits | 4 | 6 | 60-80s (25-33%) |
| 2. Global Authentication | 2 | 3 | 80-100s (33-42%) |
| 3. Visual Separation | 2 | 3 | 30-45s (CI parallel) |
| **Total** | **8** | **12** | **170-225s (70-94%)** |
### Expected Results
| Metric | Before | After | Improvement |
| ------------------ | ------ | --------- | ------------- |
| Total Runtime | 240s | 110-140s | 42-54% faster |
| Flaky Test Rate | ~5% | <1% | 80% reduction |
| CI Parallelization | None | 2 workers | 2x throughput |
| Login Operations | 43 | 1 | 98% reduction |
| Hardcoded Waits | 66 | <5 | 92% reduction |
## Consequences
### Positive
1. **Performance**: 40-55% reduction in test runtime
2. **Reliability**: Event-based waits eliminate timing flakiness
3. **Scalability**: Global setup pattern scales to N tests with O(1) login cost
4. **CI Efficiency**: Parallel visual tests enable faster feedback loops
5. **Maintainability**: Centralized auth logic reduces code duplication
6. **Transferable Knowledge**: Patterns applicable to any Playwright project
### Negative
1. **Initial Migration Effort**: 8-12 hours of refactoring
2. **Learning Curve**: Team must understand Playwright wait APIs
3. **Global Setup Complexity**: Adds shared state between tests
4. **Debugging Harder**: Shared auth can mask test isolation issues
### Mitigations
| Risk | Mitigation |
| ------------------ | ------------------------------------------------------------- |
| Global setup fails | Add retry logic; fallback to per-test login |
| Event waits flaky | Keep small timeout buffer (100ms) as escape hatch |
| Visual tests drift | Separate baseline management per environment |
| Test isolation | Run `--project=functional` without auth for isolation testing |
### Neutral
- Test count unchanged (59 tests)
- Coverage unchanged
- Visual baselines unchanged (path changes only)
## Alternatives Considered
### Alternative 1: Reduce Test Count
**Rejected:** Sacrifices coverage for speed. Tests exist for a reason.
### Alternative 2: Increase Worker Parallelism
**Rejected:** Server cannot handle >2 concurrent sessions reliably; creates resource contention.
### Alternative 3: Use `page.waitForTimeout()` with Shorter Durations
**Rejected:** Addresses symptom, not root cause. Still creates timing-dependent tests.
### Alternative 4: Cookie Injection Instead of Login
**Rejected:** Requires reverse-engineering session format; brittle if auth changes.
### Alternative 5: HTTP API Authentication (No Browser)
**Rejected:** Loses browser session behavior validation; tests login flow.
## Implementation Details
### Wait Replacement Mapping
| File | Current Timeouts | Replacement Strategy |
| ------------------- | ---------------------- | ---------------------------------------------------------------------- |
| `dashboard.spec.js` | 1000ms, 2000ms, 3000ms | `waitForSelector` for charts, `waitForLoadState` for navigation |
| `alerts.spec.js` | 500ms, 1000ms, 2000ms | `waitForResponse` for API calls, `waitForSelector` for table rows |
| `gaps.spec.js` | 500ms, 1000ms, 2000ms | `waitForResponse` for `/api/gaps`, `waitForSelector` for summary cards |
| `login.spec.js` | 500ms, 2000ms | `waitForURL` for redirects, `waitForSelector` for error messages |
### Common Wait Patterns for This Codebase
| Scenario | Recommended Pattern | Example |
| --------------------- | ------------------------------------------------- | ------------------------- |
| After page navigation | `waitForLoadState('networkidle')` | Loading dashboard data |
| After button click | `waitForResponse()` + `waitForSelector()` | Run Check button |
| After filter change | `waitForResponse(/api\/.*/)` | Status filter dropdown |
| For chart rendering | `waitForSelector('canvas', { state: 'visible' })` | Chart cards |
| For modal appearance | `waitForSelector('.modal', { state: 'visible' })` | Confirmation dialogs |
| For layout change | `waitForFunction()` | Responsive viewport tests |
### Auth Storage Structure
```
tests/browser/
├── .auth/
│ └── user.json # Generated by global-setup, gitignored
├── global-setup.js # Creates user.json
├── dashboard.spec.js # Uses storageState
├── alerts.spec.js
├── gaps.spec.js
├── login.spec.js # Tests login itself, may need special handling
└── visual.spec.js
```
**`.gitignore` Addition:**
```
tests/browser/.auth/
```
### Login.spec.js Special Handling
`login.spec.js` tests the login flow itself and must NOT use the shared auth state:
```javascript
// playwright.config.js
projects: [
{
name: 'functional',
testMatch: /^(?!.*login).*\.spec\.js$/,
use: { storageState: './tests/browser/.auth/user.json' },
},
{
name: 'login',
testMatch: '**/login.spec.js',
use: { storageState: undefined }, // No auth - tests login flow
},
];
```
## Testing the Optimization
### Baseline Measurement
```bash
# Before optimization: establish baseline
time npm run test:browser 2>&1 | tee baseline-timing.log
grep -E "passed|failed|skipped" baseline-timing.log
```
### Incremental Verification
```bash
# After Phase 1: verify wait replacement
npm run test:browser -- --reporter=list 2>&1 | grep -E "passed|failed|slow"
# After Phase 2: verify global auth
npm run test:browser -- --trace on
# Check trace for login occurrences (should be 1)
# After Phase 3: verify parallel execution
npm run test:browser -- --project=functional &
npm run test:browser -- --project=visual &
wait
```
### Success Criteria
| Metric | Target | Measurement |
| ---------------------- | ------ | -------------------------------------- |
| Total runtime | <150s | `time npm run test:browser` |
| Login count | 1 | Grep traces for `/login` navigation |
| Flaky rate | <2% | 50 consecutive CI runs |
| `waitForTimeout` count | <5 | `grep -r waitForTimeout tests/browser` |
## Lessons Learned / Patterns for Other Projects
### Pattern 1: Always Prefer Event-Based Waits
```javascript
// Bad
await page.click('#submit');
await page.waitForTimeout(2000);
expect(await page.title()).toBe('Success');
// Good
await Promise.all([page.waitForNavigation(), page.click('#submit')]);
expect(await page.title()).toBe('Success');
```
### Pattern 2: Global Setup for Authentication
Playwright's `storageState` feature should be the default for any authenticated app:
1. Create `global-setup.js` that performs login once
2. Save cookies/storage to JSON file
3. Configure `storageState` in `playwright.config.js`
4. Tests start authenticated with zero overhead
### Pattern 3: Separate Test Types by Execution Characteristics
| Test Type | Characteristics | Strategy |
| ---------- | ------------------------ | --------------------------------- |
| Functional | Fast, deterministic | Run first, gate deployment |
| Visual | Slow, baseline-dependent | Run in parallel, separate project |
| E2E | Cross-service, slow | Run nightly, separate workflow |
### Pattern 4: Measure Before and After
Always establish baseline metrics before optimization:
```bash
# Essential metrics to capture
time npm run test:browser # Total runtime
grep -c waitForTimeout *.js # Hardcoded wait count
grep -c 'await login' *.js # Login call count
```
## Related ADRs
- [ADR-0031](0031-quality-gates-eslint-playwright.md): Quality Gates - ESLint, Pre-commit Hooks, and Playwright Browser Testing
- [ADR-0035](0035-browser-test-selector-fixes.md): Browser Test Selector Fixes
- [ADR-0008](0008-testing-strategy.md): Testing Strategy
## References
- Playwright Best Practices: https://playwright.dev/docs/best-practices
- Playwright Authentication: https://playwright.dev/docs/auth
- Playwright Wait Strategies: https://playwright.dev/docs/actionability
- Test Files: `tests/browser/*.spec.js`
- Helper Module: `tests/browser/helpers.js`
- Configuration: `playwright.config.js`

View File

@@ -0,0 +1,308 @@
# ADR-059: Dependency Modernization Plan
**Status**: Accepted
**Date**: 2026-02-12
**Implemented**: 2026-02-12
## Context
NPM audit and security scanning identified deprecated dependencies requiring modernization:
| Dependency | Current | Issue | Replacement |
| --------------- | ------- | ----------------------- | --------------------------------------- |
| `swagger-jsdoc` | 6.2.8 | Unmaintained since 2022 | `tsoa` (decorator-based OpenAPI) |
| `rimraf` | 6.1.2 | Legacy cleanup utility | Node.js `fs.rm()` (native since v14.14) |
**Constraints**:
- Existing `@openapi` JSDoc annotations in 20 route files
- ADR-018 compliance (API documentation strategy)
- Zero-downtime migration (phased approach)
- Must maintain Express 5.x compatibility
## Decision
### 1. swagger-jsdoc → tsoa Migration
**Architecture**: tsoa controller classes + Express integration (no replacement of Express routing layer).
```text
Current: Route Files → JSDoc Annotations → swagger-jsdoc → OpenAPI Spec
Future: Controller Classes → @Route/@Get decorators → tsoa → OpenAPI Spec + Route Registration
```
**Controller Pattern**: Base controller providing common utilities:
```typescript
// src/controllers/base.controller.ts
export abstract class BaseController {
protected sendSuccess<T>(res: Response, data: T, status = 200) {
return sendSuccess(res, data, status);
}
protected sendError(
res: Response,
code: ErrorCode,
msg: string,
status: number,
details?: unknown,
) {
return sendError(res, code, msg, status, details);
}
}
```
**Express Integration Strategy**: tsoa generates routes.ts; wrap with Express middleware pipeline:
```typescript
// server.ts integration
import { RegisterRoutes } from './src/generated/routes';
RegisterRoutes(app); // tsoa registers routes with existing Express app
```
### 2. rimraf → fs.rm() Migration
**Change**: Replace `rimraf coverage .coverage` script with Node.js native API.
```json
// package.json (before)
"clean": "rimraf coverage .coverage"
// package.json (after)
"clean": "node -e \"import('fs/promises').then(fs => Promise.all([fs.rm('coverage', {recursive:true,force:true}), fs.rm('.coverage', {recursive:true,force:true})]))\""
```
**Alternative**: Create `scripts/clean.mjs` for maintainability:
```javascript
// scripts/clean.mjs
import { rm } from 'fs/promises';
await Promise.all([
rm('coverage', { recursive: true, force: true }),
rm('.coverage', { recursive: true, force: true }),
]);
```
## Implementation Plan
### Phase 1: Infrastructure (Tasks 1-4)
| Task | Description | Dependencies |
| ---- | ---------------------------------------------- | ------------ |
| 1 | Install tsoa, configure tsoa.json | None |
| 2 | Create BaseController with utility methods | Task 1 |
| 3 | Configure Express integration (RegisterRoutes) | Task 2 |
| 4 | Set up tsoa spec generation in build pipeline | Task 3 |
### Phase 2: Controller Migration (Tasks 5-14)
Priority order matches ADR-018:
| Task | Route File | Controller Class | Dependencies |
| ---- | ----------------------------------------------------------------------------------------------------------------- | ---------------------- | ------------ |
| 5 | health.routes.ts | HealthController | Task 4 |
| 6 | auth.routes.ts | AuthController | Task 4 |
| 7 | gamification.routes.ts | AchievementsController | Task 4 |
| 8 | flyer.routes.ts | FlyersController | Task 4 |
| 9 | user.routes.ts | UsersController | Task 4 |
| 10 | budget.routes.ts | BudgetController | Task 4 |
| 11 | recipe.routes.ts | RecipeController | Task 4 |
| 12 | store.routes.ts | StoreController | Task 4 |
| 13 | admin.routes.ts | AdminController | Task 4 |
| 14 | Remaining routes (deals, price, upc, inventory, ai, receipt, category, stats, personalization, reactions, system) | Various | Task 4 |
### Phase 3: Cleanup and rimraf (Tasks 15-18)
| Task | Description | Dependencies |
| ---- | -------------------------------- | ------------------- |
| 15 | Create scripts/clean.mjs | None |
| 16 | Update package.json clean script | Task 15 |
| 17 | Remove rimraf dependency | Task 16 |
| 18 | Remove swagger-jsdoc + types | Tasks 5-14 complete |
### Phase 4: Verification (Tasks 19-24)
| Task | Description | Dependencies |
| ---- | --------------------------------- | ------------ |
| 19 | Run type-check | Tasks 15-18 |
| 20 | Run unit tests | Task 19 |
| 21 | Run integration tests | Task 20 |
| 22 | Verify OpenAPI spec completeness | Task 21 |
| 23 | Update ADR-018 (reference tsoa) | Task 22 |
| 24 | Update CLAUDE.md (swagger → tsoa) | Task 23 |
### Task Dependency Graph
```text
[1: Install tsoa]
|
[2: BaseController]
|
[3: Express Integration]
|
[4: Build Pipeline]
|
+------------------+------------------+
| | | | |
[5] [6] [7] [8] [9-14]
Health Auth Gamif Flyer Others
| | | | |
+------------------+------------------+
|
[18: Remove swagger-jsdoc]
|
[15: clean.mjs] -----> [16: Update pkg.json]
|
[17: Remove rimraf]
|
[19: type-check]
|
[20: unit tests]
|
[21: integration tests]
|
[22: Verify OpenAPI]
|
[23: Update ADR-018]
|
[24: Update CLAUDE.md]
```
### Critical Path
**Minimum time to completion**: Tasks 1 → 2 → 3 → 4 → 5 (or any controller) → 18 → 19 → 20 → 21 → 22 → 23 → 24
**Parallelization opportunities**:
- Tasks 5-14 (all controller migrations) can run in parallel after Task 4
- Tasks 15-17 (rimraf removal) can run in parallel with controller migrations
## Technical Decisions
### tsoa Configuration
```json
// tsoa.json
{
"entryFile": "server.ts",
"noImplicitAdditionalProperties": "throw-on-extras",
"controllerPathGlobs": ["src/controllers/**/*.controller.ts"],
"spec": {
"outputDirectory": "src/generated",
"specVersion": 3,
"basePath": "/api/v1"
},
"routes": {
"routesDir": "src/generated",
"middleware": "express"
}
}
```
### Decorator Migration Example
**Before** (swagger-jsdoc):
```typescript
/**
* @openapi
* /health/ping:
* get:
* summary: Simple ping endpoint
* tags: [Health]
* responses:
* 200:
* description: Server is responsive
*/
router.get('/ping', validateRequest(emptySchema), handler);
```
**After** (tsoa):
```typescript
@Route('health')
@Tags('Health')
export class HealthController extends BaseController {
@Get('ping')
@SuccessResponse(200, 'Server is responsive')
public async ping(): Promise<{ message: string }> {
return { message: 'pong' };
}
}
```
### Zod Integration
tsoa uses its own validation. Options:
1. **Replace Zod with tsoa validation** - Use `@Body`, `@Query`, `@Path` decorators with TypeScript types
2. **Hybrid approach** - Keep Zod schemas, call `validateRequest()` within controller methods
3. **Custom template** - Generate tsoa routes that call Zod validation middleware
**Recommended**: Option 1 for new controllers; gradually migrate existing Zod schemas.
## Risk Mitigation
| Risk | Likelihood | Impact | Mitigation |
| --------------------------------------- | ---------- | ------ | ------------------------------------------- |
| tsoa/Express 5.x incompatibility | Medium | High | Test in dev container before migration |
| Missing OpenAPI coverage post-migration | Low | Medium | Compare generated specs before/after |
| Authentication middleware integration | Medium | Medium | Test @Security decorator with passport-jwt |
| Test regression from route changes | Low | High | Run full test suite after each controller |
| Build time increase (tsoa generation) | Low | Low | Add to npm run build; cache generated files |
## Consequences
### Positive
- **Type-safe API contracts**: tsoa decorators derive types from TypeScript
- **Reduced duplication**: No more parallel JSDoc + TypeScript type definitions
- **Modern tooling**: Active tsoa community (vs. unmaintained swagger-jsdoc)
- **Native Node.js**: fs.rm() is built-in, no external dependency
- **Smaller dependency tree**: Remove rimraf (5 transitive deps) + swagger-jsdoc (8 transitive deps)
### Negative
- **Learning curve**: Decorator-based controller pattern differs from Express handlers
- **Migration effort**: 20 route files require conversion
- **Generated code**: `src/generated/routes.ts` must be version-controlled or regenerated on build
### Neutral
- **Build step change**: Add `tsoa spec && tsoa routes` to build pipeline
- **Testing approach**: May need to adjust test structure for controller classes
## Alternatives Considered
### 1. Update swagger-jsdoc to fork/successor
**Rejected**: No active fork; community has moved to tsoa, fastify-swagger, or NestJS.
### 2. NestJS migration
**Rejected**: Full framework migration (Express → NestJS) is disproportionate to the problem scope.
### 3. fastify-swagger
**Rejected**: Requires Express → Fastify migration; out of scope.
### 4. Keep rimraf, accept deprecation warning
**Rejected**: Native fs.rm() is trivial replacement; no reason to maintain deprecated dependency.
## Key Files
| File | Purpose |
| ------------------------------------ | ------------------------------------- |
| `tsoa.json` | tsoa configuration |
| `src/controllers/base.controller.ts` | Base controller with utilities |
| `src/controllers/*.controller.ts` | Individual domain controllers |
| `src/generated/routes.ts` | tsoa-generated Express routes |
| `src/generated/swagger.json` | Generated OpenAPI 3.0 spec |
| `scripts/clean.mjs` | Native fs.rm() replacement for rimraf |
## Related ADRs
- [ADR-018](./0018-api-documentation-strategy.md) - API Documentation Strategy (will be updated)
- [ADR-003](./0003-standardized-input-validation-using-middleware.md) - Input Validation (Zod integration)
- [ADR-028](./0028-api-response-standardization.md) - Response Standardization (BaseController pattern)
- [ADR-001](./0001-standardized-error-handling.md) - Error Handling (error utilities in BaseController)

View File

@@ -0,0 +1,471 @@
# ADR-060: TypeScript Test Error Remediation Strategy
**Date**: 2026-02-17
**Status**: Implemented
**Completed**: 2026-02-17
**Context**: Systematic remediation of 185 TypeScript errors in test files following API response standardization (ADR-028) and tsoa migration (ADR-059)
## Implementation Summary
This ADR has been fully implemented. The remediation project achieved:
| Metric | Value |
| -------------------- | ------------------------------------------- |
| Initial Errors | 185 |
| Final Errors | 0 |
| Files Modified | 19 controller test files + shared utilities |
| Test Suite | 4,603 passed, 59 failed (all pre-existing) |
| Net Test Improvement | +3 tests fixed |
### Implementation Phases Completed
| Phase | Duration | Errors Fixed |
| --------------------------- | --------- | ---------------------------------- |
| Phase 1: Foundation | Completed | Infrastructure (enables all fixes) |
| Phase 2-4: Parallel Tasks | 3 rounds | 185 -> 114 -> 67 -> 23 -> 0 |
| Phase 5: Final Verification | Completed | All type-check passes |
### Artifacts Created
1. **Shared Test Utilities** (`src/tests/utils/testHelpers.ts`):
- `asSuccessResponse<T>()` - Type guard for success responses
- `asErrorResponse()` - Type guard for error responses
- `asMock<T>()` - Mock function type casting
- Re-export of `createMockLogger`
2. **Mock Logger** (`src/tests/utils/mockLogger.ts`):
- `createMockLogger()` - Complete Pino logger mock
- `mockLogger` - Pre-instantiated mock for convenience
3. **Updated Mock Factories** (`src/tests/utils/mockFactories.ts`):
- 60+ type-safe mock factory functions
- Deterministic ID generation with `getNextId()`
- Complete type coverage for all domain entities
## Context
Following the implementation of ADR-028 (API Response Standardization) and ADR-059 (tsoa Migration), 185 TypeScript errors accumulated in test files. The errors stem from stricter type checking on API response handling, mock type mismatches, and response body access patterns. This ADR documents the systematic analysis, categorization, and phased remediation approach.
### Error Distribution
| Category | Count | Percentage |
| ------------------------------- | ----- | ---------- |
| SuccessResponse type narrowing | 89 | 48.1% |
| Mock object type casting | 42 | 22.7% |
| Response body property access | 28 | 15.1% |
| Partial mock missing properties | 18 | 9.7% |
| Generic type parameter issues | 5 | 2.7% |
| Module import type issues | 3 | 1.6% |
### Root Cause Analysis
1. **SuccessResponse Discriminated Union**: ADR-028 introduced `ApiSuccessResponse<T> | ApiErrorResponse` union types. Tests accessing `response.body.data` without type guards trigger TS2339 errors.
2. **Mock Type Strictness**: Vitest mocks return `MockedFunction<T>` types. Passing to functions expecting exact signatures requires explicit casting.
3. **Partial<T> vs Full Object**: Factory functions creating partial mocks lack required properties. Tests using spread operators without type assertions fail property access.
4. **Response Body Type Unknown**: Supertest `response.body` is typed as `unknown` or `any`. Direct property access without narrowing violates strict mode.
## Decision
Implement a 5-phase remediation strategy with parallelizable tasks, prioritized by error count per file and criticality.
### Phase 1: High-Impact Infrastructure (Est. 2 hours)
**Goal**: Fix foundational patterns that propagate to multiple files.
| Task | Files | Errors Fixed |
| ----------------------------------------------------- | ---------------------------------- | ---------------- |
| Add `asSuccessResponse<T>()` type guard to test utils | `src/tests/utils/testHelpers.ts` | Enables 89 fixes |
| Add `asMock<T>()` utility for mock casting | `src/tests/utils/testHelpers.ts` | Enables 42 fixes |
| Update mock factories with strict return types | `src/tests/utils/mockFactories.ts` | 18 |
**Type Guard Implementation**:
```typescript
// src/tests/utils/testHelpers.ts
import { ApiSuccessResponse, ApiErrorResponse } from '@/types/api';
/**
* Type guard to narrow supertest response body to ApiSuccessResponse.
* Use when accessing .data property on API responses in tests.
*
* @example
* const response = await request.get('/api/v1/users/1');
* const body = asSuccessResponse<User>(response.body);
* expect(body.data.id).toBe(1); // TypeScript knows body.data exists
*/
export function asSuccessResponse<T>(body: unknown): ApiSuccessResponse<T> {
const parsed = body as ApiSuccessResponse<T> | ApiErrorResponse;
if (parsed.success !== true) {
throw new Error(`Expected success response, got: ${JSON.stringify(parsed)}`);
}
return parsed;
}
/**
* Type guard for error responses.
*/
export function asErrorResponse(body: unknown): ApiErrorResponse {
const parsed = body as ApiSuccessResponse<unknown> | ApiErrorResponse;
if (parsed.success !== false) {
throw new Error(`Expected error response, got: ${JSON.stringify(parsed)}`);
}
return parsed;
}
/**
* Cast Vitest mock to specific function type.
* Use when passing mocked functions to code expecting exact signatures.
*
* @example
* const mockFn = vi.fn();
* someService.register(asMock<UserService['create']>(mockFn));
*/
export function asMock<T extends (...args: unknown[]) => unknown>(
mock: ReturnType<typeof vi.fn>,
): T {
return mock as unknown as T;
}
```
### Phase 2: Route Test Files (Est. 3 hours)
**Priority**: Files with 10+ errors first.
| File | Errors | Pattern |
| ----------------------------------------- | ------ | -------------------------------------- |
| `src/routes/flyer.routes.test.ts` | 24 | Response body narrowing |
| `src/routes/user.routes.test.ts` | 18 | Response body narrowing |
| `src/routes/auth.routes.test.ts` | 15 | Response body narrowing |
| `src/routes/recipe.routes.test.ts` | 12 | Response body narrowing |
| `src/routes/shopping-list.routes.test.ts` | 11 | Response body narrowing |
| `src/routes/notification.routes.test.ts` | 9 | Response body narrowing |
| `src/routes/inventory.routes.test.ts` | 8 | Response body narrowing |
| `src/routes/budget.routes.test.ts` | 7 | Response body narrowing |
| `src/routes/admin.routes.test.ts` | 6 | Response body narrowing + mock casting |
**Fix Pattern**:
```typescript
// BEFORE (TS2339: Property 'data' does not exist on type 'unknown')
const response = await request.get('/api/v1/flyers/1');
expect(response.body.data.flyer_id).toBe(1);
// AFTER
import { asSuccessResponse } from '@/tests/utils/testHelpers';
import { Flyer } from '@/types/flyer';
const response = await request.get('/api/v1/flyers/1');
const body = asSuccessResponse<Flyer>(response.body);
expect(body.data.flyer_id).toBe(1);
```
### Phase 3: Service Test Files (Est. 2 hours)
**Priority**: Mock casting issues.
| File | Errors | Pattern |
| ------------------------------------------------- | ------ | ------------------ |
| `src/services/db/flyer.db.test.ts` | 8 | Pool mock typing |
| `src/services/db/user.db.test.ts` | 7 | Pool mock typing |
| `src/services/aiService.server.test.ts` | 6 | Gemini mock typing |
| `src/services/cacheService.server.test.ts` | 5 | Redis mock typing |
| `src/services/notificationService.server.test.ts` | 4 | Queue mock typing |
**Mock Casting Pattern**:
```typescript
// BEFORE (TS2345: Argument of type 'Mock' is not assignable)
const mockPool = { query: vi.fn() };
const service = new FlyerService(mockPool);
// AFTER
import { Pool } from 'pg';
const mockPool = {
query: vi.fn().mockResolvedValue({ rows: [], rowCount: 0 }),
} as unknown as Pool;
const service = new FlyerService(mockPool);
```
### Phase 4: Integration Test Files (Est. 1.5 hours)
| File | Errors | Pattern |
| ------------------------------------------------- | ------ | ----------------------- |
| `src/tests/integration/flyer.integration.test.ts` | 6 | Response body + cleanup |
| `src/tests/integration/auth.integration.test.ts` | 5 | Response body |
| `src/tests/integration/user.integration.test.ts` | 4 | Response body |
| `src/tests/integration/admin.integration.test.ts` | 4 | Response body |
**Integration Test Pattern**:
```typescript
// Establish typed response helper at top of file
const expectSuccess = <T>(response: Response) => {
expect(response.status).toBeLessThan(400);
return asSuccessResponse<T>(response.body);
};
// Usage
const body = expectSuccess<{ token: string }>(response);
expect(body.data.token).toBeDefined();
```
### Phase 5: Component and Hook Tests (Est. 1.5 hours)
| File | Errors | Pattern |
| ----------------------------- | ------ | ------------------- |
| `src/hooks/useFlyers.test.ts` | 3 | MSW response typing |
| `src/hooks/useAuth.test.ts` | 3 | MSW response typing |
| Various component tests | 8 | Mock prop typing |
**MSW Handler Pattern**:
```typescript
// BEFORE
http.get('/api/v1/flyers', () => {
return HttpResponse.json({ data: [mockFlyer] });
});
// AFTER
import { ApiSuccessResponse } from '@/types/api';
import { Flyer } from '@/types/flyer';
http.get('/api/v1/flyers', () => {
const response: ApiSuccessResponse<Flyer[]> = {
success: true,
data: [mockFlyer],
};
return HttpResponse.json(response);
});
```
## Implementation Guidelines
### 1. Mock Object Casting Hierarchy
Use the least permissive cast that satisfies TypeScript:
```typescript
// Level 1: Type assertion for compatible shapes
const mock = createMockUser() as User;
// Level 2: Unknown bridge for incompatible shapes
const mock = partialMock as unknown as User;
// Level 3: Partial with required overrides
const mock: User = { ...createPartialUser(), id: 1, email: 'test@test.com' };
```
### 2. Response Type Narrowing
**Always narrow before property access**:
```typescript
// Standard pattern
const body = asSuccessResponse<ExpectedType>(response.body);
expect(body.data.property).toBe(value);
// With error expectation
expect(response.status).toBe(400);
const body = asErrorResponse(response.body);
expect(body.error.code).toBe('VALIDATION_ERROR');
```
### 3. Mock Function Type Safety
```typescript
// vi.fn() with implementation type
const mockFn = vi.fn<[string], Promise<User>>().mockResolvedValue(mockUser);
// Mocked module function
vi.mock('@/services/userService');
const mockedService = vi.mocked(userService);
mockedService.create.mockResolvedValue(mockUser);
```
### 4. Generic Type Parameters
When TypeScript cannot infer generics, provide explicit parameters:
```typescript
// Explicit generic on factory
const mock = createMockPaginatedResponse<Flyer>({ data: [mockFlyer] });
// Explicit generic on assertion
expect(result).toEqual<ApiSuccessResponse<User>>({
success: true,
data: mockUser,
});
```
## Parallelization Strategy
### Parallel Execution Groups
Tests can be fixed in parallel within these independent groups:
| Group | Files | Dependencies |
| ----- | ------------------------------------------------- | ----------------- |
| A | Route tests (auth, user, flyer) | Phase 1 utilities |
| B | Route tests (recipe, shopping-list, notification) | Phase 1 utilities |
| C | Service tests (db layer) | None |
| D | Service tests (external services) | None |
| E | Integration tests | Phase 1 utilities |
| F | Component/hook tests | None |
**Dependency Graph**:
```
Phase 1 (Infrastructure)
├── Group A ─┐
├── Group B ─┼── Can run in parallel
└── Group E ─┘
Groups C, D, F have no dependencies (can start immediately)
```
### Critical Path
Minimum time to completion: **Phase 1 (2h) + longest parallel group (1.5h) = 3.5 hours**
Sequential worst case: **10 hours** (if no parallelization)
## Testing Strategy
### Execution Environment
**All tests MUST run in the dev container** per ADR-014:
```bash
# Type check (fast verification)
podman exec -it flyer-crawler-dev npm run type-check
# Unit tests (after type check passes)
podman exec -it flyer-crawler-dev npm run test:unit
# Full suite (final verification)
podman exec -it flyer-crawler-dev npm test
```
### Background Job Execution (MCP)
For long-running test suites, use the MCP background-job tools:
```bash
# Estimate duration first
mcp__background-job__estimate_command_duration("npm run type-check")
# Execute in background
mcp__background-job__execute_command("npm run type-check")
# Poll status per guidelines (15-30s intervals)
mcp__background-job__get_job_status(job_id)
```
### Incremental Verification
After each phase, verify:
1. **Type check passes**: `npm run type-check` exits 0
2. **Affected tests pass**: Run specific test file
3. **No regressions**: Run full unit suite
## Consequences
### Positive
1. **Type Safety**: All test files will have proper TypeScript coverage
2. **IDE Support**: IntelliSense works correctly for response bodies
3. **Refactoring Safety**: Type errors will catch API contract changes
4. **Pattern Consistency**: Established patterns for future test writing
5. **Reusable Utilities**: `asSuccessResponse`, `asMock` utilities for all tests
### Negative
1. **Verbosity**: Tests require explicit type narrowing (2-3 extra lines)
2. **Maintenance**: Type parameters must match actual API responses
3. **Learning Curve**: Contributors must learn type guard patterns
### Neutral
1. **Test Execution**: No runtime performance impact (compile-time only)
2. **Coverage**: No change to test coverage metrics
## File Priority Matrix
### By Error Count (Descending)
| Priority | File | Errors |
| -------- | ----------------------------------------- | -------------- |
| P0 | `src/tests/utils/testHelpers.ts` | Infrastructure |
| P0 | `src/tests/utils/mockFactories.ts` | 18 |
| P1 | `src/routes/flyer.routes.test.ts` | 24 |
| P1 | `src/routes/user.routes.test.ts` | 18 |
| P1 | `src/routes/auth.routes.test.ts` | 15 |
| P2 | `src/routes/recipe.routes.test.ts` | 12 |
| P2 | `src/routes/shopping-list.routes.test.ts` | 11 |
| P2 | `src/routes/notification.routes.test.ts` | 9 |
| P3 | `src/routes/inventory.routes.test.ts` | 8 |
| P3 | `src/services/db/flyer.db.test.ts` | 8 |
| P3 | `src/routes/budget.routes.test.ts` | 7 |
| P3 | `src/services/db/user.db.test.ts` | 7 |
### By Criticality (Business Impact)
| Tier | Files | Rationale |
| -------- | --------------------------- | ---------------------- |
| Critical | auth, user routes | Authentication flows |
| High | flyer, shopping-list routes | Core business features |
| Medium | recipe, budget, inventory | Secondary features |
| Low | admin, notification | Support features |
## Migration Checklist
### Pre-Remediation
- [x] Read this ADR and understand patterns
- [x] Verify dev container is running
- [x] Run `npm run type-check` to confirm error count
- [x] Create working branch
### During Remediation
- [x] Implement Phase 1 infrastructure utilities
- [x] Fix highest-error files first within each phase
- [x] Run type-check after each file fix
- [x] Run specific test file to verify no runtime breaks
### Post-Remediation
- [x] Run full type-check: `npm run type-check` (0 errors)
- [x] Run unit tests: `npm run test:unit` (4,603 passed)
- [x] Run integration tests: `npm run test:integration`
- [x] Update ADR status to Implemented
## Related ADRs
- [ADR-010](./0010-testing-strategy-and-standards.md) - Testing Strategy and Standards
- [ADR-014](./0014-containerization-and-deployment-strategy.md) - Platform: Linux Only
- [ADR-028](./0028-api-response-standardization.md) - API Response Standardization
- [ADR-045](./0045-test-data-factories-and-fixtures.md) - Test Data Factories and Fixtures
- [ADR-057](./0057-test-remediation-post-api-versioning.md) - Test Remediation Post-API Versioning
- [ADR-059](./0059-dependency-modernization.md) - Dependency Modernization (tsoa Migration)
## Key Files
| File | Purpose |
| ---------------------------------- | ------------------------------------------------------- |
| `src/tests/utils/testHelpers.ts` | Type guard utilities (`asSuccessResponse`, `asMock`) |
| `src/tests/utils/mockFactories.ts` | Typed mock object factories |
| `src/types/api.ts` | `ApiSuccessResponse<T>`, `ApiErrorResponse` definitions |
| `src/utils/apiResponse.ts` | `sendSuccess()`, `sendError()` implementations |
| `vite.config.ts` | Unit test TypeScript configuration |
| `vitest.config.integration.ts` | Integration test TypeScript configuration |

View File

@@ -0,0 +1,199 @@
# ADR-061: PM2 Process Isolation Safeguards
## Status
Accepted
## Context
On 2026-02-17, a critical incident occurred during v0.15.0 production deployment where ALL PM2 processes on the production server were terminated, not just flyer-crawler processes. This caused unplanned downtime for multiple applications including `stock-alert.projectium.com`.
### Problem Statement
Production and test environments share the same PM2 daemon on the server. This creates a risk where deployment scripts that operate on PM2 processes can accidentally affect processes belonging to other applications or environments.
### Pre-existing Controls
Prior to the incident, PM2 process isolation controls were already in place (commit `b6a62a0`):
- Production workflows used whitelist-based filtering with explicit process names
- Test workflows filtered by `-test` suffix pattern
- CLAUDE.md documented the prohibition of `pm2 stop all`, `pm2 delete all`, and `pm2 restart all`
Despite these controls being present in the codebase and included in v0.15.0, the incident still occurred. The leading hypothesis is that the Gitea runner executed a cached/older version of the workflow file.
### Requirements
1. Prevent accidental deletion of processes from other applications or environments
2. Provide audit trail for forensic analysis when incidents occur
3. Enable automatic abort when dangerous conditions are detected
4. Maintain visibility into PM2 operations during deployment
5. Work correctly even if the filtering logic itself is bypassed
## Decision
Implement a defense-in-depth strategy with 5 layers of safeguards in all deployment workflows that interact with PM2 processes.
### Safeguard Layers
#### Layer 1: Workflow Metadata Logging
Log workflow execution metadata at the start of each deployment:
```bash
echo "=== WORKFLOW METADATA ==="
echo "Workflow file: deploy-to-prod.yml"
echo "Workflow file hash: $(sha256sum .gitea/workflows/deploy-to-prod.yml | cut -d' ' -f1)"
echo "Git commit: $(git rev-parse HEAD)"
echo "Git branch: $(git rev-parse --abbrev-ref HEAD)"
echo "Timestamp: $(date -u '+%Y-%m-%d %H:%M:%S UTC')"
echo "Actor: ${{ gitea.actor }}"
echo "=== END METADATA ==="
```
**Purpose**: Enables verification of which workflow version was actually executed.
#### Layer 2: Pre-Cleanup PM2 State Logging
Capture full PM2 process list before any modifications:
```bash
echo "=== PRE-CLEANUP PM2 STATE ==="
pm2 jlist
echo "=== END PRE-CLEANUP STATE ==="
```
**Purpose**: Provides forensic evidence of system state before cleanup.
#### Layer 3: Process Count Validation (SAFETY ABORT)
Abort deployment if the filter would delete ALL processes and there are more than 3 processes total:
```javascript
const totalProcesses = list.length;
if (targetProcesses.length === totalProcesses && totalProcesses > 3) {
console.error('SAFETY ABORT: Filter would delete ALL processes!');
console.error(
'Total processes: ' + totalProcesses + ', Target processes: ' + targetProcesses.length,
);
process.exit(1);
}
```
**Purpose**: Catches filter bugs or unexpected conditions automatically.
**Threshold Rationale**: A threshold of 3 allows normal operation when only the expected processes exist (API, Worker, Analytics Worker) while catching anomalies when the server hosts additional applications.
#### Layer 4: Explicit Name Verification
Log the exact name, status, and PM2 ID of each process that will be affected:
```javascript
console.log('Found ' + targetProcesses.length + ' PRODUCTION processes to clean:');
targetProcesses.forEach((p) => {
console.log(
' - ' + p.name + ' (status: ' + p.pm2_env.status + ', pm_id: ' + p.pm2_env.pm_id + ')',
);
});
```
**Purpose**: Provides clear visibility into cleanup operations.
#### Layer 5: Post-Cleanup Verification
After cleanup, verify environment isolation was maintained:
```bash
echo "=== POST-CLEANUP VERIFICATION ==="
pm2 jlist | node -e "
const list = JSON.parse(require('fs').readFileSync(0, 'utf-8'));
const prodProcesses = list.filter(p => p.name && p.name.startsWith('flyer-crawler-') && !p.name.endsWith('-test'));
console.log('Production processes after cleanup: ' + prodProcesses.length);
"
echo "=== END POST-CLEANUP VERIFICATION ==="
```
**Purpose**: Immediately identifies cross-environment contamination.
## Consequences
### Positive
1. **Automatic Prevention**: Layer 3 (process count validation) can prevent catastrophic process deletion automatically, without human intervention.
2. **Forensic Capability**: Layers 1 and 2 provide the data needed to determine root cause after an incident.
3. **Visibility**: Layers 4 and 5 make PM2 operations transparent in workflow logs.
4. **Fail-Safe Design**: Even if individual layers fail, other layers provide backup protection.
5. **Non-Breaking**: Safeguards are additive and do not change the existing filtering logic.
### Negative
1. **Increased Log Volume**: Additional logging increases workflow output size.
2. **Minor Performance Impact**: Extra PM2 commands add a few seconds to deployment time.
3. **Threshold Tuning**: The threshold of 3 may need adjustment if the expected process count changes.
### Neutral
1. **Root Cause Still Unknown**: These safeguards mitigate the risk but do not definitively explain why the original incident occurred.
2. **No Structural Changes**: The underlying architecture (shared PM2 daemon) remains unchanged.
## Alternatives Considered
### PM2 Namespaces
PM2 supports namespaces to isolate groups of processes. This would provide complete isolation but requires:
- Changes to ecosystem config files
- Changes to all PM2 commands in workflows
- Potential breaking changes to monitoring and log aggregation
**Decision**: Deferred for future consideration. Current safeguards provide adequate protection.
### Separate PM2 Daemons
Running a separate PM2 daemon per application would eliminate cross-application risk entirely.
**Decision**: Not implemented due to increased operational complexity and the current safeguards being sufficient.
### Deployment Locks
Implementing mutex-style locks to prevent concurrent deployments could prevent race conditions.
**Decision**: Not implemented as the current safeguards address the identified risk. May be reconsidered if concurrent deployment issues are observed.
## Implementation
### Files Modified
| File | Changes |
| ------------------------------------------ | ---------------------- |
| `.gitea/workflows/deploy-to-prod.yml` | All 5 safeguard layers |
| `.gitea/workflows/deploy-to-test.yml` | All 5 safeguard layers |
| `.gitea/workflows/manual-deploy-major.yml` | All 5 safeguard layers |
### Validation
A standalone test file validates the safeguard logic:
- **File**: `tests/qa/test-pm2-safeguard-logic.js`
- **Coverage**: 11 scenarios covering normal operations and dangerous edge cases
- **Result**: All tests pass
## Related Documentation
- [Incident Report: 2026-02-17](../operations/INCIDENT-2026-02-17-PM2-PROCESS-KILL.md)
- [PM2 Incident Response Runbook](../operations/PM2-INCIDENT-RESPONSE.md)
- [Session Summary](../archive/sessions/PM2_SAFEGUARDS_SESSION_2026-02-17.md)
- [CLAUDE.md - PM2 Process Isolation](../../CLAUDE.md#pm2-process-isolation-productiontest-servers)
- [ADR-014: Containerization and Deployment Strategy](0014-containerization-and-deployment-strategy.md)
## References
- PM2 Documentation: https://pm2.keymetrics.io/docs/usage/application-declaration/
- Defense in Depth: https://en.wikipedia.org/wiki/Defense_in_depth_(computing)

View File

@@ -0,0 +1,265 @@
# ADR-027: Standardized Application-Wide Structured Logging
**Date**: 2026-02-10
**Status**: Accepted
**Source**: Imported from flyer-crawler project (ADR-004)
**Related**: [ADR-017](ADR-017-structured-logging-with-pino.md), [ADR-028](ADR-028-client-side-structured-logging.md), [ADR-029](ADR-029-error-tracking-with-bugsink.md)
## Context
While ADR-017 established Pino as our logging framework, this ADR extends that foundation with application-wide standards for request tracing, context propagation, and structured log formats.
The implementation of logging can vary significantly across different modules. The error handler middleware may produce high-quality, structured JSON logs for errors, but logging within route handlers and service layers can become ad-hoc, using plain strings or inconsistent object structures.
This inconsistency leads to several problems:
- **Difficult Debugging**: It is hard to trace a single user request through the system or correlate events related to a specific operation
- **Ineffective Log Analysis**: Inconsistent log formats make it difficult to effectively query, filter, and create dashboards in log management systems (like Datadog, Splunk, or the ELK stack)
- **Security Risks**: There is no enforced standard for redacting sensitive information (like passwords or tokens) in logs outside of the error handler, increasing the risk of accidental data exposure
- **Missing Context**: Logs often lack crucial context, such as a unique request ID, the authenticated user's ID, or the source IP address, making them less useful for diagnosing issues
## Decision
We will adopt a standardized, application-wide structured logging policy. All log entries MUST be in JSON format and adhere to a consistent schema.
### 1. Request-Scoped Logger with Context
We will create a middleware that runs at the beginning of the request lifecycle. This middleware will:
- Generate a unique `request_id` for each incoming request
- Create a request-scoped logger instance (a "child logger") that automatically includes the `request_id`, `user_id` (if authenticated), and `ip_address` in every log message it generates
- Attach this child logger to the `req` object (e.g., `req.log`)
### 2. Mandatory Use of Request-Scoped Logger
All route handlers and any service functions called by them **MUST** use the request-scoped logger (`req.log`) instead of the global logger instance. This ensures all logs for a given request are automatically correlated.
### 3. Standardized Log Schema
All log messages should follow a base schema. The logger configuration will be updated to enforce this.
**Base Fields**: `level`, `timestamp`, `message`, `request_id`, `user_id`, `ip_address`
**Error Fields**: When logging an error, the log entry MUST include an `error` object with `name`, `message`, and `stack`.
### 4. Standardized Logging Practices
| Level | HTTP Status | Scenario |
| ----- | ----------- | -------------------------------------------------- |
| DEBUG | Any | Request incoming, internal state, development info |
| INFO | 2xx | Successful requests, business events |
| WARN | 4xx | Client errors, validation failures, not found |
| ERROR | 5xx | Server errors, unhandled exceptions |
## Implementation Details
### Logger Configuration
Located in `src/services/logger.server.ts`:
```typescript
import pino from 'pino';
const isProduction = process.env.NODE_ENV === 'production';
const isTest = process.env.NODE_ENV === 'test';
export const logger = pino({
level: isProduction ? 'info' : 'debug',
transport:
isProduction || isTest
? undefined
: {
target: 'pino-pretty',
options: {
colorize: true,
translateTime: 'SYS:standard',
ignore: 'pid,hostname',
},
},
redact: {
paths: [
'req.headers.authorization',
'req.headers.cookie',
'*.body.password',
'*.body.newPassword',
'*.body.currentPassword',
'*.body.confirmPassword',
'*.body.refreshToken',
'*.body.token',
],
censor: '[REDACTED]',
},
});
```
### Request Logger Middleware
Located in `server.ts`:
```typescript
import { randomUUID } from 'crypto';
import type { Request, Response, NextFunction } from 'express';
import { logger } from './services/logger.server';
const requestLogger = (req: Request, res: Response, next: NextFunction) => {
const requestId = randomUUID();
const user = req.user as UserProfile | undefined;
const start = process.hrtime();
// Create request-scoped logger
req.log = logger.child({
request_id: requestId,
user_id: user?.user.user_id,
ip_address: req.ip,
});
req.log.debug({ method: req.method, originalUrl: req.originalUrl }, 'INCOMING');
res.on('finish', () => {
const duration = getDurationInMilliseconds(start);
const { statusCode, statusMessage } = res;
const logDetails = {
user_id: (req.user as UserProfile | undefined)?.user.user_id,
method: req.method,
originalUrl: req.originalUrl,
statusCode,
statusMessage,
duration: duration.toFixed(2),
};
// Include request details for failed requests (for debugging)
if (statusCode >= 400) {
logDetails.req = { headers: req.headers, body: req.body };
}
if (statusCode >= 500) req.log.error(logDetails, 'Request completed with server error');
else if (statusCode >= 400) req.log.warn(logDetails, 'Request completed with client error');
else req.log.info(logDetails, 'Request completed successfully');
});
next();
};
app.use(requestLogger);
```
### TypeScript Support
The `req.log` property is typed via declaration merging in `src/types/express.d.ts`:
```typescript
import { Logger } from 'pino';
declare global {
namespace Express {
export interface Request {
log: Logger;
}
}
}
```
### Automatic Sensitive Data Redaction
The Pino logger automatically redacts sensitive fields:
```json
// Before redaction
{
"body": {
"email": "user@example.com",
"password": "secret123",
"newPassword": "newsecret456"
}
}
// After redaction (in logs)
{
"body": {
"email": "user@example.com",
"password": "[REDACTED]",
"newPassword": "[REDACTED]"
}
}
```
### Service Layer Logging
Services accept the request-scoped logger as an optional parameter:
```typescript
export async function registerUser(email: string, password: string, reqLog?: Logger) {
const log = reqLog || logger; // Fall back to global logger
log.info({ email }, 'Registering new user');
// ... implementation
log.debug({ userId: user.user_id }, 'User created successfully');
return user;
}
// In route handler
router.post('/register', async (req, res, next) => {
await authService.registerUser(req.body.email, req.body.password, req.log);
});
```
### Log Output Format
**Development** (pino-pretty):
```text
[2026-01-09 12:34:56.789] INFO (request_id=abc123): Request completed successfully
method: "GET"
originalUrl: "/api/users"
statusCode: 200
duration: "45.23"
```
**Production** (JSON):
```json
{
"level": 30,
"time": 1704812096789,
"request_id": "abc123",
"user_id": "user_456",
"ip_address": "192.168.1.1",
"method": "GET",
"originalUrl": "/api/users",
"statusCode": 200,
"duration": "45.23",
"msg": "Request completed successfully"
}
```
## Consequences
### Positive
- **Enhanced Observability**: Every log line from a single request can be instantly grouped and analyzed, dramatically speeding up debugging
- **Improved Security**: Centralizing the addition of context (like `user_id`) reduces the chance of developers manually logging sensitive data
- **Scalable Log Management**: Consistent JSON logs are easily ingested and indexed by any modern log aggregation tool
- **Clearer Code**: Removes the need to manually pass contextual information (like user ID) down to service functions just for logging purposes
### Negative
- **Refactoring Effort**: Requires adding the `requestLogger` middleware and refactoring all routes and services to use `req.log` instead of the global `logger`
- **Slight Performance Overhead**: Creating a child logger for every request adds a minor performance cost, though this is negligible for most modern logging libraries
## Key Files
- `src/services/logger.server.ts` - Pino logger configuration
- `src/services/logger.client.ts` - Client-side logger (for frontend)
- `src/types/express.d.ts` - TypeScript declaration for `req.log`
- `server.ts` - Request logger middleware
## References
- [ADR-017: Structured Logging with Pino](ADR-017-structured-logging-with-pino.md)
- [ADR-001: Standardized Error Handling](ADR-001-standardized-error-handling.md) - Error handler uses `req.log` for error logging
- [ADR-028: Client-Side Structured Logging](ADR-028-client-side-structured-logging.md) - Client-side logging strategy
- [Pino Documentation](https://getpino.io/#/)

View File

@@ -0,0 +1,242 @@
# ADR-028: Standardized Client-Side Structured Logging
**Date**: 2026-02-10
**Status**: Accepted
**Source**: Imported from flyer-crawler project (ADR-026)
**Related**: [ADR-027](ADR-027-application-wide-structured-logging.md), [ADR-029](ADR-029-error-tracking-with-bugsink.md)
## Context
Following the standardization of backend logging in ADR-027, it is clear that our frontend components also require a consistent logging strategy. Currently, components either use `console.log` directly or a simple wrapper, but without a formal standard, this can lead to inconsistent log formats and difficulty in debugging user-facing issues.
While the frontend does not have the concept of a "request-scoped" logger, the principles of structured, context-rich logging are equally important for:
1. **Effective Debugging**: Understanding the state of a component or the sequence of user interactions that led to an error
2. **Integration with Monitoring Tools**: Sending structured logs to services like Sentry/Bugsink or LogRocket allows for powerful analysis and error tracking in production
3. **Clean Test Outputs**: Uncontrolled logging can pollute test runner output, making it difficult to spot actual test failures
An existing client-side logger at `src/services/logger.client.ts` provides a simple, structured logging interface. This ADR formalizes its use as the application standard.
## Decision
We will adopt a standardized, application-wide structured logging policy for all client-side (React) code.
### 1. Mandatory Use of the Global Client Logger
All frontend components, hooks, and services **MUST** use the global logger singleton exported from `src/services/logger.client.ts`. Direct use of `console.log`, `console.error`, etc., is discouraged.
### 2. Pino-like API for Structured Logging
The client logger mimics the `pino` API, which is the standard on the backend. It supports two primary call signatures:
- `logger.info('A simple message');`
- `logger.info({ key: 'value' }, 'A message with a structured data payload');`
The second signature, which includes a data object as the first argument, is **strongly preferred**, especially for logging errors or complex state.
### 3. Mocking in Tests
All Jest/Vitest tests for components or hooks that use the logger **MUST** mock the `src/services/logger.client.ts` module. This prevents logs from appearing in test output and allows for assertions that the logger was called correctly.
## Implementation
### Client Logger Service
Located in `src/services/logger.client.ts`:
```typescript
type LogLevel = 'debug' | 'info' | 'warn' | 'error';
interface LoggerOptions {
level?: LogLevel;
enabled?: boolean;
}
const LOG_LEVELS: Record<LogLevel, number> = {
debug: 0,
info: 1,
warn: 2,
error: 3,
};
class ClientLogger {
private level: LogLevel;
private enabled: boolean;
constructor(options: LoggerOptions = {}) {
this.level = options.level ?? 'info';
this.enabled = options.enabled ?? import.meta.env.DEV;
}
private shouldLog(level: LogLevel): boolean {
return this.enabled && LOG_LEVELS[level] >= LOG_LEVELS[this.level];
}
private formatMessage(data: object | string, message?: string): string {
if (typeof data === 'string') {
return data;
}
const payload = JSON.stringify(data, null, 2);
return message ? `${message}\n${payload}` : payload;
}
debug(data: object | string, message?: string): void {
if (this.shouldLog('debug')) {
console.debug(`[DEBUG] ${this.formatMessage(data, message)}`);
}
}
info(data: object | string, message?: string): void {
if (this.shouldLog('info')) {
console.info(`[INFO] ${this.formatMessage(data, message)}`);
}
}
warn(data: object | string, message?: string): void {
if (this.shouldLog('warn')) {
console.warn(`[WARN] ${this.formatMessage(data, message)}`);
}
}
error(data: object | string, message?: string): void {
if (this.shouldLog('error')) {
console.error(`[ERROR] ${this.formatMessage(data, message)}`);
}
}
}
export const logger = new ClientLogger({
level: import.meta.env.DEV ? 'debug' : 'warn',
enabled: true,
});
```
### Example Usage
**Logging an Error in a Component:**
```typescript
// In a React component or hook
import { logger } from '../services/logger.client';
import { notifyError } from '../services/notificationService';
const fetchData = async () => {
try {
const data = await apiClient.getData();
return data;
} catch (err) {
// Log the full error object for context, along with a descriptive message.
logger.error({ err }, 'Failed to fetch component data');
notifyError('Something went wrong. Please try again.');
}
};
```
**Logging State Changes:**
```typescript
// In a Zustand store or state hook
import { logger } from '../services/logger.client';
const useAuthStore = create((set) => ({
login: async (credentials) => {
logger.info({ email: credentials.email }, 'User login attempt');
try {
const user = await authService.login(credentials);
logger.info({ userId: user.id }, 'User logged in successfully');
set({ user, isAuthenticated: true });
} catch (error) {
logger.error({ error }, 'Login failed');
throw error;
}
},
}));
```
### Mocking the Logger in Tests
```typescript
// In a *.test.tsx file
import { vi } from 'vitest';
// Mock the logger at the top of the test file
vi.mock('../services/logger.client', () => ({
logger: {
info: vi.fn(),
warn: vi.fn(),
error: vi.fn(),
debug: vi.fn(),
},
}));
describe('MyComponent', () => {
beforeEach(() => {
vi.clearAllMocks(); // Clear mocks between tests
});
it('should log an error when fetching fails', async () => {
// ... test setup to make fetch fail ...
// Assert that the logger was called with the expected structure
expect(logger.error).toHaveBeenCalledWith(
expect.objectContaining({ err: expect.any(Error) }),
'Failed to fetch component data',
);
});
});
```
## Integration with Error Tracking
When using Sentry/Bugsink for error tracking (see ADR-029), the client logger can be extended to send logs as breadcrumbs:
```typescript
import * as Sentry from '@sentry/react';
class ClientLogger {
// ... existing implementation
error(data: object | string, message?: string): void {
if (this.shouldLog('error')) {
console.error(`[ERROR] ${this.formatMessage(data, message)}`);
}
// Add to Sentry breadcrumbs for error context
Sentry.addBreadcrumb({
category: 'log',
level: 'error',
message: typeof data === 'string' ? data : message,
data: typeof data === 'object' ? data : undefined,
});
}
}
```
## Consequences
### Positive
- **Consistency**: All client-side logs will have a predictable structure, making them easier to read and parse
- **Debuggability**: Errors logged with a full object (`{ err }`) capture the stack trace and other properties, which is invaluable for debugging
- **Testability**: Components that log are easier to test without polluting CI/CD output. We can also assert that logging occurs when expected
- **Future-Proof**: If we later decide to send client-side logs to a remote service, we only need to modify the central `logger.client.ts` file instead of every component
- **Error Tracking Integration**: Logs can be used as breadcrumbs in Sentry/Bugsink for better error context
### Negative
- **Minor Boilerplate**: Requires importing the logger in every file that needs it and mocking it in every corresponding test file. However, this is a small and consistent effort
- **Production Noise**: Care must be taken to configure appropriate log levels in production to avoid performance impact
## Key Files
- `src/services/logger.client.ts` - Client-side logger implementation
- `src/services/logger.server.ts` - Backend logger (for reference)
## References
- [ADR-027: Application-Wide Structured Logging](ADR-027-application-wide-structured-logging.md)
- [ADR-029: Error Tracking with Bugsink](ADR-029-error-tracking-with-bugsink.md)
- [Pino Documentation](https://getpino.io/#/)

View File

@@ -0,0 +1,389 @@
# ADR-029: Error Tracking and Observability with Bugsink
**Date**: 2026-02-10
**Status**: Accepted
**Source**: Imported from flyer-crawler project (ADR-015)
**Related**: [ADR-027](ADR-027-application-wide-structured-logging.md), [ADR-028](ADR-028-client-side-structured-logging.md), [ADR-030](ADR-030-postgresql-function-observability.md), [ADR-032](ADR-032-application-performance-monitoring.md)
## Context
While ADR-027 established structured logging with Pino, the application lacks a high-level, aggregated view of its health and errors. It is difficult to spot trends, identify recurring issues, or be proactively notified of new types of errors.
Key requirements:
1. **Self-hosted**: No external SaaS dependencies for error tracking
2. **Sentry SDK compatible**: Leverage mature, well-documented SDKs
3. **Lightweight**: Minimal resource overhead in the dev container
4. **Production-ready**: Same architecture works on bare-metal production servers
5. **AI-accessible**: MCP server integration for Claude Code and other AI tools
**Note**: Application Performance Monitoring (APM) and distributed tracing are covered separately in [ADR-032](ADR-032-application-performance-monitoring.md).
## Decision
We implement a self-hosted error tracking stack using **Bugsink** as the Sentry-compatible backend, with the following components:
### 1. Error Tracking Backend: Bugsink
**Bugsink** is a lightweight, self-hosted Sentry alternative that:
- Runs as a single process (no Kafka, Redis, ClickHouse required)
- Is fully compatible with Sentry SDKs
- Supports ARM64 and AMD64 architectures
- Can use SQLite (dev) or PostgreSQL (production)
**Deployment**:
- **Dev container**: Installed as a systemd service inside the container
- **Production**: Runs as a systemd service on bare-metal, listening on localhost only
- **Database**: Uses PostgreSQL with a dedicated `bugsink` user and `bugsink` database (same PostgreSQL instance as the main application)
### 2. Backend Integration: @sentry/node
The Express backend integrates `@sentry/node` SDK to:
- Capture unhandled exceptions before PM2/process manager restarts
- Report errors with full stack traces and context
- Integrate with Pino logger for breadcrumbs
- Filter errors by severity (only 5xx errors sent by default)
### 3. Frontend Integration: @sentry/react
The React frontend integrates `@sentry/react` SDK to:
- Wrap the app in an Error Boundary for graceful error handling
- Capture unhandled JavaScript errors
- Report errors with component stack traces
- Filter out browser extension errors
- **Frontend Error Correlation**: The global API client intercepts 4xx/5xx responses and can attach the `x-request-id` header to Sentry scope for correlation with backend logs
### 4. Log Aggregation: Logstash
**Logstash** parses application and infrastructure logs, forwarding error patterns to Bugsink:
- **Installation**: Installed inside the dev container (and on bare-metal prod servers)
- **Inputs**:
- Pino JSON logs from the Node.js application (PM2 managed)
- Redis logs (connection errors, memory warnings, slow commands)
- PostgreSQL function logs (via `fn_log()` - see ADR-030)
- NGINX access/error logs
- **Filter**: Identifies error-level logs (5xx responses, unhandled exceptions, Redis errors)
- **Output**: Sends to Bugsink via Sentry-compatible HTTP API
This provides a secondary error capture path for:
- Errors that occur before Sentry SDK initialization
- Log-based errors that do not throw exceptions
- Redis connection/performance issues
- Database function errors and slow queries
- Historical error analysis from log files
### 5. MCP Server Integration: bugsink-mcp
For AI tool integration (Claude Code, Cursor, etc.), we use the open-source [bugsink-mcp](https://github.com/j-shelfwood/bugsink-mcp) server:
- **No code changes required**: Configurable via environment variables
- **Capabilities**: List projects, get issues, view events, get stacktraces, manage releases
- **Configuration**:
- `BUGSINK_URL`: Points to Bugsink instance (`http://localhost:8000` for dev, `https://bugsink.example.com` for prod)
- `BUGSINK_API_TOKEN`: API token from Bugsink (created via Django management command)
- `BUGSINK_ORG_SLUG`: Organization identifier (usually "sentry")
## Architecture
```text
+---------------------------------------------------------------------------+
| Dev Container / Production Server |
+---------------------------------------------------------------------------+
| |
| +------------------+ +------------------+ |
| | Frontend | | Backend | |
| | (React) | | (Express) | |
| | @sentry/react | | @sentry/node | |
| +--------+---------+ +--------+---------+ |
| | | |
| | Sentry SDK Protocol | |
| +-----------+---------------+ |
| | |
| v |
| +----------------------+ |
| | Bugsink | |
| | (localhost:8000) |<------------------+ |
| | | | |
| | PostgreSQL backend | | |
| +----------------------+ | |
| | |
| +----------------------+ | |
| | Logstash |-------------------+ |
| | (Log Aggregator) | Sentry Output |
| | | |
| | Inputs: | |
| | - PM2/Pino logs | |
| | - Redis logs | |
| | - PostgreSQL logs | |
| | - NGINX logs | |
| +----------------------+ |
| ^ ^ ^ ^ |
| | | | | |
| +-----------+ | | +-----------+ |
| | | | | |
| +----+-----+ +-----+----+ +-----+----+ +-----+----+ |
| | PM2 | | Redis | | PostgreSQL| | NGINX | |
| | Logs | | Logs | | Logs | | Logs | |
| +----------+ +----------+ +-----------+ +---------+ |
| |
| +----------------------+ |
| | PostgreSQL | |
| | +----------------+ | |
| | | app_database | | (main app database) |
| | +----------------+ | |
| | | bugsink | | (error tracking database) |
| | +----------------+ | |
| +----------------------+ |
| |
+---------------------------------------------------------------------------+
External (Developer Machine):
+--------------------------------------+
| Claude Code / Cursor / VS Code |
| +--------------------------------+ |
| | bugsink-mcp | |
| | (MCP Server) | |
| | | |
| | BUGSINK_URL=http://localhost:8000
| | BUGSINK_API_TOKEN=... | |
| | BUGSINK_ORG_SLUG=... | |
| +--------------------------------+ |
+--------------------------------------+
```
## Implementation Details
### Environment Variables
| Variable | Description | Default (Dev) |
| -------------------- | -------------------------------- | -------------------------- |
| `SENTRY_DSN` | Sentry-compatible DSN (backend) | Set after project creation |
| `VITE_SENTRY_DSN` | Sentry-compatible DSN (frontend) | Set after project creation |
| `SENTRY_ENVIRONMENT` | Environment name | `development` |
| `SENTRY_DEBUG` | Enable debug logging | `false` |
| `SENTRY_ENABLED` | Enable/disable error reporting | `true` |
### PostgreSQL Setup
```sql
-- Create dedicated Bugsink database and user
CREATE USER bugsink WITH PASSWORD 'bugsink_dev_password';
CREATE DATABASE bugsink OWNER bugsink;
GRANT ALL PRIVILEGES ON DATABASE bugsink TO bugsink;
```
### Bugsink Configuration
```bash
# Environment variables for Bugsink service
SECRET_KEY=<random-50-char-string>
DATABASE_URL=postgresql://bugsink:bugsink_dev_password@localhost:5432/bugsink
BASE_URL=http://localhost:8000
PORT=8000
```
### Backend Sentry Integration
Located in `src/services/sentry.server.ts`:
```typescript
import * as Sentry from '@sentry/node';
import { config } from '../config/env';
export function initSentry() {
if (!config.sentry.enabled || !config.sentry.dsn) {
return;
}
Sentry.init({
dsn: config.sentry.dsn,
environment: config.sentry.environment || config.server.nodeEnv,
debug: config.sentry.debug,
// Performance monitoring - disabled by default (see ADR-032)
tracesSampleRate: 0,
// Filter out 4xx errors - only report server errors
beforeSend(event) {
const statusCode = event.contexts?.response?.status_code;
if (statusCode && statusCode >= 400 && statusCode < 500) {
return null;
}
return event;
},
});
}
// Set user context after authentication
export function setUserContext(user: { id: string; email: string; name?: string }) {
Sentry.setUser({
id: user.id,
email: user.email,
username: user.name,
});
}
// Clear user context on logout
export function clearUserContext() {
Sentry.setUser(null);
}
```
### Frontend Sentry Integration
Located in `src/services/sentry.client.ts`:
```typescript
import * as Sentry from '@sentry/react';
import { config } from '../config';
export function initSentry() {
if (!config.sentry.enabled || !config.sentry.dsn) {
return;
}
Sentry.init({
dsn: config.sentry.dsn,
environment: config.sentry.environment,
// Performance monitoring - disabled by default (see ADR-032)
tracesSampleRate: 0,
// Filter out browser extension errors
beforeSend(event) {
// Ignore errors from browser extensions
if (
event.exception?.values?.[0]?.stacktrace?.frames?.some((frame) =>
frame.filename?.includes('extension://'),
)
) {
return null;
}
return event;
},
});
}
// Set user context after login
export function setUserContext(user: { id: string; email: string; name?: string }) {
Sentry.setUser({
id: user.id,
email: user.email,
username: user.name,
});
}
// Clear user context on logout
export function clearUserContext() {
Sentry.setUser(null);
}
```
### Error Boundary Component
Located in `src/components/ErrorBoundary.tsx`:
```typescript
import * as Sentry from '@sentry/react';
import { Component, ErrorInfo, ReactNode } from 'react';
interface Props {
children: ReactNode;
fallback?: ReactNode;
}
interface State {
hasError: boolean;
}
export class ErrorBoundary extends Component<Props, State> {
constructor(props: Props) {
super(props);
this.state = { hasError: false };
}
static getDerivedStateFromError(): State {
return { hasError: true };
}
componentDidCatch(error: Error, errorInfo: ErrorInfo) {
Sentry.withScope((scope) => {
scope.setExtras({ componentStack: errorInfo.componentStack });
Sentry.captureException(error);
});
}
render() {
if (this.state.hasError) {
return this.props.fallback || (
<div className="error-boundary">
<h1>Something went wrong</h1>
<p>Please refresh the page or contact support.</p>
</div>
);
}
return this.props.children;
}
}
```
### Logstash Pipeline Configuration
Key routing for log sources:
| Source | Bugsink Project |
| --------------- | --------------- |
| Backend (Pino) | Backend API |
| Worker (Pino) | Backend API |
| PostgreSQL logs | Backend API |
| Vite logs | Infrastructure |
| Redis logs | Infrastructure |
| NGINX logs | Infrastructure |
| Frontend errors | Frontend |
## Consequences
### Positive
- **Full observability**: Aggregated view of errors and trends
- **Self-hosted**: No external SaaS dependencies or subscription costs
- **SDK compatibility**: Leverages mature Sentry SDKs with excellent documentation
- **AI integration**: MCP server enables Claude Code to query and analyze errors
- **Unified architecture**: Same setup works in dev container and production
- **Lightweight**: Bugsink runs in a single process, unlike full Sentry (16GB+ RAM)
- **Error correlation**: Request IDs allow correlation between frontend errors and backend logs
### Negative
- **Additional services**: Bugsink and Logstash add complexity to the container
- **PostgreSQL overhead**: Additional database for error tracking
- **Initial setup**: Requires configuration of multiple components
- **Logstash learning curve**: Pipeline configuration requires Logstash knowledge
## Alternatives Considered
1. **Full Sentry self-hosted**: Rejected due to complexity (Kafka, Redis, ClickHouse, 16GB+ RAM minimum)
2. **GlitchTip**: Considered, but Bugsink is lighter weight and easier to deploy
3. **Sentry SaaS**: Rejected due to self-hosted requirement
4. **Custom error aggregation**: Rejected in favor of proven Sentry SDK ecosystem
## References
- [Bugsink Documentation](https://www.bugsink.com/docs/)
- [Bugsink Docker Install](https://www.bugsink.com/docs/docker-install/)
- [@sentry/node Documentation](https://docs.sentry.io/platforms/javascript/guides/node/)
- [@sentry/react Documentation](https://docs.sentry.io/platforms/javascript/guides/react/)
- [bugsink-mcp](https://github.com/j-shelfwood/bugsink-mcp)
- [Logstash Reference](https://www.elastic.co/guide/en/logstash/current/index.html)
- [ADR-030: PostgreSQL Function Observability](ADR-030-postgresql-function-observability.md)
- [ADR-032: Application Performance Monitoring](ADR-032-application-performance-monitoring.md)

View File

@@ -0,0 +1,336 @@
# ADR-030: PostgreSQL Function Observability
**Date**: 2026-02-10
**Status**: Accepted
**Source**: Imported from flyer-crawler project (ADR-050)
**Related**: [ADR-029](ADR-029-error-tracking-with-bugsink.md), [ADR-027](ADR-027-application-wide-structured-logging.md)
## Context
Applications often use PostgreSQL functions and triggers for business logic, including:
- Data transformations and validations
- Complex query encapsulation
- Trigger-based side effects
- Audit logging
**Current Problem**: These database functions can fail silently in several ways:
1. **`ON CONFLICT DO NOTHING`** - Swallows constraint violations without notification
2. **`IF NOT FOUND THEN RETURN;`** - Silently exits when data is missing
3. **Trigger functions returning `NULL`** - No indication of partial failures
4. **No logging inside functions** - No visibility into function execution
When these silent failures occur:
- The application layer receives no error (function "succeeds" but does nothing)
- No logs are generated for debugging
- Issues are only discovered when users report missing data
- Root cause analysis is extremely difficult
**Example of Silent Failure**:
```sql
-- This function silently does nothing if record doesn't exist
CREATE OR REPLACE FUNCTION public.process_item(p_user_id UUID, p_item_name TEXT)
RETURNS void AS $$
BEGIN
SELECT item_id INTO v_item_id FROM items WHERE name = p_item_name;
IF v_item_id IS NULL THEN
RETURN; -- Silent failure - no log, no error
END IF;
-- ...
END;
$$;
```
ADR-029 established Logstash + Bugsink for error tracking, with PostgreSQL log integration. This ADR defines the implementation.
## Decision
We will implement a standardized PostgreSQL function observability strategy with three tiers of logging severity.
### 1. Function Logging Helper
Create a reusable logging function that outputs structured JSON to PostgreSQL logs:
```sql
-- Function to emit structured log messages from PL/pgSQL
CREATE OR REPLACE FUNCTION public.fn_log(
p_level TEXT, -- 'DEBUG', 'INFO', 'NOTICE', 'WARNING', 'ERROR'
p_function_name TEXT, -- The calling function name
p_message TEXT, -- Human-readable message
p_context JSONB DEFAULT NULL -- Additional context (user_id, params, etc.)
)
RETURNS void
LANGUAGE plpgsql
AS $$
DECLARE
log_line TEXT;
BEGIN
-- Build structured JSON log line
log_line := jsonb_build_object(
'timestamp', now(),
'level', p_level,
'source', 'postgresql',
'function', p_function_name,
'message', p_message,
'context', COALESCE(p_context, '{}'::jsonb)
)::text;
-- Use appropriate RAISE level
CASE p_level
WHEN 'DEBUG' THEN RAISE DEBUG '%', log_line;
WHEN 'INFO' THEN RAISE INFO '%', log_line;
WHEN 'NOTICE' THEN RAISE NOTICE '%', log_line;
WHEN 'WARNING' THEN RAISE WARNING '%', log_line;
WHEN 'ERROR' THEN RAISE LOG '%', log_line; -- Use LOG for errors to ensure capture
ELSE RAISE NOTICE '%', log_line;
END CASE;
END;
$$;
```
### 2. Logging Tiers
#### Tier 1: Critical Functions (Always Log)
Functions where silent failure causes data corruption or user-facing issues:
| Function Type | Log Events |
| ---------------------------- | --------------------------------------- |
| User creation/management | User creation, profile creation, errors |
| Permission/role changes | Role not found, permission denied |
| Financial transactions | Transaction not found, balance issues |
| Data approval workflows | Record not found, permission denied |
| Critical business operations | Items added, operations completed |
**Pattern**:
```sql
CREATE OR REPLACE FUNCTION public.process_critical_operation(p_user_id UUID, p_operation_name TEXT)
RETURNS void AS $$
DECLARE
v_operation_id BIGINT;
v_context JSONB;
BEGIN
v_context := jsonb_build_object('user_id', p_user_id, 'operation_name', p_operation_name);
SELECT operation_id INTO v_operation_id
FROM public.operations WHERE name = p_operation_name;
IF v_operation_id IS NULL THEN
-- Log the issue instead of silent return
PERFORM fn_log('WARNING', 'process_critical_operation',
'Operation not found: ' || p_operation_name, v_context);
RETURN;
END IF;
-- Perform operation
INSERT INTO public.user_operations (user_id, operation_id)
VALUES (p_user_id, v_operation_id)
ON CONFLICT (user_id, operation_id) DO NOTHING;
IF FOUND THEN
PERFORM fn_log('INFO', 'process_critical_operation',
'Operation completed: ' || p_operation_name, v_context);
END IF;
END;
$$;
```
#### Tier 2: Business Logic Functions (Log on Anomalies)
Functions where unexpected conditions should be logged but are not critical:
| Function Type | Log Events |
| --------------------------- | -------------------------------- |
| Search/suggestion functions | No match found (below threshold) |
| Recommendation engines | No recommendations generated |
| Data lookup functions | Empty results, no matches found |
| Price/analytics queries | No data available, stale data |
**Pattern**: Log when results are unexpectedly empty or inputs are invalid.
#### Tier 3: Triggers (Log Errors Only)
Triggers should be fast, so only log when something goes wrong:
| Trigger Type | Log Events |
| --------------------- | ---------------------------- |
| Audit triggers | Failed to update audit trail |
| Aggregation triggers | Calculation failed |
| Cascade triggers | Related record lookup failed |
| Notification triggers | External service call failed |
### 3. PostgreSQL Configuration
Enable logging in `postgresql.conf`:
```ini
# Log all function notices and above
log_min_messages = notice
# Include function name in log prefix
log_line_prefix = '%t [%p] %u@%d '
# Log to file for Logstash pickup
logging_collector = on
log_directory = '/var/log/postgresql'
log_filename = 'postgresql-%Y-%m-%d.log'
log_rotation_age = 1d
log_rotation_size = 100MB
# Capture slow queries from functions
log_min_duration_statement = 1000 # Log queries over 1 second
```
### 4. Logstash Integration
Update the Logstash pipeline (extends ADR-029 configuration):
```conf
# PostgreSQL function log input
input {
file {
path => "/var/log/postgresql/*.log"
type => "postgres"
tags => ["postgres"]
start_position => "beginning"
sincedb_path => "/var/lib/logstash/sincedb_postgres"
}
}
filter {
if [type] == "postgres" {
# Extract timestamp and process ID from PostgreSQL log prefix
grok {
match => { "message" => "%{TIMESTAMP_ISO8601:pg_timestamp} \[%{POSINT:pg_pid}\] %{USER:pg_user}@%{WORD:pg_database} %{GREEDYDATA:pg_message}" }
}
# Check if this is a structured JSON log from fn_log()
if [pg_message] =~ /^\{.*"source":"postgresql".*\}$/ {
json {
source => "pg_message"
target => "fn_log"
}
# Mark as error if level is WARNING or ERROR
if [fn_log][level] in ["WARNING", "ERROR"] {
mutate { add_tag => ["error", "db_function"] }
}
}
# Also catch native PostgreSQL errors
if [pg_message] =~ /^ERROR:/ or [pg_message] =~ /^FATAL:/ {
mutate { add_tag => ["error", "postgres_native"] }
}
}
}
output {
if "error" in [tags] and "postgres" in [tags] {
http {
url => "http://localhost:8000/api/store/"
http_method => "post"
format => "json"
}
}
}
```
### 5. Dual-File Update Requirement
**IMPORTANT**: All SQL function changes must be applied to BOTH files:
1. `sql/Initial_triggers_and_functions.sql` - Used for incremental updates
2. `sql/master_schema_rollup.sql` - Used for fresh database setup
Both files must remain in sync for triggers and functions.
## Implementation Steps
1. **Create `fn_log()` helper function**:
- Add to both SQL files
- Test with `SELECT fn_log('INFO', 'test', 'Test message', '{"key": "value"}'::jsonb);`
2. **Update Tier 1 critical functions** (highest priority):
- Identify functions with silent failures
- Add appropriate logging calls
- Test error paths
3. **Update Tier 2 business logic functions**:
- Add anomaly logging to suggestion/recommendation functions
- Log empty result sets with context
4. **Update Tier 3 trigger functions**:
- Add error-only logging to critical triggers
- Wrap complex trigger logic in exception handlers
5. **Configure PostgreSQL logging**:
- Update `postgresql.conf` in dev container
- Update production PostgreSQL configuration
- Verify logs appear in expected location
6. **Update Logstash pipeline**:
- Add PostgreSQL input to Logstash config
- Add filter rules for structured JSON extraction
- Test end-to-end: function log -> Logstash -> Bugsink
7. **Verify in Bugsink**:
- Confirm database function errors appear as issues
- Verify context (user_id, function name, params) is captured
## Consequences
### Positive
- **Visibility**: Silent failures become visible in error tracking
- **Debugging**: Function execution context captured for root cause analysis
- **Proactive detection**: Anomalies logged before users report issues
- **Unified monitoring**: Database errors appear alongside application errors in Bugsink
- **Structured logs**: JSON format enables filtering and aggregation
### Negative
- **Performance overhead**: Logging adds latency to function execution
- **Log volume**: Tier 1/2 functions may generate significant log volume
- **Maintenance**: Two SQL files must be kept in sync
- **PostgreSQL configuration**: Requires access to `postgresql.conf`
### Mitigations
- **Performance**: Only log meaningful events, not every function call
- **Log volume**: Use appropriate log levels; Logstash filters reduce noise
- **Sync**: Add CI check to verify SQL files match for function definitions
- **Configuration**: Document PostgreSQL settings in deployment runbook
## Examples
### Before (Silent Failure)
```sql
-- User thinks operation completed, but it silently failed
SELECT process_item('user-uuid', 'Nonexistent Item');
-- Returns: void (no error, no log)
-- Result: User never gets expected result, nobody knows why
```
### After (Observable Failure)
```sql
SELECT process_item('user-uuid', 'Nonexistent Item');
-- Returns: void
-- PostgreSQL log: {"timestamp":"2026-01-11T10:30:00Z","level":"WARNING","source":"postgresql","function":"process_item","message":"Item not found: Nonexistent Item","context":{"user_id":"user-uuid","item_name":"Nonexistent Item"}}
-- Bugsink: New issue created with full context
```
## References
- [ADR-029: Error Tracking with Bugsink](ADR-029-error-tracking-with-bugsink.md)
- [ADR-027: Application-Wide Structured Logging](ADR-027-application-wide-structured-logging.md)
- [PostgreSQL RAISE Documentation](https://www.postgresql.org/docs/current/plpgsql-errors-and-messages.html)
- [PostgreSQL Logging Configuration](https://www.postgresql.org/docs/current/runtime-config-logging.html)

View File

@@ -0,0 +1,262 @@
# ADR-031: Granular Debug Logging Strategy
**Date**: 2026-02-10
**Status**: Accepted
**Source**: Imported from flyer-crawler project (ADR-052)
**Related**: [ADR-027](ADR-027-application-wide-structured-logging.md), [ADR-017](ADR-017-structured-logging-with-pino.md)
## Context
Global log levels (INFO vs DEBUG) are too coarse. Developers need to inspect detailed debug information for specific subsystems (e.g., `ai-service`, `db-pool`, `auth-service`) without being flooded by logs from the entire application.
When debugging a specific feature:
- Setting `LOG_LEVEL=debug` globally produces too much noise
- Manually adding/removing debug statements is error-prone
- No standard way to enable targeted debugging in production
## Decision
We will adopt a namespace-based debug filter pattern, similar to the `debug` npm package, but integrated into our Pino logger.
1. **Logger Namespaces**: Every service/module logger must be initialized with a `module` property (e.g., `logger.child({ module: 'ai-service' })`).
2. **Environment Filter**: We will support a `DEBUG_MODULES` environment variable that overrides the log level for matching modules.
## Implementation
### Core Implementation
Implemented in `src/services/logger.server.ts`:
```typescript
import pino from 'pino';
// Parse DEBUG_MODULES from environment
const debugModules = (process.env.DEBUG_MODULES || '').split(',').map((s) => s.trim());
// Base logger configuration
export const logger = pino({
level: process.env.LOG_LEVEL || (process.env.NODE_ENV === 'production' ? 'info' : 'debug'),
// ... other configuration
});
/**
* Creates a scoped logger for a specific module.
* If DEBUG_MODULES includes this module or '*', debug level is enabled.
*/
export const createScopedLogger = (moduleName: string) => {
// If DEBUG_MODULES contains the module name or "*", force level to 'debug'
const isDebugEnabled = debugModules.includes('*') || debugModules.includes(moduleName);
return logger.child({
module: moduleName,
level: isDebugEnabled ? 'debug' : logger.level,
});
};
```
### Service Usage Examples
```typescript
// src/services/aiService.server.ts
import { createScopedLogger } from './logger.server';
const logger = createScopedLogger('ai-service');
export async function processWithAI(data: unknown) {
logger.debug({ data }, 'Starting AI processing');
// ... implementation
logger.info({ result }, 'AI processing completed');
}
```
```typescript
// src/services/authService.server.ts
import { createScopedLogger } from './logger.server';
const logger = createScopedLogger('auth-service');
export async function validateToken(token: string) {
logger.debug({ tokenLength: token.length }, 'Validating token');
// ... implementation
}
```
### Module Naming Convention
Use kebab-case suffixed with `-service` or `-worker`:
| Module Name | Purpose | File |
| --------------- | -------------------------------- | ------------------------------------- |
| `ai-service` | AI/external API interactions | `src/services/aiService.server.ts` |
| `auth-service` | Authentication and authorization | `src/services/authService.server.ts` |
| `db-pool` | Database connection pooling | `src/services/database.server.ts` |
| `cache-service` | Redis/caching operations | `src/services/cacheService.server.ts` |
| `queue-worker` | Background job processing | `src/workers/queueWorker.ts` |
| `email-service` | Email sending | `src/services/emailService.server.ts` |
## Usage
### Enable Debug Logging for Specific Modules
To debug only AI and authentication:
```bash
DEBUG_MODULES=ai-service,auth-service npm run dev
```
### Enable All Debug Logging
Use wildcard to enable debug logging for all modules:
```bash
DEBUG_MODULES=* npm run dev
```
### Development Environment
In `.env.development`:
```bash
# Enable debug logging for specific modules during development
DEBUG_MODULES=ai-service
```
### Production Troubleshooting
Temporarily enable debug logging for a specific subsystem:
```bash
# SSH into production server
ssh root@example.com
# Set environment variable and restart
DEBUG_MODULES=ai-service pm2 restart app-api
# View logs
pm2 logs app-api --lines 100
# Disable debug logging
pm2 unset DEBUG_MODULES app-api
pm2 restart app-api
```
### With PM2 Configuration
In `ecosystem.config.js`:
```javascript
module.exports = {
apps: [
{
name: 'app-api',
script: 'dist/server.js',
env: {
NODE_ENV: 'production',
// DEBUG_MODULES is unset by default
},
env_debug: {
NODE_ENV: 'production',
DEBUG_MODULES: 'ai-service,auth-service',
},
},
],
};
```
Start with debug logging:
```bash
pm2 start ecosystem.config.js --env debug
```
## Best Practices
### 1. Use Scoped Loggers for Long-Running Services
Services with complex workflows or external API calls should use `createScopedLogger` to allow targeted debugging:
```typescript
const logger = createScopedLogger('payment-service');
export async function processPayment(payment: Payment) {
logger.debug({ paymentId: payment.id }, 'Starting payment processing');
try {
const result = await externalPaymentAPI.process(payment);
logger.debug({ result }, 'External API response');
return result;
} catch (error) {
logger.error({ error, paymentId: payment.id }, 'Payment processing failed');
throw error;
}
}
```
### 2. Use Child Loggers for Contextual Data
Even within scoped loggers, create child loggers with job/request-specific context:
```typescript
const logger = createScopedLogger('queue-worker');
async function processJob(job: Job) {
const jobLogger = logger.child({ jobId: job.id, jobName: job.name });
jobLogger.debug('Starting job processing');
// ... processing
jobLogger.info('Job completed successfully');
}
```
### 3. Consistent Debug Message Patterns
Use consistent patterns for debug messages:
```typescript
// Function entry
logger.debug({ params: sanitizedParams }, 'Function entry: processOrder');
// External API calls
logger.debug({ url, method }, 'External API request');
logger.debug({ statusCode, duration }, 'External API response');
// State changes
logger.debug({ before, after }, 'State transition');
// Decision points
logger.debug({ condition, result }, 'Branch decision');
```
### 4. Production Usage Guidelines
- `DEBUG_MODULES` can be set in production for temporary debugging
- Should not be used continuously due to increased log volume
- Always unset after troubleshooting is complete
- Monitor log storage when debug logging is enabled
## Consequences
### Positive
- Developers can inspect detailed logs for specific subsystems without log flooding
- Production debugging becomes more targeted and efficient
- No performance impact when debug logging is disabled
- Compatible with existing Pino logging infrastructure
- Follows familiar pattern from `debug` npm package
### Negative
- Requires developers to know module names (mitigated by documentation)
- Not all services have adopted scoped loggers yet (gradual migration)
- Additional configuration complexity
## References
- [ADR-027: Application-Wide Structured Logging](ADR-027-application-wide-structured-logging.md)
- [ADR-017: Structured Logging with Pino](ADR-017-structured-logging-with-pino.md)
- [debug npm package](https://www.npmjs.com/package/debug) - Inspiration for namespace pattern
- [Pino Child Loggers](https://getpino.io/#/docs/child-loggers)

View File

@@ -0,0 +1,263 @@
# ADR-032: Application Performance Monitoring (APM)
**Date**: 2026-02-10
**Status**: Proposed
**Source**: Imported from flyer-crawler project (ADR-056)
**Related**: [ADR-029](ADR-029-error-tracking-with-bugsink.md) (Error Tracking with Bugsink)
## Context
Application Performance Monitoring (APM) provides visibility into application behavior through:
- **Distributed Tracing**: Track requests across services, queues, and database calls
- **Performance Metrics**: Response times, throughput, error rates
- **Resource Monitoring**: Memory usage, CPU, database connections
- **Transaction Analysis**: Identify slow endpoints and bottlenecks
While ADR-029 covers error tracking and observability, APM is a distinct concern focused on performance rather than errors. The Sentry SDK supports APM through its tracing features, but this capability is currently **intentionally disabled** in our application.
### Current State
The Sentry SDK is installed and configured for error tracking (see ADR-029), but APM features are disabled:
```typescript
// src/services/sentry.client.ts
Sentry.init({
dsn: config.sentry.dsn,
environment: config.sentry.environment,
// Performance monitoring - disabled for now to keep it simple
tracesSampleRate: 0,
// ...
});
```
```typescript
// src/services/sentry.server.ts
Sentry.init({
dsn: config.sentry.dsn,
environment: config.sentry.environment || config.server.nodeEnv,
// Performance monitoring - disabled for now to keep it simple
tracesSampleRate: 0,
// ...
});
```
### Why APM is Currently Disabled
1. **Complexity**: APM adds overhead and complexity to debugging
2. **Bugsink Limitations**: Bugsink's APM support is less mature than its error tracking
3. **Resource Overhead**: Tracing adds memory and CPU overhead
4. **Focus**: Error tracking provides more immediate value for our current scale
5. **Cost**: High sample rates can significantly increase storage requirements
## Decision
We propose a **staged approach** to APM implementation:
### Phase 1: Selective Backend Tracing (Low Priority)
Enable tracing for specific high-value operations:
```typescript
// Enable tracing for specific transactions only
Sentry.init({
dsn: config.sentry.dsn,
tracesSampleRate: 0, // Keep default at 0
// Trace only specific high-value transactions
tracesSampler: (samplingContext) => {
const transactionName = samplingContext.transactionContext?.name;
// Always trace long-running jobs
if (transactionName?.includes('job-processing')) {
return 0.1; // 10% sample rate
}
// Always trace AI/external API calls
if (transactionName?.includes('external-api')) {
return 0.5; // 50% sample rate
}
// Trace slow endpoints (determined by custom logic)
if (samplingContext.parentSampled) {
return 0.1; // 10% for child transactions
}
return 0; // Don't trace other transactions
},
});
```
### Phase 2: Custom Performance Metrics
Add custom metrics without full tracing overhead:
```typescript
// Custom metric for slow database queries
import { metrics } from '@sentry/node';
// In repository methods
const startTime = performance.now();
const result = await pool.query(sql, params);
const duration = performance.now() - startTime;
metrics.distribution('db.query.duration', duration, {
tags: { query_type: 'select', table: 'users' },
});
if (duration > 1000) {
logger.warn({ duration, sql }, 'Slow query detected');
}
```
### Phase 3: Full APM Integration (Future)
When/if full APM is needed:
```typescript
Sentry.init({
dsn: config.sentry.dsn,
tracesSampleRate: 0.1, // 10% of transactions
profilesSampleRate: 0.1, // 10% of traced transactions get profiled
integrations: [
// Database tracing
Sentry.postgresIntegration(),
// Redis tracing
Sentry.redisIntegration(),
// BullMQ job tracing (custom integration)
],
});
```
## Implementation Steps
### To Enable Basic APM
1. **Update Sentry Configuration**:
- Set `tracesSampleRate` > 0 in `src/services/sentry.server.ts`
- Set `tracesSampleRate` > 0 in `src/services/sentry.client.ts`
- Add environment variable `SENTRY_TRACES_SAMPLE_RATE` (default: 0)
2. **Add Instrumentation**:
- Enable automatic Express instrumentation
- Add manual spans for BullMQ job processing
- Add database query instrumentation
3. **Frontend Tracing**:
- Add Browser Tracing integration
- Configure page load and navigation tracing
4. **Environment Variables**:
```bash
SENTRY_TRACES_SAMPLE_RATE=0.1 # 10% sampling
SENTRY_PROFILES_SAMPLE_RATE=0 # Profiling disabled
```
5. **Bugsink Configuration**:
- Verify Bugsink supports performance data ingestion
- Configure retention policies for performance data
### Configuration Changes Required
```typescript
// src/config/env.ts - Add new config
sentry: {
dsn: env.SENTRY_DSN,
environment: env.SENTRY_ENVIRONMENT,
debug: env.SENTRY_DEBUG === 'true',
tracesSampleRate: parseFloat(env.SENTRY_TRACES_SAMPLE_RATE || '0'),
profilesSampleRate: parseFloat(env.SENTRY_PROFILES_SAMPLE_RATE || '0'),
},
```
```typescript
// src/services/sentry.server.ts - Updated init
Sentry.init({
dsn: config.sentry.dsn,
environment: config.sentry.environment,
tracesSampleRate: config.sentry.tracesSampleRate,
profilesSampleRate: config.sentry.profilesSampleRate,
// ... rest of config
});
```
## Trade-offs
### Enabling APM
**Benefits**:
- Identify performance bottlenecks
- Track distributed transactions across services
- Profile slow endpoints
- Monitor resource utilization trends
**Costs**:
- Increased memory usage (~5-15% overhead)
- Additional CPU for trace processing
- Increased storage in Bugsink/Sentry
- More complex debugging (noise in traces)
- Potential latency from tracing overhead
### Keeping APM Disabled
**Benefits**:
- Simpler operation and debugging
- Lower resource overhead
- Focused on error tracking (higher priority)
- No additional storage costs
**Costs**:
- No automated performance insights
- Manual profiling required for bottleneck detection
- Limited visibility into slow transactions
## Alternatives Considered
1. **OpenTelemetry**: More vendor-neutral, but adds another dependency and complexity
2. **Prometheus + Grafana**: Good for metrics, but doesn't provide distributed tracing
3. **Jaeger/Zipkin**: Purpose-built for tracing, but requires additional infrastructure
4. **New Relic/Datadog SaaS**: Full-featured but conflicts with self-hosted requirement
## Current Recommendation
**Keep APM disabled** (`tracesSampleRate: 0`) until:
1. Specific performance issues are identified that require tracing
2. Bugsink's APM support is verified and tested
3. Infrastructure can support the additional overhead
4. There is a clear business need for performance visibility
When enabling APM becomes necessary, start with Phase 1 (selective tracing) to minimize overhead while gaining targeted insights.
## Consequences
### Positive (When Implemented)
- Automated identification of slow endpoints
- Distributed trace visualization across async operations
- Correlation between errors and performance issues
- Proactive alerting on performance degradation
### Negative
- Additional infrastructure complexity
- Storage overhead for trace data
- Potential performance impact from tracing itself
- Learning curve for trace analysis
## References
- [Sentry Performance Monitoring](https://docs.sentry.io/product/performance/)
- [@sentry/node Performance](https://docs.sentry.io/platforms/javascript/guides/node/performance/)
- [@sentry/react Performance](https://docs.sentry.io/platforms/javascript/guides/react/performance/)
- [OpenTelemetry](https://opentelemetry.io/) (alternative approach)
- [ADR-029: Error Tracking with Bugsink](ADR-029-error-tracking-with-bugsink.md)

View File

@@ -0,0 +1,340 @@
# ADR-033: Bugsink to Gitea Issue Synchronization
**Date**: 2026-02-10
**Status**: Proposed
**Source**: Imported from flyer-crawler project (ADR-054)
**Related**: [ADR-029](ADR-029-error-tracking-with-bugsink.md), [ADR-012](ADR-012-bullmq-background-job-processing.md)
## Context
The application uses Bugsink (Sentry-compatible self-hosted error tracking) to capture runtime errors across multiple projects:
| Project Type | Environment | Description |
| -------------- | ------------ | ---------------------------------------- |
| Backend | Production | Main API server errors |
| Backend | Test/Staging | Pre-production API errors |
| Frontend | Production | Client-side JavaScript errors |
| Frontend | Test/Staging | Pre-production frontend errors |
| Infrastructure | Production | Infrastructure-level errors (Redis, PM2) |
| Infrastructure | Test/Staging | Pre-production infrastructure errors |
Currently, errors remain in Bugsink until manually reviewed. There is no automated workflow to:
1. Create trackable tickets for errors
2. Assign errors to developers
3. Track resolution progress
4. Prevent errors from being forgotten
## Decision
Implement an automated background worker that synchronizes unresolved Bugsink issues to Gitea as trackable tickets. The sync worker will:
1. **Run only on the test/staging server** (not production, not dev container)
2. **Poll all Bugsink projects** for unresolved issues
3. **Create Gitea issues** with full error context
4. **Mark synced issues as resolved** in Bugsink (to prevent re-polling)
5. **Track sync state in Redis** to ensure idempotency
### Why Test/Staging Only?
- The sync worker is a background service that needs API tokens for both Bugsink and Gitea
- Running on test/staging provides a single sync point without duplicating infrastructure
- All Bugsink projects (including production) are synced from this one worker
- Production server stays focused on serving users, not running sync jobs
## Architecture
### Component Overview
```
+-----------------------------------------------------------------------+
| TEST/STAGING SERVER |
| |
| +------------------+ +------------------+ +-------------------+ |
| | BullMQ Queue |--->| Sync Worker |--->| Redis DB 15 | |
| | bugsink-sync | | (15min repeat) | | Sync State | |
| +------------------+ +--------+---------+ +-------------------+ |
| | |
+-----------------------------------+------------------------------------+
|
+---------------+---------------+
v v
+------------------+ +------------------+
| Bugsink | | Gitea |
| (all projects) | | (1 repo) |
+------------------+ +------------------+
```
### Queue Configuration
| Setting | Value | Rationale |
| --------------- | ---------------------- | -------------------------------------------- |
| Queue Name | `bugsink-sync` | Follows existing naming pattern |
| Repeat Interval | 15 minutes | Balances responsiveness with API rate limits |
| Retry Attempts | 3 | Standard retry policy |
| Backoff | Exponential (30s base) | Handles temporary API failures |
| Concurrency | 1 | Serial processing prevents race conditions |
### Redis Database Allocation
| Database | Usage | Owner |
| -------- | ------------------- | --------------- |
| 0 | BullMQ (Production) | Existing queues |
| 1 | BullMQ (Test) | Existing queues |
| 2-14 | Reserved | Future use |
| 15 | Bugsink Sync State | This feature |
### Redis Key Schema
```
bugsink:synced:{bugsink_issue_id}
+-- Value: JSON {
gitea_issue_number: number,
synced_at: ISO timestamp,
project: string,
title: string
}
```
### Gitea Labels
The following labels should be created in the repository:
| Label | Color | Purpose |
| -------------------- | ------------------ | ---------------------------------- |
| `bug:frontend` | #e11d48 (Red) | Frontend JavaScript/React errors |
| `bug:backend` | #ea580c (Orange) | Backend Node.js/API errors |
| `bug:infrastructure` | #7c3aed (Purple) | Infrastructure errors (Redis, PM2) |
| `env:production` | #dc2626 (Dark Red) | Production environment |
| `env:test` | #2563eb (Blue) | Test/staging environment |
| `env:development` | #6b7280 (Gray) | Development environment |
| `source:bugsink` | #10b981 (Green) | Auto-synced from Bugsink |
### Label Mapping
| Bugsink Project Type | Bug Label | Env Label |
| --------------------- | ------------------ | -------------- |
| backend (prod) | bug:backend | env:production |
| backend (test) | bug:backend | env:test |
| frontend (prod) | bug:frontend | env:production |
| frontend (test) | bug:frontend | env:test |
| infrastructure (prod) | bug:infrastructure | env:production |
| infrastructure (test) | bug:infrastructure | env:test |
All synced issues also receive the `source:bugsink` label.
## Implementation Details
### New Files
| File | Purpose |
| -------------------------------------- | ------------------------------------------- |
| `src/services/bugsinkSync.server.ts` | Core synchronization logic |
| `src/services/bugsinkClient.server.ts` | HTTP client for Bugsink API |
| `src/services/giteaClient.server.ts` | HTTP client for Gitea API |
| `src/types/bugsink.ts` | TypeScript interfaces for Bugsink responses |
| `src/routes/admin/bugsink-sync.ts` | Admin endpoints for manual trigger |
### Modified Files
| File | Changes |
| -------------------------------- | ------------------------------------- |
| `src/services/queues.server.ts` | Add `bugsinkSyncQueue` definition |
| `src/services/workers.server.ts` | Add sync worker implementation |
| `src/config/env.ts` | Add bugsink sync configuration schema |
| `.env.example` | Document new environment variables |
### Environment Variables
```bash
# Bugsink Configuration
BUGSINK_URL=https://bugsink.example.com
BUGSINK_API_TOKEN=... # Created via Django management command
# Gitea Configuration
GITEA_URL=https://gitea.example.com
GITEA_API_TOKEN=... # Personal access token with repo scope
GITEA_OWNER=org-name
GITEA_REPO=project-repo
# Sync Control
BUGSINK_SYNC_ENABLED=false # Set true only in test environment
BUGSINK_SYNC_INTERVAL=15 # Minutes between sync runs
```
### Gitea Issue Template
```markdown
## Error Details
| Field | Value |
| ------------ | --------------- |
| **Type** | {error_type} |
| **Message** | {error_message} |
| **Platform** | {platform} |
| **Level** | {level} |
## Occurrence Statistics
- **First Seen**: {first_seen}
- **Last Seen**: {last_seen}
- **Total Occurrences**: {count}
## Request Context
- **URL**: {request_url}
- **Additional Context**: {context}
## Stacktrace
<details>
<summary>Click to expand</summary>
{stacktrace}
</details>
---
**Bugsink Issue**: {bugsink_url}
**Project**: {project_slug}
**Trace ID**: {trace_id}
```
### Sync Workflow
```
1. Worker triggered (every 15 min or manual)
2. For each Bugsink project:
a. List issues with status='unresolved'
b. For each issue:
i. Check Redis for existing sync record
ii. If already synced -> skip
iii. Fetch issue details + stacktrace
iv. Create Gitea issue with labels
v. Store sync record in Redis
vi. Mark issue as 'resolved' in Bugsink
3. Log summary (synced: N, skipped: N, failed: N)
```
### Idempotency Guarantees
1. **Redis check before creation**: Prevents duplicate Gitea issues
2. **Atomic Redis write after Gitea create**: Ensures state consistency
3. **Query only unresolved issues**: Resolved issues won't appear in polls
4. **No TTL on Redis keys**: Permanent sync history
## Admin Interface
### Manual Sync Endpoint
```
POST /api/admin/bugsink/sync
Authorization: Bearer {admin_jwt}
Response:
{
"success": true,
"data": {
"synced": 3,
"skipped": 12,
"failed": 0,
"duration_ms": 2340
}
}
```
### Sync Status Endpoint
```
GET /api/admin/bugsink/sync/status
Authorization: Bearer {admin_jwt}
Response:
{
"success": true,
"data": {
"enabled": true,
"last_run": "2026-01-17T10:30:00Z",
"next_run": "2026-01-17T10:45:00Z",
"total_synced": 47,
"projects": [
{ "slug": "backend-prod", "synced_count": 12 },
...
]
}
}
```
## Implementation Phases
### Phase 1: Core Infrastructure
- Add environment variables to `env.ts` schema
- Create `BugsinkClient` service (HTTP client)
- Create `GiteaClient` service (HTTP client)
- Add Redis db 15 connection for sync tracking
### Phase 2: Sync Logic
- Create `BugsinkSyncService` with sync logic
- Add `bugsink-sync` queue to `queues.server.ts`
- Add sync worker to `workers.server.ts`
- Create TypeScript types for API responses
### Phase 3: Integration
- Add admin endpoints for manual sync trigger
- Update CI/CD with new secrets
- Add secrets to repository settings
- Test end-to-end in staging environment
### Phase 4: Documentation
- Update CLAUDE.md with sync information
- Create operational runbook for sync issues
## Consequences
### Positive
1. **Visibility**: All application errors become trackable tickets
2. **Accountability**: Errors can be assigned to developers
3. **History**: Complete audit trail of when errors were discovered and resolved
4. **Integration**: Errors appear alongside feature work in Gitea
5. **Automation**: No manual error triage required
### Negative
1. **API Dependencies**: Requires both Bugsink and Gitea APIs to be available
2. **Token Management**: Additional secrets to manage in CI/CD
3. **Potential Noise**: High-frequency errors could create many tickets (mitigated by Bugsink's issue grouping)
4. **Single Point**: Sync only runs on test server (if test server is down, no sync occurs)
### Risks and Mitigations
| Risk | Mitigation |
| ----------------------- | ------------------------------------------------- |
| Bugsink API rate limits | 15-minute polling interval |
| Gitea API rate limits | Sequential processing with delays |
| Redis connection issues | Reuse existing connection patterns |
| Duplicate issues | Redis tracking + idempotent checks |
| Missing stacktrace | Graceful degradation (create issue without trace) |
## Future Enhancements
1. **Bi-directional sync**: Update Bugsink when Gitea issue is closed
2. **Smart deduplication**: Detect similar errors across projects
3. **Priority mapping**: High occurrence count -> high priority label
4. **Slack/Discord notifications**: Alert on new critical errors
5. **Metrics dashboard**: Track error trends over time
## References
- [ADR-012: BullMQ Background Job Processing](ADR-012-bullmq-background-job-processing.md)
- [ADR-029: Error Tracking with Bugsink](ADR-029-error-tracking-with-bugsink.md)
- [Bugsink API Documentation](https://bugsink.com/docs/api/)
- [Gitea API Documentation](https://docs.gitea.io/en-us/api-usage/)

View File

@@ -15,9 +15,10 @@ This document tracks the implementation status and estimated effort for all Arch
| Status | Count |
| ---------------------------- | ----- |
| Accepted (Fully Implemented) | 30 |
| Accepted (Fully Implemented) | 42 |
| Partially Implemented | 2 |
| Proposed (Not Started) | 16 |
| Proposed (Not Started) | 12 |
| Superseded | 1 |
---
@@ -34,23 +35,23 @@ This document tracks the implementation status and estimated effort for all Arch
### Category 2: Data Management
| ADR | Title | Status | Effort | Notes |
| --------------------------------------------------------------- | ------------------------ | -------- | ------ | ------------------------------ |
| [ADR-009](./0009-caching-strategy-for-read-heavy-operations.md) | Caching Strategy | Accepted | - | Fully implemented |
| [ADR-013](./0013-database-schema-migration-strategy.md) | Schema Migrations v1 | Proposed | M | Superseded by ADR-023 |
| [ADR-019](./0019-data-backup-and-recovery-strategy.md) | Backup & Recovery | Accepted | - | Fully implemented |
| [ADR-023](./0023-database-schema-migration-strategy.md) | Schema Migrations v2 | Proposed | L | Requires tooling setup |
| [ADR-031](./0031-data-retention-and-privacy-compliance.md) | Data Retention & Privacy | Proposed | XL | Legal/compliance review needed |
| ADR | Title | Status | Effort | Notes |
| --------------------------------------------------------------- | ------------------------ | ---------- | ------ | ------------------------------ |
| [ADR-009](./0009-caching-strategy-for-read-heavy-operations.md) | Caching Strategy | Accepted | - | Fully implemented |
| [ADR-013](./0013-database-schema-migration-strategy.md) | Schema Migrations v1 | Superseded | - | Superseded by ADR-023 |
| [ADR-019](./0019-data-backup-and-recovery-strategy.md) | Backup & Recovery | Accepted | - | Fully implemented |
| [ADR-023](./0023-database-schema-migration-strategy.md) | Schema Migrations v2 | Proposed | L | Requires tooling setup |
| [ADR-031](./0031-data-retention-and-privacy-compliance.md) | Data Retention & Privacy | Proposed | XL | Legal/compliance review needed |
### Category 3: API & Integration
| ADR | Title | Status | Effort | Notes |
| ------------------------------------------------------------------- | ------------------------ | ----------- | ------ | ------------------------------------- |
| [ADR-003](./0003-standardized-input-validation-using-middleware.md) | Input Validation | Accepted | - | Fully implemented |
| [ADR-008](./0008-api-versioning-strategy.md) | API Versioning | Proposed | L | Major URL/routing changes |
| [ADR-018](./0018-api-documentation-strategy.md) | API Documentation | Accepted | - | OpenAPI/Swagger implemented |
| [ADR-022](./0022-real-time-notification-system.md) | Real-time Notifications | Proposed | XL | WebSocket infrastructure |
| [ADR-028](./0028-api-response-standardization.md) | Response Standardization | Implemented | L | Completed (routes, middleware, tests) |
| ADR | Title | Status | Effort | Notes |
| ------------------------------------------------------------------- | ------------------------ | -------- | ------ | ------------------------------------- |
| [ADR-003](./0003-standardized-input-validation-using-middleware.md) | Input Validation | Accepted | - | Fully implemented |
| [ADR-008](./0008-api-versioning-strategy.md) | API Versioning | Accepted | - | Phase 2 complete, tests migrated |
| [ADR-018](./0018-api-documentation-strategy.md) | API Documentation | Accepted | - | OpenAPI/Swagger implemented |
| [ADR-022](./0022-real-time-notification-system.md) | Real-time Notifications | Accepted | - | Fully implemented |
| [ADR-028](./0028-api-response-standardization.md) | Response Standardization | Accepted | - | Completed (routes, middleware, tests) |
### Category 4: Security & Compliance
@@ -62,25 +63,31 @@ This document tracks the implementation status and estimated effort for all Arch
| [ADR-029](./0029-secret-rotation-and-key-management.md) | Secret Rotation | Proposed | L | Infrastructure changes needed |
| [ADR-032](./0032-rate-limiting-strategy.md) | Rate Limiting | Accepted | - | Fully implemented |
| [ADR-033](./0033-file-upload-and-storage-strategy.md) | File Upload & Storage | Accepted | - | Fully implemented |
| [ADR-048](./0048-authentication-strategy.md) | Authentication | Accepted | - | Fully implemented |
### Category 5: Observability & Monitoring
| ADR | Title | Status | Effort | Notes |
| -------------------------------------------------------------------------- | --------------------------- | -------- | ------ | --------------------------------- |
| [ADR-004](./0004-standardized-application-wide-structured-logging.md) | Structured Logging | Accepted | - | Fully implemented |
| [ADR-015](./0015-application-performance-monitoring-and-error-tracking.md) | APM & Error Tracking | Proposed | M | Third-party integration |
| [ADR-050](./0050-postgresql-function-observability.md) | PostgreSQL Fn Observability | Proposed | M | Depends on ADR-015 implementation |
| ADR | Title | Status | Effort | Notes |
| --------------------------------------------------------------------- | --------------------------- | -------- | ------ | ------------------------------------------ |
| [ADR-004](./0004-standardized-application-wide-structured-logging.md) | Structured Logging | Accepted | - | Fully implemented |
| [ADR-015](./0015-error-tracking-and-observability.md) | Error Tracking | Accepted | - | Fully implemented |
| [ADR-050](./0050-postgresql-function-observability.md) | PostgreSQL Fn Observability | Accepted | - | Fully implemented |
| [ADR-051](./0051-asynchronous-context-propagation.md) | Context Propagation | Accepted | - | Fully implemented |
| [ADR-052](./0052-granular-debug-logging-strategy.md) | Granular Debug Logging | Accepted | - | Fully implemented |
| [ADR-056](./0056-application-performance-monitoring.md) | APM (Performance) | Proposed | M | tracesSampleRate=0, intentionally disabled |
### Category 6: Deployment & Operations
| ADR | Title | Status | Effort | Notes |
| -------------------------------------------------------------- | ----------------- | -------- | ------ | -------------------------- |
| [ADR-006](./0006-background-job-processing-and-task-queues.md) | Background Jobs | Accepted | - | Fully implemented |
| [ADR-014](./0014-containerization-and-deployment-strategy.md) | Containerization | Partial | M | Docker done, K8s pending |
| [ADR-017](./0017-ci-cd-and-branching-strategy.md) | CI/CD & Branching | Accepted | - | Fully implemented |
| [ADR-024](./0024-feature-flagging-strategy.md) | Feature Flags | Proposed | M | New service/library needed |
| [ADR-037](./0037-scheduled-jobs-and-cron-pattern.md) | Scheduled Jobs | Accepted | - | Fully implemented |
| [ADR-038](./0038-graceful-shutdown-pattern.md) | Graceful Shutdown | Accepted | - | Fully implemented |
| ADR | Title | Status | Effort | Notes |
| -------------------------------------------------------------- | ------------------ | -------- | ------ | ------------------------ |
| [ADR-006](./0006-background-job-processing-and-task-queues.md) | Background Jobs | Accepted | - | Fully implemented |
| [ADR-014](./0014-containerization-and-deployment-strategy.md) | Containerization | Partial | M | Docker done, K8s pending |
| [ADR-017](./0017-ci-cd-and-branching-strategy.md) | CI/CD & Branching | Accepted | - | Fully implemented |
| [ADR-024](./0024-feature-flagging-strategy.md) | Feature Flags | Accepted | - | Fully implemented |
| [ADR-037](./0037-scheduled-jobs-and-cron-pattern.md) | Scheduled Jobs | Accepted | - | Fully implemented |
| [ADR-038](./0038-graceful-shutdown-pattern.md) | Graceful Shutdown | Accepted | - | Fully implemented |
| [ADR-053](./0053-worker-health-checks.md) | Worker Health | Accepted | - | Fully implemented |
| [ADR-054](./0054-bugsink-gitea-issue-sync.md) | Bugsink-Gitea Sync | Proposed | L | Automated issue creation |
### Category 7: Frontend / User Interface
@@ -99,61 +106,82 @@ This document tracks the implementation status and estimated effort for all Arch
| [ADR-010](./0010-testing-strategy-and-standards.md) | Testing Strategy | Accepted | - | Fully implemented |
| [ADR-021](./0021-code-formatting-and-linting-unification.md) | Formatting & Linting | Accepted | - | Fully implemented |
| [ADR-027](./0027-standardized-naming-convention-for-ai-and-database-types.md) | Naming Conventions | Accepted | - | Fully implemented |
| [ADR-040](./0040-testing-economics-and-priorities.md) | Testing Economics | Accepted | - | Fully implemented |
| [ADR-045](./0045-test-data-factories-and-fixtures.md) | Test Data Factories | Accepted | - | Fully implemented |
| [ADR-047](./0047-project-file-and-folder-organization.md) | Project Organization | Proposed | XL | Major reorganization |
| [ADR-057](./0057-test-remediation-post-api-versioning.md) | Test Remediation | Accepted | - | Fully implemented |
### Category 9: Architecture Patterns
| ADR | Title | Status | Effort | Notes |
| -------------------------------------------------------- | --------------------- | -------- | ------ | ----------------- |
| [ADR-034](./0034-repository-pattern-standards.md) | Repository Pattern | Accepted | - | Fully implemented |
| [ADR-035](./0035-service-layer-architecture.md) | Service Layer | Accepted | - | Fully implemented |
| [ADR-036](./0036-event-bus-and-pub-sub-pattern.md) | Event Bus | Accepted | - | Fully implemented |
| [ADR-039](./0039-dependency-injection-pattern.md) | Dependency Injection | Accepted | - | Fully implemented |
| [ADR-041](./0041-ai-gemini-integration-architecture.md) | AI/Gemini Integration | Accepted | - | Fully implemented |
| [ADR-042](./0042-email-and-notification-architecture.md) | Email & Notifications | Accepted | - | Fully implemented |
| [ADR-043](./0043-express-middleware-pipeline.md) | Middleware Pipeline | Accepted | - | Fully implemented |
| [ADR-046](./0046-image-processing-pipeline.md) | Image Processing | Accepted | - | Fully implemented |
| [ADR-049](./0049-gamification-and-achievement-system.md) | Gamification System | Accepted | - | Fully implemented |
| ADR | Title | Status | Effort | Notes |
| --------------------------------------------------------------------- | --------------------- | -------- | ------ | ------------------------- |
| [ADR-034](./0034-repository-pattern-standards.md) | Repository Pattern | Accepted | - | Fully implemented |
| [ADR-035](./0035-service-layer-architecture.md) | Service Layer | Accepted | - | Fully implemented |
| [ADR-036](./0036-event-bus-and-pub-sub-pattern.md) | Event Bus | Accepted | - | Fully implemented |
| [ADR-039](./0039-dependency-injection-pattern.md) | Dependency Injection | Accepted | - | Fully implemented |
| [ADR-041](./0041-ai-gemini-integration-architecture.md) | AI/Gemini Integration | Accepted | - | Fully implemented |
| [ADR-042](./0042-email-and-notification-architecture.md) | Email & Notifications | Accepted | - | Fully implemented |
| [ADR-043](./0043-express-middleware-pipeline.md) | Middleware Pipeline | Accepted | - | Fully implemented |
| [ADR-046](./0046-image-processing-pipeline.md) | Image Processing | Accepted | - | Fully implemented |
| [ADR-049](./0049-gamification-and-achievement-system.md) | Gamification System | Accepted | - | Fully implemented |
| [ADR-055](./0055-database-normalization-and-referential-integrity.md) | DB Normalization | Accepted | M | API uses IDs, not strings |
---
## Work Still To Be Completed (Priority Order)
These ADRs are proposed but not yet implemented, ordered by suggested implementation priority:
These ADRs are proposed or partially implemented, ordered by suggested implementation priority:
| Priority | ADR | Title | Effort | Rationale |
| -------- | ------- | --------------------------- | ------ | ------------------------------------------------- |
| 1 | ADR-015 | APM & Error Tracking | M | Production visibility, debugging |
| 1b | ADR-050 | PostgreSQL Fn Observability | M | Database function visibility (depends on ADR-015) |
| 2 | ADR-024 | Feature Flags | M | Safer deployments, A/B testing |
| 3 | ADR-023 | Schema Migrations v2 | L | Database evolution support |
| 4 | ADR-029 | Secret Rotation | L | Security improvement |
| 5 | ADR-008 | API Versioning | L | Future API evolution |
| 6 | ADR-030 | Circuit Breaker | L | Resilience improvement |
| 7 | ADR-022 | Real-time Notifications | XL | Major feature enhancement |
| 8 | ADR-011 | Authorization & RBAC | XL | Advanced permission system |
| 9 | ADR-025 | i18n & l10n | XL | Multi-language support |
| 10 | ADR-031 | Data Retention & Privacy | XL | Compliance requirements |
| Priority | ADR | Title | Status | Effort | Rationale |
| -------- | ------- | ------------------------ | -------- | ------ | ------------------------------------ |
| 1 | ADR-054 | Bugsink-Gitea Sync | Proposed | L | Automated issue tracking from errors |
| 2 | ADR-023 | Schema Migrations v2 | Proposed | L | Database evolution support |
| 3 | ADR-029 | Secret Rotation | Proposed | L | Security improvement |
| 4 | ADR-030 | Circuit Breaker | Proposed | L | Resilience improvement |
| 5 | ADR-056 | APM (Performance) | Proposed | M | Enable when performance issues arise |
| 6 | ADR-011 | Authorization & RBAC | Proposed | XL | Advanced permission system |
| 7 | ADR-025 | i18n & l10n | Proposed | XL | Multi-language support |
| 8 | ADR-031 | Data Retention & Privacy | Proposed | XL | Compliance requirements |
---
## Recent Implementation History
| Date | ADR | Change |
| ---------- | ------- | ---------------------------------------------------------------------- |
| 2026-01-11 | ADR-050 | Created - PostgreSQL function observability with fn_log() and Logstash |
| 2026-01-11 | ADR-018 | Implemented - OpenAPI/Swagger documentation at /docs/api-docs |
| 2026-01-11 | ADR-049 | Created - Gamification system, achievements, and testing requirements |
| 2026-01-09 | ADR-047 | Created - Project file/folder organization with migration plan |
| 2026-01-09 | ADR-041 | Created - AI/Gemini integration with model fallback and rate limiting |
| 2026-01-09 | ADR-042 | Created - Email and notification architecture with BullMQ queuing |
| 2026-01-09 | ADR-043 | Created - Express middleware pipeline ordering and patterns |
| 2026-01-09 | ADR-044 | Created - Frontend feature-based folder organization |
| 2026-01-09 | ADR-045 | Created - Test data factory pattern for mock generation |
| 2026-01-09 | ADR-046 | Created - Image processing pipeline with Sharp and EXIF stripping |
| 2026-01-09 | ADR-026 | Fully implemented - client-side structured logger |
| 2026-01-09 | ADR-028 | Fully implemented - all routes, middleware, and tests updated |
| Date | ADR | Change |
| ---------- | ------- | ----------------------------------------------------------------------------------- |
| 2026-01-28 | ADR-024 | Fully implemented - Backend/frontend feature flags, 89 tests, admin endpoint |
| 2026-01-28 | ADR-057 | Created - Test remediation documentation for ADR-008 Phase 2 migration |
| 2026-01-28 | ADR-013 | Marked as Superseded by ADR-023 |
| 2026-01-27 | ADR-008 | Test path migration complete - 23 files, ~70 paths updated, 274->345 tests passing |
| 2026-01-27 | ADR-008 | Phase 2 Complete - Version router factory, deprecation headers, 82 versioning tests |
| 2026-01-26 | ADR-015 | Completed - Added Sentry user context in AuthProvider, now fully implemented |
| 2026-01-26 | ADR-056 | Created - APM split from ADR-015, status Proposed (tracesSampleRate=0) |
| 2026-01-26 | ADR-015 | Refactored to focus on error tracking only, temporarily status Partial |
| 2026-01-26 | ADR-048 | Verified as fully implemented - JWT + OAuth authentication complete |
| 2026-01-26 | ADR-022 | Verified as fully implemented - WebSocket notifications complete |
| 2026-01-26 | ADR-052 | Marked as fully implemented - createScopedLogger complete |
| 2026-01-26 | ADR-053 | Marked as fully implemented - /health/queues endpoint complete |
| 2026-01-26 | ADR-050 | Marked as fully implemented - PostgreSQL function observability |
| 2026-01-26 | ADR-055 | Created (renumbered from duplicate ADR-023) - DB normalization |
| 2026-01-26 | ADR-054 | Added to tracker - Bugsink to Gitea issue synchronization |
| 2026-01-26 | ADR-053 | Added to tracker - Worker health checks and monitoring |
| 2026-01-26 | ADR-052 | Added to tracker - Granular debug logging strategy |
| 2026-01-26 | ADR-051 | Added to tracker - Asynchronous context propagation |
| 2026-01-26 | ADR-048 | Added to tracker - Authentication strategy |
| 2026-01-26 | ADR-040 | Added to tracker - Testing economics and priorities |
| 2026-01-17 | ADR-054 | Created - Bugsink-Gitea sync worker proposal |
| 2026-01-11 | ADR-050 | Created - PostgreSQL function observability with fn_log() |
| 2026-01-11 | ADR-018 | Implemented - OpenAPI/Swagger documentation at /docs/api-docs |
| 2026-01-11 | ADR-049 | Created - Gamification system, achievements, and testing |
| 2026-01-09 | ADR-047 | Created - Project file/folder organization with migration plan |
| 2026-01-09 | ADR-041 | Created - AI/Gemini integration with model fallback |
| 2026-01-09 | ADR-042 | Created - Email and notification architecture with BullMQ |
| 2026-01-09 | ADR-043 | Created - Express middleware pipeline ordering and patterns |
| 2026-01-09 | ADR-044 | Created - Frontend feature-based folder organization |
| 2026-01-09 | ADR-045 | Created - Test data factory pattern for mock generation |
| 2026-01-09 | ADR-046 | Created - Image processing pipeline with Sharp and EXIF stripping |
| 2026-01-09 | ADR-026 | Fully implemented - client-side structured logger |
| 2026-01-09 | ADR-028 | Fully implemented - all routes, middleware, and tests updated |
---

View File

@@ -2,6 +2,8 @@
This directory contains a log of the architectural decisions made for the Flyer Crawler project.
**[Implementation Tracker](./adr-implementation-tracker.md)**: Track implementation status and effort estimates for all ADRs.
## 1. Foundational / Core Infrastructure
**[ADR-002](./0002-standardized-transaction-management.md)**: Standardized Transaction Management and Unit of Work Pattern (Accepted)
@@ -12,7 +14,7 @@ This directory contains a log of the architectural decisions made for the Flyer
## 2. Data Management
**[ADR-009](./0009-caching-strategy-for-read-heavy-operations.md)**: Caching Strategy for Read-Heavy Operations (Accepted)
**[ADR-013](./0013-database-schema-migration-strategy.md)**: Database Schema Migration Strategy (Proposed)
**[ADR-013](./0013-database-schema-migration-strategy.md)**: Database Schema Migration Strategy (Superseded by ADR-023)
**[ADR-019](./0019-data-backup-and-recovery-strategy.md)**: Data Backup and Recovery Strategy (Accepted)
**[ADR-023](./0023-database-schema-migration-strategy.md)**: Database Schema Migration Strategy (Proposed)
**[ADR-031](./0031-data-retention-and-privacy-compliance.md)**: Data Retention and Privacy Compliance (Proposed)
@@ -20,9 +22,9 @@ This directory contains a log of the architectural decisions made for the Flyer
## 3. API & Integration
**[ADR-003](./0003-standardized-input-validation-using-middleware.md)**: Standardized Input Validation using Middleware (Accepted)
**[ADR-008](./0008-api-versioning-strategy.md)**: API Versioning Strategy (Proposed)
**[ADR-018](./0018-api-documentation-strategy.md)**: API Documentation Strategy (Proposed)
**[ADR-022](./0022-real-time-notification-system.md)**: Real-time Notification System (Proposed)
**[ADR-008](./0008-api-versioning-strategy.md)**: API Versioning Strategy (Accepted - Phase 2 Complete)
**[ADR-018](./0018-api-documentation-strategy.md)**: API Documentation Strategy (Superseded - tsoa migration complete)
**[ADR-022](./0022-real-time-notification-system.md)**: Real-time Notification System (Accepted)
**[ADR-028](./0028-api-response-standardization.md)**: API Response Standardization and Envelope Pattern (Implemented)
## 4. Security & Compliance
@@ -33,12 +35,16 @@ This directory contains a log of the architectural decisions made for the Flyer
**[ADR-029](./0029-secret-rotation-and-key-management.md)**: Secret Rotation and Key Management Strategy (Proposed)
**[ADR-032](./0032-rate-limiting-strategy.md)**: Rate Limiting Strategy (Accepted)
**[ADR-033](./0033-file-upload-and-storage-strategy.md)**: File Upload and Storage Strategy (Accepted)
**[ADR-048](./0048-authentication-strategy.md)**: Authentication Strategy (Partially Implemented)
**[ADR-048](./0048-authentication-strategy.md)**: Authentication Strategy (Accepted)
## 5. Observability & Monitoring
**[ADR-004](./0004-standardized-application-wide-structured-logging.md)**: Standardized Application-Wide Structured Logging (Accepted)
**[ADR-015](./0015-application-performance-monitoring-and-error-tracking.md)**: Application Performance Monitoring (APM) and Error Tracking (Proposed)
**[ADR-015](./0015-error-tracking-and-observability.md)**: Error Tracking and Observability (Accepted)
**[ADR-050](./0050-postgresql-function-observability.md)**: PostgreSQL Function Observability (Accepted)
**[ADR-051](./0051-asynchronous-context-propagation.md)**: Asynchronous Context Propagation (Accepted)
**[ADR-052](./0052-granular-debug-logging-strategy.md)**: Granular Debug Logging Strategy (Accepted)
**[ADR-056](./0056-application-performance-monitoring.md)**: Application Performance Monitoring (Proposed)
## 6. Deployment & Operations
@@ -48,13 +54,16 @@ This directory contains a log of the architectural decisions made for the Flyer
**[ADR-024](./0024-feature-flagging-strategy.md)**: Feature Flagging Strategy (Proposed)
**[ADR-037](./0037-scheduled-jobs-and-cron-pattern.md)**: Scheduled Jobs and Cron Pattern (Accepted)
**[ADR-038](./0038-graceful-shutdown-pattern.md)**: Graceful Shutdown Pattern (Accepted)
**[ADR-053](./0053-worker-health-checks.md)**: Worker Health Checks and Stalled Job Monitoring (Accepted)
**[ADR-054](./0054-bugsink-gitea-issue-sync.md)**: Bugsink to Gitea Issue Synchronization (Proposed)
**[ADR-061](./0061-pm2-process-isolation-safeguards.md)**: PM2 Process Isolation Safeguards (Accepted)
## 7. Frontend / User Interface
**[ADR-005](./0005-frontend-state-management-and-server-cache-strategy.md)**: Frontend State Management and Server Cache Strategy (Accepted)
**[ADR-012](./0012-frontend-component-library-and-design-system.md)**: Frontend Component Library and Design System (Partially Implemented)
**[ADR-025](./0025-internationalization-and-localization-strategy.md)**: Internationalization (i18n) and Localization (l10n) Strategy (Proposed)
**[ADR-026](./0026-standardized-client-side-structured-logging.md)**: Standardized Client-Side Structured Logging (Proposed)
**[ADR-026](./0026-standardized-client-side-structured-logging.md)**: Standardized Client-Side Structured Logging (Accepted)
**[ADR-044](./0044-frontend-feature-organization.md)**: Frontend Feature Organization Pattern (Accepted)
## 8. Development Workflow & Quality
@@ -65,6 +74,9 @@ This directory contains a log of the architectural decisions made for the Flyer
**[ADR-040](./0040-testing-economics-and-priorities.md)**: Testing Economics and Priorities (Accepted)
**[ADR-045](./0045-test-data-factories-and-fixtures.md)**: Test Data Factories and Fixtures (Accepted)
**[ADR-047](./0047-project-file-and-folder-organization.md)**: Project File and Folder Organization (Proposed)
**[ADR-057](./0057-test-remediation-post-api-versioning.md)**: Test Remediation Post-API Versioning (Accepted)
**[ADR-059](./0059-dependency-modernization.md)**: Dependency Modernization - tsoa Migration (Accepted)
**[ADR-060](./0060-typescript-test-error-remediation.md)**: TypeScript Test Error Remediation Strategy (Implemented)
## 9. Architecture Patterns
@@ -76,3 +88,5 @@ This directory contains a log of the architectural decisions made for the Flyer
**[ADR-042](./0042-email-and-notification-architecture.md)**: Email and Notification Architecture (Accepted)
**[ADR-043](./0043-express-middleware-pipeline.md)**: Express Middleware Pipeline Architecture (Accepted)
**[ADR-046](./0046-image-processing-pipeline.md)**: Image Processing Pipeline (Accepted)
**[ADR-049](./0049-gamification-and-achievement-system.md)**: Gamification and Achievement System (Accepted)
**[ADR-055](./0055-database-normalization-and-referential-integrity.md)**: Database Normalization and Referential Integrity (Accepted)

View File

@@ -1,10 +1,168 @@
# Database Setup
# Database Architecture
Flyer Crawler uses PostgreSQL with several extensions for full-text search, geographic data, and UUID generation.
**Version**: 0.12.20
**Last Updated**: 2026-01-28
Flyer Crawler uses PostgreSQL 16 with PostGIS for geographic data, pg_trgm for fuzzy text search, and uuid-ossp for UUID generation. The database contains 65 tables organized into logical domains.
## Table of Contents
1. [Schema Overview](#schema-overview)
2. [Database Setup](#database-setup)
3. [Schema Reference](#schema-reference)
4. [Related Documentation](#related-documentation)
---
## Required Extensions
## Schema Overview
The database is organized into the following domains:
### Core Infrastructure (6 tables)
| Table | Purpose | Primary Key |
| ----------------------- | ----------------------------------------- | ----------------- |
| `users` | Authentication credentials and login data | `user_id` (UUID) |
| `profiles` | Public user data, preferences, points | `user_id` (UUID) |
| `addresses` | Normalized address storage with geocoding | `address_id` |
| `activity_log` | User activity audit trail | `activity_log_id` |
| `password_reset_tokens` | Temporary tokens for password reset | `token_id` |
| `schema_info` | Schema deployment metadata | `environment` |
### Stores and Locations (4 tables)
| Table | Purpose | Primary Key |
| ------------------------ | --------------------------------------- | ------------------- |
| `stores` | Grocery store chains (Safeway, Kroger) | `store_id` |
| `store_locations` | Physical store locations with addresses | `store_location_id` |
| `favorite_stores` | User store favorites | `user_id, store_id` |
| `store_receipt_patterns` | Receipt text patterns for store ID | `pattern_id` |
### Flyers and Items (7 tables)
| Table | Purpose | Primary Key |
| ----------------------- | -------------------------------------- | ------------------------ |
| `flyers` | Uploaded flyer metadata and status | `flyer_id` |
| `flyer_items` | Individual deals extracted from flyers | `flyer_item_id` |
| `flyer_locations` | Flyer-to-location associations | `flyer_location_id` |
| `categories` | Item categorization (Produce, Dairy) | `category_id` |
| `master_grocery_items` | Canonical grocery item dictionary | `master_grocery_item_id` |
| `master_item_aliases` | Alternative names for master items | `alias_id` |
| `unmatched_flyer_items` | Items pending master item matching | `unmatched_item_id` |
### Products and Brands (2 tables)
| Table | Purpose | Primary Key |
| ---------- | ---------------------------------------------- | ------------ |
| `brands` | Brand names (Coca-Cola, Kraft) | `brand_id` |
| `products` | Specific products (master item + brand + size) | `product_id` |
### Price Tracking (3 tables)
| Table | Purpose | Primary Key |
| ----------------------- | ---------------------------------- | ------------------ |
| `item_price_history` | Historical prices for master items | `price_history_id` |
| `user_submitted_prices` | User-contributed price reports | `submission_id` |
| `suggested_corrections` | Suggested edits to flyer items | `correction_id` |
### User Features (8 tables)
| Table | Purpose | Primary Key |
| -------------------- | ------------------------------------ | --------------------------- |
| `user_watched_items` | Items user wants to track prices for | `user_watched_item_id` |
| `user_alerts` | Price alert thresholds | `alert_id` |
| `notifications` | User notifications | `notification_id` |
| `user_item_aliases` | User-defined item name aliases | `alias_id` |
| `user_follows` | User-to-user follow relationships | `follower_id, following_id` |
| `user_reactions` | Reactions to content (likes, etc.) | `reaction_id` |
| `budgets` | User-defined spending budgets | `budget_id` |
| `search_queries` | Search history for analytics | `query_id` |
### Shopping Lists (4 tables)
| Table | Purpose | Primary Key |
| ----------------------- | ------------------------ | ------------------------- |
| `shopping_lists` | User shopping lists | `shopping_list_id` |
| `shopping_list_items` | Items on shopping lists | `shopping_list_item_id` |
| `shared_shopping_lists` | Shopping list sharing | `shared_shopping_list_id` |
| `shopping_trips` | Completed shopping trips | `trip_id` |
| `shopping_trip_items` | Items purchased on trips | `trip_item_id` |
### Recipes (11 tables)
| Table | Purpose | Primary Key |
| --------------------------------- | -------------------------------- | ------------------------- |
| `recipes` | User recipes with metadata | `recipe_id` |
| `recipe_ingredients` | Recipe ingredient list | `recipe_ingredient_id` |
| `recipe_ingredient_substitutions` | Ingredient alternatives | `substitution_id` |
| `tags` | Recipe tags (vegan, quick, etc.) | `tag_id` |
| `recipe_tags` | Recipe-to-tag associations | `recipe_id, tag_id` |
| `appliances` | Kitchen appliances | `appliance_id` |
| `recipe_appliances` | Appliances needed for recipes | `recipe_id, appliance_id` |
| `recipe_ratings` | User ratings for recipes | `rating_id` |
| `recipe_comments` | User comments on recipes | `comment_id` |
| `favorite_recipes` | User recipe favorites | `user_id, recipe_id` |
| `recipe_collections` | User recipe collections | `collection_id` |
### Meal Planning (3 tables)
| Table | Purpose | Primary Key |
| ------------------- | -------------------------- | ----------------- |
| `menu_plans` | Weekly/monthly meal plans | `menu_plan_id` |
| `shared_menu_plans` | Menu plan sharing | `share_id` |
| `planned_meals` | Individual meals in a plan | `planned_meal_id` |
### Pantry and Inventory (4 tables)
| Table | Purpose | Primary Key |
| -------------------- | ------------------------------------ | ----------------- |
| `pantry_items` | User pantry inventory | `pantry_item_id` |
| `pantry_locations` | Storage locations (fridge, freezer) | `location_id` |
| `expiry_date_ranges` | Reference shelf life data | `expiry_range_id` |
| `expiry_alerts` | User expiry notification preferences | `expiry_alert_id` |
| `expiry_alert_log` | Sent expiry notifications | `alert_log_id` |
### Receipts (4 tables)
| Table | Purpose | Primary Key |
| ------------------------ | ----------------------------- | ----------------- |
| `receipts` | Scanned receipt metadata | `receipt_id` |
| `receipt_items` | Items parsed from receipts | `receipt_item_id` |
| `receipt_processing_log` | OCR/AI processing audit trail | `log_id` |
### UPC Scanning (2 tables)
| Table | Purpose | Primary Key |
| ---------------------- | ------------------------------- | ----------- |
| `upc_scan_history` | User barcode scan history | `scan_id` |
| `upc_external_lookups` | External UPC API response cache | `lookup_id` |
### Gamification (2 tables)
| Table | Purpose | Primary Key |
| ------------------- | ---------------------------- | ------------------------- |
| `achievements` | Defined achievements | `achievement_id` |
| `user_achievements` | Achievements earned by users | `user_id, achievement_id` |
### User Preferences (3 tables)
| Table | Purpose | Primary Key |
| --------------------------- | ---------------------------- | ------------------------- |
| `dietary_restrictions` | Defined dietary restrictions | `restriction_id` |
| `user_dietary_restrictions` | User dietary preferences | `user_id, restriction_id` |
| `user_appliances` | Appliances user owns | `user_id, appliance_id` |
### Reference Data (1 table)
| Table | Purpose | Primary Key |
| ------------------ | ----------------------- | --------------- |
| `unit_conversions` | Unit conversion factors | `conversion_id` |
---
## Database Setup
### Required Extensions
| Extension | Purpose |
| ----------- | ------------------------------------------- |
@@ -14,7 +172,7 @@ Flyer Crawler uses PostgreSQL with several extensions for full-text search, geog
---
## Database Users
### Database Users
This project uses **environment-specific database users** to isolate production and test environments:

View File

@@ -1,7 +1,7 @@
# Flyer Crawler - System Architecture Overview
**Version**: 0.12.5
**Last Updated**: 2026-01-22
**Version**: 0.12.20
**Last Updated**: 2026-01-28
**Platform**: Linux (Production and Development)
---
@@ -41,7 +41,7 @@
## System Architecture Diagram
```
```text
+-----------------------------------------------------------------------------------+
| CLIENT LAYER |
+-----------------------------------------------------------------------------------+
@@ -153,10 +153,10 @@
| Component | Technology | Version | Purpose |
| ---------------------- | ---------- | -------- | -------------------------------- |
| **Runtime** | Node.js | 22.x LTS | Server-side JavaScript runtime |
| **Language** | TypeScript | 5.9.x | Type-safe JavaScript superset |
| **Web Framework** | Express.js | 5.1.x | HTTP server and routing |
| **Frontend Framework** | React | 19.2.x | UI component library |
| **Build Tool** | Vite | 7.2.x | Frontend bundling and dev server |
| **Language** | TypeScript | 5.9.3 | Type-safe JavaScript superset |
| **Web Framework** | Express.js | 5.1.0 | HTTP server and routing |
| **Frontend Framework** | React | 19.2.0 | UI component library |
| **Build Tool** | Vite | 7.2.4 | Frontend bundling and dev server |
### Data Storage
@@ -176,23 +176,23 @@
| **OAuth** | Google, GitHub | Social authentication |
| **Email** | Nodemailer (SMTP) | Transactional emails |
### Background Processing
### Background Processing Stack
| Component | Technology | Version | Purpose |
| ------------------- | ---------- | ------- | --------------------------------- |
| **Job Queues** | BullMQ | 5.65.x | Reliable async job processing |
| **Job Queues** | BullMQ | 5.65.1 | Reliable async job processing |
| **Process Manager** | PM2 | Latest | Process management and clustering |
| **Scheduler** | node-cron | 4.2.x | Scheduled tasks |
| **Scheduler** | node-cron | 4.2.1 | Scheduled tasks |
### Frontend Stack
| Component | Technology | Version | Purpose |
| -------------------- | -------------- | ------- | ---------------------------------------- |
| **State Management** | TanStack Query | 5.90.x | Server state caching and synchronization |
| **Routing** | React Router | 7.9.x | Client-side routing |
| **Styling** | Tailwind CSS | 4.1.x | Utility-first CSS framework |
| **Icons** | Lucide React | 0.555.x | Icon components |
| **Charts** | Recharts | 3.4.x | Data visualization |
| **State Management** | TanStack Query | 5.90.12 | Server state caching and synchronization |
| **Routing** | React Router | 7.9.6 | Client-side routing |
| **Styling** | Tailwind CSS | 4.1.17 | Utility-first CSS framework |
| **Icons** | Lucide React | 0.555.0 | Icon components |
| **Charts** | Recharts | 3.4.1 | Data visualization |
### Observability and Quality
@@ -221,7 +221,7 @@ The frontend is a single-page application (SPA) built with React 19 and Vite.
**Directory Structure**:
```
```text
src/
+-- components/ # Reusable UI components
+-- contexts/ # React context providers
@@ -244,17 +244,30 @@ The backend is a RESTful API server built with Express.js 5.
- Structured logging with Pino
- Standardized error handling (ADR-001)
**API Route Modules**:
| Route | Purpose |
|-------|---------|
| `/api/auth` | Authentication (login, register, OAuth) |
| `/api/users` | User profile management |
| `/api/flyers` | Flyer CRUD and processing |
| `/api/recipes` | Recipe management |
| `/api/deals` | Best prices and deal discovery |
| `/api/stores` | Store management |
| `/api/admin` | Administrative functions |
| `/api/health` | Health checks and monitoring |
**API Route Modules** (all versioned under `/api/v1/*`):
| Route | Purpose |
| ------------------------- | ----------------------------------------------- |
| `/api/v1/auth` | Authentication (login, register, OAuth) |
| `/api/v1/health` | Health checks and monitoring |
| `/api/v1/system` | System administration (PM2 status, server info) |
| `/api/v1/users` | User profile management |
| `/api/v1/ai` | AI-powered features and flyer processing |
| `/api/v1/admin` | Administrative functions |
| `/api/v1/budgets` | Budget management and spending analysis |
| `/api/v1/achievements` | Gamification and achievement system |
| `/api/v1/flyers` | Flyer CRUD and processing |
| `/api/v1/recipes` | Recipe management and recommendations |
| `/api/v1/personalization` | Master items and user preferences |
| `/api/v1/price-history` | Price tracking and trend analysis |
| `/api/v1/stats` | Public statistics and analytics |
| `/api/v1/upc` | UPC barcode scanning and product lookup |
| `/api/v1/inventory` | Inventory and expiry tracking |
| `/api/v1/receipts` | Receipt scanning and purchase history |
| `/api/v1/deals` | Best prices and deal discovery |
| `/api/v1/reactions` | Social features (reactions, sharing) |
| `/api/v1/stores` | Store management and location services |
| `/api/v1/categories` | Category browsing and product categorization |
### Database (PostgreSQL/PostGIS)
@@ -331,7 +344,7 @@ BullMQ workers handle asynchronous processing tasks. PM2 manages both the API se
### Flyer Processing Pipeline
```
```text
+-------------+ +----------------+ +------------------+ +---------------+
| User | | Express | | BullMQ | | PostgreSQL |
| Upload +---->+ Route +---->+ Queue +---->+ Storage |
@@ -395,7 +408,7 @@ BullMQ workers handle asynchronous processing tasks. PM2 manages both the API se
The application follows a strict layered architecture as defined in ADR-035.
```
```text
+-----------------------------------------------------------------------+
| ROUTES LAYER |
| Responsibilities: |
@@ -458,7 +471,7 @@ The application follows a strict layered architecture as defined in ADR-035.
### Entity Relationship Overview
```
```text
+------------------+ +------------------+ +------------------+
| users | | profiles | | addresses |
|------------------| |------------------| |------------------|
@@ -537,7 +550,7 @@ The application follows a strict layered architecture as defined in ADR-035.
### JWT Token Architecture
```
```text
+-------------------+ +-------------------+ +-------------------+
| Login Request | | Server | | Database |
| (email/pass) +---->+ Validates +---->+ Verify User |
@@ -576,7 +589,7 @@ The application follows a strict layered architecture as defined in ADR-035.
### Protected Route Flow
```
```text
+-------------------+ +-------------------+ +-------------------+
| API Request | | requireAuth | | JWT Strategy |
| + Bearer Token +---->+ Middleware +---->+ Validate |
@@ -603,7 +616,7 @@ The application follows a strict layered architecture as defined in ADR-035.
### Worker Architecture
```
```text
+-------------------+ +-------------------+ +-------------------+
| API Server | | Redis | | Worker Process |
| (Queue Producer)| | (Job Storage) | | (Consumer) |
@@ -635,7 +648,7 @@ The application follows a strict layered architecture as defined in ADR-035.
Jobs use exponential backoff for retries:
```
```text
Attempt 1: Immediate
Attempt 2: Initial delay (e.g., 5 seconds)
Attempt 3: 2x delay (e.g., 10 seconds)
@@ -658,7 +671,7 @@ Attempt 4: 4x delay (e.g., 20 seconds)
### Environment Overview
```
```text
+-----------------------------------------------------------------------------------+
| DEVELOPMENT |
+-----------------------------------------------------------------------------------+
@@ -710,7 +723,7 @@ Attempt 4: 4x delay (e.g., 20 seconds)
### Deployment Pipeline (ADR-017)
```
```text
+------------+ +------------+ +------------+ +------------+
| Push to | | Gitea | | Build & | | Deploy |
| main +---->+ Actions +---->+ Test +---->+ to Prod |
@@ -762,11 +775,14 @@ The system architecture is governed by Architecture Decision Records (ADRs). Key
### API and Integration
| ADR | Title | Status |
| ------- | ----------------------------- | ----------- |
| ADR-003 | Standardized Input Validation | Accepted |
| ADR-022 | Real-time Notification System | Proposed |
| ADR-028 | API Response Standardization | Implemented |
| ADR | Title | Status |
| ------- | ----------------------------- | ---------------- |
| ADR-003 | Standardized Input Validation | Accepted |
| ADR-008 | API Versioning Strategy | Phase 1 Complete |
| ADR-022 | Real-time Notification System | Proposed |
| ADR-028 | API Response Standardization | Implemented |
**Implementation Guide**: [API Versioning Infrastructure](./api-versioning-infrastructure.md) (Phase 2)
### Security
@@ -836,22 +852,55 @@ The system architecture is governed by Architecture Decision Records (ADRs). Key
| File | Purpose |
| ----------------------------------------------- | --------------------------------------- |
| `src/services/flyerProcessingService.server.ts` | Flyer processing pipeline orchestration |
| `src/services/flyerAiProcessor.server.ts` | AI extraction for flyers |
| `src/services/aiService.server.ts` | Google Gemini AI integration |
| `src/services/cacheService.server.ts` | Redis caching abstraction |
| `src/services/emailService.server.ts` | Email sending |
| `src/services/queues.server.ts` | BullMQ queue definitions |
| `src/services/queueService.server.ts` | Queue management and scheduling |
| `src/services/workers.server.ts` | BullMQ worker definitions |
| `src/services/websocketService.server.ts` | Real-time WebSocket notifications |
| `src/services/receiptService.server.ts` | Receipt scanning and OCR |
| `src/services/upcService.server.ts` | UPC barcode lookup |
| `src/services/expiryService.server.ts` | Pantry expiry tracking |
| `src/services/geocodingService.server.ts` | Address geocoding |
| `src/services/analyticsService.server.ts` | Analytics and reporting |
| `src/services/monitoringService.server.ts` | Health monitoring |
| `src/services/barcodeService.server.ts` | Barcode detection |
| `src/services/logger.server.ts` | Structured logging (Pino) |
| `src/services/redis.server.ts` | Redis connection management |
| `src/services/sentry.server.ts` | Error tracking (Sentry/Bugsink) |
### Database Files
| File | Purpose |
| ---------------------------------- | -------------------------------------------- |
| `src/services/db/connection.db.ts` | Database pool and transaction management |
| `src/services/db/errors.db.ts` | Database error types |
| `src/services/db/user.db.ts` | User repository |
| `src/services/db/flyer.db.ts` | Flyer repository |
| `sql/master_schema_rollup.sql` | Complete database schema (for test DB setup) |
| `sql/initial_schema.sql` | Fresh installation schema |
| File | Purpose |
| --------------------------------------- | -------------------------------------------- |
| `src/services/db/connection.db.ts` | Database pool and transaction management |
| `src/services/db/errors.db.ts` | Database error types |
| `src/services/db/index.db.ts` | Repository exports |
| `src/services/db/user.db.ts` | User repository |
| `src/services/db/flyer.db.ts` | Flyer repository |
| `src/services/db/store.db.ts` | Store repository |
| `src/services/db/storeLocation.db.ts` | Store location repository |
| `src/services/db/recipe.db.ts` | Recipe repository |
| `src/services/db/category.db.ts` | Category repository |
| `src/services/db/personalization.db.ts` | Master items and personalization |
| `src/services/db/shopping.db.ts` | Shopping lists repository |
| `src/services/db/deals.db.ts` | Deals and best prices repository |
| `src/services/db/price.db.ts` | Price history repository |
| `src/services/db/receipt.db.ts` | Receipt repository |
| `src/services/db/upc.db.ts` | UPC scan history repository |
| `src/services/db/expiry.db.ts` | Expiry tracking repository |
| `src/services/db/gamification.db.ts` | Achievements repository |
| `src/services/db/budget.db.ts` | Budget repository |
| `src/services/db/reaction.db.ts` | User reactions repository |
| `src/services/db/notification.db.ts` | Notifications repository |
| `src/services/db/address.db.ts` | Address repository |
| `src/services/db/admin.db.ts` | Admin operations repository |
| `src/services/db/conversion.db.ts` | Unit conversion repository |
| `src/services/db/flyerLocation.db.ts` | Flyer locations repository |
| `sql/master_schema_rollup.sql` | Complete database schema (for test DB setup) |
| `sql/initial_schema.sql` | Fresh installation schema |
### Type Definitions

View File

@@ -0,0 +1,521 @@
# API Versioning Infrastructure (ADR-008 Phase 2)
**Status**: Complete
**Date**: 2026-01-27
**Prerequisite**: ADR-008 Phase 1 Complete (all routes at `/api/v1/*`)
## Implementation Summary
Phase 2 has been fully implemented with the following results:
| Metric | Value |
| ------------------ | -------------------------------------- |
| New Files Created | 5 |
| Files Modified | 2 (server.ts, express.d.ts) |
| Unit Tests | 82 passing (100%) |
| Integration Tests | 48 new versioning tests |
| RFC Compliance | RFC 8594 (Sunset), RFC 8288 (Link) |
| Supported Versions | v1 (active), v2 (infrastructure ready) |
**Developer Guide**: [API-VERSIONING.md](../development/API-VERSIONING.md)
## Purpose
Build infrastructure to support parallel API versions, version detection, and deprecation workflows. Enables future v2 API without breaking existing clients.
## Architecture Overview
```text
Request → Version Router → Version Middleware → Domain Router → Handler
↓ ↓
createVersionedRouter() attachVersionInfo()
↓ ↓
/api/v1/* | /api/v2/* req.apiVersion = 'v1'|'v2'
```
## Key Components
| Component | File | Responsibility |
| ---------------------- | ------------------------------------------ | ------------------------------------------ |
| Version Router Factory | `src/routes/versioned.ts` | Create version-specific Express routers |
| Version Middleware | `src/middleware/apiVersion.middleware.ts` | Extract version, attach to request context |
| Deprecation Middleware | `src/middleware/deprecation.middleware.ts` | Add RFC 8594 deprecation headers |
| Version Types | `src/types/express.d.ts` | Extend Express Request with apiVersion |
| Version Constants | `src/config/apiVersions.ts` | Centralized version definitions |
## Implementation Tasks
### Task 1: Version Types (Foundation)
**File**: `src/types/express.d.ts`
```typescript
declare global {
namespace Express {
interface Request {
apiVersion?: 'v1' | 'v2';
versionDeprecated?: boolean;
}
}
}
```
**Dependencies**: None
**Testing**: Type-check only (`npm run type-check`)
---
### Task 2: Version Constants
**File**: `src/config/apiVersions.ts`
```typescript
export const API_VERSIONS = ['v1', 'v2'] as const;
export type ApiVersion = (typeof API_VERSIONS)[number];
export const CURRENT_VERSION: ApiVersion = 'v1';
export const DEFAULT_VERSION: ApiVersion = 'v1';
export interface VersionConfig {
version: ApiVersion;
status: 'active' | 'deprecated' | 'sunset';
sunsetDate?: string; // ISO 8601
successorVersion?: ApiVersion;
}
export const VERSION_CONFIG: Record<ApiVersion, VersionConfig> = {
v1: { version: 'v1', status: 'active' },
v2: { version: 'v2', status: 'active' },
};
```
**Dependencies**: None
**Testing**: Unit test for version validation
---
### Task 3: Version Detection Middleware
**File**: `src/middleware/apiVersion.middleware.ts`
```typescript
import { Request, Response, NextFunction } from 'express';
import { API_VERSIONS, ApiVersion, DEFAULT_VERSION } from '../config/apiVersions';
export function extractApiVersion(req: Request, _res: Response, next: NextFunction) {
// Extract from URL path: /api/v1/... → 'v1'
const pathMatch = req.path.match(/^\/v(\d+)\//);
if (pathMatch) {
const version = `v${pathMatch[1]}` as ApiVersion;
if (API_VERSIONS.includes(version)) {
req.apiVersion = version;
}
}
// Fallback to default if not detected
req.apiVersion = req.apiVersion || DEFAULT_VERSION;
next();
}
```
**Pattern**: Attach to request before route handlers
**Integration Point**: `server.ts` before versioned route mounting
**Testing**: Unit tests for path extraction, default fallback
---
### Task 4: Deprecation Headers Middleware
**File**: `src/middleware/deprecation.middleware.ts`
Implements RFC 8594 (Sunset Header) and draft-ietf-httpapi-deprecation-header.
```typescript
import { Request, Response, NextFunction } from 'express';
import { VERSION_CONFIG, ApiVersion } from '../config/apiVersions';
import { logger } from '../services/logger.server';
export function deprecationHeaders(version: ApiVersion) {
const config = VERSION_CONFIG[version];
return (req: Request, res: Response, next: NextFunction) => {
if (config.status === 'deprecated') {
res.set('Deprecation', 'true');
if (config.sunsetDate) {
res.set('Sunset', config.sunsetDate);
}
if (config.successorVersion) {
res.set('Link', `</api/${config.successorVersion}>; rel="successor-version"`);
}
req.versionDeprecated = true;
// Log deprecation access for monitoring
logger.warn(
{
apiVersion: version,
path: req.path,
method: req.method,
sunsetDate: config.sunsetDate,
},
'Deprecated API version accessed',
);
}
// Always set version header
res.set('X-API-Version', version);
next();
};
}
```
**RFC Compliance**:
- `Deprecation: true` (draft-ietf-httpapi-deprecation-header)
- `Sunset: <date>` (RFC 8594)
- `Link: <url>; rel="successor-version"` (RFC 8288)
**Testing**: Unit tests for header presence, version status variations
---
### Task 5: Version Router Factory
**File**: `src/routes/versioned.ts`
```typescript
import { Router } from 'express';
import { ApiVersion } from '../config/apiVersions';
import { extractApiVersion } from '../middleware/apiVersion.middleware';
import { deprecationHeaders } from '../middleware/deprecation.middleware';
// Import domain routers
import authRouter from './auth.routes';
import userRouter from './user.routes';
import flyerRouter from './flyer.routes';
// ... all domain routers
interface VersionedRouters {
v1: Record<string, Router>;
v2: Record<string, Router>;
}
const ROUTERS: VersionedRouters = {
v1: {
auth: authRouter,
users: userRouter,
flyers: flyerRouter,
// ... all v1 routers (current implementation)
},
v2: {
// Future: v2-specific routers
// auth: authRouterV2,
// For now, fallback to v1
},
};
export function createVersionedRouter(version: ApiVersion): Router {
const router = Router();
// Apply version middleware
router.use(extractApiVersion);
router.use(deprecationHeaders(version));
// Get routers for this version, fallback to v1
const versionRouters = ROUTERS[version] || ROUTERS.v1;
// Mount domain routers
Object.entries(versionRouters).forEach(([path, domainRouter]) => {
router.use(`/${path}`, domainRouter);
});
return router;
}
```
**Pattern**: Factory function returns configured Router
**Fallback Strategy**: v2 uses v1 routers until v2-specific handlers exist
**Testing**: Integration test verifying route mounting
---
### Task 6: Server Integration
**File**: `server.ts` (modifications)
```typescript
// Before (current implementation - Phase 1):
app.use('/api/v1/auth', authRouter);
app.use('/api/v1/users', userRouter);
// ... individual route mounting
// After (Phase 2):
import { createVersionedRouter } from './src/routes/versioned';
// Mount versioned API routers
app.use('/api/v1', createVersionedRouter('v1'));
app.use('/api/v2', createVersionedRouter('v2')); // Placeholder for future
// Keep redirect middleware for unversioned requests
app.use('/api', versionRedirectMiddleware);
```
**Breaking Change Risk**: Low (same routes, different mounting)
**Rollback**: Revert to individual `app.use()` calls
**Testing**: Full integration test suite must pass
---
### Task 7: Request Context Propagation
**Pattern**: Version flows through request lifecycle for conditional logic.
```typescript
// In any route handler or service:
function handler(req: Request, res: Response) {
if (req.apiVersion === 'v2') {
// v2-specific behavior
return sendSuccess(res, transformV2(data));
}
// v1 behavior (default)
return sendSuccess(res, data);
}
```
**Use Cases**:
- Response transformation based on version
- Feature flags per version
- Metric tagging by version
---
### Task 8: Documentation Update
**File**: `src/config/swagger.ts` (modifications)
```typescript
const swaggerDefinition: OpenAPIV3.Document = {
// ...
servers: [
{
url: '/api/v1',
description: 'API v1 (Current)',
},
{
url: '/api/v2',
description: 'API v2 (Future)',
},
],
// ...
};
```
**File**: `docs/adr/0008-api-versioning-strategy.md` (update checklist)
---
### Task 9: Unit Tests
**File**: `src/middleware/apiVersion.middleware.test.ts`
```typescript
describe('extractApiVersion', () => {
it('extracts v1 from /api/v1/users', () => {
/* ... */
});
it('extracts v2 from /api/v2/users', () => {
/* ... */
});
it('defaults to v1 for unversioned paths', () => {
/* ... */
});
it('ignores invalid version numbers', () => {
/* ... */
});
});
```
**File**: `src/middleware/deprecation.middleware.test.ts`
```typescript
describe('deprecationHeaders', () => {
it('adds no headers for active version', () => {
/* ... */
});
it('adds Deprecation header for deprecated version', () => {
/* ... */
});
it('adds Sunset header when sunsetDate configured', () => {
/* ... */
});
it('adds Link header for successor version', () => {
/* ... */
});
it('always sets X-API-Version header', () => {
/* ... */
});
});
```
---
### Task 10: Integration Tests
**File**: `src/routes/versioned.test.ts`
```typescript
describe('Versioned Router Integration', () => {
it('mounts all v1 routes correctly', () => {
/* ... */
});
it('v2 falls back to v1 handlers', () => {
/* ... */
});
it('sets X-API-Version response header', () => {
/* ... */
});
it('deprecation headers appear when configured', () => {
/* ... */
});
});
```
**Run in Container**: `podman exec -it flyer-crawler-dev npm test -- versioned`
## Implementation Sequence
```text
[Task 1] → [Task 2] → [Task 3] → [Task 4] → [Task 5] → [Task 6]
Types Config Middleware Middleware Factory Server
↓ ↓ ↓ ↓
[Task 7] [Task 9] [Task 10] [Task 8]
Context Unit Integ Docs
```
**Critical Path**: 1 → 2 → 3 → 5 → 6 (server integration)
## File Structure After Implementation
```text
src/
├── config/
│ ├── apiVersions.ts # NEW: Version constants
│ └── swagger.ts # MODIFIED: Multi-server
├── middleware/
│ ├── apiVersion.middleware.ts # NEW: Version extraction
│ ├── apiVersion.middleware.test.ts # NEW: Unit tests
│ ├── deprecation.middleware.ts # NEW: RFC 8594 headers
│ └── deprecation.middleware.test.ts # NEW: Unit tests
├── routes/
│ ├── versioned.ts # NEW: Router factory
│ ├── versioned.test.ts # NEW: Integration tests
│ └── *.routes.ts # UNCHANGED: Domain routers
├── types/
│ └── express.d.ts # MODIFIED: Add apiVersion
server.ts # MODIFIED: Use versioned router
```
## Risk Assessment
| Risk | Likelihood | Impact | Mitigation |
| ------------------------------------ | ---------- | ------ | ----------------------------------- |
| Route registration order breaks | Medium | High | Full integration test suite |
| Middleware not applied to all routes | Low | Medium | Factory pattern ensures consistency |
| Performance impact from middleware | Low | Low | Minimal overhead (path regex) |
| Type errors in extended Request | Low | Medium | TypeScript strict mode catches |
## Rollback Procedure
1. Revert `server.ts` to individual route mounting
2. Remove new middleware files (not breaking)
3. Remove version types from `express.d.ts`
4. Run `npm run type-check && npm test` to verify
## Success Criteria
- [x] All existing tests pass (`npm test` in container)
- [x] `X-API-Version: v1` header on all `/api/v1/*` responses
- [x] TypeScript compiles without errors (`npm run type-check`)
- [x] No performance regression (< 5ms added latency)
- [x] Deprecation headers work when v1 marked deprecated (manual test)
## Known Issues and Follow-up Work
### Integration Tests Using Unversioned Paths
**Issue**: Some existing integration tests make requests to unversioned paths (e.g., `/api/flyers` instead of `/api/v1/flyers`). These tests now receive 301 redirects due to the backwards compatibility middleware.
**Impact**: 74 integration tests may need updates to use versioned paths explicitly.
**Workaround Options**:
1. Update test paths to use `/api/v1/*` explicitly (recommended)
2. Configure supertest to follow redirects automatically
3. Accept 301 as valid response in affected tests
**Resolution**: Phase 3 work item - update integration tests to use versioned endpoints consistently.
### Phase 3 Prerequisites
Before marking v1 as deprecated and implementing v2:
1. Update all integration tests to use versioned paths
2. Define breaking changes requiring v2
3. Create v2-specific route handlers where needed
4. Set deprecation timeline for v1
## Related ADRs
| ADR | Relationship |
| ------- | ------------------------------------------------- |
| ADR-008 | Parent decision (this implements Phase 2) |
| ADR-003 | Validation middleware pattern applies per-version |
| ADR-028 | Response format consistent across versions |
| ADR-018 | OpenAPI docs reflect versioned endpoints |
| ADR-043 | Middleware pipeline order considerations |
## Usage Examples
### Checking Version in Handler
```typescript
// src/routes/flyer.routes.ts
router.get('/', async (req, res) => {
const flyers = await flyerRepo.getFlyers(req.log);
// Version-specific response transformation
if (req.apiVersion === 'v2') {
return sendSuccess(res, flyers.map(transformFlyerV2));
}
return sendSuccess(res, flyers);
});
```
### Marking Version as Deprecated
```typescript
// src/config/apiVersions.ts
export const VERSION_CONFIG = {
v1: {
version: 'v1',
status: 'deprecated',
sunsetDate: '2027-01-01T00:00:00Z',
successorVersion: 'v2',
},
v2: { version: 'v2', status: 'active' },
};
```
### Testing Deprecation Headers
```bash
curl -I https://localhost:3001/api/v1/flyers
# When v1 deprecated:
# Deprecation: true
# Sunset: 2027-01-01T00:00:00Z
# Link: </api/v2>; rel="successor-version"
# X-API-Version: v1
```

View File

@@ -0,0 +1,377 @@
# PM2 Process Isolation Safeguards Project
**Session Date**: 2026-02-17
**Status**: Completed
**Triggered By**: Critical production incident during v0.15.0 deployment
---
## Executive Summary
On 2026-02-17, a critical incident occurred during v0.15.0 production deployment where ALL PM2 processes on the production server were killed, not just the flyer-crawler processes. This caused unplanned downtime for multiple applications including `stock-alert.projectium.com`.
Despite PM2 process isolation fixes already being in place (commit `b6a62a0`), the incident still occurred. Investigation suggests the Gitea runner may have executed a cached/older version of the workflow files. In response, we implemented a comprehensive defense-in-depth strategy with 5 layers of safeguards across all deployment workflows.
---
## Incident Background
### What Happened
| Aspect | Detail |
| --------------------- | ------------------------------------------------------- |
| **Date/Time** | 2026-02-17 ~07:40 UTC |
| **Trigger** | v0.15.0 production deployment via `deploy-to-prod.yml` |
| **Impact** | ALL PM2 processes killed (all environments) |
| **Collateral Damage** | `stock-alert.projectium.com` and other PM2-managed apps |
| **Severity** | P1 - Critical |
### Key Mystery
The PM2 process isolation fix was already implemented in commit `b6a62a0` (2026-02-13) and was included in v0.15.0. The fix correctly used whitelist-based filtering:
```javascript
const prodProcesses = [
'flyer-crawler-api',
'flyer-crawler-worker',
'flyer-crawler-analytics-worker',
];
list.forEach((p) => {
if (
(p.pm2_env.status === 'errored' || p.pm2_env.status === 'stopped') &&
prodProcesses.includes(p.name)
) {
exec('pm2 delete ' + p.pm2_env.pm_id);
}
});
```
**Hypothesis**: Gitea runner executed a cached older version of the workflow file that did not contain the fix.
---
## Solution: Defense-in-Depth Safeguards
Rather than relying solely on the filter logic (which may be correct but not executed), we implemented 5 layers of safeguards that provide visibility, validation, and automatic abort capabilities.
### Safeguard Layers
| Layer | Name | Purpose |
| ----- | --------------------------------- | ------------------------------------------------------- |
| 1 | **Workflow Metadata Logging** | Audit trail of which workflow version actually executed |
| 2 | **Pre-Cleanup PM2 State Logging** | Capture full process list before any modifications |
| 3 | **Process Count Validation** | SAFETY ABORT if filter would delete ALL processes |
| 4 | **Explicit Name Verification** | Log exactly which processes will be affected |
| 5 | **Post-Cleanup Verification** | Verify environment isolation after cleanup |
### Layer Details
#### Layer 1: Workflow Metadata Logging
Logs at the start of deployment:
- Workflow file name
- SHA-256 hash of the workflow file
- Git commit being deployed
- Git branch
- Timestamp (UTC)
- Actor (who triggered the deployment)
**Purpose**: If an incident occurs, we can verify whether the executed workflow matches the repository version.
```bash
echo "=== WORKFLOW METADATA ==="
echo "Workflow file: deploy-to-prod.yml"
echo "Workflow file hash: $(sha256sum .gitea/workflows/deploy-to-prod.yml | cut -d' ' -f1)"
echo "Git commit: $(git rev-parse HEAD)"
echo "Timestamp: $(date -u '+%Y-%m-%d %H:%M:%S UTC')"
echo "Actor: ${{ gitea.actor }}"
echo "=== END METADATA ==="
```
#### Layer 2: Pre-Cleanup PM2 State Logging
Captures full PM2 process list in JSON format before any modifications.
**Purpose**: Provides forensic evidence of what processes existed before cleanup began.
```bash
echo "=== PRE-CLEANUP PM2 STATE ==="
pm2 jlist
echo "=== END PRE-CLEANUP STATE ==="
```
#### Layer 3: Process Count Validation (SAFETY ABORT)
The most critical safeguard. Aborts the entire deployment if the filter would delete ALL processes and there are more than 3 processes total.
**Purpose**: Catches filter bugs or unexpected conditions that would result in catastrophic process deletion.
```javascript
// SAFEGUARD 1: Process count validation
const totalProcesses = list.length;
if (targetProcesses.length === totalProcesses && totalProcesses > 3) {
console.error('SAFETY ABORT: Filter would delete ALL processes!');
console.error(
'Total processes: ' + totalProcesses + ', Target processes: ' + targetProcesses.length,
);
console.error('This indicates a potential filter bug. Aborting cleanup.');
process.exit(1);
}
```
**Threshold Rationale**: The threshold of 3 allows normal operation when only the 3 expected processes exist (API, Worker, Analytics Worker) while catching anomalies when the server hosts more applications.
#### Layer 4: Explicit Name Verification
Logs the exact name, status, and PM2 ID of each process that will be deleted.
**Purpose**: Provides clear visibility into what the cleanup operation will actually do.
```javascript
console.log('Found ' + targetProcesses.length + ' PRODUCTION processes to clean:');
targetProcesses.forEach((p) => {
console.log(
' - ' + p.name + ' (status: ' + p.pm2_env.status + ', pm_id: ' + p.pm2_env.pm_id + ')',
);
});
```
#### Layer 5: Post-Cleanup Verification
After cleanup, logs the state of processes by environment to verify isolation was maintained.
**Purpose**: Immediately identifies if the cleanup affected the wrong environment.
```bash
echo "=== POST-CLEANUP VERIFICATION ==="
pm2 jlist | node -e "
const list = JSON.parse(require('fs').readFileSync(0, 'utf-8'));
const prodProcesses = list.filter(p => p.name && p.name.startsWith('flyer-crawler-') && !p.name.endsWith('-test'));
const testProcesses = list.filter(p => p.name && p.name.endsWith('-test'));
console.log('Production processes after cleanup: ' + prodProcesses.length);
console.log('Test processes (should be untouched): ' + testProcesses.length);
"
echo "=== END POST-CLEANUP VERIFICATION ==="
```
---
## Implementation Details
### Files Modified
| File | Changes |
| ------------------------------------------ | --------------------------------------------- |
| `.gitea/workflows/deploy-to-prod.yml` | Added all 5 safeguard layers |
| `.gitea/workflows/deploy-to-test.yml` | Added all 5 safeguard layers |
| `.gitea/workflows/manual-deploy-major.yml` | Added all 5 safeguard layers |
| `CLAUDE.md` | Added PM2 Process Isolation Incidents section |
### Files Created
| File | Purpose |
| --------------------------------------------------------- | --------------------------------------- |
| `docs/operations/INCIDENT-2026-02-17-PM2-PROCESS-KILL.md` | Detailed incident report |
| `docs/operations/PM2-INCIDENT-RESPONSE.md` | Comprehensive incident response runbook |
| `tests/qa/test-pm2-safeguard-logic.js` | Validation tests for safeguard logic |
---
## Testing and Validation
### Test Artifact
A standalone JavaScript test file was created to validate the safeguard logic:
**File**: `tests/qa/test-pm2-safeguard-logic.js`
**Test Categories**:
1. **Normal Operations (should NOT abort)**
- 3 errored out of 15 processes
- 1 errored out of 10 processes
- 0 processes to clean
- Fresh server with 3 processes (threshold boundary)
2. **Dangerous Operations (SHOULD abort)**
- All 10 processes targeted
- All 15 processes targeted
- All 4 processes targeted (just above threshold)
3. **Workflow-Specific Filter Tests**
- Production filter only matches production processes
- Test filter only matches `-test` suffix processes
- Filters don't cross-contaminate environments
### Test Results
All 11 scenarios passed:
| Scenario | Total | Target | Expected | Result |
| -------------------------- | ----- | ------ | -------- | ------ |
| Normal prod cleanup | 15 | 3 | No abort | PASS |
| Normal test cleanup | 15 | 3 | No abort | PASS |
| Single process | 10 | 1 | No abort | PASS |
| No cleanup needed | 10 | 0 | No abort | PASS |
| Fresh server (threshold) | 3 | 3 | No abort | PASS |
| Minimal server | 2 | 2 | No abort | PASS |
| Empty PM2 | 0 | 0 | No abort | PASS |
| Filter bug - 10 processes | 10 | 10 | ABORT | PASS |
| Filter bug - 15 processes | 15 | 15 | ABORT | PASS |
| Filter bug - 4 processes | 4 | 4 | ABORT | PASS |
| Filter bug - 100 processes | 100 | 100 | ABORT | PASS |
### YAML Validation
All workflow files passed YAML syntax validation using `python -c "import yaml; yaml.safe_load(open(...))"`
---
## Documentation Updates
### CLAUDE.md Updates
Added new section at line 293: **PM2 Process Isolation Incidents**
Contains:
- Reference to the 2026-02-17 incident
- Impact summary
- Prevention measures list
- Response instructions
- Links to related documentation
### docs/README.md
Added incident report reference under **Operations > Incident Reports**.
### Cross-References Verified
| Document | Reference | Status |
| --------------- | --------------------------------------- | ------ |
| CLAUDE.md | PM2-INCIDENT-RESPONSE.md | Valid |
| CLAUDE.md | INCIDENT-2026-02-17-PM2-PROCESS-KILL.md | Valid |
| Incident Report | CLAUDE.md PM2 section | Valid |
| Incident Report | PM2-INCIDENT-RESPONSE.md | Valid |
| docs/README.md | INCIDENT-2026-02-17-PM2-PROCESS-KILL.md | Valid |
---
## Lessons Learned
### Technical Lessons
1. **Filter logic alone is not sufficient** - Even correct filters can be bypassed if an older version of the script is executed.
2. **Workflow caching is a real risk** - CI/CD runners may cache workflow files, leading to stale versions being executed.
3. **Defense-in-depth is essential for destructive operations** - Multiple layers of validation catch failures that single-point checks miss.
4. **Visibility enables diagnosis** - Pre/post state logging makes root cause analysis possible.
5. **Automatic abort prevents cascading failures** - The process count validation could have prevented the incident entirely.
### Process Lessons
1. **Shared PM2 daemons are risky** - Multiple applications sharing a PM2 daemon create cross-application dependencies.
2. **Documentation should include failure modes** - CLAUDE.md now explicitly documents what can go wrong and how to respond.
3. **Runbooks save time during incidents** - The incident response runbook provides step-by-step guidance when time is critical.
---
## Future Considerations
### Not Implemented (Potential Future Work)
1. **PM2 Namespacing** - Use PM2's native namespace feature to completely isolate environments.
2. **Separate PM2 Daemons** - Run one PM2 daemon per application to eliminate cross-application risk.
3. **Deployment Locks** - Implement mutex-style locks to prevent concurrent deployments.
4. **Workflow Version Verification** - Add a pre-flight check that compares workflow hash against expected value.
5. **Automated Rollback** - Implement automatic process restoration if safeguards detect a problem.
---
## Related Documentation
- **ADR-061**: [PM2 Process Isolation Safeguards](../../adr/0061-pm2-process-isolation-safeguards.md)
- **Incident Report**: [INCIDENT-2026-02-17-PM2-PROCESS-KILL.md](../../operations/INCIDENT-2026-02-17-PM2-PROCESS-KILL.md)
- **Response Runbook**: [PM2-INCIDENT-RESPONSE.md](../../operations/PM2-INCIDENT-RESPONSE.md)
- **CLAUDE.md Section**: [PM2 Process Isolation Incidents](../../../CLAUDE.md#pm2-process-isolation-incidents)
- **Test Artifact**: [test-pm2-safeguard-logic.js](../../../tests/qa/test-pm2-safeguard-logic.js)
- **ADR-014**: [Containerization and Deployment Strategy](../../adr/0014-containerization-and-deployment-strategy.md)
---
## Appendix: Workflow Changes Summary
### deploy-to-prod.yml
```diff
+ - name: Log Workflow Metadata
+ run: |
+ echo "=== WORKFLOW METADATA ==="
+ echo "Workflow file: deploy-to-prod.yml"
+ echo "Workflow file hash: $(sha256sum .gitea/workflows/deploy-to-prod.yml | cut -d' ' -f1)"
+ ...
- name: Install Backend Dependencies and Restart Production Server
run: |
+ # === PRE-CLEANUP PM2 STATE LOGGING ===
+ echo "=== PRE-CLEANUP PM2 STATE ==="
+ pm2 jlist
+ echo "=== END PRE-CLEANUP STATE ==="
+
# --- Cleanup Errored Processes with Defense-in-Depth Safeguards ---
node -e "
...
+ // SAFEGUARD 1: Process count validation
+ if (targetProcesses.length === totalProcesses && totalProcesses > 3) {
+ console.error('SAFETY ABORT: Filter would delete ALL processes!');
+ process.exit(1);
+ }
+
+ // SAFEGUARD 2: Explicit name verification
+ console.log('Found ' + targetProcesses.length + ' PRODUCTION processes to clean:');
+ targetProcesses.forEach(p => {
+ console.log(' - ' + p.name + ' (status: ' + p.pm2_env.status + ')');
+ });
...
"
+
+ # === POST-CLEANUP VERIFICATION ===
+ echo "=== POST-CLEANUP VERIFICATION ==="
+ pm2 jlist | node -e "..."
+ echo "=== END POST-CLEANUP VERIFICATION ==="
```
Similar changes were applied to `deploy-to-test.yml` and `manual-deploy-major.yml`.
---
## Session Participants
| Role | Agent Type | Responsibility |
| ------------ | ------------------------- | ------------------------------------- |
| Orchestrator | Main Claude | Session coordination and delegation |
| Planner | planner subagent | Incident analysis and solution design |
| Documenter | describer-for-ai subagent | Incident report creation |
| Coder #1 | coder subagent | Workflow safeguard implementation |
| Coder #2 | coder subagent | Incident response runbook creation |
| Coder #3 | coder subagent | CLAUDE.md updates |
| Tester | tester subagent | Comprehensive validation |
| Archivist | Lead Technical Archivist | Final documentation |
---
## Revision History
| Date | Author | Change |
| ---------- | ------------------------ | ----------------------- |
| 2026-02-17 | Lead Technical Archivist | Initial session summary |

View File

@@ -486,9 +486,9 @@ Attach screenshots for:
## 🔐 Sign-Off
**Tester Name**: ******\*\*\*\*******\_\_\_******\*\*\*\*******
**Tester Name**: **\*\***\*\*\*\***\*\***\_\_\_**\*\***\*\*\*\***\*\***
**Date/Time Completed**: ****\*\*\*\*****\_\_\_****\*\*\*\*****
**Date/Time Completed**: \***\*\*\*\*\*\*\***\_\_\_\***\*\*\*\*\*\*\***
**Total Testing Time**: **\_\_** minutes

View File

@@ -0,0 +1,270 @@
# TypeScript Test Error Remediation Project
**Date**: 2026-02-17
**Status**: Completed
**ADR**: [ADR-060](../../adr/0060-typescript-test-error-remediation.md)
## Executive Summary
Systematic remediation of 185 TypeScript errors across the flyer-crawler test suite following API response standardization (ADR-028) and tsoa migration (ADR-059). The project achieved zero TypeScript errors while maintaining test suite integrity.
## Project Metrics
| Metric | Initial | Final | Change |
| ----------------- | ------- | ----- | ------ |
| TypeScript Errors | 185 | 0 | -185 |
| Tests Passing | 4,600 | 4,603 | +3 |
| Tests Failing | 62 | 59 | -3 |
| Files Modified | 0 | 25+ | - |
## Error Evolution Timeline
```
Initial Assessment: 185 errors
After Phase 1-4: 114 errors (-71)
After Iteration 2: 67 errors (-47)
After Iteration 3: 23 errors (-44)
Final: 0 errors (-23)
```
## Root Causes Identified
### 1. SuccessResponse Discriminated Union (48.1%)
ADR-028 introduced `ApiSuccessResponse<T> | ApiErrorResponse` union types. Tests accessing `response.body.data` without type guards triggered TS2339 errors.
**Solution**: Created `asSuccessResponse<T>()` type guard utility.
### 2. Mock Object Type Casting (22.7%)
Vitest mocks return `MockedFunction<T>` types. Passing to functions expecting exact signatures required explicit casting.
**Solution**: Created `asMock<T>()` utility and standardized mock patterns.
### 3. Response Body Property Access (15.1%)
Supertest `response.body` is typed as `unknown`. Direct property access violated strict mode.
**Solution**: Consistent use of `asSuccessResponse()` before accessing `.data`.
### 4. Partial Mock Missing Properties (9.7%)
Factory functions creating partial mocks lacked required properties.
**Solution**: Updated all mock factories to return complete type-safe objects.
### 5. Generic Type Parameter Issues (2.7%)
TypeScript could not infer generics in certain contexts.
**Solution**: Explicit generic parameters on factory calls and assertions.
### 6. Module Import Type Issues (1.6%)
Type mismatches in module mock declarations.
**Solution**: Proper use of `vi.mocked()` and `Mocked<typeof module>` patterns.
## Implementation Strategy
### Phase 1: Foundation (Infrastructure)
Created shared test utilities that enable fixes across all test files:
```typescript
// src/tests/utils/testHelpers.ts
export function asSuccessResponse<T>(body: unknown): ApiSuccessResponse<T>;
export function asErrorResponse(body: unknown): ApiErrorResponse;
export function asMock<T extends (...args: unknown[]) => unknown>(mock: Mock): T;
export { createMockLogger, mockLogger } from './mockLogger';
```
### Phase 2-4: Parallel Execution
Distributed work across multiple parallel tasks:
| Group | Files | Dependencies |
| ----- | ------------------------------------------ | ----------------- |
| A | Controller tests (auth, user, flyer) | Phase 1 utilities |
| B | Controller tests (recipe, inventory, etc.) | Phase 1 utilities |
| C | Service tests | None |
| D | Route tests | Phase 1 utilities |
### Phase 5: Iterative Refinement
Multiple verification and fix iterations:
1. Run type-check
2. Analyze remaining errors
3. Fix errors by file
4. Re-verify
5. Repeat until zero errors
## Files Modified
### Controller Tests (19 files)
- `src/controllers/admin.controller.test.ts`
- `src/controllers/ai.controller.test.ts`
- `src/controllers/auth.controller.test.ts`
- `src/controllers/budget.controller.test.ts`
- `src/controllers/category.controller.test.ts`
- `src/controllers/deals.controller.test.ts`
- `src/controllers/flyer.controller.test.ts`
- `src/controllers/gamification.controller.test.ts`
- `src/controllers/inventory.controller.test.ts`
- `src/controllers/personalization.controller.test.ts`
- `src/controllers/price.controller.test.ts`
- `src/controllers/reactions.controller.test.ts`
- `src/controllers/receipt.controller.test.ts`
- `src/controllers/recipe.controller.test.ts`
- `src/controllers/store.controller.test.ts`
- `src/controllers/system.controller.test.ts`
- `src/controllers/upc.controller.test.ts`
- `src/controllers/user.controller.test.ts`
### Shared Test Utilities
- `src/tests/utils/testHelpers.ts` - Type guards and mock utilities
- `src/tests/utils/mockLogger.ts` - Pino logger mock factory
- `src/tests/utils/mockFactories.ts` - 60+ entity mock factories
### Route Tests
- `src/routes/admin.*.routes.test.ts` (5 files)
- `src/routes/ai.routes.test.ts`
### Service Tests
- `src/services/receiptService.server.test.ts`
- `src/services/queueService.server.test.ts`
### Middleware Tests
- `src/middleware/apiVersion.middleware.test.ts`
## Key Patterns Established
### 1. Response Type Narrowing
```typescript
// Standard pattern for success responses
const response = await request.get('/api/v1/users/1');
const body = asSuccessResponse<User>(response.body);
expect(body.data.id).toBe(1);
// Standard pattern for error responses
expect(response.status).toBe(400);
const body = asErrorResponse(response.body);
expect(body.error.code).toBe('VALIDATION_ERROR');
```
### 2. Mock Logger Creation
```typescript
import { createMockLogger } from '../tests/utils/testHelpers';
function createMockRequest(overrides = {}): ExpressRequest {
return {
body: {},
log: createMockLogger(),
...overrides,
} as unknown as ExpressRequest;
}
```
### 3. Mock Service Casting
```typescript
import type { Mocked } from 'vitest';
vi.mock('../services/authService');
import { authService } from '../services/authService';
const mockedAuthService = authService as Mocked<typeof authService>;
mockedAuthService.login.mockResolvedValue(mockResult);
```
### 4. Mock Factory Usage
```typescript
import { createMockUserProfile, createMockFlyer } from '../tests/utils/mockFactories';
const mockUser = createMockUserProfile({ role: 'admin' });
const mockFlyer = createMockFlyer({ store: { name: 'Test Store' } });
```
## Lessons Learned
### 1. Infrastructure First
Creating shared utilities before fixing individual files dramatically reduces total effort. The `asSuccessResponse()` utility alone enabled fixes for 89 errors.
### 2. Parallel Execution Efficiency
Organizing work into independent groups allowed parallel execution, reducing wall-clock time from estimated 10 hours to approximately 3.5 hours.
### 3. Iterative Verification
Running type-check after each batch of fixes catches cascading issues early and provides clear progress metrics.
### 4. Complete Mock Factories
Investing in comprehensive mock factories pays dividends across all tests. The 60+ factory functions in `mockFactories.ts` ensure type safety throughout the test suite.
### 5. Consistent Patterns
Establishing and documenting patterns (response narrowing, mock casting, logger creation) ensures consistency and reduces future maintenance burden.
## Verification Results
### Type Check
```bash
podman exec -it flyer-crawler-dev npm run type-check
# Exit code: 0
# Output: No errors
```
### Test Suite
```bash
podman exec -it flyer-crawler-dev npm test
# Results:
# Test Files: 167 passed, 11 failed (178 total)
# Tests: 4,603 passed, 59 failed (4,662 total)
# Duration: ~4 minutes
```
### Pre-existing Failures
The 59 failing tests are pre-existing issues unrelated to this remediation:
- Integration test timing issues
- Mock isolation in globalSetup
- Redis/Queue worker interference
## Documentation Updates
1. **ADR-060**: Status updated to "Implemented" with completion metrics
2. **TESTING.md**: Added TypeScript type safety section
3. **This document**: Session archive for future reference
## Related ADRs
- [ADR-010](../../adr/0010-testing-strategy-and-standards.md) - Testing Strategy
- [ADR-028](../../adr/0028-api-response-standardization.md) - API Response Standardization
- [ADR-045](../../adr/0045-test-data-factories-and-fixtures.md) - Test Data Factories
- [ADR-057](../../adr/0057-test-remediation-post-api-versioning.md) - API Versioning Remediation
- [ADR-059](../../adr/0059-dependency-modernization.md) - tsoa Migration
- [ADR-060](../../adr/0060-typescript-test-error-remediation.md) - This Project
## Future Recommendations
1. **Enforce Type Safety in CI**: Add `npm run type-check` as a required CI step
2. **Mock Factory Maintenance**: Update factories when entity types change
3. **Pattern Documentation**: Reference TESTING.md patterns in code review guidelines
4. **New Test Template**: Create a test file template that imports standard utilities

View File

@@ -0,0 +1,844 @@
# API Versioning Developer Guide
**Status**: Complete (Phase 2)
**Last Updated**: 2026-01-27
**Implements**: ADR-008 Phase 2
**Architecture**: [api-versioning-infrastructure.md](../architecture/api-versioning-infrastructure.md)
This guide covers the API versioning infrastructure for the Flyer Crawler application. It explains how versioning works, how to add new versions, and how to deprecate old ones.
## Implementation Status
| Component | Status | Tests |
| ------------------------------ | -------- | -------------------- |
| Version Constants | Complete | Unit tests |
| Version Detection Middleware | Complete | 25 unit tests |
| Deprecation Headers Middleware | Complete | 30 unit tests |
| Version Router Factory | Complete | Integration tests |
| Server Integration | Complete | 48 integration tests |
| Developer Documentation | Complete | This guide |
**Total Tests**: 82 versioning-specific tests (100% passing)
---
## Table of Contents
1. [Overview](#overview)
2. [Architecture](#architecture)
3. [Key Concepts](#key-concepts)
4. [Developer Workflows](#developer-workflows)
5. [Version Headers](#version-headers)
6. [Testing Versioned Endpoints](#testing-versioned-endpoints)
7. [Migration Guide: v1 to v2](#migration-guide-v1-to-v2)
8. [Troubleshooting](#troubleshooting)
9. [Related Documentation](#related-documentation)
---
## Overview
The API uses URI-based versioning with the format `/api/v{MAJOR}/resource`. All endpoints are accessible at versioned paths like `/api/v1/flyers` or `/api/v2/users`.
### Current Version Status
| Version | Status | Description |
| ------- | ------ | ------------------------------------- |
| v1 | Active | Current production version |
| v2 | Active | Future version (infrastructure ready) |
### Key Features
- **Automatic version detection** from URL path
- **RFC 8594 compliant deprecation headers** when versions are deprecated
- **Backwards compatibility** via 301 redirects from unversioned paths
- **Version-aware request context** for conditional logic in handlers
- **Centralized configuration** for version lifecycle management
---
## Architecture
### Request Flow
```text
Client Request: GET /api/v1/flyers
|
v
+------+-------+
| server.ts |
| - Redirect |
| middleware |
+------+-------+
|
v
+------+-------+
| createApi |
| Router() |
+------+-------+
|
v
+------+-------+
| detectApi |
| Version |
| middleware |
+------+-------+
| req.apiVersion = 'v1'
v
+------+-------+
| Versioned |
| Router |
| (v1) |
+------+-------+
|
v
+------+-------+
| addDepreca |
| tionHeaders |
| middleware |
+------+-------+
| X-API-Version: v1
v
+------+-------+
| Domain |
| Router |
| (flyers) |
+------+-------+
|
v
Response
```
### Component Overview
| Component | File | Purpose |
| ------------------- | ------------------------------------------ | ----------------------------------------------------- |
| Version Constants | `src/config/apiVersions.ts` | Type definitions, version configs, utility functions |
| Version Detection | `src/middleware/apiVersion.middleware.ts` | Extract version from URL, validate, attach to request |
| Deprecation Headers | `src/middleware/deprecation.middleware.ts` | Add RFC 8594 headers for deprecated versions |
| Router Factory | `src/routes/versioned.ts` | Create version-specific Express routers |
| Type Extensions | `src/types/express.d.ts` | Add `apiVersion` and `versionDeprecation` to Request |
---
## Key Concepts
### 1. Version Configuration
All version definitions live in `src/config/apiVersions.ts`:
```typescript
// src/config/apiVersions.ts
// Supported versions as a const tuple
export const API_VERSIONS = ['v1', 'v2'] as const;
// Union type: 'v1' | 'v2'
export type ApiVersion = (typeof API_VERSIONS)[number];
// Version lifecycle status
export type VersionStatus = 'active' | 'deprecated' | 'sunset';
// Configuration for each version
export const VERSION_CONFIGS: Record<ApiVersion, VersionConfig> = {
v1: {
version: 'v1',
status: 'active',
},
v2: {
version: 'v2',
status: 'active',
},
};
```
### 2. Version Detection
The `detectApiVersion` middleware extracts the version from `req.params.version` and validates it:
```typescript
// How it works (src/middleware/apiVersion.middleware.ts)
// For valid versions:
// GET /api/v1/flyers -> req.apiVersion = 'v1'
// For invalid versions:
// GET /api/v99/flyers -> 404 with UNSUPPORTED_VERSION error
```
### 3. Request Context
After middleware runs, the request object has version information:
```typescript
// In any route handler
router.get('/flyers', async (req, res) => {
// Access the detected version
const version = req.apiVersion; // 'v1' | 'v2'
// Check deprecation status
if (req.versionDeprecation?.deprecated) {
req.log.warn(
{
sunset: req.versionDeprecation.sunsetDate,
},
'Client using deprecated API',
);
}
// Version-specific behavior
if (req.apiVersion === 'v2') {
return sendSuccess(res, transformV2(data));
}
return sendSuccess(res, data);
});
```
### 4. Route Registration
Routes are registered in `src/routes/versioned.ts` with version availability:
```typescript
// src/routes/versioned.ts
export const ROUTES: RouteRegistration[] = [
{
path: 'auth',
router: authRouter,
description: 'Authentication routes',
// Available in all versions (no versions array)
},
{
path: 'flyers',
router: flyerRouter,
description: 'Flyer management',
// Available in all versions
},
{
path: 'new-feature',
router: newFeatureRouter,
description: 'New feature only in v2',
versions: ['v2'], // Only available in v2
},
];
```
---
## Developer Workflows
### Adding a New API Version (e.g., v3)
**Step 1**: Add version to constants (`src/config/apiVersions.ts`)
```typescript
// Before
export const API_VERSIONS = ['v1', 'v2'] as const;
// After
export const API_VERSIONS = ['v1', 'v2', 'v3'] as const;
// Add configuration
export const VERSION_CONFIGS: Record<ApiVersion, VersionConfig> = {
v1: { version: 'v1', status: 'active' },
v2: { version: 'v2', status: 'active' },
v3: { version: 'v3', status: 'active' }, // NEW
};
```
**Step 2**: Router cache auto-updates (no changes needed)
The versioned router cache in `src/routes/versioned.ts` automatically creates routers for all versions defined in `API_VERSIONS`.
**Step 3**: Update OpenAPI documentation (`src/config/swagger.ts`)
```typescript
servers: [
{ url: '/api/v1', description: 'API v1' },
{ url: '/api/v2', description: 'API v2' },
{ url: '/api/v3', description: 'API v3 (New)' }, // NEW
],
```
**Step 4**: Test the new version
```bash
# In dev container
podman exec -it flyer-crawler-dev npm test
# Manual verification
curl -i http://localhost:3001/api/v3/health
# Should return 200 with X-API-Version: v3 header
```
### Marking a Version as Deprecated
**Step 1**: Update version config (`src/config/apiVersions.ts`)
```typescript
export const VERSION_CONFIGS: Record<ApiVersion, VersionConfig> = {
v1: {
version: 'v1',
status: 'deprecated', // Changed from 'active'
sunsetDate: '2027-01-01T00:00:00Z', // When it will be removed
successorVersion: 'v2', // Migration target
},
v2: {
version: 'v2',
status: 'active',
},
};
```
**Step 2**: Verify deprecation headers
```bash
curl -I http://localhost:3001/api/v1/health
# Expected headers:
# X-API-Version: v1
# Deprecation: true
# Sunset: 2027-01-01T00:00:00Z
# Link: </api/v2>; rel="successor-version"
# X-API-Deprecation-Notice: API v1 is deprecated and will be sunset...
```
**Step 3**: Monitor deprecation usage
Check logs for `Deprecated API version accessed` messages with context about which clients are still using deprecated versions.
### Adding Version-Specific Routes
**Scenario**: Add a new endpoint only available in v2+
**Step 1**: Create the route handler (new or existing file)
```typescript
// src/routes/newFeature.routes.ts
import { Router } from 'express';
import { sendSuccess } from '../utils/apiResponse';
const router = Router();
router.get('/', async (req, res) => {
// This endpoint only exists in v2+
sendSuccess(res, { feature: 'new-feature-data' });
});
export default router;
```
**Step 2**: Register with version restriction (`src/routes/versioned.ts`)
```typescript
import newFeatureRouter from './newFeature.routes';
export const ROUTES: RouteRegistration[] = [
// ... existing routes ...
{
path: 'new-feature',
router: newFeatureRouter,
description: 'New feature only available in v2+',
versions: ['v2'], // Not available in v1
},
];
```
**Step 3**: Verify route availability
```bash
# v1 - should return 404
curl -i http://localhost:3001/api/v1/new-feature
# HTTP/1.1 404 Not Found
# v2 - should work
curl -i http://localhost:3001/api/v2/new-feature
# HTTP/1.1 200 OK
# X-API-Version: v2
```
### Adding Version-Specific Behavior in Existing Routes
For routes that exist in multiple versions but behave differently:
```typescript
// src/routes/flyer.routes.ts
router.get('/:id', async (req, res) => {
const flyer = await flyerService.getFlyer(req.params.id, req.log);
// Different response format per version
if (req.apiVersion === 'v2') {
// v2 returns expanded store data
return sendSuccess(res, {
...flyer,
store: await storeService.getStore(flyer.store_id, req.log),
});
}
// v1 returns just the flyer
return sendSuccess(res, flyer);
});
```
---
## Version Headers
### Response Headers
All versioned API responses include these headers:
| Header | Always Present | Description |
| -------------------------- | ------------------ | ------------------------------------------------------- |
| `X-API-Version` | Yes | The API version handling the request |
| `Deprecation` | Only if deprecated | `true` when version is deprecated |
| `Sunset` | Only if configured | ISO 8601 date when version will be removed |
| `Link` | Only if configured | URL to successor version with `rel="successor-version"` |
| `X-API-Deprecation-Notice` | Only if deprecated | Human-readable deprecation message |
### Example: Active Version Response
```http
HTTP/1.1 200 OK
X-API-Version: v2
Content-Type: application/json
```
### Example: Deprecated Version Response
```http
HTTP/1.1 200 OK
X-API-Version: v1
Deprecation: true
Sunset: 2027-01-01T00:00:00Z
Link: </api/v2>; rel="successor-version"
X-API-Deprecation-Notice: API v1 is deprecated and will be sunset on 2027-01-01T00:00:00Z. Please migrate to v2.
Content-Type: application/json
```
### RFC Compliance
The deprecation headers follow these standards:
- **RFC 8594**: The "Sunset" HTTP Header Field
- **draft-ietf-httpapi-deprecation-header**: The "Deprecation" HTTP Header Field
- **RFC 8288**: Web Linking (for `rel="successor-version"`)
---
## Testing Versioned Endpoints
### Unit Testing Middleware
See test files for patterns:
- `src/middleware/apiVersion.middleware.test.ts`
- `src/middleware/deprecation.middleware.test.ts`
**Testing version detection**:
```typescript
// src/middleware/apiVersion.middleware.test.ts
import { detectApiVersion } from './apiVersion.middleware';
import { createMockRequest } from '../tests/utils/createMockRequest';
describe('detectApiVersion', () => {
it('should extract v1 from req.params.version', () => {
const mockRequest = createMockRequest({
params: { version: 'v1' },
});
const mockResponse = { status: vi.fn().mockReturnThis(), json: vi.fn() };
const mockNext = vi.fn();
detectApiVersion(mockRequest, mockResponse, mockNext);
expect(mockRequest.apiVersion).toBe('v1');
expect(mockNext).toHaveBeenCalled();
});
it('should return 404 for invalid version', () => {
const mockRequest = createMockRequest({
params: { version: 'v99' },
});
const mockResponse = {
status: vi.fn().mockReturnThis(),
json: vi.fn(),
};
const mockNext = vi.fn();
detectApiVersion(mockRequest, mockResponse, mockNext);
expect(mockNext).not.toHaveBeenCalled();
expect(mockResponse.status).toHaveBeenCalledWith(404);
});
});
```
**Testing deprecation headers**:
```typescript
// src/middleware/deprecation.middleware.test.ts
import { addDeprecationHeaders } from './deprecation.middleware';
import { VERSION_CONFIGS } from '../config/apiVersions';
describe('addDeprecationHeaders', () => {
beforeEach(() => {
// Mark v1 as deprecated for test
VERSION_CONFIGS.v1 = {
version: 'v1',
status: 'deprecated',
sunsetDate: '2027-01-01T00:00:00Z',
successorVersion: 'v2',
};
});
it('should add all deprecation headers', () => {
const setHeader = vi.fn();
const middleware = addDeprecationHeaders('v1');
middleware(mockRequest, { set: setHeader }, mockNext);
expect(setHeader).toHaveBeenCalledWith('Deprecation', 'true');
expect(setHeader).toHaveBeenCalledWith('Sunset', '2027-01-01T00:00:00Z');
expect(setHeader).toHaveBeenCalledWith('Link', '</api/v2>; rel="successor-version"');
});
});
```
### Integration Testing
**Test versioned endpoints**:
```typescript
import request from 'supertest';
import app from '../../server';
describe('API Versioning Integration', () => {
it('should return X-API-Version header for v1', async () => {
const response = await request(app).get('/api/v1/health').expect(200);
expect(response.headers['x-api-version']).toBe('v1');
});
it('should return 404 for unsupported version', async () => {
const response = await request(app).get('/api/v99/health').expect(404);
expect(response.body.error.code).toBe('UNSUPPORTED_VERSION');
});
it('should redirect unversioned paths to v1', async () => {
const response = await request(app).get('/api/health').expect(301);
expect(response.headers.location).toBe('/api/v1/health');
});
});
```
### Running Tests
```bash
# Run all tests in container (required)
podman exec -it flyer-crawler-dev npm test
# Run only middleware tests
podman exec -it flyer-crawler-dev npm test -- apiVersion
podman exec -it flyer-crawler-dev npm test -- deprecation
# Type check
podman exec -it flyer-crawler-dev npm run type-check
```
---
## Migration Guide: v1 to v2
When v2 is introduced with breaking changes, follow this migration process.
### For API Consumers (Frontend/Mobile)
**Step 1**: Check current API version usage
```typescript
// Frontend apiClient.ts
const API_BASE_URL = import.meta.env.VITE_API_BASE_URL || '/api/v1';
```
**Step 2**: Monitor deprecation headers
When v1 is deprecated, responses will include:
```http
Deprecation: true
Sunset: 2027-01-01T00:00:00Z
Link: </api/v2>; rel="successor-version"
```
**Step 3**: Update to v2
```typescript
// Change API base URL
const API_BASE_URL = import.meta.env.VITE_API_BASE_URL || '/api/v2';
```
**Step 4**: Handle response format changes
If v2 changes response formats, update your type definitions and parsing logic:
```typescript
// v1 response
interface FlyerResponseV1 {
id: number;
store_id: number;
}
// v2 response (example: includes embedded store)
interface FlyerResponseV2 {
id: string; // Changed to UUID
store: {
id: string;
name: string;
};
}
```
### For Backend Developers
**Step 1**: Create v2-specific handlers (if needed)
For breaking changes, create version-specific route files:
```text
src/routes/
flyer.routes.ts # Shared/v1 handlers
flyer.v2.routes.ts # v2-specific handlers (if significantly different)
```
**Step 2**: Register version-specific routes
```typescript
// src/routes/versioned.ts
export const ROUTES: RouteRegistration[] = [
{
path: 'flyers',
router: flyerRouter,
description: 'Flyer routes (v1)',
versions: ['v1'],
},
{
path: 'flyers',
router: flyerRouterV2,
description: 'Flyer routes (v2 with breaking changes)',
versions: ['v2'],
},
];
```
**Step 3**: Document changes
Update OpenAPI documentation to reflect v2 changes and mark v1 as deprecated.
### Timeline Example
| Date | Action |
| ---------- | ------------------------------------------ |
| T+0 | v2 released, v1 marked deprecated |
| T+0 | Deprecation headers added to v1 responses |
| T+30 days | Sunset warning emails to known integrators |
| T+90 days | v1 returns 410 Gone |
| T+120 days | v1 code removed |
---
## Troubleshooting
### Issue: "UNSUPPORTED_VERSION" Error
**Symptom**: Request to `/api/v3/...` returns 404 with `UNSUPPORTED_VERSION`
**Cause**: Version `v3` is not defined in `API_VERSIONS`
**Solution**: Add the version to `src/config/apiVersions.ts`:
```typescript
export const API_VERSIONS = ['v1', 'v2', 'v3'] as const;
export const VERSION_CONFIGS = {
// ...
v3: { version: 'v3', status: 'active' },
};
```
### Issue: Missing X-API-Version Header
**Symptom**: Response doesn't include `X-API-Version` header
**Cause**: Request didn't go through versioned router
**Solution**: Ensure the route is registered in `src/routes/versioned.ts` and mounted under `/api/:version`
### Issue: Deprecation Headers Not Appearing
**Symptom**: Deprecated version works but no deprecation headers
**Cause**: Version status not set to `'deprecated'` in config
**Solution**: Update `VERSION_CONFIGS`:
```typescript
v1: {
version: 'v1',
status: 'deprecated', // Must be 'deprecated', not 'active'
sunsetDate: '2027-01-01T00:00:00Z',
successorVersion: 'v2',
},
```
### Issue: Route Available in Wrong Version
**Symptom**: Route works in v1 but should only be in v2
**Cause**: Missing `versions` restriction in route registration
**Solution**: Add `versions` array:
```typescript
{
path: 'new-feature',
router: newFeatureRouter,
versions: ['v2'], // Add this to restrict availability
},
```
### Issue: Unversioned Paths Not Redirecting
**Symptom**: `/api/flyers` returns 404 instead of redirecting to `/api/v1/flyers`
**Cause**: Redirect middleware order issue in `server.ts`
**Solution**: Ensure redirect middleware is mounted BEFORE `createApiRouter()`:
```typescript
// server.ts - correct order
app.use('/api', redirectMiddleware); // First
app.use('/api', createApiRouter()); // Second
```
### Issue: TypeScript Errors on req.apiVersion
**Symptom**: `Property 'apiVersion' does not exist on type 'Request'`
**Cause**: Type extensions not being picked up
**Solution**: Ensure `src/types/express.d.ts` is included in tsconfig:
```json
{
"compilerOptions": {
"typeRoots": ["./node_modules/@types", "./src/types"]
},
"include": ["src/**/*"]
}
```
### Issue: Router Cache Stale After Config Change
**Symptom**: Version behavior doesn't update after changing `VERSION_CONFIGS`
**Cause**: Routers are cached at startup
**Solution**: Use `refreshRouterCache()` or restart the server:
```typescript
import { refreshRouterCache } from './src/routes/versioned';
// After config changes
refreshRouterCache();
```
---
## Related Documentation
### Architecture Decision Records
| ADR | Title |
| ------------------------------------------------------------------------ | ---------------------------- |
| [ADR-008](../adr/0008-api-versioning-strategy.md) | API Versioning Strategy |
| [ADR-003](../adr/0003-standardized-input-validation-using-middleware.md) | Input Validation |
| [ADR-028](../adr/0028-api-response-standardization.md) | API Response Standardization |
| [ADR-018](../adr/0018-api-documentation-strategy.md) | API Documentation Strategy |
### Implementation Files
| File | Description |
| -------------------------------------------------------------------------------------------- | ---------------------------- |
| [`src/config/apiVersions.ts`](../../src/config/apiVersions.ts) | Version constants and config |
| [`src/middleware/apiVersion.middleware.ts`](../../src/middleware/apiVersion.middleware.ts) | Version detection |
| [`src/middleware/deprecation.middleware.ts`](../../src/middleware/deprecation.middleware.ts) | Deprecation headers |
| [`src/routes/versioned.ts`](../../src/routes/versioned.ts) | Router factory |
| [`src/types/express.d.ts`](../../src/types/express.d.ts) | Request type extensions |
| [`server.ts`](../../server.ts) | Application entry point |
### Test Files
| File | Description |
| ------------------------------------------------------------------------------------------------------ | ------------------------ |
| [`src/middleware/apiVersion.middleware.test.ts`](../../src/middleware/apiVersion.middleware.test.ts) | Version detection tests |
| [`src/middleware/deprecation.middleware.test.ts`](../../src/middleware/deprecation.middleware.test.ts) | Deprecation header tests |
### External References
- [RFC 8594: The "Sunset" HTTP Header Field](https://datatracker.ietf.org/doc/html/rfc8594)
- [draft-ietf-httpapi-deprecation-header](https://datatracker.ietf.org/doc/draft-ietf-httpapi-deprecation-header/)
- [RFC 8288: Web Linking](https://datatracker.ietf.org/doc/html/rfc8288)
---
## Quick Reference
### Files to Modify for Common Tasks
| Task | Files |
| ------------------------------ | ---------------------------------------------------- |
| Add new version | `src/config/apiVersions.ts`, `src/config/swagger.ts` |
| Deprecate version | `src/config/apiVersions.ts` |
| Add version-specific route | `src/routes/versioned.ts` |
| Version-specific handler logic | Route file (e.g., `src/routes/flyer.routes.ts`) |
### Key Functions
```typescript
// Check if version is valid
isValidApiVersion('v1'); // true
isValidApiVersion('v99'); // false
// Get version from request with fallback
getRequestApiVersion(req); // Returns 'v1' | 'v2'
// Check if request has valid version
hasApiVersion(req); // boolean
// Get deprecation info
getVersionDeprecation('v1'); // { deprecated: false, ... }
```
### Commands
```bash
# Run all tests
podman exec -it flyer-crawler-dev npm test
# Type check
podman exec -it flyer-crawler-dev npm run type-check
# Check version headers manually
curl -I http://localhost:3001/api/v1/health
# Test deprecation (after marking v1 deprecated)
curl -I http://localhost:3001/api/v1/health | grep -E "(Deprecation|Sunset|Link|X-API)"
```
curl -I http://localhost:3001/api/v1/health | grep -E "(Deprecation|Sunset|Link|X-API)"
```

View File

@@ -2,8 +2,26 @@
Common code patterns extracted from Architecture Decision Records (ADRs). Use these as templates when writing new code.
## Quick Reference
| Pattern | Key Function/Class | Import From |
| -------------------- | ------------------------------------------------- | ------------------------------------- |
| **tsoa Controllers** | `BaseController`, `@Route`, `@Security` | `src/controllers/base.controller.ts` |
| Error Handling | `handleDbError()`, `NotFoundError` | `src/services/db/errors.db.ts` |
| Repository Methods | `get*`, `find*`, `list*` | `src/services/db/*.db.ts` |
| API Responses | `sendSuccess()`, `sendPaginated()`, `sendError()` | `src/utils/apiResponse.ts` |
| Transactions | `withTransaction()` | `src/services/db/connection.db.ts` |
| Validation | `validateRequest()` | `src/middleware/validation.ts` |
| Authentication | `authenticateJWT`, `@Security('bearerAuth')` | `src/middleware/auth.ts` |
| Caching | `cacheService` | `src/services/cache.server.ts` |
| Background Jobs | Queue classes | `src/services/queues.server.ts` |
| Feature Flags | `isFeatureEnabled()`, `useFeatureFlag()` | `src/services/featureFlags.server.ts` |
---
## Table of Contents
- [tsoa Controllers](#tsoa-controllers)
- [Error Handling](#error-handling)
- [Repository Patterns](#repository-patterns)
- [API Response Patterns](#api-response-patterns)
@@ -12,12 +30,166 @@ Common code patterns extracted from Architecture Decision Records (ADRs). Use th
- [Authentication](#authentication)
- [Caching](#caching)
- [Background Jobs](#background-jobs)
- [Feature Flags](#feature-flags)
---
## tsoa Controllers
**ADR**: [ADR-018](../adr/0018-api-documentation-strategy.md), [ADR-059](../adr/0059-dependency-modernization.md)
All API endpoints are implemented as tsoa controller classes that extend `BaseController`. This pattern provides type-safe OpenAPI documentation generation and standardized response formatting.
### Basic Controller Structure
```typescript
import {
Route,
Tags,
Get,
Post,
Body,
Path,
Query,
Security,
SuccessResponse,
Response,
} from 'tsoa';
import type { Request as ExpressRequest } from 'express';
import {
BaseController,
SuccessResponse as SuccessResponseType,
ErrorResponse,
} from './base.controller';
interface CreateItemRequest {
name: string;
description?: string;
}
interface ItemResponse {
id: number;
name: string;
created_at: string;
}
@Route('items')
@Tags('Items')
export class ItemController extends BaseController {
/**
* Get an item by ID.
* @summary Get item
* @param id Item ID
*/
@Get('{id}')
@SuccessResponse(200, 'Item retrieved')
@Response<ErrorResponse>(404, 'Item not found')
public async getItem(@Path() id: number): Promise<SuccessResponseType<ItemResponse>> {
const item = await itemService.getItemById(id);
return this.success(item);
}
/**
* Create a new item. Requires authentication.
* @summary Create item
*/
@Post()
@Security('bearerAuth')
@SuccessResponse(201, 'Item created')
@Response<ErrorResponse>(401, 'Not authenticated')
public async createItem(
@Body() body: CreateItemRequest,
@Request() request: ExpressRequest,
): Promise<SuccessResponseType<ItemResponse>> {
const user = request.user as UserProfile;
const item = await itemService.createItem(body, user.user.user_id);
return this.created(item);
}
}
```
### BaseController Response Helpers
```typescript
// Success response (200)
return this.success(data);
// Created response (201)
return this.created(data);
// Paginated response
const { page, limit } = this.normalizePagination(queryPage, queryLimit);
return this.paginated(items, { page, limit, total });
// Message-only response
return this.message('Operation completed');
// No content (204)
return this.noContent();
```
### Authentication with @Security
```typescript
import { Security, Request } from 'tsoa';
import { requireAdminRole } from '../middleware/tsoaAuthentication';
// Require authentication
@Get('profile')
@Security('bearerAuth')
public async getProfile(@Request() req: ExpressRequest): Promise<...> {
const user = req.user as UserProfile;
return this.success(user);
}
// Require admin role
@Delete('users/{id}')
@Security('bearerAuth')
public async deleteUser(@Path() id: string, @Request() req: ExpressRequest): Promise<void> {
requireAdminRole(req.user as UserProfile);
await userService.deleteUser(id);
return this.noContent();
}
```
### Error Handling in Controllers
```typescript
import { NotFoundError, ValidationError, ForbiddenError } from './base.controller';
// Throw errors - they're handled by the global error handler
throw new NotFoundError('Item', id); // 404
throw new ValidationError([], 'Invalid'); // 400
throw new ForbiddenError('Admin only'); // 403
```
### Rate Limiting
```typescript
import { Middlewares } from 'tsoa';
import { loginLimiter } from '../config/rateLimiters';
@Post('login')
@Middlewares(loginLimiter)
@Response<ErrorResponse>(429, 'Too many attempts')
public async login(@Body() body: LoginRequest): Promise<...> { ... }
```
### Regenerating Routes
After modifying controllers, regenerate the tsoa routes:
```bash
npm run tsoa:spec && npm run tsoa:routes
```
**Full Guide**: See [TSOA-MIGRATION-GUIDE.md](./TSOA-MIGRATION-GUIDE.md) for comprehensive documentation.
---
## Error Handling
**ADR**: [ADR-001](../adr/0001-standardized-error-handling-for-database-operations.md)
**ADR**: [ADR-001](../adr/0001-standardized-error-handling.md)
### Repository Layer Error Handling
@@ -47,16 +219,20 @@ export async function getFlyerById(id: number, client?: PoolClient): Promise<Fly
```typescript
import { sendError } from '../utils/apiResponse';
app.get('/api/flyers/:id', async (req, res) => {
app.get('/api/v1/flyers/:id', async (req, res) => {
try {
const flyer = await flyerDb.getFlyerById(parseInt(req.params.id));
return sendSuccess(res, flyer);
} catch (error) {
// IMPORTANT: Use req.originalUrl for dynamic path logging (not hardcoded paths)
req.log.error({ error }, `Error in ${req.originalUrl.split('?')[0]}:`);
return sendError(res, error);
}
});
```
**Best Practice**: Always use `req.originalUrl.split('?')[0]` in error log messages instead of hardcoded paths. This ensures logs reflect the actual request URL including version prefixes (`/api/v1/`). See [Error Logging Path Patterns](ERROR-LOGGING-PATHS.md) for details.
### Custom Error Types
```typescript
@@ -74,7 +250,7 @@ throw new DatabaseError('Failed to insert flyer', originalError);
## Repository Patterns
**ADR**: [ADR-034](../adr/0034-repository-layer-method-naming-conventions.md)
**ADR**: [ADR-034](../adr/0034-repository-pattern-standards.md)
### Method Naming Conventions
@@ -151,16 +327,17 @@ export async function listActiveFlyers(client?: PoolClient): Promise<Flyer[]> {
## API Response Patterns
**ADR**: [ADR-028](../adr/0028-consistent-api-response-format.md)
**ADR**: [ADR-028](../adr/0028-api-response-standardization.md)
### Success Response
```typescript
import { sendSuccess } from '../utils/apiResponse';
app.post('/api/flyers', async (req, res) => {
app.post('/api/v1/flyers', async (req, res) => {
const flyer = await flyerService.createFlyer(req.body);
return sendSuccess(res, flyer, 'Flyer created successfully', 201);
// sendSuccess(res, data, statusCode?, meta?)
return sendSuccess(res, flyer, 201);
});
```
@@ -169,30 +346,32 @@ app.post('/api/flyers', async (req, res) => {
```typescript
import { sendPaginated } from '../utils/apiResponse';
app.get('/api/flyers', async (req, res) => {
const { page = 1, pageSize = 20 } = req.query;
const { items, total } = await flyerService.listFlyers(page, pageSize);
app.get('/api/v1/flyers', async (req, res) => {
const page = parseInt(req.query.page as string) || 1;
const limit = parseInt(req.query.limit as string) || 20;
const { items, total } = await flyerService.listFlyers(page, limit);
return sendPaginated(res, {
items,
total,
page: parseInt(page),
pageSize: parseInt(pageSize),
});
// sendPaginated(res, data[], { page, limit, total }, meta?)
return sendPaginated(res, items, { page, limit, total });
});
```
### Error Response
```typescript
import { sendError } from '../utils/apiResponse';
import { sendError, sendSuccess, ErrorCode } from '../utils/apiResponse';
app.get('/api/flyers/:id', async (req, res) => {
app.get('/api/v1/flyers/:id', async (req, res) => {
try {
const flyer = await flyerDb.getFlyerById(parseInt(req.params.id));
return sendSuccess(res, flyer);
} catch (error) {
return sendError(res, error); // Automatically maps error to correct status
// sendError(res, code, message, statusCode?, details?, meta?)
if (error instanceof NotFoundError) {
return sendError(res, ErrorCode.NOT_FOUND, error.message, 404);
}
req.log.error({ error }, `Error in ${req.originalUrl.split('?')[0]}:`);
return sendError(res, ErrorCode.INTERNAL_ERROR, 'An error occurred', 500);
}
});
```
@@ -201,12 +380,12 @@ app.get('/api/flyers/:id', async (req, res) => {
## Transaction Management
**ADR**: [ADR-002](../adr/0002-transaction-management-pattern.md)
**ADR**: [ADR-002](../adr/0002-standardized-transaction-management.md)
### Basic Transaction
```typescript
import { withTransaction } from '../services/db/transaction.db';
import { withTransaction } from '../services/db/connection.db';
export async function createFlyerWithItems(
flyerData: FlyerInput,
@@ -258,7 +437,7 @@ export async function bulkImportFlyers(flyersData: FlyerInput[]): Promise<Import
## Input Validation
**ADR**: [ADR-003](../adr/0003-input-validation-framework.md)
**ADR**: [ADR-003](../adr/0003-standardized-input-validation-using-middleware.md)
### Zod Schema Definition
@@ -294,10 +473,10 @@ export type CreateFlyerInput = z.infer<typeof createFlyerSchema>;
import { validateRequest } from '../middleware/validation';
import { createFlyerSchema } from '../schemas/flyer.schemas';
app.post('/api/flyers', validateRequest(createFlyerSchema), async (req, res) => {
app.post('/api/v1/flyers', validateRequest(createFlyerSchema), async (req, res) => {
// req.body is now type-safe and validated
const flyer = await flyerService.createFlyer(req.body);
return sendSuccess(res, flyer, 'Flyer created successfully', 201);
return sendSuccess(res, flyer, 201);
});
```
@@ -327,7 +506,7 @@ export async function processFlyer(data: unknown): Promise<Flyer> {
import { authenticateJWT } from '../middleware/auth';
app.get(
'/api/profile',
'/api/v1/profile',
authenticateJWT, // Middleware adds req.user
async (req, res) => {
// req.user is guaranteed to exist
@@ -343,7 +522,7 @@ app.get(
import { optionalAuth } from '../middleware/auth';
app.get(
'/api/flyers',
'/api/v1/flyers',
optionalAuth, // req.user may or may not exist
async (req, res) => {
const flyers = req.user
@@ -370,7 +549,7 @@ export function generateToken(user: User): string {
## Caching
**ADR**: [ADR-029](../adr/0029-redis-caching-strategy.md)
**ADR**: [ADR-009](../adr/0009-caching-strategy-for-read-heavy-operations.md)
### Cache Pattern
@@ -410,7 +589,7 @@ export async function updateFlyer(id: number, data: UpdateFlyerInput): Promise<F
## Background Jobs
**ADR**: [ADR-036](../adr/0036-background-job-processing-architecture.md)
**ADR**: [ADR-006](../adr/0006-background-job-processing-and-task-queues.md)
### Queue Job
@@ -469,6 +648,153 @@ const flyerWorker = new Worker(
---
## Feature Flags
**ADR**: [ADR-024](../adr/0024-feature-flagging-strategy.md)
Feature flags enable controlled feature rollout, A/B testing, and quick production disablement without redeployment. All flags default to `false` (opt-in model).
### Backend Usage
```typescript
import { isFeatureEnabled, getFeatureFlags } from '../services/featureFlags.server';
// Check a specific flag in route handler
router.get('/dashboard', async (req, res) => {
if (isFeatureEnabled('newDashboard')) {
return sendSuccess(res, { version: 'v2', data: await getNewDashboardData() });
}
return sendSuccess(res, { version: 'v1', data: await getLegacyDashboardData() });
});
// Check flag in service layer
function processFlyer(flyer: Flyer): ProcessedFlyer {
if (isFeatureEnabled('experimentalAi')) {
return processWithExperimentalAi(flyer);
}
return processWithStandardAi(flyer);
}
// Get all flags (admin endpoint)
router.get('/admin/feature-flags', requireAdmin, async (req, res) => {
sendSuccess(res, { flags: getFeatureFlags() });
});
```
### Frontend Usage
```tsx
import { useFeatureFlag, useAllFeatureFlags } from '../hooks/useFeatureFlag';
import { FeatureFlag } from '../components/FeatureFlag';
// Hook approach - for logic beyond rendering
function Dashboard() {
const isNewDashboard = useFeatureFlag('newDashboard');
useEffect(() => {
if (isNewDashboard) {
analytics.track('new_dashboard_viewed');
}
}, [isNewDashboard]);
return isNewDashboard ? <NewDashboard /> : <LegacyDashboard />;
}
// Declarative component approach
function App() {
return (
<FeatureFlag feature="newDashboard" fallback={<LegacyDashboard />}>
<NewDashboard />
</FeatureFlag>
);
}
// Debug panel showing all flags
function DebugPanel() {
const flags = useAllFeatureFlags();
return (
<ul>
{Object.entries(flags).map(([name, enabled]) => (
<li key={name}>
{name}: {enabled ? 'ON' : 'OFF'}
</li>
))}
</ul>
);
}
```
### Adding a New Flag
1. **Backend** (`src/config/env.ts`):
```typescript
// In featureFlagsSchema
myNewFeature: booleanString(false), // FEATURE_MY_NEW_FEATURE
// In loadEnvVars()
myNewFeature: process.env.FEATURE_MY_NEW_FEATURE,
```
2. **Frontend** (`src/config.ts` and `src/vite-env.d.ts`):
```typescript
// In config.ts featureFlags section
myNewFeature: import.meta.env.VITE_FEATURE_MY_NEW_FEATURE === 'true',
// In vite-env.d.ts
readonly VITE_FEATURE_MY_NEW_FEATURE?: string;
```
3. **Environment** (`.env.example`):
```bash
# FEATURE_MY_NEW_FEATURE=false
# VITE_FEATURE_MY_NEW_FEATURE=false
```
### Testing Feature Flags
```typescript
// Backend - reset modules to test different states
beforeEach(() => {
vi.resetModules();
process.env.FEATURE_NEW_DASHBOARD = 'true';
});
// Frontend - mock config module
vi.mock('../config', () => ({
default: {
featureFlags: {
newDashboard: true,
betaRecipes: false,
},
},
}));
```
### Flag Lifecycle
| Phase | Actions |
| ---------- | -------------------------------------------------------------- |
| **Add** | Add to schemas (backend + frontend), default `false`, document |
| **Enable** | Set env var `='true'`, restart application |
| **Remove** | Remove conditional code, remove from schemas, remove env vars |
| **Sunset** | Max 3 months after full rollout - remove flag |
### Current Flags
| Flag | Backend Env Var | Frontend Env Var | Purpose |
| ---------------- | ------------------------- | ------------------------------ | ------------------------ |
| `bugsinkSync` | `FEATURE_BUGSINK_SYNC` | `VITE_FEATURE_BUGSINK_SYNC` | Bugsink error sync |
| `advancedRbac` | `FEATURE_ADVANCED_RBAC` | `VITE_FEATURE_ADVANCED_RBAC` | Advanced RBAC features |
| `newDashboard` | `FEATURE_NEW_DASHBOARD` | `VITE_FEATURE_NEW_DASHBOARD` | New dashboard experience |
| `betaRecipes` | `FEATURE_BETA_RECIPES` | `VITE_FEATURE_BETA_RECIPES` | Beta recipe features |
| `experimentalAi` | `FEATURE_EXPERIMENTAL_AI` | `VITE_FEATURE_EXPERIMENTAL_AI` | Experimental AI features |
| `debugMode` | `FEATURE_DEBUG_MODE` | `VITE_FEATURE_DEBUG_MODE` | Debug mode |
---
## Related Documentation
- [ADR Index](../adr/index.md) - All architecture decision records

View File

@@ -229,7 +229,7 @@ SELECT * FROM flyers WHERE store_id = 1;
- Add missing indexes
- Optimize WHERE clauses
- Use connection pooling
- See [ADR-034](../adr/0034-repository-layer-method-naming-conventions.md)
- See [ADR-034](../adr/0034-repository-pattern-standards.md)
---
@@ -237,7 +237,7 @@ SELECT * FROM flyers WHERE store_id = 1;
### Tests Pass on Windows, Fail in Container
**Cause**: Platform-specific behavior (ADR-014)
**Cause**: Platform-specific behavior ([ADR-014](../adr/0014-containerization-and-deployment-strategy.md))
**Rule**: Container results are authoritative. Windows results are unreliable.

View File

@@ -93,7 +93,7 @@ When the container starts (`scripts/dev-entrypoint.sh`):
PM2 manages three processes in the dev container:
```
```text
+--------------------+ +------------------------+ +--------------------+
| flyer-crawler- | | flyer-crawler- | | flyer-crawler- |
| api-dev | | worker-dev | | vite-dev |
@@ -404,5 +404,5 @@ podman exec -it flyer-crawler-dev pm2 restart flyer-crawler-api-dev
- [DEBUGGING.md](DEBUGGING.md) - Debugging strategies
- [LOGSTASH-QUICK-REF.md](../operations/LOGSTASH-QUICK-REF.md) - Logstash quick reference
- [DEV-CONTAINER-BUGSINK.md](../DEV-CONTAINER-BUGSINK.md) - Bugsink setup in dev container
- [ADR-014](../adr/0014-linux-only-platform.md) - Linux-only platform decision
- [ADR-050](../adr/0050-postgresql-function-observability.md) - PostgreSQL function observability
- [ADR-014](../adr/0014-containerization-and-deployment-strategy.md) - Containerization and deployment strategy
- [ADR-050](../adr/0050-postgresql-function-observability.md) - PostgreSQL function observability (includes log aggregation)

View File

@@ -0,0 +1,153 @@
# Error Logging Path Patterns
## Overview
This document describes the correct pattern for logging request paths in error handlers within Express route files. Following this pattern ensures that error logs accurately reflect the actual request URL, including any API version prefixes.
## The Problem
When ADR-008 (API Versioning Strategy) was implemented, all routes were moved from `/api/*` to `/api/v1/*`. However, some error log messages contained hardcoded paths that did not update automatically:
```typescript
// INCORRECT - hardcoded path
req.log.error({ error }, 'Error in /api/flyers/:id:');
```
This caused 16 unit test failures because tests expected the error log message to contain `/api/v1/flyers/:id` but received `/api/flyers/:id`.
## The Solution
Always use `req.originalUrl` to dynamically capture the actual request path in error logs:
```typescript
// CORRECT - dynamic path from request
req.log.error({ error }, `Error in ${req.originalUrl.split('?')[0]}:`);
```
### Why `req.originalUrl`?
| Property | Value for `/api/v1/flyers/123?active=true` | Use Case |
| ----------------- | ------------------------------------------ | ----------------------------------- |
| `req.url` | `/123?active=true` | Path relative to router mount point |
| `req.path` | `/123` | Path without query string |
| `req.originalUrl` | `/api/v1/flyers/123?active=true` | Full original request URL |
| `req.baseUrl` | `/api/v1/flyers` | Router mount path |
`req.originalUrl` is the correct choice because:
1. It contains the full path including version prefix (`/api/v1/`)
2. It reflects what the client actually requested
3. It makes log messages searchable by the actual endpoint path
4. It automatically adapts when routes are mounted at different paths
### Stripping Query Parameters
Use `.split('?')[0]` to remove query parameters from log messages:
```typescript
// Request: /api/v1/flyers?page=1&limit=20
req.originalUrl.split('?')[0]; // Returns: /api/v1/flyers
```
This keeps log messages clean and prevents sensitive query parameters from appearing in logs.
## Standard Error Logging Pattern
### Basic Pattern
```typescript
router.get('/:id', async (req, res) => {
try {
const result = await someService.getData(req.params.id);
return sendSuccess(res, result);
} catch (error) {
req.log.error({ error }, `Error in ${req.originalUrl.split('?')[0]}:`);
return sendError(res, error);
}
});
```
### With Additional Context
```typescript
router.post('/', async (req, res) => {
try {
const result = await someService.createItem(req.body);
return sendSuccess(res, result, 'Item created', 201);
} catch (error) {
req.log.error(
{ error, userId: req.user?.id, body: req.body },
`Error creating item in ${req.originalUrl.split('?')[0]}:`,
);
return sendError(res, error);
}
});
```
### Descriptive Messages
For clarity, include a brief description of the operation:
```typescript
// Good - describes the operation
req.log.error({ error }, `Error fetching recipes in ${req.originalUrl.split('?')[0]}:`);
req.log.error({ error }, `Error updating user profile in ${req.originalUrl.split('?')[0]}:`);
// Acceptable - just the path
req.log.error({ error }, `Error in ${req.originalUrl.split('?')[0]}:`);
// Bad - hardcoded path
req.log.error({ error }, 'Error in /api/recipes:');
```
## Files Updated in Initial Fix (2026-01-27)
The following files were updated to use this pattern:
| File | Error Log Statements Fixed |
| -------------------------------------- | -------------------------- |
| `src/routes/recipe.routes.ts` | 3 |
| `src/routes/stats.routes.ts` | 1 |
| `src/routes/flyer.routes.ts` | 2 |
| `src/routes/personalization.routes.ts` | 3 |
## Testing Error Log Messages
When writing tests that verify error log messages, use flexible matchers that account for versioned paths:
```typescript
// Good - matches any version prefix
expect(logSpy).toHaveBeenCalledWith(
expect.objectContaining({ error: expect.any(Error) }),
expect.stringContaining('/flyers'),
);
// Good - explicit version match
expect(logSpy).toHaveBeenCalledWith(
expect.objectContaining({ error: expect.any(Error) }),
expect.stringContaining('/api/v1/flyers'),
);
// Bad - hardcoded unversioned path (will fail)
expect(logSpy).toHaveBeenCalledWith(
expect.objectContaining({ error: expect.any(Error) }),
'Error in /api/flyers:',
);
```
## Checklist for New Routes
When creating new route handlers:
- [ ] Use `req.originalUrl.split('?')[0]` in all error log messages
- [ ] Include descriptive text about the operation being performed
- [ ] Add structured context (userId, relevant IDs) to the log object
- [ ] Write tests that verify error logs contain the versioned path
## Related Documentation
- [ADR-008: API Versioning Strategy](../adr/0008-api-versioning-strategy.md) - Versioning implementation details
- [ADR-057: Test Remediation Post-API Versioning](../adr/0057-test-remediation-post-api-versioning.md) - Comprehensive remediation guide
- [ADR-004: Structured Logging](../adr/0004-standardized-application-wide-structured-logging.md) - Logging standards
- [CODE-PATTERNS.md](CODE-PATTERNS.md) - General code patterns
- [TESTING.md](TESTING.md) - Testing guidelines

View File

@@ -1,5 +1,19 @@
# Testing Guide
## Quick Reference
| Command | Purpose |
| ------------------------------------------------------------ | ---------------------------- |
| `podman exec -it flyer-crawler-dev npm test` | Run all tests |
| `podman exec -it flyer-crawler-dev npm run test:unit` | Unit tests (~2900) |
| `podman exec -it flyer-crawler-dev npm run test:integration` | Integration tests (28 files) |
| `podman exec -it flyer-crawler-dev npm run test:e2e` | E2E tests (11 files) |
| `podman exec -it flyer-crawler-dev npm run type-check` | TypeScript check |
**Critical**: Always run tests in the dev container. Windows results are unreliable.
---
## Overview
This project has comprehensive test coverage including unit tests, integration tests, and E2E tests. All tests must be run in the **Linux dev container environment** for reliable results.
@@ -76,7 +90,7 @@ To verify type-check is working correctly:
Example error output:
```
```text
src/pages/MyDealsPage.tsx:68:31 - error TS2339: Property 'store_name' does not exist on type 'WatchedItemDeal'.
68 <span>{deal.store_name}</span>
@@ -113,15 +127,26 @@ Located throughout `src/` directory alongside source files with `.test.ts` or `.
npm run test:unit
```
### Integration Tests (5 test files)
### Integration Tests (28 test files)
Located in `src/tests/integration/`:
Located in `src/tests/integration/`. Key test files include:
- `admin.integration.test.ts`
- `flyer.integration.test.ts`
- `price.integration.test.ts`
- `public.routes.integration.test.ts`
- `receipt.integration.test.ts`
| Test File | Domain |
| -------------------------------------- | -------------------------- |
| `admin.integration.test.ts` | Admin dashboard operations |
| `auth.integration.test.ts` | Authentication flows |
| `budget.integration.test.ts` | Budget management |
| `flyer.integration.test.ts` | Flyer CRUD operations |
| `flyer-processing.integration.test.ts` | AI flyer processing |
| `gamification.integration.test.ts` | Achievements and points |
| `inventory.integration.test.ts` | Inventory management |
| `notification.integration.test.ts` | User notifications |
| `receipt.integration.test.ts` | Receipt processing |
| `recipe.integration.test.ts` | Recipe management |
| `shopping-list.integration.test.ts` | Shopping list operations |
| `user.integration.test.ts` | User profile operations |
See `src/tests/integration/` for the complete list.
Requires PostgreSQL and Redis services running.
@@ -129,13 +154,23 @@ Requires PostgreSQL and Redis services running.
npm run test:integration
```
### E2E Tests (3 test files)
### E2E Tests (11 test files)
Located in `src/tests/e2e/`:
Located in `src/tests/e2e/`. Full user journey tests:
- `deals-journey.e2e.test.ts`
- `budget-journey.e2e.test.ts`
- `receipt-journey.e2e.test.ts`
| Test File | Journey |
| --------------------------------- | ----------------------------- |
| `admin-authorization.e2e.test.ts` | Admin access control |
| `admin-dashboard.e2e.test.ts` | Admin dashboard flows |
| `auth.e2e.test.ts` | Login/logout/registration |
| `budget-journey.e2e.test.ts` | Budget tracking workflow |
| `deals-journey.e2e.test.ts` | Finding and saving deals |
| `error-reporting.e2e.test.ts` | Error handling verification |
| `flyer-upload.e2e.test.ts` | Flyer upload and processing |
| `inventory-journey.e2e.test.ts` | Pantry management |
| `receipt-journey.e2e.test.ts` | Receipt scanning and tracking |
| `upc-journey.e2e.test.ts` | UPC barcode scanning |
| `user-journey.e2e.test.ts` | User profile management |
Requires all services (PostgreSQL, Redis, BullMQ workers) running.
@@ -157,20 +192,18 @@ Located in `src/tests/utils/storeHelpers.ts`:
```typescript
// Create a store with a location in one call
const store = await createStoreWithLocation({
storeName: 'Test Store',
address: {
address_line_1: '123 Main St',
city: 'Toronto',
province_state: 'ON',
postal_code: 'M1M 1M1',
},
pool,
log,
const store = await createStoreWithLocation(pool, {
name: 'Test Store',
address: '123 Main St',
city: 'Toronto',
province: 'ON',
postalCode: 'M1M 1M1',
});
// Returns: { storeId, addressId, storeLocationId }
// Cleanup stores and their locations
await cleanupStoreLocations([storeId1, storeId2], pool, log);
await cleanupStoreLocation(pool, store);
```
### Mock Factories
@@ -261,3 +294,326 @@ Opens a browser-based test runner with filtering and debugging capabilities.
5. **Verify cache invalidation** - tests that insert data directly must invalidate cache
6. **Use unique filenames** - file upload tests need timestamp-based filenames
7. **Check exit codes** - `npm run type-check` returns 0 on success, non-zero on error
8. **Use `req.originalUrl` in error logs** - never hardcode API paths in error messages
9. **Use versioned API paths** - always use `/api/v1/` prefix in test requests
10. **Use `vi.hoisted()` for module mocks** - ensure mocks are available during module initialization
## Testing Error Log Messages
When testing route error handlers, ensure assertions account for versioned API paths.
### Problem: Hardcoded Paths Break Tests
Error log messages with hardcoded paths cause test failures when API versions change:
```typescript
// Production code (INCORRECT - hardcoded path)
req.log.error({ error }, 'Error in /api/flyers/:id:');
// Test expects versioned path
expect(logSpy).toHaveBeenCalledWith(
expect.objectContaining({ error: expect.any(Error) }),
expect.stringContaining('/api/v1/flyers'), // FAILS - actual log has /api/flyers
);
```
### Solution: Dynamic Paths with `req.originalUrl`
Production code should use `req.originalUrl` for dynamic path logging:
```typescript
// Production code (CORRECT - dynamic path)
req.log.error({ error }, `Error in ${req.originalUrl.split('?')[0]}:`);
```
### Writing Robust Test Assertions
```typescript
// Good - matches versioned path
expect(logSpy).toHaveBeenCalledWith(
expect.objectContaining({ error: expect.any(Error) }),
expect.stringContaining('/api/v1/flyers'),
);
// Good - flexible match for any version
expect(logSpy).toHaveBeenCalledWith(
expect.objectContaining({ error: expect.any(Error) }),
expect.stringMatching(/\/api\/v\d+\/flyers/),
);
// Bad - hardcoded unversioned path
expect(logSpy).toHaveBeenCalledWith(
expect.objectContaining({ error: expect.any(Error) }),
'Error in /api/flyers:', // Will fail with versioned routes
);
```
See [Error Logging Path Patterns](ERROR-LOGGING-PATHS.md) for complete documentation.
## API Versioning in Tests (ADR-008, ADR-057)
All API endpoints use the `/api/v1/` prefix. Tests must use versioned paths.
### Configuration
API base URLs are configured centrally in Vitest config files:
| Config File | Environment Variable | Value |
| ------------------------------ | -------------------- | ------------------------------ |
| `vite.config.ts` | `VITE_API_BASE_URL` | `/api/v1` |
| `vitest.config.e2e.ts` | `VITE_API_BASE_URL` | `http://localhost:3098/api/v1` |
| `vitest.config.integration.ts` | `VITE_API_BASE_URL` | `http://localhost:3099/api/v1` |
### Writing API Tests
```typescript
// Good - versioned path
const response = await request.post('/api/v1/auth/login').send({...});
// Bad - unversioned path (will fail)
const response = await request.post('/api/auth/login').send({...});
```
### Migration Checklist
When API version changes (e.g., v1 to v2):
1. Update all Vitest config `VITE_API_BASE_URL` values
2. Search and replace API paths in E2E tests: `grep -r "/api/v1/" src/tests/e2e/`
3. Search and replace API paths in integration tests
4. Verify route handler error logs use `req.originalUrl`
5. Run full test suite in dev container
See [ADR-057](../adr/0057-test-remediation-post-api-versioning.md) for complete migration guidance.
## vi.hoisted() Pattern for Module Mocks
When mocking modules that are imported at module initialization time (like queues or database connections), use `vi.hoisted()` to ensure mocks are available during hoisting.
### Problem: Mock Not Available During Import
```typescript
// BAD: Mock might not be ready when module imports it
vi.mock('../services/queues.server', () => ({
flyerQueue: { getJobCounts: vi.fn() }, // May not exist yet
}));
import healthRouter from './health.routes'; // Imports queues.server
```
### Solution: Use vi.hoisted()
```typescript
// GOOD: Mocks are created during hoisting, before vi.mock runs
const { mockQueuesModule } = vi.hoisted(() => {
const createMockQueue = () => ({
getJobCounts: vi.fn().mockResolvedValue({
waiting: 0,
active: 0,
failed: 0,
delayed: 0,
}),
});
return {
mockQueuesModule: {
flyerQueue: createMockQueue(),
emailQueue: createMockQueue(),
// ... additional queues
},
};
});
// Now the mock object exists when vi.mock factory runs
vi.mock('../services/queues.server', () => mockQueuesModule);
// Safe to import after mocks are defined
import healthRouter from './health.routes';
```
See [ADR-057](../adr/0057-test-remediation-post-api-versioning.md) for additional patterns.
## Testing Role-Based Component Visibility
When testing components that render differently based on user roles:
### Pattern: Separate Test Cases by Role
```typescript
describe('for authenticated users', () => {
beforeEach(() => {
mockedUseAuth.mockReturnValue({
authStatus: 'AUTHENTICATED',
userProfile: createMockUserProfile({ role: 'user' }),
});
});
it('renders user-accessible components', () => {
render(<MyComponent />);
expect(screen.getByTestId('user-component')).toBeInTheDocument();
// Admin-only should NOT be present
expect(screen.queryByTestId('admin-only')).not.toBeInTheDocument();
});
});
describe('for admin users', () => {
beforeEach(() => {
mockedUseAuth.mockReturnValue({
authStatus: 'AUTHENTICATED',
userProfile: createMockUserProfile({ role: 'admin' }),
});
});
it('renders admin-only components', () => {
render(<MyComponent />);
expect(screen.getByTestId('admin-only')).toBeInTheDocument();
});
});
```
### Key Points
1. Create separate `describe` blocks for each role
2. Set up role-specific mocks in `beforeEach`
3. Test both presence AND absence of role-gated components
4. Use `screen.queryByTestId()` for elements that should NOT exist
## CSS Class Assertions After UI Refactors
After frontend style changes, update test assertions to match new CSS classes.
### Handling Tailwind Class Changes
```typescript
// Before refactor
expect(selectedItem).toHaveClass('ring-2', 'ring-brand-primary');
// After refactor - update to new classes
expect(selectedItem).toHaveClass('border-brand-primary', 'bg-teal-50/50');
```
### Flexible Matching
For complex class combinations, consider partial matching:
```typescript
// Check for key classes, ignore utility classes
expect(element).toHaveClass('border-brand-primary');
// Or use regex for patterns
expect(element.className).toMatch(/dark:bg-teal-\d+/);
```
See [ADR-057](../adr/0057-test-remediation-post-api-versioning.md) for lessons learned from the test remediation effort.
## TypeScript Type Safety in Tests (ADR-060)
Tests must be fully type-safe. Common patterns for handling API response types and mock casting are documented below.
### Response Type Narrowing
API responses use discriminated unions (`ApiSuccessResponse<T> | ApiErrorResponse`). Access `.data` only after type narrowing.
**Utility Functions** (`src/tests/utils/testHelpers.ts`):
```typescript
import { asSuccessResponse, asErrorResponse } from '@/tests/utils/testHelpers';
// Success response access
const response = await request.get('/api/v1/users/1');
const body = asSuccessResponse<User>(response.body);
expect(body.data.id).toBe(1);
// Error response access
const errorResponse = await request.post('/api/v1/users').send({});
expect(errorResponse.status).toBe(400);
const errorBody = asErrorResponse(errorResponse.body);
expect(errorBody.error.code).toBe('VALIDATION_ERROR');
```
### Mock Object Type Casting
Use appropriate casting based on type compatibility:
```typescript
// Level 1: Type assertion for compatible shapes
const mock = createMockUser() as User;
// Level 2: Unknown bridge for incompatible shapes
const mock = partialMock as unknown as User;
// Level 3: Partial with required overrides
const mock: User = { ...createPartialUser(), id: 1, email: 'test@test.com' };
```
### Mock Function Casting
```typescript
import { asMock } from '@/tests/utils/testHelpers';
// Cast vi.fn() to specific function type
const mockFn = vi.fn();
someService.register(asMock<UserService['create']>(mockFn));
// vi.fn() with explicit type parameters
const mockFn = vi.fn<[string], Promise<User>>().mockResolvedValue(mockUser);
// vi.mocked() for mocked modules
vi.mock('@/services/userService');
const mockedService = vi.mocked(userService);
mockedService.create.mockResolvedValue(mockUser);
```
### Mock Logger for Controller Tests
Controllers require a Pino logger on `req.log`. Use the shared mock logger utility:
```typescript
import { createMockLogger } from '@/tests/utils/testHelpers';
function createMockRequest(overrides = {}): ExpressRequest {
return {
body: {},
cookies: {},
log: createMockLogger(),
res: { cookie: vi.fn() } as unknown as ExpressResponse,
...overrides,
} as unknown as ExpressRequest;
}
```
The `createMockLogger()` function returns a complete Pino logger mock with all methods (`info`, `debug`, `error`, `warn`, `fatal`, `trace`, `silent`, `child`) as `vi.fn()` mocks.
### MSW Handler Typing
Ensure MSW handlers return properly typed API responses:
```typescript
import { ApiSuccessResponse } from '@/types/api';
import { Flyer } from '@/types/flyer';
http.get('/api/v1/flyers', () => {
const response: ApiSuccessResponse<Flyer[]> = {
success: true,
data: [mockFlyer],
};
return HttpResponse.json(response);
});
```
### Generic Type Parameters
Provide explicit generics when TypeScript cannot infer:
```typescript
// Factory function generic
const mock = createMockPaginatedResponse<Flyer>({ data: [mockFlyer] });
// Assertion generic
expect(result).toEqual<ApiSuccessResponse<User>>({
success: true,
data: mockUser,
});
```
See [ADR-060](../adr/0060-typescript-test-error-remediation.md) for comprehensive patterns and remediation strategies.

View File

@@ -0,0 +1,899 @@
# tsoa Migration Guide
This guide documents the migration from `swagger-jsdoc` to `tsoa` for API documentation and route generation in the Flyer Crawler project.
## Table of Contents
- [Overview](#overview)
- [Architecture](#architecture)
- [Creating a New Controller](#creating-a-new-controller)
- [BaseController Pattern](#basecontroller-pattern)
- [Authentication](#authentication)
- [Request Handling](#request-handling)
- [Response Formatting](#response-formatting)
- [DTOs and Type Definitions](#dtos-and-type-definitions)
- [File Uploads](#file-uploads)
- [Rate Limiting](#rate-limiting)
- [Error Handling](#error-handling)
- [Testing Controllers](#testing-controllers)
- [Build and Development](#build-and-development)
- [Troubleshooting](#troubleshooting)
- [Migration Lessons Learned](#migration-lessons-learned)
## Overview
### What Changed
| Before (swagger-jsdoc) | After (tsoa) |
| ---------------------------------------- | ------------------------------------------- |
| JSDoc `@openapi` comments in route files | TypeScript decorators on controller classes |
| Manual Express route registration | tsoa generates routes automatically |
| Separate Zod validation middleware | tsoa validates from TypeScript types |
| OpenAPI spec from comments | OpenAPI spec from decorators and types |
### Why tsoa?
1. **Type Safety**: OpenAPI spec is generated from TypeScript types, eliminating drift
2. **Active Maintenance**: tsoa is actively maintained (vs. unmaintained swagger-jsdoc)
3. **Reduced Duplication**: No more parallel JSDoc + TypeScript definitions
4. **Route Generation**: tsoa generates Express routes, reducing boilerplate
### Key Files
| File | Purpose |
| -------------------------------------- | --------------------------------------- |
| `tsoa.json` | tsoa configuration |
| `src/controllers/base.controller.ts` | Base controller with response utilities |
| `src/controllers/types.ts` | Shared controller type definitions |
| `src/controllers/*.controller.ts` | Domain controllers |
| `src/dtos/common.dto.ts` | Shared DTO definitions |
| `src/middleware/tsoaAuthentication.ts` | JWT authentication handler |
| `src/routes/tsoa-generated.ts` | Generated Express routes |
| `src/config/tsoa-spec.json` | Generated OpenAPI 3.0 spec |
## Architecture
### Request Flow
```
HTTP Request
|
v
Express Middleware (logging, CORS, body parsing)
|
v
tsoa-generated routes (src/routes/tsoa-generated.ts)
|
v
tsoaAuthentication (if @Security decorator present)
|
v
Controller Method
|
v
Service Layer
|
v
Repository Layer
|
v
Database
```
### Controller Structure
```
src/controllers/
base.controller.ts # Base class with response helpers
types.ts # Shared type definitions
health.controller.ts # Health check endpoints
auth.controller.ts # Authentication endpoints
user.controller.ts # User management endpoints
...
```
## Creating a New Controller
### Step 1: Create the Controller File
```typescript
// src/controllers/example.controller.ts
import {
Route,
Tags,
Get,
Post,
Put,
Delete,
Body,
Path,
Query,
Request,
Security,
SuccessResponse,
Response,
Middlewares,
} from 'tsoa';
import type { Request as ExpressRequest } from 'express';
import {
BaseController,
SuccessResponse as SuccessResponseType,
ErrorResponse,
PaginatedResponse,
} from './base.controller';
import type { UserProfile } from '../types';
// ============================================================================
// REQUEST/RESPONSE TYPES
// ============================================================================
interface CreateExampleRequest {
/**
* Name of the example item.
* @minLength 1
* @maxLength 255
* @example "My Example"
*/
name: string;
/**
* Optional description.
* @example "This is an example item"
*/
description?: string;
}
interface ExampleResponse {
id: number;
name: string;
description?: string;
created_at: string;
}
// ============================================================================
// CONTROLLER
// ============================================================================
/**
* Example controller demonstrating tsoa patterns.
*/
@Route('examples')
@Tags('Examples')
export class ExampleController extends BaseController {
/**
* List all examples with pagination.
* @summary List examples
* @param page Page number (1-indexed)
* @param limit Items per page (max 100)
* @returns Paginated list of examples
*/
@Get()
@SuccessResponse(200, 'Examples retrieved')
public async listExamples(
@Query() page?: number,
@Query() limit?: number,
): Promise<PaginatedResponse<ExampleResponse>> {
const { page: p, limit: l } = this.normalizePagination(page, limit);
// Call service layer
const { items, total } = await exampleService.listExamples(p, l);
return this.paginated(items, { page: p, limit: l, total });
}
/**
* Get a single example by ID.
* @summary Get example
* @param id Example ID
* @returns The example
*/
@Get('{id}')
@SuccessResponse(200, 'Example retrieved')
@Response<ErrorResponse>(404, 'Example not found')
public async getExample(@Path() id: number): Promise<SuccessResponseType<ExampleResponse>> {
const example = await exampleService.getExampleById(id);
return this.success(example);
}
/**
* Create a new example.
* Requires authentication.
* @summary Create example
* @param requestBody Example data
* @param request Express request
* @returns Created example
*/
@Post()
@Security('bearerAuth')
@SuccessResponse(201, 'Example created')
@Response<ErrorResponse>(400, 'Validation error')
@Response<ErrorResponse>(401, 'Not authenticated')
public async createExample(
@Body() requestBody: CreateExampleRequest,
@Request() request: ExpressRequest,
): Promise<SuccessResponseType<ExampleResponse>> {
const user = request.user as UserProfile;
const example = await exampleService.createExample(requestBody, user.user.user_id);
return this.created(example);
}
/**
* Delete an example.
* Requires authentication.
* @summary Delete example
* @param id Example ID
* @param request Express request
*/
@Delete('{id}')
@Security('bearerAuth')
@SuccessResponse(204, 'Example deleted')
@Response<ErrorResponse>(401, 'Not authenticated')
@Response<ErrorResponse>(404, 'Example not found')
public async deleteExample(
@Path() id: number,
@Request() request: ExpressRequest,
): Promise<void> {
const user = request.user as UserProfile;
await exampleService.deleteExample(id, user.user.user_id);
return this.noContent();
}
}
```
### Step 2: Regenerate Routes
After creating or modifying a controller:
```bash
# Generate OpenAPI spec and routes
npm run tsoa:spec && npm run tsoa:routes
# Or use the combined command
npm run prebuild
```
### Step 3: Add Tests
Create a test file at `src/controllers/__tests__/example.controller.test.ts`.
## BaseController Pattern
All controllers extend `BaseController` which provides:
### Response Helpers
```typescript
// Success response (200)
return this.success(data);
// Created response (201)
return this.created(data);
// Paginated response (200 with pagination metadata)
return this.paginated(items, { page, limit, total });
// Message-only response
return this.message('Operation completed successfully');
// No content response (204)
return this.noContent();
// Error response (prefer throwing errors)
this.setStatus(400);
return this.error('BAD_REQUEST', 'Invalid input', details);
```
### Pagination Helpers
```typescript
// Normalize pagination with defaults and bounds
const { page, limit } = this.normalizePagination(queryPage, queryLimit);
// page defaults to 1, limit defaults to 20, max 100
// Calculate pagination metadata
const meta = this.calculatePagination({ page, limit, total });
// Returns: { page, limit, total, totalPages, hasNextPage, hasPrevPage }
```
### Error Codes
```typescript
// Access standard error codes
this.ErrorCode.VALIDATION_ERROR; // 'VALIDATION_ERROR'
this.ErrorCode.NOT_FOUND; // 'NOT_FOUND'
this.ErrorCode.UNAUTHORIZED; // 'UNAUTHORIZED'
this.ErrorCode.FORBIDDEN; // 'FORBIDDEN'
this.ErrorCode.CONFLICT; // 'CONFLICT'
this.ErrorCode.BAD_REQUEST; // 'BAD_REQUEST'
this.ErrorCode.INTERNAL_ERROR; // 'INTERNAL_ERROR'
```
## Authentication
### Using @Security Decorator
```typescript
import { Security, Request } from 'tsoa';
import type { Request as ExpressRequest } from 'express';
import type { UserProfile } from '../types';
@Get('profile')
@Security('bearerAuth')
public async getProfile(
@Request() request: ExpressRequest,
): Promise<SuccessResponseType<UserProfileDto>> {
// request.user is populated by tsoaAuthentication.ts
const user = request.user as UserProfile;
return this.success(toUserProfileDto(user));
}
```
### Requiring Admin Role
```typescript
import { requireAdminRole } from '../middleware/tsoaAuthentication';
@Delete('users/{id}')
@Security('bearerAuth')
public async deleteUser(
@Path() id: string,
@Request() request: ExpressRequest,
): Promise<void> {
const user = request.user as UserProfile;
requireAdminRole(user); // Throws 403 if not admin
await userService.deleteUser(id);
return this.noContent();
}
```
### How Authentication Works
1. tsoa sees `@Security('bearerAuth')` decorator
2. tsoa calls `expressAuthentication()` from `src/middleware/tsoaAuthentication.ts`
3. The function extracts and validates the JWT token
4. User profile is fetched from database and attached to `request.user`
5. If authentication fails, an `AuthenticationError` is thrown
## Request Handling
### Path Parameters
```typescript
@Get('{id}')
public async getItem(@Path() id: number): Promise<...> { ... }
// Multiple path params
@Get('{userId}/items/{itemId}')
public async getUserItem(
@Path() userId: string,
@Path() itemId: number,
): Promise<...> { ... }
```
### Query Parameters
```typescript
@Get()
public async listItems(
@Query() page?: number,
@Query() limit?: number,
@Query() status?: 'active' | 'inactive',
@Query() search?: string,
): Promise<...> { ... }
```
### Request Body
```typescript
interface CreateItemRequest {
name: string;
description?: string;
}
@Post()
public async createItem(
@Body() requestBody: CreateItemRequest,
): Promise<...> { ... }
```
### Headers
```typescript
@Get()
public async getWithHeader(
@Header('X-Custom-Header') customHeader?: string,
): Promise<...> { ... }
```
### Accessing Express Request/Response
```typescript
import type { Request as ExpressRequest } from 'express';
@Post()
public async handleRequest(
@Request() request: ExpressRequest,
): Promise<...> {
const reqLog = request.log; // Pino logger
const cookies = request.cookies; // Cookies
const ip = request.ip; // Client IP
const res = request.res!; // Express response
// Set cookie
res.cookie('name', 'value', { httpOnly: true });
// ...
}
```
## Response Formatting
### Standard Success Response
```typescript
// Returns: { "success": true, "data": {...} }
return this.success({ id: 1, name: 'Item' });
```
### Created Response (201)
```typescript
// Sets status 201 and returns success response
return this.created(newItem);
```
### Paginated Response
```typescript
// Returns: { "success": true, "data": [...], "meta": { "pagination": {...} } }
return this.paginated(items, { page: 1, limit: 20, total: 100 });
```
### No Content (204)
```typescript
// Sets status 204 with no body
return this.noContent();
```
### Error Response
Prefer throwing errors rather than returning error responses:
```typescript
import { NotFoundError, ValidationError, ForbiddenError } from './base.controller';
// Throw for not found
throw new NotFoundError('Item', id);
// Throw for validation errors
throw new ValidationError([], 'Invalid input');
// Throw for forbidden
throw new ForbiddenError('Admin access required');
```
If you need manual error response:
```typescript
this.setStatus(400);
return this.error(this.ErrorCode.BAD_REQUEST, 'Invalid operation', { reason: '...' });
```
## DTOs and Type Definitions
### Why DTOs?
tsoa generates OpenAPI specs from TypeScript types. Some types cannot be serialized:
- Tuples: `[number, number]` (e.g., GeoJSON coordinates)
- Complex generics
- Circular references
DTOs flatten these into tsoa-compatible structures.
### Shared DTOs
Define shared DTOs in `src/dtos/common.dto.ts`:
```typescript
// src/dtos/common.dto.ts
/**
* Address with flattened coordinates.
* GeoJSONPoint uses coordinates: [number, number] which tsoa cannot handle.
*/
export interface AddressDto {
address_id: number;
address_line_1: string;
city: string;
province_state: string;
postal_code: string;
country: string;
// Flattened from GeoJSONPoint.coordinates
latitude?: number | null;
longitude?: number | null;
created_at: string;
updated_at: string;
}
export interface UserDto {
user_id: string;
email: string;
created_at: string;
updated_at: string;
}
```
### Conversion Functions
Create conversion functions to map domain types to DTOs:
```typescript
// In controller file
function toAddressDto(address: Address): AddressDto {
return {
address_id: address.address_id,
address_line_1: address.address_line_1,
city: address.city,
province_state: address.province_state,
postal_code: address.postal_code,
country: address.country,
latitude: address.location?.coordinates[1] ?? null,
longitude: address.location?.coordinates[0] ?? null,
created_at: address.created_at,
updated_at: address.updated_at,
};
}
```
### Important: Avoid Duplicate Type Names
tsoa requires unique type names across all controllers. If two controllers define an interface with the same name, tsoa will fail.
**Solution**: Define shared types in `src/dtos/common.dto.ts` and import them.
## File Uploads
tsoa supports file uploads via `@UploadedFile` and `@FormField` decorators:
```typescript
import { Post, Route, UploadedFile, FormField, Security } from 'tsoa';
import multer from 'multer';
// Configure multer
const upload = multer({
storage: multer.diskStorage({
destination: '/tmp/uploads',
filename: (req, file, cb) => {
cb(null, `${Date.now()}-${Math.round(Math.random() * 1e9)}-${file.originalname}`);
},
}),
limits: { fileSize: 10 * 1024 * 1024 }, // 10MB
});
@Route('flyers')
@Tags('Flyers')
export class FlyerController extends BaseController {
/**
* Upload a flyer image.
* @summary Upload flyer
* @param file The flyer image file
* @param storeId Associated store ID
* @param request Express request
*/
@Post('upload')
@Security('bearerAuth')
@Middlewares(upload.single('file'))
@SuccessResponse(201, 'Flyer uploaded')
public async uploadFlyer(
@UploadedFile() file: Express.Multer.File,
@FormField() storeId?: number,
@Request() request: ExpressRequest,
): Promise<SuccessResponseType<FlyerDto>> {
const user = request.user as UserProfile;
const flyer = await flyerService.processUpload(file, storeId, user.user.user_id);
return this.created(flyer);
}
}
```
## Rate Limiting
Apply rate limiters using the `@Middlewares` decorator:
```typescript
import { Middlewares } from 'tsoa';
import { loginLimiter, registerLimiter } from '../config/rateLimiters';
@Post('login')
@Middlewares(loginLimiter)
@SuccessResponse(200, 'Login successful')
@Response<ErrorResponse>(429, 'Too many login attempts')
public async login(@Body() body: LoginRequest): Promise<...> { ... }
@Post('register')
@Middlewares(registerLimiter)
@SuccessResponse(201, 'User registered')
@Response<ErrorResponse>(429, 'Too many registration attempts')
public async register(@Body() body: RegisterRequest): Promise<...> { ... }
```
## Error Handling
### Throwing Errors
Use the error classes from `base.controller.ts`:
```typescript
import {
NotFoundError,
ValidationError,
ForbiddenError,
UniqueConstraintError,
} from './base.controller';
// Not found (404)
throw new NotFoundError('User', userId);
// Validation error (400)
throw new ValidationError([], 'Invalid email format');
// Forbidden (403)
throw new ForbiddenError('Admin access required');
// Conflict (409) - e.g., duplicate email
throw new UniqueConstraintError('email', 'Email already registered');
```
### Global Error Handler
Errors are caught by the global error handler in `server.ts` which formats them according to ADR-028:
```json
{
"success": false,
"error": {
"code": "NOT_FOUND",
"message": "User not found"
}
}
```
### Authentication Errors
The `tsoaAuthentication.ts` module throws `AuthenticationError` with appropriate HTTP status codes:
- 401: Missing token, invalid token, expired token
- 403: User lacks required role
- 500: Server configuration error
## Testing Controllers
### Test File Location
```
src/controllers/__tests__/
example.controller.test.ts
auth.controller.test.ts
user.controller.test.ts
...
```
### Test Structure
```typescript
// src/controllers/__tests__/example.controller.test.ts
import { describe, it, expect, vi, beforeEach, afterEach } from 'vitest';
import { ExampleController } from '../example.controller';
// Mock dependencies
vi.mock('../../services/exampleService', () => ({
exampleService: {
listExamples: vi.fn(),
getExampleById: vi.fn(),
createExample: vi.fn(),
deleteExample: vi.fn(),
},
}));
import { exampleService } from '../../services/exampleService';
describe('ExampleController', () => {
let controller: ExampleController;
beforeEach(() => {
controller = new ExampleController();
vi.clearAllMocks();
});
describe('listExamples', () => {
it('should return paginated examples', async () => {
const mockItems = [{ id: 1, name: 'Test' }];
vi.mocked(exampleService.listExamples).mockResolvedValue({
items: mockItems,
total: 1,
});
const result = await controller.listExamples(1, 20);
expect(result.success).toBe(true);
expect(result.data).toEqual(mockItems);
expect(result.meta?.pagination).toBeDefined();
expect(result.meta?.pagination?.total).toBe(1);
});
});
describe('createExample', () => {
it('should create example and return 201', async () => {
const mockExample = { id: 1, name: 'New', created_at: '2026-01-01' };
vi.mocked(exampleService.createExample).mockResolvedValue(mockExample);
const mockRequest = {
user: { user: { user_id: 'user-123' } },
} as any;
const result = await controller.createExample({ name: 'New' }, mockRequest);
expect(result.success).toBe(true);
expect(result.data).toEqual(mockExample);
// Note: setStatus is called internally, verify with spy if needed
});
});
});
```
### Testing Authentication
```typescript
describe('authenticated endpoints', () => {
it('should use user from request', async () => {
const mockRequest = {
user: {
user: { user_id: 'user-123', email: 'test@example.com' },
role: 'user',
},
} as any;
const result = await controller.getProfile(mockRequest);
expect(result.data.user.user_id).toBe('user-123');
});
});
```
### Known Test Limitations
Some test files have type errors with mock objects that are acceptable:
```typescript
// Type error: 'any' is not assignable to 'Express.Request'
// This is acceptable in tests - the mock has the properties we need
const mockRequest = { user: mockUser } as any;
```
These type errors do not affect test correctness. The 4603 unit tests and 345 integration tests all pass.
## Build and Development
### Development Workflow
1. Create or modify controller
2. Run `npm run tsoa:spec && npm run tsoa:routes`
3. Run `npm run type-check` to verify
4. Run tests
### NPM Scripts
```json
{
"tsoa:spec": "tsoa spec",
"tsoa:routes": "tsoa routes",
"prebuild": "npm run tsoa:spec && npm run tsoa:routes",
"build": "tsc"
}
```
### Watching for Changes
Currently, tsoa routes must be regenerated manually when controllers change. Consider adding a watch script:
```bash
# In development, regenerate on save
npm run tsoa:spec && npm run tsoa:routes
```
### Generated Files
| File | Regenerate When |
| ------------------------------ | ------------------------------- |
| `src/routes/tsoa-generated.ts` | Controller changes |
| `src/config/tsoa-spec.json` | Controller changes, DTO changes |
These files are committed to the repository for faster builds.
## Troubleshooting
### "Duplicate identifier" Error
**Problem**: tsoa fails with "Duplicate identifier" for a type.
**Solution**: Move the type to `src/dtos/common.dto.ts` and import it in all controllers.
### "Unable to resolve type" Error
**Problem**: tsoa cannot serialize a complex type (tuples, generics).
**Solution**: Create a DTO with flattened/simplified structure.
```typescript
// Before: GeoJSONPoint with coordinates: [number, number]
// After: AddressDto with latitude, longitude as separate fields
```
### Route Not Found (404)
**Problem**: New endpoint returns 404.
**Solution**:
1. Ensure controller file matches glob pattern: `src/controllers/**/*.controller.ts`
2. Regenerate routes: `npm run tsoa:routes`
3. Verify the route is in `src/routes/tsoa-generated.ts`
### Authentication Not Working
**Problem**: `request.user` is undefined.
**Solution**:
1. Ensure `@Security('bearerAuth')` decorator is on the method
2. Verify `tsoaAuthentication.ts` is correctly configured in `tsoa.json`
3. Check the Authorization header format: `Bearer <token>`
### Type Mismatch in Tests
**Problem**: TypeScript errors when mocking Express.Request.
**Solution**: Use `as any` cast for mock objects in tests. This is acceptable and does not affect test correctness.
```typescript
const mockRequest = {
user: mockUserProfile,
log: mockLogger,
} as any;
```
## Migration Lessons Learned
### What Worked Well
1. **BaseController Pattern**: Provides consistent response formatting and familiar helpers
2. **Incremental Migration**: Controllers can be migrated one at a time
3. **Type-First Design**: Defining request/response types first makes implementation clearer
4. **Shared DTOs**: Centralizing DTOs in `common.dto.ts` prevents duplicate type errors
### Challenges Encountered
1. **Tuple Types**: tsoa cannot serialize TypeScript tuples. Solution: Flatten to separate fields.
2. **Passport Integration**: OAuth callbacks use redirect-based flows that don't fit tsoa's JSON model. Solution: Keep OAuth callbacks in Express routes.
3. **Test Type Errors**: Mock objects don't perfectly match Express types. Solution: Accept `as any` casts in tests.
4. **Build Pipeline**: Must regenerate routes when controllers change. Solution: Add to prebuild script.
### Recommendations for Future Controllers
1. Start with the DTO/request/response types
2. Use `@SuccessResponse` and `@Response` decorators for all status codes
3. Add JSDoc comments for OpenAPI descriptions
4. Keep controller methods thin - delegate to service layer
5. Test controllers in isolation by mocking services
## Related Documentation
- [ADR-018: API Documentation Strategy](../adr/0018-api-documentation-strategy.md)
- [ADR-059: Dependency Modernization](../adr/0059-dependency-modernization.md)
- [ADR-028: API Response Standardization](../adr/0028-api-response-standardization.md)
- [CODE-PATTERNS.md](./CODE-PATTERNS.md)
- [TESTING.md](./TESTING.md)
- [tsoa Documentation](https://tsoa-community.github.io/docs/)

View File

@@ -0,0 +1,272 @@
# Test Path Migration: Unversioned to Versioned API Paths
**Status**: Complete
**Created**: 2026-01-27
**Completed**: 2026-01-27
**Related**: ADR-008 (API Versioning Strategy)
## Summary
All integration test files have been successfully migrated to use versioned API paths (`/api/v1/`). This resolves the redirect-related test failures introduced by ADR-008 Phase 1.
### Results
| Metric | Value |
| ------------------------- | ---------------------------------------- |
| Test files updated | 23 |
| Path occurrences changed | ~70 |
| Tests before migration | 274/348 passing |
| Tests after migration | 345/348 passing |
| Test failures resolved | 71 |
| Remaining todo/skipped | 3 (known issues, not versioning-related) |
| Type check | Passing |
| Versioning-specific tests | 82/82 passing |
### Key Outcomes
- No `301 Moved Permanently` responses in test output
- All redirect-related failures resolved
- No regressions introduced
- Unit tests unaffected (3,375/3,391 passing, pre-existing failures)
---
## Original Problem Statement
Integration tests failed due to redirect middleware (ADR-008 Phase 1). Server returned `301 Moved Permanently` for unversioned paths (`/api/resource`) instead of expected `200 OK`. Redirect targets versioned paths (`/api/v1/resource`).
**Root Cause**: Backwards-compatibility redirect in `server.ts`:
```typescript
app.use('/api', (req, res, next) => {
const versionPattern = /^\/v\d+/;
if (!versionPattern.test(req.path)) {
return res.redirect(301, `/api/v1${req.path}`);
}
next();
});
```
**Impact**: ~70 test path occurrences across 23 files returning 301 instead of expected status codes.
## Solution
Update all test API paths from `/api/{resource}` to `/api/v1/{resource}`.
## Files Requiring Updates
### Integration Tests (16 files)
| File | Occurrences | Domains |
| ------------------------------------------------------------ | ----------- | ---------------------- |
| `src/tests/integration/inventory.integration.test.ts` | 14 | inventory |
| `src/tests/integration/receipt.integration.test.ts` | 17 | receipts |
| `src/tests/integration/recipe.integration.test.ts` | 17 | recipes, users/recipes |
| `src/tests/integration/user.routes.integration.test.ts` | 10 | users/shopping-lists |
| `src/tests/integration/admin.integration.test.ts` | 7 | admin |
| `src/tests/integration/flyer-processing.integration.test.ts` | 6 | ai/jobs |
| `src/tests/integration/budget.integration.test.ts` | 5 | budgets |
| `src/tests/integration/notification.integration.test.ts` | 3 | users/notifications |
| `src/tests/integration/data-integrity.integration.test.ts` | 3 | users, admin |
| `src/tests/integration/upc.integration.test.ts` | 3 | upc |
| `src/tests/integration/edge-cases.integration.test.ts` | 3 | users/shopping-lists |
| `src/tests/integration/user.integration.test.ts` | 2 | users |
| `src/tests/integration/public.routes.integration.test.ts` | 2 | flyers, recipes |
| `src/tests/integration/flyer.integration.test.ts` | 1 | flyers |
| `src/tests/integration/category.routes.test.ts` | 1 | categories |
| `src/tests/integration/gamification.integration.test.ts` | 1 | ai/jobs |
### E2E Tests (7 files)
| File | Occurrences | Domains |
| --------------------------------------------- | ----------- | -------------------- |
| `src/tests/e2e/inventory-journey.e2e.test.ts` | 9 | inventory |
| `src/tests/e2e/receipt-journey.e2e.test.ts` | 9 | receipts |
| `src/tests/e2e/budget-journey.e2e.test.ts` | 6 | budgets |
| `src/tests/e2e/upc-journey.e2e.test.ts` | 3 | upc |
| `src/tests/e2e/deals-journey.e2e.test.ts` | 2 | categories, users |
| `src/tests/e2e/user-journey.e2e.test.ts` | 1 | users/shopping-lists |
| `src/tests/e2e/flyer-upload.e2e.test.ts` | 1 | jobs |
## Update Pattern
### Find/Replace Rules
**Template literals** (most common):
```
OLD: .get(`/api/resource/${id}`)
NEW: .get(`/api/v1/resource/${id}`)
```
**String literals**:
```
OLD: .get('/api/resource')
NEW: .get('/api/v1/resource')
```
### Regex Pattern for Batch Updates
```regex
Find: (\.(get|post|put|delete|patch)\([`'"])/api/([a-z])
Replace: $1/api/v1/$3
```
**Explanation**: Captures HTTP method call, inserts `/v1/` after `/api/`.
## Files to EXCLUDE
These files intentionally test unversioned path behavior:
| File | Reason |
| ---------------------------------------------------- | ------------------------------------ |
| `src/routes/versioning.integration.test.ts` | Tests redirect behavior itself |
| `src/services/apiClient.test.ts` | Mock server URLs, not real API calls |
| `src/services/aiApiClient.test.ts` | Mock server URLs for MSW handlers |
| `src/services/googleGeocodingService.server.test.ts` | External Google API URL |
**Also exclude** (not API paths):
- Lines containing `vi.mock('@bull-board/api` (import mocks)
- Lines containing `/api/v99` (intentional unsupported version tests)
- `describe()` and `it()` block descriptions
- Comment lines (`// `)
## Execution Batches
### Batch 1: High-Impact Integration (4 files, ~58 occurrences)
```bash
# Files with most occurrences
src/tests/integration/inventory.integration.test.ts
src/tests/integration/receipt.integration.test.ts
src/tests/integration/recipe.integration.test.ts
src/tests/integration/user.routes.integration.test.ts
```
### Batch 2: Medium Integration (6 files, ~27 occurrences)
```bash
src/tests/integration/admin.integration.test.ts
src/tests/integration/flyer-processing.integration.test.ts
src/tests/integration/budget.integration.test.ts
src/tests/integration/notification.integration.test.ts
src/tests/integration/data-integrity.integration.test.ts
src/tests/integration/upc.integration.test.ts
```
### Batch 3: Low Integration (6 files, ~10 occurrences)
```bash
src/tests/integration/edge-cases.integration.test.ts
src/tests/integration/user.integration.test.ts
src/tests/integration/public.routes.integration.test.ts
src/tests/integration/flyer.integration.test.ts
src/tests/integration/category.routes.test.ts
src/tests/integration/gamification.integration.test.ts
```
### Batch 4: E2E Tests (7 files, ~31 occurrences)
```bash
src/tests/e2e/inventory-journey.e2e.test.ts
src/tests/e2e/receipt-journey.e2e.test.ts
src/tests/e2e/budget-journey.e2e.test.ts
src/tests/e2e/upc-journey.e2e.test.ts
src/tests/e2e/deals-journey.e2e.test.ts
src/tests/e2e/user-journey.e2e.test.ts
src/tests/e2e/flyer-upload.e2e.test.ts
```
## Verification Strategy
### Per-Batch Verification
After each batch:
```bash
# Type check
podman exec -it flyer-crawler-dev npm run type-check
# Run specific test file
podman exec -it flyer-crawler-dev npx vitest run <file-path> --reporter=verbose
```
### Full Verification
After all batches:
```bash
# Full integration test suite
podman exec -it flyer-crawler-dev npm run test:integration
# Full E2E test suite
podman exec -it flyer-crawler-dev npm run test:e2e
```
### Success Criteria
- [x] No `301 Moved Permanently` responses in test output
- [x] All tests pass or fail for expected reasons (not redirect-related)
- [x] Type check passes
- [x] No regressions in unmodified tests
## Edge Cases
### Describe Block Text
Do NOT modify describe/it block descriptions:
```typescript
// KEEP AS-IS (documentation only):
describe('GET /api/users/profile', () => { ... });
// UPDATE (actual API call):
const response = await request.get('/api/v1/users/profile');
```
### Console Logging
Do NOT modify debug/error logging paths:
```typescript
// KEEP AS-IS:
console.error('[DEBUG] GET /api/admin/stats failed:', ...);
```
### Query Parameters
Include query parameters in update:
```typescript
// OLD:
.get(`/api/budgets/spending-analysis?startDate=${start}&endDate=${end}`)
// NEW:
.get(`/api/v1/budgets/spending-analysis?startDate=${start}&endDate=${end}`)
```
## Post-Completion Checklist
- [x] All 23 files updated
- [x] ~70 path occurrences migrated
- [x] Exclusion files unchanged
- [x] Type check passes
- [x] Integration tests pass (345/348)
- [x] E2E tests pass
- [x] Commit with message: `fix(tests): Update API paths to use /api/v1/ prefix (ADR-008)`
## Rollback
If issues arise:
```bash
git checkout HEAD -- src/tests/
```
## Related Documentation
- ADR-008: API Versioning Strategy
- `docs/architecture/api-versioning-infrastructure.md`
- `src/routes/versioning.integration.test.ts` (reference for expected behavior)

View File

@@ -2,134 +2,261 @@
Complete guide to environment variables used in Flyer Crawler.
---
## Quick Reference
### Minimum Required Variables (Development)
| Variable | Example | Purpose |
| ---------------- | ------------------------ | -------------------- |
| `DB_HOST` | `localhost` | PostgreSQL host |
| `DB_USER` | `postgres` | PostgreSQL username |
| `DB_PASSWORD` | `postgres` | PostgreSQL password |
| `DB_NAME` | `flyer_crawler_dev` | Database name |
| `REDIS_URL` | `redis://localhost:6379` | Redis connection URL |
| `JWT_SECRET` | (32+ character string) | JWT signing key |
| `GEMINI_API_KEY` | `AIzaSy...` | Google Gemini API |
### Source of Truth
The Zod schema at `src/config/env.ts` is the authoritative source for all environment variables. If a variable is not in this file, it is not used by the application.
---
## Configuration by Environment
### Production
**Location**: Gitea CI/CD secrets injected during deployment
**Path**: `/var/www/flyer-crawler.projectium.com/`
**Note**: No `.env` file exists - all variables come from CI/CD
| Aspect | Details |
| -------- | ------------------------------------------ |
| Location | Gitea CI/CD secrets injected at deployment |
| Path | `/var/www/flyer-crawler.projectium.com/` |
| File | No `.env` file - all from CI/CD secrets |
### Test
**Location**: Gitea CI/CD secrets + `.env.test` file
**Path**: `/var/www/flyer-crawler-test.projectium.com/`
**Note**: `.env.test` overrides for test-specific values
| Aspect | Details |
| -------- | --------------------------------------------- |
| Location | Gitea CI/CD secrets + `.env.test` overrides |
| Path | `/var/www/flyer-crawler-test.projectium.com/` |
| File | `.env.test` for test-specific values |
### Development Container
**Location**: `.env.local` file in project root
**Note**: Overrides default DSNs in `compose.dev.yml`
| Aspect | Details |
| -------- | --------------------------------------- |
| Location | `.env.local` file in project root |
| Priority | Overrides defaults in `compose.dev.yml` |
| File | `.env.local` (gitignored) |
## Required Variables
---
### Database
## Complete Variable Reference
| Variable | Description | Example |
| ------------------ | ---------------------------- | ------------------------------------------ |
| `DB_HOST` | PostgreSQL host | `localhost` (dev), `projectium.com` (prod) |
| `DB_PORT` | PostgreSQL port | `5432` |
| `DB_USER_PROD` | Production database user | `flyer_crawler_prod` |
| `DB_PASSWORD_PROD` | Production database password | (secret) |
| `DB_DATABASE_PROD` | Production database name | `flyer-crawler-prod` |
| `DB_USER_TEST` | Test database user | `flyer_crawler_test` |
| `DB_PASSWORD_TEST` | Test database password | (secret) |
| `DB_DATABASE_TEST` | Test database name | `flyer-crawler-test` |
| `DB_USER` | Dev database user | `postgres` |
| `DB_PASSWORD` | Dev database password | `postgres` |
| `DB_NAME` | Dev database name | `flyer_crawler_dev` |
### Database Configuration
**Note**: Production and test use separate `_PROD` and `_TEST` suffixed variables. Development uses unsuffixed variables.
| Variable | Required | Default | Description |
| ------------- | -------- | ------- | ----------------- |
| `DB_HOST` | Yes | - | PostgreSQL host |
| `DB_PORT` | No | `5432` | PostgreSQL port |
| `DB_USER` | Yes | - | Database username |
| `DB_PASSWORD` | Yes | - | Database password |
| `DB_NAME` | Yes | - | Database name |
### Redis
**Environment-Specific Variables** (Gitea Secrets):
| Variable | Description | Example |
| --------------------- | ------------------------- | ------------------------------ |
| `REDIS_URL` | Redis connection URL | `redis://localhost:6379` (dev) |
| `REDIS_PASSWORD_PROD` | Production Redis password | (secret) |
| `REDIS_PASSWORD_TEST` | Test Redis password | (secret) |
| Variable | Environment | Description |
| ------------------ | ----------- | ------------------------ |
| `DB_USER_PROD` | Production | Production database user |
| `DB_PASSWORD_PROD` | Production | Production database pass |
| `DB_DATABASE_PROD` | Production | Production database name |
| `DB_USER_TEST` | Test | Test database user |
| `DB_PASSWORD_TEST` | Test | Test database password |
| `DB_DATABASE_TEST` | Test | Test database name |
### Redis Configuration
| Variable | Required | Default | Description |
| ---------------- | -------- | ------- | ------------------------- |
| `REDIS_URL` | Yes | - | Redis connection URL |
| `REDIS_PASSWORD` | No | - | Redis password (optional) |
**URL Format**: `redis://[user:password@]host:port`
**Examples**:
```bash
# Development (no auth)
REDIS_URL=redis://localhost:6379
# Production (with auth)
REDIS_URL=redis://:${REDIS_PASSWORD_PROD}@localhost:6379
```
### Authentication
| Variable | Description | Example |
| ---------------------- | -------------------------- | -------------------------------- |
| `JWT_SECRET` | JWT token signing key | (minimum 32 characters) |
| `SESSION_SECRET` | Session encryption key | (minimum 32 characters) |
| `GOOGLE_CLIENT_ID` | Google OAuth client ID | `xxx.apps.googleusercontent.com` |
| `GOOGLE_CLIENT_SECRET` | Google OAuth client secret | (secret) |
| `GH_CLIENT_ID` | GitHub OAuth client ID | `xxx` |
| `GH_CLIENT_SECRET` | GitHub OAuth client secret | (secret) |
| Variable | Required | Min Length | Description |
| ---------------------- | -------- | ---------- | ----------------------- |
| `JWT_SECRET` | Yes | 32 chars | JWT token signing key |
| `JWT_SECRET_PREVIOUS` | No | - | Previous key (rotation) |
| `GOOGLE_CLIENT_ID` | No | - | Google OAuth client ID |
| `GOOGLE_CLIENT_SECRET` | No | - | Google OAuth secret |
| `GITHUB_CLIENT_ID` | No | - | GitHub OAuth client ID |
| `GITHUB_CLIENT_SECRET` | No | - | GitHub OAuth secret |
**Generate Secure Secret**:
```bash
node -e "console.log(require('crypto').randomBytes(32).toString('hex'))"
```
### AI Services
| Variable | Description | Example |
| -------------------------------- | ---------------------------- | ----------- |
| `VITE_GOOGLE_GENAI_API_KEY` | Google Gemini API key (prod) | `AIzaSy...` |
| `VITE_GOOGLE_GENAI_API_KEY_TEST` | Google Gemini API key (test) | `AIzaSy...` |
| `GOOGLE_MAPS_API_KEY` | Google Maps Geocoding API | `AIzaSy...` |
| Variable | Required | Description |
| ---------------------------- | -------- | -------------------------------- |
| `GEMINI_API_KEY` | Yes\* | Google Gemini API key |
| `GEMINI_RPM` | No | Rate limit (default: 5) |
| `AI_PRICE_QUALITY_THRESHOLD` | No | Quality threshold (default: 0.5) |
### Application
\*Required for flyer processing. Application works without it but cannot extract flyer data.
| Variable | Description | Example |
| -------------- | ------------------------ | ----------------------------------- |
| `NODE_ENV` | Environment mode | `development`, `test`, `production` |
| `PORT` | Backend server port | `3001` |
| `FRONTEND_URL` | Frontend application URL | `http://localhost:5173` (dev) |
**Get API Key**: [Google AI Studio](https://aistudio.google.com/app/apikey)
### Error Tracking
**Test Environment Note**: The test/staging environment **deliberately omits** `GEMINI_API_KEY` to preserve free API quota. This is intentional - the API has strict daily limits on the free tier, and we want to reserve tokens for production use. AI features will be non-functional in test, but all other features can be tested normally. Deploy warnings about missing `GEMINI_API_KEY` in test logs are expected and safe to ignore.
| Variable | Description | Example |
| ---------------------- | -------------------------------- | --------------------------- |
| `SENTRY_DSN` | Sentry DSN (production) | `https://xxx@sentry.io/xxx` |
| `VITE_SENTRY_DSN` | Frontend Sentry DSN (production) | `https://xxx@sentry.io/xxx` |
| `SENTRY_DSN_TEST` | Sentry DSN (test) | `https://xxx@sentry.io/xxx` |
| `VITE_SENTRY_DSN_TEST` | Frontend Sentry DSN (test) | `https://xxx@sentry.io/xxx` |
| `SENTRY_AUTH_TOKEN` | Sentry API token for releases | (secret) |
### Google Services
## Optional Variables
| Variable | Required | Description |
| ---------------------- | -------- | -------------------------------- |
| `GOOGLE_MAPS_API_KEY` | No | Google Maps Geocoding API |
| `GOOGLE_CLIENT_ID` | No | OAuth (see Authentication above) |
| `GOOGLE_CLIENT_SECRET` | No | OAuth (see Authentication above) |
| Variable | Description | Default |
| ------------------- | ----------------------- | ----------------- |
| `LOG_LEVEL` | Logging verbosity | `info` |
| `REDIS_TTL` | Cache TTL in seconds | `3600` |
| `MAX_UPLOAD_SIZE` | Max file upload size | `10mb` |
| `RATE_LIMIT_WINDOW` | Rate limit window (ms) | `900000` (15 min) |
| `RATE_LIMIT_MAX` | Max requests per window | `100` |
### UPC Lookup APIs
| Variable | Required | Description |
| ------------------------ | -------- | ---------------------- |
| `UPC_ITEM_DB_API_KEY` | No | UPC Item DB API key |
| `BARCODE_LOOKUP_API_KEY` | No | Barcode Lookup API key |
### Application Settings
| Variable | Required | Default | Description |
| -------------- | -------- | ------------- | ------------------------ |
| `NODE_ENV` | No | `development` | Environment mode |
| `PORT` | No | `3001` | Backend server port |
| `FRONTEND_URL` | No | - | Frontend URL (CORS) |
| `BASE_URL` | No | - | API base URL |
| `STORAGE_PATH` | No | (see below) | Flyer image storage path |
**NODE_ENV Values**: `development`, `test`, `staging`, `production`
**Default STORAGE_PATH**: `/var/www/flyer-crawler.projectium.com/flyer-images`
### Email/SMTP Configuration
| Variable | Required | Default | Description |
| ----------------- | -------- | ------- | ----------------------- |
| `SMTP_HOST` | No | - | SMTP server hostname |
| `SMTP_PORT` | No | `587` | SMTP server port |
| `SMTP_USER` | No | - | SMTP username |
| `SMTP_PASS` | No | - | SMTP password |
| `SMTP_SECURE` | No | `false` | Use TLS |
| `SMTP_FROM_EMAIL` | No | - | From address for emails |
**Note**: Email functionality degrades gracefully if not configured.
### Worker Configuration
| Variable | Default | Description |
| ------------------------------------- | ------- | ---------------------------- |
| `WORKER_CONCURRENCY` | `1` | Main worker concurrency |
| `WORKER_LOCK_DURATION` | `30000` | Lock duration (ms) |
| `EMAIL_WORKER_CONCURRENCY` | `10` | Email worker concurrency |
| `ANALYTICS_WORKER_CONCURRENCY` | `1` | Analytics worker concurrency |
| `CLEANUP_WORKER_CONCURRENCY` | `10` | Cleanup worker concurrency |
| `WEEKLY_ANALYTICS_WORKER_CONCURRENCY` | `1` | Weekly analytics concurrency |
### Error Tracking (Bugsink/Sentry)
| Variable | Required | Default | Description |
| --------------------- | -------- | -------- | ------------------------------- |
| `SENTRY_DSN` | No | - | Backend Sentry DSN |
| `SENTRY_ENABLED` | No | `true` | Enable error tracking |
| `SENTRY_ENVIRONMENT` | No | NODE_ENV | Environment name for errors |
| `SENTRY_DEBUG` | No | `false` | Enable Sentry SDK debug logging |
| `VITE_SENTRY_DSN` | No | - | Frontend Sentry DSN |
| `VITE_SENTRY_ENABLED` | No | `true` | Enable frontend error tracking |
| `VITE_SENTRY_DEBUG` | No | `false` | Frontend SDK debug logging |
**DSN Format**: `http://[key]@[host]:[port]/[project_id]`
**Dev Container DSNs**:
```bash
# Backend (internal)
SENTRY_DSN=http://<key>@localhost:8000/1
# Frontend (via nginx proxy)
VITE_SENTRY_DSN=https://<key>@localhost/bugsink-api/2
```
---
## Configuration Files
| File | Purpose |
| ------------------------------------- | ------------------------------------------- |
| `src/config/env.ts` | Zod schema validation - **source of truth** |
| `ecosystem.config.cjs` | PM2 process manager config |
| `ecosystem.config.cjs` | PM2 process manager (production) |
| `ecosystem.dev.config.cjs` | PM2 process manager (development) |
| `.gitea/workflows/deploy-to-prod.yml` | Production deployment workflow |
| `.gitea/workflows/deploy-to-test.yml` | Test deployment workflow |
| `.env.example` | Template with all variables |
| `.env.local` | Dev container overrides (not in git) |
| `.env.test` | Test environment overrides (not in git) |
---
## Adding New Variables
### 1. Update Zod Schema
### Checklist
1. [ ] **Update Zod Schema** - Edit `src/config/env.ts`
2. [ ] **Add to Gitea Secrets** - For prod/test environments
3. [ ] **Update Deployment Workflows** - `.gitea/workflows/*.yml`
4. [ ] **Update PM2 Config** - `ecosystem.config.cjs`
5. [ ] **Update .env.example** - Template for developers
6. [ ] **Update this document** - Add to appropriate section
### Step-by-Step
#### 1. Update Zod Schema
Edit `src/config/env.ts`:
```typescript
const envSchema = z.object({
// ... existing variables ...
NEW_VARIABLE: z.string().min(1),
newSection: z.object({
newVariable: z.string().min(1, 'NEW_VARIABLE is required'),
}),
});
// In loadEnvVars():
newSection: {
newVariable: process.env.NEW_VARIABLE,
},
```
### 2. Add to Gitea Secrets
For prod/test environments:
#### 2. Add to Gitea Secrets
1. Go to Gitea repository Settings > Secrets
2. Add `NEW_VARIABLE` with value
2. Add `NEW_VARIABLE` with production value
3. Add `NEW_VARIABLE_TEST` if test needs different value
### 3. Update Deployment Workflows
#### 3. Update Deployment Workflows
Edit `.gitea/workflows/deploy-to-prod.yml`:
@@ -145,7 +272,7 @@ env:
NEW_VARIABLE: ${{ secrets.NEW_VARIABLE_TEST }}
```
### 4. Update PM2 Config
#### 4. Update PM2 Config
Edit `ecosystem.config.cjs`:
@@ -161,31 +288,36 @@ module.exports = {
};
```
### 5. Update Documentation
- Add to `.env.example`
- Update this document
- Document in relevant feature docs
---
## Security Best Practices
### Secrets Management
### Do
- **NEVER** commit secrets to git
- Use Gitea Secrets for prod/test
- Use `.env.local` for dev (gitignored)
- Generate secrets with cryptographic randomness
- Rotate secrets regularly
- Use environment-specific database users
### Do Not
- Commit secrets to git
- Use short or predictable secrets
- Share secrets across environments
- Log sensitive values
### Secret Generation
```bash
# Generate secure random secrets
# Generate secure random secrets (64 hex characters)
node -e "console.log(require('crypto').randomBytes(32).toString('hex'))"
# Example output:
# a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4e5f6a1b2
```
### Database Users
Each environment has its own PostgreSQL user:
### Database Users by Environment
| Environment | User | Database |
| ----------- | -------------------- | -------------------- |
@@ -193,44 +325,61 @@ Each environment has its own PostgreSQL user:
| Test | `flyer_crawler_test` | `flyer-crawler-test` |
| Development | `postgres` | `flyer_crawler_dev` |
**Setup Commands** (as postgres superuser):
```sql
-- Production
CREATE DATABASE "flyer-crawler-prod";
CREATE USER flyer_crawler_prod WITH PASSWORD 'secure-password';
ALTER DATABASE "flyer-crawler-prod" OWNER TO flyer_crawler_prod;
\c "flyer-crawler-prod"
ALTER SCHEMA public OWNER TO flyer_crawler_prod;
GRANT CREATE, USAGE ON SCHEMA public TO flyer_crawler_prod;
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
CREATE EXTENSION IF NOT EXISTS postgis;
CREATE EXTENSION IF NOT EXISTS pg_trgm;
-- Test (similar commands with _test suffix)
```
---
## Validation
Environment variables are validated at startup via `src/config/env.ts`. If validation fails:
Environment variables are validated at startup via `src/config/env.ts`.
1. Check the error message for missing/invalid variables
2. Verify `.env.local` (dev) or Gitea Secrets (prod/test)
3. Ensure values match schema requirements (min length, format, etc.)
### Startup Validation
If validation fails, you will see:
```text
╔════════════════════════════════════════════════════════════════╗
║ CONFIGURATION ERROR - APPLICATION STARTUP ║
╚════════════════════════════════════════════════════════════════╝
The following environment variables are missing or invalid:
- database.host: DB_HOST is required
- auth.jwtSecret: JWT_SECRET must be at least 32 characters
Please check your .env file or environment configuration.
```
### Debugging Configuration
```bash
# Check what variables are set (dev container)
podman exec flyer-crawler-dev env | grep -E "^(DB_|REDIS_|JWT_|SENTRY_)"
# Test database connection
podman exec flyer-crawler-postgres psql -U postgres -d flyer_crawler_dev -c "SELECT 1;"
# Test Redis connection
podman exec flyer-crawler-redis redis-cli ping
```
---
## Troubleshooting
### Variable Not Found
```
```text
Error: Missing required environment variable: JWT_SECRET
```
**Solution**: Add the variable to your environment configuration.
**Solutions**:
1. Check `.env.local` exists and has the variable
2. Verify variable name matches schema exactly
3. Restart the application after changes
### Invalid Value
```
```text
Error: JWT_SECRET must be at least 32 characters
```
@@ -240,32 +389,36 @@ Error: JWT_SECRET must be at least 32 characters
Check `NODE_ENV` is set correctly:
- `development` - Local dev container
- `test` - CI/CD test server
- `production` - Production server
| Value | Purpose |
| ------------- | ---------------------- |
| `development` | Local dev container |
| `test` | CI/CD test server |
| `staging` | Pre-production testing |
| `production` | Production server |
### Database Connection Issues
Verify database credentials:
```bash
# Development
podman exec flyer-crawler-postgres psql -U postgres -d flyer_crawler_dev -c "SELECT 1;"
# Production (via SSH)
ssh root@projectium.com "psql -U flyer_crawler_prod -d flyer-crawler-prod -c 'SELECT 1;'"
# If connection fails, check:
# 1. Container is running: podman ps
# 2. DB_HOST matches container network
# 3. DB_PASSWORD is correct
```
## Reference
---
- **Validation Schema**: [src/config/env.ts](../../src/config/env.ts)
- **Template**: [.env.example](../../.env.example)
- **Deployment Workflows**: [.gitea/workflows/](../../.gitea/workflows/)
- **PM2 Config**: [ecosystem.config.cjs](../../ecosystem.config.cjs)
## See Also
## Related Documentation
- [QUICKSTART.md](QUICKSTART.md) - Quick setup guide
- [INSTALL.md](INSTALL.md) - Detailed installation
- [DEV-CONTAINER.md](../development/DEV-CONTAINER.md) - Dev container setup
- [DEPLOYMENT.md](../operations/DEPLOYMENT.md) - Production deployment
- [AUTHENTICATION.md](../architecture/AUTHENTICATION.md) - OAuth setup
- [ADR-007](../adr/0007-configuration-and-secrets-management.md) - Configuration decisions
---
Last updated: January 2026

View File

@@ -1,203 +1,453 @@
# Installation Guide
This guide covers setting up a local development environment for Flyer Crawler.
Complete setup instructions for the Flyer Crawler local development environment.
---
## Quick Reference
| Setup Method | Best For | Time | Document Section |
| ----------------- | --------------------------- | ------ | --------------------------------------------------- |
| Quick Start | Already have Postgres/Redis | 5 min | [Quick Start](#quick-start) |
| Dev Container | Full production-like setup | 15 min | [Dev Container](#development-container-recommended) |
| Manual Containers | Learning the components | 20 min | [Podman Setup](#podman-setup-manual) |
---
## Prerequisites
- Node.js 20.x or later
- Access to a PostgreSQL database (local or remote)
- Redis instance (for session management)
- Google Gemini API key
- Google Maps API key (for geocoding)
### Required Software
| Software | Minimum Version | Purpose | Download |
| -------------- | --------------- | -------------------- | ----------------------------------------------- |
| Node.js | 20.x | Runtime | [nodejs.org](https://nodejs.org/) |
| Podman Desktop | 4.x | Container management | [podman-desktop.io](https://podman-desktop.io/) |
| Git | 2.x | Version control | [git-scm.com](https://git-scm.com/) |
### Windows-Specific Requirements
| Requirement | Purpose | Setup Command |
| ----------- | ------------------------------ | ---------------------------------- |
| WSL 2 | Linux compatibility for Podman | `wsl --install` (admin PowerShell) |
### Verify Installation
```bash
# Check all prerequisites
node --version # Expected: v20.x or higher
podman --version # Expected: podman version 4.x or higher
git --version # Expected: git version 2.x or higher
wsl --list -v # Expected: Shows WSL 2 distro
```
---
## Quick Start
If you already have PostgreSQL and Redis configured:
If you already have PostgreSQL and Redis configured externally:
```bash
# Install dependencies
# 1. Clone the repository
git clone https://gitea.projectium.com/flyer-crawler/flyer-crawler.git
cd flyer-crawler
# 2. Install dependencies
npm install
# Run in development mode
# 3. Create .env.local (see Environment section below)
# 4. Run in development mode
npm run dev
```
**Access Points**:
- Frontend: `http://localhost:5173`
- Backend API: `http://localhost:3001`
---
## Development Environment with Podman (Recommended for Windows)
## Development Container (Recommended)
This approach uses Podman with an Ubuntu container for a consistent development environment.
The dev container provides a complete, production-like environment.
### What's Included
| Service | Purpose | Port |
| ---------- | ------------------------ | ---------- |
| Node.js | API server, worker, Vite | 3001, 5173 |
| PostgreSQL | Database with PostGIS | 5432 |
| Redis | Cache and job queues | 6379 |
| NGINX | HTTPS reverse proxy | 443 |
| Bugsink | Error tracking | 8443 |
| Logstash | Log aggregation | - |
| PM2 | Process management | - |
### Setup Steps
#### Step 1: Initialize Podman
```bash
# Windows: Start Podman Desktop, or from terminal:
podman machine init
podman machine start
```
#### Step 2: Start Dev Container
```bash
# Start all services
podman-compose -f compose.dev.yml up -d
# View logs (optional)
podman-compose -f compose.dev.yml logs -f
```
**Expected Output**:
```text
[+] Running 3/3
- Container flyer-crawler-postgres Started
- Container flyer-crawler-redis Started
- Container flyer-crawler-dev Started
```
#### Step 3: Verify Services
```bash
# Check containers are running
podman ps
# Check PM2 processes
podman exec -it flyer-crawler-dev pm2 status
```
**Expected PM2 Status**:
```text
+---------------------------+--------+-------+
| name | status | cpu |
+---------------------------+--------+-------+
| flyer-crawler-api-dev | online | 0% |
| flyer-crawler-worker-dev | online | 0% |
| flyer-crawler-vite-dev | online | 0% |
+---------------------------+--------+-------+
```
#### Step 4: Access Application
| Service | URL | Notes |
| ----------- | ------------------------ | ---------------------------- |
| Frontend | `https://localhost` | NGINX proxies to Vite |
| Backend API | `http://localhost:3001` | Express server |
| Bugsink | `https://localhost:8443` | Login: admin@localhost/admin |
### SSL Certificate Setup (Optional but Recommended)
To eliminate browser security warnings:
**Windows**:
1. Double-click `certs/mkcert-ca.crt`
2. Click "Install Certificate..."
3. Select "Local Machine" > Next
4. Select "Place all certificates in the following store"
5. Browse > Select "Trusted Root Certification Authorities" > OK
6. Click Next > Finish
7. Restart browser
**Other Platforms**: See [`certs/README.md`](../../certs/README.md)
### Managing the Dev Container
| Action | Command |
| --------- | ------------------------------------------- |
| Start | `podman-compose -f compose.dev.yml up -d` |
| Stop | `podman-compose -f compose.dev.yml down` |
| View logs | `podman-compose -f compose.dev.yml logs -f` |
| Restart | `podman-compose -f compose.dev.yml restart` |
| Rebuild | `podman-compose -f compose.dev.yml build` |
---
## Podman Setup (Manual)
For understanding the individual components or custom configurations.
### Step 1: Install Prerequisites on Windows
1. **Install WSL 2**: Podman on Windows relies on the Windows Subsystem for Linux.
```powershell
# Run in administrator PowerShell
wsl --install
```
```powershell
wsl --install
```
Restart computer after WSL installation.
Run this in an administrator PowerShell.
### Step 2: Initialize Podman
2. **Install Podman Desktop**: Download and install [Podman Desktop for Windows](https://podman-desktop.io/).
1. Launch **Podman Desktop**
2. Follow the setup wizard to initialize Podman machine
3. Start the Podman machine
### Step 2: Set Up Podman
1. **Initialize Podman**: Launch Podman Desktop. It will automatically set up its WSL 2 machine.
2. **Start Podman**: Ensure the Podman machine is running from the Podman Desktop interface.
### Step 3: Set Up the Ubuntu Container
1. **Pull Ubuntu Image**:
```bash
podman pull ubuntu:latest
```
2. **Create a Podman Volume** (persists node_modules between container restarts):
```bash
podman volume create node_modules_cache
```
3. **Run the Ubuntu Container**:
Open a terminal in your project's root directory and run:
```bash
podman run -it -p 3001:3001 -p 5173:5173 --name flyer-dev \
-v "$(pwd):/app" \
-v "node_modules_cache:/app/node_modules" \
ubuntu:latest
```
| Flag | Purpose |
| ------------------------------------------- | ------------------------------------------------ |
| `-p 3001:3001` | Forwards the backend server port |
| `-p 5173:5173` | Forwards the Vite frontend server port |
| `--name flyer-dev` | Names the container for easy reference |
| `-v "...:/app"` | Mounts your project directory into the container |
| `-v "node_modules_cache:/app/node_modules"` | Mounts the named volume for node_modules |
### Step 4: Configure the Ubuntu Environment
You are now inside the Ubuntu container's shell.
1. **Update Package Lists**:
```bash
apt-get update
```
2. **Install Dependencies**:
```bash
apt-get install -y curl git
curl -sL https://deb.nodesource.com/setup_20.x | bash -
apt-get install -y nodejs
```
3. **Navigate to Project Directory**:
```bash
cd /app
```
4. **Install Project Dependencies**:
```bash
npm install
```
### Step 5: Run the Development Server
Or from terminal:
```bash
podman machine init
podman machine start
```
### Step 3: Create Podman Network
```bash
podman network create flyer-crawler-net
```
### Step 4: Create PostgreSQL Container
```bash
podman run -d \
--name flyer-crawler-postgres \
--network flyer-crawler-net \
-e POSTGRES_USER=postgres \
-e POSTGRES_PASSWORD=postgres \
-e POSTGRES_DB=flyer_crawler_dev \
-p 5432:5432 \
-v flyer-crawler-pgdata:/var/lib/postgresql/data \
docker.io/postgis/postgis:15-3.3
```
### Step 5: Create Redis Container
```bash
podman run -d \
--name flyer-crawler-redis \
--network flyer-crawler-net \
-p 6379:6379 \
-v flyer-crawler-redis:/data \
docker.io/library/redis:alpine
```
### Step 6: Initialize Database
```bash
# Wait for PostgreSQL to be ready
podman exec flyer-crawler-postgres pg_isready -U postgres
# Install required extensions
podman exec flyer-crawler-postgres psql -U postgres -d flyer_crawler_dev -c "
CREATE EXTENSION IF NOT EXISTS postgis;
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE EXTENSION IF NOT EXISTS \"uuid-ossp\";
"
# Apply schema
podman exec -i flyer-crawler-postgres psql -U postgres -d flyer_crawler_dev < sql/master_schema_rollup.sql
```
### Step 7: Create Node.js Container
```bash
# Create volume for node_modules
podman volume create node_modules_cache
# Run Ubuntu container with project mounted
podman run -it \
--name flyer-dev \
--network flyer-crawler-net \
-p 3001:3001 \
-p 5173:5173 \
-v "$(pwd):/app" \
-v "node_modules_cache:/app/node_modules" \
ubuntu:latest
```
### Step 8: Configure Container Environment
Inside the container:
```bash
# Update and install dependencies
apt-get update
apt-get install -y curl git
# Install Node.js 20
curl -sL https://deb.nodesource.com/setup_20.x | bash -
apt-get install -y nodejs
# Navigate to project and install
cd /app
npm install
# Start development server
npm run dev
```
### Step 6: Access the Application
### Container Management Commands
- **Frontend**: http://localhost:5173
- **Backend API**: http://localhost:3001
### Dev Container with HTTPS (Full Stack)
When using the full dev container stack with NGINX (via `compose.dev.yml`), access the application over HTTPS:
- **Frontend**: https://localhost or https://127.0.0.1
- **Backend API**: http://localhost:3001
**SSL Certificate Notes:**
- The dev container uses self-signed certificates generated by mkcert
- Both `localhost` and `127.0.0.1` are valid hostnames (certificate includes both as SANs)
- If images fail to load with SSL errors, see [FLYER-URL-CONFIGURATION.md](../FLYER-URL-CONFIGURATION.md#ssl-certificate-configuration-dev-container)
**Eliminate SSL Warnings (Recommended):**
To avoid browser security warnings for self-signed certificates, install the mkcert CA certificate on your system. The CA certificate is located at `certs/mkcert-ca.crt` in the project root.
See [`certs/README.md`](../../certs/README.md) for platform-specific installation instructions (Windows, macOS, Linux, Firefox).
After installation:
- Your browser will trust all mkcert certificates without warnings
- Both `https://localhost/` and `https://127.0.0.1/` will work without SSL errors
- Flyer images will load without `ERR_CERT_AUTHORITY_INVALID` errors
### Managing the Container
| Action | Command |
| --------------------- | -------------------------------- |
| Stop the container | Press `Ctrl+C`, then type `exit` |
| Restart the container | `podman start -a -i flyer-dev` |
| Remove the container | `podman rm flyer-dev` |
| Action | Command |
| -------------- | ------------------------------ |
| Stop container | Press `Ctrl+C`, then `exit` |
| Restart | `podman start -a -i flyer-dev` |
| Remove | `podman rm flyer-dev` |
| List running | `podman ps` |
| List all | `podman ps -a` |
---
## Environment Variables
## Environment Configuration
This project is configured to run in a CI/CD environment and does not use `.env` files. All configuration must be provided as environment variables.
### Create .env.local
For local development, you can export these in your shell or use your IDE's environment configuration:
Create `.env.local` in the project root with your configuration:
| Variable | Description |
| --------------------------- | ------------------------------------- |
| `DB_HOST` | PostgreSQL server hostname |
| `DB_USER` | PostgreSQL username |
| `DB_PASSWORD` | PostgreSQL password |
| `DB_DATABASE_PROD` | Production database name |
| `JWT_SECRET` | Secret string for signing auth tokens |
| `VITE_GOOGLE_GENAI_API_KEY` | Google Gemini API key |
| `GOOGLE_MAPS_API_KEY` | Google Maps Geocoding API key |
| `REDIS_PASSWORD_PROD` | Production Redis password |
| `REDIS_PASSWORD_TEST` | Test Redis password |
```bash
# Database (adjust host based on your setup)
DB_HOST=localhost # Use 'postgres' if inside dev container
DB_PORT=5432
DB_USER=postgres
DB_PASSWORD=postgres
DB_NAME=flyer_crawler_dev
# Redis (adjust host based on your setup)
REDIS_URL=redis://localhost:6379 # Use 'redis://redis:6379' inside container
# Application
NODE_ENV=development
PORT=3001
FRONTEND_URL=http://localhost:5173
# Authentication (generate secure values)
JWT_SECRET=your-secret-at-least-32-characters-long
# AI Services
GEMINI_API_KEY=your-google-gemini-api-key
GOOGLE_MAPS_API_KEY=your-google-maps-api-key # Optional
```
**Generate Secure Secrets**:
```bash
node -e "console.log(require('crypto').randomBytes(32).toString('hex'))"
```
### Environment Differences
| Variable | Host Development | Inside Dev Container |
| ----------- | ------------------------ | -------------------- |
| `DB_HOST` | `localhost` | `postgres` |
| `REDIS_URL` | `redis://localhost:6379` | `redis://redis:6379` |
See [ENVIRONMENT.md](ENVIRONMENT.md) for complete variable reference.
---
## Seeding Development Data
To create initial test accounts (`admin@example.com` and `user@example.com`) and sample data:
Create test accounts and sample data:
```bash
npm run seed
```
The seed script performs the following actions:
### What the Seed Script Does
1. Rebuilds the database schema from `sql/master_schema_rollup.sql`
2. Creates test user accounts (admin and regular user)
3. Copies test flyer images from `src/tests/assets/` to `public/flyer-images/`
4. Creates a sample flyer with items linked to the test images
5. Seeds watched items and a shopping list for the test user
1. Rebuilds database schema from `sql/master_schema_rollup.sql`
2. Creates test user accounts:
- `admin@example.com` (admin user)
- `user@example.com` (regular user)
3. Copies test flyer images to `public/flyer-images/`
4. Creates sample flyer with items
5. Seeds watched items and shopping list
**Test Images**: The seed script copies `test-flyer-image.jpg` and `test-flyer-icon.png` to the `public/flyer-images/` directory, which is served by NGINX at `/flyer-images/`.
### Test Images
After running, you may need to restart your IDE's TypeScript server to pick up any generated types.
The seed script copies these files from `src/tests/assets/`:
- `test-flyer-image.jpg`
- `test-flyer-icon.png`
Images are served by NGINX at `/flyer-images/`.
---
## Verification Checklist
After installation, verify everything works:
- [ ] **Containers running**: `podman ps` shows postgres and redis
- [ ] **Database accessible**: `podman exec flyer-crawler-postgres psql -U postgres -c "SELECT 1;"`
- [ ] **Frontend loads**: Open `http://localhost:5173` (or `https://localhost` for dev container)
- [ ] **API responds**: `curl http://localhost:3001/health`
- [ ] **Tests pass**: `npm run test:unit` (or in container: `podman exec -it flyer-crawler-dev npm run test:unit`)
- [ ] **Type check passes**: `npm run type-check`
---
## Troubleshooting
### Podman Machine Won't Start
```bash
# Reset Podman machine
podman machine rm
podman machine init
podman machine start
```
### Port Already in Use
```bash
# Find process using port
netstat -ano | findstr :5432
# Option: Use different port
podman run -d --name flyer-crawler-postgres -p 5433:5432 ...
# Then set DB_PORT=5433 in .env.local
```
### Database Extensions Missing
```bash
podman exec flyer-crawler-postgres psql -U postgres -d flyer_crawler_dev -c "
CREATE EXTENSION IF NOT EXISTS postgis;
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE EXTENSION IF NOT EXISTS \"uuid-ossp\";
"
```
### Permission Denied on Windows Paths
Use `MSYS_NO_PATHCONV=1` prefix:
```bash
MSYS_NO_PATHCONV=1 podman exec flyer-crawler-dev /path/to/script.sh
```
### Tests Fail with Timezone Errors
Tests must run in the dev container, not on Windows host:
```bash
# CORRECT
podman exec -it flyer-crawler-dev npm test
# INCORRECT (may fail with TZ errors)
npm test
```
---
## Next Steps
- [Database Setup](DATABASE.md) - Set up PostgreSQL with required extensions
- [Authentication Setup](AUTHENTICATION.md) - Configure OAuth providers
- [Deployment Guide](DEPLOYMENT.md) - Deploy to production
| Goal | Document |
| --------------------- | ------------------------------------------------------ |
| Quick setup guide | [QUICKSTART.md](QUICKSTART.md) |
| Environment variables | [ENVIRONMENT.md](ENVIRONMENT.md) |
| Database schema | [DATABASE.md](../architecture/DATABASE.md) |
| Authentication setup | [AUTHENTICATION.md](../architecture/AUTHENTICATION.md) |
| Dev container details | [DEV-CONTAINER.md](../development/DEV-CONTAINER.md) |
| Deployment | [DEPLOYMENT.md](../operations/DEPLOYMENT.md) |
---
Last updated: January 2026

View File

@@ -2,13 +2,38 @@
Get Flyer Crawler running in 5 minutes.
## Prerequisites
---
- **Windows 10/11** with WSL 2
- **Podman Desktop** installed
- **Node.js 20+** installed
## Prerequisites Checklist
## 1. Start Containers (1 minute)
Before starting, verify you have:
- [ ] **Windows 10/11** with WSL 2 enabled
- [ ] **Podman Desktop** installed ([download](https://podman-desktop.io/))
- [ ] **Node.js 20+** installed
- [ ] **Git** for cloning the repository
**Verify Prerequisites**:
```bash
# Check Podman
podman --version
# Expected: podman version 4.x or higher
# Check Node.js
node --version
# Expected: v20.x or higher
# Check WSL
wsl --list --verbose
# Expected: Shows WSL 2 distro
```
---
## Quick Setup (5 Steps)
### Step 1: Start Containers (1 minute)
```bash
# Start PostgreSQL and Redis
@@ -27,11 +52,18 @@ podman run -d --name flyer-crawler-redis \
docker.io/library/redis:alpine
```
## 2. Initialize Database (2 minutes)
**Expected Output**:
```text
# Container IDs displayed, no errors
```
### Step 2: Initialize Database (2 minutes)
```bash
# Wait for PostgreSQL to be ready
podman exec flyer-crawler-postgres pg_isready -U postgres
# Expected: localhost:5432 - accepting connections
# Install extensions
podman exec flyer-crawler-postgres psql -U postgres -d flyer_crawler_dev \
@@ -41,7 +73,17 @@ podman exec flyer-crawler-postgres psql -U postgres -d flyer_crawler_dev \
podman exec -i flyer-crawler-postgres psql -U postgres -d flyer_crawler_dev < sql/master_schema_rollup.sql
```
## 3. Configure Environment (1 minute)
**Expected Output**:
```text
CREATE EXTENSION
CREATE EXTENSION
CREATE EXTENSION
CREATE TABLE
... (many tables created)
```
### Step 3: Configure Environment (1 minute)
Create `.env.local` in the project root:
@@ -61,16 +103,22 @@ NODE_ENV=development
PORT=3001
FRONTEND_URL=http://localhost:5173
# Secrets (generate your own)
# Secrets (generate your own - see command below)
JWT_SECRET=your-dev-jwt-secret-at-least-32-chars-long
SESSION_SECRET=your-dev-session-secret-at-least-32-chars-long
# AI Services (get your own keys)
VITE_GOOGLE_GENAI_API_KEY=your-google-genai-api-key
GEMINI_API_KEY=your-google-gemini-api-key
GOOGLE_MAPS_API_KEY=your-google-maps-api-key
```
## 4. Install & Run (1 minute)
**Generate Secure Secrets**:
```bash
node -e "console.log(require('crypto').randomBytes(32).toString('hex'))"
```
### Step 4: Install and Run (1 minute)
```bash
# Install dependencies (first time only)
@@ -80,35 +128,61 @@ npm install
npm run dev
```
## 5. Access Application
**Expected Output**:
- **Frontend**: http://localhost:5173
- **Backend API**: http://localhost:3001
- **Health Check**: http://localhost:3001/health
```text
> flyer-crawler@x.x.x dev
> concurrently ...
### Dev Container (HTTPS)
[API] Server listening on port 3001
[Vite] VITE ready at http://localhost:5173
```
When using the full dev container with NGINX, access via HTTPS:
### Step 5: Verify Installation
- **Frontend**: https://localhost or https://127.0.0.1
- **Backend API**: http://localhost:3001
- **Bugsink**: `https://localhost:8443` (error tracking)
| Check | URL/Command | Expected Result |
| ----------- | ------------------------------ | ----------------------------------- |
| Frontend | `http://localhost:5173` | Flyer Crawler app loads |
| Backend API | `http://localhost:3001/health` | `{ "status": "ok", ... }` |
| Database | `podman exec ... psql -c ...` | `SELECT version()` returns Postgres |
| Containers | `podman ps` | Shows postgres and redis running |
**Note:** The dev container accepts both `localhost` and `127.0.0.1` for HTTPS connections. The self-signed certificate is valid for both hostnames.
---
**SSL Certificate Warnings:** To eliminate browser security warnings for self-signed certificates, install the mkcert CA certificate. See [`certs/README.md`](../../certs/README.md) for platform-specific installation instructions. This is optional but recommended for a better development experience.
## Full Dev Container (Recommended)
### Dev Container Architecture
For a production-like environment with NGINX, Bugsink error tracking, and PM2 process management:
The dev container uses PM2 for process management, matching production (ADR-014):
### Starting the Dev Container
| Process | Description | Port |
| -------------------------- | ------------------------ | ---- |
| `flyer-crawler-api-dev` | API server (tsx watch) | 3001 |
| `flyer-crawler-worker-dev` | Background job worker | - |
| `flyer-crawler-vite-dev` | Vite frontend dev server | 5173 |
```bash
# Start all services
podman-compose -f compose.dev.yml up -d
**PM2 Commands** (run inside container):
# View logs
podman-compose -f compose.dev.yml logs -f
```
### Access Points
| Service | URL | Notes |
| ----------- | ------------------------ | ---------------------------- |
| Frontend | `https://localhost` | NGINX proxy to Vite |
| Backend API | `http://localhost:3001` | Express server |
| Bugsink | `https://localhost:8443` | Error tracking (admin/admin) |
| PostgreSQL | `localhost:5432` | Database |
| Redis | `localhost:6379` | Cache |
**SSL Certificate Setup (Recommended)**:
To eliminate browser security warnings, install the mkcert CA certificate:
```bash
# Windows: Double-click certs/mkcert-ca.crt and install to Trusted Root CAs
# See certs/README.md for detailed instructions per platform
```
### PM2 Commands
```bash
# View process status
@@ -124,63 +198,152 @@ podman exec -it flyer-crawler-dev pm2 restart all
podman exec -it flyer-crawler-dev pm2 restart flyer-crawler-api-dev
```
## Verify Installation
### Dev Container Processes
| Process | Description | Port |
| -------------------------- | ------------------------ | ---- |
| `flyer-crawler-api-dev` | API server (tsx watch) | 3001 |
| `flyer-crawler-worker-dev` | Background job worker | - |
| `flyer-crawler-vite-dev` | Vite frontend dev server | 5173 |
---
## Verification Commands
Run these to confirm everything is working:
```bash
# Check containers are running
podman ps
# Expected: flyer-crawler-postgres and flyer-crawler-redis both running
# Test database connection
podman exec flyer-crawler-postgres psql -U postgres -d flyer_crawler_dev -c "SELECT version();"
# Expected: PostgreSQL 15.x with PostGIS
# Run tests (in dev container)
podman exec -it flyer-crawler-dev npm run test:unit
# Expected: All tests pass
# Run type check
podman exec -it flyer-crawler-dev npm run type-check
# Expected: No type errors
```
## Common Issues
---
## Common Issues and Solutions
### "Unable to connect to Podman socket"
**Cause**: Podman machine not running
**Solution**:
```bash
podman machine start
```
### "Connection refused" to PostgreSQL
Wait a few seconds for PostgreSQL to initialize:
**Cause**: PostgreSQL still initializing
**Solution**:
```bash
# Wait for PostgreSQL to be ready
podman exec flyer-crawler-postgres pg_isready -U postgres
# Retry after "accepting connections" message
```
### Port 5432 or 6379 already in use
Stop conflicting services or change port mappings:
**Cause**: Another service using the port
**Solution**:
```bash
# Use different host port
# Option 1: Stop conflicting service
# Option 2: Use different host port
podman run -d --name flyer-crawler-postgres -p 5433:5432 ...
# Then update DB_PORT=5433 in .env.local
```
Then update `DB_PORT=5433` in `.env.local`.
### "JWT_SECRET must be at least 32 characters"
**Cause**: Secret too short in .env.local
**Solution**: Generate a longer secret:
```bash
node -e "console.log(require('crypto').randomBytes(32).toString('hex'))"
```
### Tests fail with "TZ environment variable" errors
**Cause**: Timezone setting interfering with Node.js async hooks
**Solution**: Tests must run in dev container (not Windows host):
```bash
# CORRECT - run in container
podman exec -it flyer-crawler-dev npm test
# INCORRECT - do not run on Windows host
npm test
```
---
## Next Steps
- **Read the docs**: [docs/README.md](../README.md)
- **Understand the architecture**: [docs/architecture/DATABASE.md](../architecture/DATABASE.md)
- **Learn testing**: [docs/development/TESTING.md](../development/TESTING.md)
- **Explore ADRs**: [docs/adr/index.md](../adr/index.md)
- **Contributing**: [CONTRIBUTING.md](../../CONTRIBUTING.md)
| Goal | Document |
| ----------------------- | ----------------------------------------------------- |
| Understand the codebase | [Architecture Overview](../architecture/OVERVIEW.md) |
| Configure environment | [Environment Variables](ENVIRONMENT.md) |
| Set up MCP tools | [MCP Configuration](../tools/MCP-CONFIGURATION.md) |
| Learn testing | [Testing Guide](../development/TESTING.md) |
| Understand DB schema | [Database Documentation](../architecture/DATABASE.md) |
| Read ADRs | [ADR Index](../adr/index.md) |
| Full installation guide | [Installation Guide](INSTALL.md) |
## Development Workflow
---
## Daily Development Workflow
```bash
# Daily workflow
# 1. Start containers
podman start flyer-crawler-postgres flyer-crawler-redis
# 2. Start dev server
npm run dev
# ... make changes ...
# 3. Make changes and test
npm test
# 4. Type check before commit
npm run type-check
# 5. Commit changes
git commit
```
For detailed setup instructions, see [INSTALL.md](INSTALL.md).
**For dev container users**:
```bash
# 1. Start dev container
podman-compose -f compose.dev.yml up -d
# 2. View logs
podman exec -it flyer-crawler-dev pm2 logs
# 3. Run tests
podman exec -it flyer-crawler-dev npm test
# 4. Stop when done
podman-compose -f compose.dev.yml down
```
---
Last updated: January 2026

View File

@@ -2,8 +2,68 @@
This guide covers the manual installation of Flyer Crawler and its dependencies on a bare-metal Ubuntu server (e.g., a colocation server). This is the definitive reference for setting up a production environment without containers.
**Last verified**: 2026-01-28
**Target Environment**: Ubuntu 22.04 LTS (or newer)
**Related documentation**:
- [ADR-014: Containerization and Deployment Strategy](../adr/0014-containerization-and-deployment-strategy.md)
- [ADR-015: Error Tracking and Observability](../adr/0015-error-tracking-and-observability.md)
- [ADR-050: PostgreSQL Function Observability](../adr/0050-postgresql-function-observability.md)
- [Deployment Guide](DEPLOYMENT.md)
- [Monitoring Guide](MONITORING.md)
---
## Quick Reference
### Installation Time Estimates
| Component | Estimated Time | Notes |
| ----------- | --------------- | ----------------------------- |
| PostgreSQL | 10-15 minutes | Including PostGIS extensions |
| Redis | 5 minutes | Quick install |
| Node.js | 5 minutes | Via NodeSource repository |
| Application | 15-20 minutes | Clone, install, build |
| PM2 | 5 minutes | Global install + config |
| NGINX | 10-15 minutes | Including SSL via Certbot |
| Bugsink | 20-30 minutes | Python venv, systemd services |
| Logstash | 15-20 minutes | Including pipeline config |
| **Total** | **~90 minutes** | For complete fresh install |
### Post-Installation Verification
After completing setup, verify all services:
```bash
# Check all services are running
systemctl status postgresql nginx redis-server gunicorn-bugsink snappea logstash
# Verify application health
curl -s https://flyer-crawler.projectium.com/api/health/ready | jq .
# Check PM2 processes
pm2 list
# Verify Bugsink is accessible
curl -s https://bugsink.projectium.com/accounts/login/ | head -5
```
---
## Server Access Model
All commands in this guide are intended for the **system administrator** to execute directly on the server. Claude Code and AI tools have **READ-ONLY** access to production servers and cannot execute these commands directly.
When Claude assists with server setup or troubleshooting:
1. Claude provides commands for the administrator to execute
2. Administrator runs commands and reports output
3. Claude analyzes results and provides next steps (1-3 commands at a time)
4. Administrator executes and reports results
5. Claude provides verification commands to confirm success
---
## Table of Contents

View File

@@ -2,14 +2,81 @@
This guide covers deploying Flyer Crawler to a production server.
**Last verified**: 2026-01-28
**Related documentation**:
- [ADR-014: Containerization and Deployment Strategy](../adr/0014-containerization-and-deployment-strategy.md)
- [ADR-015: Error Tracking and Observability](../adr/0015-error-tracking-and-observability.md)
- [Bare-Metal Setup Guide](BARE-METAL-SETUP.md)
- [Monitoring Guide](MONITORING.md)
---
## Quick Reference
### Command Reference Table
| Task | Command |
| -------------------- | ----------------------------------------------------------------------- |
| Deploy to production | Gitea Actions workflow (manual trigger) |
| Deploy to test | Automatic on push to `main` |
| Check PM2 status | `pm2 list` |
| View logs | `pm2 logs flyer-crawler-api --lines 100` |
| Restart all | `pm2 restart all` |
| Check NGINX | `sudo nginx -t && sudo systemctl status nginx` |
| Check health | `curl -s https://flyer-crawler.projectium.com/api/health/ready \| jq .` |
### Deployment URLs
| Environment | URL | API Port |
| ------------- | ------------------------------------------- | -------- |
| Production | `https://flyer-crawler.projectium.com` | 3001 |
| Test | `https://flyer-crawler-test.projectium.com` | 3002 |
| Dev Container | `https://localhost` | 3001 |
---
## Server Access Model
**Important**: Claude Code (and AI tools) have **READ-ONLY** access to production/test servers. The deployment workflow is:
| Actor | Capability |
| ------------ | --------------------------------------------------------------- |
| Gitea CI/CD | Automated deployments via workflows (has write access) |
| User (human) | Manual server access for troubleshooting and emergency fixes |
| Claude Code | Provides commands for user to execute; cannot run them directly |
When troubleshooting deployment issues:
1. Claude provides **diagnostic commands** for the user to run
2. User executes commands and reports output
3. Claude analyzes results and provides **fix commands** (1-3 at a time)
4. User executes fixes and reports results
5. Claude provides **verification commands** to confirm success
---
## Prerequisites
- Ubuntu server (22.04 LTS recommended)
- PostgreSQL 14+ with PostGIS extension
- Redis
- Node.js 20.x
- NGINX (reverse proxy)
- PM2 (process manager)
| Component | Version | Purpose |
| ---------- | --------- | ------------------------------- |
| Ubuntu | 22.04 LTS | Operating system |
| PostgreSQL | 14+ | Database with PostGIS extension |
| Redis | 6+ | Caching and job queues |
| Node.js | 20.x LTS | Application runtime |
| NGINX | 1.18+ | Reverse proxy and static files |
| PM2 | Latest | Process manager |
**Verify prerequisites**:
```bash
node --version # Should be v20.x.x
psql --version # Should be 14+
redis-cli ping # Should return PONG
nginx -v # Should be 1.18+
pm2 --version # Any recent version
```
## Dev Container Parity (ADR-014)
@@ -190,7 +257,7 @@ types {
**Option 2**: Edit `/etc/nginx/mime.types` globally:
```
```text
# Change this line:
application/javascript js;
@@ -321,9 +388,78 @@ The Sentry SDK v10+ enforces HTTPS-only DSNs by default. Since Bugsink runs loca
---
## Deployment Troubleshooting
### Decision Tree: Deployment Issues
```text
Deployment failed?
|
+-- Build step failed?
| |
| +-- TypeScript errors --> Fix type issues, run `npm run type-check`
| +-- Missing dependencies --> Run `npm ci`
| +-- Out of memory --> Increase Node heap size
|
+-- Tests failed?
| |
| +-- Database connection --> Check DB_HOST, credentials
| +-- Redis connection --> Check REDIS_URL
| +-- Test isolation --> Check for race conditions
|
+-- SSH/Deploy failed?
|
+-- Permission denied --> Check SSH keys in Gitea secrets
+-- Host unreachable --> Check firewall, VPN
+-- PM2 error --> Check PM2 logs on server
```
### Common Deployment Issues
| Symptom | Diagnosis | Solution |
| ------------------------------------ | ----------------------- | ------------------------------------------------ |
| "Connection refused" on health check | API not started | Check `pm2 logs flyer-crawler-api` |
| 502 Bad Gateway | NGINX cannot reach API | Verify API port (3001), restart PM2 |
| CSS/JS not loading | Build artifacts missing | Re-run `npm run build`, check NGINX static paths |
| Database migrations failed | Schema mismatch | Run migrations manually, check DB connectivity |
| "ENOSPC" error | Disk full | Clear old logs: `pm2 flush`, clean npm cache |
| SSL certificate error | Cert expired/missing | Run `certbot renew`, check NGINX config |
### Post-Deployment Verification Checklist
After every deployment, verify:
- [ ] Health check passes: `curl -s https://flyer-crawler.projectium.com/api/health/ready`
- [ ] PM2 processes running: `pm2 list` shows `online` status
- [ ] No recent errors: Check Bugsink for new issues
- [ ] Frontend loads: Browser shows login page
- [ ] API responds: `curl https://flyer-crawler.projectium.com/api/health/ping`
### Rollback Procedure
If deployment causes issues:
```bash
# 1. Check current release
cd /var/www/flyer-crawler.projectium.com
git log --oneline -5
# 2. Revert to previous commit
git checkout HEAD~1
# 3. Rebuild and restart
npm ci && npm run build
pm2 restart all
# 4. Verify health
curl -s http://localhost:3001/api/health/ready | jq .
```
---
## Related Documentation
- [Database Setup](DATABASE.md) - PostgreSQL and PostGIS configuration
- [Authentication Setup](AUTHENTICATION.md) - OAuth provider configuration
- [Installation Guide](INSTALL.md) - Local development setup
- [Bare-Metal Server Setup](docs/BARE-METAL-SETUP.md) - Manual server installation guide
- [Database Setup](../architecture/DATABASE.md) - PostgreSQL and PostGIS configuration
- [Monitoring Guide](MONITORING.md) - Health checks and error tracking
- [Logstash Quick Reference](LOGSTASH-QUICK-REF.md) - Log aggregation
- [Bare-Metal Server Setup](BARE-METAL-SETUP.md) - Manual server installation guide

View File

@@ -0,0 +1,269 @@
# Incident Report: PM2 Process Kill During v0.15.0 Deployment
**Date**: 2026-02-17
**Severity**: Critical
**Status**: Mitigated - Safeguards Implemented
**Affected Systems**: All PM2-managed applications on projectium.com server
---
## Resolution Summary
**Safeguards implemented on 2026-02-17** to prevent recurrence:
1. Workflow metadata logging (audit trail)
2. Pre-cleanup PM2 state logging (forensics)
3. Process count validation with SAFETY ABORT (automatic prevention)
4. Explicit name verification (visibility)
5. Post-cleanup verification (environment isolation check)
**Documentation created**:
- [PM2 Incident Response Runbook](PM2-INCIDENT-RESPONSE.md)
- [PM2 Safeguards Session Summary](../archive/sessions/PM2_SAFEGUARDS_SESSION_2026-02-17.md)
- CLAUDE.md updated with [PM2 Process Isolation Incidents section](../../CLAUDE.md#pm2-process-isolation-incidents)
---
## Summary
During v0.15.0 production deployment, ALL PM2 processes on the server were terminated, not just flyer-crawler processes. This caused unplanned downtime for other applications including stock-alert.
## Timeline
| Time (Approx) | Event |
| --------------------- | ---------------------------------------------------------------- |
| 2026-02-17 ~07:40 UTC | v0.15.0 production deployment triggered via `deploy-to-prod.yml` |
| Unknown | All PM2 processes killed (flyer-crawler AND other apps) |
| Unknown | Incident discovered - stock-alert down |
| 2026-02-17 | Investigation initiated |
| 2026-02-17 | Defense-in-depth safeguards implemented in all workflows |
| 2026-02-17 | Incident response runbook created |
| 2026-02-17 | Status changed to Mitigated |
## Impact
- **Affected Applications**: All PM2-managed processes on projectium.com
- flyer-crawler-api, flyer-crawler-worker, flyer-crawler-analytics-worker (expected)
- stock-alert (NOT expected - collateral damage)
- Potentially other unidentified applications
- **Downtime Duration**: TBD
- **User Impact**: Service unavailability for all affected applications
---
## Investigation Findings
### Deployment Workflow Analysis
All deployment workflows were reviewed for PM2 process isolation:
| Workflow | PM2 Isolation | Implementation |
| ------------------------- | -------------- | ------------------------------------------------------------------------------------------------- |
| `deploy-to-prod.yml` | Whitelist | `prodProcesses = ['flyer-crawler-api', 'flyer-crawler-worker', 'flyer-crawler-analytics-worker']` |
| `deploy-to-test.yml` | Pattern | `p.name.endsWith('-test')` |
| `manual-deploy-major.yml` | Whitelist | Same as deploy-to-prod |
| `manual-db-restore.yml` | Explicit names | `pm2 stop flyer-crawler-api flyer-crawler-worker flyer-crawler-analytics-worker` |
### Fix Commit Already In Place
The PM2 process isolation fix was implemented in commit `b6a62a0` (2026-02-13):
```
commit b6a62a036f39ac895271402a61e5cc4227369de7
Author: Torben Sorensen <torben.sorensen@gmail.com>
Date: Fri Feb 13 10:19:28 2026 -0800
be specific about pm2 processes
Files modified:
.gitea/workflows/deploy-to-prod.yml
.gitea/workflows/deploy-to-test.yml
.gitea/workflows/manual-db-restore.yml
.gitea/workflows/manual-deploy-major.yml
CLAUDE.md
```
### v0.15.0 Release Contains Fix
Confirmed: v0.15.0 (commit `93ad624`, 2026-02-18) includes the fix commit:
```
93ad624 ci: Bump version to 0.15.0 for production release [skip ci]
...
b6a62a0 be specific about pm2 processes <-- Fix commit included
```
### Current Workflow PM2 Commands
**Production Deploy (`deploy-to-prod.yml` line 170)**:
```javascript
const prodProcesses = [
'flyer-crawler-api',
'flyer-crawler-worker',
'flyer-crawler-analytics-worker',
];
list.forEach((p) => {
if (
(p.pm2_env.status === 'errored' || p.pm2_env.status === 'stopped') &&
prodProcesses.includes(p.name)
) {
exec('pm2 delete ' + p.pm2_env.pm_id);
}
});
```
**Test Deploy (`deploy-to-test.yml` line 100)**:
```javascript
list.forEach((p) => {
if (p.name && p.name.endsWith('-test')) {
exec('pm2 delete ' + p.pm2_env.pm_id);
}
});
```
Both implementations have proper name filtering and should NOT affect non-flyer-crawler processes.
---
## Discrepancy Analysis
### Key Mystery
**If the fixes are in place, why did ALL processes get killed?**
### Possible Explanations
#### 1. Workflow Version Mismatch (HIGH PROBABILITY)
**Hypothesis**: Gitea runner cached an older version of the workflow file.
- Gitea Actions may cache workflow definitions
- The runner might have executed an older version without the fix
- Need to verify: What version of `deploy-to-prod.yml` actually executed?
**Investigation Required**:
- Check Gitea workflow execution logs for actual script content
- Verify runner workflow caching behavior
- Compare executed workflow vs repository version
#### 2. Concurrent Workflow Execution (MEDIUM PROBABILITY)
**Hypothesis**: Another workflow ran simultaneously with destructive PM2 commands.
Workflows with potential issues:
- `manual-db-reset-prod.yml` - Does NOT restart PM2 (schema reset only)
- `manual-redis-flush-prod.yml` - Does NOT touch PM2
- Test deployment concurrent with prod deployment
**Investigation Required**:
- Check Gitea Actions history for concurrent workflow runs
- Review timestamps of all workflow executions on 2026-02-17
#### 3. Manual SSH Command (MEDIUM PROBABILITY)
**Hypothesis**: Someone SSH'd to the server and ran `pm2 stop all` or `pm2 delete all` manually.
**Investigation Required**:
- Check server shell history (if available)
- Review any maintenance windows or manual interventions
- Ask team members about manual actions
#### 4. PM2 Internal Issue (LOW PROBABILITY)
**Hypothesis**: PM2 daemon crash or corruption caused all processes to stop.
**Investigation Required**:
- Check PM2 daemon logs on server
- Look for OOM killer events in system logs
- Check disk space issues during deployment
#### 5. Script Execution Error (LOW PROBABILITY)
**Hypothesis**: JavaScript parsing error caused the filtering logic to be bypassed.
**Investigation Required**:
- Review workflow execution logs for JavaScript errors
- Test the inline Node.js scripts locally
- Check for shell escaping issues
---
## Documentation/Code Gaps Identified
### CLAUDE.md Documentation
The PM2 isolation rules are documented in `CLAUDE.md`, but:
- Documentation uses `pm2 restart all` in the Quick Reference table (for dev container - acceptable)
- Multiple docs still reference `pm2 restart all` without environment context
- No incident response runbook for PM2 issues
### Workflow Gaps
1. **No Workflow Audit Trail**: No logging of which exact workflow version executed
2. **No Pre-deployment Verification**: Workflows don't log PM2 state before modifications
3. **No Cross-Application Impact Assessment**: No mechanism to detect/warn about other apps
---
## Next Steps for Root Cause Analysis
### Immediate (Priority 1)
1. [ ] Retrieve Gitea Actions execution logs for v0.15.0 deployment
2. [ ] Extract actual executed workflow content from logs
3. [ ] Check for concurrent workflow executions on 2026-02-17
4. [ ] Review server PM2 daemon logs around incident time
### Short-term (Priority 2)
5. [ ] Implement pre-deployment PM2 state logging in workflows
6. [ ] Add workflow version hash logging for audit trail
7. [ ] Create incident response runbook for PM2/deployment issues
### Long-term (Priority 3)
8. [ ] Evaluate PM2 namespacing for complete process isolation
9. [ ] Consider separate PM2 daemon per application
10. [ ] Implement deployment monitoring/alerting
---
## Related Documentation
- [CLAUDE.md - PM2 Process Isolation](../../../CLAUDE.md) (Critical Rules section)
- [ADR-014: Containerization and Deployment Strategy](../adr/0014-containerization-and-deployment-strategy.md)
- [Deployment Guide](./DEPLOYMENT.md)
- Workflow files in `.gitea/workflows/`
---
## Appendix: Commit Timeline
```
93ad624 ci: Bump version to 0.15.0 for production release [skip ci] <-- v0.15.0 release
7dd4f21 ci: Bump version to 0.14.4 [skip ci]
174b637 even more typescript fixes
4f80baf ci: Bump version to 0.14.3 [skip ci]
8450b5e Generate TSOA Spec and Routes
e4d830a ci: Bump version to 0.14.2 [skip ci]
b6a62a0 be specific about pm2 processes <-- PM2 fix commit
2d2cd52 Massive Dependency Modernization Project
```
---
## Revision History
| Date | Author | Change |
| ---------- | ------------------ | ----------------------- |
| 2026-02-17 | Investigation Team | Initial incident report |

View File

@@ -2,10 +2,47 @@
Aggregates logs from PostgreSQL, PM2, Redis, NGINX; forwards errors to Bugsink.
**Last verified**: 2026-01-28
**Related documentation**:
- [ADR-050: PostgreSQL Function Observability](../adr/0050-postgresql-function-observability.md)
- [ADR-015: Error Tracking and Observability](../adr/0015-error-tracking-and-observability.md)
- [Monitoring Guide](MONITORING.md)
- [Logstash Troubleshooting Runbook](LOGSTASH-TROUBLESHOOTING.md)
---
## Quick Reference
### Bugsink Project Routing
| Source Type | Environment | Bugsink Project | Project ID |
| -------------- | ----------- | -------------------- | ---------- |
| PM2 API/Worker | Dev | Backend API (Dev) | 1 |
| PostgreSQL | Dev | Backend API (Dev) | 1 |
| Frontend JS | Dev | Frontend (Dev) | 2 |
| Redis/NGINX | Dev | Infrastructure (Dev) | 4 |
| PM2 API/Worker | Production | Backend API (Prod) | 1 |
| PostgreSQL | Production | Backend API (Prod) | 1 |
| PM2 API/Worker | Test | Backend API (Test) | 3 |
### Key DSN Keys (Dev Container)
| Project | DSN Key |
| -------------------- | ---------------------------------- |
| Backend API (Dev) | `cea01396c56246adb5878fa5ee6b1d22` |
| Frontend (Dev) | `d92663cb73cf4145b677b84029e4b762` |
| Infrastructure (Dev) | `14e8791da3d347fa98073261b596cab9` |
---
## Configuration
**Primary config**: `/etc/logstash/conf.d/bugsink.conf`
**Dev container config**: `docker/logstash/bugsink.conf`
### Related Files
| Path | Purpose |
@@ -89,6 +126,34 @@ MSYS_NO_PATHCONV=1 podman exec flyer-crawler-dev ls -la /var/log/redis/
## Troubleshooting
### Decision Tree: Logs Not Appearing in Bugsink
```text
Errors not showing in Bugsink?
|
+-- Logstash running?
| |
| +-- No --> systemctl start logstash
| +-- Yes --> Check pipeline stats
| |
| +-- Events in = 0?
| | |
| | +-- Log files exist? --> ls /var/log/pm2/*.log
| | +-- Permissions OK? --> groups logstash
| |
| +-- Events filtered = high?
| | |
| | +-- Grok failures --> Check log format matches pattern
| |
| +-- Events out but no Bugsink?
| |
| +-- 403 error --> Wrong DSN key
| +-- 500 error --> Invalid event format (check sentry_level)
| +-- Connection refused --> Bugsink not running
```
### Common Issues Table
| Issue | Check | Solution |
| --------------------- | ---------------- | ---------------------------------------------------------------------------------------------- |
| No Bugsink errors | Logstash running | `systemctl status logstash` |
@@ -103,6 +168,25 @@ MSYS_NO_PATHCONV=1 podman exec flyer-crawler-dev ls -la /var/log/redis/
| High disk usage | Log rotation | Verify `/etc/logrotate.d/logstash` configured |
| varchar(7) error | Level validation | Add Ruby filter to validate/normalize `sentry_level` before output |
### Expected Output Examples
**Successful Logstash pipeline stats**:
```json
{
"in": 1523,
"out": 1520,
"filtered": 1520,
"queue_push_duration_in_millis": 45
}
```
**Healthy Bugsink HTTP response**:
```json
{ "id": "a1b2c3d4e5f6..." }
```
## Related Documentation
- **Dev Container Guide**: [DEV-CONTAINER.md](../development/DEV-CONTAINER.md) - PM2 and log aggregation in dev

View File

@@ -2,6 +2,16 @@
This runbook provides step-by-step diagnostics and solutions for common Logstash issues in the PostgreSQL observability pipeline (ADR-050).
**Last verified**: 2026-01-28
**Related documentation**:
- [ADR-050: PostgreSQL Function Observability](../adr/0050-postgresql-function-observability.md)
- [Logstash Quick Reference](LOGSTASH-QUICK-REF.md)
- [Monitoring Guide](MONITORING.md)
---
## Quick Reference
| Symptom | Most Likely Cause | Quick Check |

View File

@@ -2,6 +2,72 @@
This guide covers all aspects of monitoring the Flyer Crawler application across development, test, and production environments.
**Last verified**: 2026-01-28
**Related documentation**:
- [ADR-015: Error Tracking and Observability](../adr/0015-error-tracking-and-observability.md)
- [ADR-020: Health Checks](../adr/0020-health-checks-and-liveness-readiness-probes.md)
- [ADR-050: PostgreSQL Function Observability](../adr/0050-postgresql-function-observability.md)
- [Logstash Quick Reference](LOGSTASH-QUICK-REF.md)
- [Deployment Guide](DEPLOYMENT.md)
---
## Quick Reference
### Monitoring URLs
| Service | Production URL | Dev Container URL |
| ------------ | ------------------------------------------------------- | ---------------------------------------- |
| Health Check | `https://flyer-crawler.projectium.com/api/health/ready` | `http://localhost:3001/api/health/ready` |
| Bugsink | `https://bugsink.projectium.com` | `https://localhost:8443` |
| Bull Board | `https://flyer-crawler.projectium.com/api/admin/jobs` | `http://localhost:3001/api/admin/jobs` |
### Quick Diagnostic Commands
```bash
# Check all services at once (production)
curl -s https://flyer-crawler.projectium.com/api/health/ready | jq '.data.services'
# Dev container health check
podman exec flyer-crawler-dev curl -s http://localhost:3001/api/health/ready | jq .
# PM2 process overview
pm2 list
# Recent errors in Bugsink (via MCP)
# mcp__bugsink__list_issues --project_id 1 --status unresolved
```
### Monitoring Decision Tree
```text
Application seems slow or unresponsive?
|
+-- Check health endpoint first
| |
| +-- Returns unhealthy?
| | |
| | +-- Database unhealthy --> Check DB pool, connections
| | +-- Redis unhealthy --> Check Redis memory, connection
| | +-- Storage unhealthy --> Check disk space, permissions
| |
| +-- Returns healthy but slow?
| |
| +-- Check PM2 memory/CPU usage
| +-- Check database slow query log
| +-- Check Redis queue depth
|
+-- Health endpoint not responding?
|
+-- Check PM2 status --> Process crashed?
+-- Check NGINX --> 502 errors?
+-- Check network --> Firewall/DNS issues?
```
---
## Table of Contents
1. [Health Checks](#health-checks)
@@ -276,10 +342,10 @@ Dev Container (in `.mcp.json`):
Bugsink 2.0.11 does not have a UI for API tokens. Create via Django management command.
**Production**:
**Production** (user executes on server):
```bash
ssh root@projectium.com "cd /opt/bugsink && bugsink-manage create_auth_token"
cd /opt/bugsink && bugsink-manage create_auth_token
```
**Dev Container**:
@@ -294,7 +360,7 @@ The command outputs a 40-character hex token.
**Error Anatomy**:
```
```text
TypeError: Cannot read properties of undefined (reading 'map')
├── Exception Type: TypeError
├── Message: Cannot read properties of undefined (reading 'map')
@@ -357,7 +423,7 @@ Logstash aggregates logs from multiple sources and forwards errors to Bugsink (A
### Architecture
```
```text
Log Sources Logstash Outputs
┌──────────────┐ ┌─────────────┐ ┌─────────────┐
│ PostgreSQL │──────────────│ │───────────│ Bugsink │
@@ -388,11 +454,9 @@ Log Sources Logstash Outputs
### Pipeline Status
**Check Logstash Service**:
**Check Logstash Service** (user executes on server):
```bash
ssh root@projectium.com
# Service status
systemctl status logstash
@@ -485,9 +549,11 @@ PM2 manages the Node.js application processes in production.
### Basic Commands
> **Note**: These commands are for the user to execute on the server. Claude Code provides commands but cannot run them directly.
```bash
ssh root@projectium.com
su - gitea-runner # PM2 runs under this user
# Switch to gitea-runner user (PM2 runs under this user)
su - gitea-runner
# List all processes
pm2 list
@@ -520,7 +586,7 @@ pm2 stop flyer-crawler-api
**Healthy Process**:
```
```text
┌─────────────────────┬────┬─────────┬─────────┬───────┬────────┬─────────┬──────────┐
│ Name │ id │ mode │ status │ cpu │ mem │ uptime │ restarts │
├─────────────────────┼────┼─────────┼─────────┼───────┼────────┼─────────┼──────────┤
@@ -833,29 +899,28 @@ Configure alerts in your monitoring tool (UptimeRobot, Datadog, etc.):
2. Review during business hours
3. Create Gitea issue for tracking
### Quick Diagnostic Commands
### On-Call Diagnostic Commands
> **Note**: User executes these commands on the server. Claude Code provides commands but cannot run them directly.
```bash
# Full system health check
ssh root@projectium.com << 'EOF'
echo "=== Service Status ==="
# Service status checks
systemctl status pm2-gitea-runner --no-pager
systemctl status logstash --no-pager
systemctl status redis --no-pager
systemctl status postgresql --no-pager
echo "=== PM2 Processes ==="
# PM2 processes (run as gitea-runner)
su - gitea-runner -c "pm2 list"
echo "=== Disk Space ==="
# Disk space
df -h / /var
echo "=== Memory ==="
# Memory
free -h
echo "=== Recent Errors ==="
# Recent errors
journalctl -p err -n 20 --no-pager
EOF
```
### Runbook Quick Reference

View File

@@ -0,0 +1,818 @@
# PM2 Incident Response Runbook
**Purpose**: Step-by-step procedures for responding to PM2 process isolation incidents on the projectium.com server.
**Audience**: On-call responders, system administrators, developers with server access.
**Last updated**: 2026-02-17
**Related documentation**:
- [CLAUDE.md - PM2 Process Isolation Rules](../../CLAUDE.md)
- [Incident Report: 2026-02-17](INCIDENT-2026-02-17-PM2-PROCESS-KILL.md)
- [Monitoring Guide](MONITORING.md)
- [Deployment Guide](DEPLOYMENT.md)
---
## Table of Contents
1. [Quick Reference](#quick-reference)
2. [Detection](#detection)
3. [Initial Assessment](#initial-assessment)
4. [Immediate Response](#immediate-response)
5. [Process Restoration](#process-restoration)
6. [Root Cause Investigation](#root-cause-investigation)
7. [Communication Templates](#communication-templates)
8. [Prevention Measures](#prevention-measures)
9. [Contact Information](#contact-information)
10. [Post-Incident Review](#post-incident-review)
---
## Quick Reference
### PM2 Process Inventory
| Application | Environment | Process Names | Config File | Directory |
| ------------- | ----------- | -------------------------------------------------------------------------------------------- | --------------------------- | -------------------------------------------- |
| Flyer Crawler | Production | `flyer-crawler-api`, `flyer-crawler-worker`, `flyer-crawler-analytics-worker` | `ecosystem.config.cjs` | `/var/www/flyer-crawler.projectium.com` |
| Flyer Crawler | Test | `flyer-crawler-api-test`, `flyer-crawler-worker-test`, `flyer-crawler-analytics-worker-test` | `ecosystem-test.config.cjs` | `/var/www/flyer-crawler-test.projectium.com` |
| Stock Alert | Production | `stock-alert-*` | (varies) | `/var/www/stock-alert.projectium.com` |
### Critical Commands
```bash
# Check PM2 status
pm2 list
# Check specific process
pm2 show flyer-crawler-api
# View recent logs
pm2 logs --lines 50
# Restart specific processes (SAFE)
pm2 restart flyer-crawler-api flyer-crawler-worker flyer-crawler-analytics-worker
# DO NOT USE (affects ALL apps)
# pm2 restart all <-- DANGEROUS
# pm2 stop all <-- DANGEROUS
# pm2 delete all <-- DANGEROUS
```
### Severity Classification
| Severity | Criteria | Response Time | Example |
| ----------------- | --------------------------------------------- | ------------------- | ----------------------------------------------- |
| **P1 - Critical** | Multiple applications down, production impact | Immediate (< 5 min) | All PM2 processes killed |
| **P2 - High** | Single application down, production impact | < 15 min | Flyer Crawler prod down, Stock Alert unaffected |
| **P3 - Medium** | Test environment only, no production impact | < 1 hour | Test processes killed, production unaffected |
---
## Detection
### How to Identify a PM2 Incident
**Automated Indicators**:
- Health check failures on `/api/health/ready`
- Monitoring alerts (UptimeRobot, etc.)
- Bugsink showing connection errors
- NGINX returning 502 Bad Gateway
**User-Reported Symptoms**:
- "The site is down"
- "I can't log in"
- "Pages are loading slowly then timing out"
- "I see a 502 error"
**Manual Discovery**:
```bash
# SSH to server
ssh gitea-runner@projectium.com
# Check if PM2 is running
pm2 list
# Expected output shows processes
# If empty or all errored = incident
```
### Incident Signature: Process Isolation Violation
When a PM2 incident is caused by process isolation failure, you will see:
```text
# Expected state (normal):
+-----------------------------------+----+-----+---------+-------+
| App name | id |mode | status | cpu |
+-----------------------------------+----+-----+---------+-------+
| flyer-crawler-api | 0 |clust| online | 0% |
| flyer-crawler-worker | 1 |fork | online | 0% |
| flyer-crawler-analytics-worker | 2 |fork | online | 0% |
| flyer-crawler-api-test | 3 |fork | online | 0% |
| flyer-crawler-worker-test | 4 |fork | online | 0% |
| flyer-crawler-analytics-worker-test| 5 |fork | online | 0% |
| stock-alert-api | 6 |fork | online | 0% |
+-----------------------------------+----+-----+---------+-------+
# Incident state (isolation violation):
# All processes missing or errored - not just one app
+-----------------------------------+----+-----+---------+-------+
| App name | id |mode | status | cpu |
+-----------------------------------+----+-----+---------+-------+
# (empty or all processes errored/stopped)
+-----------------------------------+----+-----+---------+-------+
```
---
## Initial Assessment
### Step 1: Gather Information (2 minutes)
Run these commands and capture output:
```bash
# 1. Check PM2 status
pm2 list
# 2. Check PM2 daemon status
pm2 ping
# 3. Check recent PM2 logs
pm2 logs --lines 20 --nostream
# 4. Check system status
systemctl status pm2-gitea-runner --no-pager
# 5. Check disk space
df -h /
# 6. Check memory
free -h
# 7. Check recent deployments (in app directory)
cd /var/www/flyer-crawler.projectium.com
git log --oneline -5
```
### Step 2: Determine Scope
| Question | Command | Impact Level |
| ------------------------ | ---------------------------------------------------------------- | ------------------------------- |
| How many apps affected? | `pm2 list` | Count missing/errored processes |
| Is production down? | `curl https://flyer-crawler.projectium.com/api/health/ping` | Yes/No |
| Is test down? | `curl https://flyer-crawler-test.projectium.com/api/health/ping` | Yes/No |
| Are other apps affected? | `pm2 list \| grep stock-alert` | Yes/No |
### Step 3: Classify Severity
```text
Decision Tree:
Production app(s) down?
|
+-- YES: Multiple apps affected?
| |
| +-- YES --> P1 CRITICAL (all apps down)
| |
| +-- NO --> P2 HIGH (single app down)
|
+-- NO: Test environment only?
|
+-- YES --> P3 MEDIUM
|
+-- NO --> Investigate further
```
### Step 4: Document Initial State
Capture this information before making any changes:
```bash
# Save PM2 state to file
pm2 jlist > /tmp/pm2-incident-$(date +%Y%m%d-%H%M%S).json
# Save system state
{
echo "=== PM2 List ==="
pm2 list
echo ""
echo "=== Disk Space ==="
df -h
echo ""
echo "=== Memory ==="
free -h
echo ""
echo "=== Recent Git Commits ==="
cd /var/www/flyer-crawler.projectium.com && git log --oneline -5
} > /tmp/incident-state-$(date +%Y%m%d-%H%M%S).txt
```
---
## Immediate Response
### Priority 1: Stop Ongoing Deployments
If a deployment is currently running:
1. Check Gitea Actions for running workflows
2. Cancel any in-progress deployment workflows
3. Do NOT start new deployments until incident resolved
### Priority 2: Assess Which Processes Are Down
```bash
# Get list of processes and their status
pm2 list
# Check which processes exist but are errored/stopped
pm2 jlist | jq '.[] | {name, status: .pm2_env.status}'
```
### Priority 3: Establish Order of Restoration
Restore in this order (production first, critical path first):
| Priority | Process | Rationale |
| -------- | ------------------------------------- | ------------------------------------ |
| 1 | `flyer-crawler-api` | Production API - highest user impact |
| 2 | `flyer-crawler-worker` | Production background jobs |
| 3 | `flyer-crawler-analytics-worker` | Production analytics |
| 4 | `stock-alert-*` | Other production apps |
| 5 | `flyer-crawler-api-test` | Test environment |
| 6 | `flyer-crawler-worker-test` | Test background jobs |
| 7 | `flyer-crawler-analytics-worker-test` | Test analytics |
---
## Process Restoration
### Scenario A: Flyer Crawler Production Processes Missing
```bash
# Navigate to production directory
cd /var/www/flyer-crawler.projectium.com
# Start production processes
pm2 start ecosystem.config.cjs
# Verify processes started
pm2 list
# Check health endpoint
curl -s http://localhost:3001/api/health/ready | jq .
```
### Scenario B: Flyer Crawler Test Processes Missing
```bash
# Navigate to test directory
cd /var/www/flyer-crawler-test.projectium.com
# Start test processes
pm2 start ecosystem-test.config.cjs
# Verify processes started
pm2 list
# Check health endpoint
curl -s http://localhost:3002/api/health/ready | jq .
```
### Scenario C: Stock Alert Processes Missing
```bash
# Navigate to stock-alert directory
cd /var/www/stock-alert.projectium.com
# Start processes (adjust config file name as needed)
pm2 start ecosystem.config.cjs
# Verify processes started
pm2 list
```
### Scenario D: All Processes Missing
Execute restoration in priority order:
```bash
# 1. Flyer Crawler Production (highest priority)
cd /var/www/flyer-crawler.projectium.com
pm2 start ecosystem.config.cjs
# Verify production is healthy before continuing
curl -s http://localhost:3001/api/health/ready | jq '.data.status'
# Should return "healthy"
# 2. Stock Alert Production
cd /var/www/stock-alert.projectium.com
pm2 start ecosystem.config.cjs
# 3. Flyer Crawler Test (lower priority)
cd /var/www/flyer-crawler-test.projectium.com
pm2 start ecosystem-test.config.cjs
# 4. Save PM2 process list
pm2 save
# 5. Final verification
pm2 list
```
### Health Check Verification
After restoration, verify each application:
**Flyer Crawler Production**:
```bash
# API health
curl -s https://flyer-crawler.projectium.com/api/health/ready | jq '.data.status'
# Expected: "healthy"
# Check all services
curl -s https://flyer-crawler.projectium.com/api/health/ready | jq '.data.services'
```
**Flyer Crawler Test**:
```bash
curl -s https://flyer-crawler-test.projectium.com/api/health/ready | jq '.data.status'
```
**Stock Alert**:
```bash
# Adjust URL as appropriate for stock-alert
curl -s https://stock-alert.projectium.com/api/health/ready | jq '.data.status'
```
### Verification Checklist
After restoration, confirm:
- [ ] `pm2 list` shows all expected processes as `online`
- [ ] Production health check returns `healthy`
- [ ] Test health check returns `healthy` (if applicable)
- [ ] No processes showing high restart count
- [ ] No processes showing `errored` or `stopped` status
- [ ] PM2 process list saved: `pm2 save`
---
## Root Cause Investigation
### Step 1: Check Workflow Execution Logs
```bash
# Find recent Gitea Actions runs
# (Access via Gitea web UI: Repository > Actions > Recent Runs)
# Look for these workflows:
# - deploy-to-prod.yml
# - deploy-to-test.yml
# - manual-deploy-major.yml
# - manual-db-restore.yml
```
### Step 2: Check PM2 Daemon Logs
```bash
# PM2 daemon logs
cat ~/.pm2/pm2.log | tail -100
# PM2 process-specific logs
ls -la ~/.pm2/logs/
# Recent API logs
tail -100 ~/.pm2/logs/flyer-crawler-api-out.log
tail -100 ~/.pm2/logs/flyer-crawler-api-error.log
```
### Step 3: Check System Logs
```bash
# System journal for PM2 service
journalctl -u pm2-gitea-runner -n 100 --no-pager
# Kernel messages (OOM killer, etc.)
journalctl -k -n 50 --no-pager | grep -i "killed\|oom\|memory"
# Authentication logs (unauthorized access)
tail -50 /var/log/auth.log
```
### Step 4: Git History Analysis
```bash
# Recent commits to deployment workflows
cd /var/www/flyer-crawler.projectium.com
git log --oneline -20 -- .gitea/workflows/
# Check what changed in PM2 configs
git log --oneline -10 -- ecosystem.config.cjs ecosystem-test.config.cjs
# Diff against last known good state
git diff <last-good-commit> -- .gitea/workflows/ ecosystem*.cjs
```
### Step 5: Timing Correlation
Create a timeline:
```text
| Time (UTC) | Event | Source |
|------------|-------|--------|
| XX:XX | Last successful health check | Monitoring |
| XX:XX | Deployment workflow started | Gitea Actions |
| XX:XX | First failed health check | Monitoring |
| XX:XX | Incident detected | User report / Alert |
| XX:XX | Investigation started | On-call |
```
### Common Root Causes
| Root Cause | Evidence | Prevention |
| ---------------------------- | -------------------------------------- | ---------------------------- |
| `pm2 stop all` in workflow | Workflow logs show "all" command | Use explicit process names |
| `pm2 delete all` in workflow | Empty PM2 list after deploy | Use whitelist-based deletion |
| OOM killer | `journalctl -k` shows "Killed process" | Increase memory limits |
| Disk space exhaustion | `df -h` shows 100% | Log rotation, cleanup |
| Manual intervention | Shell history shows pm2 commands | Document all manual actions |
| Concurrent deployments | Multiple workflows at same time | Implement deployment locks |
| Workflow caching issue | Old workflow version executed | Force workflow refresh |
---
## Communication Templates
### Incident Notification (Internal)
```text
Subject: [P1 INCIDENT] PM2 Process Isolation Failure - Multiple Apps Down
Status: INVESTIGATING
Time Detected: YYYY-MM-DD HH:MM UTC
Affected Systems: [flyer-crawler-prod, stock-alert-prod, ...]
Summary:
All PM2 processes on projectium.com server were terminated unexpectedly.
Multiple production applications are currently down.
Impact:
- flyer-crawler.projectium.com: DOWN
- stock-alert.projectium.com: DOWN
- [other affected apps]
Current Actions:
- Restoring critical production processes
- Investigating root cause
Next Update: In 15 minutes or upon status change
Incident Commander: [Name]
```
### Status Update Template
```text
Subject: [P1 INCIDENT] PM2 Process Isolation Failure - UPDATE #N
Status: [INVESTIGATING | IDENTIFIED | RESTORING | RESOLVED]
Time: YYYY-MM-DD HH:MM UTC
Progress Since Last Update:
- [Action taken]
- [Discovery made]
- [Process restored]
Current State:
- flyer-crawler.projectium.com: [UP|DOWN]
- stock-alert.projectium.com: [UP|DOWN]
Root Cause: [If identified]
Next Steps:
- [Planned action]
ETA to Resolution: [If known]
Next Update: In [X] minutes
```
### Resolution Notification
```text
Subject: [RESOLVED] PM2 Process Isolation Failure
Status: RESOLVED
Time Resolved: YYYY-MM-DD HH:MM UTC
Total Downtime: X minutes
Summary:
All PM2 processes have been restored. Services are operating normally.
Root Cause:
[Brief description of what caused the incident]
Impact Summary:
- flyer-crawler.projectium.com: Down for X minutes
- stock-alert.projectium.com: Down for X minutes
- Estimated user impact: [description]
Immediate Actions Taken:
1. [Action]
2. [Action]
Follow-up Actions:
1. [ ] [Preventive measure] - Owner: [Name] - Due: [Date]
2. [ ] Post-incident review scheduled for [Date]
Post-Incident Review: [Link or scheduled time]
```
---
## Prevention Measures
### Pre-Deployment Checklist
Before triggering any deployment:
- [ ] Review workflow file for PM2 commands
- [ ] Confirm no `pm2 stop all`, `pm2 delete all`, or `pm2 restart all`
- [ ] Verify process names are explicitly listed
- [ ] Check for concurrent deployment risks
- [ ] Confirm recent workflow changes were reviewed
### Workflow Review Checklist
When reviewing deployment workflow changes:
- [ ] All PM2 `stop` commands use explicit process names
- [ ] All PM2 `delete` commands filter by process name pattern
- [ ] All PM2 `restart` commands use explicit process names
- [ ] Test deployments filter by `-test` suffix
- [ ] Production deployments use whitelist array
**Safe Patterns**:
```javascript
// SAFE: Explicit process names (production)
const prodProcesses = [
'flyer-crawler-api',
'flyer-crawler-worker',
'flyer-crawler-analytics-worker',
];
list.forEach((p) => {
if (
(p.pm2_env.status === 'errored' || p.pm2_env.status === 'stopped') &&
prodProcesses.includes(p.name)
) {
exec('pm2 delete ' + p.pm2_env.pm_id);
}
});
// SAFE: Pattern-based filtering (test)
list.forEach((p) => {
if (p.name && p.name.endsWith('-test')) {
exec('pm2 delete ' + p.pm2_env.pm_id);
}
});
```
**Dangerous Patterns** (NEVER USE):
```bash
# DANGEROUS - affects ALL applications
pm2 stop all
pm2 delete all
pm2 restart all
# DANGEROUS - no name filtering
pm2 delete $(pm2 jlist | jq -r '.[] | select(.pm2_env.status == "errored") | .pm_id')
```
### PM2 Configuration Validation
Before deploying PM2 config changes:
```bash
# Test configuration locally
cd /var/www/flyer-crawler.projectium.com
node -e "console.log(JSON.stringify(require('./ecosystem.config.cjs'), null, 2))"
# Verify process names
node -e "require('./ecosystem.config.cjs').apps.forEach(a => console.log(a.name))"
# Expected output should match documented process names
```
### Deployment Monitoring
After every deployment:
```bash
# Immediate verification
pm2 list
# Check no unexpected processes were affected
pm2 list | grep -v flyer-crawler
# Should still show other apps (e.g., stock-alert)
# Health check
curl -s https://flyer-crawler.projectium.com/api/health/ready | jq '.data.status'
```
---
## Contact Information
### On-Call Escalation
| Role | Contact | When to Escalate |
| ----------------- | -------------- | ----------------------------------- |
| Primary On-Call | [Name/Channel] | First responder |
| Secondary On-Call | [Name/Channel] | If primary unavailable after 10 min |
| Engineering Lead | [Name/Channel] | P1 incidents > 30 min |
| Product Owner | [Name/Channel] | User communication needed |
### External Dependencies
| Service | Support Channel | When to Contact |
| --------------- | --------------- | ----------------------- |
| Server Provider | [Contact info] | Hardware/network issues |
| DNS Provider | [Contact info] | DNS resolution failures |
| SSL Certificate | [Contact info] | Certificate issues |
### Communication Channels
| Channel | Purpose |
| -------------- | -------------------------- |
| `#incidents` | Real-time incident updates |
| `#deployments` | Deployment announcements |
| `#engineering` | Technical discussion |
| Email list | Formal notifications |
---
## Post-Incident Review
### Incident Report Template
```markdown
# Incident Report: [Title]
## Overview
| Field | Value |
| ------------------ | ----------------- |
| Date | YYYY-MM-DD |
| Duration | X hours Y minutes |
| Severity | P1/P2/P3 |
| Incident Commander | [Name] |
| Status | Resolved |
## Timeline
| Time (UTC) | Event |
| ---------- | ------------------- |
| HH:MM | [Event description] |
| HH:MM | [Event description] |
## Impact
- **Users affected**: [Number/description]
- **Revenue impact**: [If applicable]
- **SLA impact**: [If applicable]
## Root Cause
[Detailed technical explanation]
## Resolution
[What was done to resolve the incident]
## Contributing Factors
1. [Factor]
2. [Factor]
## Action Items
| Action | Owner | Due Date | Status |
| -------- | ------ | -------- | ------ |
| [Action] | [Name] | [Date] | [ ] |
## Lessons Learned
### What Went Well
- [Item]
### What Could Be Improved
- [Item]
## Appendix
- Link to monitoring data
- Link to relevant logs
- Link to workflow runs
```
### Lessons Learned Format
Use "5 Whys" technique:
```text
Problem: All PM2 processes were killed during deployment
Why 1: The deployment workflow ran `pm2 delete all`
Why 2: The workflow used an outdated version of the script
Why 3: Gitea runner cached the old workflow file
Why 4: No mechanism to verify workflow version before execution
Why 5: Workflow versioning and audit trail not implemented
Root Cause: Lack of workflow versioning and execution verification
Preventive Measure: Implement workflow hash logging and pre-execution verification
```
### Action Items Tracking
Create Gitea issues for each action item:
```bash
# Example using Gitea CLI or API
gh issue create --title "Implement PM2 state logging in deployment workflows" \
--body "Related to incident YYYY-MM-DD. Add pre-deployment PM2 state capture." \
--label "incident-follow-up,priority:high"
```
Track action items in a central location:
| Issue # | Action | Owner | Due | Status |
| ------- | -------------------------------- | ------ | ------ | ------ |
| #123 | Add PM2 state logging | [Name] | [Date] | Open |
| #124 | Implement workflow version hash | [Name] | [Date] | Open |
| #125 | Create deployment lock mechanism | [Name] | [Date] | Open |
---
## Appendix: PM2 Command Reference
### Safe Commands
```bash
# Status and monitoring
pm2 list
pm2 show <process-name>
pm2 monit
pm2 logs <process-name>
# Restart specific processes
pm2 restart flyer-crawler-api
pm2 restart flyer-crawler-api flyer-crawler-worker flyer-crawler-analytics-worker
# Reload (zero-downtime, cluster mode only)
pm2 reload flyer-crawler-api
# Start from config
pm2 start ecosystem.config.cjs
pm2 start ecosystem.config.cjs --only flyer-crawler-api
```
### Dangerous Commands (Use With Caution)
```bash
# CAUTION: These affect ALL processes
pm2 stop all # Stops every PM2 process
pm2 restart all # Restarts every PM2 process
pm2 delete all # Removes every PM2 process
# CAUTION: Modifies saved process list
pm2 save # Overwrites saved process list
pm2 resurrect # Restores from saved list
# CAUTION: Affects PM2 daemon
pm2 kill # Kills PM2 daemon and all processes
pm2 update # Updates PM2 in place (may cause brief outage)
```
---
## Revision History
| Date | Author | Change |
| ---------- | ---------------------- | ------------------------ |
| 2026-02-17 | Incident Response Team | Initial runbook creation |

View File

@@ -0,0 +1,161 @@
# Unit Test Fix Plan: Error Log Path Mismatches
**Date**: 2026-01-27
**Type**: Technical Implementation Plan
**Related**: [ADR-008: API Versioning Strategy](../adr/0008-api-versioning-strategy.md)
**Status**: Ready for Implementation
---
## Problem Statement
16 unit tests fail due to error log message assertions expecting versioned paths (`/api/v1/`) while route handlers emit hardcoded unversioned paths (`/api/`).
**Failure Pattern**:
```text
AssertionError: expected "Error PUT /api/users/profile" to contain "/api/v1/users/profile"
```
**Scope**: All failures are `toContain` assertions on `logger.error()` call arguments.
---
## Root Cause Analysis
| Layer | Behavior | Issue |
| ------------------ | ----------------------------------------------------- | ------------------- |
| Route Registration | `server.ts` mounts at `/api/v1/` | Correct |
| Request Path | `req.path` returns `/users/profile` (router-relative) | No version info |
| Error Handlers | Hardcode `"Error PUT /api/users/profile"` | Version mismatch |
| Test Assertions | Expect `"/api/v1/users/profile"` | Correct expectation |
**Root Cause**: Error log statements use template literals with hardcoded `/api/` prefix instead of `req.originalUrl` which contains the full versioned path.
**Example**:
```typescript
// Current (broken)
logger.error(`Error PUT /api/users/profile: ${err}`);
// Expected
logger.error(`Error PUT ${req.originalUrl}: ${err}`);
// Output: "Error PUT /api/v1/users/profile: ..."
```
---
## Solution Approach
Replace hardcoded path strings with `req.originalUrl` in all error log statements.
### Express Request Properties Reference
| Property | Example Value | Use Case |
| ----------------- | ------------------------------- | ----------------------------- |
| `req.originalUrl` | `/api/v1/users/profile?foo=bar` | Full URL with version + query |
| `req.path` | `/profile` | Router-relative path only |
| `req.baseUrl` | `/api/v1/users` | Mount point |
**Decision**: Use `req.originalUrl` for error logging to capture complete request context.
---
## Implementation Plan
### Affected Files
| File | Error Statements | Methods |
| ------------------------------- | ---------------- | ---------------------------------------------------- |
| `src/routes/users.routes.ts` | 3 | `PUT /profile`, `POST /profile/password`, `DELETE /` |
| `src/routes/recipe.routes.ts` | 2 | `POST /import`, `POST /:id/fork` |
| `src/routes/receipts.routes.ts` | 2 | `POST /`, `PATCH /:id` |
| `src/routes/flyers.routes.ts` | 2 | `POST /`, `PUT /:id` |
**Total**: 9 error log statements across 4 route files
### Parallel Implementation Tasks
All 4 files can be modified independently:
**Task 1**: `users.routes.ts`
- Line patterns: `Error PUT /api/users/profile`, `Error POST /api/users/profile/password`, `Error DELETE /api/users`
- Change: Replace with `Error ${req.method} ${req.originalUrl}`
**Task 2**: `recipe.routes.ts`
- Line patterns: `Error POST /api/recipes/import`, `Error POST /api/recipes/:id/fork`
- Change: Replace with `Error ${req.method} ${req.originalUrl}`
**Task 3**: `receipts.routes.ts`
- Line patterns: `Error POST /api/receipts`, `Error PATCH /api/receipts/:id`
- Change: Replace with `Error ${req.method} ${req.originalUrl}`
**Task 4**: `flyers.routes.ts`
- Line patterns: `Error POST /api/flyers`, `Error PUT /api/flyers/:id`
- Change: Replace with `Error ${req.method} ${req.originalUrl}`
### Verification
```bash
podman exec -it flyer-crawler-dev npm run test:unit
```
**Expected**: 16 failures → 0 failures (3,391/3,391 passing)
---
## Test Files Affected
Tests that will pass after fix:
| Test File | Failing Tests |
| ------------------------- | ------------- |
| `users.routes.test.ts` | 6 |
| `recipe.routes.test.ts` | 4 |
| `receipts.routes.test.ts` | 3 |
| `flyers.routes.test.ts` | 3 |
---
## Expected Outcomes
| Metric | Before | After |
| ------------------ | ----------- | ------------------- |
| Unit test failures | 16 | 0 |
| Unit tests passing | 3,375/3,391 | 3,391/3,391 |
| Integration tests | 345/348 | 345/348 (unchanged) |
### Benefits
1. **Version-agnostic logging**: Error messages automatically reflect actual request URL
2. **Future-proof**: No changes needed when v2 API is introduced
3. **Debugging clarity**: Logs show exact URL including query parameters
4. **Consistency**: All error handlers follow same pattern
---
## Implementation Notes
### Pattern to Apply
**Before**:
```typescript
logger.error(`Error PUT /api/users/profile: ${error.message}`);
```
**After**:
```typescript
logger.error(`Error ${req.method} ${req.originalUrl}: ${error.message}`);
```
### Edge Cases
- `req.originalUrl` includes query string if present (acceptable for debugging)
- No sanitization needed as URL is from Express parsed request
- Works correctly with route parameters (`:id` becomes actual value)

View File

@@ -0,0 +1,849 @@
# ADR-024 Implementation Plan: Feature Flagging Strategy
**Date**: 2026-01-28
**Type**: Technical Implementation Plan
**Related**: [ADR-024: Feature Flagging Strategy](../adr/0024-feature-flagging-strategy.md), [ADR-007: Configuration and Secrets Management](../adr/0007-configuration-and-secrets-management.md)
**Status**: Ready for Implementation
---
## Project Overview
Implement a simple, configuration-based feature flag system that integrates with the existing Zod-validated configuration in `src/config/env.ts`. The system will support both backend and frontend feature flags through environment variables, with type-safe access patterns and helper utilities.
### Key Success Criteria
1. Feature flags accessible via type-safe API on both backend and frontend
2. Zero runtime overhead when flag is disabled (compile-time elimination where possible)
3. Consistent naming convention (environment variables and code access)
4. Graceful degradation (missing flag defaults to disabled)
5. Easy migration path to external service (Flagsmith/LaunchDarkly) in the future
6. Full test coverage with mocking utilities
### Estimated Total Effort
| Phase | Estimate |
| --------------------------------- | -------------- |
| Phase 1: Backend Infrastructure | 3-5 hours |
| Phase 2: Frontend Infrastructure | 2-3 hours |
| Phase 3: Documentation & Examples | 1-2 hours |
| **Total** | **6-10 hours** |
---
## Current State Analysis
### Backend Configuration (`src/config/env.ts`)
- Zod-based schema validation at startup
- Organized into logical groups (database, redis, auth, smtp, ai, etc.)
- Helper exports for service availability (`isSmtpConfigured`, `isAiConfigured`, etc.)
- Environment helpers (`isProduction`, `isTest`, `isDevelopment`)
- Fail-fast on invalid configuration
### Frontend Configuration (`src/config.ts`)
- Uses `import.meta.env` (Vite environment variables)
- Organized into sections (app, google, sentry)
- Boolean parsing for string env vars
- Type declarations in `src/vite-env.d.ts`
### Existing Patterns to Follow
```typescript
// Backend - service availability check pattern
export const isSmtpConfigured =
!!config.smtp.host && !!config.smtp.user && !!config.smtp.pass;
// Frontend - boolean parsing pattern
enabled: import.meta.env.VITE_SENTRY_ENABLED !== 'false',
```
---
## Task Breakdown
### Phase 1: Backend Feature Flag Infrastructure
#### [1.1] Define Feature Flag Schema in env.ts
**Complexity**: Low
**Estimate**: 30-45 minutes
**Dependencies**: None
**Parallelizable**: Yes
**Description**: Add a new `featureFlags` section to the Zod schema in `src/config/env.ts`.
**Acceptance Criteria**:
- [ ] New `featureFlagsSchema` Zod object defined
- [ ] Schema supports boolean flags with defaults to `false` (opt-in model)
- [ ] Schema added to main `envSchema` object
- [ ] Type exported as part of `EnvConfig`
**Implementation Details**:
```typescript
// src/config/env.ts
/**
* Feature flags configuration schema (ADR-024).
* All flags default to false (disabled) for safety.
* Set to 'true' in environment to enable.
*/
const featureFlagsSchema = z.object({
// Example flags - replace with actual feature flags as needed
newDashboard: booleanString(false), // FEATURE_NEW_DASHBOARD
betaRecipes: booleanString(false), // FEATURE_BETA_RECIPES
experimentalAi: booleanString(false), // FEATURE_EXPERIMENTAL_AI
debugMode: booleanString(false), // FEATURE_DEBUG_MODE
});
// In loadEnvVars():
featureFlags: {
newDashboard: process.env.FEATURE_NEW_DASHBOARD,
betaRecipes: process.env.FEATURE_BETA_RECIPES,
experimentalAi: process.env.FEATURE_EXPERIMENTAL_AI,
debugMode: process.env.FEATURE_DEBUG_MODE,
},
```
**Risks/Notes**:
- Naming convention: `FEATURE_*` prefix for all feature flag env vars
- Default to `false` ensures features are opt-in, preventing accidental exposure
---
#### [1.2] Create Feature Flag Service Module
**Complexity**: Medium
**Estimate**: 1-2 hours
**Dependencies**: [1.1]
**Parallelizable**: No (depends on 1.1)
**Description**: Create a dedicated service module for feature flag access with helper functions.
**File**: `src/services/featureFlags.server.ts`
**Acceptance Criteria**:
- [ ] `isFeatureEnabled(flagName)` function for checking flags
- [ ] `getAllFeatureFlags()` function for debugging/admin endpoints
- [ ] Type-safe flag name parameter (union type or enum)
- [ ] Exported helper booleans for common flags (similar to `isSmtpConfigured`)
- [ ] Logging when feature flag is checked in development mode
**Implementation Details**:
```typescript
// src/services/featureFlags.server.ts
import { config, isDevelopment } from '../config/env';
import { logger } from './logger.server';
export type FeatureFlagName = keyof typeof config.featureFlags;
/**
* Check if a feature flag is enabled.
* @param flagName - The name of the feature flag to check
* @returns boolean indicating if the feature is enabled
*/
export function isFeatureEnabled(flagName: FeatureFlagName): boolean {
const enabled = config.featureFlags[flagName];
if (isDevelopment) {
logger.debug({ flag: flagName, enabled }, 'Feature flag checked');
}
return enabled;
}
/**
* Get all feature flags and their current states.
* Useful for debugging and admin endpoints.
*/
export function getAllFeatureFlags(): Record<FeatureFlagName, boolean> {
return { ...config.featureFlags };
}
// Convenience exports for common flag checks
export const isNewDashboardEnabled = config.featureFlags.newDashboard;
export const isBetaRecipesEnabled = config.featureFlags.betaRecipes;
export const isExperimentalAiEnabled = config.featureFlags.experimentalAi;
export const isDebugModeEnabled = config.featureFlags.debugMode;
```
**Risks/Notes**:
- Keep logging minimal to avoid performance impact
- Convenience exports are evaluated once at startup (not dynamic)
---
#### [1.3] Add Admin Endpoint for Feature Flag Status
**Complexity**: Low
**Estimate**: 30-45 minutes
**Dependencies**: [1.2]
**Parallelizable**: No (depends on 1.2)
**Description**: Add an admin/health endpoint to view current feature flag states.
**File**: `src/routes/admin.routes.ts` (or `stats.routes.ts` if admin routes don't exist)
**Acceptance Criteria**:
- [ ] `GET /api/v1/admin/feature-flags` endpoint (admin-only)
- [ ] Returns JSON object with all flags and their states
- [ ] Requires admin authentication
- [ ] Endpoint documented in Swagger
**Implementation Details**:
```typescript
// In appropriate routes file
router.get('/feature-flags', requireAdmin, async (req, res) => {
const flags = getAllFeatureFlags();
sendSuccess(res, { flags });
});
```
**Risks/Notes**:
- Ensure endpoint is protected (admin-only)
- Consider caching response if called frequently
---
#### [1.4] Backend Unit Tests
**Complexity**: Medium
**Estimate**: 1-2 hours
**Dependencies**: [1.1], [1.2]
**Parallelizable**: Yes (can start after 1.1, in parallel with 1.3)
**Description**: Write unit tests for feature flag configuration and service.
**Files**:
- `src/config/env.test.ts` (add feature flag tests)
- `src/services/featureFlags.server.test.ts` (new file)
**Acceptance Criteria**:
- [ ] Test default values (all false)
- [ ] Test parsing 'true'/'false' strings
- [ ] Test `isFeatureEnabled()` function
- [ ] Test `getAllFeatureFlags()` function
- [ ] Test type safety (TypeScript compile-time checks)
**Implementation Details**:
```typescript
// src/config/env.test.ts - add to existing file
describe('featureFlags configuration', () => {
it('should default all feature flags to false', async () => {
setValidEnv();
const { config } = await import('./env');
expect(config.featureFlags.newDashboard).toBe(false);
expect(config.featureFlags.betaRecipes).toBe(false);
});
it('should parse FEATURE_NEW_DASHBOARD as true when set', async () => {
setValidEnv({ FEATURE_NEW_DASHBOARD: 'true' });
const { config } = await import('./env');
expect(config.featureFlags.newDashboard).toBe(true);
});
});
// src/services/featureFlags.server.test.ts - new file
describe('featureFlags service', () => {
describe('isFeatureEnabled', () => {
it('should return false for disabled flags', () => {
expect(isFeatureEnabled('newDashboard')).toBe(false);
});
// ... more tests
});
});
```
---
### Phase 2: Frontend Feature Flag Infrastructure
#### [2.1] Add Frontend Feature Flag Config
**Complexity**: Low
**Estimate**: 30-45 minutes
**Dependencies**: None (can run in parallel with Phase 1)
**Parallelizable**: Yes
**Description**: Add feature flags to the frontend config module.
**Files**:
- `src/config.ts` - Add featureFlags section
- `src/vite-env.d.ts` - Add type declarations
**Acceptance Criteria**:
- [ ] Feature flags section added to `src/config.ts`
- [ ] TypeScript declarations updated in `vite-env.d.ts`
- [ ] Boolean parsing consistent with existing pattern
- [ ] Default to false when env var not set
**Implementation Details**:
```typescript
// src/config.ts
const config = {
// ... existing sections ...
/**
* Feature flags for conditional feature rendering (ADR-024).
* All flags default to false (disabled) when not explicitly set.
*/
featureFlags: {
newDashboard: import.meta.env.VITE_FEATURE_NEW_DASHBOARD === 'true',
betaRecipes: import.meta.env.VITE_FEATURE_BETA_RECIPES === 'true',
experimentalAi: import.meta.env.VITE_FEATURE_EXPERIMENTAL_AI === 'true',
debugMode: import.meta.env.VITE_FEATURE_DEBUG_MODE === 'true',
},
};
// src/vite-env.d.ts
interface ImportMetaEnv {
// ... existing declarations ...
readonly VITE_FEATURE_NEW_DASHBOARD?: string;
readonly VITE_FEATURE_BETA_RECIPES?: string;
readonly VITE_FEATURE_EXPERIMENTAL_AI?: string;
readonly VITE_FEATURE_DEBUG_MODE?: string;
}
```
---
#### [2.2] Create useFeatureFlag React Hook
**Complexity**: Medium
**Estimate**: 1-1.5 hours
**Dependencies**: [2.1]
**Parallelizable**: No (depends on 2.1)
**Description**: Create a React hook for checking feature flags in components.
**File**: `src/hooks/useFeatureFlag.ts`
**Acceptance Criteria**:
- [ ] `useFeatureFlag(flagName)` hook returns boolean
- [ ] Type-safe flag name parameter
- [ ] Memoized to prevent unnecessary re-renders
- [ ] Optional `FeatureFlag` component for conditional rendering
**Implementation Details**:
```typescript
// src/hooks/useFeatureFlag.ts
import { useMemo } from 'react';
import config from '../config';
export type FeatureFlagName = keyof typeof config.featureFlags;
/**
* Hook to check if a feature flag is enabled.
*
* @param flagName - The name of the feature flag to check
* @returns boolean indicating if the feature is enabled
*
* @example
* const isNewDashboard = useFeatureFlag('newDashboard');
* if (isNewDashboard) {
* return <NewDashboard />;
* }
*/
export function useFeatureFlag(flagName: FeatureFlagName): boolean {
return useMemo(() => config.featureFlags[flagName], [flagName]);
}
/**
* Get all feature flags (useful for debugging).
*/
export function useAllFeatureFlags(): Record<FeatureFlagName, boolean> {
return useMemo(() => ({ ...config.featureFlags }), []);
}
```
---
#### [2.3] Create FeatureFlag Component
**Complexity**: Low
**Estimate**: 30-45 minutes
**Dependencies**: [2.2]
**Parallelizable**: No (depends on 2.2)
**Description**: Create a declarative component for feature flag conditional rendering.
**File**: `src/components/FeatureFlag.tsx`
**Acceptance Criteria**:
- [ ] `<FeatureFlag name="flagName">` component
- [ ] Children rendered only when flag is enabled
- [ ] Optional `fallback` prop for disabled state
- [ ] TypeScript-enforced flag names
**Implementation Details**:
```typescript
// src/components/FeatureFlag.tsx
import { ReactNode } from 'react';
import { useFeatureFlag, FeatureFlagName } from '../hooks/useFeatureFlag';
interface FeatureFlagProps {
/** The name of the feature flag to check */
name: FeatureFlagName;
/** Content to render when feature is enabled */
children: ReactNode;
/** Optional content to render when feature is disabled */
fallback?: ReactNode;
}
/**
* Conditionally renders children based on feature flag state.
*
* @example
* <FeatureFlag name="newDashboard" fallback={<OldDashboard />}>
* <NewDashboard />
* </FeatureFlag>
*/
export function FeatureFlag({ name, children, fallback = null }: FeatureFlagProps) {
const isEnabled = useFeatureFlag(name);
return <>{isEnabled ? children : fallback}</>;
}
```
---
#### [2.4] Frontend Unit Tests
**Complexity**: Medium
**Estimate**: 1-1.5 hours
**Dependencies**: [2.1], [2.2], [2.3]
**Parallelizable**: No (depends on previous frontend tasks)
**Description**: Write unit tests for frontend feature flag utilities.
**Files**:
- `src/config.test.ts` (add feature flag tests)
- `src/hooks/useFeatureFlag.test.ts` (new file)
- `src/components/FeatureFlag.test.tsx` (new file)
**Acceptance Criteria**:
- [ ] Test config structure includes featureFlags
- [ ] Test default values (all false)
- [ ] Test hook returns correct values
- [ ] Test component renders/hides children correctly
- [ ] Test fallback rendering
**Implementation Details**:
```typescript
// src/hooks/useFeatureFlag.test.ts
import { renderHook } from '@testing-library/react';
import { useFeatureFlag, useAllFeatureFlags } from './useFeatureFlag';
describe('useFeatureFlag', () => {
it('should return false for disabled flags', () => {
const { result } = renderHook(() => useFeatureFlag('newDashboard'));
expect(result.current).toBe(false);
});
});
// src/components/FeatureFlag.test.tsx
import { render, screen } from '@testing-library/react';
import { FeatureFlag } from './FeatureFlag';
describe('FeatureFlag', () => {
it('should not render children when flag is disabled', () => {
render(
<FeatureFlag name="newDashboard">
<div data-testid="new-feature">New Feature</div>
</FeatureFlag>
);
expect(screen.queryByTestId('new-feature')).not.toBeInTheDocument();
});
it('should render fallback when flag is disabled', () => {
render(
<FeatureFlag name="newDashboard" fallback={<div>Old Feature</div>}>
<div>New Feature</div>
</FeatureFlag>
);
expect(screen.getByText('Old Feature')).toBeInTheDocument();
});
});
```
---
### Phase 3: Documentation & Integration
#### [3.1] Update ADR-024 with Implementation Status
**Complexity**: Low
**Estimate**: 30 minutes
**Dependencies**: [1.1], [1.2], [2.1], [2.2]
**Parallelizable**: Yes (can be done after core implementation)
**Description**: Update ADR-024 to mark it as implemented and add implementation details.
**File**: `docs/adr/0024-feature-flagging-strategy.md`
**Acceptance Criteria**:
- [ ] Status changed from "Proposed" to "Accepted"
- [ ] Implementation status section added
- [ ] Key files documented
- [ ] Usage examples included
---
#### [3.2] Update Environment Documentation
**Complexity**: Low
**Estimate**: 30 minutes
**Dependencies**: [1.1], [2.1]
**Parallelizable**: Yes
**Description**: Add feature flag environment variables to documentation.
**Files**:
- `docs/getting-started/ENVIRONMENT.md`
- `.env.example`
**Acceptance Criteria**:
- [ ] Feature flag variables documented in ENVIRONMENT.md
- [ ] New section "Feature Flags" added
- [ ] `.env.example` updated with commented feature flag examples
**Implementation Details**:
```bash
# .env.example addition
# ===================
# Feature Flags (ADR-024)
# ===================
# All feature flags default to disabled (false) when not set.
# Set to 'true' to enable a feature.
#
# FEATURE_NEW_DASHBOARD=false
# FEATURE_BETA_RECIPES=false
# FEATURE_EXPERIMENTAL_AI=false
# FEATURE_DEBUG_MODE=false
#
# Frontend equivalents (prefix with VITE_):
# VITE_FEATURE_NEW_DASHBOARD=false
# VITE_FEATURE_BETA_RECIPES=false
```
---
#### [3.3] Create CODE-PATTERNS Entry
**Complexity**: Low
**Estimate**: 30 minutes
**Dependencies**: All implementation tasks
**Parallelizable**: Yes
**Description**: Add feature flag usage patterns to CODE-PATTERNS.md.
**File**: `docs/development/CODE-PATTERNS.md`
**Acceptance Criteria**:
- [ ] Feature flag section added with examples
- [ ] Backend usage pattern documented
- [ ] Frontend usage pattern documented
- [ ] Testing pattern documented
---
#### [3.4] Update CLAUDE.md Quick Reference
**Complexity**: Low
**Estimate**: 15 minutes
**Dependencies**: All implementation tasks
**Parallelizable**: Yes
**Description**: Add feature flags to the CLAUDE.md quick reference tables.
**File**: `CLAUDE.md`
**Acceptance Criteria**:
- [ ] Feature flags added to "Key Patterns" table
- [ ] Reference to featureFlags service added
---
## Implementation Sequence
### Phase 1 (Backend) - Can Start Immediately
```text
[1.1] Schema ──────────┬──> [1.2] Service ──> [1.3] Admin Endpoint
└──> [1.4] Backend Tests (can start after 1.1)
```
### Phase 2 (Frontend) - Can Start Immediately (Parallel with Phase 1)
```text
[2.1] Config ──> [2.2] Hook ──> [2.3] Component ──> [2.4] Frontend Tests
```
### Phase 3 (Documentation) - After Implementation
```text
All Phase 1 & 2 Tasks ──> [3.1] ADR Update
├──> [3.2] Env Docs
├──> [3.3] Code Patterns
└──> [3.4] CLAUDE.md
```
---
## Critical Path
The minimum path to a working feature flag system:
1. **[1.1] Schema** (30 min) - Required for backend
2. **[1.2] Service** (1.5 hr) - Required for backend access
3. **[2.1] Frontend Config** (30 min) - Required for frontend
4. **[2.2] Hook** (1 hr) - Required for React integration
**Critical path duration**: ~3.5 hours
Non-critical but recommended:
- Admin endpoint (debugging)
- FeatureFlag component (developer convenience)
- Tests (quality assurance)
- Documentation (maintainability)
---
## Scope Recommendations
### MVP (Minimum Viable Implementation)
Include in initial implementation:
- [1.1] Backend schema with 2-3 example flags
- [1.2] Feature flag service
- [2.1] Frontend config
- [2.2] useFeatureFlag hook
- [1.4] Core backend tests
- [2.4] Core frontend tests
### Enhancements (Future Iterations)
Defer to follow-up work:
- Admin endpoint for flag visibility
- FeatureFlag component (nice-to-have)
- Dynamic flag updates without restart (requires external service)
- User-specific flags (A/B testing)
- Flag analytics/usage tracking
- Gradual rollout percentages
### Explicitly Out of Scope
- Integration with Flagsmith/LaunchDarkly (future ADR)
- Database-stored flags (requires schema changes)
- Real-time flag updates (WebSocket/SSE)
- Flag inheritance/hierarchy
- Flag audit logging
---
## Testing Strategy
### Backend Tests
| Test Type | Coverage Target | Location |
| ----------------- | ---------------------------------------- | ------------------------------------------ |
| Schema validation | Parse true/false, defaults | `src/config/env.test.ts` |
| Service functions | `isFeatureEnabled`, `getAllFeatureFlags` | `src/services/featureFlags.server.test.ts` |
| Integration | Admin endpoint (if added) | `src/routes/admin.routes.test.ts` |
### Frontend Tests
| Test Type | Coverage Target | Location |
| ------------------- | --------------------------- | ------------------------------------- |
| Config structure | featureFlags section exists | `src/config.test.ts` |
| Hook behavior | Returns correct values | `src/hooks/useFeatureFlag.test.ts` |
| Component rendering | Conditional children | `src/components/FeatureFlag.test.tsx` |
### Mocking Pattern for Tests
```typescript
// Backend - reset modules to test different flag states
beforeEach(() => {
vi.resetModules();
process.env.FEATURE_NEW_DASHBOARD = 'true';
});
// Frontend - mock config module
vi.mock('../config', () => ({
default: {
featureFlags: {
newDashboard: true,
betaRecipes: false,
},
},
}));
```
---
## Risk Assessment
| Risk | Impact | Likelihood | Mitigation |
| ------------------------------------------- | ------ | ---------- | ------------------------------------------------------------- |
| Flag state inconsistency (backend/frontend) | Medium | Low | Use same env var naming, document sync requirements |
| Performance impact from flag checks | Low | Low | Flags cached at startup, no runtime DB calls |
| Stale flags after deployment | Medium | Medium | Document restart requirement, consider future dynamic loading |
| Feature creep (too many flags) | Medium | Medium | Require ADR for new flags, sunset policy |
| Missing flag causes crash | High | Low | Default to false, graceful degradation |
---
## Files to Create
| File | Purpose |
| ------------------------------------------ | ---------------------------- |
| `src/services/featureFlags.server.ts` | Backend feature flag service |
| `src/services/featureFlags.server.test.ts` | Backend tests |
| `src/hooks/useFeatureFlag.ts` | React hook for flag access |
| `src/hooks/useFeatureFlag.test.ts` | Hook tests |
| `src/components/FeatureFlag.tsx` | Declarative flag component |
| `src/components/FeatureFlag.test.tsx` | Component tests |
## Files to Modify
| File | Changes |
| -------------------------------------------- | ---------------------------------- |
| `src/config/env.ts` | Add featureFlagsSchema and loading |
| `src/config/env.test.ts` | Add feature flag tests |
| `src/config.ts` | Add featureFlags section |
| `src/config.test.ts` | Add feature flag tests |
| `src/vite-env.d.ts` | Add VITE*FEATURE*\* declarations |
| `.env.example` | Add feature flag examples |
| `docs/adr/0024-feature-flagging-strategy.md` | Update status and details |
| `docs/getting-started/ENVIRONMENT.md` | Document feature flag vars |
| `docs/development/CODE-PATTERNS.md` | Add usage patterns |
| `CLAUDE.md` | Add to quick reference |
---
## Verification Commands
After implementation, run these commands in the dev container:
```bash
# Type checking
podman exec -it flyer-crawler-dev npm run type-check
# Backend unit tests
podman exec -it flyer-crawler-dev npm run test:unit -- --grep "featureFlag"
# Frontend tests (includes hook and component tests)
podman exec -it flyer-crawler-dev npm run test:unit -- --grep "FeatureFlag"
# Full test suite
podman exec -it flyer-crawler-dev npm test
```
---
## Example Usage (Post-Implementation)
### Backend Route Handler
```typescript
// src/routes/flyers.routes.ts
import { isFeatureEnabled } from '../services/featureFlags.server';
router.get('/dashboard', async (req, res) => {
if (isFeatureEnabled('newDashboard')) {
// New dashboard logic
return sendSuccess(res, { version: 'v2', data: await getNewDashboardData() });
}
// Legacy dashboard
return sendSuccess(res, { version: 'v1', data: await getLegacyDashboardData() });
});
```
### React Component
```tsx
// src/pages/Dashboard.tsx
import { FeatureFlag } from '../components/FeatureFlag';
import { useFeatureFlag } from '../hooks/useFeatureFlag';
// Option 1: Declarative component
function Dashboard() {
return (
<FeatureFlag name="newDashboard" fallback={<LegacyDashboard />}>
<NewDashboard />
</FeatureFlag>
);
}
// Option 2: Hook for logic
function DashboardWithLogic() {
const isNewDashboard = useFeatureFlag('newDashboard');
useEffect(() => {
if (isNewDashboard) {
analytics.track('new_dashboard_viewed');
}
}, [isNewDashboard]);
return isNewDashboard ? <NewDashboard /> : <LegacyDashboard />;
}
```
---
## Implementation Notes
### Naming Convention
| Context | Pattern | Example |
| ---------------- | ------------------------- | ---------------------------------- |
| Backend env var | `FEATURE_SNAKE_CASE` | `FEATURE_NEW_DASHBOARD` |
| Frontend env var | `VITE_FEATURE_SNAKE_CASE` | `VITE_FEATURE_NEW_DASHBOARD` |
| Config property | `camelCase` | `config.featureFlags.newDashboard` |
| Type/Hook param | `camelCase` | `isFeatureEnabled('newDashboard')` |
### Flag Lifecycle
1. **Adding a flag**: Add to both schemas, set default to `false`, document
2. **Enabling a flag**: Set env var to `'true'`, restart application
3. **Removing a flag**: Remove conditional code first, then remove flag from schemas
4. **Sunset policy**: Flags should be removed within 3 months of full rollout
---
Last updated: 2026-01-28

View File

@@ -2,6 +2,17 @@
The **ai-usage** subagent specializes in LLM APIs (Gemini, Claude), prompt engineering, and AI-powered features in the Flyer Crawler project.
## Quick Reference
| Aspect | Details |
| ------------------ | ----------------------------------------------------------------------------------- |
| **Primary Use** | Gemini API integration, prompt engineering, AI extraction |
| **Key Files** | `src/services/aiService.server.ts`, `src/services/flyerProcessingService.server.ts` |
| **Key ADRs** | ADR-041 (AI Integration), ADR-046 (Image Processing) |
| **API Key Env** | `VITE_GOOGLE_GENAI_API_KEY` (prod), `VITE_GOOGLE_GENAI_API_KEY_TEST` (test) |
| **Error Handling** | Rate limits (429), JSON parse errors, timeout handling |
| **Delegate To** | `coder` (implementation), `testwriter` (tests), `integrations-specialist` |
## When to Use
Use the **ai-usage** subagent when you need to:
@@ -295,6 +306,9 @@ const fixtureResponse = await fs.readFile('fixtures/gemini-response.json');
## Related Documentation
- [OVERVIEW.md](./OVERVIEW.md) - Subagent system overview
- [CODER-GUIDE.md](./CODER-GUIDE.md) - For implementing AI features
- [TESTER-GUIDE.md](./TESTER-GUIDE.md) - Testing AI features
- [INTEGRATIONS-GUIDE.md](./INTEGRATIONS-GUIDE.md) - External API patterns
- [../adr/0041-ai-gemini-integration-architecture.md](../adr/0041-ai-gemini-integration-architecture.md) - AI integration ADR
- [../adr/0046-image-processing-pipeline.md](../adr/0046-image-processing-pipeline.md) - Image processing
- [CODER-GUIDE.md](./CODER-GUIDE.md) - For implementing AI features
- [../getting-started/ENVIRONMENT.md](../getting-started/ENVIRONMENT.md) - Environment configuration

View File

@@ -2,6 +2,17 @@
The **coder** subagent is your primary tool for writing and modifying production Node.js/TypeScript code in the Flyer Crawler project. This guide explains how to work effectively with the coder subagent.
## Quick Reference
| Aspect | Details |
| ---------------- | ------------------------------------------------------------------------ |
| **Primary Use** | Write/modify production TypeScript code |
| **Key Files** | `src/routes/*.routes.ts`, `src/services/**/*.ts`, `src/components/*.tsx` |
| **Key ADRs** | ADR-034 (Repository), ADR-035 (Services), ADR-028 (API Response) |
| **Test Command** | `podman exec -it flyer-crawler-dev npm run test:unit` |
| **Type Check** | `podman exec -it flyer-crawler-dev npm run type-check` |
| **Delegate To** | `db-dev` (database), `frontend-specialist` (UI), `testwriter` (tests) |
## When to Use the Coder Subagent
Use the coder subagent when you need to:
@@ -307,6 +318,8 @@ error classes for all database operations"
- [OVERVIEW.md](./OVERVIEW.md) - Subagent system overview
- [TESTER-GUIDE.md](./TESTER-GUIDE.md) - Testing strategies
- [DATABASE-GUIDE.md](./DATABASE-GUIDE.md) - Database development workflows
- [../adr/0034-repository-pattern-standards.md](../adr/0034-repository-pattern-standards.md) - Repository patterns
- [../adr/0035-service-layer-architecture.md](../adr/0035-service-layer-architecture.md) - Service layer architecture
- [../adr/0028-api-response-standardization.md](../adr/0028-api-response-standardization.md) - API response patterns
- [../development/CODE-PATTERNS.md](../development/CODE-PATTERNS.md) - Code patterns reference

View File

@@ -5,6 +5,17 @@ This guide covers two database-focused subagents:
- **db-dev**: Database development - schemas, queries, migrations, optimization
- **db-admin**: Database administration - PostgreSQL/Redis admin, security, backups
## Quick Reference
| Aspect | db-dev | db-admin |
| ---------------- | -------------------------------------------- | ------------------------------------------ |
| **Primary Use** | Schemas, queries, migrations | Performance tuning, backups, security |
| **Key Files** | `src/services/db/*.db.ts`, `sql/migrations/` | `postgresql.conf`, `pg_hba.conf` |
| **Key ADRs** | ADR-034 (Repository), ADR-002 (Transactions) | ADR-019 (Backups), ADR-050 (Observability) |
| **Test Command** | `podman exec -it flyer-crawler-dev npm test` | N/A |
| **MCP Tool** | `mcp__devdb__query` | SSH to production |
| **Delegate To** | `coder` (service layer), `db-admin` (perf) | `devops` (infrastructure) |
## Understanding the Difference
| Aspect | db-dev | db-admin |
@@ -412,8 +423,9 @@ This is useful for:
- [OVERVIEW.md](./OVERVIEW.md) - Subagent system overview
- [CODER-GUIDE.md](./CODER-GUIDE.md) - Working with the coder subagent
- [DEVOPS-GUIDE.md](./DEVOPS-GUIDE.md) - DevOps and deployment workflows
- [../adr/0034-repository-pattern-standards.md](../adr/0034-repository-pattern-standards.md) - Repository patterns
- [../adr/0002-standardized-transaction-management.md](../adr/0002-standardized-transaction-management.md) - Transaction management
- [../adr/0019-data-backup-and-recovery-strategy.md](../adr/0019-data-backup-and-recovery-strategy.md) - Backup strategy
- [../adr/0050-postgresql-function-observability.md](../adr/0050-postgresql-function-observability.md) - Database observability
- [../BARE-METAL-SETUP.md](../BARE-METAL-SETUP.md) - Production database setup
- [../operations/BARE-METAL-SETUP.md](../operations/BARE-METAL-SETUP.md) - Production database setup

View File

@@ -6,6 +6,90 @@ This guide covers DevOps-related subagents for deployment, infrastructure, and o
- **infra-architect**: Resource optimization, capacity planning
- **bg-worker**: Background jobs, PM2 workers, BullMQ queues
## Quick Reference
| Aspect | devops | infra-architect | bg-worker |
| ---------------- | ------------------------------------------ | --------------------------- | ------------------------------- |
| **Primary Use** | Containers, CI/CD, deployments | Resource optimization | BullMQ queues, PM2 workers |
| **Key Files** | `compose.dev.yml`, `.gitea/workflows/` | `ecosystem.config.cjs` | `src/services/queues.server.ts` |
| **Key ADRs** | ADR-014 (Containers), ADR-017 (CI/CD) | N/A | ADR-006 (Background Jobs) |
| **Commands** | `podman-compose`, `pm2` | `pm2 monit`, system metrics | Redis CLI, `pm2 logs` |
| **MCP Tools** | `mcp__podman__*` | N/A | N/A |
| **Access Model** | Read-only on production (provide commands) | Same | Same |
---
## CRITICAL: Server Access Model
**Claude Code has READ-ONLY access to production/test servers.**
The `claude-win10` user cannot execute write operations (PM2 restart, systemctl, file modifications) directly on servers. The devops subagent must **provide commands for the user to execute**, not attempt to run them via SSH.
### Command Delegation Workflow
When troubleshooting or making changes to production/test servers:
| Phase | Actor | Action |
| -------- | ------ | ----------------------------------------------------------- |
| Diagnose | Claude | Provide read-only diagnostic commands |
| Report | User | Execute commands, share output with Claude |
| Analyze | Claude | Interpret results, identify root cause |
| Fix | Claude | Provide 1-3 fix commands (never more, errors may cascade) |
| Execute | User | Run fix commands, report results |
| Verify | Claude | Provide verification commands to confirm success |
| Document | Claude | Update relevant documentation with findings and resolutions |
### Example: PM2 Process Issue
Step 1 - Diagnostic Commands (Claude provides, user runs):
```bash
# Check PM2 process status
pm2 list
# View recent error logs
pm2 logs flyer-crawler-api --err --lines 50
# Check system resources
free -h
df -h /var/www
```
Step 2 - User reports output to Claude
Step 3 - Fix Commands (Claude provides 1-3 at a time):
```bash
# Restart the failing process
pm2 restart flyer-crawler-api
```
Step 4 - User executes and reports result
Step 5 - Verification Commands:
```bash
# Confirm process is running
pm2 list
# Test API health
curl -s https://flyer-crawler.projectium.com/api/health/ready | jq .
```
### What NOT to Do
```bash
# WRONG - Claude cannot execute this directly
ssh root@projectium.com "pm2 restart all"
# WRONG - Providing too many commands at once
pm2 stop all && rm -rf node_modules && npm install && pm2 start all
# WRONG - Assuming commands succeeded without user confirmation
```
---
## The devops Subagent
### When to Use
@@ -372,6 +456,8 @@ redis-cli -a $REDIS_PASSWORD
## Service Management Commands
> **Note**: These commands are for the **user to execute on the server**. Claude Code provides these commands but cannot run them directly due to read-only server access. See [Server Access Model](#critical-server-access-model) above.
### PM2 Commands
```bash
@@ -468,8 +554,13 @@ podman exec -it flyer-crawler-dev npm test
## Related Documentation
- [OVERVIEW.md](./OVERVIEW.md) - Subagent system overview
- [../BARE-METAL-SETUP.md](../BARE-METAL-SETUP.md) - Production setup guide
- [DATABASE-GUIDE.md](./DATABASE-GUIDE.md) - Database administration
- [SECURITY-DEBUG-GUIDE.md](./SECURITY-DEBUG-GUIDE.md) - Production debugging
- [../operations/BARE-METAL-SETUP.md](../operations/BARE-METAL-SETUP.md) - Production setup guide
- [../operations/DEPLOYMENT.md](../operations/DEPLOYMENT.md) - Deployment guide
- [../operations/MONITORING.md](../operations/MONITORING.md) - Monitoring guide
- [../development/DEV-CONTAINER.md](../development/DEV-CONTAINER.md) - Dev container guide
- [../adr/0014-containerization-and-deployment-strategy.md](../adr/0014-containerization-and-deployment-strategy.md) - Containerization ADR
- [../adr/0006-background-job-processing-and-task-queues.md](../adr/0006-background-job-processing-and-task-queues.md) - Background jobs ADR
- [../adr/0017-ci-cd-and-branching-strategy.md](../adr/0017-ci-cd-and-branching-strategy.md) - CI/CD strategy
- [../adr/0053-worker-health-checks.md](../adr/0053-worker-health-checks.md) - Worker health checks
- [../adr/0053-worker-health-checks-and-monitoring.md](../adr/0053-worker-health-checks-and-monitoring.md) - Worker health checks

View File

@@ -7,6 +7,15 @@ This guide covers documentation-focused subagents:
- **planner**: Feature breakdown, roadmaps, scope management
- **product-owner**: Requirements, user stories, backlog prioritization
## Quick Reference
| Aspect | documenter | describer-for-ai | planner | product-owner |
| --------------- | -------------------- | ------------------------ | --------------------- | ---------------------- |
| **Primary Use** | User docs, API specs | ADRs, technical specs | Feature breakdown | User stories, backlog |
| **Key Files** | `docs/`, API docs | `docs/adr/`, `CLAUDE.md` | `docs/plans/` | Issue tracker |
| **Output** | Markdown guides | ADRs, context docs | Task lists, roadmaps | User stories, criteria |
| **Delegate To** | `coder` (implement) | `documenter` (user docs) | `coder` (build tasks) | `planner` (breakdown) |
## The documenter Subagent
### When to Use
@@ -437,6 +446,8 @@ Include dates on documentation that may become stale:
## Related Documentation
- [OVERVIEW.md](./OVERVIEW.md) - Subagent system overview
- [CODER-GUIDE.md](./CODER-GUIDE.md) - For implementing documented features
- [../adr/index.md](../adr/index.md) - ADR index
- [../TESTING.md](../TESTING.md) - Testing guide
- [../development/TESTING.md](../development/TESTING.md) - Testing guide
- [../development/CODE-PATTERNS.md](../development/CODE-PATTERNS.md) - Code patterns reference
- [../../CLAUDE.md](../../CLAUDE.md) - AI instructions

View File

@@ -5,6 +5,17 @@ This guide covers frontend-focused subagents:
- **frontend-specialist**: UI components, Neo-Brutalism, Core Web Vitals, accessibility
- **uiux-designer**: UI/UX decisions, component design, user experience
## Quick Reference
| Aspect | frontend-specialist | uiux-designer |
| ----------------- | ---------------------------------------------- | -------------------------------------- |
| **Primary Use** | React components, performance, accessibility | Design decisions, user flows |
| **Key Files** | `src/components/`, `src/features/` | Design specs, mockups |
| **Key ADRs** | ADR-012 (Design System), ADR-044 (Feature Org) | ADR-012 (Design System) |
| **Design System** | Neo-Brutalism (bold borders, high contrast) | Same |
| **State Mgmt** | TanStack Query (server), Zustand (client) | N/A |
| **Delegate To** | `coder` (backend), `tester` (test coverage) | `frontend-specialist` (implementation) |
## The frontend-specialist Subagent
### When to Use
@@ -406,7 +417,8 @@ const handleSelect = useCallback((id: string) => {
- [OVERVIEW.md](./OVERVIEW.md) - Subagent system overview
- [CODER-GUIDE.md](./CODER-GUIDE.md) - For implementing features
- [../DESIGN_TOKENS.md](../DESIGN_TOKENS.md) - Design token reference
- [TESTER-GUIDE.md](./TESTER-GUIDE.md) - Component testing patterns
- [../development/DESIGN_TOKENS.md](../development/DESIGN_TOKENS.md) - Design token reference
- [../adr/0012-frontend-component-library-and-design-system.md](../adr/0012-frontend-component-library-and-design-system.md) - Design system ADR
- [../adr/0005-frontend-state-management-and-server-cache-strategy.md](../adr/0005-frontend-state-management-and-server-cache-strategy.md) - State management ADR
- [../adr/0044-frontend-feature-organization.md](../adr/0044-frontend-feature-organization.md) - Feature organization

View File

@@ -0,0 +1,396 @@
# Integrations Subagent Guide
The **integrations-specialist** subagent handles third-party services, webhooks, and external API integrations in the Flyer Crawler project.
## Quick Reference
| Aspect | Details |
| --------------- | --------------------------------------------------------------------------- |
| **Primary Use** | External APIs, webhooks, OAuth, third-party services |
| **Key Files** | `src/services/external/`, `src/routes/webhooks.routes.ts` |
| **Key ADRs** | ADR-041 (AI Integration), ADR-016 (API Security), ADR-048 (Auth) |
| **MCP Tools** | `mcp__gitea-projectium__*`, `mcp__bugsink__*` |
| **Security** | API key storage, webhook signatures, OAuth state param |
| **Delegate To** | `coder` (implementation), `security-engineer` (review), `ai-usage` (Gemini) |
## When to Use
Use the **integrations-specialist** subagent when you need to:
- Integrate with external APIs (OAuth, REST, GraphQL)
- Implement webhook handlers
- Configure third-party services
- Debug external service connectivity
- Handle API authentication flows
- Manage external service rate limits
## What integrations-specialist Knows
The integrations-specialist subagent understands:
- OAuth 2.0 flows (authorization code, client credentials)
- REST API integration patterns
- Webhook security (signature verification)
- External service error handling
- Rate limiting and retry strategies
- API key management
## Current Integrations
| Service | Purpose | Integration Type | Key Files |
| ------------- | ---------------------- | ---------------- | ---------------------------------- |
| Google Gemini | AI flyer extraction | REST API | `src/services/aiService.server.ts` |
| Bugsink | Error tracking | REST API | MCP: `mcp__bugsink__*` |
| Gitea | Repository and CI/CD | REST API | MCP: `mcp__gitea-projectium__*` |
| Redis | Caching and job queues | Native client | `src/services/redis.server.ts` |
| PostgreSQL | Primary database | Native client | `src/services/db/pool.db.ts` |
## Example Requests
### Adding External API Integration
```
"Use integrations-specialist to integrate with the Store API
to automatically fetch store location data. Include proper
error handling, rate limiting, and caching."
```
### OAuth Implementation
```
"Use integrations-specialist to implement Google OAuth for
user authentication. Include token refresh handling and
session management."
```
### Webhook Handler
```
"Use integrations-specialist to create a webhook handler for
receiving store inventory updates. Include signature verification
and idempotency handling."
```
### Debugging External Service Issues
```
"Use integrations-specialist to debug why the Gemini API calls
are intermittently failing with timeout errors. Check connection
pooling, retry logic, and error handling."
```
## Integration Patterns
### REST API Client Pattern
```typescript
// src/services/external/storeApi.server.ts
import { env } from '@/config/env';
import { log } from '@/services/logger.server';
interface StoreApiConfig {
baseUrl: string;
apiKey: string;
timeout: number;
}
class StoreApiClient {
private config: StoreApiConfig;
constructor(config: StoreApiConfig) {
this.config = config;
}
async getStoreLocations(storeId: string): Promise<StoreLocation[]> {
const url = `${this.config.baseUrl}/stores/${storeId}/locations`;
try {
const response = await fetch(url, {
headers: {
Authorization: `Bearer ${this.config.apiKey}`,
'Content-Type': 'application/json',
},
signal: AbortSignal.timeout(this.config.timeout),
});
if (!response.ok) {
throw new ExternalApiError(`Store API error: ${response.status}`, response.status);
}
return response.json();
} catch (error) {
log.error({ error, storeId }, 'Failed to fetch store locations');
throw error;
}
}
}
export const storeApiClient = new StoreApiClient({
baseUrl: env.STORE_API_BASE_URL,
apiKey: env.STORE_API_KEY,
timeout: 10000,
});
```
### Webhook Handler Pattern
```typescript
// src/routes/webhooks.routes.ts
import { Router } from 'express';
import crypto from 'crypto';
import { env } from '@/config/env';
const router = Router();
function verifyWebhookSignature(payload: string, signature: string, secret: string): boolean {
const expected = crypto.createHmac('sha256', secret).update(payload).digest('hex');
return crypto.timingSafeEqual(Buffer.from(signature), Buffer.from(`sha256=${expected}`));
}
router.post('/store-updates', async (req, res, next) => {
try {
const signature = req.headers['x-webhook-signature'] as string;
const payload = JSON.stringify(req.body);
if (!verifyWebhookSignature(payload, signature, env.WEBHOOK_SECRET)) {
return res.status(401).json({ error: 'Invalid signature' });
}
// Process webhook with idempotency check
const eventId = req.headers['x-event-id'] as string;
const alreadyProcessed = await checkIdempotencyKey(eventId);
if (alreadyProcessed) {
return res.status(200).json({ status: 'already_processed' });
}
await processStoreUpdate(req.body);
await markEventProcessed(eventId);
res.status(200).json({ status: 'processed' });
} catch (error) {
next(error);
}
});
```
### OAuth Flow Pattern
```typescript
// src/services/oauth/googleOAuth.server.ts
import { OAuth2Client } from 'google-auth-library';
import { env } from '@/config/env';
const oauth2Client = new OAuth2Client(
env.GOOGLE_CLIENT_ID,
env.GOOGLE_CLIENT_SECRET,
env.GOOGLE_REDIRECT_URI,
);
export function getAuthorizationUrl(): string {
return oauth2Client.generateAuthUrl({
access_type: 'offline',
scope: ['email', 'profile'],
prompt: 'consent',
});
}
export async function exchangeCodeForTokens(code: string) {
const { tokens } = await oauth2Client.getToken(code);
return tokens;
}
export async function refreshAccessToken(refreshToken: string) {
oauth2Client.setCredentials({ refresh_token: refreshToken });
const { credentials } = await oauth2Client.refreshAccessToken();
return credentials;
}
```
## Error Handling for External Services
### Custom Error Classes
```typescript
// src/services/external/errors.ts
export class ExternalApiError extends Error {
constructor(
message: string,
public statusCode: number,
public retryable: boolean = false,
) {
super(message);
this.name = 'ExternalApiError';
}
}
export class RateLimitError extends ExternalApiError {
constructor(
message: string,
public retryAfter: number,
) {
super(message, 429, true);
this.name = 'RateLimitError';
}
}
```
### Retry with Exponential Backoff
```typescript
async function fetchWithRetry<T>(
fn: () => Promise<T>,
options: { maxRetries: number; baseDelay: number },
): Promise<T> {
let lastError: Error;
for (let attempt = 0; attempt <= options.maxRetries; attempt++) {
try {
return await fn();
} catch (error) {
lastError = error as Error;
if (error instanceof ExternalApiError && !error.retryable) {
throw error;
}
if (attempt < options.maxRetries) {
const delay = options.baseDelay * Math.pow(2, attempt);
await new Promise((resolve) => setTimeout(resolve, delay));
}
}
}
throw lastError!;
}
```
## Rate Limiting Strategies
### Token Bucket Pattern
```typescript
class RateLimiter {
private tokens: number;
private lastRefill: number;
private readonly maxTokens: number;
private readonly refillRate: number; // tokens per second
constructor(maxTokens: number, refillRate: number) {
this.maxTokens = maxTokens;
this.tokens = maxTokens;
this.refillRate = refillRate;
this.lastRefill = Date.now();
}
async acquire(): Promise<void> {
this.refill();
if (this.tokens < 1) {
const waitTime = ((1 - this.tokens) / this.refillRate) * 1000;
await new Promise((resolve) => setTimeout(resolve, waitTime));
this.refill();
}
this.tokens -= 1;
}
private refill(): void {
const now = Date.now();
const elapsed = (now - this.lastRefill) / 1000;
this.tokens = Math.min(this.maxTokens, this.tokens + elapsed * this.refillRate);
this.lastRefill = now;
}
}
```
## Testing Integrations
### Mocking External Services
```typescript
// src/tests/mocks/storeApi.mock.ts
import { vi } from 'vitest';
export const mockStoreApiClient = {
getStoreLocations: vi.fn(),
};
vi.mock('@/services/external/storeApi.server', () => ({
storeApiClient: mockStoreApiClient,
}));
```
### Integration Test with Real Service
```typescript
// src/tests/integration/storeApi.integration.test.ts
describe('Store API Integration', () => {
it.skipIf(!env.STORE_API_KEY)('fetches real store locations', async () => {
const locations = await storeApiClient.getStoreLocations('test-store');
expect(locations).toBeInstanceOf(Array);
});
});
```
## MCP Tools for Integrations
### Gitea Integration
```
// List repositories
mcp__gitea-projectium__list_my_repos()
// Create issue
mcp__gitea-projectium__create_issue({
owner: "projectium",
repo: "flyer-crawler",
title: "Issue title",
body: "Issue description"
})
```
### Bugsink Integration
```
// List projects
mcp__bugsink__list_projects()
// Get issue details
mcp__bugsink__get_issue({ issue_id: "..." })
// Get stacktrace
mcp__bugsink__get_stacktrace({ event_id: "..." })
```
## Security Considerations
### API Key Storage
- Never commit API keys to version control
- Use environment variables via `src/config/env.ts`
- Rotate keys periodically
- Use separate keys for dev/test/prod
### Webhook Security
- Always verify webhook signatures
- Use HTTPS for webhook endpoints
- Implement idempotency
- Log webhook events for audit
### OAuth Security
- Use state parameter to prevent CSRF
- Store tokens securely (encrypted at rest)
- Implement token refresh before expiration
- Validate token scopes
## Related Documentation
- [OVERVIEW.md](./OVERVIEW.md) - Subagent system overview
- [SECURITY-DEBUG-GUIDE.md](./SECURITY-DEBUG-GUIDE.md) - Security patterns
- [AI-USAGE-GUIDE.md](./AI-USAGE-GUIDE.md) - Gemini API integration
- [../adr/0041-ai-gemini-integration-architecture.md](../adr/0041-ai-gemini-integration-architecture.md) - AI integration ADR
- [../adr/0016-api-security-hardening.md](../adr/0016-api-security-hardening.md) - API security
- [../adr/0048-authentication-strategy.md](../adr/0048-authentication-strategy.md) - Authentication

View File

@@ -89,6 +89,47 @@ Or:
Claude will automatically invoke the appropriate subagent with the relevant context.
## Quick Reference Decision Tree
Use this flowchart to quickly identify the right subagent:
```
What do you need to do?
|
+-- Write/modify code? ----------------> Is it database-related?
| |
| +-- Yes -> db-dev
| +-- No --> Is it frontend?
| |
| +-- Yes -> frontend-specialist
| +-- No --> Is it AI/Gemini?
| |
| +-- Yes -> ai-usage
| +-- No --> coder
|
+-- Test something? -------------------> Write new tests? -> testwriter
| Find bugs/vulnerabilities? -> tester
| Review existing code? -> code-reviewer
|
+-- Debug an issue? -------------------> Production error? -> log-debug
| Database slow? -> db-admin
| External API failing? -> integrations-specialist
| AI extraction failing? -> ai-usage
|
+-- Infrastructure/Deployment? --------> Container/CI/CD? -> devops
| Resource optimization? -> infra-architect
| Background jobs? -> bg-worker
|
+-- Documentation? --------------------> User-facing docs? -> documenter
| ADRs/Technical specs? -> describer-for-ai
| Feature planning? -> planner
| User stories? -> product-owner
|
+-- Security? -------------------------> security-engineer
|
+-- Design/UX? ------------------------> uiux-designer
```
## Subagent Selection Guide
### Which Subagent Should I Use?
@@ -183,12 +224,26 @@ Subagents can pass information back to the main conversation and to each other t
## Related Documentation
- [CODER-GUIDE.md](./CODER-GUIDE.md) - Working with the coder subagent
- [TESTER-GUIDE.md](./TESTER-GUIDE.md) - Testing strategies and patterns
- [DATABASE-GUIDE.md](./DATABASE-GUIDE.md) - Database development workflows
- [DEVOPS-GUIDE.md](./DEVOPS-GUIDE.md) - DevOps and deployment workflows
### Subagent Guides
| Guide | Subagents Covered |
| ---------------------------------------------------- | ----------------------------------------------------- |
| [CODER-GUIDE.md](./CODER-GUIDE.md) | coder |
| [TESTER-GUIDE.md](./TESTER-GUIDE.md) | tester, testwriter |
| [DATABASE-GUIDE.md](./DATABASE-GUIDE.md) | db-dev, db-admin |
| [DEVOPS-GUIDE.md](./DEVOPS-GUIDE.md) | devops, infra-architect, bg-worker |
| [FRONTEND-GUIDE.md](./FRONTEND-GUIDE.md) | frontend-specialist, uiux-designer |
| [SECURITY-DEBUG-GUIDE.md](./SECURITY-DEBUG-GUIDE.md) | security-engineer, log-debug, code-reviewer |
| [AI-USAGE-GUIDE.md](./AI-USAGE-GUIDE.md) | ai-usage |
| [INTEGRATIONS-GUIDE.md](./INTEGRATIONS-GUIDE.md) | integrations-specialist, tools-integration-specialist |
| [DOCUMENTATION-GUIDE.md](./DOCUMENTATION-GUIDE.md) | documenter, describer-for-ai, planner, product-owner |
### Project Documentation
- [../adr/index.md](../adr/index.md) - Architecture Decision Records
- [../TESTING.md](../TESTING.md) - Testing guide
- [../development/TESTING.md](../development/TESTING.md) - Testing guide
- [../development/CODE-PATTERNS.md](../development/CODE-PATTERNS.md) - Code patterns reference
- [../architecture/OVERVIEW.md](../architecture/OVERVIEW.md) - System architecture
## Troubleshooting

View File

@@ -6,6 +6,16 @@ This guide covers security and debugging-focused subagents:
- **log-debug**: Production errors, observability, Bugsink/Sentry analysis
- **code-reviewer**: Code quality, security review, best practices
## Quick Reference
| Aspect | security-engineer | log-debug | code-reviewer |
| --------------- | ---------------------------------- | ---------------------------------------- | --------------------------- |
| **Primary Use** | Security audits, OWASP | Production debugging | Code quality review |
| **Key ADRs** | ADR-016 (Security), ADR-032 (Rate) | ADR-050 (Observability) | ADR-034, ADR-035 (Patterns) |
| **MCP Tools** | N/A | `mcp__bugsink__*`, `mcp__localerrors__*` | N/A |
| **Key Checks** | Auth, input validation, CORS | Logs, stacktraces, error patterns | Patterns, tests, security |
| **Delegate To** | `coder` (fix issues) | `devops` (infra), `coder` (fixes) | `coder`, `testwriter` |
## The security-engineer Subagent
### When to Use
@@ -432,8 +442,10 @@ tail -f /var/log/postgresql/postgresql-$(date +%Y-%m-%d).log | grep "duration:"
- [OVERVIEW.md](./OVERVIEW.md) - Subagent system overview
- [DEVOPS-GUIDE.md](./DEVOPS-GUIDE.md) - Infrastructure debugging
- [TESTER-GUIDE.md](./TESTER-GUIDE.md) - Security testing
- [../adr/0016-api-security-hardening.md](../adr/0016-api-security-hardening.md) - Security ADR
- [../adr/0032-rate-limiting-strategy.md](../adr/0032-rate-limiting-strategy.md) - Rate limiting
- [../adr/0015-application-performance-monitoring-and-error-tracking.md](../adr/0015-application-performance-monitoring-and-error-tracking.md) - Monitoring ADR
- [../adr/0015-error-tracking-and-observability.md](../adr/0015-error-tracking-and-observability.md) - Monitoring ADR
- [../adr/0050-postgresql-function-observability.md](../adr/0050-postgresql-function-observability.md) - Database observability
- [../BARE-METAL-SETUP.md](../BARE-METAL-SETUP.md) - Production setup
- [../operations/BARE-METAL-SETUP.md](../operations/BARE-METAL-SETUP.md) - Production setup
- [../tools/BUGSINK-SETUP.md](../tools/BUGSINK-SETUP.md) - Bugsink configuration

View File

@@ -5,6 +5,17 @@ This guide covers two related but distinct subagents for testing in the Flyer Cr
- **tester**: Adversarial testing to find edge cases, race conditions, and vulnerabilities
- **testwriter**: Creating comprehensive test suites for features and fixes
## Quick Reference
| Aspect | tester | testwriter |
| ---------------- | -------------------------------------------- | ------------------------------------------ |
| **Primary Use** | Find bugs, security issues, edge cases | Create test suites, improve coverage |
| **Key Files** | N/A (analysis-focused) | `*.test.ts`, `src/tests/utils/` |
| **Key ADRs** | ADR-010 (Testing), ADR-040 (Test Economics) | ADR-010 (Testing), ADR-045 (Test Fixtures) |
| **Test Command** | `podman exec -it flyer-crawler-dev npm test` | Same |
| **Test Stack** | Vitest, Supertest, Testing Library | Same |
| **Delegate To** | `testwriter` (write tests for findings) | `coder` (fix failing tests) |
## Understanding the Difference
| Aspect | tester | testwriter |
@@ -399,6 +410,7 @@ A typical workflow for thorough testing:
- [OVERVIEW.md](./OVERVIEW.md) - Subagent system overview
- [CODER-GUIDE.md](./CODER-GUIDE.md) - Working with the coder subagent
- [../TESTING.md](../TESTING.md) - Testing guide
- [SECURITY-DEBUG-GUIDE.md](./SECURITY-DEBUG-GUIDE.md) - Security testing and code review
- [../development/TESTING.md](../development/TESTING.md) - Testing guide
- [../adr/0010-testing-strategy-and-standards.md](../adr/0010-testing-strategy-and-standards.md) - Testing ADR
- [../adr/0040-testing-economics-and-priorities.md](../adr/0040-testing-economics-and-priorities.md) - Testing priorities

View File

@@ -109,10 +109,10 @@ MSYS_NO_PATHCONV=1 podman exec -e DATABASE_URL=postgresql://bugsink:bugsink_dev_
### Production Token
SSH into the production server:
User executes this command on the production server:
```bash
ssh root@projectium.com "cd /opt/bugsink && bugsink-manage create_auth_token"
cd /opt/bugsink && bugsink-manage create_auth_token
```
**Output:** Same format - 40-character hex token.
@@ -795,10 +795,10 @@ podman exec flyer-crawler-dev pg_isready -U bugsink -d bugsink -h postgres
podman exec flyer-crawler-dev psql -U postgres -h postgres -c "\l" | grep bugsink
```
**Production:**
**Production** (user executes on server):
```bash
ssh root@projectium.com "cd /opt/bugsink && bugsink-manage check"
cd /opt/bugsink && bugsink-manage check
```
### PostgreSQL Sequence Out of Sync (Duplicate Key Errors)
@@ -834,10 +834,9 @@ SELECT
END as status;
"
# Production
ssh root@projectium.com "cd /opt/bugsink && bugsink-manage dbshell" <<< "
SELECT MAX(id) as max_id, (SELECT last_value FROM projects_project_id_seq) as seq_value FROM projects_project;
"
# Production (user executes on server)
cd /opt/bugsink && bugsink-manage dbshell
# Then run: SELECT MAX(id) as max_id, (SELECT last_value FROM projects_project_id_seq) as seq_value FROM projects_project;
```
**Solution:**
@@ -850,10 +849,9 @@ podman exec flyer-crawler-dev psql -U bugsink -h postgres -d bugsink -c "
SELECT setval('projects_project_id_seq', COALESCE((SELECT MAX(id) FROM projects_project), 1), true);
"
# Production
ssh root@projectium.com "cd /opt/bugsink && bugsink-manage dbshell" <<< "
SELECT setval('projects_project_id_seq', COALESCE((SELECT MAX(id) FROM projects_project), 1), true);
"
# Production (user executes on server)
cd /opt/bugsink && bugsink-manage dbshell
# Then run: SELECT setval('projects_project_id_seq', COALESCE((SELECT MAX(id) FROM projects_project), 1), true);
```
**Verification:**

View File

@@ -50,7 +50,7 @@ if (fs.existsSync(envPath)) {
} else {
console.warn('[ecosystem-test.config.cjs] No .env file found at:', envPath);
console.warn(
'[ecosystem-test.config.cjs] Environment variables must be provided by the shell or CI/CD.'
'[ecosystem-test.config.cjs] Environment variables must be provided by the shell or CI/CD.',
);
}
@@ -60,12 +60,16 @@ if (fs.existsSync(envPath)) {
// The actual application will fail to start if secrets are missing,
// which PM2 will handle with its restart logic.
const requiredSecrets = ['DB_HOST', 'JWT_SECRET', 'GEMINI_API_KEY'];
const missingSecrets = requiredSecrets.filter(key => !process.env[key]);
const missingSecrets = requiredSecrets.filter((key) => !process.env[key]);
if (missingSecrets.length > 0) {
console.warn('\n[ecosystem.config.test.cjs] WARNING: The following environment variables are MISSING:');
missingSecrets.forEach(key => console.warn(` - ${key}`));
console.warn('[ecosystem.config.test.cjs] The application may fail to start if these are required.\n');
console.warn(
'\n[ecosystem.config.test.cjs] WARNING: The following environment variables are MISSING:',
);
missingSecrets.forEach((key) => console.warn(` - ${key}`));
console.warn(
'[ecosystem.config.test.cjs] The application may fail to start if these are required.\n',
);
} else {
console.log('[ecosystem.config.test.cjs] Critical environment variables are present.');
}

View File

@@ -16,11 +16,13 @@
// The actual application will fail to start if secrets are missing,
// which PM2 will handle with its restart logic.
const requiredSecrets = ['DB_HOST', 'JWT_SECRET', 'GEMINI_API_KEY'];
const missingSecrets = requiredSecrets.filter(key => !process.env[key]);
const missingSecrets = requiredSecrets.filter((key) => !process.env[key]);
if (missingSecrets.length > 0) {
console.warn('\n[ecosystem.config.cjs] WARNING: The following environment variables are MISSING:');
missingSecrets.forEach(key => console.warn(` - ${key}`));
console.warn(
'\n[ecosystem.config.cjs] WARNING: The following environment variables are MISSING:',
);
missingSecrets.forEach((key) => console.warn(` - ${key}`));
console.warn('[ecosystem.config.cjs] The application may fail to start if these are required.\n');
} else {
console.log('[ecosystem.config.cjs] Critical environment variables are present.');

View File

@@ -34,9 +34,7 @@ if (missingVars.length > 0) {
'\n[ecosystem.dev.config.cjs] WARNING: The following environment variables are MISSING:',
);
missingVars.forEach((key) => console.warn(` - ${key}`));
console.warn(
'[ecosystem.dev.config.cjs] These should be set in compose.dev.yml or .env.local\n',
);
console.warn('[ecosystem.dev.config.cjs] These should be set in compose.dev.yml or .env.local\n');
} else {
console.log('[ecosystem.dev.config.cjs] Required environment variables are present.');
}

6090
package-lock.json generated

File diff suppressed because it is too large Load Diff

View File

@@ -1,8 +1,11 @@
{
"name": "flyer-crawler",
"private": true,
"version": "0.12.13",
"version": "0.16.2",
"type": "module",
"engines": {
"node": ">=18.0.0"
},
"scripts": {
"dev": "concurrently \"npm:start:dev\" \"vite\"",
"dev:container": "concurrently \"npm:start:dev\" \"vite --host\"",
@@ -24,14 +27,17 @@
"lint": "eslint . --ext ts,tsx --report-unused-disable-directives --max-warnings 0",
"type-check": "tsc --noEmit",
"validate": "(prettier --check . || true) && npm run type-check && (npm run lint || true)",
"clean": "rimraf coverage .coverage",
"clean": "node scripts/clean.mjs",
"start:dev": "NODE_ENV=development tsx watch server.ts",
"start:prod": "NODE_ENV=production tsx server.ts",
"start:test": "NODE_ENV=test NODE_V8_COVERAGE=.coverage/tmp/integration-server tsx server.ts",
"db:reset:dev": "NODE_ENV=development tsx src/db/seed.ts",
"db:reset:test": "NODE_ENV=test tsx src/db/seed.ts",
"worker:prod": "NODE_ENV=production tsx src/services/queueService.server.ts",
"prepare": "node -e \"try { require.resolve('husky') } catch (e) { process.exit(0) }\" && husky || true"
"prepare": "node -e \"try { require.resolve('husky') } catch (e) { process.exit(0) }\" && husky || true",
"tsoa:spec": "tsoa spec",
"tsoa:routes": "tsoa routes",
"tsoa:build": "tsoa spec-and-routes"
},
"dependencies": {
"@bull-board/api": "^6.14.2",
@@ -74,8 +80,8 @@
"react-router-dom": "^7.9.6",
"recharts": "^3.4.1",
"sharp": "^0.34.5",
"swagger-jsdoc": "^6.2.8",
"swagger-ui-express": "^5.0.1",
"tsoa": "^6.6.0",
"tsx": "^4.20.6",
"zod": "^4.2.1",
"zxcvbn": "^4.4.2",
@@ -110,7 +116,6 @@
"@types/react-dom": "^19.2.3",
"@types/sharp": "^0.31.1",
"@types/supertest": "^6.0.3",
"@types/swagger-jsdoc": "^6.0.4",
"@types/swagger-ui-express": "^4.1.8",
"@types/ws": "^8.18.1",
"@types/zxcvbn": "^4.4.5",
@@ -139,7 +144,6 @@
"pino-pretty": "^13.1.3",
"postcss": "^8.5.6",
"prettier": "^3.3.2",
"rimraf": "^6.1.2",
"supertest": "^7.1.4",
"tailwindcss": "^4.1.17",
"testcontainers": "^11.8.1",

View File

@@ -7,6 +7,7 @@
## Current State Analysis
### What We Have
1.**TanStack Query v5.90.12 already installed** in package.json
2.**Not being used** - Custom hooks reimplementing its functionality
3.**Custom `useInfiniteQuery` hook** ([src/hooks/useInfiniteQuery.ts](../src/hooks/useInfiniteQuery.ts)) using `useState`/`useEffect`
@@ -16,10 +17,12 @@
### Current Data Fetching Patterns
#### Pattern 1: Custom useInfiniteQuery Hook
**Location**: [src/hooks/useInfiniteQuery.ts](../src/hooks/useInfiniteQuery.ts)
**Used By**: [src/providers/FlyersProvider.tsx](../src/providers/FlyersProvider.tsx)
**Problems**:
- Reimplements pagination logic that TanStack Query provides
- Manual loading state management
- Manual error handling
@@ -28,10 +31,12 @@
- No request deduplication
#### Pattern 2: useApiOnMount Hook
**Location**: Unknown (needs investigation)
**Used By**: [src/providers/UserDataProvider.tsx](../src/providers/UserDataProvider.tsx)
**Problems**:
- Fetches data on mount only
- Manual loading/error state management
- No caching between unmount/remount
@@ -42,6 +47,7 @@
### Phase 1: Setup TanStack Query Infrastructure (Day 1)
#### 1.1 Create QueryClient Configuration
**File**: `src/config/queryClient.ts`
```typescript
@@ -51,7 +57,7 @@ export const queryClient = new QueryClient({
defaultOptions: {
queries: {
staleTime: 1000 * 60 * 5, // 5 minutes
gcTime: 1000 * 60 * 30, // 30 minutes (formerly cacheTime)
gcTime: 1000 * 60 * 30, // 30 minutes (formerly cacheTime)
retry: 1,
refetchOnWindowFocus: false,
refetchOnMount: true,
@@ -64,9 +70,11 @@ export const queryClient = new QueryClient({
```
#### 1.2 Wrap App with QueryClientProvider
**File**: `src/providers/AppProviders.tsx`
Add TanStack Query provider at the top level:
```typescript
import { QueryClientProvider } from '@tanstack/react-query';
import { ReactQueryDevtools } from '@tanstack/react-query-devtools';
@@ -158,6 +166,7 @@ export const FlyersProvider: React.FC<{ children: ReactNode }> = ({ children })
```
**Benefits**:
- ~100 lines of code removed
- Automatic caching
- Background refetching
@@ -170,6 +179,7 @@ export const FlyersProvider: React.FC<{ children: ReactNode }> = ({ children })
**Action**: Use TanStack Query's `useQuery` for watched items and shopping lists
**New Files**:
- `src/hooks/queries/useWatchedItemsQuery.ts`
- `src/hooks/queries/useShoppingListsQuery.ts`
@@ -208,6 +218,7 @@ export const useShoppingListsQuery = (enabled: boolean) => {
```
**Updated Provider**:
```typescript
import React, { ReactNode, useMemo } from 'react';
import { UserDataContext } from '../contexts/UserDataContext';
@@ -240,6 +251,7 @@ export const UserDataProvider: React.FC<{ children: ReactNode }> = ({ children }
```
**Benefits**:
- ~40 lines of code removed
- No manual state synchronization
- Automatic cache invalidation on user logout
@@ -292,7 +304,7 @@ export const useUpdateShoppingListMutation = () => {
// Optimistically update
queryClient.setQueryData(['shopping-lists'], (old) =>
old.map((list) => (list.id === newList.id ? newList : list))
old.map((list) => (list.id === newList.id ? newList : list)),
);
return { previousLists };
@@ -313,20 +325,24 @@ export const useUpdateShoppingListMutation = () => {
### Phase 4: Remove Old Custom Hooks (Day 9)
#### Files to Remove:
-`src/hooks/useInfiniteQuery.ts` (if not used elsewhere)
-`src/hooks/useApiOnMount.ts` (needs investigation)
#### Files to Update:
- Update any remaining usages in other components
### Phase 5: Testing & Documentation (Day 10)
#### 5.1 Update Tests
- Update provider tests to work with QueryClient
- Add tests for new query hooks
- Add tests for mutation hooks
#### 5.2 Update Documentation
- Mark ADR-0005 as **Accepted** and **Implemented**
- Add usage examples to documentation
- Update developer onboarding guide
@@ -334,11 +350,13 @@ export const useUpdateShoppingListMutation = () => {
## Migration Checklist
### Prerequisites
- [x] TanStack Query installed
- [ ] QueryClient configuration created
- [ ] App wrapped with QueryClientProvider
### Queries
- [ ] Flyers infinite query migrated
- [ ] Watched items query migrated
- [ ] Shopping lists query migrated
@@ -346,6 +364,7 @@ export const useUpdateShoppingListMutation = () => {
- [ ] Active deals query migrated (if applicable)
### Mutations
- [ ] Add watched item mutation
- [ ] Remove watched item mutation
- [ ] Update shopping list mutation
@@ -353,12 +372,14 @@ export const useUpdateShoppingListMutation = () => {
- [ ] Remove shopping list item mutation
### Cleanup
- [ ] Remove custom useInfiniteQuery hook
- [ ] Remove custom useApiOnMount hook
- [ ] Update all tests
- [ ] Remove redundant state management code
### Documentation
- [ ] Update ADR-0005 status to "Accepted"
- [ ] Add usage guidelines to README
- [ ] Document query key conventions
@@ -367,10 +388,12 @@ export const useUpdateShoppingListMutation = () => {
## Benefits Summary
### Code Reduction
- **Estimated**: ~300-500 lines of custom hook code removed
- **Result**: Simpler, more maintainable codebase
### Performance Improvements
- ✅ Automatic request deduplication
- ✅ Background data synchronization
- ✅ Smart cache invalidation
@@ -378,12 +401,14 @@ export const useUpdateShoppingListMutation = () => {
- ✅ Automatic retry logic
### Developer Experience
- ✅ React Query Devtools for debugging
- ✅ Type-safe query hooks
- ✅ Standardized patterns across the app
- ✅ Less boilerplate code
### User Experience
- ✅ Faster perceived performance (cached data)
- ✅ Better offline experience
- ✅ Smoother UI interactions (optimistic updates)
@@ -392,11 +417,13 @@ export const useUpdateShoppingListMutation = () => {
## Risk Assessment
### Low Risk
- TanStack Query is industry-standard
- Already installed in project
- Incremental migration possible
### Mitigation Strategies
1. **Test thoroughly** - Maintain existing test coverage
2. **Migrate incrementally** - One provider at a time
3. **Monitor performance** - Use React Query Devtools

View File

@@ -45,6 +45,7 @@ Successfully completed Phase 2 of ADR-0005 enforcement by migrating all remainin
## Code Reduction Summary
### Phase 1 + Phase 2 Combined
- **Total custom state management code removed**: ~200 lines
- **New query hooks created**: 5 files (~200 lines of standardized code)
- **Providers simplified**: 4 files
@@ -53,34 +54,38 @@ Successfully completed Phase 2 of ADR-0005 enforcement by migrating all remainin
## Technical Improvements
### 1. Intelligent Caching Strategy
```typescript
// Master items (rarely change) - 10 min stale time
useMasterItemsQuery() // staleTime: 10 minutes
useMasterItemsQuery(); // staleTime: 10 minutes
// Flyers (moderate changes) - 2 min stale time
useFlyersQuery() // staleTime: 2 minutes
useFlyersQuery(); // staleTime: 2 minutes
// User data (frequent changes) - 1 min stale time
useWatchedItemsQuery() // staleTime: 1 minute
useShoppingListsQuery() // staleTime: 1 minute
useWatchedItemsQuery(); // staleTime: 1 minute
useShoppingListsQuery(); // staleTime: 1 minute
// Flyer items (static) - 5 min stale time
useFlyerItemsQuery() // staleTime: 5 minutes
useFlyerItemsQuery(); // staleTime: 5 minutes
```
### 2. Per-Resource Caching
Each flyer's items are cached separately:
```typescript
// Flyer 1 items cached with key: ['flyer-items', 1]
useFlyerItemsQuery(1)
useFlyerItemsQuery(1);
// Flyer 2 items cached with key: ['flyer-items', 2]
useFlyerItemsQuery(2)
useFlyerItemsQuery(2);
// Both caches persist independently
```
### 3. Automatic Query Disabling
```typescript
// Query automatically disabled when flyerId is undefined
const { data } = useFlyerItemsQuery(selectedFlyer?.flyer_id);
@@ -90,24 +95,28 @@ const { data } = useFlyerItemsQuery(selectedFlyer?.flyer_id);
## Benefits Achieved
### Performance
-**Reduced API calls** - Data cached between component unmounts
-**Background refetching** - Stale data updates in background
-**Request deduplication** - Multiple components can use same query
-**Optimized cache times** - Different strategies for different data types
### Code Quality
-**Removed ~50 more lines** of custom state management
-**Eliminated useApiOnMount** from all providers
-**Standardized patterns** - All queries follow same structure
-**Better type safety** - TypeScript types flow through queries
### Developer Experience
-**React Query Devtools** - Inspect all queries and cache
-**Easier debugging** - Clear query states and transitions
-**Less boilerplate** - No manual loading/error state management
-**Automatic retries** - Failed queries retry automatically
### User Experience
-**Faster perceived performance** - Cached data shows instantly
-**Fresh data** - Background refetching keeps data current
-**Better offline handling** - Cached data available offline
@@ -116,12 +125,14 @@ const { data } = useFlyerItemsQuery(selectedFlyer?.flyer_id);
## Remaining Work
### Phase 3: Mutations (Next)
- [ ] Create mutation hooks for data modifications
- [ ] Add/remove watched items with optimistic updates
- [ ] Shopping list CRUD operations
- [ ] Proper cache invalidation strategies
### Phase 4: Cleanup (Final)
- [ ] Remove `useApiOnMount` hook entirely
- [ ] Remove `useApi` hook if no longer used
- [ ] Remove stub implementations in providers
@@ -159,10 +170,13 @@ Before merging, test the following:
## Migration Notes
### Breaking Changes
None! All providers maintain the same interface.
### Deprecation Warnings
The following will log warnings if used:
- `setWatchedItems()` in UserDataProvider
- `setShoppingLists()` in UserDataProvider

View File

@@ -12,6 +12,7 @@ Successfully completed Phase 3 of ADR-0005 enforcement by creating all mutation
### Mutation Hooks
All mutation hooks follow a consistent pattern:
- Automatic cache invalidation via `queryClient.invalidateQueries()`
- Success/error notifications via notification service
- Proper TypeScript types for parameters
@@ -113,15 +114,12 @@ function WatchedItemsManager() {
{
onSuccess: () => console.log('Added to watched list!'),
onError: (error) => console.error('Failed:', error),
}
},
);
};
return (
<button
onClick={handleAdd}
disabled={addWatchedItem.isPending}
>
<button onClick={handleAdd} disabled={addWatchedItem.isPending}>
{addWatchedItem.isPending ? 'Adding...' : 'Add to Watched List'}
</button>
);
@@ -134,7 +132,7 @@ function WatchedItemsManager() {
import {
useCreateShoppingListMutation,
useAddShoppingListItemMutation,
useUpdateShoppingListItemMutation
useUpdateShoppingListItemMutation,
} from '../hooks/mutations';
function ShoppingListManager() {
@@ -149,14 +147,14 @@ function ShoppingListManager() {
const handleAddItem = (listId: number, masterItemId: number) => {
addItem.mutate({
listId,
item: { masterItemId }
item: { masterItemId },
});
};
const handleMarkPurchased = (itemId: number) => {
updateItem.mutate({
itemId,
updates: { is_purchased: true }
updates: { is_purchased: true },
});
};
@@ -172,23 +170,27 @@ function ShoppingListManager() {
## Benefits Achieved
### Performance
-**Automatic cache updates** - Queries automatically refetch after mutations
-**Request deduplication** - Multiple mutation calls are properly queued
-**Optimistic updates ready** - Infrastructure in place for Phase 4
### Code Quality
-**Standardized pattern** - All mutations follow the same structure
-**Comprehensive documentation** - JSDoc with examples for every hook
-**Type safety** - Full TypeScript types for all parameters
-**Error handling** - Consistent error handling and user notifications
### Developer Experience
-**React Query Devtools** - Inspect mutation states in real-time
-**Easy imports** - Barrel export for clean imports
-**Consistent API** - Same pattern across all mutations
-**Built-in loading states** - `isPending`, `isError`, `isSuccess` states
### User Experience
-**Automatic notifications** - Success/error toasts on all mutations
-**Fresh data** - Queries automatically update after mutations
-**Loading states** - UI can show loading indicators during mutations
@@ -197,6 +199,7 @@ function ShoppingListManager() {
## Current State
### Completed
- ✅ All 7 mutation hooks created
- ✅ Barrel export created for easy imports
- ✅ Comprehensive documentation with examples
@@ -225,12 +228,14 @@ These hooks are actively used throughout the application and will need careful r
### Phase 4: Hook Refactoring & Cleanup
#### Step 1: Refactor useWatchedItems
- [ ] Replace `useApi` calls with mutation hooks
- [ ] Remove manual state management logic
- [ ] Simplify to just wrap mutation hooks with custom logic
- [ ] Update all tests
#### Step 2: Refactor useShoppingLists
- [ ] Replace `useApi` calls with mutation hooks
- [ ] Remove manual state management logic
- [ ] Remove complex state synchronization
@@ -238,17 +243,20 @@ These hooks are actively used throughout the application and will need careful r
- [ ] Update all tests
#### Step 3: Remove Deprecated Code
- [ ] Remove `setWatchedItems` from UserDataContext
- [ ] Remove `setShoppingLists` from UserDataContext
- [ ] Remove `useApi` hook (if no longer used)
- [ ] Remove `useApiOnMount` hook (already deprecated)
#### Step 4: Add Optimistic Updates (Optional)
- [ ] Implement optimistic updates for better UX
- [ ] Use `onMutate` to update cache before server response
- [ ] Implement rollback on error
#### Step 5: Documentation & Testing
- [ ] Update all component documentation
- [ ] Update developer onboarding guide
- [ ] Add integration tests for mutation flows

View File

@@ -41,13 +41,13 @@ Successfully completed Phase 4 of ADR-0005 enforcement by refactoring the remain
### Phase 1-4 Combined
| Metric | Before | After | Reduction |
|--------|--------|-------|-----------|
| **useWatchedItems** | 77 lines | 71 lines | -6 lines (cleaner) |
| **useShoppingLists** | 222 lines | 176 lines | -46 lines (-21%) |
| **Manual state management** | ~150 lines | 0 lines | -150 lines (100%) |
| **useApi dependencies** | 7 hooks | 0 hooks | -7 dependencies |
| **Total for Phase 4** | 299 lines | 247 lines | **-52 lines (-17%)** |
| Metric | Before | After | Reduction |
| --------------------------- | ---------- | --------- | -------------------- |
| **useWatchedItems** | 77 lines | 71 lines | -6 lines (cleaner) |
| **useShoppingLists** | 222 lines | 176 lines | -46 lines (-21%) |
| **Manual state management** | ~150 lines | 0 lines | -150 lines (100%) |
| **useApi dependencies** | 7 hooks | 0 hooks | -7 dependencies |
| **Total for Phase 4** | 299 lines | 247 lines | **-52 lines (-17%)** |
### Overall ADR-0005 Impact (Phases 1-4)
@@ -61,45 +61,54 @@ Successfully completed Phase 4 of ADR-0005 enforcement by refactoring the remain
### 1. Simplified useWatchedItems
**Before (useApi pattern):**
```typescript
const { execute: addWatchedItemApi, error: addError } = useApi<MasterGroceryItem, [string, string]>(
(itemName, category) => apiClient.addWatchedItem(itemName, category)
(itemName, category) => apiClient.addWatchedItem(itemName, category),
);
const addWatchedItem = useCallback(async (itemName: string, category: string) => {
if (!userProfile) return;
const updatedOrNewItem = await addWatchedItemApi(itemName, category);
const addWatchedItem = useCallback(
async (itemName: string, category: string) => {
if (!userProfile) return;
const updatedOrNewItem = await addWatchedItemApi(itemName, category);
if (updatedOrNewItem) {
setWatchedItems((currentItems) => {
const itemExists = currentItems.some(
(item) => item.master_grocery_item_id === updatedOrNewItem.master_grocery_item_id
);
if (!itemExists) {
return [...currentItems, updatedOrNewItem].sort((a, b) => a.name.localeCompare(b.name));
}
return currentItems;
});
}
}, [userProfile, setWatchedItems, addWatchedItemApi]);
if (updatedOrNewItem) {
setWatchedItems((currentItems) => {
const itemExists = currentItems.some(
(item) => item.master_grocery_item_id === updatedOrNewItem.master_grocery_item_id,
);
if (!itemExists) {
return [...currentItems, updatedOrNewItem].sort((a, b) => a.name.localeCompare(b.name));
}
return currentItems;
});
}
},
[userProfile, setWatchedItems, addWatchedItemApi],
);
```
**After (TanStack Query):**
```typescript
const addWatchedItemMutation = useAddWatchedItemMutation();
const addWatchedItem = useCallback(async (itemName: string, category: string) => {
if (!userProfile) return;
const addWatchedItem = useCallback(
async (itemName: string, category: string) => {
if (!userProfile) return;
try {
await addWatchedItemMutation.mutateAsync({ itemName, category });
} catch (error) {
console.error('useWatchedItems: Failed to add item', error);
}
}, [userProfile, addWatchedItemMutation]);
try {
await addWatchedItemMutation.mutateAsync({ itemName, category });
} catch (error) {
console.error('useWatchedItems: Failed to add item', error);
}
},
[userProfile, addWatchedItemMutation],
);
```
**Benefits:**
- No manual state updates
- Cache automatically invalidated
- Success/error notifications handled
@@ -108,6 +117,7 @@ const addWatchedItem = useCallback(async (itemName: string, category: string) =>
### 2. Dramatically Simplified useShoppingLists
**Before:** 222 lines with:
- 5 separate `useApi` hooks
- Complex manual state synchronization
- Client-side duplicate checking
@@ -115,6 +125,7 @@ const addWatchedItem = useCallback(async (itemName: string, category: string) =>
- Try-catch blocks for each operation
**After:** 176 lines with:
- 5 TanStack Query mutation hooks
- Zero manual state management
- Server-side validation
@@ -122,6 +133,7 @@ const addWatchedItem = useCallback(async (itemName: string, category: string) =>
- Consistent error handling
**Removed Complexity:**
```typescript
// OLD: Manual state update with complex logic
const addItemToList = useCallback(async (listId: number, item: {...}) => {
@@ -158,6 +170,7 @@ const addItemToList = useCallback(async (listId: number, item: {...}) => {
```
**NEW: Simple mutation call:**
```typescript
const addItemToList = useCallback(async (listId: number, item: {...}) => {
if (!userProfile) return;
@@ -173,18 +186,20 @@ const addItemToList = useCallback(async (listId: number, item: {...}) => {
### 3. Cleaner Context Interface
**Before:**
```typescript
export interface UserDataContextType {
watchedItems: MasterGroceryItem[];
shoppingLists: ShoppingList[];
setWatchedItems: React.Dispatch<React.SetStateAction<MasterGroceryItem[]>>; // ❌ Removed
setShoppingLists: React.Dispatch<React.SetStateAction<ShoppingList[]>>; // ❌ Removed
setWatchedItems: React.Dispatch<React.SetStateAction<MasterGroceryItem[]>>; // ❌ Removed
setShoppingLists: React.Dispatch<React.SetStateAction<ShoppingList[]>>; // ❌ Removed
isLoading: boolean;
error: string | null;
}
```
**After:**
```typescript
export interface UserDataContextType {
watchedItems: MasterGroceryItem[];
@@ -195,6 +210,7 @@ export interface UserDataContextType {
```
**Why this matters:**
- Context now truly represents "server state" (read-only from context perspective)
- Mutations are handled separately via mutation hooks
- Clear separation of concerns: queries for reads, mutations for writes
@@ -202,12 +218,14 @@ export interface UserDataContextType {
## Benefits Achieved
### Performance
-**Eliminated redundant refetches** - No more manual state sync causing stale data
-**Automatic cache updates** - Mutations invalidate queries automatically
-**Optimistic updates ready** - Infrastructure supports adding optimistic updates in future
-**Reduced bundle size** - 52 lines less code in custom hooks
### Code Quality
-**Removed 150+ lines** of manual state management across all hooks
-**Eliminated useApi dependency** from user-facing hooks
-**Consistent error handling** - All mutations use same pattern
@@ -215,12 +233,14 @@ export interface UserDataContextType {
-**Removed complex logic** - No more client-side duplicate checking
### Developer Experience
-**Simpler hook implementations** - 46 lines less in useShoppingLists alone
-**Easier debugging** - React Query Devtools show all mutations
-**Type safety** - Mutation hooks provide full TypeScript types
-**Consistent patterns** - All operations follow same mutation pattern
### User Experience
-**Automatic notifications** - Success/error toasts on all operations
-**Fresh data** - Cache automatically updates after mutations
-**Better error messages** - Server-side validation provides better feedback
@@ -231,6 +251,7 @@ export interface UserDataContextType {
### Breaking Changes
**Direct UserDataContext usage:**
```typescript
// ❌ OLD: This no longer works
const { setWatchedItems } = useUserData();
@@ -245,6 +266,7 @@ addWatchedItem.mutate({ itemName: 'Milk', category: 'Dairy' });
### Non-Breaking Changes
**Custom hooks maintain backward compatibility:**
```typescript
// ✅ STILL WORKS: Custom hooks maintain same interface
const { addWatchedItem, removeWatchedItem } = useWatchedItems();
@@ -273,6 +295,7 @@ addWatchedItem.mutate({ itemName: 'Milk', category: 'Dairy' });
### Testing Approach
**Current tests mock useApi:**
```typescript
vi.mock('./useApi');
const mockedUseApi = vi.mocked(useApi);
@@ -280,6 +303,7 @@ mockedUseApi.mockReturnValue({ execute: mockFn, error: null, loading: false });
```
**New tests should mock mutations:**
```typescript
vi.mock('./mutations', () => ({
useAddWatchedItemMutation: vi.fn(),
@@ -300,17 +324,20 @@ useAddWatchedItemMutation.mockReturnValue({
## Remaining Work
### Immediate Follow-Up (Phase 4.5)
- [ ] Update [src/hooks/useWatchedItems.test.tsx](../src/hooks/useWatchedItems.test.tsx)
- [ ] Update [src/hooks/useShoppingLists.test.tsx](../src/hooks/useShoppingLists.test.tsx)
- [ ] Add integration tests for mutation flows
### Phase 5: Admin Features (Next)
- [ ] Create query hooks for admin features
- [ ] Migrate ActivityLog.tsx
- [ ] Migrate AdminStatsPage.tsx
- [ ] Migrate CorrectionsPage.tsx
### Phase 6: Final Cleanup
- [ ] Remove `useApi` hook (no longer used by core features)
- [ ] Remove `useApiOnMount` hook (deprecated)
- [ ] Remove custom `useInfiniteQuery` hook (deprecated)
@@ -350,12 +377,14 @@ None! Phase 4 implementation is complete and working.
## Performance Metrics
### Before Phase 4
- Multiple redundant state updates per mutation
- Client-side validation adding latency
- Complex nested state updates causing re-renders
- Manual cache synchronization prone to bugs
### After Phase 4
- Single mutation triggers automatic cache update
- Server-side validation (proper place for business logic)
- Simple refetch after mutation (no manual updates)
@@ -372,6 +401,7 @@ None! Phase 4 implementation is complete and working.
Phase 4 successfully refactored the remaining custom hooks (`useWatchedItems` and `useShoppingLists`) to use TanStack Query mutations, eliminating all manual state management for user-facing features. The codebase is now significantly simpler, more maintainable, and follows consistent patterns throughout.
**Key Achievements:**
- Removed 52 lines of code from custom hooks
- Eliminated 7 `useApi` dependencies
- Removed 150+ lines of manual state management
@@ -380,6 +410,7 @@ Phase 4 successfully refactored the remaining custom hooks (`useWatchedItems` an
- Zero regressions in functionality
**Next Steps**:
1. Update tests for refactored hooks (Phase 4.5 - follow-up)
2. Proceed to Phase 5 to migrate admin features
3. Final cleanup in Phase 6

View File

@@ -100,6 +100,7 @@ Successfully completed Phase 5 of ADR-0005 by migrating all admin features from
### Before (Manual State Management)
**ActivityLog.tsx - Before:**
```typescript
const [logs, setLogs] = useState<ActivityLogItem[]>([]);
const [isLoading, setIsLoading] = useState(true);
@@ -116,8 +117,7 @@ useEffect(() => {
setError(null);
try {
const response = await fetchActivityLog(20, 0);
if (!response.ok)
throw new Error((await response.json()).message || 'Failed to fetch logs');
if (!response.ok) throw new Error((await response.json()).message || 'Failed to fetch logs');
setLogs(await response.json());
} catch (err) {
setError(err instanceof Error ? err.message : 'Failed to load activity.');
@@ -131,6 +131,7 @@ useEffect(() => {
```
**ActivityLog.tsx - After:**
```typescript
const { data: logs = [], isLoading, error } = useActivityLogQuery(20, 0);
```
@@ -138,6 +139,7 @@ const { data: logs = [], isLoading, error } = useActivityLogQuery(20, 0);
### Before (Manual Parallel Fetching)
**CorrectionsPage.tsx - Before:**
```typescript
const [corrections, setCorrections] = useState<SuggestedCorrection[]>([]);
const [isLoading, setIsLoading] = useState(true);
@@ -172,6 +174,7 @@ useEffect(() => {
```
**CorrectionsPage.tsx - After:**
```typescript
const {
data: corrections = [],
@@ -180,15 +183,9 @@ const {
refetch: refetchCorrections,
} = useSuggestedCorrectionsQuery();
const {
data: masterItems = [],
isLoading: isLoadingMasterItems,
} = useMasterItemsQuery();
const { data: masterItems = [], isLoading: isLoadingMasterItems } = useMasterItemsQuery();
const {
data: categories = [],
isLoading: isLoadingCategories,
} = useCategoriesQuery();
const { data: categories = [], isLoading: isLoadingCategories } = useCategoriesQuery();
const isLoading = isLoadingCorrections || isLoadingMasterItems || isLoadingCategories;
const error = correctionsError?.message || null;
@@ -197,12 +194,14 @@ const error = correctionsError?.message || null;
## Benefits Achieved
### Performance
-**Automatic parallel fetching** - CorrectionsPage fetches 3 queries simultaneously
-**Shared cache** - Multiple components can reuse the same queries
-**Smart refetching** - Queries refetch on window focus automatically
-**Stale-while-revalidate** - Shows cached data while fetching fresh data
### Code Quality
-**~77 lines removed** from admin components (-20% average)
-**Eliminated manual state management** for all admin queries
-**Consistent error handling** across all admin features
@@ -210,6 +209,7 @@ const error = correctionsError?.message || null;
-**Removed complex Promise.all logic** from CorrectionsPage
### Developer Experience
-**Simpler component code** - Focus on UI, not data fetching
-**Easier debugging** - React Query Devtools show all queries
-**Type safety** - Query hooks provide full TypeScript types
@@ -217,6 +217,7 @@ const error = correctionsError?.message || null;
-**Consistent patterns** - All admin features follow same query pattern
### User Experience
-**Faster perceived performance** - Show cached data instantly
-**Background updates** - Data refreshes without loading spinners
-**Network resilience** - Automatic retry on failure
@@ -224,12 +225,12 @@ const error = correctionsError?.message || null;
## Code Reduction Summary
| Component | Before | After | Reduction |
|-----------|--------|-------|-----------|
| **ActivityLog.tsx** | 158 lines | 133 lines | -25 lines (-16%) |
| **AdminStatsPage.tsx** | 104 lines | 78 lines | -26 lines (-25%) |
| Component | Before | After | Reduction |
| ----------------------- | ----------------------- | ----------------- | --------------------------- |
| **ActivityLog.tsx** | 158 lines | 133 lines | -25 lines (-16%) |
| **AdminStatsPage.tsx** | 104 lines | 78 lines | -26 lines (-25%) |
| **CorrectionsPage.tsx** | ~120 lines (state mgmt) | ~50 lines (hooks) | ~70 lines (-58% state code) |
| **Total Reduction** | ~382 lines | ~261 lines | **~121 lines (-32%)** |
| **Total Reduction** | ~382 lines | ~261 lines | **~121 lines (-32%)** |
**Note**: CorrectionsPage reduction is approximate as the full component includes rendering logic that wasn't changed.
@@ -334,6 +335,7 @@ export const AdminComponent: React.FC = () => {
All changes are backward compatible at the component level. Components maintain their existing props and behavior.
**Example: ActivityLog component still accepts same props:**
```typescript
interface ActivityLogProps {
userProfile: UserProfile | null;

View File

@@ -2,7 +2,8 @@
**Date**: 2026-01-08
**Environment**: Windows 10, VSCode with Claude Code integration
**Configuration Files**:
**Configuration Files**:
- [`mcp.json`](c:/Users/games3/AppData/Roaming/Code/User/mcp.json:1)
- [`mcp-servers.json`](c:/Users/games3/AppData/Roaming/Code/User/globalStorage/mcp-servers.json:1)
@@ -13,6 +14,7 @@
You have **8 MCP servers** configured in your environment. These servers extend Claude's capabilities by providing specialized tools for browser automation, file conversion, Git hosting integration, container management, filesystem access, and HTTP requests.
**Key Findings**:
- ✅ 7 servers are properly configured and ready to test
- ⚠️ 1 server requires token update (gitea-lan)
- 📋 Testing guide and automated script provided
@@ -23,11 +25,13 @@ You have **8 MCP servers** configured in your environment. These servers extend
## MCP Server Inventory
### 1. Chrome DevTools MCP Server
**Status**: ✅ Configured
**Type**: Browser Automation
**Command**: `npx -y chrome-devtools-mcp@latest`
**Capabilities**:
- Launch and control Chrome browser
- Navigate to URLs
- Click elements and interact with DOM
@@ -36,6 +40,7 @@ You have **8 MCP servers** configured in your environment. These servers extend
- Execute JavaScript in browser context
**Use Cases**:
- Web scraping
- Automated testing
- UI verification
@@ -43,6 +48,7 @@ You have **8 MCP servers** configured in your environment. These servers extend
- Debugging frontend issues
**Configuration Details**:
- Headless mode: Enabled
- Isolated: False (shares browser state)
- Channel: Stable
@@ -50,11 +56,13 @@ You have **8 MCP servers** configured in your environment. These servers extend
---
### 2. Markitdown MCP Server
**Status**: ✅ Configured
**Type**: File Conversion
**Command**: `C:\Users\games3\.local\bin\uvx.exe markitdown-mcp`
**Capabilities**:
- Convert PDF files to markdown
- Convert DOCX files to markdown
- Convert HTML to markdown
@@ -62,24 +70,28 @@ You have **8 MCP servers** configured in your environment. These servers extend
- Convert PowerPoint presentations
**Use Cases**:
- Document processing
- Content extraction from various formats
- Making documents AI-readable
- Converting legacy documents to markdown
**Notes**:
- Requires Python and `uvx` to be installed
- Uses Microsoft's Markitdown library
---
### 3. Gitea Torbonium
**Status**: ✅ Configured
**Type**: Git Hosting Integration
**Host**: https://gitea.torbonium.com
**Command**: `d:\gitea-mcp\gitea-mcp.exe run -t stdio`
**Capabilities**:
- List and manage repositories
- Create and update issues
- Manage pull requests
@@ -89,6 +101,7 @@ You have **8 MCP servers** configured in your environment. These servers extend
- Manage repository settings
**Use Cases**:
- Automated issue creation
- Repository management
- Code review automation
@@ -96,12 +109,14 @@ You have **8 MCP servers** configured in your environment. These servers extend
- Release management
**Configuration**:
- Token: Configured (ending in ...fcf8)
- Access: Full API access based on token permissions
---
### 4. Gitea LAN (Torbolan)
**Status**: ⚠️ Requires Configuration
**Type**: Git Hosting Integration
**Host**: https://gitea.torbolan.com
@@ -110,6 +125,7 @@ You have **8 MCP servers** configured in your environment. These servers extend
**Issue**: Access token is set to `REPLACE_WITH_NEW_TOKEN`
**Action Required**:
1. Log into https://gitea.torbolan.com
2. Navigate to Settings → Applications
3. Generate a new access token
@@ -120,6 +136,7 @@ You have **8 MCP servers** configured in your environment. These servers extend
---
### 5. Gitea Projectium
**Status**: ✅ Configured
**Type**: Git Hosting Integration
**Host**: https://gitea.projectium.com
@@ -128,6 +145,7 @@ You have **8 MCP servers** configured in your environment. These servers extend
**Capabilities**: Same as Gitea Torbonium
**Configuration**:
- Token: Configured (ending in ...9ef)
- This appears to be the Gitea instance for your current project
@@ -136,11 +154,13 @@ You have **8 MCP servers** configured in your environment. These servers extend
---
### 6. Podman/Docker MCP Server
**Status**: ✅ Configured
**Type**: Container Management
**Command**: `npx -y @modelcontextprotocol/server-docker`
**Capabilities**:
- List running containers
- Start and stop containers
- View container logs
@@ -150,6 +170,7 @@ You have **8 MCP servers** configured in your environment. These servers extend
- Create and manage networks
**Use Cases**:
- Container orchestration
- Development environment management
- Log analysis
@@ -157,22 +178,26 @@ You have **8 MCP servers** configured in your environment. These servers extend
- Image management
**Configuration**:
- Docker Host: `npipe:////./pipe/docker_engine`
- Requires: Docker Desktop or Podman running on Windows
**Prerequisites**:
- Docker Desktop must be running
- Named pipe access configured
---
### 7. Filesystem MCP Server
**Status**: ✅ Configured
**Type**: File System Access
**Path**: `D:\gitea\flyer-crawler.projectium.com\flyer-crawler.projectium.com`
**Command**: `npx -y @modelcontextprotocol/server-filesystem`
**Capabilities**:
- List directory contents recursively
- Read file contents
- Write and modify files
@@ -181,27 +206,31 @@ You have **8 MCP servers** configured in your environment. These servers extend
- Create and delete files/directories
**Use Cases**:
- Project file management
- Bulk file operations
- Code generation and modifications
- File content analysis
- Project structure exploration
**Security Note**:
**Security Note**:
This server has full read/write access to your project directory. It operates within the specified directory only.
**Scope**:
**Scope**:
- Limited to: `D:\gitea\flyer-crawler.projectium.com\flyer-crawler.projectium.com`
- Cannot access files outside this directory
---
### 8. Fetch MCP Server
**Status**: ✅ Configured
**Type**: HTTP Client
**Command**: `npx -y @modelcontextprotocol/server-fetch`
**Capabilities**:
- Send HTTP GET requests
- Send HTTP POST requests
- Send PUT, DELETE, PATCH requests
@@ -211,6 +240,7 @@ This server has full read/write access to your project directory. It operates wi
- Handle authentication
**Use Cases**:
- API testing
- Web scraping
- Data fetching from external services
@@ -218,6 +248,7 @@ This server has full read/write access to your project directory. It operates wi
- Integration with external APIs
**Examples**:
- Fetch data from REST APIs
- Download web content
- Test API endpoints
@@ -228,11 +259,12 @@ This server has full read/write access to your project directory. It operates wi
## Current Status: MCP Server Tool Availability
**Important Note**: While these MCP servers are configured in your environment, they are **not currently exposed as callable tools** in this Claude Code session.
**Important Note**: While these MCP servers are configured in your environment, they are **not currently exposed as callable tools** in this Claude Code session.
### What This Means:
MCP servers typically work by:
1. Running as separate processes
2. Exposing tools and resources via the Model Context Protocol
3. Being connected to the AI assistant by the client application (VSCode)
@@ -240,12 +272,14 @@ MCP servers typically work by:
### Current Situation:
In the current session, Claude Code has access to:
- ✅ Built-in file operations (read, write, search, list)
- ✅ Browser actions
- ✅ Mode switching
- ✅ Task management tools
But does **NOT** have direct access to:
- ❌ MCP server-specific tools (e.g., Gitea API operations)
- ❌ Chrome DevTools controls
- ❌ Markitdown conversion functions
@@ -255,6 +289,7 @@ But does **NOT** have direct access to:
### Why This Happens:
MCP servers need to be:
1. Actively connected by the client (VSCode)
2. Running in the background
3. Properly registered with the AI assistant
@@ -277,6 +312,7 @@ cd plans
```
This will:
- Test each server's basic functionality
- Check API connectivity for Gitea servers
- Verify Docker daemon access
@@ -297,6 +333,7 @@ mcp-inspector npx -y @modelcontextprotocol/server-filesystem "D:\gitea\flyer-cra
```
The inspector provides a web UI to:
- View available tools
- Test tool invocations
- See real-time logs
@@ -343,14 +380,14 @@ Follow the comprehensive guide in [`mcp-server-testing-guide.md`](plans/mcp-serv
## MCP Server Use Case Matrix
| Server | Code Analysis | Testing | Deployment | Documentation | API Integration |
|--------|--------------|---------|------------|---------------|-----------------|
| Chrome DevTools | ✓ (UI testing) | ✓✓✓ | - | ✓ (screenshots) | ✓ |
| Markitdown | - | - | - | ✓✓✓ | - |
| Gitea (all 3) | ✓✓✓ | ✓ | ✓✓✓ | ✓✓ | ✓✓✓ |
| Docker | ✓ | ✓✓✓ | ✓✓✓ | - | ✓ |
| Filesystem | ✓✓✓ | ✓✓ | ✓ | ✓✓ | ✓ |
| Fetch | ✓ | ✓✓ | ✓ | - | ✓✓✓ |
| Server | Code Analysis | Testing | Deployment | Documentation | API Integration |
| --------------- | -------------- | ------- | ---------- | --------------- | --------------- |
| Chrome DevTools | ✓ (UI testing) | ✓✓✓ | - | ✓ (screenshots) | ✓ |
| Markitdown | - | - | - | ✓✓✓ | - |
| Gitea (all 3) | ✓✓✓ | ✓ | ✓✓✓ | ✓✓ | ✓✓✓ |
| Docker | ✓ | ✓✓✓ | ✓✓✓ | - | ✓ |
| Filesystem | ✓✓✓ | ✓✓ | ✓ | ✓✓ | ✓ |
| Fetch | ✓ | ✓✓ | ✓ | - | ✓✓✓ |
Legend: ✓✓✓ = Primary use case, ✓✓ = Strong use case, ✓ = Applicable, - = Not applicable
@@ -359,12 +396,14 @@ Legend: ✓✓✓ = Primary use case, ✓✓ = Strong use case, ✓ = Applicable
## Potential Workflows
### Workflow 1: Automated Documentation Updates
1. **Fetch server**: Get latest API documentation from external service
2. **Markitdown**: Convert to markdown format
3. **Filesystem server**: Write to project documentation folder
4. **Gitea server**: Create commit and push changes
### Workflow 2: Container-Based Testing
1. **Docker server**: Start test containers
2. **Fetch server**: Send test API requests
3. **Docker server**: Collect container logs
@@ -372,6 +411,7 @@ Legend: ✓✓✓ = Primary use case, ✓✓ = Strong use case, ✓ = Applicable
5. **Gitea server**: Update test status in issues
### Workflow 3: Web UI Testing
1. **Chrome DevTools**: Launch browser and navigate to app
2. **Chrome DevTools**: Interact with UI elements
3. **Chrome DevTools**: Capture screenshots
@@ -379,6 +419,7 @@ Legend: ✓✓✓ = Primary use case, ✓✓ = Strong use case, ✓ = Applicable
5. **Gitea server**: Update test documentation
### Workflow 4: Repository Management
1. **Gitea server**: List all repositories
2. **Gitea server**: Check for outdated dependencies
3. **Gitea server**: Create issues for updates needed
@@ -389,24 +430,28 @@ Legend: ✓✓✓ = Primary use case, ✓✓ = Strong use case, ✓ = Applicable
## Next Steps
### Phase 1: Verification (Immediate)
1. Run the test script: [`test-mcp-servers.ps1`](plans/test-mcp-servers.ps1:1)
2. Review results and identify issues
3. Fix Gitea LAN token configuration
4. Re-test all servers
### Phase 2: Documentation (Short-term)
1. Document successful test results
2. Create usage examples for each server
3. Set up troubleshooting guides
4. Document common error scenarios
### Phase 3: Integration (Medium-term)
1. Verify MCP server connectivity in Claude Code sessions
2. Test tool availability and functionality
3. Create workflow templates
4. Integrate into development processes
### Phase 4: Optimization (Long-term)
1. Monitor MCP server performance
2. Optimize configurations
3. Add additional MCP servers as needed
@@ -419,7 +464,7 @@ Legend: ✓✓✓ = Primary use case, ✓✓ = Strong use case, ✓ = Applicable
- **MCP Protocol Specification**: https://modelcontextprotocol.io
- **Testing Guide**: [`mcp-server-testing-guide.md`](plans/mcp-server-testing-guide.md:1)
- **Test Script**: [`test-mcp-servers.ps1`](plans/test-mcp-servers.ps1:1)
- **Configuration Files**:
- **Configuration Files**:
- [`mcp.json`](c:/Users/games3/AppData/Roaming/Code/User/mcp.json:1)
- [`mcp-servers.json`](c:/Users/games3/AppData/Roaming/Code/User/globalStorage/mcp-servers.json:1)
@@ -447,6 +492,7 @@ Legend: ✓✓✓ = Primary use case, ✓✓ = Strong use case, ✓ = Applicable
## Conclusion
You have a comprehensive MCP server setup that provides powerful capabilities for:
- **Browser automation** (Chrome DevTools)
- **Document conversion** (Markitdown)
- **Git hosting integration** (3 Gitea instances)
@@ -454,12 +500,14 @@ You have a comprehensive MCP server setup that provides powerful capabilities fo
- **File system operations** (Filesystem)
- **HTTP requests** (Fetch)
**Immediate Action Required**:
**Immediate Action Required**:
- Fix the Gitea LAN token configuration
- Run the test script to verify all servers are operational
- Review test results and address any failures
**Current Limitation**:
**Current Limitation**:
- MCP server tools are not exposed in the current Claude Code session
- May require VSCode or client-side configuration to enable

Some files were not shown because too many files have changed in this diff Show More