520 lines
21 KiB
Markdown
520 lines
21 KiB
Markdown
# Claude Code Project Instructions
|
|
|
|
## Session Startup Checklist
|
|
|
|
**IMPORTANT**: At the start of every session, perform these steps:
|
|
|
|
1. **Check Memory First** - Use `mcp__memory__read_graph` or `mcp__memory__search_nodes` to recall:
|
|
- Project-specific configurations and credentials
|
|
- Previous work context and decisions
|
|
- Infrastructure details (URLs, ports, access patterns)
|
|
- Known issues and their solutions
|
|
|
|
2. **Review Recent Git History** - Check `git log --oneline -10` to understand recent changes
|
|
|
|
3. **Check Container Status** - Use `mcp__podman__container_list` to see what's running
|
|
|
|
---
|
|
|
|
## Project Instructions
|
|
|
|
### Things to Remember
|
|
|
|
Before writing any code:
|
|
|
|
1. State how you will verify this change works (test, bash command, browser check, etc.)
|
|
|
|
2. Write the test or verification step first
|
|
|
|
3. Then implement the code
|
|
|
|
4. Run verification and iterate until it passes
|
|
|
|
## Git Bash / MSYS Path Conversion Issue (Windows Host)
|
|
|
|
**CRITICAL ISSUE**: Git Bash on Windows automatically converts Unix-style paths to Windows paths, which breaks Podman/Docker commands.
|
|
|
|
### Problem Examples:
|
|
|
|
```bash
|
|
# This FAILS in Git Bash:
|
|
podman exec container /usr/local/bin/script.sh
|
|
# Git Bash converts to: C:/Program Files/Git/usr/local/bin/script.sh
|
|
|
|
# This FAILS in Git Bash:
|
|
podman exec container bash -c "cat /tmp/file.sql"
|
|
# Git Bash converts /tmp to C:/Users/user/AppData/Local/Temp
|
|
```
|
|
|
|
### Solutions:
|
|
|
|
1. **Use `sh -c` instead of `bash -c`** for single-quoted commands:
|
|
|
|
```bash
|
|
podman exec container sh -c '/usr/local/bin/script.sh'
|
|
```
|
|
|
|
2. **Use double slashes** to escape path conversion:
|
|
|
|
```bash
|
|
podman exec container //usr//local//bin//script.sh
|
|
```
|
|
|
|
3. **Set MSYS_NO_PATHCONV** environment variable:
|
|
|
|
```bash
|
|
MSYS_NO_PATHCONV=1 podman exec container /usr/local/bin/script.sh
|
|
```
|
|
|
|
4. **Use Windows paths with forward slashes** when referencing host files:
|
|
```bash
|
|
podman cp "d:/path/to/file" container:/tmp/file
|
|
```
|
|
|
|
**ALWAYS use one of these workarounds when running Bash commands on Windows that involve Unix paths inside containers.**
|
|
|
|
## Communication Style: Ask Before Assuming
|
|
|
|
**IMPORTANT**: When helping with tasks, **ask clarifying questions before making assumptions**. Do not assume:
|
|
|
|
- What steps the user has or hasn't completed
|
|
- What the user already knows or has configured
|
|
- What external services (OAuth providers, APIs, etc.) are already set up
|
|
- What secrets or credentials have already been created
|
|
|
|
Instead, ask the user to confirm the current state before providing instructions or making recommendations. This prevents wasted effort and respects the user's existing work.
|
|
|
|
## Platform Requirement: Linux Only
|
|
|
|
**CRITICAL**: This application is designed to run **exclusively on Linux**. See [ADR-014](docs/adr/0014-containerization-and-deployment-strategy.md) for full details.
|
|
|
|
### Environment Terminology
|
|
|
|
- **Dev Container** (or just "dev"): The containerized Linux development environment (`flyer-crawler-dev`). This is where all development and testing should occur.
|
|
- **Host**: The Windows machine running Podman/Docker and VS Code.
|
|
|
|
When instructions say "run in dev" or "run in the dev container", they mean executing commands inside the `flyer-crawler-dev` container.
|
|
|
|
### Test Execution Rules
|
|
|
|
1. **ALL tests MUST be executed in the dev container** - the Linux container environment
|
|
2. **NEVER run tests directly on Windows host** - test results from Windows are unreliable
|
|
3. **Always use the dev container for testing** when developing on Windows
|
|
4. **TypeScript type-check MUST run in dev container** - `npm run type-check` on Windows does not reliably detect errors
|
|
|
|
See [docs/TESTING.md](docs/TESTING.md) for comprehensive testing documentation.
|
|
|
|
### How to Run Tests Correctly
|
|
|
|
```bash
|
|
# If on Windows, first open VS Code and "Reopen in Container"
|
|
# Then run tests inside the dev container:
|
|
npm test # Run all unit tests
|
|
npm run test:unit # Run unit tests only
|
|
npm run test:integration # Run integration tests (requires DB/Redis)
|
|
```
|
|
|
|
### Running Tests via Podman (from Windows host)
|
|
|
|
**Note:** This project has 2900+ unit tests. For AI-assisted development, pipe output to a file for easier processing.
|
|
|
|
The command to run unit tests in the dev container via podman:
|
|
|
|
```bash
|
|
# Basic (output to terminal)
|
|
podman exec -it flyer-crawler-dev npm run test:unit
|
|
|
|
# Recommended for AI processing: pipe to file
|
|
podman exec -it flyer-crawler-dev npm run test:unit 2>&1 | tee test-results.txt
|
|
```
|
|
|
|
The command to run integration tests in the dev container via podman:
|
|
|
|
```bash
|
|
podman exec -it flyer-crawler-dev npm run test:integration
|
|
```
|
|
|
|
For running specific test files:
|
|
|
|
```bash
|
|
podman exec -it flyer-crawler-dev npm test -- --run src/hooks/useAuth.test.tsx
|
|
```
|
|
|
|
### Why Linux Only?
|
|
|
|
- Path separators: Code uses POSIX-style paths (`/`) which may break on Windows
|
|
- Shell scripts in `scripts/` directory are Linux-only
|
|
- External dependencies like `pdftocairo` assume Linux installation paths
|
|
- Unix-style file permissions are assumed throughout
|
|
|
|
### Test Result Interpretation
|
|
|
|
- Tests that **pass on Windows but fail on Linux** = **BROKEN tests** (must be fixed)
|
|
- Tests that **fail on Windows but pass on Linux** = **PASSING tests** (acceptable)
|
|
|
|
## Development Workflow
|
|
|
|
1. Open project in VS Code
|
|
2. Use "Reopen in Container" (Dev Containers extension required) to enter the dev environment
|
|
3. Wait for dev container initialization to complete
|
|
4. Run `npm test` to verify the dev environment is working
|
|
5. Make changes and run tests inside the dev container
|
|
|
|
## Code Change Verification
|
|
|
|
After making any code changes, **always run a type-check** to catch TypeScript errors before committing:
|
|
|
|
```bash
|
|
npm run type-check
|
|
```
|
|
|
|
This prevents linting/type errors from being introduced into the codebase.
|
|
|
|
## Quick Reference
|
|
|
|
| Command | Description |
|
|
| -------------------------- | ---------------------------- |
|
|
| `npm test` | Run all unit tests |
|
|
| `npm run test:unit` | Run unit tests only |
|
|
| `npm run test:integration` | Run integration tests |
|
|
| `npm run dev:container` | Start dev server (container) |
|
|
| `npm run build` | Build for production |
|
|
| `npm run type-check` | Run TypeScript type checking |
|
|
|
|
## Database Schema Files
|
|
|
|
**CRITICAL**: The database schema files must be kept in sync with each other. When making schema changes:
|
|
|
|
| File | Purpose |
|
|
| ------------------------------ | ----------------------------------------------------------- |
|
|
| `sql/master_schema_rollup.sql` | Complete schema used by test database setup and reference |
|
|
| `sql/initial_schema.sql` | Base schema without seed data, used as standalone reference |
|
|
| `sql/migrations/*.sql` | Incremental migrations for production database updates |
|
|
|
|
**Maintenance Rules:**
|
|
|
|
1. **Keep `master_schema_rollup.sql` and `initial_schema.sql` in sync** - These files should contain the same table definitions
|
|
2. **When adding columns via migration**, also add them to both `master_schema_rollup.sql` and `initial_schema.sql`
|
|
3. **Migrations are for production deployments** - They use `ALTER TABLE` to add columns incrementally
|
|
4. **Schema files are for fresh installs** - They define the complete table structure
|
|
5. **Test database uses `master_schema_rollup.sql`** - If schema files are out of sync with migrations, tests will fail
|
|
|
|
**Example:** When `002_expiry_tracking.sql` adds `purchase_date` to `pantry_items`, that column must also exist in the `CREATE TABLE` statements in both `master_schema_rollup.sql` and `initial_schema.sql`.
|
|
|
|
## Known Integration Test Issues and Solutions
|
|
|
|
This section documents common test issues encountered in integration tests, their root causes, and solutions. These patterns recur frequently.
|
|
|
|
### 1. Vitest globalSetup Runs in Separate Node.js Context
|
|
|
|
**Problem:** Vitest's `globalSetup` runs in a completely separate Node.js context from test files. This means:
|
|
|
|
- Singletons created in globalSetup are NOT the same instances as those in test files
|
|
- `global`, `globalThis`, and `process` are all isolated between contexts
|
|
- `vi.spyOn()` on module exports doesn't work cross-context
|
|
- Dependency injection via setter methods fails across contexts
|
|
|
|
**Affected Tests:** Any test trying to inject mocks into BullMQ worker services (e.g., AI failure tests, DB failure tests)
|
|
|
|
**Solution Options:**
|
|
|
|
1. Mark tests as `.todo()` until an API-based mock injection mechanism is implemented
|
|
2. Create test-only API endpoints that allow setting mock behaviors via HTTP
|
|
3. Use file-based or Redis-based mock flags that services check at runtime
|
|
|
|
**Example of affected code pattern:**
|
|
|
|
```typescript
|
|
// This DOES NOT work - different module instances
|
|
const { flyerProcessingService } = await import('../../services/workers.server');
|
|
flyerProcessingService._getAiProcessor()._setExtractAndValidateData(mockFn);
|
|
// The worker uses a different flyerProcessingService instance!
|
|
```
|
|
|
|
### 2. BullMQ Cleanup Queue Deleting Files Before Test Verification
|
|
|
|
**Problem:** The cleanup worker runs in the globalSetup context and processes cleanup jobs even when tests spy on `cleanupQueue.add()`. The spy intercepts calls in the test context, but jobs already queued run in the worker's context.
|
|
|
|
**Affected Tests:** EXIF/PNG metadata stripping tests that need to verify file contents before deletion
|
|
|
|
**Solution:** Drain and pause the cleanup queue before the test:
|
|
|
|
```typescript
|
|
const { cleanupQueue } = await import('../../services/queues.server');
|
|
await cleanupQueue.drain(); // Remove existing jobs
|
|
await cleanupQueue.pause(); // Prevent new jobs from processing
|
|
// ... run test ...
|
|
await cleanupQueue.resume(); // Restore normal operation
|
|
```
|
|
|
|
### 3. Cache Invalidation After Direct Database Inserts
|
|
|
|
**Problem:** Tests that insert data directly via SQL (bypassing the service layer) don't trigger cache invalidation. Subsequent API calls return stale cached data.
|
|
|
|
**Affected Tests:** Any test using `pool.query()` to insert flyers, stores, or other cached entities
|
|
|
|
**Solution:** Manually invalidate the cache after direct inserts:
|
|
|
|
```typescript
|
|
await pool.query('INSERT INTO flyers ...');
|
|
await cacheService.invalidateFlyers(); // Clear stale cache
|
|
```
|
|
|
|
### 4. Unique Filenames Required for Test Isolation
|
|
|
|
**Problem:** Multer generates predictable filenames in test environments, causing race conditions when multiple tests upload files concurrently or in sequence.
|
|
|
|
**Affected Tests:** Flyer processing tests, file upload tests
|
|
|
|
**Solution:** Always use unique filenames with timestamps:
|
|
|
|
```typescript
|
|
// In multer.middleware.ts
|
|
const uniqueSuffix = `${Date.now()}-${Math.round(Math.random() * 1e9)}`;
|
|
cb(null, `${file.fieldname}-${uniqueSuffix}-${sanitizedOriginalName}`);
|
|
```
|
|
|
|
### 5. Response Format Mismatches
|
|
|
|
**Problem:** API response formats may change, causing tests to fail when expecting old formats.
|
|
|
|
**Common Issues:**
|
|
|
|
- `response.body.data.jobId` vs `response.body.data.job.id`
|
|
- Nested objects vs flat response structures
|
|
- Type coercion (string vs number for IDs)
|
|
|
|
**Solution:** Always log response bodies during debugging and update test assertions to match actual API contracts.
|
|
|
|
### 6. External Service Availability
|
|
|
|
**Problem:** Tests depending on external services (PM2, Redis health checks) fail when those services aren't available in the test environment.
|
|
|
|
**Solution:** Use try/catch with graceful degradation or mock the external service checks.
|
|
|
|
## Secrets and Environment Variables
|
|
|
|
**CRITICAL**: This project uses **Gitea CI/CD secrets** for all sensitive configuration. There is NO `/etc/flyer-crawler/environment` file or similar local config file on the server.
|
|
|
|
### Server Directory Structure
|
|
|
|
| Path | Environment | Notes |
|
|
| --------------------------------------------- | ----------- | ------------------------------------------------ |
|
|
| `/var/www/flyer-crawler.projectium.com/` | Production | NO `.env` file - secrets injected via CI/CD only |
|
|
| `/var/www/flyer-crawler-test.projectium.com/` | Test | Has `.env.test` file for test-specific config |
|
|
|
|
### How Secrets Work
|
|
|
|
1. **Gitea Secrets**: All secrets are stored in Gitea repository settings (Settings → Secrets)
|
|
2. **CI/CD Injection**: Secrets are injected during deployment via `.gitea/workflows/deploy-to-prod.yml` and `deploy-to-test.yml`
|
|
3. **PM2 Environment**: The CI/CD workflow passes secrets to PM2 via environment variables, which are then available to the application
|
|
|
|
### Key Files for Configuration
|
|
|
|
| File | Purpose |
|
|
| ------------------------------------- | ---------------------------------------------------- |
|
|
| `src/config/env.ts` | Centralized config with Zod schema validation |
|
|
| `ecosystem.config.cjs` | PM2 process config - reads from `process.env` |
|
|
| `.gitea/workflows/deploy-to-prod.yml` | Production deployment with secret injection |
|
|
| `.gitea/workflows/deploy-to-test.yml` | Test deployment with secret injection |
|
|
| `.env.example` | Template showing all available environment variables |
|
|
| `.env.test` | Test environment overrides (only on test server) |
|
|
|
|
### Adding New Secrets
|
|
|
|
To add a new secret (e.g., `SENTRY_DSN`):
|
|
|
|
1. Add the secret to Gitea repository settings
|
|
2. Update the relevant workflow file (e.g., `deploy-to-prod.yml`) to inject it:
|
|
|
|
```yaml
|
|
SENTRY_DSN: ${{ secrets.SENTRY_DSN }}
|
|
```
|
|
|
|
3. Update `ecosystem.config.cjs` to read it from `process.env`
|
|
4. Update `src/config/env.ts` schema if validation is needed
|
|
5. Update `.env.example` to document the new variable
|
|
|
|
### Current Gitea Secrets
|
|
|
|
**Shared (used by both environments):**
|
|
|
|
- `DB_HOST` - Database host (shared PostgreSQL server)
|
|
- `JWT_SECRET` - Authentication
|
|
- `GOOGLE_MAPS_API_KEY` - Google Maps
|
|
- `GOOGLE_CLIENT_ID`, `GOOGLE_CLIENT_SECRET` - Google OAuth
|
|
- `GH_CLIENT_ID`, `GH_CLIENT_SECRET` - GitHub OAuth
|
|
- `SENTRY_AUTH_TOKEN` - Bugsink API token for source map uploads (create at Settings > API Keys in Bugsink)
|
|
|
|
**Production-specific:**
|
|
|
|
- `DB_USER_PROD`, `DB_PASSWORD_PROD` - Production database credentials (`flyer_crawler_prod`)
|
|
- `DB_DATABASE_PROD` - Production database name (`flyer-crawler`)
|
|
- `REDIS_PASSWORD_PROD` - Redis password (uses database 0)
|
|
- `VITE_GOOGLE_GENAI_API_KEY` - Gemini API key for production
|
|
- `SENTRY_DSN`, `VITE_SENTRY_DSN` - Bugsink error tracking DSNs (production projects)
|
|
|
|
**Test-specific:**
|
|
|
|
- `DB_USER_TEST`, `DB_PASSWORD_TEST` - Test database credentials (`flyer_crawler_test`)
|
|
- `DB_DATABASE_TEST` - Test database name (`flyer-crawler-test`)
|
|
- `REDIS_PASSWORD_TEST` - Redis password (uses database 1 for isolation)
|
|
- `VITE_GOOGLE_GENAI_API_KEY_TEST` - Gemini API key for test
|
|
- `SENTRY_DSN_TEST`, `VITE_SENTRY_DSN_TEST` - Bugsink error tracking DSNs (test projects)
|
|
|
|
### Test Environment
|
|
|
|
The test environment (`flyer-crawler-test.projectium.com`) uses **both** Gitea CI/CD secrets and a local `.env.test` file:
|
|
|
|
- **Gitea secrets**: Injected during deployment via `.gitea/workflows/deploy-to-test.yml`
|
|
- **`.env.test` file**: Located at `/var/www/flyer-crawler-test.projectium.com/.env.test` for local overrides
|
|
- **Redis database 1**: Isolates test job queues from production (which uses database 0)
|
|
- **PM2 process names**: Suffixed with `-test` (e.g., `flyer-crawler-api-test`)
|
|
|
|
### Database User Setup (Test Environment)
|
|
|
|
**CRITICAL**: The test database requires specific PostgreSQL permissions to be configured manually. Schema ownership alone is NOT sufficient - explicit privileges must be granted.
|
|
|
|
**Database Users:**
|
|
|
|
| User | Database | Purpose |
|
|
| -------------------- | -------------------- | ---------- |
|
|
| `flyer_crawler_prod` | `flyer-crawler-prod` | Production |
|
|
| `flyer_crawler_test` | `flyer-crawler-test` | Testing |
|
|
|
|
**Required Setup Commands** (run as `postgres` superuser):
|
|
|
|
```bash
|
|
# Connect as postgres superuser
|
|
sudo -u postgres psql
|
|
|
|
# Create the test database and user (if not exists)
|
|
CREATE DATABASE "flyer-crawler-test";
|
|
CREATE USER flyer_crawler_test WITH PASSWORD 'your-password-here';
|
|
|
|
# Grant ownership and privileges
|
|
ALTER DATABASE "flyer-crawler-test" OWNER TO flyer_crawler_test;
|
|
\c "flyer-crawler-test"
|
|
ALTER SCHEMA public OWNER TO flyer_crawler_test;
|
|
GRANT CREATE, USAGE ON SCHEMA public TO flyer_crawler_test;
|
|
|
|
# Create required extension (must be done by superuser)
|
|
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
|
|
```
|
|
|
|
**Why These Steps Are Necessary:**
|
|
|
|
1. **Schema ownership alone is insufficient** - PostgreSQL requires explicit `GRANT CREATE, USAGE` privileges even when the user owns the schema
|
|
2. **uuid-ossp extension** - Required by the application for UUID generation; must be created by a superuser before the app can use it
|
|
3. **Separate users for prod/test** - Prevents accidental cross-environment data access; each environment has its own credentials in Gitea secrets
|
|
|
|
**Verification:**
|
|
|
|
```bash
|
|
# Check schema privileges (should show 'UC' for flyer_crawler_test)
|
|
psql -d "flyer-crawler-test" -c "\dn+ public"
|
|
|
|
# Expected output:
|
|
# Name | Owner | Access privileges
|
|
# -------+--------------------+------------------------------------------
|
|
# public | flyer_crawler_test | flyer_crawler_test=UC/flyer_crawler_test
|
|
```
|
|
|
|
### Dev Container Environment
|
|
|
|
The dev container runs its own **local Bugsink instance** - it does NOT connect to the production Bugsink server:
|
|
|
|
- **Local Bugsink**: Runs at `http://localhost:8000` inside the container
|
|
- **Pre-configured DSNs**: Set in `compose.dev.yml`, pointing to local instance
|
|
- **Admin credentials**: `admin@localhost` / `admin`
|
|
- **Isolated**: Dev errors stay local, don't pollute production/test dashboards
|
|
- **No Gitea secrets needed**: Everything is self-contained in the container
|
|
|
|
---
|
|
|
|
## MCP Servers
|
|
|
|
The following MCP servers are configured for this project:
|
|
|
|
| Server | Purpose |
|
|
| --------------------- | ------------------------------------------- |
|
|
| gitea-projectium | Gitea API for gitea.projectium.com |
|
|
| gitea-torbonium | Gitea API for gitea.torbonium.com |
|
|
| podman | Container management |
|
|
| filesystem | File system access |
|
|
| fetch | Web fetching |
|
|
| markitdown | Convert documents to markdown |
|
|
| sequential-thinking | Step-by-step reasoning |
|
|
| memory | Knowledge graph persistence |
|
|
| postgres | Direct database queries (localhost:5432) |
|
|
| playwright | Browser automation and testing |
|
|
| redis | Redis cache inspection (localhost:6379) |
|
|
| sentry-selfhosted-mcp | Error tracking via Bugsink (localhost:8000) |
|
|
|
|
**Note:** MCP servers work in both **Claude CLI** and **Claude Code VS Code extension** (as of January 2026).
|
|
|
|
### Sentry/Bugsink MCP Server Setup (ADR-015)
|
|
|
|
To enable Claude Code to query and analyze application errors from Bugsink:
|
|
|
|
1. **Install the MCP server**:
|
|
|
|
```bash
|
|
# Clone the sentry-selfhosted-mcp repository
|
|
git clone https://github.com/ddfourtwo/sentry-selfhosted-mcp.git
|
|
cd sentry-selfhosted-mcp
|
|
npm install
|
|
```
|
|
|
|
2. **Configure Claude Code** (add to `.claude/mcp.json`):
|
|
|
|
```json
|
|
{
|
|
"sentry-selfhosted-mcp": {
|
|
"command": "node",
|
|
"args": ["/path/to/sentry-selfhosted-mcp/dist/index.js"],
|
|
"env": {
|
|
"SENTRY_URL": "http://localhost:8000",
|
|
"SENTRY_AUTH_TOKEN": "<get-from-bugsink-ui>",
|
|
"SENTRY_ORG_SLUG": "flyer-crawler"
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
3. **Get the auth token**:
|
|
- Navigate to Bugsink UI at `http://localhost:8000`
|
|
- Log in with admin credentials
|
|
- Go to Settings > API Keys
|
|
- Create a new API key with read access
|
|
|
|
4. **Available capabilities**:
|
|
- List projects and issues
|
|
- View detailed error events
|
|
- Search by error message or stack trace
|
|
- Update issue status (resolve, ignore)
|
|
- Add comments to issues
|
|
|
|
### SSH Server Access
|
|
|
|
Claude Code can execute commands on the production server via SSH:
|
|
|
|
```bash
|
|
# Basic command execution
|
|
ssh root@projectium.com "command here"
|
|
|
|
# Examples:
|
|
ssh root@projectium.com "systemctl status logstash"
|
|
ssh root@projectium.com "pm2 list"
|
|
ssh root@projectium.com "tail -50 /var/www/flyer-crawler.projectium.com/logs/app.log"
|
|
```
|
|
|
|
**Use cases:**
|
|
|
|
- Managing Logstash, PM2, NGINX, Redis services
|
|
- Viewing server logs
|
|
- Deploying configuration changes
|
|
- Checking service status
|
|
|
|
**Important:** SSH access requires the host machine to have SSH keys configured for `root@projectium.com`.
|