# Database Subagent Guide This guide covers two database-focused subagents: - **db-dev**: Database development - schemas, queries, migrations, optimization - **db-admin**: Database administration - PostgreSQL/Redis admin, security, backups ## Quick Reference | Aspect | db-dev | db-admin | | ---------------- | -------------------------------------------- | ------------------------------------------ | | **Primary Use** | Schemas, queries, migrations | Performance tuning, backups, security | | **Key Files** | `src/services/db/*.db.ts`, `sql/migrations/` | `postgresql.conf`, `pg_hba.conf` | | **Key ADRs** | ADR-034 (Repository), ADR-002 (Transactions) | ADR-019 (Backups), ADR-050 (Observability) | | **Test Command** | `podman exec -it flyer-crawler-dev npm test` | N/A | | **MCP Tool** | `mcp__devdb__query` | SSH to production | | **Delegate To** | `coder` (service layer), `db-admin` (perf) | `devops` (infrastructure) | ## Understanding the Difference | Aspect | db-dev | db-admin | | --------------- | ----------------------------------- | ------------------------------------- | | **Focus** | Application database code | Infrastructure and operations | | **Tasks** | Queries, migrations, repositories | Performance tuning, backups, security | | **Output** | SQL migrations, repository methods | Configuration, monitoring scripts | | **When to Use** | Adding features, optimizing queries | Production issues, capacity planning | ## The db-dev Subagent ### When to Use Use the **db-dev** subagent when you need to: - Design new database tables or modify existing ones - Write SQL queries or optimize existing ones - Create database migrations - Implement repository pattern methods - Fix N+1 query problems - Add indexes for performance - Work with PostGIS spatial queries ### What db-dev Knows The db-dev subagent has deep knowledge of: - Project database schema (`sql/master_schema_rollup.sql`) - Repository pattern standards (ADR-034) - Transaction management (ADR-002) - PostgreSQL-specific features (PostGIS, pg_trgm, etc.) - Schema synchronization requirements ### Schema Synchronization (Critical) > **Schema files MUST stay synchronized:** > > | File | Purpose | > | ------------------------------ | --------------------------------- | > | `sql/master_schema_rollup.sql` | Test DB setup, complete reference | > | `sql/initial_schema.sql` | Fresh install schema | > | `sql/migrations/*.sql` | Production incremental changes | When db-dev creates a migration, it will also update the schema files. ### Example Requests **Adding a new table:** ``` "Use db-dev to design a table for storing user recipe reviews. Include fields for rating (1-5), review text, and relationships to users and recipes. Create the migration and update schema files." ``` **Optimizing a slow query:** ``` "Use db-dev to optimize the query that lists flyers with their item counts. It's currently doing N+1 queries and takes too long with many flyers." ``` **Adding spatial search:** ``` "Use db-dev to add the ability to search stores within a radius of a given location using PostGIS. Include the migration for adding the geography column." ``` ### Repository Pattern Standards The db-dev subagent follows these naming conventions: | Prefix | Returns | Behavior on Not Found | | --------- | ------------------- | ------------------------------------ | | `get*` | Single entity | Throws `NotFoundError` | | `find*` | Entity or `null` | Returns `null` | | `list*` | Array | Returns `[]` | | `create*` | Created entity | Throws on constraint violation | | `update*` | Updated entity | Throws `NotFoundError` if not exists | | `delete*` | `void` or `boolean` | Throws `NotFoundError` if not exists | ### Example Migration ```sql -- sql/migrations/20260121_add_recipe_reviews.sql -- Create recipe_reviews table CREATE TABLE IF NOT EXISTS recipe_reviews ( review_id UUID PRIMARY KEY DEFAULT uuid_generate_v4(), recipe_id UUID NOT NULL REFERENCES recipes(recipe_id) ON DELETE CASCADE, user_id UUID NOT NULL REFERENCES users(user_id) ON DELETE CASCADE, rating INTEGER NOT NULL CHECK (rating >= 1 AND rating <= 5), review_text TEXT, created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP, updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP, UNIQUE (recipe_id, user_id) ); -- Add indexes CREATE INDEX idx_recipe_reviews_recipe_id ON recipe_reviews(recipe_id); CREATE INDEX idx_recipe_reviews_user_id ON recipe_reviews(user_id); CREATE INDEX idx_recipe_reviews_rating ON recipe_reviews(rating); -- Add trigger for updated_at CREATE TRIGGER update_recipe_reviews_updated_at BEFORE UPDATE ON recipe_reviews FOR EACH ROW EXECUTE FUNCTION update_updated_at_column(); ``` ### Example Repository Method ```typescript // src/services/db/recipeReview.db.ts import { handleDbError, NotFoundError } from './errors.db'; export async function getReviewById(reviewId: string, client?: PoolClient): Promise { const queryable = client || getPool(); try { const result = await queryable.query( `SELECT * FROM recipe_reviews WHERE review_id = $1`, [reviewId], ); if (result.rows.length === 0) { throw new NotFoundError(`Review with ID ${reviewId} not found.`); } return result.rows[0]; } catch (error) { handleDbError( error, log, 'Database error in getReviewById', { reviewId }, { entityName: 'RecipeReview', defaultMessage: 'Failed to fetch review.', }, ); } } export async function listReviewsByRecipeId( recipeId: string, options: { limit?: number; offset?: number } = {}, client?: PoolClient, ): Promise { const queryable = client || getPool(); const { limit = 50, offset = 0 } = options; try { const result = await queryable.query( `SELECT * FROM recipe_reviews WHERE recipe_id = $1 ORDER BY created_at DESC LIMIT $2 OFFSET $3`, [recipeId, limit, offset], ); return result.rows; } catch (error) { handleDbError( error, log, 'Database error in listReviewsByRecipeId', { recipeId, limit, offset }, { entityName: 'RecipeReview', defaultMessage: 'Failed to list reviews.', }, ); } } ``` ## The db-admin Subagent ### When to Use Use the **db-admin** subagent when you need to: - Debug production database issues - Configure PostgreSQL settings - Set up database backups - Analyze slow query logs - Configure Redis for production - Plan database capacity - Manage database users and permissions - Handle replication or failover ### What db-admin Knows The db-admin subagent understands: - PostgreSQL configuration and tuning - Redis configuration for BullMQ queues - Backup and recovery strategies (ADR-019) - Connection pooling settings - Production deployment setup - Bugsink PostgreSQL observability (ADR-050) ### Example Requests **Performance tuning:** ``` "Use db-admin to analyze why the database is running slow. Check connection pool settings, identify slow queries, and recommend PostgreSQL configuration changes." ``` **Backup configuration:** ``` "Use db-admin to set up daily automated backups for the production database with 30-day retention." ``` **User management:** ``` "Use db-admin to create a read-only database user for reporting purposes that can only SELECT from specific tables." ``` ### Database Users | User | Database | Purpose | | -------------------- | -------------------- | ---------------------- | | `flyer_crawler_prod` | `flyer-crawler-prod` | Production | | `flyer_crawler_test` | `flyer-crawler-test` | Testing | | `postgres` | All | Superuser (admin only) | ### Creating Database Users ```sql -- As postgres superuser CREATE DATABASE "flyer-crawler-test"; CREATE USER flyer_crawler_test WITH PASSWORD 'secure_password'; ALTER DATABASE "flyer-crawler-test" OWNER TO flyer_crawler_test; \c "flyer-crawler-test" ALTER SCHEMA public OWNER TO flyer_crawler_test; GRANT CREATE, USAGE ON SCHEMA public TO flyer_crawler_test; CREATE EXTENSION IF NOT EXISTS "uuid-ossp"; ``` ### PostgreSQL Configuration Guidance For production, db-admin may recommend settings like: ```ini # /etc/postgresql/14/main/conf.d/performance.conf # Connection settings max_connections = 100 shared_buffers = 256MB # Query optimization effective_cache_size = 768MB random_page_cost = 1.1 # Write performance wal_buffers = 16MB checkpoint_completion_target = 0.9 # Logging log_min_duration_statement = 1000 # Log queries over 1 second ``` ### Redis Configuration Guidance For BullMQ queues: ```ini # /etc/redis/redis.conf # Memory management maxmemory 256mb maxmemory-policy noeviction # BullMQ requires this # Persistence appendonly yes appendfsync everysec # Security requirepass your_redis_password ``` ## Common Database Tasks ### Running Migrations in Production ```bash # SSH to production server ssh root@projectium.com # Run migration cd /var/www/flyer-crawler.projectium.com npm run db:migrate ``` ### Checking Database Health ```bash # Connection count psql -c "SELECT count(*) FROM pg_stat_activity WHERE datname = 'flyer-crawler-prod';" # Table sizes psql -d "flyer-crawler-prod" -c " SELECT tablename, pg_size_pretty(pg_total_relation_size(schemaname || '.' || tablename)) as size FROM pg_tables WHERE schemaname = 'public' ORDER BY pg_total_relation_size(schemaname || '.' || tablename) DESC LIMIT 10;" # Slow queries psql -d "flyer-crawler-prod" -c " SELECT calls, mean_exec_time::numeric(10,2) as avg_ms, query FROM pg_stat_statements ORDER BY mean_exec_time DESC LIMIT 5;" ``` ### Database Backup Commands ```bash # Manual backup pg_dump -U flyer_crawler_prod -h localhost "flyer-crawler-prod" > backup_$(date +%Y%m%d).sql # Restore from backup psql -U flyer_crawler_prod -h localhost "flyer-crawler-prod" < backup_20260121.sql ``` ## N+1 Query Detection The db-dev subagent is particularly skilled at identifying N+1 query problems: **Problematic Pattern:** ```typescript // BAD: N+1 queries const flyers = await listFlyers(); for (const flyer of flyers) { flyer.items = await listItemsByFlyerId(flyer.flyer_id); // N queries! } ``` **Optimized Pattern:** ```typescript // GOOD: Single query with JOIN or separate batch query const flyersWithItems = await listFlyersWithItems(); // 1 query // Or with batching: const flyers = await listFlyers(); const flyerIds = flyers.map((f) => f.flyer_id); const allItems = await listItemsByFlyerIds(flyerIds); // 1 query // Group items by flyer_id in application code ``` ## Working with PostGIS The project uses PostGIS for spatial queries. Example: ```sql -- Find stores within 10km of a location SELECT s.store_id, s.name, ST_Distance( sl.location::geography, ST_MakePoint(-79.3832, 43.6532)::geography ) / 1000 as distance_km FROM stores s JOIN store_locations sl ON s.store_id = sl.store_id WHERE ST_DWithin( sl.location::geography, ST_MakePoint(-79.3832, 43.6532)::geography, 10000 -- 10km in meters ) ORDER BY distance_km; ``` ## MCP Database Access For direct database queries during development, use the MCP server: ``` // Query the dev database mcp__devdb__query("SELECT * FROM flyers LIMIT 5") ``` This is useful for: - Verifying data during debugging - Checking schema state - Testing queries before implementing ## Related Documentation - [OVERVIEW.md](./OVERVIEW.md) - Subagent system overview - [CODER-GUIDE.md](./CODER-GUIDE.md) - Working with the coder subagent - [DEVOPS-GUIDE.md](./DEVOPS-GUIDE.md) - DevOps and deployment workflows - [../adr/0034-repository-pattern-standards.md](../adr/0034-repository-pattern-standards.md) - Repository patterns - [../adr/0002-standardized-transaction-management.md](../adr/0002-standardized-transaction-management.md) - Transaction management - [../adr/0019-data-backup-and-recovery-strategy.md](../adr/0019-data-backup-and-recovery-strategy.md) - Backup strategy - [../adr/0050-postgresql-function-observability.md](../adr/0050-postgresql-function-observability.md) - Database observability - [../operations/BARE-METAL-SETUP.md](../operations/BARE-METAL-SETUP.md) - Production database setup