Some checks failed
Deploy to Test Environment / deploy-to-test (push) Failing after 2m15s
432 lines
13 KiB
Markdown
432 lines
13 KiB
Markdown
# Database Subagent Guide
|
|
|
|
This guide covers two database-focused subagents:
|
|
|
|
- **db-dev**: Database development - schemas, queries, migrations, optimization
|
|
- **db-admin**: Database administration - PostgreSQL/Redis admin, security, backups
|
|
|
|
## Quick Reference
|
|
|
|
| Aspect | db-dev | db-admin |
|
|
| ---------------- | -------------------------------------------- | ------------------------------------------ |
|
|
| **Primary Use** | Schemas, queries, migrations | Performance tuning, backups, security |
|
|
| **Key Files** | `src/services/db/*.db.ts`, `sql/migrations/` | `postgresql.conf`, `pg_hba.conf` |
|
|
| **Key ADRs** | ADR-034 (Repository), ADR-002 (Transactions) | ADR-019 (Backups), ADR-050 (Observability) |
|
|
| **Test Command** | `podman exec -it flyer-crawler-dev npm test` | N/A |
|
|
| **MCP Tool** | `mcp__devdb__query` | SSH to production |
|
|
| **Delegate To** | `coder` (service layer), `db-admin` (perf) | `devops` (infrastructure) |
|
|
|
|
## Understanding the Difference
|
|
|
|
| Aspect | db-dev | db-admin |
|
|
| --------------- | ----------------------------------- | ------------------------------------- |
|
|
| **Focus** | Application database code | Infrastructure and operations |
|
|
| **Tasks** | Queries, migrations, repositories | Performance tuning, backups, security |
|
|
| **Output** | SQL migrations, repository methods | Configuration, monitoring scripts |
|
|
| **When to Use** | Adding features, optimizing queries | Production issues, capacity planning |
|
|
|
|
## The db-dev Subagent
|
|
|
|
### When to Use
|
|
|
|
Use the **db-dev** subagent when you need to:
|
|
|
|
- Design new database tables or modify existing ones
|
|
- Write SQL queries or optimize existing ones
|
|
- Create database migrations
|
|
- Implement repository pattern methods
|
|
- Fix N+1 query problems
|
|
- Add indexes for performance
|
|
- Work with PostGIS spatial queries
|
|
|
|
### What db-dev Knows
|
|
|
|
The db-dev subagent has deep knowledge of:
|
|
|
|
- Project database schema (`sql/master_schema_rollup.sql`)
|
|
- Repository pattern standards (ADR-034)
|
|
- Transaction management (ADR-002)
|
|
- PostgreSQL-specific features (PostGIS, pg_trgm, etc.)
|
|
- Schema synchronization requirements
|
|
|
|
### Schema Synchronization (Critical)
|
|
|
|
> **Schema files MUST stay synchronized:**
|
|
>
|
|
> | File | Purpose |
|
|
> | ------------------------------ | --------------------------------- |
|
|
> | `sql/master_schema_rollup.sql` | Test DB setup, complete reference |
|
|
> | `sql/initial_schema.sql` | Fresh install schema |
|
|
> | `sql/migrations/*.sql` | Production incremental changes |
|
|
|
|
When db-dev creates a migration, it will also update the schema files.
|
|
|
|
### Example Requests
|
|
|
|
**Adding a new table:**
|
|
|
|
```
|
|
"Use db-dev to design a table for storing user recipe reviews.
|
|
Include fields for rating (1-5), review text, and relationships
|
|
to users and recipes. Create the migration and update schema files."
|
|
```
|
|
|
|
**Optimizing a slow query:**
|
|
|
|
```
|
|
"Use db-dev to optimize the query that lists flyers with their
|
|
item counts. It's currently doing N+1 queries and takes too long
|
|
with many flyers."
|
|
```
|
|
|
|
**Adding spatial search:**
|
|
|
|
```
|
|
"Use db-dev to add the ability to search stores within a radius
|
|
of a given location using PostGIS. Include the migration for
|
|
adding the geography column."
|
|
```
|
|
|
|
### Repository Pattern Standards
|
|
|
|
The db-dev subagent follows these naming conventions:
|
|
|
|
| Prefix | Returns | Behavior on Not Found |
|
|
| --------- | ------------------- | ------------------------------------ |
|
|
| `get*` | Single entity | Throws `NotFoundError` |
|
|
| `find*` | Entity or `null` | Returns `null` |
|
|
| `list*` | Array | Returns `[]` |
|
|
| `create*` | Created entity | Throws on constraint violation |
|
|
| `update*` | Updated entity | Throws `NotFoundError` if not exists |
|
|
| `delete*` | `void` or `boolean` | Throws `NotFoundError` if not exists |
|
|
|
|
### Example Migration
|
|
|
|
```sql
|
|
-- sql/migrations/20260121_add_recipe_reviews.sql
|
|
|
|
-- Create recipe_reviews table
|
|
CREATE TABLE IF NOT EXISTS recipe_reviews (
|
|
review_id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
|
|
recipe_id UUID NOT NULL REFERENCES recipes(recipe_id) ON DELETE CASCADE,
|
|
user_id UUID NOT NULL REFERENCES users(user_id) ON DELETE CASCADE,
|
|
rating INTEGER NOT NULL CHECK (rating >= 1 AND rating <= 5),
|
|
review_text TEXT,
|
|
created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
|
|
updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
|
|
UNIQUE (recipe_id, user_id)
|
|
);
|
|
|
|
-- Add indexes
|
|
CREATE INDEX idx_recipe_reviews_recipe_id ON recipe_reviews(recipe_id);
|
|
CREATE INDEX idx_recipe_reviews_user_id ON recipe_reviews(user_id);
|
|
CREATE INDEX idx_recipe_reviews_rating ON recipe_reviews(rating);
|
|
|
|
-- Add trigger for updated_at
|
|
CREATE TRIGGER update_recipe_reviews_updated_at
|
|
BEFORE UPDATE ON recipe_reviews
|
|
FOR EACH ROW
|
|
EXECUTE FUNCTION update_updated_at_column();
|
|
```
|
|
|
|
### Example Repository Method
|
|
|
|
```typescript
|
|
// src/services/db/recipeReview.db.ts
|
|
import { handleDbError, NotFoundError } from './errors.db';
|
|
|
|
export async function getReviewById(reviewId: string, client?: PoolClient): Promise<RecipeReview> {
|
|
const queryable = client || getPool();
|
|
try {
|
|
const result = await queryable.query<RecipeReview>(
|
|
`SELECT * FROM recipe_reviews WHERE review_id = $1`,
|
|
[reviewId],
|
|
);
|
|
if (result.rows.length === 0) {
|
|
throw new NotFoundError(`Review with ID ${reviewId} not found.`);
|
|
}
|
|
return result.rows[0];
|
|
} catch (error) {
|
|
handleDbError(
|
|
error,
|
|
log,
|
|
'Database error in getReviewById',
|
|
{ reviewId },
|
|
{
|
|
entityName: 'RecipeReview',
|
|
defaultMessage: 'Failed to fetch review.',
|
|
},
|
|
);
|
|
}
|
|
}
|
|
|
|
export async function listReviewsByRecipeId(
|
|
recipeId: string,
|
|
options: { limit?: number; offset?: number } = {},
|
|
client?: PoolClient,
|
|
): Promise<RecipeReview[]> {
|
|
const queryable = client || getPool();
|
|
const { limit = 50, offset = 0 } = options;
|
|
try {
|
|
const result = await queryable.query<RecipeReview>(
|
|
`SELECT * FROM recipe_reviews
|
|
WHERE recipe_id = $1
|
|
ORDER BY created_at DESC
|
|
LIMIT $2 OFFSET $3`,
|
|
[recipeId, limit, offset],
|
|
);
|
|
return result.rows;
|
|
} catch (error) {
|
|
handleDbError(
|
|
error,
|
|
log,
|
|
'Database error in listReviewsByRecipeId',
|
|
{ recipeId, limit, offset },
|
|
{
|
|
entityName: 'RecipeReview',
|
|
defaultMessage: 'Failed to list reviews.',
|
|
},
|
|
);
|
|
}
|
|
}
|
|
```
|
|
|
|
## The db-admin Subagent
|
|
|
|
### When to Use
|
|
|
|
Use the **db-admin** subagent when you need to:
|
|
|
|
- Debug production database issues
|
|
- Configure PostgreSQL settings
|
|
- Set up database backups
|
|
- Analyze slow query logs
|
|
- Configure Redis for production
|
|
- Plan database capacity
|
|
- Manage database users and permissions
|
|
- Handle replication or failover
|
|
|
|
### What db-admin Knows
|
|
|
|
The db-admin subagent understands:
|
|
|
|
- PostgreSQL configuration and tuning
|
|
- Redis configuration for BullMQ queues
|
|
- Backup and recovery strategies (ADR-019)
|
|
- Connection pooling settings
|
|
- Production deployment setup
|
|
- Bugsink PostgreSQL observability (ADR-050)
|
|
|
|
### Example Requests
|
|
|
|
**Performance tuning:**
|
|
|
|
```
|
|
"Use db-admin to analyze why the database is running slow.
|
|
Check connection pool settings, identify slow queries, and
|
|
recommend PostgreSQL configuration changes."
|
|
```
|
|
|
|
**Backup configuration:**
|
|
|
|
```
|
|
"Use db-admin to set up daily automated backups for the
|
|
production database with 30-day retention."
|
|
```
|
|
|
|
**User management:**
|
|
|
|
```
|
|
"Use db-admin to create a read-only database user for
|
|
reporting purposes that can only SELECT from specific tables."
|
|
```
|
|
|
|
### Database Users
|
|
|
|
| User | Database | Purpose |
|
|
| -------------------- | -------------------- | ---------------------- |
|
|
| `flyer_crawler_prod` | `flyer-crawler-prod` | Production |
|
|
| `flyer_crawler_test` | `flyer-crawler-test` | Testing |
|
|
| `postgres` | All | Superuser (admin only) |
|
|
|
|
### Creating Database Users
|
|
|
|
```sql
|
|
-- As postgres superuser
|
|
CREATE DATABASE "flyer-crawler-test";
|
|
CREATE USER flyer_crawler_test WITH PASSWORD 'secure_password';
|
|
ALTER DATABASE "flyer-crawler-test" OWNER TO flyer_crawler_test;
|
|
\c "flyer-crawler-test"
|
|
ALTER SCHEMA public OWNER TO flyer_crawler_test;
|
|
GRANT CREATE, USAGE ON SCHEMA public TO flyer_crawler_test;
|
|
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
|
|
```
|
|
|
|
### PostgreSQL Configuration Guidance
|
|
|
|
For production, db-admin may recommend settings like:
|
|
|
|
```ini
|
|
# /etc/postgresql/14/main/conf.d/performance.conf
|
|
|
|
# Connection settings
|
|
max_connections = 100
|
|
shared_buffers = 256MB
|
|
|
|
# Query optimization
|
|
effective_cache_size = 768MB
|
|
random_page_cost = 1.1
|
|
|
|
# Write performance
|
|
wal_buffers = 16MB
|
|
checkpoint_completion_target = 0.9
|
|
|
|
# Logging
|
|
log_min_duration_statement = 1000 # Log queries over 1 second
|
|
```
|
|
|
|
### Redis Configuration Guidance
|
|
|
|
For BullMQ queues:
|
|
|
|
```ini
|
|
# /etc/redis/redis.conf
|
|
|
|
# Memory management
|
|
maxmemory 256mb
|
|
maxmemory-policy noeviction # BullMQ requires this
|
|
|
|
# Persistence
|
|
appendonly yes
|
|
appendfsync everysec
|
|
|
|
# Security
|
|
requirepass your_redis_password
|
|
```
|
|
|
|
## Common Database Tasks
|
|
|
|
### Running Migrations in Production
|
|
|
|
```bash
|
|
# SSH to production server
|
|
ssh root@projectium.com
|
|
|
|
# Run migration
|
|
cd /var/www/flyer-crawler.projectium.com
|
|
npm run db:migrate
|
|
```
|
|
|
|
### Checking Database Health
|
|
|
|
```bash
|
|
# Connection count
|
|
psql -c "SELECT count(*) FROM pg_stat_activity WHERE datname = 'flyer-crawler-prod';"
|
|
|
|
# Table sizes
|
|
psql -d "flyer-crawler-prod" -c "
|
|
SELECT
|
|
tablename,
|
|
pg_size_pretty(pg_total_relation_size(schemaname || '.' || tablename)) as size
|
|
FROM pg_tables
|
|
WHERE schemaname = 'public'
|
|
ORDER BY pg_total_relation_size(schemaname || '.' || tablename) DESC
|
|
LIMIT 10;"
|
|
|
|
# Slow queries
|
|
psql -d "flyer-crawler-prod" -c "
|
|
SELECT
|
|
calls,
|
|
mean_exec_time::numeric(10,2) as avg_ms,
|
|
query
|
|
FROM pg_stat_statements
|
|
ORDER BY mean_exec_time DESC
|
|
LIMIT 5;"
|
|
```
|
|
|
|
### Database Backup Commands
|
|
|
|
```bash
|
|
# Manual backup
|
|
pg_dump -U flyer_crawler_prod -h localhost "flyer-crawler-prod" > backup_$(date +%Y%m%d).sql
|
|
|
|
# Restore from backup
|
|
psql -U flyer_crawler_prod -h localhost "flyer-crawler-prod" < backup_20260121.sql
|
|
```
|
|
|
|
## N+1 Query Detection
|
|
|
|
The db-dev subagent is particularly skilled at identifying N+1 query problems:
|
|
|
|
**Problematic Pattern:**
|
|
|
|
```typescript
|
|
// BAD: N+1 queries
|
|
const flyers = await listFlyers();
|
|
for (const flyer of flyers) {
|
|
flyer.items = await listItemsByFlyerId(flyer.flyer_id); // N queries!
|
|
}
|
|
```
|
|
|
|
**Optimized Pattern:**
|
|
|
|
```typescript
|
|
// GOOD: Single query with JOIN or separate batch query
|
|
const flyersWithItems = await listFlyersWithItems(); // 1 query
|
|
|
|
// Or with batching:
|
|
const flyers = await listFlyers();
|
|
const flyerIds = flyers.map((f) => f.flyer_id);
|
|
const allItems = await listItemsByFlyerIds(flyerIds); // 1 query
|
|
// Group items by flyer_id in application code
|
|
```
|
|
|
|
## Working with PostGIS
|
|
|
|
The project uses PostGIS for spatial queries. Example:
|
|
|
|
```sql
|
|
-- Find stores within 10km of a location
|
|
SELECT
|
|
s.store_id,
|
|
s.name,
|
|
ST_Distance(
|
|
sl.location::geography,
|
|
ST_MakePoint(-79.3832, 43.6532)::geography
|
|
) / 1000 as distance_km
|
|
FROM stores s
|
|
JOIN store_locations sl ON s.store_id = sl.store_id
|
|
WHERE ST_DWithin(
|
|
sl.location::geography,
|
|
ST_MakePoint(-79.3832, 43.6532)::geography,
|
|
10000 -- 10km in meters
|
|
)
|
|
ORDER BY distance_km;
|
|
```
|
|
|
|
## MCP Database Access
|
|
|
|
For direct database queries during development, use the MCP server:
|
|
|
|
```
|
|
// Query the dev database
|
|
mcp__devdb__query("SELECT * FROM flyers LIMIT 5")
|
|
```
|
|
|
|
This is useful for:
|
|
|
|
- Verifying data during debugging
|
|
- Checking schema state
|
|
- Testing queries before implementing
|
|
|
|
## Related Documentation
|
|
|
|
- [OVERVIEW.md](./OVERVIEW.md) - Subagent system overview
|
|
- [CODER-GUIDE.md](./CODER-GUIDE.md) - Working with the coder subagent
|
|
- [DEVOPS-GUIDE.md](./DEVOPS-GUIDE.md) - DevOps and deployment workflows
|
|
- [../adr/0034-repository-pattern-standards.md](../adr/0034-repository-pattern-standards.md) - Repository patterns
|
|
- [../adr/0002-standardized-transaction-management.md](../adr/0002-standardized-transaction-management.md) - Transaction management
|
|
- [../adr/0019-data-backup-and-recovery-strategy.md](../adr/0019-data-backup-and-recovery-strategy.md) - Backup strategy
|
|
- [../adr/0050-postgresql-function-observability.md](../adr/0050-postgresql-function-observability.md) - Database observability
|
|
- [../operations/BARE-METAL-SETUP.md](../operations/BARE-METAL-SETUP.md) - Production database setup
|