doc updates and test fixin
This commit is contained in:
413
docs/subagents/DATABASE-GUIDE.md
Normal file
413
docs/subagents/DATABASE-GUIDE.md
Normal file
@@ -0,0 +1,413 @@
|
||||
# Database Subagent Guide
|
||||
|
||||
This guide covers two database-focused subagents:
|
||||
|
||||
- **db-dev**: Database development - schemas, queries, migrations, optimization
|
||||
- **db-admin**: Database administration - PostgreSQL/Redis admin, security, backups
|
||||
|
||||
## Understanding the Difference
|
||||
|
||||
| Aspect | db-dev | db-admin |
|
||||
|--------|--------|----------|
|
||||
| **Focus** | Application database code | Infrastructure and operations |
|
||||
| **Tasks** | Queries, migrations, repositories | Performance tuning, backups, security |
|
||||
| **Output** | SQL migrations, repository methods | Configuration, monitoring scripts |
|
||||
| **When to Use** | Adding features, optimizing queries | Production issues, capacity planning |
|
||||
|
||||
## The db-dev Subagent
|
||||
|
||||
### When to Use
|
||||
|
||||
Use the **db-dev** subagent when you need to:
|
||||
|
||||
- Design new database tables or modify existing ones
|
||||
- Write SQL queries or optimize existing ones
|
||||
- Create database migrations
|
||||
- Implement repository pattern methods
|
||||
- Fix N+1 query problems
|
||||
- Add indexes for performance
|
||||
- Work with PostGIS spatial queries
|
||||
|
||||
### What db-dev Knows
|
||||
|
||||
The db-dev subagent has deep knowledge of:
|
||||
|
||||
- Project database schema (`sql/master_schema_rollup.sql`)
|
||||
- Repository pattern standards (ADR-034)
|
||||
- Transaction management (ADR-002)
|
||||
- PostgreSQL-specific features (PostGIS, pg_trgm, etc.)
|
||||
- Schema synchronization requirements
|
||||
|
||||
### Schema Synchronization (Critical)
|
||||
|
||||
> **Schema files MUST stay synchronized:**
|
||||
>
|
||||
> | File | Purpose |
|
||||
> |------|---------|
|
||||
> | `sql/master_schema_rollup.sql` | Test DB setup, complete reference |
|
||||
> | `sql/initial_schema.sql` | Fresh install schema |
|
||||
> | `sql/migrations/*.sql` | Production incremental changes |
|
||||
|
||||
When db-dev creates a migration, it will also update the schema files.
|
||||
|
||||
### Example Requests
|
||||
|
||||
**Adding a new table:**
|
||||
```
|
||||
"Use db-dev to design a table for storing user recipe reviews.
|
||||
Include fields for rating (1-5), review text, and relationships
|
||||
to users and recipes. Create the migration and update schema files."
|
||||
```
|
||||
|
||||
**Optimizing a slow query:**
|
||||
```
|
||||
"Use db-dev to optimize the query that lists flyers with their
|
||||
item counts. It's currently doing N+1 queries and takes too long
|
||||
with many flyers."
|
||||
```
|
||||
|
||||
**Adding spatial search:**
|
||||
```
|
||||
"Use db-dev to add the ability to search stores within a radius
|
||||
of a given location using PostGIS. Include the migration for
|
||||
adding the geography column."
|
||||
```
|
||||
|
||||
### Repository Pattern Standards
|
||||
|
||||
The db-dev subagent follows these naming conventions:
|
||||
|
||||
| Prefix | Returns | Behavior on Not Found |
|
||||
|--------|---------|----------------------|
|
||||
| `get*` | Single entity | Throws `NotFoundError` |
|
||||
| `find*` | Entity or `null` | Returns `null` |
|
||||
| `list*` | Array | Returns `[]` |
|
||||
| `create*` | Created entity | Throws on constraint violation |
|
||||
| `update*` | Updated entity | Throws `NotFoundError` if not exists |
|
||||
| `delete*` | `void` or `boolean` | Throws `NotFoundError` if not exists |
|
||||
|
||||
### Example Migration
|
||||
|
||||
```sql
|
||||
-- sql/migrations/20260121_add_recipe_reviews.sql
|
||||
|
||||
-- Create recipe_reviews table
|
||||
CREATE TABLE IF NOT EXISTS recipe_reviews (
|
||||
review_id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
|
||||
recipe_id UUID NOT NULL REFERENCES recipes(recipe_id) ON DELETE CASCADE,
|
||||
user_id UUID NOT NULL REFERENCES users(user_id) ON DELETE CASCADE,
|
||||
rating INTEGER NOT NULL CHECK (rating >= 1 AND rating <= 5),
|
||||
review_text TEXT,
|
||||
created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
|
||||
updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
|
||||
UNIQUE (recipe_id, user_id)
|
||||
);
|
||||
|
||||
-- Add indexes
|
||||
CREATE INDEX idx_recipe_reviews_recipe_id ON recipe_reviews(recipe_id);
|
||||
CREATE INDEX idx_recipe_reviews_user_id ON recipe_reviews(user_id);
|
||||
CREATE INDEX idx_recipe_reviews_rating ON recipe_reviews(rating);
|
||||
|
||||
-- Add trigger for updated_at
|
||||
CREATE TRIGGER update_recipe_reviews_updated_at
|
||||
BEFORE UPDATE ON recipe_reviews
|
||||
FOR EACH ROW
|
||||
EXECUTE FUNCTION update_updated_at_column();
|
||||
```
|
||||
|
||||
### Example Repository Method
|
||||
|
||||
```typescript
|
||||
// src/services/db/recipeReview.db.ts
|
||||
import { handleDbError, NotFoundError } from './errors.db';
|
||||
|
||||
export async function getReviewById(
|
||||
reviewId: string,
|
||||
client?: PoolClient
|
||||
): Promise<RecipeReview> {
|
||||
const queryable = client || getPool();
|
||||
try {
|
||||
const result = await queryable.query<RecipeReview>(
|
||||
`SELECT * FROM recipe_reviews WHERE review_id = $1`,
|
||||
[reviewId]
|
||||
);
|
||||
if (result.rows.length === 0) {
|
||||
throw new NotFoundError(`Review with ID ${reviewId} not found.`);
|
||||
}
|
||||
return result.rows[0];
|
||||
} catch (error) {
|
||||
handleDbError(
|
||||
error,
|
||||
log,
|
||||
'Database error in getReviewById',
|
||||
{ reviewId },
|
||||
{
|
||||
entityName: 'RecipeReview',
|
||||
defaultMessage: 'Failed to fetch review.',
|
||||
}
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
export async function listReviewsByRecipeId(
|
||||
recipeId: string,
|
||||
options: { limit?: number; offset?: number } = {},
|
||||
client?: PoolClient
|
||||
): Promise<RecipeReview[]> {
|
||||
const queryable = client || getPool();
|
||||
const { limit = 50, offset = 0 } = options;
|
||||
try {
|
||||
const result = await queryable.query<RecipeReview>(
|
||||
`SELECT * FROM recipe_reviews
|
||||
WHERE recipe_id = $1
|
||||
ORDER BY created_at DESC
|
||||
LIMIT $2 OFFSET $3`,
|
||||
[recipeId, limit, offset]
|
||||
);
|
||||
return result.rows;
|
||||
} catch (error) {
|
||||
handleDbError(
|
||||
error,
|
||||
log,
|
||||
'Database error in listReviewsByRecipeId',
|
||||
{ recipeId, limit, offset },
|
||||
{
|
||||
entityName: 'RecipeReview',
|
||||
defaultMessage: 'Failed to list reviews.',
|
||||
}
|
||||
);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## The db-admin Subagent
|
||||
|
||||
### When to Use
|
||||
|
||||
Use the **db-admin** subagent when you need to:
|
||||
|
||||
- Debug production database issues
|
||||
- Configure PostgreSQL settings
|
||||
- Set up database backups
|
||||
- Analyze slow query logs
|
||||
- Configure Redis for production
|
||||
- Plan database capacity
|
||||
- Manage database users and permissions
|
||||
- Handle replication or failover
|
||||
|
||||
### What db-admin Knows
|
||||
|
||||
The db-admin subagent understands:
|
||||
|
||||
- PostgreSQL configuration and tuning
|
||||
- Redis configuration for BullMQ queues
|
||||
- Backup and recovery strategies (ADR-019)
|
||||
- Connection pooling settings
|
||||
- Production deployment setup
|
||||
- Bugsink PostgreSQL observability (ADR-050)
|
||||
|
||||
### Example Requests
|
||||
|
||||
**Performance tuning:**
|
||||
```
|
||||
"Use db-admin to analyze why the database is running slow.
|
||||
Check connection pool settings, identify slow queries, and
|
||||
recommend PostgreSQL configuration changes."
|
||||
```
|
||||
|
||||
**Backup configuration:**
|
||||
```
|
||||
"Use db-admin to set up daily automated backups for the
|
||||
production database with 30-day retention."
|
||||
```
|
||||
|
||||
**User management:**
|
||||
```
|
||||
"Use db-admin to create a read-only database user for
|
||||
reporting purposes that can only SELECT from specific tables."
|
||||
```
|
||||
|
||||
### Database Users
|
||||
|
||||
| User | Database | Purpose |
|
||||
|------|----------|---------|
|
||||
| `flyer_crawler_prod` | `flyer-crawler-prod` | Production |
|
||||
| `flyer_crawler_test` | `flyer-crawler-test` | Testing |
|
||||
| `postgres` | All | Superuser (admin only) |
|
||||
|
||||
### Creating Database Users
|
||||
|
||||
```sql
|
||||
-- As postgres superuser
|
||||
CREATE DATABASE "flyer-crawler-test";
|
||||
CREATE USER flyer_crawler_test WITH PASSWORD 'secure_password';
|
||||
ALTER DATABASE "flyer-crawler-test" OWNER TO flyer_crawler_test;
|
||||
\c "flyer-crawler-test"
|
||||
ALTER SCHEMA public OWNER TO flyer_crawler_test;
|
||||
GRANT CREATE, USAGE ON SCHEMA public TO flyer_crawler_test;
|
||||
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
|
||||
```
|
||||
|
||||
### PostgreSQL Configuration Guidance
|
||||
|
||||
For production, db-admin may recommend settings like:
|
||||
|
||||
```ini
|
||||
# /etc/postgresql/14/main/conf.d/performance.conf
|
||||
|
||||
# Connection settings
|
||||
max_connections = 100
|
||||
shared_buffers = 256MB
|
||||
|
||||
# Query optimization
|
||||
effective_cache_size = 768MB
|
||||
random_page_cost = 1.1
|
||||
|
||||
# Write performance
|
||||
wal_buffers = 16MB
|
||||
checkpoint_completion_target = 0.9
|
||||
|
||||
# Logging
|
||||
log_min_duration_statement = 1000 # Log queries over 1 second
|
||||
```
|
||||
|
||||
### Redis Configuration Guidance
|
||||
|
||||
For BullMQ queues:
|
||||
|
||||
```ini
|
||||
# /etc/redis/redis.conf
|
||||
|
||||
# Memory management
|
||||
maxmemory 256mb
|
||||
maxmemory-policy noeviction # BullMQ requires this
|
||||
|
||||
# Persistence
|
||||
appendonly yes
|
||||
appendfsync everysec
|
||||
|
||||
# Security
|
||||
requirepass your_redis_password
|
||||
```
|
||||
|
||||
## Common Database Tasks
|
||||
|
||||
### Running Migrations in Production
|
||||
|
||||
```bash
|
||||
# SSH to production server
|
||||
ssh root@projectium.com
|
||||
|
||||
# Run migration
|
||||
cd /var/www/flyer-crawler.projectium.com
|
||||
npm run db:migrate
|
||||
```
|
||||
|
||||
### Checking Database Health
|
||||
|
||||
```bash
|
||||
# Connection count
|
||||
psql -c "SELECT count(*) FROM pg_stat_activity WHERE datname = 'flyer-crawler-prod';"
|
||||
|
||||
# Table sizes
|
||||
psql -d "flyer-crawler-prod" -c "
|
||||
SELECT
|
||||
tablename,
|
||||
pg_size_pretty(pg_total_relation_size(schemaname || '.' || tablename)) as size
|
||||
FROM pg_tables
|
||||
WHERE schemaname = 'public'
|
||||
ORDER BY pg_total_relation_size(schemaname || '.' || tablename) DESC
|
||||
LIMIT 10;"
|
||||
|
||||
# Slow queries
|
||||
psql -d "flyer-crawler-prod" -c "
|
||||
SELECT
|
||||
calls,
|
||||
mean_exec_time::numeric(10,2) as avg_ms,
|
||||
query
|
||||
FROM pg_stat_statements
|
||||
ORDER BY mean_exec_time DESC
|
||||
LIMIT 5;"
|
||||
```
|
||||
|
||||
### Database Backup Commands
|
||||
|
||||
```bash
|
||||
# Manual backup
|
||||
pg_dump -U flyer_crawler_prod -h localhost "flyer-crawler-prod" > backup_$(date +%Y%m%d).sql
|
||||
|
||||
# Restore from backup
|
||||
psql -U flyer_crawler_prod -h localhost "flyer-crawler-prod" < backup_20260121.sql
|
||||
```
|
||||
|
||||
## N+1 Query Detection
|
||||
|
||||
The db-dev subagent is particularly skilled at identifying N+1 query problems:
|
||||
|
||||
**Problematic Pattern:**
|
||||
```typescript
|
||||
// BAD: N+1 queries
|
||||
const flyers = await listFlyers();
|
||||
for (const flyer of flyers) {
|
||||
flyer.items = await listItemsByFlyerId(flyer.flyer_id); // N queries!
|
||||
}
|
||||
```
|
||||
|
||||
**Optimized Pattern:**
|
||||
```typescript
|
||||
// GOOD: Single query with JOIN or separate batch query
|
||||
const flyersWithItems = await listFlyersWithItems(); // 1 query
|
||||
|
||||
// Or with batching:
|
||||
const flyers = await listFlyers();
|
||||
const flyerIds = flyers.map(f => f.flyer_id);
|
||||
const allItems = await listItemsByFlyerIds(flyerIds); // 1 query
|
||||
// Group items by flyer_id in application code
|
||||
```
|
||||
|
||||
## Working with PostGIS
|
||||
|
||||
The project uses PostGIS for spatial queries. Example:
|
||||
|
||||
```sql
|
||||
-- Find stores within 10km of a location
|
||||
SELECT
|
||||
s.store_id,
|
||||
s.name,
|
||||
ST_Distance(
|
||||
sl.location::geography,
|
||||
ST_MakePoint(-79.3832, 43.6532)::geography
|
||||
) / 1000 as distance_km
|
||||
FROM stores s
|
||||
JOIN store_locations sl ON s.store_id = sl.store_id
|
||||
WHERE ST_DWithin(
|
||||
sl.location::geography,
|
||||
ST_MakePoint(-79.3832, 43.6532)::geography,
|
||||
10000 -- 10km in meters
|
||||
)
|
||||
ORDER BY distance_km;
|
||||
```
|
||||
|
||||
## MCP Database Access
|
||||
|
||||
For direct database queries during development, use the MCP server:
|
||||
|
||||
```
|
||||
// Query the dev database
|
||||
mcp__devdb__query("SELECT * FROM flyers LIMIT 5")
|
||||
```
|
||||
|
||||
This is useful for:
|
||||
- Verifying data during debugging
|
||||
- Checking schema state
|
||||
- Testing queries before implementing
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [OVERVIEW.md](./OVERVIEW.md) - Subagent system overview
|
||||
- [CODER-GUIDE.md](./CODER-GUIDE.md) - Working with the coder subagent
|
||||
- [../adr/0034-repository-pattern-standards.md](../adr/0034-repository-pattern-standards.md) - Repository patterns
|
||||
- [../adr/0002-standardized-transaction-management.md](../adr/0002-standardized-transaction-management.md) - Transaction management
|
||||
- [../adr/0019-data-backup-and-recovery-strategy.md](../adr/0019-data-backup-and-recovery-strategy.md) - Backup strategy
|
||||
- [../adr/0050-postgresql-function-observability.md](../adr/0050-postgresql-function-observability.md) - Database observability
|
||||
- [../BARE-METAL-SETUP.md](../BARE-METAL-SETUP.md) - Production database setup
|
||||
Reference in New Issue
Block a user