doc updates and test fixin
This commit is contained in:
406
docs/subagents/SECURITY-DEBUG-GUIDE.md
Normal file
406
docs/subagents/SECURITY-DEBUG-GUIDE.md
Normal file
@@ -0,0 +1,406 @@
|
||||
# Security and Debugging Subagent Guide
|
||||
|
||||
This guide covers security and debugging-focused subagents:
|
||||
|
||||
- **security-engineer**: Security audits, vulnerability scanning, OWASP, pentesting
|
||||
- **log-debug**: Production errors, observability, Bugsink/Sentry analysis
|
||||
- **code-reviewer**: Code quality, security review, best practices
|
||||
|
||||
## The security-engineer Subagent
|
||||
|
||||
### When to Use
|
||||
|
||||
Use the **security-engineer** subagent when you need to:
|
||||
|
||||
- Conduct security audits of code or features
|
||||
- Review authentication/authorization flows
|
||||
- Identify vulnerabilities (OWASP Top 10)
|
||||
- Review API security
|
||||
- Assess data protection measures
|
||||
- Plan security improvements
|
||||
|
||||
### What security-engineer Knows
|
||||
|
||||
The security-engineer subagent understands:
|
||||
|
||||
- OWASP Top 10 vulnerabilities
|
||||
- Node.js/Express security best practices
|
||||
- JWT authentication security
|
||||
- SQL injection prevention
|
||||
- XSS and CSRF protection
|
||||
- Rate limiting strategies (ADR-032)
|
||||
- API security hardening (ADR-016)
|
||||
|
||||
### Example Requests
|
||||
|
||||
**Security Audit:**
|
||||
```
|
||||
"Use security-engineer to audit the user registration and
|
||||
login flow for security vulnerabilities. Check for common
|
||||
issues like credential stuffing, brute force, and session
|
||||
management problems."
|
||||
```
|
||||
|
||||
**API Security Review:**
|
||||
```
|
||||
"Use security-engineer to review the flyer upload endpoint
|
||||
for security issues. Consider file type validation, size
|
||||
limits, malicious file handling, and authorization."
|
||||
```
|
||||
|
||||
**Vulnerability Assessment:**
|
||||
```
|
||||
"Use security-engineer to assess our exposure to the OWASP
|
||||
Top 10 vulnerabilities. Identify any gaps in our current
|
||||
security measures."
|
||||
```
|
||||
|
||||
### Security Checklist
|
||||
|
||||
The security-engineer subagent uses this checklist:
|
||||
|
||||
#### Authentication & Authorization
|
||||
- [ ] Password hashing with bcrypt (cost factor >= 10)
|
||||
- [ ] JWT tokens with appropriate expiration
|
||||
- [ ] Refresh token rotation
|
||||
- [ ] Session invalidation on password change
|
||||
- [ ] Role-based access control (RBAC)
|
||||
|
||||
#### Input Validation
|
||||
- [ ] All user input validated with Zod schemas
|
||||
- [ ] SQL queries use parameterized statements
|
||||
- [ ] File uploads validated for type and size
|
||||
- [ ] Path traversal prevention
|
||||
|
||||
#### Data Protection
|
||||
- [ ] Sensitive data encrypted at rest
|
||||
- [ ] HTTPS enforced in production
|
||||
- [ ] No secrets in source code
|
||||
- [ ] Proper error messages (no stack traces to users)
|
||||
|
||||
#### Rate Limiting
|
||||
- [ ] Login attempts limited
|
||||
- [ ] API endpoints rate limited
|
||||
- [ ] File upload rate limited
|
||||
|
||||
#### Headers & CORS
|
||||
- [ ] Security headers set (Helmet.js)
|
||||
- [ ] CORS configured appropriately
|
||||
- [ ] Content-Security-Policy defined
|
||||
|
||||
### Security Patterns in This Project
|
||||
|
||||
**Rate Limiting (ADR-032):**
|
||||
```typescript
|
||||
// src/config/rateLimiters.ts
|
||||
export const loginLimiter = rateLimit({
|
||||
windowMs: 15 * 60 * 1000, // 15 minutes
|
||||
max: 5, // 5 attempts per window
|
||||
message: 'Too many login attempts',
|
||||
});
|
||||
```
|
||||
|
||||
**Input Validation (ADR-003):**
|
||||
```typescript
|
||||
// src/middleware/validation.middleware.ts
|
||||
router.post(
|
||||
'/register',
|
||||
validateRequest(registerSchema),
|
||||
async (req, res, next) => { ... }
|
||||
);
|
||||
```
|
||||
|
||||
**Authentication (ADR-048):**
|
||||
```typescript
|
||||
// JWT with refresh tokens
|
||||
const accessToken = jwt.sign(payload, secret, { expiresIn: '15m' });
|
||||
const refreshToken = jwt.sign({ userId }, refreshSecret, { expiresIn: '7d' });
|
||||
```
|
||||
|
||||
## The log-debug Subagent
|
||||
|
||||
### When to Use
|
||||
|
||||
Use the **log-debug** subagent when you need to:
|
||||
|
||||
- Debug production errors
|
||||
- Analyze Bugsink/Sentry error reports
|
||||
- Investigate performance issues
|
||||
- Trace request flows through logs
|
||||
- Identify patterns in error occurrences
|
||||
|
||||
### What log-debug Knows
|
||||
|
||||
The log-debug subagent understands:
|
||||
|
||||
- Pino structured logging
|
||||
- Bugsink/Sentry error tracking
|
||||
- Log aggregation with Logstash
|
||||
- PostgreSQL function observability (ADR-050)
|
||||
- Request tracing patterns
|
||||
- Error correlation
|
||||
|
||||
### MCP Tools for Debugging
|
||||
|
||||
The log-debug subagent can use MCP tools to access error tracking:
|
||||
|
||||
```
|
||||
// Check Bugsink for production errors
|
||||
mcp__bugsink__list_projects()
|
||||
mcp__bugsink__list_issues({ project_id: 1 })
|
||||
mcp__bugsink__get_event({ event_id: "..." })
|
||||
mcp__bugsink__get_stacktrace({ event_id: "..." })
|
||||
|
||||
// Check local dev errors
|
||||
mcp__localerrors__list_issues({ project_id: 1 })
|
||||
```
|
||||
|
||||
### Example Requests
|
||||
|
||||
**Production Error Investigation:**
|
||||
```
|
||||
"Use log-debug to investigate the spike in 500 errors on the
|
||||
flyer processing endpoint yesterday. Check Bugsink for error
|
||||
patterns and identify the root cause."
|
||||
```
|
||||
|
||||
**Performance Analysis:**
|
||||
```
|
||||
"Use log-debug to analyze the slow response times on the deals
|
||||
page. Check logs for database query timing and identify any
|
||||
bottlenecks."
|
||||
```
|
||||
|
||||
**Error Pattern Analysis:**
|
||||
```
|
||||
"Use log-debug to identify patterns in the authentication
|
||||
failures over the past week. Are they coming from specific
|
||||
IPs or affecting specific users?"
|
||||
```
|
||||
|
||||
### Log Analysis Patterns
|
||||
|
||||
**Structured Log Format (Pino):**
|
||||
```json
|
||||
{
|
||||
"level": 50,
|
||||
"time": 1704067200000,
|
||||
"pid": 1234,
|
||||
"hostname": "server1",
|
||||
"module": "flyerService",
|
||||
"requestId": "abc-123",
|
||||
"userId": "user-456",
|
||||
"msg": "Flyer processing failed",
|
||||
"err": {
|
||||
"type": "AIExtractionError",
|
||||
"message": "Rate limit exceeded",
|
||||
"stack": "..."
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Request Tracing:**
|
||||
```typescript
|
||||
// Each request gets a unique ID for tracing
|
||||
app.use((req, res, next) => {
|
||||
req.requestId = crypto.randomUUID();
|
||||
req.log = logger.child({ requestId: req.requestId });
|
||||
next();
|
||||
});
|
||||
```
|
||||
|
||||
**Error Correlation:**
|
||||
- Same `requestId` across all logs for a request
|
||||
- Same `userId` for user-related errors
|
||||
- Same `flyerId` for flyer processing errors
|
||||
|
||||
### Bugsink Error Tracking
|
||||
|
||||
**Production Bugsink Projects:**
|
||||
|
||||
| Project | ID | Purpose |
|
||||
|---------|-----|---------|
|
||||
| flyer-crawler-backend | 1 | Backend errors |
|
||||
| flyer-crawler-frontend | 2 | Frontend errors |
|
||||
| flyer-crawler-backend-test | 3 | Test backend |
|
||||
| flyer-crawler-frontend-test | 4 | Test frontend |
|
||||
| flyer-crawler-infrastructure | 5 | Infra errors |
|
||||
|
||||
**Accessing Bugsink:**
|
||||
- Production: https://bugsink.projectium.com
|
||||
- Dev Container: http://localhost:8000
|
||||
|
||||
### Log File Locations
|
||||
|
||||
| Environment | Log Path |
|
||||
|-------------|----------|
|
||||
| Production | `/var/www/flyer-crawler.projectium.com/logs/app.log` |
|
||||
| Test | `/var/www/flyer-crawler-test.projectium.com/logs/app.log` |
|
||||
| Dev Container | `/app/logs/app.log` |
|
||||
|
||||
## The code-reviewer Subagent
|
||||
|
||||
### When to Use
|
||||
|
||||
Use the **code-reviewer** subagent when you need to:
|
||||
|
||||
- Review code quality before merging
|
||||
- Identify potential issues in implementations
|
||||
- Check adherence to project patterns
|
||||
- Review security implications
|
||||
- Assess test coverage
|
||||
|
||||
### What code-reviewer Knows
|
||||
|
||||
The code-reviewer subagent understands:
|
||||
|
||||
- Project architecture patterns (ADRs)
|
||||
- Repository pattern standards (ADR-034)
|
||||
- Service layer architecture (ADR-035)
|
||||
- Testing standards (ADR-010)
|
||||
- TypeScript best practices
|
||||
- Security considerations
|
||||
|
||||
### Example Requests
|
||||
|
||||
**Code Review:**
|
||||
```
|
||||
"Use code-reviewer to review the changes in the shopping list
|
||||
feature branch. Check for adherence to project patterns,
|
||||
potential bugs, and security issues."
|
||||
```
|
||||
|
||||
**Architecture Review:**
|
||||
```
|
||||
"Use code-reviewer to review the proposed changes to the
|
||||
caching layer. Does it follow our patterns? Are there
|
||||
potential issues with cache invalidation?"
|
||||
```
|
||||
|
||||
**Security-Focused Review:**
|
||||
```
|
||||
"Use code-reviewer to review the new file upload handling
|
||||
code with a focus on security. Check for path traversal,
|
||||
file type validation, and size limits."
|
||||
```
|
||||
|
||||
### Code Review Checklist
|
||||
|
||||
The code-reviewer subagent checks:
|
||||
|
||||
#### Code Quality
|
||||
- [ ] Follows TypeScript strict mode
|
||||
- [ ] No `any` types without justification
|
||||
- [ ] Proper error handling
|
||||
- [ ] Meaningful variable names
|
||||
- [ ] Appropriate comments
|
||||
|
||||
#### Architecture
|
||||
- [ ] Follows layer separation (Routes -> Services -> Repositories)
|
||||
- [ ] Uses correct file naming conventions
|
||||
- [ ] Repository methods follow naming patterns
|
||||
- [ ] Transactions used for multi-operation changes
|
||||
|
||||
#### Testing
|
||||
- [ ] New code has corresponding tests
|
||||
- [ ] Tests follow project patterns
|
||||
- [ ] Edge cases covered
|
||||
- [ ] Mocks used appropriately
|
||||
|
||||
#### Security
|
||||
- [ ] Input validation present
|
||||
- [ ] Authorization checks in place
|
||||
- [ ] No secrets in code
|
||||
- [ ] Error messages don't leak information
|
||||
|
||||
#### Performance
|
||||
- [ ] No obvious N+1 queries
|
||||
- [ ] Appropriate use of caching
|
||||
- [ ] Large data sets paginated
|
||||
- [ ] Expensive operations async/queued
|
||||
|
||||
### Review Output Format
|
||||
|
||||
```markdown
|
||||
## Code Review: [Feature/PR Name]
|
||||
|
||||
### Summary
|
||||
Brief overview of the changes reviewed.
|
||||
|
||||
### Issues Found
|
||||
|
||||
#### Critical
|
||||
- **[File:Line]** Description of critical issue
|
||||
- Impact: What could go wrong
|
||||
- Suggestion: How to fix
|
||||
|
||||
#### High Priority
|
||||
- **[File:Line]** Description
|
||||
|
||||
#### Medium Priority
|
||||
- **[File:Line]** Description
|
||||
|
||||
#### Low Priority / Suggestions
|
||||
- **[File:Line]** Description
|
||||
|
||||
### Positive Observations
|
||||
- Good patterns followed
|
||||
- Well-tested areas
|
||||
- Clean implementations
|
||||
|
||||
### Recommendations
|
||||
1. Priority items to address before merge
|
||||
2. Items for follow-up tickets
|
||||
```
|
||||
|
||||
## Debugging Workflow
|
||||
|
||||
### 1. Error Investigation
|
||||
|
||||
```
|
||||
1. Identify the error in Bugsink
|
||||
mcp__bugsink__list_issues({ project_id: 1, status: "unresolved" })
|
||||
|
||||
2. Get error details
|
||||
mcp__bugsink__get_issue({ issue_id: "..." })
|
||||
|
||||
3. Get full stacktrace
|
||||
mcp__bugsink__get_stacktrace({ event_id: "..." })
|
||||
|
||||
4. Check for patterns across events
|
||||
mcp__bugsink__list_events({ issue_id: "..." })
|
||||
```
|
||||
|
||||
### 2. Log Correlation
|
||||
|
||||
```bash
|
||||
# Find related logs by request ID
|
||||
grep "requestId\":\"abc-123\"" /var/www/flyer-crawler.projectium.com/logs/app.log
|
||||
|
||||
# Find all errors in a time range
|
||||
jq 'select(.level >= 50 and .time >= 1704067200000)' app.log
|
||||
```
|
||||
|
||||
### 3. Database Query Analysis
|
||||
|
||||
```bash
|
||||
# Check slow query log
|
||||
tail -f /var/log/postgresql/postgresql-$(date +%Y-%m-%d).log | grep "duration:"
|
||||
```
|
||||
|
||||
### 4. Root Cause Analysis
|
||||
|
||||
- Correlate error timing with deployments
|
||||
- Check for resource constraints (memory, connections)
|
||||
- Review recent code changes
|
||||
- Check external service status
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [OVERVIEW.md](./OVERVIEW.md) - Subagent system overview
|
||||
- [DEVOPS-GUIDE.md](./DEVOPS-GUIDE.md) - Infrastructure debugging
|
||||
- [../adr/0016-api-security-hardening.md](../adr/0016-api-security-hardening.md) - Security ADR
|
||||
- [../adr/0032-rate-limiting-strategy.md](../adr/0032-rate-limiting-strategy.md) - Rate limiting
|
||||
- [../adr/0015-application-performance-monitoring-and-error-tracking.md](../adr/0015-application-performance-monitoring-and-error-tracking.md) - Monitoring ADR
|
||||
- [../adr/0050-postgresql-function-observability.md](../adr/0050-postgresql-function-observability.md) - Database observability
|
||||
- [../BARE-METAL-SETUP.md](../BARE-METAL-SETUP.md) - Production setup
|
||||
Reference in New Issue
Block a user