Files
flyer-crawler.projectium.com/docs/adr/0029-secret-rotation-and-key-management.md
Torben Sorensen 4a04e478c4
Some checks failed
Deploy to Test Environment / deploy-to-test (push) Failing after 16m58s
integration test fixes - claude for the win? try 4 - i have a good feeling
2026-01-09 05:56:19 -08:00

4.8 KiB

ADR-029: Secret Rotation and Key Management Strategy

Date: 2026-01-09

Status: Proposed

Context

While ADR-007 covers configuration validation at startup, it does not address the lifecycle management of secrets:

  1. JWT Secrets: If the JWT_SECRET is rotated, all existing user sessions are immediately invalidated
  2. Database Credentials: No documented procedure for rotating database passwords without downtime
  3. API Keys: External service API keys (AI services, geocoding) have no rotation strategy
  4. Emergency Revocation: No process for immediately invalidating compromised credentials

Current risks:

  • Long-lived secrets that never change become high-value targets
  • No ability to rotate secrets without application restart
  • No audit trail of when secrets were last rotated
  • Compromised keys could remain active indefinitely

Decision

We will implement a comprehensive secret rotation and key management strategy.

1. JWT Secret Rotation with Dual-Key Support

Support multiple JWT secrets simultaneously to enable zero-downtime rotation:

// Environment variables
JWT_SECRET = current_secret;
JWT_SECRET_PREVIOUS = old_secret; // Optional, for transition period

// Token verification tries current first, falls back to previous
const verifyToken = (token: string) => {
  try {
    return jwt.verify(token, process.env.JWT_SECRET);
  } catch {
    if (process.env.JWT_SECRET_PREVIOUS) {
      return jwt.verify(token, process.env.JWT_SECRET_PREVIOUS);
    }
    throw new AuthenticationError('Invalid token');
  }
};

2. Database Credential Rotation

Document and implement a procedure for PostgreSQL credential rotation:

  1. Create new database user with identical permissions
  2. Update application configuration to use new credentials
  3. Restart application instances (rolling restart)
  4. Remove old database user after all instances updated
  5. Log rotation event for audit purposes

3. API Key Management

For external service API keys (Google AI, geocoding services):

  1. Naming Convention: {SERVICE}_API_KEY and {SERVICE}_API_KEY_PREVIOUS
  2. Fallback Logic: Try primary key, fall back to previous on 401/403
  3. Health Checks: Validate API keys on startup
  4. Usage Logging: Track which key is being used for each request

4. Emergency Revocation Procedures

Document emergency procedures for:

  • JWT Compromise: Set new JWT_SECRET, clear all refresh tokens from database
  • Database Compromise: Rotate credentials immediately, audit access logs
  • API Key Compromise: Regenerate at provider, update environment, restart

5. Secret Audit Trail

Track secret lifecycle events:

  • When secrets were last rotated
  • Who initiated the rotation
  • Which instances are using which secrets

Implementation Approach

Phase 1: Dual JWT Secret Support

  • Modify token verification to support fallback secret
  • Add JWT_SECRET_PREVIOUS to configuration schema
  • Update documentation

Phase 2: Rotation Scripts

  • Create scripts/rotate-jwt-secret.sh
  • Create scripts/rotate-db-credentials.sh
  • Add rotation instructions to operations runbook

Phase 3: API Key Fallback

  • Wrap external API clients with fallback logic
  • Add key validation to health checks
  • Implement key usage logging

Consequences

Positive

  • Zero-Downtime Rotation: Secrets can be rotated without invalidating all sessions
  • Reduced Risk: Regular rotation limits exposure window for compromised credentials
  • Audit Trail: Clear record of when secrets were changed
  • Emergency Response: Documented procedures for security incidents

Negative

  • Complexity: Dual-key logic adds code complexity
  • Operations Overhead: Regular rotation requires operational discipline
  • Testing: Rotation procedures need to be tested periodically

Implementation Status

What's Implemented

  • Not yet implemented

What Needs To Be Done

  1. Implement dual JWT secret verification
  2. Create rotation scripts
  3. Document emergency procedures
  4. Add secret validation to health checks
  5. Create rotation schedule recommendations

Key Files (To Be Created)

  • src/utils/secretManager.ts - Secret rotation utilities
  • scripts/rotate-jwt-secret.sh - JWT rotation script
  • scripts/rotate-db-credentials.sh - Database credential rotation
  • docs/operations/secret-rotation.md - Operations runbook

Rotation Schedule Recommendations

Secret Type Rotation Frequency Grace Period
JWT_SECRET 90 days 7 days (dual-key)
Database Passwords 180 days Rolling restart
AI API Keys On suspicion of compromise Immediate
Refresh Tokens 7-day max age N/A (per-token)