Files
flyer-crawler.projectium.com/docs/adr/0015-error-tracking-and-observability.md
Torben Sorensen 61cfb518e6
Some checks failed
Deploy to Test Environment / deploy-to-test (push) Failing after 1m13s
ADR-015 done
2026-01-26 11:48:42 -08:00

13 KiB

ADR-015: Error Tracking and Observability

Date: 2025-12-12

Status: Accepted (Fully Implemented)

Updated: 2026-01-26 (user context integration completed)

Related: ADR-056 (Application Performance Monitoring)

Context

While ADR-004 established structured logging with Pino, the application lacks a high-level, aggregated view of its health and errors. It's difficult to spot trends, identify recurring issues, or be proactively notified of new types of errors.

Key requirements:

  1. Self-hosted: No external SaaS dependencies for error tracking
  2. Sentry SDK compatible: Leverage mature, well-documented SDKs
  3. Lightweight: Minimal resource overhead in the dev container
  4. Production-ready: Same architecture works on bare-metal production servers
  5. AI-accessible: MCP server integration for Claude Code and other AI tools

Note: Application Performance Monitoring (APM) and distributed tracing are covered separately in ADR-056.

Decision

We implement a self-hosted error tracking stack using Bugsink as the Sentry-compatible backend, with the following components:

1. Error Tracking Backend: Bugsink

Bugsink is a lightweight, self-hosted Sentry alternative that:

  • Runs as a single process (no Kafka, Redis, ClickHouse required)
  • Is fully compatible with Sentry SDKs
  • Supports ARM64 and AMD64 architectures
  • Can use SQLite (dev) or PostgreSQL (production)

Deployment:

  • Dev container: Installed as a systemd service inside the container
  • Production: Runs as a systemd service on bare-metal, listening on localhost only
  • Database: Uses PostgreSQL with a dedicated bugsink user and bugsink database (same PostgreSQL instance as the main application)

2. Backend Integration: @sentry/node

The Express backend integrates @sentry/node SDK to:

  • Capture unhandled exceptions before PM2/process manager restarts
  • Report errors with full stack traces and context
  • Integrate with Pino logger for breadcrumbs
  • Filter errors by severity (only 5xx errors sent by default)

3. Frontend Integration: @sentry/react

The React frontend integrates @sentry/react SDK to:

  • Wrap the app in an Error Boundary for graceful error handling
  • Capture unhandled JavaScript errors
  • Report errors with component stack traces
  • Filter out browser extension errors
  • Frontend Error Correlation: The global API client intercepts 4xx/5xx responses and can attach the x-request-id header to Sentry scope for correlation with backend logs

4. Log Aggregation: Logstash

Logstash parses application and infrastructure logs, forwarding error patterns to Bugsink:

  • Installation: Installed inside the dev container (and on bare-metal prod servers)
  • Inputs:
    • Pino JSON logs from the Node.js application (PM2 managed)
    • Redis logs (connection errors, memory warnings, slow commands)
    • PostgreSQL function logs (via fn_log() - see ADR-050)
    • NGINX access/error logs
  • Filter: Identifies error-level logs (5xx responses, unhandled exceptions, Redis errors)
  • Output: Sends to Bugsink via Sentry-compatible HTTP API

This provides a secondary error capture path for:

  • Errors that occur before Sentry SDK initialization
  • Log-based errors that don't throw exceptions
  • Redis connection/performance issues
  • Database function errors and slow queries
  • Historical error analysis from log files

5. MCP Server Integration: bugsink-mcp

For AI tool integration (Claude Code, Cursor, etc.), we use the open-source bugsink-mcp server:

  • No code changes required: Configurable via environment variables
  • Capabilities: List projects, get issues, view events, get stacktraces, manage releases
  • Configuration:
    • BUGSINK_URL: Points to Bugsink instance (http://localhost:8000 for dev, https://bugsink.projectium.com for prod)
    • BUGSINK_API_TOKEN: API token from Bugsink (created via Django management command)
    • BUGSINK_ORG_SLUG: Organization identifier (usually "sentry")

Architecture

+---------------------------------------------------------------------------+
|                      Dev Container / Production Server                     |
+---------------------------------------------------------------------------+
|                                                                            |
|  +------------------+        +------------------+                          |
|  |    Frontend      |        |     Backend      |                          |
|  |    (React)       |        |    (Express)     |                          |
|  |  @sentry/react   |        |   @sentry/node   |                          |
|  +--------+---------+        +--------+---------+                          |
|           |                           |                                    |
|           |    Sentry SDK Protocol    |                                    |
|           +-----------+---------------+                                    |
|                       |                                                    |
|                       v                                                    |
|           +----------------------+                                         |
|           |      Bugsink         |                                         |
|           |   (localhost:8000)   |<------------------+                     |
|           |                      |                   |                     |
|           |  PostgreSQL backend  |                   |                     |
|           +----------------------+                   |                     |
|                                                      |                     |
|           +----------------------+                   |                     |
|           |      Logstash        |-------------------+                     |
|           |   (Log Aggregator)   |   Sentry Output                         |
|           |                      |                                         |
|           |  Inputs:             |                                         |
|           |  - PM2/Pino logs     |                                         |
|           |  - Redis logs        |                                         |
|           |  - PostgreSQL logs   |                                         |
|           |  - NGINX logs        |                                         |
|           +----------------------+                                         |
|                   ^   ^   ^   ^                                            |
|                   |   |   |   |                                            |
|       +-----------+   |   |   +-----------+                                |
|       |               |   |               |                                |
|  +----+-----+   +-----+----+   +-----+----+   +-----+----+                 |
|  |  PM2     |   |  Redis   |   | PostgreSQL|   |  NGINX  |                 |
|  |  Logs    |   |  Logs    |   |   Logs    |   |  Logs   |                 |
|  +----------+   +----------+   +-----------+   +---------+                 |
|                                                                            |
|           +----------------------+                                         |
|           |     PostgreSQL       |                                         |
|           |  +----------------+  |                                         |
|           |  | flyer_crawler  |  |  (main app database)                    |
|           |  +----------------+  |                                         |
|           |  |   bugsink      |  |  (error tracking database)              |
|           |  +----------------+  |                                         |
|           +----------------------+                                         |
|                                                                            |
+---------------------------------------------------------------------------+

External (Developer Machine):
+--------------------------------------+
|  Claude Code / Cursor / VS Code      |
|  +--------------------------------+  |
|  |  bugsink-mcp                   |  |
|  |  (MCP Server)                  |  |
|  |                                |  |
|  |  BUGSINK_URL=http://localhost:8000
|  |  BUGSINK_API_TOKEN=...         |  |
|  |  BUGSINK_ORG_SLUG=...          |  |
|  +--------------------------------+  |
+--------------------------------------+

Implementation Status

Completed

  • Bugsink installed and configured in dev container
  • PostgreSQL bugsink database and user created
  • @sentry/node SDK integrated in backend (src/services/sentry.server.ts)
  • @sentry/react SDK integrated in frontend (src/services/sentry.client.ts)
  • ErrorBoundary component created (src/components/ErrorBoundary.tsx)
  • ErrorBoundary wrapped around app (src/providers/AppProviders.tsx)
  • Logstash pipeline configured for PM2/Pino, Redis, PostgreSQL, NGINX logs
  • MCP server (bugsink-mcp) documented and configured
  • Environment variables added to src/config/env.ts and frontend src/config.ts
  • Browser extension errors filtered in beforeSend
  • 5xx error filtering in backend error handler

Recently Completed (2026-01-26)

  • User context after authentication: Integrated setUser() calls in AuthProvider.tsx to associate errors with authenticated users
    • Called on profile fetch from query (line 44-49)
    • Called on direct login with profile (line 94-99)
    • Called on login with profile fetch (line 124-129)
    • Cleared on logout (line 76-77)
    • Maps user_idid, emailemail, full_nameusername

This completes the error tracking implementation - all errors are now associated with the authenticated user who encountered them, enabling user-specific error analysis and debugging.

Configuration

Environment Variables

Variable Description Default (Dev)
SENTRY_DSN Sentry-compatible DSN (backend) Set after project creation
VITE_SENTRY_DSN Sentry-compatible DSN (frontend) Set after project creation
SENTRY_ENVIRONMENT Environment name development
SENTRY_DEBUG Enable debug logging false
SENTRY_ENABLED Enable/disable error reporting true

PostgreSQL Setup

-- Create dedicated Bugsink database and user
CREATE USER bugsink WITH PASSWORD 'bugsink_dev_password';
CREATE DATABASE bugsink OWNER bugsink;
GRANT ALL PRIVILEGES ON DATABASE bugsink TO bugsink;

Bugsink Configuration

# Environment variables for Bugsink service
SECRET_KEY=<random-50-char-string>
DATABASE_URL=postgresql://bugsink:bugsink_dev_password@localhost:5432/bugsink
BASE_URL=http://localhost:8000
PORT=8000

Logstash Pipeline

See docker/logstash/bugsink.conf for the full pipeline configuration.

Key routing:

Source Bugsink Project
Backend (Pino) Backend API
Worker (Pino) Backend API
PostgreSQL logs Backend API
Vite logs Infrastructure
Redis logs Infrastructure
NGINX logs Infrastructure
Frontend errors Frontend

Consequences

Positive

  • Full observability: Aggregated view of errors and trends
  • Self-hosted: No external SaaS dependencies or subscription costs
  • SDK compatibility: Leverages mature Sentry SDKs with excellent documentation
  • AI integration: MCP server enables Claude Code to query and analyze errors
  • Unified architecture: Same setup works in dev container and production
  • Lightweight: Bugsink runs in a single process, unlike full Sentry (16GB+ RAM)
  • Error correlation: Request IDs allow correlation between frontend errors and backend logs

Negative

  • Additional services: Bugsink and Logstash add complexity to the container
  • PostgreSQL overhead: Additional database for error tracking
  • Initial setup: Requires configuration of multiple components
  • Logstash learning curve: Pipeline configuration requires Logstash knowledge

Alternatives Considered

  1. Full Sentry self-hosted: Rejected due to complexity (Kafka, Redis, ClickHouse, 16GB+ RAM minimum)
  2. GlitchTip: Considered, but Bugsink is lighter weight and easier to deploy
  3. Sentry SaaS: Rejected due to self-hosted requirement
  4. Custom error aggregation: Rejected in favor of proven Sentry SDK ecosystem

References