Some checks failed
Deploy to Test Environment / deploy-to-test (push) Failing after 1m13s
273 lines
13 KiB
Markdown
273 lines
13 KiB
Markdown
# ADR-015: Error Tracking and Observability
|
|
|
|
**Date**: 2025-12-12
|
|
|
|
**Status**: Accepted (Fully Implemented)
|
|
|
|
**Updated**: 2026-01-26 (user context integration completed)
|
|
|
|
**Related**: [ADR-056](./0056-application-performance-monitoring.md) (Application Performance Monitoring)
|
|
|
|
## Context
|
|
|
|
While ADR-004 established structured logging with Pino, the application lacks a high-level, aggregated view of its health and errors. It's difficult to spot trends, identify recurring issues, or be proactively notified of new types of errors.
|
|
|
|
Key requirements:
|
|
|
|
1. **Self-hosted**: No external SaaS dependencies for error tracking
|
|
2. **Sentry SDK compatible**: Leverage mature, well-documented SDKs
|
|
3. **Lightweight**: Minimal resource overhead in the dev container
|
|
4. **Production-ready**: Same architecture works on bare-metal production servers
|
|
5. **AI-accessible**: MCP server integration for Claude Code and other AI tools
|
|
|
|
**Note**: Application Performance Monitoring (APM) and distributed tracing are covered separately in [ADR-056](./0056-application-performance-monitoring.md).
|
|
|
|
## Decision
|
|
|
|
We implement a self-hosted error tracking stack using **Bugsink** as the Sentry-compatible backend, with the following components:
|
|
|
|
### 1. Error Tracking Backend: Bugsink
|
|
|
|
**Bugsink** is a lightweight, self-hosted Sentry alternative that:
|
|
|
|
- Runs as a single process (no Kafka, Redis, ClickHouse required)
|
|
- Is fully compatible with Sentry SDKs
|
|
- Supports ARM64 and AMD64 architectures
|
|
- Can use SQLite (dev) or PostgreSQL (production)
|
|
|
|
**Deployment**:
|
|
|
|
- **Dev container**: Installed as a systemd service inside the container
|
|
- **Production**: Runs as a systemd service on bare-metal, listening on localhost only
|
|
- **Database**: Uses PostgreSQL with a dedicated `bugsink` user and `bugsink` database (same PostgreSQL instance as the main application)
|
|
|
|
### 2. Backend Integration: @sentry/node
|
|
|
|
The Express backend integrates `@sentry/node` SDK to:
|
|
|
|
- Capture unhandled exceptions before PM2/process manager restarts
|
|
- Report errors with full stack traces and context
|
|
- Integrate with Pino logger for breadcrumbs
|
|
- Filter errors by severity (only 5xx errors sent by default)
|
|
|
|
### 3. Frontend Integration: @sentry/react
|
|
|
|
The React frontend integrates `@sentry/react` SDK to:
|
|
|
|
- Wrap the app in an Error Boundary for graceful error handling
|
|
- Capture unhandled JavaScript errors
|
|
- Report errors with component stack traces
|
|
- Filter out browser extension errors
|
|
- **Frontend Error Correlation**: The global API client intercepts 4xx/5xx responses and can attach the `x-request-id` header to Sentry scope for correlation with backend logs
|
|
|
|
### 4. Log Aggregation: Logstash
|
|
|
|
**Logstash** parses application and infrastructure logs, forwarding error patterns to Bugsink:
|
|
|
|
- **Installation**: Installed inside the dev container (and on bare-metal prod servers)
|
|
- **Inputs**:
|
|
- Pino JSON logs from the Node.js application (PM2 managed)
|
|
- Redis logs (connection errors, memory warnings, slow commands)
|
|
- PostgreSQL function logs (via `fn_log()` - see ADR-050)
|
|
- NGINX access/error logs
|
|
- **Filter**: Identifies error-level logs (5xx responses, unhandled exceptions, Redis errors)
|
|
- **Output**: Sends to Bugsink via Sentry-compatible HTTP API
|
|
|
|
This provides a secondary error capture path for:
|
|
|
|
- Errors that occur before Sentry SDK initialization
|
|
- Log-based errors that don't throw exceptions
|
|
- Redis connection/performance issues
|
|
- Database function errors and slow queries
|
|
- Historical error analysis from log files
|
|
|
|
### 5. MCP Server Integration: bugsink-mcp
|
|
|
|
For AI tool integration (Claude Code, Cursor, etc.), we use the open-source [bugsink-mcp](https://github.com/j-shelfwood/bugsink-mcp) server:
|
|
|
|
- **No code changes required**: Configurable via environment variables
|
|
- **Capabilities**: List projects, get issues, view events, get stacktraces, manage releases
|
|
- **Configuration**:
|
|
- `BUGSINK_URL`: Points to Bugsink instance (`http://localhost:8000` for dev, `https://bugsink.projectium.com` for prod)
|
|
- `BUGSINK_API_TOKEN`: API token from Bugsink (created via Django management command)
|
|
- `BUGSINK_ORG_SLUG`: Organization identifier (usually "sentry")
|
|
|
|
## Architecture
|
|
|
|
```text
|
|
+---------------------------------------------------------------------------+
|
|
| Dev Container / Production Server |
|
|
+---------------------------------------------------------------------------+
|
|
| |
|
|
| +------------------+ +------------------+ |
|
|
| | Frontend | | Backend | |
|
|
| | (React) | | (Express) | |
|
|
| | @sentry/react | | @sentry/node | |
|
|
| +--------+---------+ +--------+---------+ |
|
|
| | | |
|
|
| | Sentry SDK Protocol | |
|
|
| +-----------+---------------+ |
|
|
| | |
|
|
| v |
|
|
| +----------------------+ |
|
|
| | Bugsink | |
|
|
| | (localhost:8000) |<------------------+ |
|
|
| | | | |
|
|
| | PostgreSQL backend | | |
|
|
| +----------------------+ | |
|
|
| | |
|
|
| +----------------------+ | |
|
|
| | Logstash |-------------------+ |
|
|
| | (Log Aggregator) | Sentry Output |
|
|
| | | |
|
|
| | Inputs: | |
|
|
| | - PM2/Pino logs | |
|
|
| | - Redis logs | |
|
|
| | - PostgreSQL logs | |
|
|
| | - NGINX logs | |
|
|
| +----------------------+ |
|
|
| ^ ^ ^ ^ |
|
|
| | | | | |
|
|
| +-----------+ | | +-----------+ |
|
|
| | | | | |
|
|
| +----+-----+ +-----+----+ +-----+----+ +-----+----+ |
|
|
| | PM2 | | Redis | | PostgreSQL| | NGINX | |
|
|
| | Logs | | Logs | | Logs | | Logs | |
|
|
| +----------+ +----------+ +-----------+ +---------+ |
|
|
| |
|
|
| +----------------------+ |
|
|
| | PostgreSQL | |
|
|
| | +----------------+ | |
|
|
| | | flyer_crawler | | (main app database) |
|
|
| | +----------------+ | |
|
|
| | | bugsink | | (error tracking database) |
|
|
| | +----------------+ | |
|
|
| +----------------------+ |
|
|
| |
|
|
+---------------------------------------------------------------------------+
|
|
|
|
External (Developer Machine):
|
|
+--------------------------------------+
|
|
| Claude Code / Cursor / VS Code |
|
|
| +--------------------------------+ |
|
|
| | bugsink-mcp | |
|
|
| | (MCP Server) | |
|
|
| | | |
|
|
| | BUGSINK_URL=http://localhost:8000
|
|
| | BUGSINK_API_TOKEN=... | |
|
|
| | BUGSINK_ORG_SLUG=... | |
|
|
| +--------------------------------+ |
|
|
+--------------------------------------+
|
|
```
|
|
|
|
## Implementation Status
|
|
|
|
### Completed
|
|
|
|
- [x] Bugsink installed and configured in dev container
|
|
- [x] PostgreSQL `bugsink` database and user created
|
|
- [x] `@sentry/node` SDK integrated in backend (`src/services/sentry.server.ts`)
|
|
- [x] `@sentry/react` SDK integrated in frontend (`src/services/sentry.client.ts`)
|
|
- [x] ErrorBoundary component created (`src/components/ErrorBoundary.tsx`)
|
|
- [x] ErrorBoundary wrapped around app (`src/providers/AppProviders.tsx`)
|
|
- [x] Logstash pipeline configured for PM2/Pino, Redis, PostgreSQL, NGINX logs
|
|
- [x] MCP server (`bugsink-mcp`) documented and configured
|
|
- [x] Environment variables added to `src/config/env.ts` and frontend `src/config.ts`
|
|
- [x] Browser extension errors filtered in `beforeSend`
|
|
- [x] 5xx error filtering in backend error handler
|
|
|
|
### Recently Completed (2026-01-26)
|
|
|
|
- [x] **User context after authentication**: Integrated `setUser()` calls in `AuthProvider.tsx` to associate errors with authenticated users
|
|
- Called on profile fetch from query (line 44-49)
|
|
- Called on direct login with profile (line 94-99)
|
|
- Called on login with profile fetch (line 124-129)
|
|
- Cleared on logout (line 76-77)
|
|
- Maps `user_id` → `id`, `email` → `email`, `full_name` → `username`
|
|
|
|
This completes the error tracking implementation - all errors are now associated with the authenticated user who encountered them, enabling user-specific error analysis and debugging.
|
|
|
|
## Configuration
|
|
|
|
### Environment Variables
|
|
|
|
| Variable | Description | Default (Dev) |
|
|
| -------------------- | -------------------------------- | -------------------------- |
|
|
| `SENTRY_DSN` | Sentry-compatible DSN (backend) | Set after project creation |
|
|
| `VITE_SENTRY_DSN` | Sentry-compatible DSN (frontend) | Set after project creation |
|
|
| `SENTRY_ENVIRONMENT` | Environment name | `development` |
|
|
| `SENTRY_DEBUG` | Enable debug logging | `false` |
|
|
| `SENTRY_ENABLED` | Enable/disable error reporting | `true` |
|
|
|
|
### PostgreSQL Setup
|
|
|
|
```sql
|
|
-- Create dedicated Bugsink database and user
|
|
CREATE USER bugsink WITH PASSWORD 'bugsink_dev_password';
|
|
CREATE DATABASE bugsink OWNER bugsink;
|
|
GRANT ALL PRIVILEGES ON DATABASE bugsink TO bugsink;
|
|
```
|
|
|
|
### Bugsink Configuration
|
|
|
|
```bash
|
|
# Environment variables for Bugsink service
|
|
SECRET_KEY=<random-50-char-string>
|
|
DATABASE_URL=postgresql://bugsink:bugsink_dev_password@localhost:5432/bugsink
|
|
BASE_URL=http://localhost:8000
|
|
PORT=8000
|
|
```
|
|
|
|
### Logstash Pipeline
|
|
|
|
See `docker/logstash/bugsink.conf` for the full pipeline configuration.
|
|
|
|
Key routing:
|
|
|
|
| Source | Bugsink Project |
|
|
| --------------- | --------------- |
|
|
| Backend (Pino) | Backend API |
|
|
| Worker (Pino) | Backend API |
|
|
| PostgreSQL logs | Backend API |
|
|
| Vite logs | Infrastructure |
|
|
| Redis logs | Infrastructure |
|
|
| NGINX logs | Infrastructure |
|
|
| Frontend errors | Frontend |
|
|
|
|
## Consequences
|
|
|
|
### Positive
|
|
|
|
- **Full observability**: Aggregated view of errors and trends
|
|
- **Self-hosted**: No external SaaS dependencies or subscription costs
|
|
- **SDK compatibility**: Leverages mature Sentry SDKs with excellent documentation
|
|
- **AI integration**: MCP server enables Claude Code to query and analyze errors
|
|
- **Unified architecture**: Same setup works in dev container and production
|
|
- **Lightweight**: Bugsink runs in a single process, unlike full Sentry (16GB+ RAM)
|
|
- **Error correlation**: Request IDs allow correlation between frontend errors and backend logs
|
|
|
|
### Negative
|
|
|
|
- **Additional services**: Bugsink and Logstash add complexity to the container
|
|
- **PostgreSQL overhead**: Additional database for error tracking
|
|
- **Initial setup**: Requires configuration of multiple components
|
|
- **Logstash learning curve**: Pipeline configuration requires Logstash knowledge
|
|
|
|
## Alternatives Considered
|
|
|
|
1. **Full Sentry self-hosted**: Rejected due to complexity (Kafka, Redis, ClickHouse, 16GB+ RAM minimum)
|
|
2. **GlitchTip**: Considered, but Bugsink is lighter weight and easier to deploy
|
|
3. **Sentry SaaS**: Rejected due to self-hosted requirement
|
|
4. **Custom error aggregation**: Rejected in favor of proven Sentry SDK ecosystem
|
|
|
|
## References
|
|
|
|
- [Bugsink Documentation](https://www.bugsink.com/docs/)
|
|
- [Bugsink Docker Install](https://www.bugsink.com/docs/docker-install/)
|
|
- [@sentry/node Documentation](https://docs.sentry.io/platforms/javascript/guides/node/)
|
|
- [@sentry/react Documentation](https://docs.sentry.io/platforms/javascript/guides/react/)
|
|
- [bugsink-mcp](https://github.com/j-shelfwood/bugsink-mcp)
|
|
- [Logstash Reference](https://www.elastic.co/guide/en/logstash/current/index.html)
|
|
- [ADR-050: PostgreSQL Function Observability](./0050-postgresql-function-observability.md)
|
|
- [ADR-056: Application Performance Monitoring](./0056-application-performance-monitoring.md)
|