flyer-crawler.projectium.com/docs/subagents/AI-USAGE-GUIDE.md

# AI Usage Subagent Guide

The **ai-usage** subagent specializes in LLM APIs (Gemini, Claude), prompt engineering, and AI-powered features in the Flyer Crawler project.

## Quick Reference

| Aspect             | Details                                                                             |
| ------------------ | ----------------------------------------------------------------------------------- |
| **Primary Use**    | Gemini API integration, prompt engineering, AI extraction                           |
| **Key Files**      | `src/services/aiService.server.ts`, `src/services/flyerProcessingService.server.ts` |
| **Key ADRs**       | ADR-041 (AI Integration), ADR-046 (Image Processing)                                |
| **API Key Env**    | `VITE_GOOGLE_GENAI_API_KEY` (prod), `VITE_GOOGLE_GENAI_API_KEY_TEST` (test)         |
| **Error Handling** | Rate limits (429), JSON parse errors, timeout handling                              |
| **Delegate To**    | `coder` (implementation), `testwriter` (tests), `integrations-specialist`           |

## When to Use

Use the **ai-usage** subagent when you need to:

- Integrate with the Gemini API for flyer extraction
- Debug AI extraction failures
- Optimize prompts for better accuracy
- Handle rate limiting and API errors
- Implement new AI-powered features
- Fine-tune extraction schemas

## What ai-usage Knows

The ai-usage subagent understands:

- Google Generative AI (Gemini) API
- Flyer extraction prompts and schemas
- Error handling for AI services
- Rate limiting strategies
- Token optimization
- AI service architecture (ADR-041)

## AI Architecture Overview

```
┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│  Flyer Upload   │───►│   AI Service    │───►│  Gemini API     │
│                 │    │                 │    │                 │
└─────────────────┘    │ - Preprocessing │    │ - Vision model  │
                       │ - Prompt build  │    │ - JSON output   │
                       │ - Response parse│    │                 │
                       └─────────────────┘    └─────────────────┘
                              │
                              ▼
                       ┌─────────────────┐
                       │  Validation &   │
                       │  Normalization  │
                       └─────────────────┘
```

## Key Files

| File                                            | Purpose                              |
| ----------------------------------------------- | ------------------------------------ |
| `src/services/aiService.server.ts`              | Gemini API integration               |
| `src/services/flyerProcessingService.server.ts` | Flyer extraction pipeline            |
| `src/schemas/flyer.schemas.ts`                  | Zod schemas for AI output validation |
| `src/types/ai.types.ts`                         | TypeScript types for AI responses    |

## Example Requests

### Debugging Extraction Failures

```
"Use ai-usage to debug why flyer extractions are failing for
multi-page PDFs. The error logs show 'Invalid JSON response'
but only for certain stores."
```

### Optimizing Prompts

```
"Use ai-usage to improve the item extraction prompt. Currently
it's missing unit prices when items show 'X for $Y' pricing
(e.g., '3 for $5')."
```

### Handling Rate Limits

```
"Use ai-usage to implement exponential backoff for Gemini API
rate limits. We're seeing 429 errors during high-volume uploads."
```

### Adding New AI Features

```
"Use ai-usage to add a feature that uses Gemini to categorize
extracted items into grocery categories (produce, dairy, meat, etc.)."
```

## Extraction Pipeline

### 1. Image Preprocessing

```typescript
// Convert PDF to images, resize large images
const processedImages = await imageProcessor.prepareForAI(uploadedFile);
```

### 2. Prompt Construction

The extraction prompt includes:

- System instructions for the AI model
- Expected output schema (JSON)
- Examples of correct extraction
- Handling instructions for edge cases

### 3. API Call

```typescript
const response = await aiService.extractFlyerData(processedImages, storeContext, extractionOptions);
```

### 4. Response Validation

```typescript
// Validate against Zod schema
const validatedItems = flyerItemsSchema.parse(response.items);
```

### 5. Normalization

```typescript
// Normalize prices, units, quantities
const normalizedItems = normalizeExtractedItems(validatedItems);
```

## Common Issues and Solutions

### Issue: Inconsistent Price Extraction

**Symptoms**: Same item priced differently on different extractions.

**Solution**: Improve prompt with explicit price format examples:

```
"Price formats to recognize:
- $X.XX (regular price)
- X for $Y.YY (multi-buy)
- $X.XX/lb or $X.XX/kg (unit price)
- $X.XX each (per item)
- SAVE $X.XX (discount amount, not item price)"
```

### Issue: Missing Items from Dense Flyers

**Symptoms**: Flyers with many items on one page have missing extractions.

**Solution**:

1. Split page into quadrants for separate extraction
2. Increase token limit for response
3. Use structured grid-based prompting

### Issue: Rate Limit Errors (429)

**Symptoms**: `429 Too Many Requests` errors during bulk uploads.

**Solution**: Implement request queuing:

```typescript
// Add to job queue instead of direct call
await flyerQueue.add(
  'extract',
  {
    flyerId,
    images,
  },
  {
    attempts: 3,
    backoff: {
      type: 'exponential',
      delay: 2000,
    },
  },
);
```

### Issue: Hallucinated Items

**Symptoms**: Items extracted that don't exist in the flyer.

**Solution**:

1. Add confidence scoring to extraction
2. Request bounding box coordinates for verification
3. Add post-extraction validation against image

## Prompt Engineering Best Practices

### 1. Be Specific About Output Format

```
Output MUST be valid JSON matching this schema:
{
  "items": [
    {
      "name": "string (product name as shown)",
      "brand": "string or null",
      "price": number (in dollars),
      "unit": "string (each, lb, kg, etc.)",
      "quantity": number (default 1)
    }
  ]
}
```

### 2. Provide Examples

```
Example extractions:
- "Chicken Breast $4.99/lb" -> {"name": "Chicken Breast", "price": 4.99, "unit": "lb"}
- "Coca-Cola 12pk $5.99" -> {"name": "Coca-Cola", "quantity": 12, "price": 5.99, "unit": "each"}
```

### 3. Handle Edge Cases Explicitly

```
Special cases:
- If "LIMIT X" shown, add to notes, don't affect price
- If "SAVE $X" shown without base price, mark price as null
- If item is "FREE with purchase", set price to 0
```

### 4. Request Structured Thinking

```
For each item:
1. Identify the product name and brand
2. Find the associated price
3. Determine if price is per-unit or total
4. Extract any quantity information
```

## Monitoring AI Performance

### Metrics to Track

| Metric                  | Description                             | Target          |
| ----------------------- | --------------------------------------- | --------------- |
| Extraction success rate | % of flyers processed without error     | >95%            |
| Items per flyer         | Average items extracted                 | Varies by store |
| Price accuracy          | Match rate vs manual verification       | >98%            |
| Response time           | Time from upload to extraction complete | <30s            |

### Logging

```typescript
log.info(
  {
    flyerId,
    itemCount: extractedItems.length,
    processingTime: duration,
    modelVersion: response.model,
    tokenUsage: response.usage,
  },
  'Flyer extraction completed',
);
```

## Environment Configuration

| Variable                         | Purpose                     |
| -------------------------------- | --------------------------- |
| `VITE_GOOGLE_GENAI_API_KEY`      | Gemini API key (production) |
| `VITE_GOOGLE_GENAI_API_KEY_TEST` | Gemini API key (test)       |

**Note**: Use separate API keys for production and test to avoid rate limit conflicts and enable separate billing tracking.

## Testing AI Features

### Unit Tests

Mock the Gemini API response:

```typescript
vi.mock('@google/generative-ai', () => ({
  GoogleGenerativeAI: vi.fn().mockImplementation(() => ({
    getGenerativeModel: () => ({
      generateContent: vi.fn().mockResolvedValue({
        response: {
          text: () => JSON.stringify({ items: mockItems }),
        },
      }),
    }),
  })),
}));
```

### Integration Tests

Use recorded responses for deterministic testing:

```typescript
// Save real API responses to fixtures
const fixtureResponse = await fs.readFile('fixtures/gemini-response.json');
```

## Related Documentation

- [OVERVIEW.md](./OVERVIEW.md) - Subagent system overview
- [CODER-GUIDE.md](./CODER-GUIDE.md) - For implementing AI features
- [TESTER-GUIDE.md](./TESTER-GUIDE.md) - Testing AI features
- [INTEGRATIONS-GUIDE.md](./INTEGRATIONS-GUIDE.md) - External API patterns
- [../adr/0041-ai-gemini-integration-architecture.md](../adr/0041-ai-gemini-integration-architecture.md) - AI integration ADR
- [../adr/0046-image-processing-pipeline.md](../adr/0046-image-processing-pipeline.md) - Image processing
- [../getting-started/ENVIRONMENT.md](../getting-started/ENVIRONMENT.md) - Environment configuration