dc-recorder/CLAUDE.md

# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Project Overview

**Discord Moderation Watcher Bot** — A comprehensive monitoring bot that captures voice, text messages, and images from Discord servers. Records audio from voice channels, captures all text messages (new/edited/deleted) from channels and threads, and uploads attachments to external storage. All data stored in SQLite with real-time dashboard.

Built with **Node.js/pnpm** + **discord.js-selfbot-v13** + **@discordjs/voice** + **Express** + **WebSocket**.

## Architecture

### High-Level Flow

1. **Bot Entry** (`src/index.ts`) — Initializes Discord client, registers event listeners, starts webserver
2. **Message Capture** (`src/moderation/messageCapture.ts`) — Listens to Discord events (messageCreate, messageUpdate, messageDelete)
3. **Message Store** (`src/moderation/messageStore.ts`) — Database operations for messages and attachments
4. **Attachment Uploader** (`src/moderation/attachmentUploader.ts`) — Downloads from Discord, uploads to picser, stores URLs
5. **Voice Controller** (`src/voiceController.ts`) — Manages voice channel connections
6. **Recorder** (`src/recorder.ts`) — Records voice audio to OGG segments
7. **Web Server** (`src/webserver.ts`) — Express + WebSocket for REST API and real-time updates
8. **Dashboard** (`public/dashboard.html`) — Web UI with three tabs (Text, Images, Voice)

### Key Modules

**Moderation Subsystem** (`src/moderation/`):
- `types.ts` — TypeScript types for messages, attachments, voice segments
- `messageCapture.ts` — Discord event listeners (messageCreate, messageUpdate, messageDelete)
- `messageStore.ts` — Database CRUD operations (insert, update, query)
- `attachmentUploader.ts` — Picser integration with retry logic and error handling

**Database Schema** (SQLite):
- `messages` table — text messages with edit/delete tracking, user metadata, timestamps
- `attachments` table — attachment metadata, Discord URLs, picser URLs, upload status
- Indexes on channel_id, user_id, created_at for fast queries

**Voice Recording** (existing, unchanged):
- `recorder.ts` — Joins voice channel, subscribes to user audio streams
- `recorder/audioStream.ts` — Opus packet subscription
- `recorder/decoder.ts` — Opus decoder with runtime checks
- `recorder/segment.ts` — OGG file rotation (5s segments)

**Web Interface**:
- REST API: `/api/messages?channel=<id>&type=text|image`
- WebSocket: real-time events (message_created, message_updated, message_deleted, attachment_uploaded)
- Dashboard: three tabs (Text Messages, Images, Voice) with channel filtering

### Recording Structure

```
recordings/
  ├── <user-id>/
  │   ├── <user-id>-<session-start>-0.ogg
  │   ├── <user-id>-<session-start>-0.json
  │   └── ...

messages (SQLite):
  ├── id, guild_id, channel_id, thread_id
  ├── user_id, username, avatar_url
  ├── content, edited_content
  ├── created_at, edited_at, deleted_at
  └── type (text|edited|deleted)

attachments (SQLite):
  ├── id, message_id, guild_id, channel_id, user_id
  ├── filename, size, type (MIME)
  ├── discord_url, uploaded_url (picser raw_commit)
  ├── upload_status (pending|uploaded|failed)
  └── created_at, uploaded_at
```

## Development Commands

```bash
# Install dependencies
pnpm install

# Development (auto-restart on file changes)
pnpm run dev

# Production
pnpm run start

# Type checking
pnpm run typecheck

# Linting (Biome)
pnpm run lint

# Format code (Biome)
pnpm run format

# Run tests
pnpm run test

# Build TypeScript
pnpm run build
```

## Configuration

All config via `.env` (see `.env.example`). Key variables:

**Discord & Monitoring:**
- `DISCORD_TOKEN` — Bot token (required)
- `MONITOR_GUILD_ID` — Target server to monitor (required for moderation)
- `GUILD_ID` — Legacy voice channel guild (optional)
- `VOICE_CHANNEL_ID` — Legacy voice channel ID (optional)

**Recording:**
- `RECORDINGS_DIR` — Where to save audio files (default: `./recordings`)
- `RECORDING_SEGMENT_MS` — OGG segment duration (default: 5000ms)

**Decoder:**
- `DECODER_ROTATE_MS` — Opus decoder rotation interval (default: 5000ms)
- `DECODER_COOLDOWN_MS` — Cooldown after decoder error (default: 30000ms)

**Attachments:**
- `PICSER_UPLOAD_URL` — Picser upload endpoint (default: https://picser.asepharyana.tech/api/upload)
- `ATTACHMENT_UPLOAD_TIMEOUT_MS` — Upload timeout (default: 30000ms)
- `ATTACHMENT_MAX_SIZE_MB` — Max file size (default: 100MB)
- `ATTACHMENT_RETRY_ATTEMPTS` — Retry count (default: 3)

**Web Server:**
- `WEBSERVER_PORT` — HTTP/WebSocket port (default: 3000)

**Connection:**
- `VOICE_CONNECTION_TIMEOUT_MS` — Voice join timeout (default: 15000ms)
- `RECONNECT_TIMEOUT_MS` — Reconnect timeout (default: 5000ms)
- `AUDIO_STREAM_SILENCE_DURATION_MS` — Silence threshold (default: 3000ms)

**Logging:**
- `LOG_LEVEL` — Pino log level (default: info)
- `VERBOSE` — Enable debug logging (default: false)
- `NODE_ENV` — Environment (development|production|test)

## Testing

Tests use **Vitest** in `tests/` directory. Run with `pnpm run test`.

**Test Coverage:**
- `tests/moderation/messageStore.test.ts` — Message store CRUD operations
- `tests/moderation/attachmentUploader.test.ts` — Picser response parsing
- `tests/config.test.ts` — Configuration validation
- `tests/decoder.test.ts` — Opus decoder runtime detection

## Code Style

- **Formatter**: Biome (2-space indent)
- **Linter**: Biome with custom rules (warn on non-null assertions, noExplicitAny)
- **Language**: TypeScript with strict mode
- **Logging**: Use `createChildLogger(context)` for scoped logs
- **Errors**: Throw custom AppError subclasses with code + statusCode
- **Database**: Use prepared statements, never string interpolation

## Key Patterns

### Message Capture Lifecycle

1. Discord event fires (messageCreate, messageUpdate, messageDelete)
2. Check if guild matches MONITOR_GUILD_ID
3. Extract message metadata (user, channel, content, timestamp)
4. Insert into messages table
5. Broadcast WebSocket event to connected clients
6. If attachments exist:
   - Insert into attachments table with status='pending'
   - Start async upload to picser (non-blocking)
   - On success: update uploaded_url, status='uploaded'
   - On failure: store error, status='failed'

### Attachment Upload Flow

1. Download from Discord URL (with timeout)
2. Validate file size against ATTACHMENT_MAX_SIZE_MB
3. Upload to picser with retry logic (exponential backoff)
4. Parse response, extract raw_commit URL
5. Update database with uploaded_url and status
6. Broadcast attachment_uploaded event

### WebSocket Protocol

**Inbound (browser → bot):**
- Binary: Raw PCM buffers (24kHz mono s16le) for voice transmission

**Outbound (bot → browser):**
- Binary: 4-byte user ID hash + PCM chunk (voice)
- JSON: `{ type: "user_state", users: [...] }` (active speakers)
- JSON: `{ type: "message_created", data: {...} }` (new text message)
- JSON: `{ type: "message_updated", data: {...} }` (edited message)
- JSON: `{ type: "message_deleted", data: {...} }` (deleted message)
- JSON: `{ type: "attachment_uploaded", data: {...} }` (image uploaded)

### Graceful Shutdown

Handles SIGINT/SIGTERM/uncaughtException/unhandledRejection:
1. Stop voice connection
2. Pause player
3. Destroy Discord client
4. Exit process

## Dashboard Usage

**Access:** `http://localhost:3000/dashboard.html`

**Features:**
- Three tabs: Text Messages | Images | Voice
- Channel/thread filter dropdown
- Real-time WebSocket updates
- Polling fallback if WebSocket disconnects
- Message display with metadata (author, timestamp, edits, deletions)
- Image grid with previews and upload status
- Voice segment list (future enhancement)

**Keyboard/UI:**
- Click tab to switch content type
- Select channel to filter
- Click image to view full size
- WebSocket status indicator (green = connected)

## Common Tasks

### Add a new config variable

1. Add to `configSchema` in `src/config.ts` with Zod validation
2. Add to `.env.example` with description
3. Use via `config.VARIABLE_NAME`

### Add a new REST endpoint

1. Add route in `src/webserver.ts` (Express)
2. Use database functions from `src/moderation/messageStore.ts`
3. Wrap in try-catch, pass errors to Express error handler
4. Return JSON response

### Add a new WebSocket event

1. Define broadcast function in `src/webserver.ts` (attach to globalThis)
2. Call from event handler (e.g., messageCapture.ts)
3. Send JSON with `{ type, data, timestamp }`
4. Handle in dashboard JavaScript

### Debug message capture

- Set `VERBOSE=true` in `.env` for detailed logging
- Check `/health` endpoint for active users/connections
- Monitor `/metrics` endpoint (Prometheus format)
- Check `recordings/<user-id>/` for voice segments
- Query SQLite directly: `sqlite3 .muxer-queue.db "SELECT * FROM messages LIMIT 10;"`

### Debug attachment uploads

- Check `upload_status` in attachments table
- View `upload_error` field for failure reasons
- Monitor logs for "Attachment upload" messages
- Verify picser endpoint is accessible
- Check file size against ATTACHMENT_MAX_SIZE_MB

## Dependencies

**Core:**
- **discord.js-selfbot-v13** — Discord client (selfbot variant)
- **@discordjs/voice** — Voice connection management
- **@discordjs/opus** — Native Opus codec (optional, required for web PCM)
- **prism-media** — Audio encoding/decoding (Opus, OGG)

**Web:**
- **express** — HTTP server
- **ws** — WebSocket server
- **helmet** — Security headers

**Data:**
- **better-sqlite3** — SQLite database
- **zod** — Config validation

**Logging & Monitoring:**
- **pino** — Structured logging
- **pino-http** — HTTP request logging
- **prom-client** — Prometheus metrics

**Utilities:**
- **p-retry** — Retry logic with backoff
- **class-transformer** — Object transformation
- **class-validator** — Data validation

**Dev:**
- **Biome** — Linting/formatting
- **Vitest** — Testing framework
- **TypeScript** — Type checking

## Notes

- Bot uses selfbot variant (user account) rather than standard bot token — check Discord ToS
- Opus decoding requires native `@discordjs/opus` under Node.js
- OGG segments include metadata JSON for each segment (user info, timestamps, duration)
- WebSocket broadcasts PCM in real-time; browser can transmit audio back to Discord
- Graceful shutdown ensures clean disconnection and resource cleanup
- All database operations use prepared statements to prevent SQL injection
- Attachment uploads are non-blocking (async) to avoid blocking message capture
- Message capture continues even if attachment upload fails
- Dashboard uses textContent for XSS prevention (not innerHTML)

## Future Enhancements

- Reaction tracking
- Message search/full-text search
- Moderation actions (flag, delete, mute)
- Export/archive functionality
- Retention policies (auto-delete old data)
- Voice segment metadata in dashboard
- User activity analytics
- Audit log export


## Architecture

### High-Level Flow

1. **Bot Entry** (`src/index.ts`) — Initializes Discord client, sets up graceful shutdown, starts webserver
2. **Voice Controller** (`src/voiceController.ts`) — Manages guild/channel selection and connection lifecycle
3. **Recorder** (`src/recorder.ts`) — Joins voice channel, subscribes to user audio streams, handles Opus decoding and segment rotation
4. **Web Server** (`src/webserver.ts`) — Express + WebSocket server for:
   - REST API: guild/channel listing, connect/disconnect
   - WebSocket: real-time PCM broadcast to browser, browser-to-Discord audio transmission
5. **Muxer Queue** (`src/muxer-queue.ts`) — SQLite-backed job queue for post-processing audio segments (future use)

### Key Modules

- **Recorder subsystem** (`src/recorder/`):
  - `audioStream.ts` — Subscribes to Discord audio receiver, emits Opus packets
  - `decoder.ts` — Opus decoder with runtime checks, cooldown/rotation logic for web PCM broadcast
  - `segment.ts` — Manages OGG file rotation (5s default segments per user)
  - `metadata.ts` — Collects user/role info, creates segment metadata JSON

- **Voice Connection** — Uses `@discordjs/voice` receiver to subscribe to speaking users; each user gets their own stream
- **Audio Pipeline**:
  - Discord → Opus packets → PacketFilter → OGG segments (disk) + OpusDecoder → PCM (web broadcast)
  - Browser → 24kHz mono PCM → upsample to 48kHz stereo → Opus encoder → OGG → Discord player

- **Metrics** (`src/metrics.ts`) — Prometheus metrics for audio levels, recordings, connections, WebSocket clients
- **Logging** (`src/logger.ts`) — Pino logger with pretty-print in dev, JSON in prod
- **Config** (`src/config.ts`) — Zod-validated environment variables with sensible defaults
- **Error Handling** (`src/errors.ts`) — Custom error classes (AppError, ConfigError, AudioError, VoiceConnectionError, ValidationError)

### Recording Structure

```
recordings/
  ├── <user-id>/
  │   ├── <user-id>-<session-start>-0.ogg
  │   ├── <user-id>-<session-start>-0.json
  │   ├── <user-id>-<session-start>-1.ogg
  │   ├── <user-id>-<session-start>-1.json
  │   └── ...
```

Each segment is 5s (configurable). Metadata JSON includes user info, roles, timestamps, duration.

### Database

- **Muxer Queue** (`.muxer-queue.db`) — SQLite with WAL mode, tracks pending/processing/completed/failed jobs for audio post-processing

## Development Commands

```bash
# Install dependencies
pnpm install

# Development (auto-restart on file changes)
pnpm run dev

# Production
pnpm run start

# Type checking
pnpm run typecheck

# Linting (Biome)
pnpm run lint

# Format code (Biome)
pnpm run format

# Run tests
pnpm run test

# Build TypeScript
pnpm run build
```

## Configuration

All config via `.env` (see `.env.example`). Key variables:

- `DISCORD_TOKEN` — Bot token (required)
- `RECORDINGS_DIR` — Where to save audio files (default: `./recordings`)
- `RECORDING_SEGMENT_MS` — OGG segment duration (default: 5000ms)
- `DECODER_ROTATE_MS` — Opus decoder rotation interval (default: 5000ms)
- `DECODER_COOLDOWN_MS` — Cooldown after decoder error (default: 30000ms)
- `WEBSERVER_PORT` — HTTP/WebSocket port (default: 3000)
- `VOICE_CONNECTION_TIMEOUT_MS` — Voice join timeout (default: 15000ms)
- `AUDIO_STREAM_SILENCE_DURATION_MS` — Silence threshold before ending stream (default: 3000ms)
- `LOG_LEVEL` — Pino log level (default: info)
- `VERBOSE` — Enable debug logging (default: false)

## Testing

Tests use **Vitest** in `tests/` directory. Run with `pnpm run test`.

Example: `tests/decoder.test.ts` tests Opus decoder runtime detection and native opus availability.

## Code Style

- **Formatter**: Biome (2-space indent)
- **Linter**: Biome with custom rules (warn on non-null assertions, noExplicitAny)
- **Language**: TypeScript with strict mode
- **Logging**: Use `createChildLogger(context)` for scoped logs
- **Errors**: Throw custom AppError subclasses with code + statusCode

## Key Patterns

### Voice Connection Lifecycle

1. `VoiceController.connect(guildId, channelId)` → calls `startRecording()`
2. `startRecording()` joins channel, sets up receiver, subscribes to speaking users
3. On user speak: create stream, segment manager, decoder; pipe to OGG + web broadcast
4. On silence (3s): close stream, save metadata JSON
5. `VoiceController.disconnect()` → calls `stopRecording()` → destroys connection

### Audio Decoding (Web Broadcast)

- OpusDecoder wraps prism decoder with error recovery
- Rotates decoder every 5s to prevent memory leaks
- Cools down for 30s after error before retrying
- Downsamples 48kHz stereo → 24kHz mono for web transmission

### WebSocket Protocol

- **Inbound** (browser → bot): Raw PCM buffers (24kHz mono s16le)
- **Outbound** (bot → browser):
  - Binary: 4-byte user ID hash + PCM chunk
  - JSON: `{ type: "user_state", users: [...] }` on connect/user activity change

### Graceful Shutdown

Handles SIGINT/SIGTERM/uncaughtException/unhandledRejection:
1. Stop voice connection
2. Pause player
3. Destroy Discord client
4. Exit process

## Future Expansion (Text/Image Monitoring)

Current scope: voice only. Planned additions:
- Text channel message capture
- Image/attachment logging
- Per-channel/per-user filtering
- Moderation action triggers

These will likely require:
- Additional event listeners in recorder
- Extended metadata schema
- New storage/indexing strategy
- Webhook/alert system

## Common Tasks

### Add a new config variable

1. Add to `configSchema` in `src/config.ts` with Zod validation
2. Add to `.env.example`
3. Use via `config.VARIABLE_NAME`

### Add a new REST endpoint

1. Add route in `src/webserver.ts` (Express)
2. Use `VoiceController` methods or create new ones
3. Wrap in try-catch, pass errors to Express error handler

### Add metrics

1. Define gauge/counter/histogram in `src/metrics.ts`
2. Update in relevant code paths
3. Metrics exposed at `/metrics` endpoint (Prometheus format)

### Debug audio issues

- Set `VERBOSE=true` in `.env` for detailed logging
- Check `/health` endpoint for active users/connections
- Monitor audio levels via `/metrics` (audio_level_db gauge)
- Check segment files in `recordings/<user-id>/` directory

## Dependencies

- **discord.js-selfbot-v13** — Discord client (selfbot variant for user account access)
- **@discordjs/voice** — Voice connection management
- **@discordjs/opus** — Native Opus codec (optional, required for web PCM decode)
- **prism-media** — Audio encoding/decoding (Opus, OGG)
- **express** — HTTP server
- **ws** — WebSocket server
- **better-sqlite3** — SQLite database (muxer queue)
- **pino** — Structured logging
- **prom-client** — Prometheus metrics
- **zod** — Config validation
- **Biome** — Linting/formatting
- **Vitest** — Testing framework

## Notes

- Bot uses selfbot variant (user account) rather than standard bot token — check Discord ToS
- Opus decoding requires native `@discordjs/opus` under Node.js
- OGG segments include metadata JSON for each segment (user info, timestamps, duration)
- WebSocket broadcasts PCM in real-time; browser can transmit audio back to Discord
- Graceful shutdown ensures clean disconnection and resource cleanup