# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. ## Project Overview **Discord Moderation Watcher Bot** — A comprehensive monitoring bot that captures voice, text messages, and images from Discord servers. Records audio from voice channels, captures all text messages (new/edited/deleted) from channels and threads, and uploads attachments to external storage. All data stored in SQLite with real-time dashboard. Built with **Node.js/pnpm** + **discord.js-selfbot-v13** + **@discordjs/voice** + **Express** + **WebSocket**. ## Architecture ### High-Level Flow 1. **Bot Entry** (`src/index.ts`) — Initializes Discord client, registers event listeners, starts webserver 2. **Message Capture** (`src/moderation/messageCapture.ts`) — Listens to Discord events (messageCreate, messageUpdate, messageDelete) 3. **Message Store** (`src/moderation/messageStore.ts`) — Database operations for messages and attachments 4. **Attachment Uploader** (`src/moderation/attachmentUploader.ts`) — Downloads from Discord, uploads to picser, stores URLs 5. **Voice Controller** (`src/voiceController.ts`) — Manages voice channel connections 6. **Recorder** (`src/recorder.ts`) — Records voice audio to OGG segments 7. **Web Server** (`src/webserver.ts`) — Express + WebSocket for REST API and real-time updates 8. **Dashboard** (`public/dashboard.html`) — Web UI with three tabs (Text, Images, Voice) ### Key Modules **Moderation Subsystem** (`src/moderation/`): - `types.ts` — TypeScript types for messages, attachments, voice segments - `messageCapture.ts` — Discord event listeners (messageCreate, messageUpdate, messageDelete) - `messageStore.ts` — Database CRUD operations (insert, update, query) - `attachmentUploader.ts` — Picser integration with retry logic and error handling **Database Schema** (SQLite): - `messages` table — text messages with edit/delete tracking, user metadata, timestamps - `attachments` table — attachment metadata, Discord URLs, picser URLs, upload status - Indexes on channel_id, user_id, created_at for fast queries **Voice Recording** (existing, unchanged): - `recorder.ts` — Joins voice channel, subscribes to user audio streams - `recorder/audioStream.ts` — Opus packet subscription - `recorder/decoder.ts` — Opus decoder with runtime checks - `recorder/segment.ts` — OGG file rotation (5s segments) **Web Interface**: - REST API: `/api/messages?channel=&type=text|image` - WebSocket: real-time events (message_created, message_updated, message_deleted, attachment_uploaded) - Dashboard: three tabs (Text Messages, Images, Voice) with channel filtering ### Recording Structure ``` recordings/ ├── / │ ├── --0.ogg │ ├── --0.json │ └── ... messages (SQLite): ├── id, guild_id, channel_id, thread_id ├── user_id, username, avatar_url ├── content, edited_content ├── created_at, edited_at, deleted_at └── type (text|edited|deleted) attachments (SQLite): ├── id, message_id, guild_id, channel_id, user_id ├── filename, size, type (MIME) ├── discord_url, uploaded_url (picser raw_commit) ├── upload_status (pending|uploaded|failed) └── created_at, uploaded_at ``` ## Development Commands ```bash # Install dependencies pnpm install # Development (auto-restart on file changes) pnpm run dev # Production pnpm run start # Type checking pnpm run typecheck # Linting (Biome) pnpm run lint # Format code (Biome) pnpm run format # Run tests pnpm run test # Build TypeScript pnpm run build ``` ## Configuration All config via `.env` (see `.env.example`). Key variables: **Discord & Monitoring:** - `DISCORD_TOKEN` — Bot token (required) - `MONITOR_GUILD_ID` — Target server to monitor (required for moderation) - `GUILD_ID` — Legacy voice channel guild (optional) - `VOICE_CHANNEL_ID` — Legacy voice channel ID (optional) **Recording:** - `RECORDINGS_DIR` — Where to save audio files (default: `./recordings`) - `RECORDING_SEGMENT_MS` — OGG segment duration (default: 5000ms) **Decoder:** - `DECODER_ROTATE_MS` — Opus decoder rotation interval (default: 5000ms) - `DECODER_COOLDOWN_MS` — Cooldown after decoder error (default: 30000ms) **Attachments:** - `PICSER_UPLOAD_URL` — Picser upload endpoint (default: https://picser.asepharyana.tech/api/upload) - `ATTACHMENT_UPLOAD_TIMEOUT_MS` — Upload timeout (default: 30000ms) - `ATTACHMENT_MAX_SIZE_MB` — Max file size (default: 100MB) - `ATTACHMENT_RETRY_ATTEMPTS` — Retry count (default: 3) **Web Server:** - `WEBSERVER_PORT` — HTTP/WebSocket port (default: 3000) **Connection:** - `VOICE_CONNECTION_TIMEOUT_MS` — Voice join timeout (default: 15000ms) - `RECONNECT_TIMEOUT_MS` — Reconnect timeout (default: 5000ms) - `AUDIO_STREAM_SILENCE_DURATION_MS` — Silence threshold (default: 3000ms) **Logging:** - `LOG_LEVEL` — Pino log level (default: info) - `VERBOSE` — Enable debug logging (default: false) - `NODE_ENV` — Environment (development|production|test) ## Testing Tests use **Vitest** in `tests/` directory. Run with `pnpm run test`. **Test Coverage:** - `tests/moderation/messageStore.test.ts` — Message store CRUD operations - `tests/moderation/attachmentUploader.test.ts` — Picser response parsing - `tests/config.test.ts` — Configuration validation - `tests/decoder.test.ts` — Opus decoder runtime detection ## Code Style - **Formatter**: Biome (2-space indent) - **Linter**: Biome with custom rules (warn on non-null assertions, noExplicitAny) - **Language**: TypeScript with strict mode - **Logging**: Use `createChildLogger(context)` for scoped logs - **Errors**: Throw custom AppError subclasses with code + statusCode - **Database**: Use prepared statements, never string interpolation ## Key Patterns ### Message Capture Lifecycle 1. Discord event fires (messageCreate, messageUpdate, messageDelete) 2. Check if guild matches MONITOR_GUILD_ID 3. Extract message metadata (user, channel, content, timestamp) 4. Insert into messages table 5. Broadcast WebSocket event to connected clients 6. If attachments exist: - Insert into attachments table with status='pending' - Start async upload to picser (non-blocking) - On success: update uploaded_url, status='uploaded' - On failure: store error, status='failed' ### Attachment Upload Flow 1. Download from Discord URL (with timeout) 2. Validate file size against ATTACHMENT_MAX_SIZE_MB 3. Upload to picser with retry logic (exponential backoff) 4. Parse response, extract raw_commit URL 5. Update database with uploaded_url and status 6. Broadcast attachment_uploaded event ### WebSocket Protocol **Inbound (browser → bot):** - Binary: Raw PCM buffers (24kHz mono s16le) for voice transmission **Outbound (bot → browser):** - Binary: 4-byte user ID hash + PCM chunk (voice) - JSON: `{ type: "user_state", users: [...] }` (active speakers) - JSON: `{ type: "message_created", data: {...} }` (new text message) - JSON: `{ type: "message_updated", data: {...} }` (edited message) - JSON: `{ type: "message_deleted", data: {...} }` (deleted message) - JSON: `{ type: "attachment_uploaded", data: {...} }` (image uploaded) ### Graceful Shutdown Handles SIGINT/SIGTERM/uncaughtException/unhandledRejection: 1. Stop voice connection 2. Pause player 3. Destroy Discord client 4. Exit process ## Dashboard Usage **Access:** `http://localhost:3000/dashboard.html` **Features:** - Three tabs: Text Messages | Images | Voice - Channel/thread filter dropdown - Real-time WebSocket updates - Polling fallback if WebSocket disconnects - Message display with metadata (author, timestamp, edits, deletions) - Image grid with previews and upload status - Voice segment list (future enhancement) **Keyboard/UI:** - Click tab to switch content type - Select channel to filter - Click image to view full size - WebSocket status indicator (green = connected) ## Common Tasks ### Add a new config variable 1. Add to `configSchema` in `src/config.ts` with Zod validation 2. Add to `.env.example` with description 3. Use via `config.VARIABLE_NAME` ### Add a new REST endpoint 1. Add route in `src/webserver.ts` (Express) 2. Use database functions from `src/moderation/messageStore.ts` 3. Wrap in try-catch, pass errors to Express error handler 4. Return JSON response ### Add a new WebSocket event 1. Define broadcast function in `src/webserver.ts` (attach to globalThis) 2. Call from event handler (e.g., messageCapture.ts) 3. Send JSON with `{ type, data, timestamp }` 4. Handle in dashboard JavaScript ### Debug message capture - Set `VERBOSE=true` in `.env` for detailed logging - Check `/health` endpoint for active users/connections - Monitor `/metrics` endpoint (Prometheus format) - Check `recordings//` for voice segments - Query SQLite directly: `sqlite3 .muxer-queue.db "SELECT * FROM messages LIMIT 10;"` ### Debug attachment uploads - Check `upload_status` in attachments table - View `upload_error` field for failure reasons - Monitor logs for "Attachment upload" messages - Verify picser endpoint is accessible - Check file size against ATTACHMENT_MAX_SIZE_MB ## Dependencies **Core:** - **discord.js-selfbot-v13** — Discord client (selfbot variant) - **@discordjs/voice** — Voice connection management - **@discordjs/opus** — Native Opus codec (optional, required for web PCM) - **prism-media** — Audio encoding/decoding (Opus, OGG) **Web:** - **express** — HTTP server - **ws** — WebSocket server - **helmet** — Security headers **Data:** - **better-sqlite3** — SQLite database - **zod** — Config validation **Logging & Monitoring:** - **pino** — Structured logging - **pino-http** — HTTP request logging - **prom-client** — Prometheus metrics **Utilities:** - **p-retry** — Retry logic with backoff - **class-transformer** — Object transformation - **class-validator** — Data validation **Dev:** - **Biome** — Linting/formatting - **Vitest** — Testing framework - **TypeScript** — Type checking ## Notes - Bot uses selfbot variant (user account) rather than standard bot token — check Discord ToS - Opus decoding requires native `@discordjs/opus` under Node.js - OGG segments include metadata JSON for each segment (user info, timestamps, duration) - WebSocket broadcasts PCM in real-time; browser can transmit audio back to Discord - Graceful shutdown ensures clean disconnection and resource cleanup - All database operations use prepared statements to prevent SQL injection - Attachment uploads are non-blocking (async) to avoid blocking message capture - Message capture continues even if attachment upload fails - Dashboard uses textContent for XSS prevention (not innerHTML) ## Future Enhancements - Reaction tracking - Message search/full-text search - Moderation actions (flag, delete, mute) - Export/archive functionality - Retention policies (auto-delete old data) - Voice segment metadata in dashboard - User activity analytics - Audit log export ## Architecture ### High-Level Flow 1. **Bot Entry** (`src/index.ts`) — Initializes Discord client, sets up graceful shutdown, starts webserver 2. **Voice Controller** (`src/voiceController.ts`) — Manages guild/channel selection and connection lifecycle 3. **Recorder** (`src/recorder.ts`) — Joins voice channel, subscribes to user audio streams, handles Opus decoding and segment rotation 4. **Web Server** (`src/webserver.ts`) — Express + WebSocket server for: - REST API: guild/channel listing, connect/disconnect - WebSocket: real-time PCM broadcast to browser, browser-to-Discord audio transmission 5. **Muxer Queue** (`src/muxer-queue.ts`) — SQLite-backed job queue for post-processing audio segments (future use) ### Key Modules - **Recorder subsystem** (`src/recorder/`): - `audioStream.ts` — Subscribes to Discord audio receiver, emits Opus packets - `decoder.ts` — Opus decoder with runtime checks, cooldown/rotation logic for web PCM broadcast - `segment.ts` — Manages OGG file rotation (5s default segments per user) - `metadata.ts` — Collects user/role info, creates segment metadata JSON - **Voice Connection** — Uses `@discordjs/voice` receiver to subscribe to speaking users; each user gets their own stream - **Audio Pipeline**: - Discord → Opus packets → PacketFilter → OGG segments (disk) + OpusDecoder → PCM (web broadcast) - Browser → 24kHz mono PCM → upsample to 48kHz stereo → Opus encoder → OGG → Discord player - **Metrics** (`src/metrics.ts`) — Prometheus metrics for audio levels, recordings, connections, WebSocket clients - **Logging** (`src/logger.ts`) — Pino logger with pretty-print in dev, JSON in prod - **Config** (`src/config.ts`) — Zod-validated environment variables with sensible defaults - **Error Handling** (`src/errors.ts`) — Custom error classes (AppError, ConfigError, AudioError, VoiceConnectionError, ValidationError) ### Recording Structure ``` recordings/ ├── / │ ├── --0.ogg │ ├── --0.json │ ├── --1.ogg │ ├── --1.json │ └── ... ``` Each segment is 5s (configurable). Metadata JSON includes user info, roles, timestamps, duration. ### Database - **Muxer Queue** (`.muxer-queue.db`) — SQLite with WAL mode, tracks pending/processing/completed/failed jobs for audio post-processing ## Development Commands ```bash # Install dependencies pnpm install # Development (auto-restart on file changes) pnpm run dev # Production pnpm run start # Type checking pnpm run typecheck # Linting (Biome) pnpm run lint # Format code (Biome) pnpm run format # Run tests pnpm run test # Build TypeScript pnpm run build ``` ## Configuration All config via `.env` (see `.env.example`). Key variables: - `DISCORD_TOKEN` — Bot token (required) - `RECORDINGS_DIR` — Where to save audio files (default: `./recordings`) - `RECORDING_SEGMENT_MS` — OGG segment duration (default: 5000ms) - `DECODER_ROTATE_MS` — Opus decoder rotation interval (default: 5000ms) - `DECODER_COOLDOWN_MS` — Cooldown after decoder error (default: 30000ms) - `WEBSERVER_PORT` — HTTP/WebSocket port (default: 3000) - `VOICE_CONNECTION_TIMEOUT_MS` — Voice join timeout (default: 15000ms) - `AUDIO_STREAM_SILENCE_DURATION_MS` — Silence threshold before ending stream (default: 3000ms) - `LOG_LEVEL` — Pino log level (default: info) - `VERBOSE` — Enable debug logging (default: false) ## Testing Tests use **Vitest** in `tests/` directory. Run with `pnpm run test`. Example: `tests/decoder.test.ts` tests Opus decoder runtime detection and native opus availability. ## Code Style - **Formatter**: Biome (2-space indent) - **Linter**: Biome with custom rules (warn on non-null assertions, noExplicitAny) - **Language**: TypeScript with strict mode - **Logging**: Use `createChildLogger(context)` for scoped logs - **Errors**: Throw custom AppError subclasses with code + statusCode ## Key Patterns ### Voice Connection Lifecycle 1. `VoiceController.connect(guildId, channelId)` → calls `startRecording()` 2. `startRecording()` joins channel, sets up receiver, subscribes to speaking users 3. On user speak: create stream, segment manager, decoder; pipe to OGG + web broadcast 4. On silence (3s): close stream, save metadata JSON 5. `VoiceController.disconnect()` → calls `stopRecording()` → destroys connection ### Audio Decoding (Web Broadcast) - OpusDecoder wraps prism decoder with error recovery - Rotates decoder every 5s to prevent memory leaks - Cools down for 30s after error before retrying - Downsamples 48kHz stereo → 24kHz mono for web transmission ### WebSocket Protocol - **Inbound** (browser → bot): Raw PCM buffers (24kHz mono s16le) - **Outbound** (bot → browser): - Binary: 4-byte user ID hash + PCM chunk - JSON: `{ type: "user_state", users: [...] }` on connect/user activity change ### Graceful Shutdown Handles SIGINT/SIGTERM/uncaughtException/unhandledRejection: 1. Stop voice connection 2. Pause player 3. Destroy Discord client 4. Exit process ## Future Expansion (Text/Image Monitoring) Current scope: voice only. Planned additions: - Text channel message capture - Image/attachment logging - Per-channel/per-user filtering - Moderation action triggers These will likely require: - Additional event listeners in recorder - Extended metadata schema - New storage/indexing strategy - Webhook/alert system ## Common Tasks ### Add a new config variable 1. Add to `configSchema` in `src/config.ts` with Zod validation 2. Add to `.env.example` 3. Use via `config.VARIABLE_NAME` ### Add a new REST endpoint 1. Add route in `src/webserver.ts` (Express) 2. Use `VoiceController` methods or create new ones 3. Wrap in try-catch, pass errors to Express error handler ### Add metrics 1. Define gauge/counter/histogram in `src/metrics.ts` 2. Update in relevant code paths 3. Metrics exposed at `/metrics` endpoint (Prometheus format) ### Debug audio issues - Set `VERBOSE=true` in `.env` for detailed logging - Check `/health` endpoint for active users/connections - Monitor audio levels via `/metrics` (audio_level_db gauge) - Check segment files in `recordings//` directory ## Dependencies - **discord.js-selfbot-v13** — Discord client (selfbot variant for user account access) - **@discordjs/voice** — Voice connection management - **@discordjs/opus** — Native Opus codec (optional, required for web PCM decode) - **prism-media** — Audio encoding/decoding (Opus, OGG) - **express** — HTTP server - **ws** — WebSocket server - **better-sqlite3** — SQLite database (muxer queue) - **pino** — Structured logging - **prom-client** — Prometheus metrics - **zod** — Config validation - **Biome** — Linting/formatting - **Vitest** — Testing framework ## Notes - Bot uses selfbot variant (user account) rather than standard bot token — check Discord ToS - Opus decoding requires native `@discordjs/opus` under Node.js - OGG segments include metadata JSON for each segment (user info, timestamps, duration) - WebSocket broadcasts PCM in real-time; browser can transmit audio back to Discord - Graceful shutdown ensures clean disconnection and resource cleanup