Files
dc-recorder/CLAUDE.md

18 KiB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

Discord Moderation Watcher Bot — A comprehensive monitoring bot that captures voice, text messages, and images from Discord servers. Records audio from voice channels, captures all text messages (new/edited/deleted) from channels and threads, and uploads attachments to external storage. All data stored in SQLite with real-time dashboard.

Built with Node.js/pnpm + discord.js-selfbot-v13 + @discordjs/voice + Express + WebSocket.

Architecture

High-Level Flow

  1. Bot Entry (src/index.ts) — Initializes Discord client, registers event listeners, starts webserver
  2. Message Capture (src/moderation/messageCapture.ts) — Listens to Discord events (messageCreate, messageUpdate, messageDelete)
  3. Message Store (src/moderation/messageStore.ts) — Database operations for messages and attachments
  4. Attachment Uploader (src/moderation/attachmentUploader.ts) — Downloads from Discord, uploads to picser, stores URLs
  5. Voice Controller (src/voiceController.ts) — Manages voice channel connections
  6. Recorder (src/recorder.ts) — Records voice audio to OGG segments
  7. Web Server (src/webserver.ts) — Express + WebSocket for REST API and real-time updates
  8. Dashboard (public/dashboard.html) — Web UI with three tabs (Text, Images, Voice)

Key Modules

Moderation Subsystem (src/moderation/):

  • types.ts — TypeScript types for messages, attachments, voice segments
  • messageCapture.ts — Discord event listeners (messageCreate, messageUpdate, messageDelete)
  • messageStore.ts — Database CRUD operations (insert, update, query)
  • attachmentUploader.ts — Picser integration with retry logic and error handling

Database Schema (SQLite):

  • messages table — text messages with edit/delete tracking, user metadata, timestamps
  • attachments table — attachment metadata, Discord URLs, picser URLs, upload status
  • Indexes on channel_id, user_id, created_at for fast queries

Voice Recording (existing, unchanged):

  • recorder.ts — Joins voice channel, subscribes to user audio streams
  • recorder/audioStream.ts — Opus packet subscription
  • recorder/decoder.ts — Opus decoder with runtime checks
  • recorder/segment.ts — OGG file rotation (5s segments)

Web Interface:

  • REST API: /api/messages?channel=<id>&type=text|image
  • WebSocket: real-time events (message_created, message_updated, message_deleted, attachment_uploaded)
  • Dashboard: three tabs (Text Messages, Images, Voice) with channel filtering

Recording Structure

recordings/
  ├── <user-id>/
  │   ├── <user-id>-<session-start>-0.ogg
  │   ├── <user-id>-<session-start>-0.json
  │   └── ...

messages (SQLite):
  ├── id, guild_id, channel_id, thread_id
  ├── user_id, username, avatar_url
  ├── content, edited_content
  ├── created_at, edited_at, deleted_at
  └── type (text|edited|deleted)

attachments (SQLite):
  ├── id, message_id, guild_id, channel_id, user_id
  ├── filename, size, type (MIME)
  ├── discord_url, uploaded_url (picser raw_commit)
  ├── upload_status (pending|uploaded|failed)
  └── created_at, uploaded_at

Development Commands

# Install dependencies
pnpm install

# Development (auto-restart on file changes)
pnpm run dev

# Production
pnpm run start

# Type checking
pnpm run typecheck

# Linting (Biome)
pnpm run lint

# Format code (Biome)
pnpm run format

# Run tests
pnpm run test

# Build TypeScript
pnpm run build

Configuration

All config via .env (see .env.example). Key variables:

Discord & Monitoring:

  • DISCORD_TOKEN — Bot token (required)
  • MONITOR_GUILD_ID — Target server to monitor (required for moderation)
  • GUILD_ID — Legacy voice channel guild (optional)
  • VOICE_CHANNEL_ID — Legacy voice channel ID (optional)

Recording:

  • RECORDINGS_DIR — Where to save audio files (default: ./recordings)
  • RECORDING_SEGMENT_MS — OGG segment duration (default: 5000ms)

Decoder:

  • DECODER_ROTATE_MS — Opus decoder rotation interval (default: 5000ms)
  • DECODER_COOLDOWN_MS — Cooldown after decoder error (default: 30000ms)

Attachments:

  • PICSER_UPLOAD_URL — Picser upload endpoint (default: https://picser.asepharyana.tech/api/upload)
  • ATTACHMENT_UPLOAD_TIMEOUT_MS — Upload timeout (default: 30000ms)
  • ATTACHMENT_MAX_SIZE_MB — Max file size (default: 100MB)
  • ATTACHMENT_RETRY_ATTEMPTS — Retry count (default: 3)

Web Server:

  • WEBSERVER_PORT — HTTP/WebSocket port (default: 3000)

Connection:

  • VOICE_CONNECTION_TIMEOUT_MS — Voice join timeout (default: 15000ms)
  • RECONNECT_TIMEOUT_MS — Reconnect timeout (default: 5000ms)
  • AUDIO_STREAM_SILENCE_DURATION_MS — Silence threshold (default: 3000ms)

Logging:

  • LOG_LEVEL — Pino log level (default: info)
  • VERBOSE — Enable debug logging (default: false)
  • NODE_ENV — Environment (development|production|test)

Testing

Tests use Vitest in tests/ directory. Run with pnpm run test.

Test Coverage:

  • tests/moderation/messageStore.test.ts — Message store CRUD operations
  • tests/moderation/attachmentUploader.test.ts — Picser response parsing
  • tests/config.test.ts — Configuration validation
  • tests/decoder.test.ts — Opus decoder runtime detection

Code Style

  • Formatter: Biome (2-space indent)
  • Linter: Biome with custom rules (warn on non-null assertions, noExplicitAny)
  • Language: TypeScript with strict mode
  • Logging: Use createChildLogger(context) for scoped logs
  • Errors: Throw custom AppError subclasses with code + statusCode
  • Database: Use prepared statements, never string interpolation

Key Patterns

Message Capture Lifecycle

  1. Discord event fires (messageCreate, messageUpdate, messageDelete)
  2. Check if guild matches MONITOR_GUILD_ID
  3. Extract message metadata (user, channel, content, timestamp)
  4. Insert into messages table
  5. Broadcast WebSocket event to connected clients
  6. If attachments exist:
    • Insert into attachments table with status='pending'
    • Start async upload to picser (non-blocking)
    • On success: update uploaded_url, status='uploaded'
    • On failure: store error, status='failed'

Attachment Upload Flow

  1. Download from Discord URL (with timeout)
  2. Validate file size against ATTACHMENT_MAX_SIZE_MB
  3. Upload to picser with retry logic (exponential backoff)
  4. Parse response, extract raw_commit URL
  5. Update database with uploaded_url and status
  6. Broadcast attachment_uploaded event

WebSocket Protocol

Inbound (browser → bot):

  • Binary: Raw PCM buffers (24kHz mono s16le) for voice transmission

Outbound (bot → browser):

  • Binary: 4-byte user ID hash + PCM chunk (voice)
  • JSON: { type: "user_state", users: [...] } (active speakers)
  • JSON: { type: "message_created", data: {...} } (new text message)
  • JSON: { type: "message_updated", data: {...} } (edited message)
  • JSON: { type: "message_deleted", data: {...} } (deleted message)
  • JSON: { type: "attachment_uploaded", data: {...} } (image uploaded)

Graceful Shutdown

Handles SIGINT/SIGTERM/uncaughtException/unhandledRejection:

  1. Stop voice connection
  2. Pause player
  3. Destroy Discord client
  4. Exit process

Dashboard Usage

Access: http://localhost:3000/dashboard.html

Features:

  • Three tabs: Text Messages | Images | Voice
  • Channel/thread filter dropdown
  • Real-time WebSocket updates
  • Polling fallback if WebSocket disconnects
  • Message display with metadata (author, timestamp, edits, deletions)
  • Image grid with previews and upload status
  • Voice segment list (future enhancement)

Keyboard/UI:

  • Click tab to switch content type
  • Select channel to filter
  • Click image to view full size
  • WebSocket status indicator (green = connected)

Common Tasks

Add a new config variable

  1. Add to configSchema in src/config.ts with Zod validation
  2. Add to .env.example with description
  3. Use via config.VARIABLE_NAME

Add a new REST endpoint

  1. Add route in src/webserver.ts (Express)
  2. Use database functions from src/moderation/messageStore.ts
  3. Wrap in try-catch, pass errors to Express error handler
  4. Return JSON response

Add a new WebSocket event

  1. Define broadcast function in src/webserver.ts (attach to globalThis)
  2. Call from event handler (e.g., messageCapture.ts)
  3. Send JSON with { type, data, timestamp }
  4. Handle in dashboard JavaScript

Debug message capture

  • Set VERBOSE=true in .env for detailed logging
  • Check /health endpoint for active users/connections
  • Monitor /metrics endpoint (Prometheus format)
  • Check recordings/<user-id>/ for voice segments
  • Query SQLite directly: sqlite3 .muxer-queue.db "SELECT * FROM messages LIMIT 10;"

Debug attachment uploads

  • Check upload_status in attachments table
  • View upload_error field for failure reasons
  • Monitor logs for "Attachment upload" messages
  • Verify picser endpoint is accessible
  • Check file size against ATTACHMENT_MAX_SIZE_MB

Dependencies

Core:

  • discord.js-selfbot-v13 — Discord client (selfbot variant)
  • @discordjs/voice — Voice connection management
  • @discordjs/opus — Native Opus codec (optional, required for web PCM)
  • prism-media — Audio encoding/decoding (Opus, OGG)

Web:

  • express — HTTP server
  • ws — WebSocket server
  • helmet — Security headers

Data:

  • better-sqlite3 — SQLite database
  • zod — Config validation

Logging & Monitoring:

  • pino — Structured logging
  • pino-http — HTTP request logging
  • prom-client — Prometheus metrics

Utilities:

  • p-retry — Retry logic with backoff
  • class-transformer — Object transformation
  • class-validator — Data validation

Dev:

  • Biome — Linting/formatting
  • Vitest — Testing framework
  • TypeScript — Type checking

Notes

  • Bot uses selfbot variant (user account) rather than standard bot token — check Discord ToS
  • Opus decoding requires native @discordjs/opus under Node.js
  • OGG segments include metadata JSON for each segment (user info, timestamps, duration)
  • WebSocket broadcasts PCM in real-time; browser can transmit audio back to Discord
  • Graceful shutdown ensures clean disconnection and resource cleanup
  • All database operations use prepared statements to prevent SQL injection
  • Attachment uploads are non-blocking (async) to avoid blocking message capture
  • Message capture continues even if attachment upload fails
  • Dashboard uses textContent for XSS prevention (not innerHTML)

Future Enhancements

  • Reaction tracking
  • Message search/full-text search
  • Moderation actions (flag, delete, mute)
  • Export/archive functionality
  • Retention policies (auto-delete old data)
  • Voice segment metadata in dashboard
  • User activity analytics
  • Audit log export

Architecture

High-Level Flow

  1. Bot Entry (src/index.ts) — Initializes Discord client, sets up graceful shutdown, starts webserver
  2. Voice Controller (src/voiceController.ts) — Manages guild/channel selection and connection lifecycle
  3. Recorder (src/recorder.ts) — Joins voice channel, subscribes to user audio streams, handles Opus decoding and segment rotation
  4. Web Server (src/webserver.ts) — Express + WebSocket server for:
    • REST API: guild/channel listing, connect/disconnect
    • WebSocket: real-time PCM broadcast to browser, browser-to-Discord audio transmission
  5. Muxer Queue (src/muxer-queue.ts) — SQLite-backed job queue for post-processing audio segments (future use)

Key Modules

  • Recorder subsystem (src/recorder/):

    • audioStream.ts — Subscribes to Discord audio receiver, emits Opus packets
    • decoder.ts — Opus decoder with runtime checks, cooldown/rotation logic for web PCM broadcast
    • segment.ts — Manages OGG file rotation (5s default segments per user)
    • metadata.ts — Collects user/role info, creates segment metadata JSON
  • Voice Connection — Uses @discordjs/voice receiver to subscribe to speaking users; each user gets their own stream

  • Audio Pipeline:

    • Discord → Opus packets → PacketFilter → OGG segments (disk) + OpusDecoder → PCM (web broadcast)
    • Browser → 24kHz mono PCM → upsample to 48kHz stereo → Opus encoder → OGG → Discord player
  • Metrics (src/metrics.ts) — Prometheus metrics for audio levels, recordings, connections, WebSocket clients

  • Logging (src/logger.ts) — Pino logger with pretty-print in dev, JSON in prod

  • Config (src/config.ts) — Zod-validated environment variables with sensible defaults

  • Error Handling (src/errors.ts) — Custom error classes (AppError, ConfigError, AudioError, VoiceConnectionError, ValidationError)

Recording Structure

recordings/
  ├── <user-id>/
  │   ├── <user-id>-<session-start>-0.ogg
  │   ├── <user-id>-<session-start>-0.json
  │   ├── <user-id>-<session-start>-1.ogg
  │   ├── <user-id>-<session-start>-1.json
  │   └── ...

Each segment is 5s (configurable). Metadata JSON includes user info, roles, timestamps, duration.

Database

  • Muxer Queue (.muxer-queue.db) — SQLite with WAL mode, tracks pending/processing/completed/failed jobs for audio post-processing

Development Commands

# Install dependencies
pnpm install

# Development (auto-restart on file changes)
pnpm run dev

# Production
pnpm run start

# Type checking
pnpm run typecheck

# Linting (Biome)
pnpm run lint

# Format code (Biome)
pnpm run format

# Run tests
pnpm run test

# Build TypeScript
pnpm run build

Configuration

All config via .env (see .env.example). Key variables:

  • DISCORD_TOKEN — Bot token (required)
  • RECORDINGS_DIR — Where to save audio files (default: ./recordings)
  • RECORDING_SEGMENT_MS — OGG segment duration (default: 5000ms)
  • DECODER_ROTATE_MS — Opus decoder rotation interval (default: 5000ms)
  • DECODER_COOLDOWN_MS — Cooldown after decoder error (default: 30000ms)
  • WEBSERVER_PORT — HTTP/WebSocket port (default: 3000)
  • VOICE_CONNECTION_TIMEOUT_MS — Voice join timeout (default: 15000ms)
  • AUDIO_STREAM_SILENCE_DURATION_MS — Silence threshold before ending stream (default: 3000ms)
  • LOG_LEVEL — Pino log level (default: info)
  • VERBOSE — Enable debug logging (default: false)

Testing

Tests use Vitest in tests/ directory. Run with pnpm run test.

Example: tests/decoder.test.ts tests Opus decoder runtime detection and native opus availability.

Code Style

  • Formatter: Biome (2-space indent)
  • Linter: Biome with custom rules (warn on non-null assertions, noExplicitAny)
  • Language: TypeScript with strict mode
  • Logging: Use createChildLogger(context) for scoped logs
  • Errors: Throw custom AppError subclasses with code + statusCode

Key Patterns

Voice Connection Lifecycle

  1. VoiceController.connect(guildId, channelId) → calls startRecording()
  2. startRecording() joins channel, sets up receiver, subscribes to speaking users
  3. On user speak: create stream, segment manager, decoder; pipe to OGG + web broadcast
  4. On silence (3s): close stream, save metadata JSON
  5. VoiceController.disconnect() → calls stopRecording() → destroys connection

Audio Decoding (Web Broadcast)

  • OpusDecoder wraps prism decoder with error recovery
  • Rotates decoder every 5s to prevent memory leaks
  • Cools down for 30s after error before retrying
  • Downsamples 48kHz stereo → 24kHz mono for web transmission

WebSocket Protocol

  • Inbound (browser → bot): Raw PCM buffers (24kHz mono s16le)
  • Outbound (bot → browser):
    • Binary: 4-byte user ID hash + PCM chunk
    • JSON: { type: "user_state", users: [...] } on connect/user activity change

Graceful Shutdown

Handles SIGINT/SIGTERM/uncaughtException/unhandledRejection:

  1. Stop voice connection
  2. Pause player
  3. Destroy Discord client
  4. Exit process

Future Expansion (Text/Image Monitoring)

Current scope: voice only. Planned additions:

  • Text channel message capture
  • Image/attachment logging
  • Per-channel/per-user filtering
  • Moderation action triggers

These will likely require:

  • Additional event listeners in recorder
  • Extended metadata schema
  • New storage/indexing strategy
  • Webhook/alert system

Common Tasks

Add a new config variable

  1. Add to configSchema in src/config.ts with Zod validation
  2. Add to .env.example
  3. Use via config.VARIABLE_NAME

Add a new REST endpoint

  1. Add route in src/webserver.ts (Express)
  2. Use VoiceController methods or create new ones
  3. Wrap in try-catch, pass errors to Express error handler

Add metrics

  1. Define gauge/counter/histogram in src/metrics.ts
  2. Update in relevant code paths
  3. Metrics exposed at /metrics endpoint (Prometheus format)

Debug audio issues

  • Set VERBOSE=true in .env for detailed logging
  • Check /health endpoint for active users/connections
  • Monitor audio levels via /metrics (audio_level_db gauge)
  • Check segment files in recordings/<user-id>/ directory

Dependencies

  • discord.js-selfbot-v13 — Discord client (selfbot variant for user account access)
  • @discordjs/voice — Voice connection management
  • @discordjs/opus — Native Opus codec (optional, required for web PCM decode)
  • prism-media — Audio encoding/decoding (Opus, OGG)
  • express — HTTP server
  • ws — WebSocket server
  • better-sqlite3 — SQLite database (muxer queue)
  • pino — Structured logging
  • prom-client — Prometheus metrics
  • zod — Config validation
  • Biome — Linting/formatting
  • Vitest — Testing framework

Notes

  • Bot uses selfbot variant (user account) rather than standard bot token — check Discord ToS
  • Opus decoding requires native @discordjs/opus under Node.js
  • OGG segments include metadata JSON for each segment (user info, timestamps, duration)
  • WebSocket broadcasts PCM in real-time; browser can transmit audio back to Discord
  • Graceful shutdown ensures clean disconnection and resource cleanup