111 lines
5.8 KiB
Markdown
111 lines
5.8 KiB
Markdown
|
|
# Media Music Phase 1 Design
|
||
|
|
|
||
|
|
## Goal
|
||
|
|
|
||
|
|
Add a first media playback phase focused on play music: users can queue, play, skip, and stop audio sources from the dashboard while preserving the existing Discord voice recorder, browser microphone transmit, and moderation capture flows.
|
||
|
|
|
||
|
|
## Scope
|
||
|
|
|
||
|
|
Phase 1 implements audio-only playback and queue control. Share screen/video streaming is intentionally reserved for phase 2, but the controller shape should leave room for a later `screen` mode using the already vendored `@dank074/discord-video-stream` APIs seen in `MythEclipse/StreamBot`.
|
||
|
|
|
||
|
|
## Recommended Architecture
|
||
|
|
|
||
|
|
Create a small media subsystem under `src/media/`:
|
||
|
|
|
||
|
|
- `mediaTypes.ts` defines `MediaMode`, `MediaQueueItem`, `MediaState`, and request/response types.
|
||
|
|
- `mediaQueue.ts` owns in-memory queue operations: add, current, next, remove current, clear, snapshot.
|
||
|
|
- `mediaResolver.ts` resolves initial supported sources. Phase 1 should support direct HTTP(S) URLs and local file paths. YouTube/search can be added later because it requires adding or wrapping yt-dlp behavior.
|
||
|
|
- `musicPlayer.ts` converts a media source to Ogg Opus using ffmpeg and feeds the existing `discordPlayer.playStream()`.
|
||
|
|
- `mediaController.ts` coordinates queue state, voice connection assumptions, play/skip/stop, and WebSocket broadcast state.
|
||
|
|
|
||
|
|
The existing `VoiceController` remains the owner of joining/leaving voice channels. Phase 1 does not create a second voice connection path. Music playback requires the bot to already be connected through the existing voice UI or `/api/connect`; otherwise the media route returns `409 VOICE_NOT_CONNECTED`.
|
||
|
|
|
||
|
|
## Data Flow
|
||
|
|
|
||
|
|
1. Browser submits a source to `/api/media/queue` with `{ source }`.
|
||
|
|
2. `mediaResolver` validates and resolves the source into `{ source, title, kind }`.
|
||
|
|
3. `mediaQueue` appends a `MediaQueueItem`.
|
||
|
|
4. If no item is playing, `mediaController` starts playback of the current queue item.
|
||
|
|
5. `musicPlayer` spawns ffmpeg and outputs Ogg Opus to `discordPlayer.playStream()`.
|
||
|
|
6. When playback finishes, the controller removes the completed item and starts the next item.
|
||
|
|
7. State changes broadcast over the existing moderation broadcaster as a JSON WebSocket event, or via a small media broadcaster wrapper if that keeps types cleaner.
|
||
|
|
|
||
|
|
## API Design
|
||
|
|
|
||
|
|
Add `src/routes/mediaRoutes.ts` mounted under `/api`:
|
||
|
|
|
||
|
|
- `GET /api/media/status` returns `{ playing, current, queue }`.
|
||
|
|
- `POST /api/media/queue` accepts `{ source: string }`, queues it, and returns the updated state.
|
||
|
|
- `POST /api/media/skip` skips current item and starts the next if present.
|
||
|
|
- `POST /api/media/stop` stops playback and clears the queue.
|
||
|
|
|
||
|
|
All routes should use `AppError` for boundary validation. Empty source returns `400 MISSING_MEDIA_SOURCE`. No voice connection returns `409 VOICE_NOT_CONNECTED`.
|
||
|
|
|
||
|
|
## Dashboard Design
|
||
|
|
|
||
|
|
Add a compact Media card to the existing voice tab for phase 1:
|
||
|
|
|
||
|
|
- Source input: URL or local path.
|
||
|
|
- Buttons: Queue/Play, Skip, Stop.
|
||
|
|
- Current item label and queue list.
|
||
|
|
|
||
|
|
Do not add a separate full media tab yet. The voice tab already owns voice channel selection and connection state, so colocating music controls there reduces user confusion.
|
||
|
|
|
||
|
|
## Playback Details
|
||
|
|
|
||
|
|
Use ffmpeg directly or the existing `src/audio/ffmpegProcess.ts` helper if it already fits. The target stream should be Ogg Opus because `DiscordPlayer.playStream()` currently expects `StreamType.OggOpus`.
|
||
|
|
|
||
|
|
Recommended ffmpeg output shape:
|
||
|
|
|
||
|
|
- Input: local file or HTTP(S) URL.
|
||
|
|
- Output format: `ogg`.
|
||
|
|
- Audio codec: `libopus`.
|
||
|
|
- Sample rate: `48000`.
|
||
|
|
- Channels: `2`.
|
||
|
|
|
||
|
|
The controller owns an `AbortController` or child process handle so skip/stop can terminate ffmpeg. Stop must also call `discordPlayer.stop()` so the audio player releases the current resource.
|
||
|
|
|
||
|
|
## Concurrency Rules
|
||
|
|
|
||
|
|
- Only one media item plays at a time.
|
||
|
|
- Browser microphone transmit and music playback both use `discordPlayer`; phase 1 should disable music start while `isStreaming` is true, or stop browser transmit before playback. Prefer returning `409 BROWSER_STREAM_ACTIVE` to avoid surprising the user.
|
||
|
|
- Voice recording can continue while music plays because recording uses the receiver pipeline and music uses the player pipeline.
|
||
|
|
- Skip is serialized: concurrent skip calls should return the same resulting state or reject with `409 MEDIA_SKIP_IN_PROGRESS`.
|
||
|
|
|
||
|
|
## Error Handling
|
||
|
|
|
||
|
|
- Unsupported source format: `400 UNSUPPORTED_MEDIA_SOURCE`.
|
||
|
|
- ffmpeg spawn failure: current item becomes failed, playback advances to the next queued item if present.
|
||
|
|
- ffmpeg runtime failure: log stderr summary, mark item failed, advance queue.
|
||
|
|
- Stop is idempotent: stopping while idle returns current idle state.
|
||
|
|
|
||
|
|
## Tests
|
||
|
|
|
||
|
|
Unit tests should cover:
|
||
|
|
|
||
|
|
- Queue add/next/remove/clear behavior.
|
||
|
|
- Resolver accepts HTTP(S) URLs and existing local paths, rejects empty/unsupported input.
|
||
|
|
- Controller rejects playback when voice is not connected.
|
||
|
|
- Controller starts next item after completion.
|
||
|
|
- Skip aborts current playback and advances queue.
|
||
|
|
- Routes validate payloads and call controller methods.
|
||
|
|
|
||
|
|
Manual verification should cover:
|
||
|
|
|
||
|
|
- Connect to a voice channel, queue a short audio URL or local file, hear playback in Discord.
|
||
|
|
- Queue two items, confirm automatic advance.
|
||
|
|
- Skip moves to the next item.
|
||
|
|
- Stop clears playback and queue.
|
||
|
|
- Existing voice recording and text moderation still work after media playback.
|
||
|
|
|
||
|
|
## Phase 2 Compatibility
|
||
|
|
|
||
|
|
Phase 2 can add `MediaMode = "screen"` and a `screenSharePlayer.ts` using StreamBot's pattern:
|
||
|
|
|
||
|
|
- `new Streamer(client)`
|
||
|
|
- `streamer.joinVoice(guildId, channelId)` only if phase 2 decides to own its own connection path
|
||
|
|
- `prepareStream(source, videoOptions, signal)`
|
||
|
|
- `playStream(output, streamer, { type: "go-live" }, signal)`
|
||
|
|
|
||
|
|
Phase 1 should not instantiate `Streamer`; it should only reserve type and controller seams so adding screen share later does not rewrite queue/status APIs.
|