Audio stream multiplexer that routes mic and system audio to separate transcription backends, merging responses with speaker attribution.
| Endpoint | Description |
|---|---|
/ws | WebSocket connection endpoint |
/health | Health check |
Send binary WebSocket frames with a type byte prefix:
| Prefix | Type | Speaker | Backend |
|---|---|---|---|
0x01 |
Microphone | you | MIC_BACKEND_URL |
0x02 |
System Audio | other | SYSTEM_BACKEND_URL |
If API_KEY env var is set, authenticate via query parameter:
ws://host:port/ws?api_key=YOUR_KEY
Transcription responses:
{
"id": "uuid",
"text": "transcribed text",
"final": false,
"rev": 2,
"lang": "en",
"speaker": "you" | "other" | "person:N"
}
{
"id": "uuid",
"type": "connected" | "error" | "disconnected",
"message": "description"
}