- Add activeRequests counter to track in-flight AI requests
- Limit concurrent requests to 1 (MAX_CONCURRENT_REQUESTS)
- drainQueue now waits if at max concurrency before processing next message
- Prevents overwhelming streaming LLM API with multiple concurrent requests
This resolves:
- AI LLM API overload from concurrent requests
- Streaming response conflicts
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>