VoicePulse (hosted publicly at actualvoice.ai) combines batch + streaming components to process thousands of voice clips per minute:
- Ingestion: Supabase Storage with pre-signed uploads, queued through Edge Functions, then normalized via worker pods.
- Analysis: Whisper.cpp for transcription, followed by LLM classification (sentiment, intent, compliance) with automatic retries when rate limits hit.
- Scalability: Queue-depth driven autoscaling plus tenant-aware rate limiting to keep OpenAI quotas under control.
- Observability: Grafana dashboards track processing latency, token burn, and failure buckets to trigger red/amber alerts.
I use VoicePulse as a playground for benchmarking vendor LLMs and experimenting with hybrid edge/cloud architectures.