ai-api
AssemblyAI
Best-in-class speech-to-text API — Universal models, 99 languages, low-latency streaming
Leading STT API. Universal-3 Pro (multilingual top-tier), Universal-2 (99 langs), Universal Streaming, Whisper streaming. Built-in diarization, PII redaction, medical mode.
Pricing
| Tier | Price | Notes |
|---|---|---|
| Free Credits | Free | $50 in free credits on signup. Full API access. |
| Pay-as-you-go | Free | Per-hour billing by model. No minimum. |
| Enterprise | Custom | Custom contracts. SLA, private deployments, BAA. |
Limits
| Tier | Metric | Value | Notes |
|---|---|---|---|
| — | diarization addon | $0.02-$0.12/hr | Speaker diarization |
| — | keyterms addon | $0.04-$0.05/hr | Keyterm prompting |
| — | languages supported | 99 (Universal-2) / 6 (Universal-3 Pro Streaming) | Language coverage |
| — | medical mode | $0.15/hr add-on | Medical mode |
| — | multilingual streaming rate | $0.15/hr (6 languages realtime) | Universal-Streaming ML |
| — | universal2 rate | $0.15/hr (99 languages, general purpose) | Universal-2 |
| — | universal3 pro rate | $0.21/hr (most accurate, multilingual) | Universal-3 Pro |
| — | universal3 streaming rate | $0.45/hr (premium realtime) | Universal-3 Pro Streaming |
| — | universal streaming rate | $0.15/hr (English realtime) | Universal-Streaming English |
| — | whisper streaming rate | $0.30/hr (OSS Whisper on AAI infra) | Whisper Streaming |
Features
- Advanced Prompting — Streaming with disfluency + code-switching + realtime diarization.
- Audio Intelligence — Sentiment, topic detection, summarization, entity detection, content safety, IAB classification.
- Auto Punctuation — Smart capitalization + punctuation.
- Keyterm Prompting — Boost accuracy for domain vocabulary.
- LeMUR (LLM framework) — Run LLMs over transcripts: Q&A, summary, action items. · docs
- Medical Mode — Specialized for clinical + medical vocabulary.
- PII Redaction — Auto-redact credit cards, SSNs, addresses, emails.
- Pre-recorded Transcription — Upload audio/video URL or file → transcript.
- Realtime Streaming — WebSocket-based low-latency STT.
- Speaker Diarization — Identify who spoke when.
- Webhooks — Auto-notify when transcription finishes.
Developer interfaces
| Slug | Name | Kind | Version |
|---|---|---|---|
| sdk-go | assemblyai-go | sdk | 1.x |
| sdk-node | assemblyai (Node) | sdk | 4.x |
| sdk-python | assemblyai (Python) | sdk | 0.x |
| rest-api | AssemblyAI REST API | rest | v2 |
| sdk-ruby | assemblyai (Ruby) | sdk | 1.x |
| streaming-ws | Streaming WebSocket | other | v3 |
| webhooks | Webhooks | other | — |
Compare AssemblyAI with
ai-api
AssemblyAI vs Anthropic API
Side-by-side breakdown.
ai-api
AssemblyAI vs Deepgram
Side-by-side breakdown.
ai-api
AssemblyAI vs ElevenLabs
Side-by-side breakdown.
ai-api
AssemblyAI vs Google Gemini API
Side-by-side breakdown.
ai-api
AssemblyAI vs Groq
Side-by-side breakdown.
ai-api
AssemblyAI vs OpenAI API
Side-by-side breakdown.
ai-api
AssemblyAI vs Replicate
Side-by-side breakdown.
ai-api
AssemblyAI vs Together AI
Side-by-side breakdown.
Staxly is an independent catalog of developer platforms. Outbound links to AssemblyAI are plain references to their official pages. Pricing is verified at publication time — reconfirm on the vendor site before buying.