Daily and Vapi partner to deliver AI Voice Assistants as an API

Today, we’re thrilled to partner with Vapi as they launch the first omni-platform AI voice assistant API in the market.

Daily’s mission has always been to help developers build powerful real-time communications experiences leveraging the power of WebRTC. The AI platform shift is happening faster than any previous technology wave. We recently shipped a toolkit designed to power real-time AI: voice-driven LLM apps, bots and characters, video and vision features, and speech-to-speech experiences.

Leveraging Daily's global audio infrastructure & real-time AI toolkit, Vapi’s platform delivers low-latency, customizable, and reliable real-time conversations with AI, with Vapi assistants available on all platforms supported by Daily.

Leveraging voice-enabled generative AI technology at scale

Over the past few years, audio and video communications between co-workers, between service providers and clients, and between companies and customers have become an everyday experience in almost every industry. Now, voice-enabled generative AI is poised to become the norm for many kinds of operational, educational, and commercial conversations. Conversational AI will become ubiquitous.

ScaleConvo, a YC W24 batch company, is an example of an early adopter that implemented Vapi to manage thousands of AI-driven conversations for property management. “Vapi does the legwork of immediately parsing unstructured conversations, turning it into action asynchronously, while still on a voice call. It lets us focus on building. It’s like having a high-performing customer success agent at a fraction of the cost.”

Vapi in action 


Tech Stack & Capabilities 

Vapi is built on top of the best media transport, speech-to-text, text-to-speech, and LLM technologies available.

Daily’s global WebRTC infrastructure and extensively tuned client SDKs are key to delivering the fastest response times and the best possible output from Vapi’s conversational agents. Daily’s real-time bandwidth management and very low average first-hop latency everywhere in the world (13ms) guarantee that audio packets are delivered to the cloud quickly and reliably. High-quality audio makes possible accurate speech-to-text transcriptions, which in turn ensures that Vapi’s LLMs perform optimally.

Low latency is critical in real-time conversation applications, so Vapi uses Deepgram at the start of their response pipeline to transcribe what's said in under 300ms. Deepgram is an industry leader in both overall accuracy and the flexibility of its Speech-to-text models.

Together, these technologies form a best-in-class tech stack for powerful generative voice AI. 

Adding Vapi to your site or application 

It's hard to create and scale voice AI experiences that feel natural to talk to. Vapi handles the complexity of managing the voice AI pipeline & real-time call infrastructure and makes this easy. It’s as simple as: 

  1. Write a prompt to create an assistant ("you're an assistant for....")
  2. Buy a phone number / Add a snippet to your website to deliver your assistant ("vapi.start()")
  3. That's it, your users can interact with your assistant with voice

The Vapi and Daily teams are excited to see what you build. If you have questions, or suggestions, or want to show off your real-time AI projects, feel free to post in our peerConnection community.

Never miss a story

Get the latest direct to your inbox.