Voice AI Funding Surges 8X as Businesses Humanize Chatbots

Highlights

Voice AI startup funding surged eightfold in 2024, fueled by advances in the technology that enabled real-time, humanlike voices from OpenAI, ElevenLabs and others.

Businesses are using voice agents to cut costs and boost availability, starting with tasks like after-hours calls and appointment booking.

Despite rising adoption, challenges remain around accuracy and trust, especially in high-stakes or public-facing scenarios.

A revolution is underway in customer communication, and it’s being led not by keyboards or screens — but by voices.

    Get the Full Story

    Complete the form to unlock this article and enjoy unlimited free access to all PYMNTS content — no additional logins required.

    yesSubscribe to our daily newsletter, PYMNTS Today.

    By completing this form, you agree to receive marketing communications from PYMNTS and to the sharing of your information with our sponsor, if applicable, in accordance with our Privacy Policy and Terms and Conditions.

    Voice-based artificial intelligence (AI) agents, long a promise more than a product, are advancing to such a degree that they are now outperforming call centers and beginning to replace human labor in industries from healthcare to retail, according to venture capital firm Andreessen Horowitz.

    “Voice is one of the most powerful unlocks for AI application companies,” Olivia Moore, a partner at Andreessen Horowitz, wrote in a 2025 AI Voice Agent Update blog post. “It is the most frequent and information-dense form of communication, made programmable for the first time due to AI.”

    Making voice programmable means AI can now interpret, respond to and act on voiced queries with more accuracy and reliability. Voice is naturally unstructured and messy — people interrupt, change topics, or use slang.

    Moore said voice AI lets businesses respond to customers 24/7 instead of having to wait until an office is staffed. For consumers, “We believe voice will be the first — and perhaps primary — way people interact with AI.”

    According to a PYMNTS Intelligence report, 30.4% of Gen Z consumers already shop by voice every week. In “How the World Does Digital,” the report showed that millennials came in second, at 27.6%. For all ages, the average is 17.9% of consumers using voice to shop.

    Last year, voice AI startups raised $2.1 billion, up eightfold from 2023, according to research firm CB Insights. The 2024 bumper crop included ElevenLabs’ $180 million fundraising round.

    Growth was driven by advances in voice AI models — such as OpenAI’s Realtime API for speech-to-speech applications — that gave a big boost to applications in various use cases, the research firm said.

    “It’s really in the last 12 to 18 months that we’ve seen AI voice agents performing as well or better than humans,” Alex Levin, co-founder and CEO of voice AI company Regal, told The Wall Street Journal.

    In March, Yum! Brands, which owns Taco Bell, KFC and Pizza Hut, announced a partnership with Nvidia to deploy AI solutions. This includes implementing voice AI across its brands in call centers to handle phone orders when demand surges.

    Jersey Mike’s has tapped SoundHound’s AI to do voice ordering in 50 stores. Recently, SoundHound partnered with Allina Health to deploy “Alli,” an AI agent that can answer calls from patients. It helps patients manage appointments and will soon be able to refill medications, find doctors and locations and answer non-clinical questions.

    Read more: How the World Does Digital: A Deep Dive Into Global Digital Engagement

    Pivotal Moment for Voice AI

    In the past year, the underlying AI infrastructure for voice has radically improved.

    A year ago, OpenAI unveiled a “voice mode” built on top of GPT-4o that offered real-time voice responsiveness, the ability to be interrupted, and a diversity in emotional tones rather than robotic responses.

    ElevenLabs followed with Conversational AI in November, with version 2.0 coming out last month. Meanwhile, players like Kyutai and Speechmatics brought real-time, full-duplex conversations into production, Moore said.

    These models also became more affordable as latency dropped. OpenAI cut GPT-4o API costs by up to 87.5% last December, according to Moore.

    Moore noted, “Conversational quality is now largely a solved problem,” and startups are racing to deploy voice as the “wedge” or entry point into broader enterprise platforms since businesses are starting small — handling FAQs, booking appointments, or conducting initial screenings.

    Ketan Babaria, chief digital officer of insurance marketplace eHealth, told the Journal that voice AI has gotten quite good.

    “Suddenly, we noticed these agents become very humanlike,” Babaria said. “It’s getting to a point where our customers are not able to differentiate between the two.”

    The next advance would be AI voice agents that can use the phone to do tasks independently such as making restaurant reservations, closing sales and placing orders, PolyAI CEO Nikola Mrksic told the Journal.

    Voice AI use cases include the following, according to Moore:

    • After-hours or overflow calls: These are calls that would otherwise go to voicemail. A voice agent can collect or share information and may even be able to complete a booking or transaction.
    • Net-new outbound calls such as customer check calls, activation calls, lead calls and the like.
    • Back office calls to vendors, suppliers, and the like.

    Despite the rapid adoption, voice AI still faces hurdles. Reputational risk is high, especially when systems fail in public. McDonald’s, for example, pulled its voice AI pilot with IBM after videos of botched orders went viral.

    Read more: Anthropic Begins Adding Voice Mode to AI Assistant Claude

    Read more: OpenAI Debuts Advanced Voice AI for Subscribers

    Read more: T-Mobile Parent Developing a Voice-Activated AI Phone