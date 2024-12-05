A new research paper reveals artificial intelligence (AI) systems using large language models can now effectively control computer interfaces through natural interaction. These agents can autonomously navigate software, complete tasks and manipulate interfaces as humans do.

“This technology will change our relationship to software,” Joan Palmiter Bajorek, CEO and founder of Clarity AI, told PYMNTS. “For most people, speaking is one of the most natural ways of interacting with another person. So, instead of manually clicking on buttons, interfaces that incorporate voice AI could be some of the most naturalistic we use. Instead of typing a prompt into ChatGPT, you could simply speak your request out loud.”

As the world becomes increasingly digital, voice technology is something consumers expect to be more widely available, according to a PYMNTS Intelligence survey of 2,939 consumers in the U.S. examining how they use voice technology. That report showed that 54% of consumers prefer voice technology because it’s easier and faster than typing or clicking through websites. That same data revealed almost half of those surveyed think voice tech will be as smart and reliable as humans within five years.

AI Agents as Helpers

AI agents, which are autonomous software systems capable of performing tasks with minimal human intervention, are reshaping industries by improving efficiency and enabling complex problem-solving. These agents use machine learning and natural language processing to understand, act and adapt in real time.

For instance, financial institutions deploy AI agents to detect fraud by analyzing transaction patterns and identifying anomalies. Skyfire Systems launched a payment network in October that enables AI agents to handle financial transactions autonomously, signaling a shift toward agent-based commerce. Similarly, customer service is being transformed with conversational AI agents like ChatGPT and Google Gemini, which can resolve customer inquiries 24/7 with human-like interaction.

Voice control of computers is advancing with AI, enabling more natural and accurate interactions. Developers are focusing on hands-free navigation, transcription and specialized tools for tasks like coding. Companies are refining virtual assistants to handle increasingly complex commands, marking a shift toward more practical and accessible applications of voice technology.

Speaking Naturally

The Microsoft-backed research found that rather than relying on specific commands, users can make natural language requests that the AI translates into interface actions like clicking, typing and navigating between applications. This development could transform how people interact with software by making complex tasks more accessible through conversational AI assistance.

Unlike traditional automation tools, these agents interpret visual interfaces contextually while understanding user intent, enabling more flexible and intuitive computer interaction.

“These agents represent a paradigm shift, enabling users to perform intricate, multi-step tasks through simple conversational commands,” the researchers wrote. “Their applications span across web navigation, mobile app interactions and desktop automation, offering a transformative user experience that revolutionizes how individuals interact with software.”

AI interface automation in customer service could help streamline operations and reduce waiting periods by directing clients to appropriate resources when they contact the help center, Daniel Balaceanu, co-founder and chief product officer at Druid AI, told PYMNTS.

“Not every individual will be calling for the same issue, so having an AI agent direct them at the beginning of a call can reduce their waiting time as they are sent directly to the individual who can help them with their specific problem,” he said.

Balanceau said that AI agents and small language models (SLMs) would allow businesses to reduce operational costs by automating routine tasks. This will give them more resources to use on growth initiatives and enable them to maximize efficiency without having to compromise on innovation. Using automation to streamline their processes and lower expenses, they can reinvest those savings into projects that will drive expansion.

“However, the benefits of AI go far beyond cost savings,” he said. “In fact, aligning success metrics in conversational AI with broader business objectives is crucial for ensuring that a conversational AI platform’s performance directly contributes to the organization’s overall success. Each business may have unique goals, such as improving customer satisfaction, optimizing support operations or increasing conversion rates. By aligning metrics with these objectives, organizations can measure a conversational AI solution’s impact in a meaningful way.”