Anthropic Unveils AI Assistant That Can Navigate Computer Screens

Anthropic has launched a new artificial intelligence capability called “computer use” that enables its AI to interact with users’ computer screens.

The feature, released in beta to developers on October 22, 2024, represents a significant advancement in AI assistance technology, moving beyond simple chatbot interactions to more sophisticated computer operations.

The new tool allows Anthropic’s AI to interpret on-screen content and, with user permission, perform various tasks such as web browsing, button clicking and text input. This development marks a shift in AI assistance technology, as it can process real-time screen activity rather than relying on back-end application integrations.

During a demonstration, the system reportedly showcased its abilities by planning a morning hike near the Golden Gate Bridge, autonomously searching for trails, checking sunrise times and creating calendar invites with detailed information about appropriate attire.

The release comes amid growing industry interest in AI agents that can operate with minimal human oversight. While companies like Microsoft and Salesforce have recently introduced their agent tools for workplace tasks, Anthropic’s approach differs in that it focuses on direct screen interpretation rather than application-specific integration.

However, the technology faces significant limitations and safety considerations. The company said the system struggles with everyday computer actions like scrolling, dragging and zooming. To address security concerns, Anthropic has implemented various safeguards, including restrictions on social media engagement, account creation and government website interaction. Developers can also add human oversight requirements and limit when the tool can access a user’s computer.

Several companies have already begun testing the technology, with Canva, Asana and Replit implementing it in graphic design, project management and coding. The computer use feature is currently available to developers using Anthropic’s Claude technology.

Alongside this release, Anthropic has introduced Claude 3.5 Sonnet, an upgraded model with enhanced coding and reasoning capabilities, and an improved version of its faster Claude 3.5 Haiku model.

“Early customer feedback suggests the upgraded Claude 3.5 Sonnet represents a significant leap for AI-powered coding,” Anthropic wrote in its blog post. “GitLab, which tested the model for DevSecOps tasks, found it delivered stronger reasoning (up to 10% across use cases) with no added latency, making it an ideal choice to power multi-step software development processes. Cognition uses the new Claude 3.5 Sonnet for autonomous AI evaluations, and experienced substantial improvements in coding, planning, and problem-solving compared to the previous version. The Browser Company, in using the model for automating web-based workflows, noted Claude 3.5 Sonnet outperformed every model they’ve tested before.”

For all PYMNTS AI coverage, subscribe to the daily AI Newsletter.