OpenAI Warns Malicious Links Could Undermine Agentic AI

OpenAI, cybersecurity

OpenAI is elevating a risk that has quietly grown alongside autonomous AI systems: links.

    Get the Full Story

    Complete the form to unlock this article and enjoy unlimited free access to all PYMNTS content — no additional logins required.

    yesSubscribe to our daily newsletter, PYMNTS Today.

    By completing this form, you agree to receive marketing communications from PYMNTS and to the sharing of your information with our sponsor, if applicable, in accordance with our Privacy Policy and Terms and Conditions.

    In new guidance on artificial intelligence (AI) agent link safety, the company said on Wednesday (Jan. 29) that malicious links are emerging as one of the most exploitable surfaces as agents move beyond conversation into action.

    The concern is not abstract. As AI agents increasingly browse the web, retrieve information and complete tasks on behalf of users, links become gateways that can expose sensitive data or manipulate behavior if left unchecked.

    The warning comes at a moment when AI usage is becoming habitual. PYMNTS Intelligence data shows that more than 60% of consumers now start at least one daily task with AI. As autonomy increases, so does the cost of failure.

    OpenAI’s position is that links should be treated as a core security risk for agentic systems, on par with prompts and permissions. That framing reflects a broader shift in how AI safety is being operationalized as these systems move closer to commerce, payments and enterprise workflows.

    Why Links Can Be Exploited When AI Acts Autonomously

    In traditional browsing, humans decide whether to click a link and implicitly accept the risk. In agentic AI, that decision can be automated. An AI agent researching a product, managing a workflow or completing a transaction may encounter dozens of links in a single task. If even one of those links is malicious, the system can be manipulated into revealing information or taking actions the user never intended.

    Advertisement: Scroll to Continue

    OpenAI highlights the risk of malicious links that embed hidden instructions or deceptive redirects inside web content. When an AI agent consumes that content, it may treat those instructions as legitimate context rather than as an attack. This is especially dangerous when agents have access to tools, credentials or downstream systems.

    The problem scales with adoption. PYMNTS research shows that consumer trust in AI handling transactions is uneven, with a majority of shoppers saying they trust banks more than retailers to let AI buy on their behalf. That trust is fragile. A single high-profile failure tied to unsafe automation could slow adoption across entire categories.

    How OpenAI Is Preventing Link-Based Attacks

    To address the risk, OpenAI outlines a layered approach designed to reduce exposure without undermining usability. One core safeguard is link transparency. AI agents are trained to distinguish between links that already exist publicly on the open web and links that are introduced or modified within a conversation. If a link cannot be independently verified as preexisting, the system treats it as higher risk.

    This verification step helps prevent attackers from injecting custom URLs designed to capture sensitive data or trigger unintended behavior. Instead of silently following such links, the agent pauses and surfaces the decision to the user.

    OpenAI also applies constrained browsing, limiting what agents are allowed to do automatically when interacting with external content. Rather than granting blanket permission to fetch or execute actions from any link, the system narrows the scope of autonomous behavior. This reduces the chance that a single malicious page can cascade into broader compromise.

    For actions that involve elevated risk, OpenAI requires explicit human approval. If an agent encounters ambiguity or a task that could expose private information or initiate a meaningful action, it does not proceed on its own. This introduces friction by design, reinforcing the idea that autonomy should expand only where confidence is high.

    The company is transparent that these safeguards do not eliminate risk entirely. Instead, they are meant to make attacks harder, more visible and easier to interrupt. That tradeoff is intentional.