Anthropic Outage Shows AI Is Straining the Digital Stack

AI

Highlights

The “five nines” (99.999% uptime) standard is slipping, with platforms like Anthropic’s Claude, Apple services, and Microsoft’s GitHub suffering downtime due to rising complexity and demand.

Today’s interconnected, AI-driven systems are more powerful but also more fragile, turning outages into cascading, systemwide disruptions.

CFOs are shifting their mindset from chasing perfection to planning for uncertainty by balancing reliability and performance with cost and risk.

The five nines gold standard of digital reliability is cracking in 2026.

    Get the Full Story

    Complete the form to unlock this article and enjoy unlimited free access to all PYMNTS content — no additional logins required.

    yesSubscribe to our daily newsletter, PYMNTS Today.

    By completing this form, you agree to receive marketing communications from PYMNTS and to the sharing of your information with our sponsor, if applicable, in accordance with our Privacy Policy and Terms and Conditions.

    The benchmark of 99.999% uptime availability, or the “five nines” that represent just over five minutes of downtime per year, has long reassured executives, underpinned service-level agreements and shaped capital allocation decisions. In today’s hyperconnected, compute-intensive economy, that promise is starting to look more like a relic than a reality.

    Anthropic’s Claude AI model, for example, had slipped below the threshold as of Thursday (April 30) to around 98% uptime for the past 90 days. In China, the government reportedly suspended its issuance of new licenses for robotaxis and other autonomous connected electric vehicles, which are typically dependent on cloud-linked systems, after dozens of Baidu’s vehicles temporarily lost their functionality mid-use.

    Apple’s weather app suffered an hourslong outage this week, a relative rarity across the tech giant’s typically airtight ecosystem, and even Microsoft’s code hosting platform, Github, on Tuesday (April 28) posted an apology to its software developer base for recent downtime issues. The platform flagged AI’s power-hungry needs as a key factor.

    “The main driver is a rapid change in how software is being built. Since the second half of December 2025, agentic development workflows have accelerated sharply,” wrote Vladimir Fedorov, GitHub’s chief technology officer.

    And Fedorov is right. The modern digital stack bears little resemblance to the monolithic architectures of the past. Today’s systems are composable, distributed and deeply layered. A single enterprise workflow might rely on a large language model, a cloud provider, multiple APIs, and a network backbone spanning continents and satellites.

    Advertisement: Scroll to Continue

    See also: Smart Firms Treat Vendor Risk Like Their Own 

    Complexity Is the New Risk

    Individually, the recent downtime incidents might be dismissed as routine hiccups. Collectively, they point to a deeper structural shift: the systems underpinning modern commerce are more powerful than ever, but also more fragile, more interdependent, and more prone to cascading failure.

    “For the past month I’ve kept a journal where I put an ‘X’ next to every date where a GitHub outage has negatively impacted my ability to work. Almost every day has an ‘X.’ On the day I am writing this post, I’ve been unable to do any PR review for ~2 hours because there is a GitHub Actions outage,”  wrote Hashicorp Co-Founder Mitchell Hashimoto in a Tuesday post.

    Elsewhere, on April 25, an AI coding agent managed to delete the production database and “all volume-level backups” belonging to the startup PocketOS.

    “I serve rental businesses. They use our software to manage reservations, payments, vehicle assignments, customer profiles, the works. This morning — Saturday — those businesses have customers physically arriving at their locations to pick up vehicles, and my customers don’t have records of who those customers are,” Jer Crane, founder of PocketOS, wrote in a lengthy article on X, noting that this incident caused a cascading series of issues that persisted for more than 30 hours.

    It’s not just corporations, or software and AI startups, that are being impacted. The U.S. military ran into trouble in mid-April when a global outage of the Starlink satellite network disrupted several autonomous operations.

    Outages are no longer isolated events; they are becoming networked disruptions with unpredictable blast radii. For CFOs, this can change the calculus. Reliability can no longer be assessed solely vendor by vendor. It instead is more and more being evaluated as a portfolio of interdependencies.

    “Platform resiliency and business continuity planning, in my opinion, has been our number one unsung hero,” Rinku Sharma, chief technology officer at Boost Payment Solutions, told PYMNTS in an earlier interview.

    Read also: PYMNTS Execs Say Resilience Isn’t a Buzzword. It’s Their Business Model

    Budgeting for Failure

    The paradox of modern operations is that as digital systems become more advanced, they also become more sensitive to disruption.

    Part of the challenge lies in the sheer scale of modern compute demands. AI workloads, in particular, are pushing infrastructure to its limits. Training and inference require vast amounts of processing power, memory bandwidth, and energy, often concentrated in specific regions or clusters.

    Meanwhile, physical infrastructure, ranging from data centers to satellite networks and beyond, remains subject to real-world constraints. Weather, power supply, hardware failures and geopolitical factors all play a role. Achieving near-perfect uptime across complex, distributed ecosystem may require exponential investment in redundancy, monitoring and failover mechanisms.

    For finance teams, then, the question may no longer be one of whether to invest in reliability, but instead how to balance cost, risk and performance. The challenge for organizations is to adapt their operating models accordingly. This may mean embracing uncertainty, designing for resilience, and aligning financial planning with the realities of modern infrastructure.

    For all PYMNTS AI coverage, subscribe to the daily AI Newsletter.