SoftBank and Nvidia Redesign Computing Architecture for AI Workloads

The unspoken secret of artificial intelligence (AI) is its massive computing requirements.

The cost to power them is nothing to sniff at, either. As new AI applications come in, that peak demand will grow. The power requirements are going to grow. The computing requirements are also going to grow.

SoftBank recently announced that it is building data centers, in collaboration with Nvidia, that can host generative AI and wireless 5G applications alike on a multitenant common server platform designed to both reduce costs and be more energy efficient.

That’s because as 5G networks come online around the world, they will create a literal network effect powering the rapid, worldwide deployment of generative AI applications and services.

By using 5G data centers synchronously for AI applications, businesses can score a return on both investment and computing and processing power.

“As we enter an era where society coexists with AI, the demand for data processing and electricity requirements will rapidly increase,” said Junichi Miyakawa, president and CEO of SoftBank, in a release announcing the partnership

“Demand for accelerated computing and generative AI is driving a fundamental change in the architecture of data centers,” added Jensen Huang, founder and CEO of Nvidia. “Nvidia Grace Hopper is a revolutionary computing platform designed to process and scale-out generative AI services.”

The platform will first be rolled out across new, distributed AI data centers in Japan that can house both generative AI and wireless apps on a shared server platform.

The rising number of data centers to store and manage data securely has increased the demand for AI-powered storage.

But despite the promising potential of AI, our current computing infrastructure isn’t built to handle the workloads AI will throw at it.

That’s because while, for the last half century, most computing architectures have tended to be CPU-focused, the future of generative AI will require higher-performance, more energy-efficient computing capabilities.

The size of AI networks grew 10 times per year over the last five years, and by 2027 observers predict that 1 in 5 Ethernet switch ports in data centers will be dedicated to AI, machine learning and accelerated computing.

To escape this spiral, we need to rethink computing architecture from the ground up. One solution is to completely disaggregate compute platforms, eliminating interdependencies between CPUs, GPUs, DPUs, memory, storage, networking, and so on.

“We see a significant underutilization of the networks being built, and the return on investment (ROI) on 5G has been relatively low,” explained Ronnie Vasishta Sr., vice president of telecom at Nvidia, during a briefing with analysts.

That’s why Nvidia is making its 5G infrastructure not just virtualized, but also completely software-defined, so that it’s possible to run a high-performance, efficient 5G network alongside AI applications, all within the same data center.

This contemporary moment within the future of technical infrastructure is being marked by a convergence of two pivotal phase shift factors in the tech industry. The first is a change in computing architectures, and the second is the hyper-rapid emergence and commercialization of generative AI.

Generative AI, now a household term, requires scale-out architectures, which is driving tremendous demand for networked “AI factories” or data centers like those Nvidia and SoftBank have partnered up on creating.

The challenges of AI are daunting, but they all fall within the limits of our imagination. The return on effort will be compounded as innovations made for AI will trickle down to all other forms of computing.