How AI Firms Plan to Build, Then Control, Superhuman Intelligence

artificial intelligence

At the center of many of humanity’s most compelling problems and innovations lie inscrutable paradoxes. 

Nearly the entirety of advanced mathematics, for example, is built around the concept of imaginary numbers. Without them, the field would collapse. 

It is the same with physics: Were it not for black holes, nearly every natural law could not be supported, yet the science behind this phenomenon is not fully understood. 

The emergence of artificial intelligence (AI) comes with its own central questions. Namely, if the capacity exists to build a superintelligent AI system able to “outsmart” humans, just how will this hypothetical AI system be controlled? 

And if such a system cannot be controlled to the end users’ benefit, why are tech companies racing to build one? 

It is this question that OpenAI and Anthropic, among others, were originally founded on. 

This, as on Monday (Dec. 18) OpenAI laid out a new Preparedness Strategy framework, which claims a scientific approach to measuring catastrophic risk in its most advanced frontier AI systems. 

“The study of frontier AI risks has fallen far short of what is possible and where we need to be. To address this gap and systematize our safety thinking, we are adopting the initial version of our Preparedness Framework. It describes OpenAI’s processes to track, evaluate, forecast, and protect against catastrophic risks posed by increasingly powerful models,” the company wrote. 

But while “superhuman AI,” or artificial general intelligence (AGI), will certainly need oversight if it is ever realized, the AI field is split in two over the hypothesis that these all-powerful AI systems are even possible, much less that they represent an uncontrollable existential threat

Read also: What Does it Mean to be Good for Humanity? Don’t Ask OpenAI Right Now

Just What Is Superalignment, Anyway?

At the heart of the AI sector’s plans for creating and corralling AGI is a concept known as superalignment. 

Superalignment represents a holistic approach to achieving alignment in AI, going beyond mere technical specifications to encompass a broader understanding of the societal impact and ethical considerations associated with AI development.

It extends the traditional notion of alignment, which primarily focuses on aligning AI systems with human values during the training phase. While this is a crucial aspect, superalignment recognizes the need for ongoing alignment throughout the lifecycle of AI systems, including deployment, adaptation and evolution.

Last week (Dec. 14), OpenAI published a research paper describing a technique for aligning hypothetical future models that are “much smarter” than humans. The approach the AI firm proposes is to use a less powerful large language model (LLM) to supervise and train a more powerful one — treating the less powerful model as a proxy for human oversight of the more powerful, superintelligent AI. 

“We believe superintelligence — AI vastly smarter than humans — could be developed within the next ten years,” OpenAI said in the paper. 

The study looked at how GPT-2, one of the original AI systems that OpenAI released five years ago, could supervise GPT-4, the company’s latest LLM. 

See moreMIT Says AI Development Should Help the Workplace, Not Control It

Is AGI an Arms Race or a Hype Race?

Still, many observers and industry luminaries believe that focusing on AI’s doomsday scenarios distracts from the very real, immediate short-term risks of current AI systems, which include misinformation, encoded bias, copyright infringement and higher computing costs relative to alternative options.

Yann LeCun, Meta’s chief AI scientist, has described the conversation around AI risks as “unhinged.” 

“The emergence of superhuman AI will not be an event … Fight people who want to regulate AI R&D by claiming that it’s too dangerous for everyone but them to have access to it,” LeCun wrote in a post on X, formerly known as Twitter. 

And he is not alone in his doubts about the industry’s fixation on (preventing) doomsday scenarios. 

“Many of the hypothetical forms of harm, like AI ‘taking over’, are based on highly questionable hypotheses about what technology that does not currently exist might do. Every field should examine both future and current problems. But is there any other engineering discipline where this much attention is on hypothetical problems rather than actual problems?” Andrew Ng, the founder of, posted on X. 

Speaking to PYMNTS in November, Akli Adjaoute, AI entrepreneur and general partner at venture capital fund Exponion, emphasized that there is a growing need for a widespread, realistic comprehension of what AI can achieve in order to avoid the pitfalls of hype-driven perceptions around the technology.

“When you ask people, a lot of them don’t know much about AI — only that it is a technology that will change everything … At the end of the day,” Adjaoute said, AI is “merely probabilistics and statistics … A detailed unraveling of AI’s intricacies reveals that the innovation is truly just a sequence of weighted numbers.”

When end users understand this, and the “black box” of uncontrollable AI is demystified, then the technology can be better put to use.

For all PYMNTS AI coverage, subscribe to the daily AI Newsletter.