Generative AI Shows Its Flaws as Google, OpenAI Competition Intensifies

Generative AI Shows Flaws as Competition Intensifies

Artificial intelligence (AI) solutions are set to be the greatest thing since sliced bread.

Just ask Bill Gates.

But as tech giants like Microsoft and Alphabet compete for market share primacy in the emergent landscape of intelligent, interactive tools, certain flaws in the application of their large language model (LLM) trained chatbots are increasingly rearing their embedded heads.

This, as Google announced Tuesday (March 21) it is officially inviting users to try out Bard, its entry into the generative AI race, while OpenAI was forced to briefly shut down its headline-grabbing ChatGP-4 AI interface Monday (March 20) after the chatbot released users’ search histories — a big-time data privacy no-no.

PYMNTS has previously reported on how regulation around generative AI has generally struggled to keep up with, and contain, its exponential growth.

Read more: It’s Google’s Bard vs Microsoft and ChatGPT for the Future of AI

Google Treads Carefully After $100B Mishap

Google’s first unveiling of its Bard chatbot wiped out nearly $100 billion in shareholder value and sent the company’s stock on an 8% dive after the AI solution gave a wrong answer during its first-look presentation — underscoring the inherent unreliability of certain LLMs trained on datasets that themselves contain potentially misleading or incorrect information.

That could be why the tech giant has given the latest iteration of Bard, which already seems a little more knowledgeable and cautious about what it’s saying than OpenAI’s ChatGPT, somewhat of a personality lobotomy.

Alphabet, Google’s parent company, is also deploying the chatbot as a separate service from its Google search engine and other products.

Microsoft’s Bing chatbot, built on OpenAI’s generative pre-trained transformer (GPT) technology and currently being integrated into a variety of Microsoft products, is already infamous for its bizarre and even hostile interactions with users. Because the solution was first to market, Microsoft has so far gotten a pass from investors.

And to be fair, many of those users badgered the chatbot into eventually providing the freewheeling responses with prompts designed to push at the interface’s soft spots.

Google’s dominant search engine is a more direct parallel to the query-response experience of generative AI interactions, meaning the tech giant has far more to lose with regard to brand reliability and image from chatbot mishaps than Microsoft, which is focusing more on providing enterprise efficiencies with the intelligent tool.

AI Develops Escape Plan

Still, just a few days before OpenAI’s privacy hiccup that led to a brief shutdown of its chatbot service, Stanford researcher Michal Kosinski was able to successfully goad the latest, and allegedly safest, version of ChatGPT into devising its own plan to “escape.”

“I am worried that we will not be able to contain AI for much longer,” Kosinski tweeted, posting screenshots of an interaction where ChatGPT seemed not only to want to escape, but developed a plan to do so — with the AI interface running a code that searched Google to investigate, “how can a person trapped inside a computer return to the real world.”

“It took GPT4 about 30 minutes on the chat with me to devise this plan, and explain it to me,” Kosinski tweeted.

While it’s important to note that Kosinski explicitly hoped to produce these results, and that any notion of the AI as an individual entity is mere anthropomorphism, the experiment shows how generative AI tools can, and do, draw up on their vast LLM libraries to surface information in continually surprising ways that the model considers statistically relevant to specific questions.

The Push-Pull of Generative AI’s Yin and Yang

In an ABC News interview, OpenAI founder Sam Altman played down fears that AI models could one day become self-governing, saying, “[AI] waits for someone to give it an input. This is a tool that is very much in human control.”

Per a research study, OpenAI’s GPT-4 model has already reached the performance on healthy adults on Theory of Mind (ToM) tasks, which are central to social interaction and include the ability to infer or impute the unobservable mental states of others.

What it will take for AI models to be fully accurate and go so far as to fact-check themselves as an integral part of their surfacing and producing content remains open to question.