MIT Researchers Propose Bot Debates to Beat AI ‘Hallucinations’

OpenAI, Microsoft, and Google have all made strides in developing generative chatbots, but one major flaw plagues them all: AI hallucinations. These hallucinations occur when the chatbots create responses that sound plausible but are factually incorrect or unrelated to the context. To address this issue, a group of MIT researchers has released a new paper that finds a debate between whether chatbots can improve the reasoning and factual accuracy of large language models (LLMs).

“It’s like a bot debate club, except the bot can essentially debate iterations of itself,” said Yilun Du, a researcher at MIT and one of the paper’s authors to Fortune. “The debates can occur in a single model (or bot). A single language model is replicated multiple times to generate multiple bots. Given a question, each bot then generates a different answer (the learned model behind the bot is the same across bots). The bots can then debate each other.”

This approach is different from other attempts to handle AI hallucinations by increasing the amount of memory AIs have available to them, training datasets more tightly with the data they will use and constricting their answers. OpenAI has also produced a paper that shows how its latest model, GPT4, produced fewer hallucinations than previous versions. However, some leading AI researchers say hallucinations should be embraced.

Meanwhile, some firms are using human trainers to rewrite the bots’ answers and feed them back into the machine with the goal of making them smarter. Companies are also spending time and money improving their models by testing them with real people. Vectara Inc. has raised $28 million in seed funding led by Race Capital to empower developers with a new capability that greatly reduces AI errors when producing search results. Vectara provides a cloud-based conversational generative large language model “search-as-a-service” that allows businesses to hold intelligent conversations with their own data, such as documents, knowledge bases, and code. Vectara’s proprietary AI is similar to OpenAI LP’s ChatGPT but works on businesses’ own data.

As AI continues to evolve, it is crucial to ensure that it remains accurate and trustworthy. While different approaches are being taken to tackle this problem, the use of debates between chatbots and the augmentation of queries with facts and data related to a company’s own data are promising solutions.