The Future of AI? RAG Combines Language and Search

As experts hint at the nearing technical limits of large language models (LLMs), the spotlight turns to retrieval-augmented generation (RAG) — a promising advancement that could redefine artificial intelligence (AI) by merging information retrieval with natural language generation.

LLMs have led AI technology and improved various applications. However, their tendency to generate false information has limited their potential. RAG allows AI to access and incorporate specific external data into its responses, making them more effective and accurate. 

“The main advantage of RAGs over LLMs is the fact that the former is based entirely on a proprietary data set that the owner of said RAG can control, allowing for more targeted applications,” Renat Abyasov, CEO of the AI company Wonderslide told PYMNTS. “Let’s say a doctor wants to deploy a chatbot for their patients; using an RAG will allow them to ensure that the advice provided by said chatbot will be reliable and consistent. That reliability is much harder to achieve with LLMs, systems trained on massive amounts of publicly available and sometimes rather dubious data.”

RAGs to Riches?

RAG models are cutting-edge AI that combines language understanding with real-time information retrieval. This allows them to provide more accurate and up-to-date answers by accessing the latest relevant data from external sources. RAG models excel in dynamic fields like news, research and customer support, where their ability to incorporate fresh information makes them highly adaptable and valuable in situations that require staying current.

In some situations, RAG might beat LLMs. Tonic.ai, a company known for its proprietary benchmarking platform, Tonic Validate, has recently conducted a series of evaluations on RAG systems. In one of these tests, the spotlight was on CustomGPT.ai, a no-code tool that enables businesses to deploy ChatGPT-style solutions alongside RAG databases. 

To assess CustomGPT.ai’s performance, Tonic.ai compared it against OpenAI’s built-in RAG functions. The evaluation dataset comprised several hundred essays written by Paul Graham and a set of 55 benchmark questions with ground-truth answers derived from the text. The primary objective was to evaluate the platforms’ ability to generate accurate and contextually relevant responses.

The test results revealed that both CustomGPT.ai and OpenAI’s tools could produce high-quality answers. However, CustomGPT.ai outperformed its competitor by consistently providing more precise responses to complex queries. This outcome highlights the effectiveness of CustomGPT.ai’s streamlined approach to deploying generative AI chatbots, making it an attractive option for businesses seeking to implement such solutions without extensive programming knowledge.

Using RAG could have real-world benefits. A recent report by Stanford University researchers and collaborators, published in the NEJM AI journal, suggests that RAG can significantly improve the performance of LLMs in answering medical questions.

The study found that RAG-enhanced versions of GPT-4 and other programs performed better than standard LLMs when answering questions written by board-certified physicians. The authors believe RAG is essential for safely using generative AI in clinical settings.

Even medical-specific LLMs, like Google DeepMind’s MedPaLM, still struggle with hallucinations and may not accurately handle clinically relevant tasks.

In related news, MedPerf is a new initiative that aims to speed up the development of medical AI while protecting data privacy. This emphasizes the growing need for secure and reliable data integration methods, such as RAG, to ensure the accuracy and relevance of AI-generated responses in healthcare.

The RAG Advantage

Andrew Gamino-Cheong, CTO of the Trustible, told PYMNTS that many LLMs are trained on fairly generic information that can be easily collected from the internet. He stressed that RAG is a powerful and cost-effective way to enhance LLMs. By integrating confidential or up-to-date information, RAG enables LLMs to provide more accurate and relevant responses. This approach allows businesses to leverage the full potential of LLMs while maintaining the security and specificity of their proprietary data.

“A lot of use cases of LLMs are limited by data that might be older, and RAG patterns are the most effective way of keeping them up to date without spending millions on fully retraining them,” he added. “One secret is that a lot of LLM providers would love for users to add RAG pipelines or outright fine-tune their foundational models because it radically shifts a lot of product liability.” 

Abyasov explained that RAG models are most frequently used to create self-operating technical assistance programs and conversational AI interfaces.

“RAGs have been used for this application for years before LLMs even appeared on the public’s radar,” he added. “Overall, practically any application that requires you to have a tightly controlled dataset will favor using an RAG, as they allow for less surprises and much more consistent results across the board.”

For all PYMNTS AI coverage, subscribe to the daily AI Newsletter.