Startup Patronus AI has uncovered limitations in artificial intelligence (AI) models when analyzing Securities and Exchange Commission (SEC) filings.
The findings of the study shed light on the challenges faced by AI models and emphasize the need for improvement to meet the demands of regulated industries, particularly finance, CNBC reported Tuesday (Dec. 19).
The research focused on large language models (LLMs) that are commonly used in analyzing SEC filings, according to the report. The study revealed that even the best-performing AI model configuration tested only achieved a 79% accuracy rate in answering questions when provided with the entire filing alongside the question.
One of the major issues identified by the researchers was the AI models’ tendency to refuse to answer questions or provide incorrect information that is not present in the SEC filings, the report said. This lack of accuracy and reliability poses significant concerns, especially in regulated industries where precision is crucial.
The finance industry values the ability to quickly extract important data and analyze financial narratives, per the report. If AI models could accurately summarize SEC filings or promptly answer questions about their content, it could provide an advantage to users in the competitive financial sector.
However, the entry of AI models into the industry has not been without challenges, according to the report.
One of the main challenges highlighted by Patronus AI is the nondeterministic nature of LLMs, the report said. These models do not consistently produce the same output for the same input, making rigorous testing essential to ensure accurate and reliable results.
Patronus AI, founded by Anand Kannappan and Rebecca Qian, aims to address this challenge by automating LLM testing with software, per the report.
Despite the challenges and limitations identified in the study, the co-founders of Patronus AI remain optimistic about the potential of LLMs to assist professionals in the finance industry, according to the report.
They believe that with continued improvement, these models can provide valuable support to analysts and investors. However, for now, human involvement is necessary to ensure accuracy and reliability.
PYMNTS Intelligence has found that financial chatbots are evolving into highly capable problem-solvers. In the future of consumer banking, digital assistants will not only listen but also understand and anticipate consumers’ needs, according to “AI and Banking’s New Dawn: From Conversations to Conversions,” a PYMNTS and Galileo collaboration.