The problem with artificial intelligence (AI) is that its intelligence is, well, artificial.
And AI-generated content can spread potentially harmful misinformation and synthetic media across images, text, video and more.
As it becomes increasingly difficult to distinguish between human and AI-generated content, even some of AI’s early leaders are throwing in the towel. Industry pioneer OpenAI decided to shut down its AI classifier tool due to the solution’s unreliability in accurately differentiating between AI-generated content and human creations.
“In the world of generative AI, not only is it advancing so quickly, it is truly fundamentally indistinguishable,” Shaunt Sarkissian, founder and CEO at AI-ID, told PYMNTS CEO Karen Webster.
The images look indistinguishable, the sounds are indistinguishable, and the voiceprints are looking more and more indistinguishable, Sarkissian added.
That’s why fraud detection techniques that can be successfully applied to other areas where data tends to be more static, including across payments and other industries, can’t be carried over to AI.
“The things that you try to do prior are going to be just very, very difficult,” Sarkissian said. “It’s a fallacy from the jump.”
Still, identifying AI-generated content is important for ensuring transparency, legal compliance and ethical considerations. It helps to build trust between users and organizations and ensures that AI is used responsibly and ethically.
“You can’t simply look at something and say, ‘Is that AI or not AI?’” he said. “You really have to know the source of the content’s creation. I think it is truly one of the challenges that will exist in the AI industry for a long time.”
“When you’re building [large language models (LLMs)] to mimic human behavior, they are going to look like humans,” he added.
A rose by any other name would smell as sweet, Shakespeare wrote in “Romeo and Juliet,” and now his famous line is getting a 21st-century update.
But AI roses don’t smell as sweet; the sophistication of AI models has made computer-generated content virtually indistinguishable from content created by humans, posing challenges in industries like journalism, where authenticity and credibility are crucial.
There are two levels of analysis needed. Beyond just identifying whether a piece of media is human-generated or AI-created, the issue of intellectual property (IP) arises as well in determining the origins of AI-generated content and tracking the sources used.
“You really have to track what it’s made of, otherwise you’re going to lose trust,” Sarkissian said.
The first step though, is to know whether something has been created by AI, and then if it is extracted and used by another program or process, there can be a “digital signature” that gets attached to it where “going forward you can track the provenance,” he explained.
“The bigger challenge will be content that is an entirely and truly autonomous AI output,” Sarkissian added.
That’s when determining the sources and derivative pieces of information used by AI models becomes a complex task, making it difficult to establish ownership and legality of AI-generated content.
The consequences of inaccurate or unreliable AI-generated content can be significant, so ensuring the accuracy and value of AI-generated content is crucial.
Sarkissian explained that one attractive approach is to build systems that automatically (and virtually) “footnote” AI-produced content so that then the systems doing the generation know what inputs are there and they can be embedded into the output.
“If an AI model has a certain identifier, say everything that comes from ChatGPT is tagged that it’s ChatGPT, it will almost act like a credit history,” Sarkissian said. “Either the outputs will be trusted and gain reputability over time, or they will be derailed by a compounding history of false statements.”
But he said he doesn’t think that companies alone can pull something like that off internally.
“The failure of OpenAI trying to screen their own data shows that [these solutions] can’t be really inbound or inwardly focused,” he said. “Organizations need to establish how they can work together with an outside vendor or third party … and I think the larger companies that have more experience in dealing with these types of situations, the Googles, Microsofts and others, I think you’re going to see them handle this a bit better.”
By establishing rules, building transparency systems and assessing reliability, firms can navigate these challenges and harness the power of AI to deliver accurate and valuable content to their audiences.
“This is where things like legislation help versus self-attestation,” Sarkissian said. “I do think different vendors will do different things based on what their LLMs or what their AIs are focused on, and they’re going to be a little fragmented in their approach if they are left to their own devices.”
The big things, he explained, are firms having to uniquely identify content as AI generated and what it’s also made up of, as well as informing the regulators what their models are trained on.
The tech industry must come together to address these challenges and establish itself as a responsible and trustworthy player in the AI landscape.
“At the end of the day, this is something that if it’s not solved, it’s going to harm the entire industry,” Sarkissian said. “This is the moment.”