Ironclad Journal icon Ironclad BLOG

Hallucinating with AI: What It Does Well…And Where It Falls Short

July 19, 2023 4 min read
ai generated image with man with spiral sunglasses against a green psychedelic background

A Friend You Can’t Fully Trust

Imagine you had a friend who knows just about everything there is to know. You could ask them about art, science, culture, it doesn’t matter… They’re engaging, smart, and funny on any topic. There’s just one catch. They don’t let the facts get in the way of a good story.

That’s pretty much where we are at the moment with generative AI. It is the most powerful and exciting technology in a generation. It is fueling an unprecedented boom in innovation with far-reaching implications for every industry. And every so often, with complete confidence, it will present fiction as fact.

How do we come to terms with this flaw within AI?

The Problem of Hallucination

I’ve worked with AI models and tools for years. The solutions of just a few months ago, while impressive at the time, now appear very limited. The acceleration of accuracy, understanding, and performance is shocking. The results are sometimes stunning and insightful. They are often helpful and relevant. But occasionally, they are simply wrong. As capable as the new technology is, it still can give out inaccurate “facts,” make up fake citations, and behave unpredictably.

The problem of AI generating false results, a phenomenon called “hallucination,” is not new. It is simply getting more attention now given the incredible leap forward that the technology has taken. Essentially, this happens because generative AI is trained to produce convincing results for humans. And humans don’t always require facts to be convinced.

While there are things you can do to minimize hallucination, there is no current way to eliminate it. And it is hard to see that changing any time soon. Generative AI is trying to create a convincing response; companies that build models try to steer it towards truth, but there is no way to completely stop inaccurate and false outputs.

So a certain amount of hallucination is inherent to using AI. How should that change the way we invest in and trust this new technology?

Weighing Trust and Value

To answer, think about your imaginary know-it-all friend. The extent to which you would trust them might depend on the following questions:

  • Value. How valuable and helpful are their accurate responses?
  • Frequency. How frequently do they lie?
  • Mitigation. Are there ways that you can lower the error rate? For example, by phrasing your question differently?

If the value wasn’t there, dealing with a potentially inaccurate resource wouldn’t be worth it. But when it works, AI works incredibly well. Already, AI bots and apps are changing how students learn, how factories work, and how people create art and media. It is pushing out the limits of the possible. And remember: we are at the very beginning of things. Just imagine where the technology will take us in a year or two!

As to the frequency question, it’s one one reason why the problem of AI hallucination is so insidious. Because the frequency of “lying” is relatively low today and likely to remain relatively small. At the moment, the error rate is likely to be low enough not to discourage use for most people.

The interesting point to consider is around mitigation. Because there are, in fact, ways that the frequency and severity of hallucination can be diminished. And understanding and using these practices should be a top priority for anyone building solutions with AI, or depending on the technology for their personal or work life.

Making AI Work for You

 

Check for confidence.

Perhaps the most important thing to remember when it comes to mitigating AI hallucination is that you are dealing with a technology that does not, in of itself, ever say “i don’t know.” Unless you put in some kind of confidence boundary into the design of the tool or the prompt that you give it, it will “answer” any question you put to it, to the best of its ability.

Some simple changes to the way  you design prompts can help a lot. You can try adding “If you don’t know, tell me that” to your next prompt, for instance.

Ask for help.

Human beings tend to make a lot of assumptions. Just because we have something in our head, we may take it for granted that the person – or AI tool – in front of us can somehow understand it. A lot of AI error is at least partially the result of us not providing enough parameters, or missing important details. In other words: we are asking AI to read our mind.

The AI itself is sometimes the best tool to help you design better, more specific prompts that are less likely to result in errors. So try asking it for what you want, and then add “Tell me what you need to know to answer this question.”

Validate sources and logic.

Would you accept questionably sourced or reasoned thinking from a human being? Don’t do it with AI. Two prompts that can help are “Explain your reasoning” and “Cite your sources.”

Coming to Terms With AI’s Possibility – And Limits

Ultimately, the strategy of mitigating AI hallucination is the right one. It is simply too powerful for us not to keep developing and building with it. And while there will likely always be a failure rate, it can be diminished and managed by engaging with the AI carefully.

Part of the reason we are at a dangerous moment with AI is that it is, for almost everyone, so new. We don’t know how much to trust it, or what to trust it with. There are few established legal or practical safeguards in place, the technology is raw, and there is so much incentive to throw it at new problems. In a year or two, we will likely have evolved some standards and practices around AI; for now, it is a bit of a free-for-all.

 

Want more content like this? Sign up for our monthly newsletter.

Book your live demo

Cai GoGwilt is CTO and Co-Founder of Ironclad. Before founding Ironclad, he was a software engineer at Palantir Technologies. He holds a B.S. and M.Eng. in Computer Science from the Massachusetts Institute of Technology.