Understanding LLM hallucinations

dida

July 29th 2025

Hallucination in large language models refers to the generation of information that appears accurate but is actually false. Despite their impressive capabilities in natural language understanding and generation, LLMs sometimes produce information that lacks a basis in reality. For example, Recent findings for ChatGPT-generated medical content showed that, out of 115 references generated by the model, only 7% were both real and precise, while 47% were completely fabricated and 46% were real but inaccurate. These AI hallucinations present significant risks by contributing to the spread of misinformation. Therefore, it is necessary to gain a deep understanding of the causes of AI hallucinations. This knowledge enables the development of effective strategies to minimize the occurrence of hallucinations.

Causes of hallucinations in large language models

Large Language Models (LLMs) hallucinate for several reasons. Firstly, a non-specific prompt can cause the model to generate inaccurate or fabricated responses as it attempts to fill in the gaps with plausible-sounding information. Secondly, limited and outdated domain knowledge through insufficient training data can lead to hallucinations, by causing the model to produce incorrect information when it encounters topics outside its expertise and accessible information.. Therefore, it is really important to make sure that the model has access to high-quality data about the relevant topic and domain. Taken together, these factors highlight the points to keep in mind to ensure the reliability of LLM outputs.

Types of Hallucinations in LLMs

Hallucinations in Large Language Models (LLMs) are categorized into factuality and faithfulness hallucinations. Factuality hallucinations arise when an LLM generates factually incorrect content. The cause behind it is usually the limited contextual understanding of the model or the noise/errors in the training data.

Factuality hallucinations can be further divided into two categories: Factual inconsistency and factual fabrication.

Let’s take an example to understand those concepts better. Factual inconsistency is about an LLM providing incorrect information. The example for that would be an LLM stating that the Great Wall of China was built in the 15th century by European explorers. In contrast, factual fabrication entails the invention of entirely unsupported information, like generating a narrative about a secret society of time-traveling monks in medieval Europe with advanced technology. Faithfulness hallucinations, on the other hand, occur when the model produces content that deviates from the provided source, demonstrating inconsistency with the original material.For instance, the user may want to translate the sentence 'What is the capital of Germany?' and the LLM answers with 'Berlin,' which is indeed correct but not the expected response.

How to prevent LLM hallucinations? -Techniques

Reducing hallucinations in large language models (LLMs) requires the use of a variety of techniques. Each of these techniques has its own advantages. A brief description of each technique is provided below.

Restrict open-ended or ambiguous prompts

Design prompts with clear, specific instructions to minimize the risk of the model generating false or misleading information. This ensures the model has a well-defined context to work within.

Fine-tuning the model

Fine-tuning a model involves taking pre-trained models, which are typically trained on large and diverse datasets, and refining them on smaller, specific datasets to enhance their performance in particular tasks or domains, e.g. through Reinforcement Learning or preference optimization. This process leverages the general capabilities of the pre-trained model, turning it into a specialized model tailored to specific needs. By training the model on a more focused dataset, fine-tuning allows it to adapt and optimize its parameters for the nuances and intricacies of the new task, thereby improving its accuracy and effectiveness. This approach is efficient as it builds upon the extensive learning already embedded in the pre-trained model, requiring less computational resources and time compared to training a model from scratch.

Retrieval-Augmented Generation (RAG)

Implement RAG techniques, which combine the generative capabilities of LLMs with external data retrieval mechanisms. By accessing and incorporating relevant information from external sources, the model can produce more accurate and contextually appropriate responses.

Incorporating reasoning capabilities

Enhance the model's reasoning abilities by prompting it to provide evidence for its claims or to generate alternative explanations. This step involves the model cross-verifying its responses and improving the logical consistency of its output.

Few-shot learning

Few-shot learning is an advanced training technique where a model learns to make accurate predictions by training on a very small number of labeled examples. The primary goal in traditional few-shot frameworks is to develop a similarity function that can effectively map the similarities between classes in the support and query sets. By providing the model with a few examples or demonstrations related to the task, few-shot learning helps establish a clearer context, guiding the model towards generating more accurate and relevant responses. This approach is particularly beneficial in scenarios where obtaining a large amount of labeled data is impractical, allowing the model to quickly adapt to new tasks with minimal data.

Decomposing complex tasks

Simplify complex tasks by breaking them down into smaller, more manageable components. This decomposition enables the model to focus on specific aspects of the task, thereby reducing the likelihood of generating hallucinations and improving overall reliability.