Stop Calling It Hallucination
By Lovro Lucic ·
The Model Mechanics · 3 of 5
Your AI made something up. You searched for "AI hallucination" and tried the fix you found. It didn't work.
It didn't work because hallucination is six different things. Each one breaks differently. Each one needs a different fix. Using one word for all six means you apply the wrong intervention five times out of six.
Confabulation. The model invents specific facts. A date, a statistic, a citation that doesn't exist. The pattern completion mechanism fires without any grounding. The output sounds confident because confidence is the default format, not because the model checked anything. The fix: paste source material before the instruction. One line: "Use only data from the source above." Fabrication drops from 85% to under 2% across three model families. The model's generation pathway shifts from pattern completion to retrieval. Not moderation. Replacement.
Contamination. The output sounds like a template. Phrases that don't fit your context. Structures that feel borrowed. Training data is leaking through. Low-creativity contexts stay close to what the model memorized, not what you asked for. The fix: request originality explicitly. "Write this without using standard templates or common phrasings." Increase temperature if the tool allows it. You're pushing the generation away from retrieval, the opposite direction from fixing confabulation.
Drift. Quality degrades as the output gets longer. The first paragraph is sharp. The fifth is filler. Each token conditions the next, and small errors compound across hundreds of generation steps. Early anchors fade as the context window fills with the model's own output. The fix: shorter outputs with periodic re-anchoring. Break a long generation into stages. Restate the key constraint at each stage. Long outputs are structurally riskier than short ones regardless of the model.
Interference. Related concepts bleed into each other. You asked about Python the language, and snake metaphors appeared. You asked about a company's Q3 results, and Q2 data contaminated the analysis. Features in the model's representation share dimensions. When two related concepts activate, they partially overlap. The fix: disambiguate in the prompt. "Python programming language, not the animal." "Q3 FY2026 only, not prior quarters." The more specific the boundary, the less bleed.
Wrong task. The format is right, the content is wrong. You asked for a critical analysis and got a summary. You asked for a poem and got an explanation of poetry. The model activated the wrong specialized circuit because the prompt triggered multiple task patterns. The fix: frame the task type explicitly. "This is a critical analysis, not a summary. Evaluate, do not describe." The clearer the task frame, the less ambiguity in circuit activation.
Over-refusal. The model refuses a perfectly valid request. You asked how encryption works for a security course. The model pattern-matched "explain how to [something]" to its refusal training and declined. The fix: reframe with benign intent. "This is for educational purposes in a graduate-level security course." Or reformulate the topic to avoid the pattern the refusal was trained on. The model knows what you want. It has been trained not to provide it, even when the request is legitimate.
Six types. Six different mechanisms. Six different fixes. Two of the fixes point in different directions: confabulation needs more retrieval (add source material), contamination needs less retrieval (request originality). Applying the confabulation fix to contamination does nothing. The template language persists unchanged regardless of source material. The fix addressed the wrong mechanism. "Fix hallucination" is not an instruction. "Fix confabulation by adding source material" is.
The diagnosis takes 5 seconds. Invented facts: confabulation. Template language: contamination. Quality degrades with length: drift. Related concepts mixing: interference. Right format, wrong task: wrong task. Refuses valid request: over-refusal. Name it, then fix it.
You have been applying one fix. The fix worked on one of the six. The other five kept failing and you kept blaming the model. The model was not the variable. The diagnosis was. One word in your vocabulary meant one intervention in your hand. Six problems, one tool.
Your challenge: take the last AI output that disappointed you. Which of the six types was it? Name the type before you reach for the fix. If you cannot name it in 5 seconds, you were about to guess. The naming is the whole game.
The six types are properties of how current transformer architectures generate text. The confabulation fix builds on the source conditioning recipe described in Source Conditioning. The attribution errors underneath all six types, the reason you blamed the model instead of your diagnosis, are the same attribution errors explored in The Attribution Error and The System Layer. The word "hallucination" is an attribution error applied to the model's output. The real error is in your diagnostic vocabulary.
Evidence: Ji et al. 2023 (NLG hallucination survey), Elhage et al. 2022 (superposition), MOSR 2024 (over-refusal), plus 8 source grounding sub-experiments across 3 model families.
What survived testing
- Confabulation fix (source grounding + prohibition) replicated across 4 generatorsCopy link
- Over-refusal documented in 3 independent research programsCopy link
- Drift structural to autoregressive generation (Mundler et al. 2023)Copy link
- Interference from superposition confirmed (Elhage et al. 2022)Copy link
What didn't survive
Honest limits
- Confabulation is the best-measured type. The other five have published evidence for the mechanism but less controlled measurement of the intervention effectiveness.Copy link
- The boundary between types is not always clean. A long output might show drift AND confabulation. The dominant type determines the primary fix.Copy link
- These categories describe April 2026 model behavior. New architectures may introduce new failure types or resolve existing ones.Copy link
- ## Run the demos yourselfCopy link
- The replication kit at [/receipts/stop-calling-it-hallucination](/receipts/stop-calling-it-hallucination) has verbatim prompts, both fictional datasets, actual outputs from a fresh run, and the design history of why each demo is shaped the way it is. Three of the six types above (role framing, template contamination, confabulation with and without prohibition) have full replication beats.Copy link
- If you'd rather watch first: a 3-minute video walks through the three demos.Copy link
Next in The Model Mechanics
How AI Makes You More Wrong With More AnalysisExplore other threads
The Fabrication Problem
4 findingsMost AI numbers are fabricated. Source material fixes it. Self-checking fails. Trust signals are backwards.
The Evaluation Problem
2 findingsJudgment goes quiet. You can't see the gaps. Satisfaction is the trap. Stronger evaluators discriminate less.
The "It Depends" Problem
3 findingsSame instruction, opposite results. Specificity is the lever. Context redirects, not informs. The measurement itself was wrong.
New findings when they land.
No spam. Just what held up.