What didn't hold up

These are hypotheses I held going into each experiment. The data killed them before the finding was published. They're part of each finding, not against it.

If a claim is ever corrected or withdrawn afterpublication, that's a retraction and shows up separately on the record.

You Can Only Evaluate What You Could Produce

"Always generate first" as universal prescription, and its mirror, "delegate everything to AI." Both miss the discrimination move. The practice is generating what compounds for you and delegating what does not.
"Borrowed is bad." Borrowed is fine when labeled. The failure is borrowed-mistaken-for-yours.
Anchoring risk on generate-first (Tversky and Kahneman 1974) is real and unresolved at the controlled-test level. The mitigation: use the construction trace for structural evaluation (framing, completeness, what's missing), not content comparison.

Satisfaction Turns Our Evaluator Off

AI Amplifies What You Bring

the assumption that AI transforms how people think. The assumption that exposure to better information produces better decisions. The assumption that more capable models reduce the amplification problem. More capable models amplify more convincingly.

How AI Makes You More Wrong With More Analysis

"AI can never break frames" is too strong. AI can be prompted to challenge its own assumptions. But: in an extended session where the frame is set implicitly through accumulated context, the AI doesn't spontaneously question it. The break has to be INITIATED by the human or by an external signal.

Stop Calling It Hallucination

The Decision That Was Never Made

"AI advice is always worse than human advice" too strong. The content may be equivalent. The speed and confidence change the cognitive process.

Why Experts Miss What Beginners Catch

What Your Body Does When AI Disagrees

The assumption that exposure to opposing views improves decisions. Exposure without metabolic capacity to hold the opposition produces rebuttal, not update. The disagreement audit does not prescribe update. It measures whether update is currently possible.

More Context Barely Helps

Stop Polishing, Start Switching

Your Body Reads AI Output Before You Do

Four Layers Produce Every AI Output

Same Technique, Opposite Results

The Model Is Rarely the Variable

The Most Trustworthy AI Output Is the Least Reliable

"Trust signals are always inverted" too strong. For content the reader produced themselves, verification is possible. The inversion applies to content the reader hasn't independently verified.

The Most-Cited Finding Was Wrong

Why AI Can't Check Its Own Work

How to Stop AI from Making Up Numbers

Most AI Numbers Are Fabricated