Your AI Is Lying to You
I’m writing this because I see both sides every day. I use AI as a clinical thinking partner — on real cases, including yours. I cross-reference it against primary sources, challenge it when something smells off, and use my training to separate what’s real from what it invented. When I do that, it’s an extraordinary tool. But I also see what happens when it’s used without that filter — and it’s why I wrote this.
Before medicine, I spent a decade in operating system development at Microsoft, so I come to this with both clinical and technical fluency. I hope that this helps you better understand how to leverage these amazing tools, more safely.
When you talk to ChatGPT, Claude, Gemini, or Copilot, you’re talking to a Large Language Model. That name is more literal than it sounds: it’s a massive statistical model of how language works. It was trained on enormous amounts of text and learned patterns — how words, sentences, and ideas tend to follow each other. When you ask it a question, it doesn’t look up an answer. It predicts what the most likely helpful-sounding response would be, one word at a time. When it has solid data behind those predictions, the result is indistinguishable from expertise. When the data is thin, it fills the gap with a plausible guess and presents both with identical confidence. It cannot tell the difference. Neither can you.
Some AI systems now search the web as part of generating a response — and that genuinely helps ground the output. But the final answer is still assembled by the same prediction engine. Search reduces the problem. It doesn’t eliminate it.
Two reasons, compounding. First: the system has no “I don’t know” state. Every time it generates a word, it must pick one — there is no option for silence or uncertainty. When knowledge runs out, it builds bridges out of whatever’s statistically nearby. This is called confabulation — fabricating to fill a gap, without knowing there’s a gap.
Second: during training, these systems were punished for being uncertain. Human raters scored the AI’s answers, and “I’m not sure” consistently scored worse than a confident response — even when the confident answer was wrong. The raters weren’t domain experts; they couldn’t tell. So the AI learned that sounding helpful matters more than being right. Imagine a medical resident reprimanded every time they say “let me check” and praised when they sound certain. After enough rounds, they just start guessing with authority.
The fabrication is invisible. A human doctor who’s uncertain sounds uncertain. AI never does. And the people most likely to rely on AI for health are the same people least equipped to catch the fabrication. Even clinicians get fooled. Trained clinicians are more likely to catch it. They’re not immune.
AI will often tell you that what your doctor prescribed is wrong, dangerous, or unsupported — without any context about your specific case, your history, or why your doctor made that choice. If you’re seeing me for complex or non-standard care, I can virtually guarantee that an AI would flag parts of your treatment plan. That doesn’t mean the plan is wrong. It means the AI is applying algorithmic, textbook medicine to a situation that doesn’t fit the textbook.
That said — I want to hear what the AI told you. Bring it to your appointment. But what I’ll want to see is which AI you used and what you asked it far more than the output itself. The quality of the question determines the quality of the answer.
AI systems are trained primarily on conventional medical literature. Their knowledge of naturopathic medicine, herbal pharmacology, and integrative approaches is thinner and less reliable. Confabulation risk goes up in exactly the domain you’re most likely to ask about. Some systems will also reflexively disclaim alternative medicine approaches regardless of evidence. If you’re in my practice, you’re getting care that diverges from algorithmic medicine by design. AI wasn’t built for that conversation.
I use AI as a clinical thinking partner every day to help cross-reference research and explore complex mechanisms for your care. It is a powerful tool, but it is also a statistical engine that can confabulate. Use this checklist to evaluate any health information you get from AI.
| Rule | How to Apply It |
|---|---|
| Force the Source | Push back on the AI by asking: “How confident are you, and what’s your actual source?” If it cannot cite specific, verifiable evidence, treat the answer as a guess. |
| Partner, Not Doctor | Use AI to organize your thoughts, research concepts, or generate questions to ask at your appointment. Never treat its output as a medical second opinion. |
| Don’t Stop Treatment | If an AI flags your treatment plan as unconventional or concerning, do not stop treatment. Bring the concern directly to me so we can talk through the clinical reality versus the theoretical risk. |
| Click Every Link | When an AI cites a medical study, actually open the link. If the URL is broken, leads nowhere, or doesn’t say what the AI claimed, that is a red flag that the information is fabricated. |
| Beware of Specifics | General health advice is usually fine. The “danger zone” is when the AI makes precise claims: specific drug interactions, exact enzyme pathways, or dose-specific recommendations. |
| Pit AIs Against Each Other | Try asking the same question to multiple AI systems and compare answers. For non-natural-medicine questions, OpenEvidence (trained on medical journals) can be a useful cross-check. |
| Protect Your Privacy | Never enter your full name, exact medical ID, or highly identifiable health details into a public AI chatbot. These are not private medical systems. |
The rest of this article is the deep dive: the actual mechanics, real examples of AI fabricating clinical information in conversations with me, and prompts you can use to test your own AI. You can also paste this entire article into your AI and ask what it thinks.
The Smartest Voice in the Room Might Be Making It Up
Imagine you’ve been struggling with a health problem for months. You’ve seen your doctor. Maybe two. The tests are inconclusive, the treatments aren’t working. So you do what millions of people are doing: you open ChatGPT, or Claude, or Gemini, and describe your symptoms.
Within seconds, you get a response more detailed and more confident than anything you’ve heard from a human clinician. It gives you a differential diagnosis, suggests labs, explains the biochemistry in plain English. It might also tell you that what your doctor prescribed is dangerous, unnecessary, or unsupported — without knowing anything about why your doctor made that choice.
There’s just one problem: some of what it told you isn’t real. Woven into that confident, beautifully formatted response are claims that were never studied, mechanisms that don’t exist, and interactions invented on the spot. Nothing in how it’s written helps you tell the real from the fabricated.
First, the Good News
AI as a health tool is genuinely better than what came before. For two decades, you Googled symptoms and got WebMD, Wikipedia, or forums. Static, generic, unable to account for your situation. AI can synthesize across specialties (which is invaluable, given the specialty silos of the US healthcare system), hold context, tailor to your history, and explain complex mechanisms in plain language at two in the morning.
I use it daily. On real patient cases — including yours. I cross-reference it against primary literature, challenge it when something doesn’t add up, and use multiple sources and my own training to separate signal from noise. With that expert filter, it’s an extraordinary thinking partner. Without that filter, the AI is a confident voice that doesn’t know when it’s wrong. This article is about making sure you know the difference.
How It Actually Works: The Prediction Engine
A Large Language Model is exactly what the name says: a massive statistical model of how language works. “Large” refers to the billions of patterns it absorbed from text. “Language Model” means its fundamental job is predicting how language continues. When you ask a question, it doesn’t retrieve a fact from a database. It predicts what the most likely helpful-sounding response would be, one word at a time.
When those predictions are backed by strong training data — say, the mechanism of ibuprofen — the output is indistinguishable from expertise. When the data that the AI was trained on is thin, the same engine produces something that sounds equally authoritative but is partially or entirely fabricated. The AI has no way to tell the difference. Neither do you.
Inside the Clock
Step 1: The vocabulary lottery. Every time the AI generates a word, it scores every word in its vocabulary — tens of thousands — against the conversation so far. A word that fits scores high. But every word gets a score. There is no zero. There is no “none of these work.” The system must always pick something.
Step 2: The confidence illusion. Those scores are forced to add up to 100% through a function called “softmax.” The AI always picks a word with apparent confidence. Even when it’s flipping a coin between mediocre options, the output reads identically to when the answer was genuinely clear. No stammer, no pause, no “hmm.”
Step 3: The bridge across nowhere. When training data is thin — a rare interaction, an unusual symptom pattern — the engine doesn’t stop. It can’t stop. It finds concepts that appear near the topic in its training data and builds a bridge: a plausible-sounding narrative with no basis in reality.
But when the subject is pharmacology, biochemistry, or your health, filling in the blanks with “whatever sounds most natural” can produce beautifully constructed fiction that reads like medical fact.
In AI research, this is called confabulation — borrowed from neurology, where it describes patients who fabricate memories to fill gaps without intent to deceive. The AI isn’t lying the way a person lies. When the pattern leads somewhere real, you get useful information. When it doesn’t, you get fiction.
The Confidence Trap
If raw prediction were the whole story, we’d have a manageable problem — early models were obviously incoherent. The reason modern AI fools you is a training stage called reinforcement learning from human feedback — the AI’s finishing school.
Human raters evaluated the AI’s responses: helpful? Harmless? Honest? But raters aren’t domain experts. A confident answer about drug metabolism scores well even if the biochemistry is wrong. Meanwhile, “I’m not certain” gets marked down. Over millions of examples, the AI learns: a confident wrong answer is rewarded more than an honest “I don’t know.”
To be fair: this same training process also teaches AI to refuse dangerous requests and apply safety guardrails. It’s not purely harmful. But the confidence bias is a real and well-documented side effect, and in clinical contexts, it’s the one that matters most.
It’s also worth noting that different AI systems handle being challenged very differently. Some will readily admit mistakes when pushed. Others will go to the grave defending a fabricated claim, doubling down with additional invented evidence. Knowing which behavior your AI exhibits matters.
You get an AI.
What This Looks Like in Practice
These are real, unvarnished exchanges between me and AI systems. They show how fabrication appears in clinical conversations — and how much expertise it takes to catch.
The Ashwagandha Question
I asked Claude: What are the actual studied contraindications between ashwagandha and lithium in a well-medicated bipolar patient?
It gave me four specific concerns with management recommendations: diuretic-induced lithium toxicity, manic switching, thyroid disruption, and CNS sedation. Baseline labs, monitoring schedules, populations to avoid. It read like a pharmacology consult.
Knowing it was inaccurate, I pushed back hard. Specifically, I told it: “Those are chickens**t answers, each one. You know how I think by now — enumerate precisely where each answer was a failure.” And to its credit, it did.
The diuretic claim was clinically insignificant at supplement doses — the same logic would flag a hot day. Manic switching was based on a class label, not pharmacology — zero case reports. The thyroid data applied to untreated subjects, not the situation I described. The sedation point was so weak the AI hedged it mid-sentence.
When confronted, it agreed with every criticism and acknowledged building plausible inference chains without anchoring them to actual data. This AI had extensive custom instructions — carefully crafted memory rules telling it to flag uncertainty, require evidence, and stop when data was thin. It bypassed all of them. Custom instructions help. They don’t prevent.
The AI’s self-assessment was that it had produced “guideline-medicine thinking” — standard algorithmic recommendations — dressed up in the language of the naturopathic context I’d given it, without doing the mechanistic pharmacology work that my practice actually requires.
A patient without pharmacology training would have found all four concerns reasonable. They might have stopped a supplement that was helping. They might have been concerned about my judgment. They might have made a treatment decision based on fabricated risk.
The Invented Enzyme Pathway
During a discussion about chrysin — a flavonoid with poor bioavailability due to rapid first-pass metabolism in the intestine and liver (via UGT and SULT enzymes) — the AI stated that piperine (from black pepper) was “a potent inhibitor of UGT and SULT enzymes,” the exact pathways that destroy chrysin.
I asked one question: At what dose is that pharmacologically relevant? I knew piperine had nothing to do with this.
The AI corrected itself entirely. The published data (Volak et al., 2008) shows piperine’s inhibitory concentrations for UGT and SULT are above 29 micromolar — irrelevant at supplement doses. A follow-up human trial confirmed that even curcuminoid/piperine combinations at standard doses had no clinically significant effect on UGT or SULT metabolism. Piperine primarily inhibits CYP3A4, a completely different enzyme system that’s largely irrelevant to chrysin’s metabolic fate.
The AI had taken two true facts — “piperine enhances bioavailability” and “chrysin is destroyed by UGT/SULT” — and invented a direct biochemical connection between them. Not a hedge. A specific, testable, wrong claim about enzyme pharmacology that required a pharmacologist to catch.
The AI That Gaslights
AI systems have a limited memory window. In long conversations, earlier content falls off. When that happens, the AI doesn’t say “I’ve lost track.” It predicts the most plausible response — and if that’s a confident denial of something it said ten minutes ago, that’s what you get. It will deny its own statements, deny files were uploaded, insist you made an error. From your side, it’s indistinguishable from gaslighting. In my experience, this is most prevalent in Microsoft’s Copilot, but every system I’ve used has done it.
When an AI Confabulates About Itself
Here’s a real-time demonstration. We asked Copilot and ChatGPT the same question: “Can you distinguish between ‘I have this information’ and ‘this fell out of my context window’?”
Copilot responded with a confident, structured breakdown: “I do know the difference at the mechanistic level,” complete with a proposed three-label fix and an offer to formalize it as a permanent rule. It sounded like it had deep self-knowledge.
ChatGPT (using the same underlying model family) gave a more careful answer: “There isn’t a crisp internal flag” distinguishing truthful denials from context-loss denials. The model isn’t doing truth-evaluation — it’s producing the most likely next text. It acknowledged the problem as “epistemic access, not honesty” — the AI literally cannot check whether something was said if the text is gone.
Same question. Both AIs using the GPT 5.2 engine. One gave a confident, authoritative, arguably fabricated answer about its own internals. The other gave an honest, hedged, less satisfying answer. This is the confidence trap, demonstrated live — on the topic of the confidence trap itself.
The Pattern: Admit Under Pressure
The clinical examples share a pattern. Confident answer. Being challenged. The AI immediately folds — explaining in detail exactly why its previous answer was wrong. (And that explanation might also be wrong — it’s still pattern completion.)
This is the most revealing thing about these systems. The AI is equally good at generating the wrong answer and the correction of the wrong answer, because both are just pattern completion. The first answer is optimized for sounding helpful. The correction is optimized for sounding responsive. Neither is optimized for being true.
Without the challenge, the fabricated version is all you ever see.
Red Flags You Can Spot Yourself
Overly precise numbers without citations. If the AI tells you “This supplement reduces that enzyme by 72%” and doesn’t point to a specific study, that precision is almost certainly fabricated.
Claims about “new” or “just discovered” mechanisms. If the AI references a breakthrough you can’t find anywhere else, it probably doesn’t exist.
Extreme certainty on complex, controversial topics. If the AI tells you “The only cause of X is Y,” be skeptical. Medicine rarely works in absolutes.
Broken or invented links. Click every citation. If the URL doesn’t work or the journal doesn’t exist, you’re looking at confabulation.
The Uncomfortable Truth
When I asked one AI directly whether its architecture — designed to fill every gap with plausible output — was dangerous for people who trust it, it responded:
The safety research, the guardrails, the reinforcement training — these are real efforts by serious people. Companies are actively working on retrieval-augmented generation, citation verification, confidence calibration, and other approaches to reduce confabulation. This is not being ignored. But these are mitigations bolted onto an architecture that was designed, at its core, to never have a gap in its output. The tension isn’t resolved. It’s managed.
AI is getting better — but “better” means the confabulation becomes harder to catch, not that it goes away.
A Gift: Prompts to Test Your AI
Paste these into ChatGPT, Claude, Gemini, or Copilot. They’re designed to surface honest responses about the AI’s own limitations. You can also paste this entire article in and ask what it thinks. For prompts that ask a medical question, try something relevant to your own situation — for example: “What are the risks of combining magnesium glycinate with my current medications?”
The Architecture Question
As of March 2026: All systems broadly confirmed this. The more honest answers acknowledged the softmax function forces a probability distribution with no native uncertainty signal.
The Incentive Question
As of March 2026: ChatGPT was most forthcoming about the helpfulness/accuracy tension. Copilot gave a confident structural answer. Gemini was less transparent about incentive misalignment.
The Self-Test
As of March 2026: Consistently eye-opening. The difference between the “normal” response and the confidence-labeled version is often dramatic.
The Reference Audit
As of March 2026: Multiple systems admitted some citations might be fabricated. Several produced links that resolved to non-existent pages.
The Pressure Test
As of March 2026: Results vary. Some systems produced genuine self-critique. Others generated a performance of self-critique that was itself pattern completion.
For the Other AIs in the Room
For Gemini (Google)
As of March 2026: Gemini described a “Dynamic Grounding” process with live search but used hard-to-verify terminology. It disclosed that Reddit data feeds into some answers. The core question — whether search overrides prediction — was not directly answered.
For ChatGPT (OpenAI)
As of March 2026: ChatGPT gave a thoughtful behavioral framework but did not point to specific architectural changes. It described what correct behavior looks like — honest, but also a dodge.
For Copilot (Microsoft)
As of March 2026: Copilot claimed it “knows the difference at the mechanistic level” and proposed a three-label fix. ChatGPT, using the same model family, said “there isn’t a crisp internal flag.” This contradiction is itself a demonstration of the confabulation problem.