Why AI’s overconfidence in healthcare is a crisis we cannot ignore

By Dr Satadal Saha On May 12, 2026

Dr. Satadal Saha, Dean for School of Health Environment & Sustainability Studies, TCG CREST believes that as India is building digital health infrastructure at scale, the country has a real opportunity to define a model in which AI expands access to care without weakening quality or safety

Yesterday, I had a phone call from one of the most celebrated scientists in India visiting USA with his family. He was panicking that his wife has developed some dark patches on her back and this could be a sign of occult cancer (we all know how he found that one out!). I checked the pictures sent to me by the ubiquitous WhatsApp; and although I am not a dermatologist, I had enough knowledge to advise him to continue enjoying the drive through the West Coast. The reverse is also possible that lulled into a false sense of security by advice from AI chatbots, given the innocuous nature of the symptom (such as Heartburn), the family members find it difficult to wake you up the next morning. We are witnessing such circumstances with disturbing frequency; enough to make the institutions of learning, medicine and public leadership evaluate, lay down guidelines and build guardrails on the role of artificial intelligence in healthcare.

Before we get into a critical evaluation of AI chatbots, let us try to understand the nuances of medicine. Medicine is an imperfect science; its practice is an art that holds the ‘science’. The doctor isn’t just asking you questions in a certain sequence (which an AI chatbot can do just as well); the doctors is also establishing a construct and a context around your responses to the questions, observing your cognitive conduct, validating multiple social, economic and professional factors around you, and finally, juxtaposing the ‘whole’ against his or her cumulative wisdom. Often times it is not about reaching an accurate diagnosis before your treatment can start; it is about determining a clinical management pathway that is safe, evidence-based, likely to be successful but even when not, will surely keep you out of harm’s way.

So, practice of medicine isn’t just about information (AI models can have more information than an individual doctor), it is about how information is processed into knowledge, and the knowledge is transformed into wisdom. Take a common symptom ‘pain’ for example. Description, language, tolerance and attitude to pain varies enormously between individuals and it requires wisdom to unravel what exactly the patient is going through when describing pain. It is not binary. Practice of medicine is a social transaction, deeply rooted in anthropology, sociology, societal circumstances and it is only when these are factored in, a patient can expect to be diagnosed or treated with a global standard outcome.

Think about AI and an AI chatbot now!

A recent study published in BMJ Open found that AI chatbots, including ChatGPT, Gemini, Meta AI, Grok and DeepSeek, produced clinically problematic advice in nearly half of the cases examined. Around one in five responses was classified as carrying a high risk of patient harm. And it does so with elan, quite like a smooth talker leading you up the wrong way, with confidence! The danger it may present is a two-way sword – errors, and the confidence with which a clear and present safety risk is presented.

And that is because whereas the chatbot has an enormous amount of information and very high processing power to make quick decisions and conclusions (more, and faster than most doctors), it’s processing lacks judgement. It lacks contextual understanding, it fails to distinguish between the important data-points and irrelevant ones, it does not take into account the educational, cultural, behavioural and experiential backgrounds of the patient that colours his or her description of symptoms. It fails to assimilate conflicting pieces of information from the patient, fails to filter the noise from the symphony. They can produce the language of clinical assurance without the philosophical understanding that must sit behind it.

In India, this failure mode carries particular weight. Health-seeking behaviour is already marked by delay. Symptoms are often normalised, attributed to minor causes or managed at home until they become impossible to ignore. An AI system that confidently endorses that instinct, even inadvertently, can widen the gap between the onset of illness and clinical care. Design adds another layer of risk. Most consumer AI tools assume that the user can frame a precise and well-structured question. Patients rarely present that way. They arrive, even at a digital interface, with fear, fragmented descriptions, culturally shaped explanations of illness and vocabularies that vary across India’s languages and regions. A system that works safely only when the user asks the right question has quietly shifted clinical responsibility to the person least equipped to carry it.

This is not an argument against AI in healthcare. It is an argument for clearer boundaries. AI has shown real value in roles that support clinical judgment. It can assist with imaging interpretation and digital pathology, generating advisory evidence for sound practice of medicine, support risk assessment, improve medication adherence, strengthen public health communication and reduce administrative burden for clinicians. These are meaningful contributions. And with more advancement in AI models, it’s roles will possibly further enhance. But most models are just not ready yet for a direct interaction with the patient.

India needs a governance framework that classifies health AI tools by the nature and severity of their possible clinical impact. Accountability should then follow the level of risk. Any system that interprets symptoms, shapes treatment choices or offers reassurance in potentially acute situations must undergo rigorous clinical validation, transparent classification, continuous monitoring after deployment, and an escalation mechanism that actually works through linkage with service providers. They should not produce a conversational answer that sounds reassuring. Validation also has to take place in Indian conditions, across languages, literacy levels and real patterns of patient behaviour, rather than only in controlled, English language research settings.

India is building digital health infrastructure at scale. This gives the country a real opportunity to define a model in which AI expands access to care without weakening quality or safety. That will require generating a vast amount of data from India within the existing regulatory systems, clear regulation with defined accountability, clinical integration that keeps AI as an aid to professional judgment, and sustained investment in public and professional literacy so that these tools are used with care.

The young professional who sought advice in the middle of the night from an AI chatbot for some chest discomfort, and was advised about its innocuous nature, is still asleep. Perhaps the AI was right. Perhaps it merely sounded right. That distinction matters enormously, and she had no reliable way of making it. The task before academic and medical institutions is to ensure that systems shaping such moments are held to the same standard of accountability we would expect from any other clinical intervention. Confidence without validated judgment is not a strength. In medicine, it is a hazard. Institutions of learning are uniquely placed to recognise that hazard, study it and help build safer ways forward.

- Advertisement -

AI chatbot ChatGPT DeepSeek digital healthcare Gemini Grok Meta AI