A doctor I once spoke with at a small clinic in Karachi has two stacks of paper on her desk. Patient histories are one. The other is a stack of lab reports that she has been meaning to digitize for several months. She says that perhaps 60% of what she writes by hand is understood by the scanner in her office and laughs about it in that weary way that doctors do.
She fixes the rest by hand. It’s a minor detail, but it lies at the core of one of the most peculiar issues facing healthcare technology at the moment: teaching machines to read medication.
| Field | Detail |
|---|---|
| Topic | Building datasets for AI interpretation of medical terminology |
| Core Framework Cited | RF-AI framework (Recognition, Formatting, AI-Processing) |
| Primary Technologies | OCR, multimodal LLMs, NLP, deep learning |
| Notable Medical Model | Med-PaLM (Google Research) |
| Common Data Sources | Electronic Health Records, free-text clinical notes, lab reports |
| Largest Ethical Concerns | Bias, transparency, patient consent and confidentiality, accountability |
| Institutions Active in Research | Dartmouth, Google Health, Stanford HAI |
| Adoption Stage | Early clinical pilots, limited deployment |
| Key Challenge | Non-representative, unstructured medical data |
| Regulation Status | Fragmented, lagging behind innovation |
| Year of Recent Acceleration | 2023–2025 |
| Common Failure Mode | Hallucinated diagnoses, missed negatives |
Despite all the hype surrounding big language models in healthcare, most people are unaware of how difficult it is to actually create useful datasets. English is not a medical language. It’s not even a single language. The same condition may be described in three different shorthand styles by a radiologist in Boston, a cardiologist in Lahore, and a general practitioner in rural Spain, using acronyms that are unique to their respective hospitals. The cracks appear quickly when you feed that into an AI model that was primarily trained on neat American textbook data.
Researchers feel that the field has been advancing at two different rates simultaneously. In demos, the models continue to get faster, smarter, and more impressive. However, the information beneath them is still inconsistent. This was intended to be resolved by electronic health records.

In actuality, a large portion of the helpful information is still found in free-text notes that are hurriedly written between patients and are riddled with negations, hedges, and terms like “rule out” that essentially mean the opposite of what they seem to say. A model that interprets “no evidence of malignancy” incorrectly as “evidence of malignancy” is a serious flaw. A lawsuit is inevitable.
This is something that the Dartmouth team and others have worked on for years. Some of their early work involved laborious step-by-step instructions for how a computer should view a slide or a scan, as well as manually engineered features. That was altered, primarily for the better, by deep learning. However, it also produced a new reliance. The model is only as inclusive, accurate, and fair as the data used to train it. Additionally, medical data almost always reflects whoever was in the hospital during that decade.
The frequency with which the same datasets recur in paper after paper is difficult to ignore. Western, urban, and frequently biased in favor of patients who already had access to quality care. Communities that are underrepresented hardly show up. When they do, the model performs worse on them—exactly the opposite of what equitable medicine is meant to look like. Scholars are aware of this. Many of them express it honestly. However, the solution is costly, time-consuming, and politically complex.
Consent is another issue. There is a clause in the forms that patients sign that allows their data to be used to train an algorithm that they will never see. This is acceptable to some ethicists. Some people don’t. The lines have not yet been drawn.
However, something is changing. The messy middle ground between an image, a lab value, and a casual sentence in a discharge note is beginning to be handled by multimodal models such as Med-PaLM. It’s another matter entirely whether they are reliable enough for a weary doctor at two in the morning. The technology is amazing. It is still catching up to the data feeding it. A handwritten prescription on paper will continue to humble the smartest AI in the room until that gap closes.
London Bilingualism's content on health, medicine, and weight loss is solely meant for general educational and informational purposes. This website does not offer any diagnosis, treatment recommendations, or medical advice.
We consistently compile and disseminate the most recent information, findings, and advancements from the medical, health, and weight loss sectors. When content contains opinions, commentary, or viewpoints from professionals, industry leaders, or other people, it is published exactly as it is and reflects those people's opinions rather than London Bilingualism's editorial stance.
We strongly advise all readers to consult a qualified medical professional before acting on any medical, health, dietary, or pharmaceutical information found on this website. Since every person's health situation is different, only a qualified healthcare provider who is familiar with your medical history can offer you advice that is suitable for you.
In a similar vein, any legal, regulatory, or compliance-related information found on this platform is provided solely for informational purposes and should not be used without first obtaining independent legal counsel from a licensed attorney.
You understand and agree that London Bilingualism, its editors, contributors, and affiliated parties are not responsible for any decisions made using the information on this website.
