The Cultural Empathy Gap in Machine Learning: Can AI Ever Truly Be Bilingual?

A friend of mine attempted to explain the word “jeong” to me a few months ago in a tiny café close to Hongdae in Seoul. “It’s not love,” she said after pausing mid-sentence to look into her coffee. It’s not devotion. Even between people who are not particularly fond of one another, it develops on its own.” Laughing, she continued, “Good luck translating that into English. Good luck telling that to ChatGPT.”

While reading the recent paper on the Korean Empathetic Dialogues benchmark, or KoED, I was reminded of that conversation. KoED is a meticulous study that does something that most AI evaluations do not. It asks whether a model can feel in Korean rather than whether they can translate it.

Field	Details
Topic Focus	Cultural Empathy Gap in Multilingual Large Language Models
Core Research Reference	Korean Empathetic Dialogues (KoED) benchmark study
Original Dataset	EmpatheticDialogues (ED), English-language baseline
Key Cultural Concepts Tested	Jeong (정), Han (한) — Korean emotional terms with no direct English equivalent
Models Examined	GPT-class multilingual models, EXAONE (Korean-centric LLM)
Standout Finding	Korean-centric models show notably higher cultural appropriateness than general multilingual models
Underlying Principle	“Data provenance effect” — pre-training data origin shapes empathy quality
Industry Context	Machine translation post-editing is among the fastest-growing language-service skills globally
Geographic Relevance	South Korea, with growing implications across Asia, the Middle East, and Africa
Forward Question	Can AI move from linguistic fluency to cultural fluency — or is that a human-only territory?

Predictably and somewhat painfully, the answer is primarily no. When compared to the English-based EmpatheticDialogues set, leading multilingual LLMs performed worse on KoED, particularly when it came to culturally specific emotions. EXAONE, a Korean-centric model that was primarily trained on Korean data, was an exception, scoring significantly higher on cultural appropriateness. The researchers refer to this as the “data provenance effect,” which is a polite, scholarly way of saying that machines take on the worldview of their feeders.

When you put these models in the context of a culture they were not truly raised in, it’s difficult to ignore how confidently they perform. They deal with grammar. They deal with vocabulary. They even manage levels of politeness in Korean that are higher than those of most beginning pupils.

The Cultural Empathy Gap in Machine Learning

However, something flattens when you ask them to react to a bereaved widow speaking in the slow, clipped sentences typical of older Koreans. The empathy becomes cliched. The responses seem to have come from a phrasebook and were written by a well-intentioned stranger.

The industry dislikes talking about this. In actuality, multilingualism is still primarily a translation layer that sits atop an English-shaped brain, but it is marketed as a finished feature, a checkbox. There isn’t and probably never will be an English equivalent for terms like han, the long-lasting, inherited grief that Koreans refer to as practically a national inheritance. A model that was primarily trained on English-language online text is able to identify, define, and even paraphrase words. Acknowledging an emotion and expressing it are two different things.

As we watch this play out, it seems like we’re making the same old mistake. Similar claims were made by early machine translation, and for years, tourists mocked menus that translated “grilled bass” into something that could not be printed. The low stakes made those mistakes humorous. The stakes are high now. AI is now used to create mental health screeners, customer service responses, therapy chatbots, and condolence messages. In those situations, a cultural empathy gap is not an oddity. It’s a silent kind of damage.

Investors appear to think that scale will resolve this. More parameters, languages, and data. Perhaps. However, the KoED results point to something more intriguing and modest: empathy is a function of upbringing rather than size. In the same way that someone raised in a culture picks up on its subtle cues without conscious thought, a model raised on Korean conversations reacts to Korean suffering more instinctively. Years ago, Tesla’s self-driving promises were questioned, and the gap between demo and reality continued to grow. A similar reckoning may be imminent for multilingual AI.

Whether the upcoming models will bridge this gap or just better conceal it behind more polished writing is still up for debate. In light of the research and that conversation in the café, it is evident that being bilingual was never solely about language. It was always about the meaning that the words conveyed.

Disclaimer

London Bilingualism's content on health, medicine, and weight loss is solely meant for general educational and informational purposes. This website does not offer any diagnosis, treatment recommendations, or medical advice.

We consistently compile and disseminate the most recent information, findings, and advancements from the medical, health, and weight loss sectors. When content contains opinions, commentary, or viewpoints from professionals, industry leaders, or other people, it is published exactly as it is and reflects those people's opinions rather than London Bilingualism's editorial stance.

We strongly advise all readers to consult a qualified medical professional before acting on any medical, health, dietary, or pharmaceutical information found on this website. Since every person's health situation is different, only a qualified healthcare provider who is familiar with your medical history can offer you advice that is suitable for you.

In a similar vein, any legal, regulatory, or compliance-related information found on this platform is provided solely for informational purposes and should not be used without first obtaining independent legal counsel from a licensed attorney.

You understand and agree that London Bilingualism, its editors, contributors, and affiliated parties are not responsible for any decisions made using the information on this website.

The Cultural Empathy Gap in Machine Learning: Can AI Ever Truly Be Bilingual?

The Evolution of Estuary English in a Multilingual Context

The Korean of New Malden: London’s Hidden Bilingual Capital

How London’s NHS is Relying on Bilingual Youth to Translate Medical Trauma

Why Federal Housing Agencies Are Going English-Only — Just as AI Makes Spanish Service Free

The Evolution of Estuary English in a Multilingual Context

Alexa Adds Multilingual Mode: Inside the Algorithm Powering Bilingual Homes

Inside the Race Between OpenAI, Anthropic and Google to Build the First Truly Bilingual AI Brain

Can AI Translators Actually Do the Work of Bilingual Staffers? The Government Experiment

The Korean of New Malden: London’s Hidden Bilingual Capital

How London’s NHS is Relying on Bilingual Youth to Translate Medical Trauma

The Filipino-English Nurses Holding Up London’s Hospitals

The Rise of London’s Bilingual Influencers: TikTok’s New Linguistic Powerhouses

The Bangladeshi Brick Lane: London’s Bilingual Heart Faces an Uncertain Future

Must Read

Peptide Stacking Is the New Biohacking Trend, Here Is What the Science Actually Supports — and What Is Pure Hype.

The Ozempic Economy – Why Wall Street is Shorting Fast Food and Gym Stocks.

Britain’s Multilingual Children: “We Speak Whatever Language Gets the Job Done”

A Cognitive Edge or a Crutch? The New Science Reshaping What We Know About Bilingual Children

The Cultural Empathy Gap in Machine Learning: Can AI Ever Truly Be Bilingual?

Related Posts