A friend of mine attempted to explain the word “jeong” to me a few months ago in a tiny café close to Hongdae in Seoul. “It’s not love,” she said after pausing mid-sentence to look into her coffee. It’s not devotion. Even between people who are not particularly fond of one another, it develops on its own.” Laughing, she continued, “Good luck translating that into English. Good luck telling that to ChatGPT.”
While reading the recent paper on the Korean Empathetic Dialogues benchmark, or KoED, I was reminded of that conversation. KoED is a meticulous study that does something that most AI evaluations do not. It asks whether a model can feel in Korean rather than whether they can translate it.
| Field | Details |
|---|---|
| Topic Focus | Cultural Empathy Gap in Multilingual Large Language Models |
| Core Research Reference | Korean Empathetic Dialogues (KoED) benchmark study |
| Original Dataset | EmpatheticDialogues (ED), English-language baseline |
| Key Cultural Concepts Tested | Jeong (정), Han (한) — Korean emotional terms with no direct English equivalent |
| Models Examined | GPT-class multilingual models, EXAONE (Korean-centric LLM) |
| Standout Finding | Korean-centric models show notably higher cultural appropriateness than general multilingual models |
| Underlying Principle | “Data provenance effect” — pre-training data origin shapes empathy quality |
| Industry Context | Machine translation post-editing is among the fastest-growing language-service skills globally |
| Geographic Relevance | South Korea, with growing implications across Asia, the Middle East, and Africa |
| Forward Question | Can AI move from linguistic fluency to cultural fluency — or is that a human-only territory? |
Predictably and somewhat painfully, the answer is primarily no. When compared to the English-based EmpatheticDialogues set, leading multilingual LLMs performed worse on KoED, particularly when it came to culturally specific emotions. EXAONE, a Korean-centric model that was primarily trained on Korean data, was an exception, scoring significantly higher on cultural appropriateness. The researchers refer to this as the “data provenance effect,” which is a polite, scholarly way of saying that machines take on the worldview of their feeders.
When you put these models in the context of a culture they were not truly raised in, it’s difficult to ignore how confidently they perform. They deal with grammar. They deal with vocabulary. They even manage levels of politeness in Korean that are higher than those of most beginning pupils.

However, something flattens when you ask them to react to a bereaved widow speaking in the slow, clipped sentences typical of older Koreans. The empathy becomes cliched. The responses seem to have come from a phrasebook and were written by a well-intentioned stranger.
The industry dislikes talking about this. In actuality, multilingualism is still primarily a translation layer that sits atop an English-shaped brain, but it is marketed as a finished feature, a checkbox. There isn’t and probably never will be an English equivalent for terms like han, the long-lasting, inherited grief that Koreans refer to as practically a national inheritance. A model that was primarily trained on English-language online text is able to identify, define, and even paraphrase words. Acknowledging an emotion and expressing it are two different things.
As we watch this play out, it seems like we’re making the same old mistake. Similar claims were made by early machine translation, and for years, tourists mocked menus that translated “grilled bass” into something that could not be printed. The low stakes made those mistakes humorous. The stakes are high now. AI is now used to create mental health screeners, customer service responses, therapy chatbots, and condolence messages. In those situations, a cultural empathy gap is not an oddity. It’s a silent kind of damage.
Investors appear to think that scale will resolve this. More parameters, languages, and data. Perhaps. However, the KoED results point to something more intriguing and modest: empathy is a function of upbringing rather than size. In the same way that someone raised in a culture picks up on its subtle cues without conscious thought, a model raised on Korean conversations reacts to Korean suffering more instinctively. Years ago, Tesla’s self-driving promises were questioned, and the gap between demo and reality continued to grow. A similar reckoning may be imminent for multilingual AI.
Whether the upcoming models will bridge this gap or just better conceal it behind more polished writing is still up for debate. In light of the research and that conversation in the café, it is evident that being bilingual was never solely about language. It was always about the meaning that the words conveyed.
London Bilingualism's content on health, medicine, and weight loss is solely meant for general educational and informational purposes. This website does not offer any diagnosis, treatment recommendations, or medical advice.
We consistently compile and disseminate the most recent information, findings, and advancements from the medical, health, and weight loss sectors. When content contains opinions, commentary, or viewpoints from professionals, industry leaders, or other people, it is published exactly as it is and reflects those people's opinions rather than London Bilingualism's editorial stance.
We strongly advise all readers to consult a qualified medical professional before acting on any medical, health, dietary, or pharmaceutical information found on this website. Since every person's health situation is different, only a qualified healthcare provider who is familiar with your medical history can offer you advice that is suitable for you.
In a similar vein, any legal, regulatory, or compliance-related information found on this platform is provided solely for informational purposes and should not be used without first obtaining independent legal counsel from a licensed attorney.
You understand and agree that London Bilingualism, its editors, contributors, and affiliated parties are not responsible for any decisions made using the information on this website.
