The AI That Reads Mexican Spanish, Cuban Spanish, and Argentinian Spanish — And Knows the Difference

Like many of these things, it began with a minor annoyance. When a linguist in Madrid asked a well-known chatbot to write something in “Spanish,” it nearly always responded in Spanish, with clipped consonants, the vosotros form, and vocabulary that would sound a little stiff in Mexico City and a little strange in Buenos Aires. Not exactly incorrect. Simply narrow. As though 600 million people had decided to speak in unison somewhere offstage. She didn’t.

Even though the dataset that resulted from that annoyance is tiny by AI standards—just thirty questions—it has accomplished something that more sophisticated systems haven’t quite been able to. In simple terms, it asks the models if they can distinguish between four regional varieties of Spanish, including Mexican, Cuban, and Argentinian. Then it waits for them to fail, almost courteously.

Project Information	Details
Project Name	It’s the same but not the same: Do LLMs distinguish Spanish varieties?
Field	Natural Language Processing, Computational Linguistics
Origin	Madrid, Spain
Dataset Size	30 expert-crafted multiple-choice questions
Dialects Covered	Andean, Antillean, Chilean, Continental Caribbean, Mexican and Central American, European Peninsular, Rioplatense
Question Categories	Lexical variation, morphosyntactic structure
Review Process	Three-expert peer review with collaborative refinement
Public Repository	Zenodo, DOI 10.5281/zenodo.15101403
Related Publication	Accepted in Procesamiento del Lenguaje Natural
Native Speakers Affected	Over 600 million globally

A lot of them do. Some people achieve success in peculiar, partial ways, either by correctly answering lexical questions but struggling with grammar, or vice versa. Based on the preliminary findings, it appears that these models have learned a lot of Spanish without actually understanding that it is plural.

The questions themselves are surprisingly easy. The question is whether “Llegas tarde, vávete y corre” or “Llegas tarde, võete y córrele” sounds more natural. The second is unmistakable to a Mexican ear; that tiny le tucked onto the verb is like a fingerprint, the kind of detail you grow up with but never give much thought to. It may appear to be a typo to a model that was primarily trained on Peninsular text.

Whether you “levantarse” or “pararse” to stand up is another question. Pararse sounds like you’ve stopped in Madrid. It’s how you get out of a chair in Mexico City or Havana. It’s not trivia. They are the places where identity resides, the tiny joints of a language.

The test was developed slowly by the team. The questions were written by one expert, reviewed by two others, and revised until they were clear enough to be understood in every area they addressed. The questions were then passed through a number of LLMs, sometimes asking the model directly and other times telling it to respond as though it were born in Montevideo, San Salvador, or Lima. The role-playing is important. It shows if a model has a single voice and a costume rack, or if it can switch dialects at will.

It’s difficult to ignore how late this work is coming in. Spanish, despite its enormous size, has frequently been treated as a single block, while English NLP has long debated dialects. The default has been to translate an English benchmark into “generic” Spanish, whatever that may be, which makes many speakers barely noticeable. Large labs and investors discuss multilingual AI; the more difficult and truthful question is whether their multilingual AI is also multidialectal.

Cautious optimism is warranted. The methodology is transparent, the dataset is available, and other researchers are already experimenting with it. It’s still unclear if the upcoming generation of models will take this to heart or continue to reduce Spanish to a single, polite standard. As you watch this happen, it seems like technology is finally being asked to listen instead of just talk. That’s a minor change. However, minor changes add up in a language this diverse.

Disclaimer

London Bilingualism's content on health, medicine, and weight loss is solely meant for general educational and informational purposes. This website does not offer any diagnosis, treatment recommendations, or medical advice.

We consistently compile and disseminate the most recent information, findings, and advancements from the medical, health, and weight loss sectors. When content contains opinions, commentary, or viewpoints from professionals, industry leaders, or other people, it is published exactly as it is and reflects those people's opinions rather than London Bilingualism's editorial stance.

We strongly advise all readers to consult a qualified medical professional before acting on any medical, health, dietary, or pharmaceutical information found on this website. Since every person's health situation is different, only a qualified healthcare provider who is familiar with your medical history can offer you advice that is suitable for you.

In a similar vein, any legal, regulatory, or compliance-related information found on this platform is provided solely for informational purposes and should not be used without first obtaining independent legal counsel from a licensed attorney.

You understand and agree that London Bilingualism, its editors, contributors, and affiliated parties are not responsible for any decisions made using the information on this website.

The AI That Reads Mexican Spanish, Cuban Spanish, and Argentinian Spanish — And Knows the Difference

TranslatePress Multilingual , The WordPress Translation Plugin That Lets You See Exactly What Your Site Looks Like in Every Language Before Anyone Else Does

2026 College Baseball Super Regionals , Troy’s Historic First Trip to Omaha and Eight Stories That Defined the Weekend

Milwaukee Baseball College Scene , How UWM’s Cinderella Run to the NCAA Regional Finals Rewrote What’s Possible for the Panthers

What You Actually Get With Polylang Pro — And What Nobody Tells You About the Cost

Kobe Bryant Education: Why Skipping College Was the Smartest Move He Ever Made

Belred Bilingual Academy: The Quiet Bellevue School That’s Raising Tomorrow’s Bilingual Thinkers

NBCC Early Childhood Education: The Program That’s Quietly Changing How New Brunswick Raises Its Kids

Types of Multilingualism: Why Speaking Two Languages Is Never the Same Experience Twice

Donald Trump Education: From Queens to Wharton — The Making of a President’s Mind

Babyland Bilingual Academy Is Quietly Changing How Florida Kids Learn Two Languages Before Age Five

Your Child’s Brain Is Being Rewired Every Time They Switch Languages — Here’s Why That’s a Good Thing

What Does It Actually Mean to Be Multilingual? The Answer Is More Complicated Than You Think

ClassLink SAISD: How San Antonio Schools Are Finally Getting Digital Access Right

Must Read

The Spanish-Speaking ICU: How Bilingual Nurses Are Saving Lives No One Else Can

What New York City Can Learn From London’s Booming Trilingual Public Schools

How Dungeons and Dragons Became the Most Unexpected Mental Health Intervention of the Decade

Early Childhood Education Program: What the World Keeps Getting Wrong About the First Eight Years

The AI That Reads Mexican Spanish, Cuban Spanish, and Argentinian Spanish — And Knows the Difference

Related Posts