A graduate student named Ivory Yang is teaching a machine to read a script that was previously passed between women on silk fans and handkerchiefs in a Dartmouth research lab. The writing system, known as Nüshu, was developed four centuries ago by Yao women in Hunan Province, China, who used it for covert communication. When Yang was younger, her grandmother taught her a few words. She is now trying to teach a large language model something no AI has ever known by feeding it those words along with just 35 pairs of matched sentences.
It was successful. In a way. The NüshuRescue model started producing translations from Chinese into Nüshu that weren’t included in its training set. 35 pairs of sentences. That was all it took to open a door that had been shutting for many years. That figure serves as a reminder of how little material is left for languages that once carried entire worlds of meaning, which is both encouraging and extremely unsettling.
It is difficult to overestimate the scope of what is disappearing. The United Nations estimates that an indigenous language vanishes about every two weeks. Nearly half of the world’s seven thousand languages are in danger of going extinct by the end of the century, according to UNESCO. Vocabulary is not the only thing lost when a language dies. Entire systems of memory, oral law, medicine, and ecological knowledge become silent. We may have already lost more than we will ever be able to quantify.
That loss cannot be undone by artificial intelligence alone. There isn’t a single serious worker in this field who says otherwise. However, the tools that are currently being developed are documenting, transcribing, and translating languages at a speed that is simply unmatched by human linguists working alone. This is something that was genuinely impossible ten years ago. Using community-recorded audio in over thirty languages, including Louisiana Creole and Ainu, Google’s Woolaroo app allows users to point a phone camera at an object and hear its name spoken in an endangered language. Te Hiku Media developed a speech recognition system for Te Reo Māori in New Zealand that is said to achieve over 90% transcription accuracy, translating spoken Māori into text for digital content, news, and education. Luganda and Quechua are just two of the hundreds of low-resource languages that Meta’s No Language Left Behind initiative has pushed machine translation into, allowing speakers to use Facebook or WhatsApp without automatically switching to English or Spanish.

Every one of these projects has a subtle tension of its own. Woolaroo lowers a barrier rather than eliminating one; it does not create grammar or take the place of teachers. Meta’s automated systems have already made serious mistakes, resulting in inaccurate translations into Mohawk and Mi’kmaq in a book that was published and received harsh criticism. It turns out that inclusion without accountability can have negative effects of its own. And almost everything is clouded by the issue of data ownership. In stark contrast to most AI development, where community recordings are extracted, processed, and commercialized with little meaningful consent, Te Hiku Media treats language data as a cultural treasure governed by Māori principles rather than corporate policy.
As this develops, it’s difficult to ignore the fact that the most promising work occurs when technologists take a backseat and let communities take the lead. Rolando Coto Solano, a linguist at Dartmouth who develops speech-recognition models for Costa Rican and Cook Islands Māori languages like Cabécar and Bribri, began his work after a coworker made a joke about dying before completing her transcriptions. Something urgent was concealed by the humor. Because transcription is a specialized, time-consuming, and slow task, machines are well suited to handle it, freeing up human experts to concentrate on cultural context, teaching, and interpretation—tasks that no algorithm can match.
FormosanBench, the first evaluation benchmark created to assess AI performance on native Formosan languages like Atayal and Amis, was recently unveiled by researchers at the University of Hawaii. No one truly knows how well these models work outside of English without such benchmarks. They are working under the assumption that they are doing a lot of heavy lifting, and assumptions tend to fall apart when they are scrutinized. It seems like the field is still figuring out what responsible preservation looks like, whether AI turns into a true ally or just another technology that takes more than it gives. The answer most likely depends more on who gets to decide how the machines are used than on the machines themselves.
London Bilingualism's content on health, medicine, and weight loss is solely meant for general educational and informational purposes. This website does not offer any diagnosis, treatment recommendations, or medical advice.
We consistently compile and disseminate the most recent information, findings, and advancements from the medical, health, and weight loss sectors. When content contains opinions, commentary, or viewpoints from professionals, industry leaders, or other people, it is published exactly as it is and reflects those people's opinions rather than London Bilingualism's editorial stance.
We strongly advise all readers to consult a qualified medical professional before acting on any medical, health, dietary, or pharmaceutical information found on this website. Since every person's health situation is different, only a qualified healthcare provider who is familiar with your medical history can offer you advice that is suitable for you.
In a similar vein, any legal, regulatory, or compliance-related information found on this platform is provided solely for informational purposes and should not be used without first obtaining independent legal counsel from a licensed attorney.
You understand and agree that London Bilingualism, its editors, contributors, and affiliated parties are not responsible for any decisions made using the information on this website.
