An elder speaks into a microphone somewhere in a New Zealand recording studio. The words are Māori, or te reo, a language that, during the twentieth century, dangerously approached generational silence. The recording is fed into a machine learning system developed by Te Hiku Media, a Māori-owned nonprofit that made the decision years ago that Māori people would be the ones to digitize their language, rather than a Silicon Valley company. Over 90% transcription accuracy is currently attained by the system. The voices of the elders are no longer merely recollections.
Variations of that scene occur in an increasing number of communities facing the same underlying crisis. According to UN estimates, an indigenous language vanishes approximately every two weeks. Nearly half of the world’s 7,000 languages are at risk by the end of this century, according to UNESCO. The loss of a language is not just linguistic; it also includes ecological knowledge that is encoded in vocabulary, medical practices, oral law, and ways of naming the world that do not translate into any dominant tongue. Gone are generations of accumulated knowledge. The question of whether AI can address that and under what circumstances is becoming more and more urgent.
Some of the most well-known initiatives use almost intentionally basic tools. With Google’s Woolaroo app, users can point a smartphone camera at a commonplace object and hear its name spoken in an endangered language, such as Louisiana Creole, Yiddish, or Japanese Ainu. The app links the image to the word, while native speakers supply the audio. It has reportedly been used by families in Louisiana to teach kids vocabulary that was no longer used in everyday life a generation ago. It doesn’t produce grammar or impart conversation skills. However, it places a language in a child’s hands during a typical moment in life, which is exactly when languages either survive or do not.

More ambitious initiatives are advancing. By extending automated translation to hundreds of under-resourced languages, such as Luganda and Quechua, Meta’s No Language Left Behind initiative has made it possible for speakers to communicate across platforms without resorting to English or Spanish. FormosanBench, an assessment tool created especially to gauge how well AI models handle native Taiwanese languages like Atayal and Amis, was unveiled by researchers at the University of Hawaii in 2025. The preliminary findings showed something unsettling: even sophisticated models have difficulty with linguistic features that are very different from the dominant global languages. It’s not a small technical flaw. The systems that most people believe are generally capable are inherently biased.
It’s important to take the ethical aspect of all of this seriously rather than ignoring it because it’s where things truly become complicated. Language data is not impartial. In ways that corporate data pipelines don’t naturally support, the words of many indigenous communities carry sacred weight, encode knowledge that was deliberately suppressed during colonialism, and belong to the community. Te Hiku Media directly addressed this by creating the Kaitiakitanga License, a governance framework that maintains Māori control over data. Some communities are more circumspect. Some Shoshone communities in the American Southwest have opposed any attempts to standardize their language in writing, preferring to preserve its oral nature rather than run the risk of transcription flattening it.
This is a reasonable form of skepticism. Vocabulary can be recorded, transcribed, and taught by AI. It cannot convince an adolescent to speak their grandmother’s language instead of English on its own. The deeper factors influencing whether a language is passed down include economic pressures, migration trends, and educational systems. Technology can reduce the fragility of the documentation, lower the barrier to participation, and maintain something accessible long enough for communities to decide what to do with it. Depending on variables that an algorithm cannot control, that may or may not be sufficient.
As this develops, the most obvious lesson appears to be that technology functions best when the community takes the lead, not as a recipient of external tools but as the creator of how its own language is managed.
London Bilingualism's content on health, medicine, and weight loss is solely meant for general educational and informational purposes. This website does not offer any diagnosis, treatment recommendations, or medical advice.
We consistently compile and disseminate the most recent information, findings, and advancements from the medical, health, and weight loss sectors. When content contains opinions, commentary, or viewpoints from professionals, industry leaders, or other people, it is published exactly as it is and reflects those people's opinions rather than London Bilingualism's editorial stance.
We strongly advise all readers to consult a qualified medical professional before acting on any medical, health, dietary, or pharmaceutical information found on this website. Since every person's health situation is different, only a qualified healthcare provider who is familiar with your medical history can offer you advice that is suitable for you.
In a similar vein, any legal, regulatory, or compliance-related information found on this platform is provided solely for informational purposes and should not be used without first obtaining independent legal counsel from a licensed attorney.
You understand and agree that London Bilingualism, its editors, contributors, and affiliated parties are not responsible for any decisions made using the information on this website.
