Mikel Artetxe, a computer scientist, asked an almost ridiculous question somewhere in San Sebastián on a sunny morning that most likely smelled slightly of the Atlantic. Give someone a stack of non-matching Chinese and Arabic books, then ask them to translate between the two. Impossible, isn’t it? That was the idea. However, the machine he had been training was starting to do just that.
One unsettling reality has been the foundation of the dream of fluid, automatic translation for years. Every contemporary translation tool is powered by neural networks, which are brain-inspired algorithms that require feeding. Indefinitely. Over decades of multilingual paperwork, millions of well-aligned sentence pairs were meticulously created by humans. It performed flawlessly in both French and English. For Basque, Swahili, or any of the thousands of languages that were never included in the UN archives, it performed less well.
| Topic | Unsupervised Machine Translation |
| Lead Researchers | Mikel Artetxe (UPV) & Guillaume Lample (Facebook AI Research) |
| Institutions Involved | University of the Basque Country, Spain · Facebook AI, Paris |
| Original Coverage | Reported in Science Magazine |
| Method Used | Unsupervised neural machine translation |
| Key Techniques | Back translation and denoising |
| Benchmark Score | BLEU score of roughly 15 on English-French pairings |
| Comparable Supervised Score | Google Translate, approximately 40 |
| Human Translator Score | Above 50 |
| Status | Submitted to ICLR 2018, not yet peer reviewed |
| Significance | First credible attempt at translation without parallel text |
Now, two independent research teams have proposed something unusual, uploading their papers to arXiv within a day of one another. It turns out that parallel text may not even be necessary for translation. The way words interact with one another is the trick. In practically every language ever spoken, a table and chair are seated together. Socks and shoes travel together. In the same way that two cities viewed from a satellite start to share the same general shape, the same arteries, and the same logic of human life, if a computer carefully maps these clusters, the maps from two different languages begin to resemble each other.
When you lay one map on top of the other, you get a rough, approximative map that is actually quite helpful. Without a teacher, a bilingual dictionary was created. The idea that languages, despite their apparent differences, share a deep structural rhythm that a machine can detect even when a person is unable to articulate it has a slightly philosophical quality.

The teams then use two approaches with almost poetic names to further develop their systems. Back translation involves translating a sentence from one language to another, comparing the outcome with the original, and making adjustments when necessary. Denoising causes the model to recover the structure instead of memorizing it by shuffling or removing words. It feels more like teaching a child by giving them a torn page and asking them to guess what was missing than it does like programming.
In absolute terms, the results are not yet noteworthy. When translating between English and French, both systems received a score of about 15 on the BLEU scale. Google’s supervised model scores close to 40. A competent person, well over fifty. Therefore, no one will take the place of qualified translators in the future. However, the trajectory is uncommon, and the implications are subtly huge.
Something seems to have changed. It was shocking, according to Di He, a Beijing-based Microsoft researcher whose previous work inspired both papers. It is difficult to ignore the restraint in that word when reading it. For a long time, translation has been viewed as a brute-force data problem. These studies raise the possibility that it could also be an issue with structure, pattern, or the peculiar universal grammar that linguists like Chomsky used to debate in classrooms. You get the impression that the field is getting looser as you watch this happen.
It’s anyone’s guess what comes next. Artetxe’s co-author, Eneko Agirre, took care to refer to the work as an infancy, a gateway rather than a destination. However, that doorway is worth keeping a close eye on for the world’s smaller languages, medical jargon, regional slang, and the half-forgotten dialects that have always existed outside the purview of big tech.
London Bilingualism's content on health, medicine, and weight loss is solely meant for general educational and informational purposes. This website does not offer any diagnosis, treatment recommendations, or medical advice.
We consistently compile and disseminate the most recent information, findings, and advancements from the medical, health, and weight loss sectors. When content contains opinions, commentary, or viewpoints from professionals, industry leaders, or other people, it is published exactly as it is and reflects those people's opinions rather than London Bilingualism's editorial stance.
We strongly advise all readers to consult a qualified medical professional before acting on any medical, health, dietary, or pharmaceutical information found on this website. Since every person's health situation is different, only a qualified healthcare provider who is familiar with your medical history can offer you advice that is suitable for you.
In a similar vein, any legal, regulatory, or compliance-related information found on this platform is provided solely for informational purposes and should not be used without first obtaining independent legal counsel from a licensed attorney.
You understand and agree that London Bilingualism, its editors, contributors, and affiliated parties are not responsible for any decisions made using the information on this website.
