AI can help to save endangered languages | The Drum

This post was originally published on this site.

Pioneering translators welcome LLMs, says Hannes Ben of Locaria. While the models still need human input, they provide hope that endangered languages can be saved.

Artificial intelligence (AI) and large language models (LLMs) like GPT have sparked considerable debate among language professionals. Rather than being a threat to jobs, AI and LLMs are essential for language and cultural preservation.

They can ensure that all cultures and languages not only survive but thrive, maintaining their uniqueness while adapting to a globalizing world. For language specialists, AI can help create accessible, high-quality content in every language, bridging divides that human effort alone cannot overcome.

Lost in translation

A linguistā€™s goal is to keep languages alive by making information accessible in every language, a daunting task given the 7,000 languages spoken worldwide. English, Spanish, and Mandarin dominate global communication, but thousands of other languages are underrepresented digitally. According to industry estimates, over 75% of all web content is in English, while only about 10% of the global population speaks English as a first language.

The vast number of language pairs (combinations of a source and target language) and the limited number of translators mean only a small fraction of content is professionally translated. Most translations occur between a few dominant pairs (English-Spanish, English-French, English-Chinese, and a few others), leaving many languages with little to no translation. This makes vast amounts of global knowledge inaccessible to billions in their native languages.

Human translators alone cannot meet this demand, no matter how skilled, and a scalable solution is needed. This is where AI and LLMs become indispensable.

Human input

AI-driven translation models like LLMs have begun addressing this gap, processing massive amounts of text and performing translations quickly across a broader range of languages than any human team. While LLMs are far from perfect, they represent a significant step forward in enabling access to information in multiple languages, especially for underrepresented languages and regions with fewer translators.

However, it’s essential to acknowledge where LLMs excel and where they still fall short. They perform best with widely spoken languages and simple texts, struggling with niche languages and specialized content like legal or medical documents. AI-generated translations can lack the nuance, accuracy, and cultural sensitivity of human translators, and errors can be problematic in critical documents. Limited data directly affects translation quality.

Factual inaccuracies in AI-generated content pose another challenge. LLMs can sound highly confident even when wrong, which is risky in fields requiring high accuracy, such as legal or technical domains. Improving factual reliability is therefore essential for AIā€™s continued evolution.

Despite these challenges, LLMs are improving with each iteration, becoming better at understanding grammar, tone, and context. Future models are expected to handle complex linguistic tasks and smaller language pairs, positioning AI as a supportive tool, an evolution upending the traditional role of translators themselves.

This new dynamic between AI and linguists is redefining the profile of a professional translator. Pioneering linguists embracing the change do not fear AI; they want LLMs to improve. They have an unparalleled mastery of two languages, most likely in a specific field, and are uniquely placed to correct the rapid output of any given LLM, ā€˜filling in the gapsā€™, and meeting the precise requirements of a client and target audience.

In this way, a partnership between humans and advanced LLMs has the potential to democratize access to information, address the shortage of professional translators, and ensure high-quality translations for all languages at a pace unmatched by humans alone. This would level the playing field for smaller languages and cultures often overlooked.

Global diversity

The future clearly lies in collaboration between AI, the tools that will enhance accessibility and cultural preservation, and a new type of language professional, individuals with expertise beyond language alone, with a deep understanding of culture and technology, able to concoct the perfect blend of linguists and LLMs.

For this vision to be realized, ongoing investment is needed in AI development, particularly in underrepresented languages, cultural nuances, and specialized translation fields. We need faster, more accurate AI systems that understand the subtleties of language. These systems must be trained in diverse linguistic and cultural contexts, not just dominant languages. AI should be seen as a partner, not a competitor, in preserving languages and cultures and making knowledge accessible globally.

Language professionals are not afraid of AI and LLMs ā€“ on the contrary: we want more, faster, better, and in more languages. AIā€™s evolution is not a challenge for linguists, but a powerful ally in the quest to keep every language and culture alive and thriving in a connected world.