Preserving Endangered Languages with the Help of Technology: The Story of Project ELLORA
Every two weeks, a language becomes extinct across the world. This is known as language death.
A project called Project ELLORA (Enabling Low Resource Languages) was started in 2015 to preserve these endangered languages and make sure that their speakers can participate and interact in the digital world.
Project ELLORA is a subsidiary of Microsoft Research India and works with local communities and native speakers to create base datasets used to build AI technologies for languages that do not have a strong presence in the digital world. The first step of Project ELLORA was to map out what resources were already available and classify the languages into six tiers, with the top tier being resource-rich languages and the bottom tiers reflecting languages with little-to-no resources.
The project aims to preserve a language for posterity and meet the digital needs of the language's speakers. The researchers work with the communities to define what their needs are and what technology can help fulfill them. For example, the project is currently working on Hindi-to-Mundari text translation and speech recognition models, as well as a text-to-speech model, to provide the Munda community access to more content in their language. By involving the community in the data collection process, the researchers hope to create a dataset that is both accurate and culturally relevant.
Read more about the ELLORA project and saving endangered languages>>