Google announces new language features for India

Our Bureau Updated - December 06, 2021 at 09:57 AM.

Google India is launching a litany of new products to extend the power of the internet to more Indian language users. Anand Rangarajan, Engineering Director at Google India, says: “Google’s mission is to organise the world’s information to make it universally accessible and useful...which is harder to achieve when it comes to languages. India has over 150 languages. How you achieve this goal at this scale is a fascinating problem.”

Google has announced four new language features to further ongoing product development and better user experience on this front.

The first feature improves on the current system of writing queries in local languages. As Rangarajan explains: “People often find it difficult to use the Indian language keyboard to write queries in their local language and thus use the Roman/Latin, i.e English keyboard to write the same query, although in Roman characters. Thus the Google system detects an English query and offers English language results rather than the more relevant and often preferred local language results, which would have better served bilingual users. Search will now show relevant content in supported Indian languages, wherever appropriate, even if the local language query is typed using Latin or Roman characters.” This feature will be rolled out in five Indian languages including, Hindi, Bengali, Marathi, Tamil, and Telegu.

The next feature will allow users to toggle search results between English and four more Indian languages, namely, Tamil, Telugu, Bengali and Marathi. This feature was previously offered to toggle search results between English and Hindi.

The third feature extends a previous product development, which allowed users to set the language of Google Assistant and Discover to be different from the phone language. This ability is now being extended to Google Maps. Thus, users will be able to change their map experience to one of nine Indian languages. Rangarajan says, “We know that people prefer to use different languages in different contexts, which is the motivation for this feature.”

The fourth feature allows users to snap a photo of a math problem from the Google search bar using Google lens and then learn how to solve it on their own in English or in Hindi. Rangarajan says, “This is one of the interesting ways in which we have combined image recognition with machine learning to take on a real-world problem and do it in an Indian language.”

Natural language understanding

Google Research also has interesting breakthroughs to report in the area of natural language understanding. Natural language understanding is seen in action in products such as Google Translate. Google Research has developed a new approach over the past year to address the computational challenges faced called Multilingual Representation of Indian Languages or MuRIL. Dr Partha Talukdar, Research Scientist, Google Research India explains: “The traditional approach to address problems in the area of natural language understanding has been to build a model specifically for a language. The challenge here is that most of the time we don’t have language-specific data to train the data on making it a labour-intensive process. With the large number of languages and dialects that exist in India, this solution is not very scalable.”

MuRIL, a powerful multilingual model developed by Google India, is essentially a single model which is capable of handling multiple languages. “This model has made it possible to transfer knowledge of training from one language to another. This will also help us in addressing nuanced problems pertinent to the Indian context such as translation, spelling deviation or mixed usage.”

MuRIL currently supports 16 Indian languages and English. This is the highest coverage of Indian languages amongst any such models present in the ecosystem, that are publicly available. This model makes the computational process more efficient and accurate in the Indian context, while also providing support for translating text when writing Hindi in English script to Devnagiri script, a capability that was previously missing. MuRIL is open source and currently available for download and use from the TensorFlow Hub for free.

Talukdar concludes: “MuRIL is a starting point in the big evolution for Indian language understanding. We hope this will prove to be a better foundation for researchers, start-ups and anyone else interested in building Indian language technologies. We are really excited about what use the ecosystem puts it to.”

 

 
Published on December 17, 2020 11:22