Meta's Latest India Push: Cracking the Vernacular Code for AI Chatbots – Entrepreneur


Copyright © 2025 Entrepreneur Media, LLC All rights reserved. Entrepreneur® and its related marks are registered trademarks of Entrepreneur Media LLC
By Kul Bhushan
Share
Opinions expressed by Entrepreneur contributors are their own.
You’re reading Entrepreneur India, an international franchise of Entrepreneur Media.
India, with its massive internet user base, remains an attractive destination for internet companies. For Meta, India is among the largest markets for its WhatsApp, Instagram, and Facebook services. Now, the social media giant is looking to the market for the growth of its next service: chatbots.
Now that Meta and the likes of Google and Amazon have been in the market for over a decade, they understand the need to offer their key services in a language that locals understand. However, India’s diverse set of languages remains a daunting task for them to overcome. AI is no exception, and when it comes to chatbots, local nuances are extremely important.
Training AI for Hindi
Business Insider reports that Meta is hiring contractors in the US to train AI chatbots for Hindi-speaking users, with job listings on platforms such as Aquent Talents offering up to USD 55 per hour. According to the job description, these contractors will work to make the chatbots more intuitive and interactive for Hindi users.
It’s worth highlighting that Meta is now gradually integrating different kinds of chatbots into its services, including Instagram. These are separate from the Meta AI chatbot, a ChatGPT-like service available on Meta’s apps.
To put things in perspective, Meta and other internet companies hire contractors to work on local language versions of their services. In the case of chatbots, however, the idea is to help the AI understand local nuances, slang, humour, and social context much better than it does now. The more these bots interact, the more they learn. This process is popularly known as Reinforcement Learning from Human Feedback (RLHF).
Without specialized and well-trained data in vernacular languages, LLMs (large language models) can deliver responses that clearly miss the nuances. For example, a literal translation of a popular phrase in your language simply won’t make sense. Moreover, specialized training helps to reduce biases.
Complications involved
While we have understood that it is pertinent to have a specialised training for vernacular languages, it still remains a complicated job. One of the biggest challenges is the availability of good-quality and clean data.
“We are trying to solve this to some extent by ensuring our pipelines weed out and normalize noisy data as much as we possibly can. Also, we put in efforts to create data which will mimic the real user as much as possible. Most datasets are biased toward global languages, so sourcing data for regional ones is difficult,” Pranjal Nayak, head of speech R&D at Reverie Language Technologies explained it to Entrepreneur India.
Reverie focuses on language solutions, offering text, voice, and video localisation automation.
Nayak says there have been initiatives backed by the government from educational institutions, such as, IITs, IISc, CIIL and state universities. These focus on creation of datasets for various regional languages, and models, covering both speech and text.
He further said that the company uses these limited public datasets, and also scrape regional websites, crowdsource, or partner with local institutions.
“Quality conversational data is especially hard to find. We try to collect them by doing activities in-house, sometimes extending to friends and family. There has also been growth of data building companies in the past 3-4 years who focus on sourcing and creating datasets that companies like us can consume,” he added.
Devnagri AI cofounder Nakul Kundra further explains that Hindi presents unique challenges such as limited high-quality parallel datasets, heavy use of code-mixed Hindi-English, multiple dialects, and inconsistent spellings. Issues such as transliteration between Devanagari and Roman scripts further complicate accuracy.
Kundra further said that his startup is trying to address this challenge through large-scale corpus creation, active learning loops, and human-in-the-loop reviews that refine domain-specific glossaries. His startup fine-tunes models with sector-focused data to achieve contextual accuracy.
“Moreover, advanced translation and segmentation techniques help stabilise results across formats. A rigorous quality assurance framework and real-time feedback dashboards close the loop. For us, success is not only about benchmark improvements but ensuring end users can clearly understand, learn, and transact in their own language,” he added.
Global firms local efforts
The demand for vernacular Artificial Intelligence is driven by business needs, not just research ambition. Across India, banks, ecommerce platforms, telecom providers, and public services serve millions who prefer native languages. Multilingual chatbots, voice assistants, and customer support are now becoming increasingly critical to engagement, compliance, and trust.
This shift explains why startups, enterprises, and even government agencies are investing in vernacular Artificial Intelligence specialists.
“It is no longer limited to global firms like Google or Meta; regional players see real conversion lifts, stronger customer loyalty, and measurable impact when they operate in local languages,” says Kundra.
Moreover, increasing adoption of voice interface is leading to increase in Indian language usage, as voice comes naturally without any additional learning curve. With voice AI growth, a lot of companies/startups are creating solutions for Indian language users which is creating the need for AI specialists in these languages. The arrival of Artificial Intelligence, especially Generative AI (GenAI), is a potential turning point in how we interact with human languages.
GenAI learns from massive amounts of human language data, so its ability to understand and create text directly depends on how common a language’s form or dialect is in its training data. This means AI will naturally perform better with more common versions of a language, which will lead to them being used even more and becoming even more dominant.
“AI is clearly becoming a bigger part of our daily lives. As more people use AI systems, the impact on language diversity becomes clearer. While this increase in AI use is expected to connect people globally through language, it could also subtly but significantly shift things. Languages might become more uniform, leaning towards their most common dialects or forms, even as they become more widespread and accessible,” said Nayak.
Summing up,
Even as Meta is working to boost its AI presence through specialization in the local language, more and more firms will be making similar moves to fine-tune their LLMs/SLMs for different geographies and languages. This clearly opens a new window of opportunity for those looking to acquire new skills, such as language specialists in the AI era.
Beyond basic translation and description, the adoption of specialized AI will grow in banking, education, retail, and governance, especially in tier-two and tier-three cities.
We put together a list of the best, most profitable small business ideas for entrepreneurs to pursue in 2025.
These 10 podcasts cut through the noise with unfiltered lessons, unconventional strategies and stories that actually prepare you for the entrepreneurial grind.
The Series A funding round was led by Notion Capital with participation from RTP Global, LocalGlobe, EQ2 Ventures, Leo Capital.
By integrating GenStaq's plug and play infrastructure platform, RevRag.AI will gain greater control from application layer agents to backend infrastructure.
Many first-time founders stall their own growth, not because they lack talent or vision, but because they fall into predictable traps.
Japanese firms are drawn by India's unmatched supply of 1.5 million STEM graduates annually and the potential to reduce operational costs by up to 40 per cent.
Successfully copied link
We'll be in your inbox every morning Monday-Saturday with all the day’s top business news, inspiring stories, best advice and exclusive reporting from Entrepreneur.
Copyright © 2025 Entrepreneur Media, LLC All rights reserved. Entrepreneur® and its related marks are registered trademarks of Entrepreneur Media LLC

source

Jesse
https://playwithchatgtp.com