India should build its own ChatGPT – Hindustan Times
Subscribe Now! Get features like
In January 2023, OpenAI chief Sam Altman told a large audience in India that the country was not capable of building a ChatGPT. Eight months later, India’s Chandrayaan-3 made a perfect landing on the largely unexplored south pole of the Moon, putting India alongside the United States (US) and China in the exclusive club of nations with working lunar rovers.
Altman later dialled back his comments but his Freudian slip illustrated the ignorance of Silicon Valley. Altman and his colleagues are the wrong ones to lead us into the Artificial Intelligence (AI) future because they are largely driven by profit and disconnected from the world’s realities.
It is not only space travel, India has already built the world’s most advanced cancer care infrastructure and is rolling it out on an unprecedented scale, leading the White House to announce a partnership with India. We can do the same with AI. India can take Silicon Valley’s open-source technology and build something that benefits the masses, just as it is doing with cancer.
Altman’s comments suggested he is unaware that Indian scientists played a massive role in AI’s recent progress. They published important research papers and were key players inside large technology giants that have driven the development of foundational large language models (LLM) like GPT-4 (which powers ChatGPT) and Google’s Bard. Needless to say, the CEOs of US technology companies building AI technologies, including Google, Microsoft, and IBM, are Indian.
India is also the only country that has designed and fostered an intelligent technology regulation strategy to maintain open and free markets in key aspects of technology like e-commerce and finance. Its Unified Payments Interface (UPI) system is levelling the playing field and preventing market abuse by large technology giants.
Research shows that market dysfunction was created by Google, Amazon, Facebook, and other large players who dominate e-commerce, advertising, and online information sharing. We are already seeing levers of dominance enabling Big Tech to position themselves to dominate AI. The shortage of GPUs and massive lobbying dollars spent requesting expensive regulation that would lock out start-ups are just two examples of this troubling trend.
AI will fundamentally change society and billions of lives. Its development is too important to be left to Silicon Valley elites. India is well positioned to break this dominance and level the AI playing field, accelerating technology innovation and benefiting all of humankind.
To start, India can build, train, and tune a massive foundational LLM trained on data legally aggregated with full permission from data creators or owners. Notably, social media data should be underweighted as large chunks are toxic and unhelpful. This will also make a homegrown LLM safer than current models. As data scientists always say — garbage in, garbage out.
In addition, India’s LLM should be trained on data representing diverse world views and situations, something OpenAI and StableDiffusion neglected. The cardinal sin of much of the AI and algorithms today is the discrimination baked into their fabric through poor data design. This leads to the next key component of India’s LLM gift to the world — complete transparency and traceability. Unlike a few years ago, it is now largely possible to shine a light on the inner workings of deep neural networks used to train and create LLMs. This requires innovation and advances, but India is up to the task. Transparent AI would radically change the game.
Sharing such a system with the world will give us greater safety by allowing the brightest minds to work on the best technology and counterbalance dominant technology companies and evil-doers such as rogue States and criminal groups. This is particularly important because the AI cat is already out of the bag. LLM code and weights for large models such as Meta’s Llama are in the wild. Restrictions on transmission are no longer useful and only serve to handicap innovation.
Building, sharing, and maintaining a truly open AI LLM is not sufficient. Half of the challenge with LLMs is training, which is extraordinarily expensive and computation intensive, costing tens of millions of dollars for each model version. Costs will fall, and researchers and AI companies are already figuring out how to train effectively with less data, less computation, and more specificity, often generating superior results for discrete tasks.
AI is a public good, and the best way to support its creation is to ensure all resources required are publicly accessible. Part of why Google and Meta have leapt to the forefront of AI is their brute access to massive technology infrastructure, something few university researchers could dream of. To cement AI’s future development as an open resource, India should build a massive training cloud for AI and offer it up at cost to Indian start-ups and researchers, and to the world at large. A good model for this is a system of massive telescopes that scan the skies and offer usage time to astronomers everywhere, with a small slice dedicated to the operating institution.
India can reap enormous benefits by fostering AI innovation in its own economy and helping its AI community leap forward with the resources required to do the work. Even better, India and the US can join forces in this quest to foster truly open and publicly available AI. LLMs are foundational technology components that can benefit all and develop faster with a motivated community. The Linux operating system, which now dominates global enterprise computing, is a great example of this dynamic.
A public, community-driven version of AI will accelerate innovation, reduce bias, ensure greater transparency, and provide a better outcome for all.
Vivek Wadhwa is an academic, entrepreneur, and author. Vinita Gupta is the first woman of Indian origin to take a company public in the US. The views expressed are personal