Open-source AI chatbots are booming — what does this mean for … – Nature.com
Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.
Advertisement
You can also search for this author in PubMed Google Scholar
You have full access to this article via your institution.
Open-source AI initiatives seek to make the technology more widely accessible to researchers.Credit: Philippe Lejeanvre/Alamy
The craze for generative artificial intelligence (AI) that began with the release of OpenAI’s ChatGPT shows no sign of abating. But while large technology companies such as OpenAI and Google have captured the attention of the wider public — and are finding ways to monetize their AI tools — a quieter revolution is being waged by researchers and software engineers at smaller organizations.
Whereas most large technology companies have become increasingly secretive, these smaller actors have stuck to the field’s ethos of openness. They span the spectrum from small businesses and non-profit organizations to individual hobbyists, and some of their activity is motivated by social goals, such as democratizing access to technology and reducing its harms.
Such open-source activity has been “exploding”, says computer scientist Stella Biderman, head of research at EleutherAI, an AI research institute in New York City. This is particularly true for large language models (LLMs), the data-hungry artificial neural networks that power a range of text-oriented software, including chatbots and automated translators. Hugging Face, a New York City-based company that aims to expand access to AI, lists more than 100 open-source LLMs on its website.
Last year, Hugging Face led BigScience, a coalition of volunteer researchers and academics, to develop and release one of the largest LLMs yet. The model, called BLOOM, is a multilingual, open-source system designed for researchers. It continues to be an important tool: the paper that described it has since amassed more than 300 citations, mostly in computer-science research.
Open-source language AI challenges big tech’s models
Open-source language AI challenges big tech’s models
In February, an even bigger push came for the open-source movement when Facebook’s parent company, Meta, made a model called LLaMA freely available to selected external developers. Within a week, the LLaMA code was leaked and published online for anyone to download.
The availability of LLaMA has been a game-changer for AI researchers. It is much smaller than other LLMs, meaning that it doesn’t require large computing facilities to host the pretrained model or to adapt it for specialized applications, such as to act as a mathematics assistant or a customer-service chatbot. The biggest version of LLaMA consists of 65 billion parameters: the variables set during the neural network’s initial, general-purpose training. This is less than half of BLOOM’s 176 billion parameters, and a fraction of the 540 billion parameters of Google’s latest LLM, PaLM2.
“With LLaMA, some of the most interesting innovation is on the side of efficiency,” says Joelle Pineau, vice-president of AI research at Meta and a computer scientist at McGill University in Montreal, Canada.
Developers have made a version of the leaked AI LLaMA that can run on a Raspberry Pi computer.Credit: Dominic Harrison/Alamy
Open-source developers have been experimenting with ways of shrinking LLaMA down even more. Some of these techniques involve keeping the number of parameters the same but reducing the parameters’ precision — an approach that, surprisingly, does not cause unacceptable drops in performance. Other ways of downsizing neural networks involve reducing the number of parameters, for example, by training a separate, smaller neural network on the responses of a large, pretrained network, rather than directly on the data.
Within weeks of the LLaMA leak, developers managed to produce versions that could fit onto laptops and even a Raspberry Pi, the bare-bones, credit-card-sized computer that is a favourite of the ‘maker’ community. Hugging Face is now primarily using LLaMA, and is not planning to push for a BLOOM-2.
Shrinking down AI tools could help to make them more widely accessible, says Vukosi Marivate, a computer scientist at the University of Pretoria. For example, it could help organizations such as Masakhane, a community of African researchers led by Marivate that is trying to make LLMs work for languages for which there isn’t a lot of existing written text that can be used to train a model. But the push towards expanding access still has a way to go: for some researchers in low-income countries, even a top-of-the-range laptop can be out of reach. “It’s been great,” says Marivate, “but I would also ask you to define ‘cheap’.”
For many years, AI researchers routinely made their code open source and posted their results on repositories such as the arXiv. “People collectively understood that the field would progress more quickly if we agreed share things with each other,” says Colin Raffel, a computer scientist at the University of North Carolina at Chapel Hill. The innovation that underlies current state-of-the-art LLMs, called the Transformer architecture, was created at Google and released as open source, for example.
What ChatGPT and generative AI mean for science
What ChatGPT and generative AI mean for science
Making neural networks open source enables researchers to look ‘under the hood’ to try to understand why the systems sometimes answer questions in unpredictable ways and can carry biases and toxic information over from the data they were pre-trained on, says Ellie Pavlick, a computer scientist at Brown University in Providence, Rhode Island, who collaborated with the BigScience project and also works for Google AI. “One benefit is allowing many people — especially from academia — to work on mitigation strategies,” she says. “If you have a thousand eyes on it, you’re going to come up with better ways of doing it.”
Pavlick’s team has analysed open-source systems such as BLOOM and found ways to identify and fix biases that are inherited from the training data — the prototypical example being how language models tend to associate ‘nurse’ with the female gender and ‘doctor’ with the male gender.
Even if the open-source boom goes on, the push to make language AI more powerful will continue to come from the largest players. Only a handful of companies are able to create language models from scratch that can truly push the state of the art. Pretraining an LLM requires massive resources — researchers estimate that OpenAI’s GPT-4 and Google’s PaLM 2 took tens of millions of dollars’ worth of computing time — and also plenty of ‘secret sauce’, researchers say.
Are ChatGPT and AlphaCode going to replace programmers?
Are ChatGPT and AlphaCode going to replace programmers?
“We have some general recipes, but there are often small details that are not documented or written down,” says Pavlick. “It’s not like someone gives you the code, you push a button and you get a model.”
“Very few organizations and people can pretrain,” says Louis Castricato, an AI researcher at open-source software company Stability AI in New York. “It’s still a huge bottleneck.”
Other researchers warn that making powerful language models broadly accessible increases the chances that they will end up in the wrong hands. Connor Leahy, chief executive of the AI company Conjecture in London, who was a co-founder of EleutherAI, thinks that AI will soon be intelligent enough to put humanity at existential risk. “I believe we shouldn’t open-source any of this,” he says.
Nature 618, 891-892 (2023)
doi: https://doi.org/10.1038/d41586-023-01970-6
What ChatGPT and generative AI mean for science
Open-source language AI challenges big tech’s models
How Nature readers are using ChatGPT
Are ChatGPT and AlphaCode going to replace programmers?
ChatGPT: five priorities for research
Why open-source generative AI models are an ethical way forward for science
How to train early-career scientists to weather failure
Career Feature
Postdoctoral researchers warn NIH that cost-of-living pressures are gutting the workforce
Career News
John Bannister Goodenough, battery pioneer (1922–2023)
Obituary
Gordon Moore (1929–2023)
Obituary
DeepMind AI creates algorithms that sort data faster than those built by people
News
Faster sorting algorithms discovered using deep reinforcement learning
Article
Adaptive algorithms: users must be more vigilant
Correspondence
Stop talking about tomorrow’s AI doomsday when AI poses risks today
Editorial
Ethics: fund an independent system to verify EdTech
Correspondence
Founded in 1897, Zhejiang University (ZJU) ranks among the top 3 universities on Chinese mainland and within the top 100 in the Times Higher Educat…
Hangzhou, Zhejiang (CN)
Stomatology Hospital, School of Stomatology, Zhejiang University School of Medicine(ZJUSS)
IFReC, Osaka University in Japan offers Advanced Postdoc Positions for Immunology, Cell Biology, Bioinformatics and Bioimaging.
Suita Campus, Osaka University in Osaka, Japan
Immunology Frontier Research Center, Osaka University
The Department of Biomedical Engineering (BME) , Southern University of Science and Technology (SUSTech), seeks outstanding applicants for full-tim…
Shenzhen, Guangdong, China
Department of Biomedical Engineering, SUSTech
Professor, Associated Professor, Assistant Professor, Post-doctoral, Research Assistant
Shenzhen, Guangdong, China
Shenzhen Institutes of Advanced Technology (SIAT), Chinese Academy of Sciences(CAS)
The First Affiliated Hospital of Nanchang University invites global scholars around the world.
Nanchang, Jiangxi, China
The First Affiliated Hospital of Nanchang University
You have full access to this article via your institution.
What ChatGPT and generative AI mean for science
Open-source language AI challenges big tech’s models
How Nature readers are using ChatGPT
Are ChatGPT and AlphaCode going to replace programmers?
ChatGPT: five priorities for research
Why open-source generative AI models are an ethical way forward for science
An essential round-up of science news, opinion and analysis, delivered to your inbox every weekday.
Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.
© 2023 Springer Nature Limited