ChatGPT has prompted a ‘code red’ – and other, alternative chatbots could be coming for it – The Independent


Notifications can be managed in browser preferences.
Please refresh the page or navigate to another page on the site to be automatically logged inPlease refresh your browser to be logged in
Swipe for next article
OpenAI’s system is behind both on technical prowess and personality
From reproductive rights to climate change to Big Tech, The Independent is on the ground when the story is developing. Whether it’s investigating the financials of Elon Musk’s pro-Trump PAC or producing our latest documentary, ‘The A Word’, which shines a light on the American women fighting for reproductive rights, we know how important it is to parse out the facts from the messaging.
At such a critical moment in US history, we need reporters on the ground. Your donation allows us to keep sending journalists to speak to both sides of the story.
The Independent is trusted by Americans across the entire political spectrum. And unlike many other quality news outlets, we choose not to lock Americans out of our reporting and analysis with paywalls. We believe quality journalism should be available to everyone, paid for by those who can afford it.
Almost exactly three years ago, everything changed: OpenAI launched ChatGPT and upended the entire world in a moment. Since then, artificial intelligence has transformed industries, become one of the most popular talking points in the world and much more besides.
Throughout all that, ChatGPT has remained so popular that it is almost synonymous with generative AI and chatbots. But that dominance might be under threat: this week, it was reported that OpenAI boss Sam Altman has issued a “code red” and urged staff to improve ChatGPT, amid fears it could be overtaken by rivals.
There are many of those rivals, each of them also offering a system that will use a large language model to answer questions that are written into a box, just like ChatGPT. They include Google’s Gemini model – the one that is improving so rapidly it worried Mr Altman – but also others such as Elon Musk’s Grok, Perplexity and Anthropic’s Claude.
For the most part, the systems are more similar than they are different. All of them were created by training a large language model on vast amounts of text, and then putting that into a chat system so that people can easily interact with them. All of them will aim to answer questions and fulfil requests as helpfully as possible.
Broadly, the different systems are better or worse at different things: Claude tends to be better for coding, for instance, while Gemini can call on its connection with the rest of Google search and produce better answers about real-time events.
There are some leaderboards that aim to offer some objective ways of comparing different systems. AI company Hugging Space for instance operates a leaderboard that allows them to be evaluated based on a variety of different criteria: how much context they can bring into your discussions, how quickly they work, and how well they perform on a series of tests. Broadly, that suggests that Google’s top-end models are the best, followed by Anthropic’s Claude and then OpenAI’s ChatGPT.
Similarly, researchers have created a tool called Humanity’s Last Exam which aims to evaluate how close AI systems are to expert-level humans by asking it questions from a 2,500-strong set of advanced problems. Google is also winning there: Gemini is at the top of the leaderboard, followed by two different releases from OpenAI and then Claude. (None of them are yet doing all that well: the top score is under 38 per cent, suggesting that humans are still more intelligent for now.)
All of these more objective rankings have their issues, however. One important one is that new models are being trained after the leaderboards’ tests were made, so that the models can be specifically trained to be good at the test. Another is that their ability to excel on those tests might not actually mean that the models are going to be more helpful.
The more obvious and perhaps more substantial difference between all of those models is more likely to be one of personality. The main thing that marks the main systems apart is the training process they have been through, during which they are built to favour different kinds of answers: one system might be more playful, for instance, or more verbose. Users can find out those stylistic differences by spending some time with each of the systems.
OpenAI seems to have been panicked into launching its “code red” by the technical superiority of Google’s Gemini. But ChatGPT has also received sustained criticism for problems with its style, too. When it launched GPT-5 in summer – after intense attempts to build hype around its reveal – it was hit by criticism from users who found it cold, or unfriendly, or less fun. Likewise, the company rushed to respond to the concerns, offering the ability to use the older model as well as making quick tweaks to the new one to make it more satisfying to users.
For now, ChatGPT remains by far the most popular of the chatbots. But the panic at OpenAI seemingly reflects a real threat: that, because of technical prowess or simply being a bit annoying, people might stop talking to it.
Join thought-provoking conversations, follow other Independent readers and see their replies
Please refresh the page or navigate to another page on the site to be automatically logged inPlease refresh your browser to be logged in

source

Jesse
https://playwithchatgtp.com