Google's new AI chatbot beats OpenAI, human experts in tests – The Australian Financial Review
Google has launched what it says is the most advanced artificial intelligence invented, Gemini, leapfrogging OpenAI in the race to render human brains redundant.
The “multimodal” AI, trained from the ground up to handle questions asked via a mix of audio, photos, video and text, outperformed every other AI including OpenAI’s GPT-4 model in 30 out of the 32 most popular industry benchmarks, Google officials claimed.
Google’s new AI, Gemini, answers questions asked using a mix of voice, video, photos and typed text.
It was also the first AI to outperform human experts in a key benchmark known as the Massive Multitask Language Understanding (MMLU) test, Google said.
The most advanced version of the AI, known as Gemini Ultra, will not be available until next year, but an intermediate version known as Gemini Pro will begin to power Google’s free chatbot, Bard, as of Thursday, Google said.
The version of Bard powered by Gemini Ultra, known as Bard Advanced, might well be a paid-for service.
“We’ll explore what monetisation might look like, but we don’t have anything specific on that right now,” Sissie Hsiao, vice president in charge of Assistant and Bard at Google, said.
A third, cut-down version of the AI, Gemini Nano, will appear in Android phones starting with Google’s Pixel 8 Pro phone, to answer complex voice, video, photo and written questions on the phone itself, without the need for an internet connection.
In a video demonstration, Gemini identifies that the photo is homework, marks it, and explains the errors.
In a global launch event, the company showed off Gemini Ultra performing a range of tasks which, until now, were generally reserved for humans.
In one pre-recorded demonstration, Gemini Ultra was shown a photo of a child’s physics homework, and was able to read it, mark it, and explain the maths and physics errors the child had made, going into levels of detail far beyond what most parents would be capable of.
In another demonstration, two objects were held up in front of a webcam – an orange and a fidget spinner – and the AI was able to identify them both and explain that citrus and the spinner had something in common: they both could be “calming”.
Eli Collins, the vice president in charge of product at Google DeepMind, which developed Gemini, said one of the main features of Gemini was it was less likely to “hallucinate” than other AIs.
“Improving the accuracy of responses was one of the core training objectives of the model. When we talk about getting a better score on these benchmarks, it’s often a result of improving Gemini’s ability to reason and to answer questions factually,” he said.
(And, indeed, the Google search engine does contain plenty of references to the “calming” effects of both citrus and fidget spinners.)
When the orange was replaced by a Rubik’s Cube, the AI identified they were both examples of toys that adults, as well as children, play with.
Google also showed off Gemini figuring out what a complex join-the-dots puzzle was depicting, before anyone even joined the dots (“This is a picture of a crab,” the AI pre-empted.) The AI watched as someone performed a simple sleight of hand with a ball and three cups, and correctly predicted the ball would be in the left cup.
But Gemini would not just be used for homework, puzzles and party tricks, Google said.
In tests where it was pitted against (presumably talented) human software developers attending a coding competition, Gemini was better than 85 per cent of them, Mr Collins said.
Starting on Wednesday, Gemini would be used to power Google’s software-writing platform, AlphaCode, where it would be able to solve “nearly twice as many problems” as the previous AI, he said.
Not only was new AI inherently multimodal and able to program in a number of programming languages, it was also inherently multilingual, having been trained on more than 100 human languages.
But, as with Google’s previous AIs, it will speak only one language at first: English. Other languages would quickly follow, Mr Collins said.
Follow the topics, people and companies that matter to you.
Fetching latest articles
The Daily Habit of Successful People