Alibaba’s Latest A.I. Beats GPT-3.5, Claude In Multple Benchmark Tests – Wccftech
This is not investment advice. The author has no position in any of the stocks mentioned. Wccftech.com has a disclosure and ethics policy.
With 2024 marking a strong start to the global artificial intelligence race, Chinese technology giant Alibaba Group has also announced the latest iteration of its Qwen artificial intelligence model. Apart from OpenAI’s ChatGPT, which is the most well known A.I. chatbot in the world, other models such as Meta’s Llama and Amazon partner Anthropic’s Claude are several options that consumers and businesses have when making the choice of an A.I. platform for their needs.
Alibaba’s latest Qwen iteration is Qwen 1.5, and according to benchmarks shared on the social media platform X, the model beats both ChatGPT and Claude in some benchmark scores.
Just like operating systems that run on computers or smartphones, an artificial intelligence model is also a piece of software. This allows software engineers and analysts to evaluate its performance, and when it comes to Alibaba’s latest Qwen 1.5, some scores show that it outperforms Anthropic’s Claude and OpenAI’s ChatGPT.
Benchmarks that test operating systems evaluate their ability to process instructions and run applications, and those for artificial intelligence models typically revolve around them testing the models’ ability to generate outputs.
Two such benchmarks are MT-bench and Alapaca-Eval, and scores shared on X show that a variant of Alibaba’s Qwen 1.5 has surpassed ChatGPT and Claude in them. MT-bench tests a models’ ability to answer a set of pre defined questions that not only seek to differentiate it from chatbot but also try to determine if the model can ‘hold its ground’ in a tough conversational setting that involves two parties rapidly engaging with each other.
The benchmark scores show that Qwen was the fourth highest scorer in MT-bench, and it only lagged behind GPT-4 Turbo and the first two GPT-4 releases, namely versions 0613 and 0314.
Alibaba releases Qwen 1.5
demo: https://t.co/goMcWMsIzT
largest open-source Qwen1.5-72B-Chat, exhibits superior performance, surpassing Claude-2.1, GPT-3.5-Turbo-0613, on both MT-Bench and Alpaca-Eval v2 pic.twitter.com/50dNuUpEBx
— AK (@_akhaliq) February 5, 2024
Alapaca-Eval is a benchmark that uses a reference model to emulate human interactions and determine the extent to which an A.I. model being tested delivers results in line with the baseline. It also provides users with a leaderboard to track their tests, and today’s benchmarks show that Qwen 1.5’s Alapaca-Eval performance only lags behind GPT-4 Turbo and New York based HuggingFace’s Yi-34B.
Qwen1.5 is one of the largest open source models of its kind, and it’s backed by Alibaba’s massive computing resources. An open source A.I., like open source software, makes its code available to users and developers so that they can understand the model and make their own variants. Meta’s Llama, also present in today’s scores, is also an open source model.
The start of 2024 has seen renowned focus from Wall Street and companies on A.I. Earnings reports of mega cap technology giants such as Meta, Microsoft and Alphabet have all focused on A.I. Meta’s chief Mark Zuckerberg aims to buy hundreds of thousands of GPUs this year to power up Llama, and at the firm’s earnings call the executive explained that his decision to beef up computing capacity at Meta follows earlier oversights that led to the firm being under capacity.
Similarly, earnings from chip makers and designers TSMC and AMD have also seen their managements express optimism for the future of A.I. TSMC’s management is confident that the firm has stable footing to capture any A.I. demand, while AMD is of the view that A.I. can end up becoming worth hundreds of billions of dollars by the end of the decade.
Subscribe to get an everyday digest of the latest technology news in your inbox
Some posts on wccftech.com may contain affiliate links. We are a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to amazon.com
© 2024 WCCF TECH INC. 700 – 401 West Georgia Street, Vancouver, BC, Canada