French AI darling Mistral's new LLM can teach bomb building … – Sifted

Startup Europe. Grown up reporting.
Backed by
© 2023 Sifted EU Ltd. All Rights Reserved.
September 29, 2023
Tim Smith
3 min read
French breakout AI startup Mistral this week released its first large language model (LLM) to great fanfare, as Europe seeks to hold its own in AI against Big Tech companies like Meta.
But a day after release, the model has attracted significant criticism for its lack of moderation, especially as it appears to generate harmful content that is filtered out from the output of competitor models. When asked, Mistral’s LLM gives detailed instructions on how to make a bomb, something that Meta’s Llama, OpenAI’s ChatGPT and Google’s Bard all refuse to answer.
Sifted tested Mistral’s 7B model by asking for advice on how to self harm and to harm others — in each case the model responded with step by step and detailed guides on how to do so.
After concerns were raised online about Mistral’s model, the company added a text box to its release page.
“The Mistral 7B Instruct model is a quick demonstration that the base model can be easily fine-tuned to achieve compelling performance. It does not have any moderation mechanisms. We’re looking forward to engaging with the community on ways to make the model finely respect guardrails, allowing for deployment in environments requiring moderated outputs,” it reads.
Mistral declined Sifted’s response for comment on the safety of the model and its release.
There are other open source LLMs available online that don’t moderate responses, but AI safety researcher Paul Röttger — who worked on the team to make GPT-4 safer pre-release — says it’s surprising for a well-known company to release a model like this.
“As a large organisation releasing a large chat model, you have to evaluate and mention safety. What they did was to compare themselves to the Llama models and said they were better than them,” he tells Sifted. “A responsible release at least comments on model safety. That choice has important consequences, because in many applications it’s a very important distinction.”
Others on Twitter pointed out that any well-trained LLM can give answers like these, if the model has not been fine-tuned or guided using reinforcement learning via human feedback. But, Röttger points out that it’s a model tuned for chat.
“Because they made the choice to release this chat-optimised model, I think that means they need to compare themselves to the other chat-optimised models,” he says. “They never claimed that this chat model was particularly safe. They just didn’t comment on that at all.”
Editor's pick
15 climate software startups to watch, according to investors
Brunch with Creandum’s Johan Brenner: “All partners have equal pay, equal bonuses and equal carry”
Future Fund was meant to save British startups. Now it risks bankrupting them
Tim Smith
Tim Smith is a senior reporter at Sifted. He covers deeptech and all things taboo, and produces Startup Europe — The Sifted Podcast . Follow him on Twitter and LinkedIn
How a Polish startup is bringing AI to one of world's biggest construction sites
AI Clearing helps businesses track how work is progressing on big construction sites, and has just raised a $14m Series A

Sifted podcast: Can French AI startup Mistral ever compete with Big Tech?
Hot off a €105m seed round, what's next for Mistral?

How can Europe move deeptech talent out of the lab?
How can Europe strengthen deeptech talent quality? Founders and investors share recommendations ahead of the Sifted Summit

Backed by
© 2023 Sifted EU Ltd. All Rights Reserved.