AI Labs Lack Robust Safety Measures, Study Finds
A new study has revealed that some of the leading AI labs around the world lack sufficient safety measures, with Elon Musk’s xAI identified as the worst offender.
The French nonprofit SaferAI released its initial ratings on Wednesday, evaluating the risk-management practices of top AI companies. Siméon Campos, the founder of SaferAI, explains that the purpose of these ratings is to establish a clear standard for how AI companies handle risks as these emerging systems increase in power and usage. AI systems have already demonstrated their capability to [or]. Governments have been slow in implementing regulatory frameworks: a California bill aimed at regulating the AI industry there was [by Governor Gavin Newsom].
“AI is a rapidly evolving technology, but AI risk management isn’t keeping pace,” Campos states. “Our ratings are here to fill the gap until we have governments conducting these assessments themselves.
To evaluate each company, SaferAI researchers assessed the “red teaming” of models—technical efforts to uncover flaws and vulnerabilities—along with the companies’ strategies for modeling threats and mitigating risks.
Among the six companies assessed, xAI ranked last, earning a score of 0/5. Meta and Mistral AI were also categorized as having “very weak” risk management. OpenAI and Google Deepmind received “weak” ratings, while Anthropic led the pack with a “moderate” score of 2.2 out of 5.
xAI received the lowest possible score due to their limited published information on risk management, Campos explains. He hopes the company will prioritize risk now that its model Grok 2 is competing with Chat-GPT and other systems. “I hope this is temporary: that they will publish something in the next six months, and then we can update their grade accordingly,” he says.
Campos believes these ratings could encourage these companies to improve their internal processes, potentially reducing model bias, curbing the spread of misinformation, or making them less susceptible to misuse by malicious actors. Campos also hopes these companies adopt principles similar to those used in high-risk industries like nuclear power, biosafety, and aviation safety. “Despite these industries dealing with vastly different areas, they share very similar principles and risk management frameworks,” he says.
SaferAI’s grading framework has been designed to align with some of the world’s most important AI standards, including those set forth by the EU AI Act and the G7 Hiroshima Process. SaferAI is part of the , which was established by the White House in February. The nonprofit is primarily funded by the tech nonprofit Founders Pledge and the investor Jaan Tallinn.
Yoshua Bengio, a highly respected figure in AI, endorsed the ratings system, stating that he hopes it will “guarantee the safety of the models [companies] develop and deploy…We can’t let them grade their own homework.”
Correction, Oct. 2: The original version of this story misstated how SaferAI graded the companies. Its researchers assessed the “red teaming” procedures of the models; they did not conduct their own red teaming.