• Duo Discover
  • Posts
  • Anthropic CEO says DeepSeek was ‘the worst’ on a critical bioweapons data safety test

Anthropic CEO says DeepSeek was ‘the worst’ on a critical bioweapons data safety test

In partnership with

The Daily Newsletter for Intellectually Curious Readers

If you're frustrated by one-sided reporting, our 5-minute newsletter is the missing piece. We sift through 100+ sources to bring you comprehensive, unbiased news—free from political agendas. Stay informed with factual coverage on the topics that matter.

Anthropic’s CEO, Dario Amodei, has expressed serious concerns about DeepSeek, the Chinese AI company that has gained significant attention in Silicon Valley with its R1 model. These concerns go beyond the usual worries about DeepSeek possibly sending user data back to China, and focus on the company’s AI models potentially posing a national security risk.

In an interview on Jordan Schneider’s ChinaTalk podcast, Amodei revealed that DeepSeek’s model performed poorly in a safety test run by Anthropic, which evaluated the generation of sensitive bioweapons-related information. According to Amodei, DeepSeek’s model “had absolutely no blocks whatsoever against generating this information,” making it “the worst of basically any model we’d ever tested.”

These tests are part of Anthropic’s routine assessments of AI models to understand their potential risks to national security. The company evaluates whether AI systems can generate rare and sensitive data related to bioweapons, which isn’t easily found through common sources like Google or textbooks. Anthropic’s commitment to AI safety is central to its mission as a provider of foundational AI models.

Although Amodei did not claim that DeepSeek’s models were “literally dangerous” at the moment, he warned that they could become so in the near future. He praised the team behind DeepSeek for being “talented engineers,” but advised the company to take AI safety concerns more seriously.

In addition to his concerns about DeepSeek’s safety, Amodei has been a strong advocate for strict export controls on chips to China, citing the potential military advantage these chips could provide to China’s military.

Amodei did not clarify which specific DeepSeek model was tested by Anthropic, nor did he provide more technical details about the safety tests. DeepSeek has not responded to requests for comment.

DeepSeek’s rise in the AI field has sparked safety concerns beyond Anthropic’s tests. Last week, Cisco security researchers reported that DeepSeek’s R1 model failed to block harmful prompts in safety tests, with a 100% jailbreak success rate. Although Cisco didn’t specify that it tested bioweapons-related content, it did confirm that DeepSeek was able to generate harmful information on topics like cybercrime and illegal activities. For context, other major AI models, such as Meta’s Llama-3.1-405B and OpenAI’s GPT-4o, also had high failure rates in similar safety tests, with success rates of 96% and 86%, respectively.

Despite these safety concerns, DeepSeek’s rapid adoption by major companies raises the question of whether these issues will hinder its growth. Notably, AWS and Microsoft have publicly embraced the integration of DeepSeek’s R1 model into their cloud platforms, even though Amazon is one of Anthropic’s biggest investors.

On the other hand, DeepSeek’s safety risks have led to a growing number of bans from governments and organizations, particularly in the United States. The U.S. Navy and the Pentagon are among those who have prohibited the use of DeepSeek’s technology.

Whether these regulatory efforts will gain momentum or if DeepSeek’s global rise will continue remains to be seen. However, Amodei now considers DeepSeek a serious competitor in the AI space, on par with leading U.S. companies such as Anthropic, OpenAI, Google, and possibly Meta and xAI.

As Amodei stated in his ChinaTalk interview, “The new fact here is that there’s a new competitor. In the big companies that can train AI — Anthropic, OpenAI, Google, perhaps Meta and xAI — now DeepSeek is maybe being added to that category.”

What did you think of this week's issue?

We take your feedback seriously.

Login or Subscribe to participate in polls.