Shallow Review of Technical AI Safety, 2025

WMD evals (Weapons of Mass Destruction)

Evaluate whether AI models possess dangerous knowledge or capabilities related to biological and chemical weapons, such as biosecurity or chemical synthesis.
Theory of Change:By benchmarking and tracking AI's knowledge of biology and chemistry, we can identify when models become capable of accelerating WMD development or misuse, allowing for timely intervention.
General Approach:Behavioral
Target Case:Pessimistic
Some names:Stephen Casper
Estimated FTEs:10-50
Outputs:
Virology Capabilities Test (VCT): A Multimodal Virology Q&A BenchmarkJasper Götting, Pedro Medeiros, Jon G Sanders, Nathaniel Li, Long Phan, Karam Elabd, Lennart Justen, Dan Hendrycks, Seth Donoughe
The Safety Gap Toolkit: Evaluating Hidden Dangers of Open-Source ModelsAnn-Kathrin Dombrowski, Dillon Bowen, Adam Gleave, Chris Cundy
Best Practices for Biorisk Evaluations on Open-Weight Bio-Foundation ModelsBoyi Wei, Zora Che, Nathaniel Li, Udari Madhushani Sehwag, Jasper Götting, Samira Nedungadi, Julian Michael, Summer Yue, Dan Hendrycks, Peter Henderson, Zifan Wang, Seth Donoughe, Mantas Mazeika
ChemSafetyBench: Benchmarking LLM Safety on Chemistry DomainHaochen Zhao, Xiangru Tang, Ziran Yang, Xiao Han, Xuanzhi Feng, Yueqing Fan, Senhao Cheng, Di Jin, Yilun Zhao, Arman Cohan, Mark Gerstein
The Reality of AI and BioriskAidan Peppin, Anka Reuel, Stephen Casper, Elliot Jones, Andrew Strait, Usman Anwar, Anurag Agrawal, Sayash Kapoor, Sanmi Koyejo, Marie Pellat, Rishi Bommasani, Nick Frosst, Sara Hooker