Shallow Review of Technical AI Safety, 2025

China

The Chinese companies don't attempt to be safe, often not even in the prosaic safeguards sense. They drop the weights immediately after post-training finishes. They're mostly open weights and closed data. As of writing the companies are often severely compute-constrained. There are some informal reasons to doubt their capabilities. The (academic) Chinese AI safety scene is however also growing.

  • Alibaba's Qwen3-etc-etc is nominally at the level of Gemini 2.5 Flash. Maybe the only Chinese model with a large Western userbase, including businesses, but since it's self-hosted this doesn't translate into profits for them yet. On one ad hoc test it was the only Chinese model not to collapse OOD, but the Qwen2.5 corpus was severely contaminated.
  • DeepSeek's v3.2 is nominally around the same as Qwen. The CCP made them waste months trying Huawei chips.
  • Moonshot's Kimi-K2-Thinking has some nominally frontier benchmark results and a pleasant style but does not seem frontier.
  • Baidu's ERNIE 5 is again nominally very strong, a bit better than DeepSeek. This new one seems to not be open.
  • Z's GLM-4.6 is around the same as Qwen. The product director was involved in the MIT Alignment group.
  • MiniMax's M2 is nominally better than Qwen, around the same as Grok 4 Fast on the usual superficial benchmarks. It does fine on one very basic red-team test.
  • ByteDance does impressive research in a lagging paradigm, diffusion LMs.
  • There are others but they're marginal for now.