Shallow Review of Technical AI Safety, 2025

Target Cases

Assumptions about how difficult alignment is. "Pessimistic" approaches assume alignment is hard, while "optimistic" approaches assume current techniques may be sufficient.

Inspired by: Defining Alignment Research

Average Case25 agendas

Focuses on typical expected outcomes rather than extreme scenarios. Emphasizes practical safety measures that work well in normal operation, without necessarily handling all edge cases.

Pessimistic19 agendas

Assumes AI alignment is difficult and that achieving safe AI requires substantial effort, novel breakthroughs, or solving hard open problems. Prioritizes robustness against adversarial or deceptive AI behavior.

Worst Case18 agendas

Designs for the most challenging possible scenarios, including highly capable adversarial AI systems. Prioritizes formal guarantees and provable safety properties over practical convenience.