Target Cases
Assumptions about how difficult alignment is. "Pessimistic" approaches assume alignment is hard, while "optimistic" approaches assume current techniques may be sufficient.
Inspired by: Defining Alignment Research
Focuses on typical expected outcomes rather than extreme scenarios. Emphasizes practical safety measures that work well in normal operation, without necessarily handling all edge cases.
Relevant Agendas
Assumes AI alignment is difficult and that achieving safe AI requires substantial effort, novel breakthroughs, or solving hard open problems. Prioritizes robustness against adversarial or deceptive AI behavior.
Relevant Agendas
Designs for the most challenging possible scenarios, including highly capable adversarial AI systems. Prioritizes formal guarantees and provable safety properties over practical convenience.