Safety by construction
Approaches which try to get assurances about system outputs while still being scalable.
Guaranteed-Safe AI
5 papersHave an AI system generate outputs (e.g. code, control systems, or RL policies) which it can quantitatively guarantee comply with a formal safety specification and world model.
Scientist AI
2 papersDevelop powerful, nonagentic, uncertain world models that accelerate scientific progress while avoiding the risks of agent AIs
Brainlike-AGI Safety
6 papersSocial and moral instincts are (partly) implemented in particular hardwired brain circuitry; let's figure out what those circuits are and how they work; this will involve symbol grounding. "a yet-to-be-invented variation on actor-critic model-based reinforcement learning"