Aligning what?
Develop alternatives to agent-level models of alignment, by treating human-AI interactions, AI-assisted institutions, AI economic or cultural systems, drives within one AI, and other causal/constitutive processes as subject to alignment
Theory of Change:Model multiple reality-shaping processes above and below the level of the individual AI, some of which are themselves quasi-agential (e.g. cultures) or intelligence-like (e.g. markets), will develop AI alignment into a mature science for managing the transition to an AGI civilization
Some names:Richard Ngo, Emmett Shear, Softmax, Full Stack Alignment, AI Objectives Institute, Jan Kulveit
Estimated FTEs:5-10
Outputs:
Towards a scale-free theory of intelligent agency— Richard Ngo
Alignment first, intelligence later— Chris Lakin
Collective cooperative intelligence— Wolfram Barfuss, Jessica Flack, Chaitanya S. Gokhale, Lewis Hammond, Christian Hilbe, Edward Hughes, Joel Z. Leibo, Tom Lenaerts, Naomi Leonard, Simon Levin, Udari Madhushani Sehwag, Alex McAvoy, Janusz M. Meylahn, Fernando P. Santos
Multipolar AI is Underrated— Allison Duettmann
A Phylogeny of Agents— Equilibria
Hierarchical Agency: A Missing Piece in AI Alignment— Jan_Kulveit
Emmett Shear on Building AI That Actually Cares: Beyond Control and Steering— Emmett Shear, Erik Torenberg, Séb Krier