Our Take
Sequent is betting that alignment methods need principled theory, not just empirical tweaks—a direct critique of how frontier labs work, and a bet that the field has years, not months, to make that shift.
Why it matters
Recursive self-improvement and autonomous research delegation will demand alignment confidence we don't yet have. Independent research organizations have the institutional freedom to raise alarms that labs don't.
Do this week
Safety engineers: review Sequent's research directions (scalable oversight, learning theory, game theory, personas) and map which ones address your deployment's top failure mode before committing to lab-only methods.
New nonprofit tackles alignment theory before superintelligence
Researchers from the UK AI Security Institute's Alignment team and alignment theory startup Timaeus have formed Sequent, a nonprofit research organization focused on developing alignment techniques that provide confidence in superintelligent AI safety before deployment.
Sequent's founding statement is direct: "Artificial superintelligence may be developed in the next few years. It is unclear whether alignment is on track to be ready on the same timeframe." The organization plans to reach 40–80 full-time employees within two years and is seeking $100–150 million initially, with plans to raise an additional order of magnitude if research directions prove successful.
The research strategy diverges from frontier lab approaches. Sequent aims to find "principled reasons for being confident that the alignment we observe in situations we control (for example, in training, or during evaluations in chosen environments) generalizes to alignment in situations we cannot easily control." This contrasts with what Sequent describes as the "essentially reactive" methods of major AI labs, which are "functional" but "do not yield principled insight into if or when they will fail."
Sequent's research portfolio includes scalable oversight, learning theory, heuristic arguments, game theory, and personas. The organization expects synergies across these directions: for instance, combining insights from learning theory and personas to identify which training variables can be adjusted, then using scalable oversight to determine adjustment magnitudes.
Independence is the point
Today's AI systems are broadly aligned but exhibit sharp, surprising failures in real-world deployment. The industry has managed this through monitoring and iteration. But as systems grow more capable and begin handling recursive self-improvement, humans lose observability and control.
Sequent exists partly because independent organizations can raise public alarms that labs cannot. As Sequent notes in its announcement, "we might need to yell." The structure creates space for researchers to publish negative findings or push back on deployment timelines without internal pressure to minimize safety concerns or avoid slowing product schedules.
The timing matters. If superintelligence arrives within the next few years and empirical alignment methods remain reactive rather than theory-grounded, the field will face deployment decisions with incomplete confidence. Sequent is explicitly betting that the gap between "methods that work on tasks we control" and "methods that guarantee safety on tasks we do not" is the core technical problem of the next phase.
Audit where your alignment confidence comes from
If your deployment relies on alignment techniques developed and validated only within a single lab's ecosystem, map which of Sequent's research directions (or competitors') address your specific failure modes. Lab teams optimize for capability metrics and shipping timelines; independent researchers optimize for theoretical robustness and worst-case behavior. Neither is wrong, but they are not interchangeable.
For teams deploying systems that will make autonomous research decisions or undergo recursive self-improvement, confirm that your safety case rests on more than empirical tuning. Ask whether your lab's alignment approach yields "principled insight into if or when it will fail." If the answer is "not yet," Sequent's research directions suggest which directions to monitor or fund independently.