Automation‑first AI‑safety labs

Organize new AI‑safety organizations around heavy use of AI automation and agentic workflows (evaluations, red‑teaming, data curation, reporting) so a small, lean team can scale safety work against rapidly improving capabilities. These labs prioritize building automated tooling and agentic pipelines as the core product, not as an augmentation to large human teams. — If successful, such labs change who can produce credible safety evaluations, accelerate the pace of safety tooling, and shift regulatory and funding questions toward provenance, auditability, and the governance of automated testing pipelines.

Sources

Wake up people assorted links

Tyler Cowen 2026.04.15 85% relevant

The quoted claim about 'Anthropic’s automated alignment researchers already outperform humans' directly exemplifies automation‑first approaches to alignment research, reinforcing the pattern that safety work itself is being automated (actor: Anthropic; claim: automated alignment researchers outperforming humans).

Open Thread 415

Scott Alexander 2026.01.05 100% relevant

ACX grantee Jacob Arbeid is soliciting a cofounder for a grant‑funded ‘automation‑first’ AI safety lab to scale evaluations and safety engineering with AI agents (article item #2).