AI safety research based in Tokyo.
We are a research lab focused on AI safety, alignment, and governance. Our work spans evolutionary approaches, multi-agent coordination, and building interpretable AI systems.
News
Vulnerability Research in the Age of AI: Deconstructing Mythos
A review of automated zero-day discovery across operating systems, hypervisors, and browsers with Pwn2Own winner Thanh Do.
Paper accepted at ICML 2026 Workshop TAIGR
The paper "Direct Causation in International Humanitarian Law and the Challenge of AI-Mediated Civilian Cyber Operations," authored by Alice and Harold, has been accepted at the ICML 2026 Workshop TAIGR!
Paper 'Evolving Interpretable Constitutions' accepted to ICML 2026
The paper "Evolving Interpretable Constitutions for Multi-Agent Coordination," authored by Ujwal K. and co-authors Alice Saito, Hershraj Niranjani and Rayan Yessou, has been accepted as a regular paper at the ICML 2026 Main Conference!
Strahinja Janjusevic is Securing Critical Maritime Infrastructure with AI
Strahinja (Strajo) Janjusevic, a graduate researcher at the MIT Laboratory for Information and Decision Systems (LIDS), is developing AI-driven solutions to secure critical maritime infrastructure. His work bridges technical defense and policy, training AI systems to identify spoofed signals and help operators distinguish technical glitches from strategic cyberattacks.
Why We Study Multi-Agent Coordination
A look into why multi-agent coordination is central to AI alignment research.
Evolving Interpretable Constitutions for Multi-Agent Coordination
We introduce a method for evolving constitutions in multi-agent systems to improve interpretability and alignment.