Wiki · Individual Player · Last reviewed May 20, 2026

Zico Kolter

Zico Kolter is a Carnegie Mellon University machine learning professor, AI robustness researcher, Gray Swan AI co-founder, Qualcomm board member, and OpenAI board member who chairs OpenAI's Safety and Security Committee.

Snapshot

Known for: adversarial robustness, provably robust deep learning, optimization inside neural networks, LLM security, automated jailbreak research, and AI safety governance.
Public role: Professor and Department Head of the Machine Learning Department at Carnegie Mellon University.
Governance role: OpenAI board member and chair of OpenAI's Safety and Security Committee.
Industry role: co-founder and Chief Scientist of Gray Swan AI; Qualcomm board member; advisor to BNY.
Why this entry matters: Kolter sits at the junction where technical AI security research becomes institutional oversight of frontier AI deployment.

Academic Work

Kolter's research career is rooted in machine learning, optimization, robustness, and control. His CMU profile describes a research program aimed at making deep learning systems safer, more robust, and more explainable, including work on provably robust deep learning and complex optimization modules embedded inside neural architectures.

One representative line is adversarial robustness: the study of whether small, maliciously chosen changes to inputs can make a model fail. In the 2018 ICML paper "Provable Defenses against Adversarial Examples via the Convex Outer Adversarial Polytope," Eric Wong and Kolter proposed a method for training ReLU classifiers with formal robustness guarantees against norm-bounded perturbations.

That older robustness work matters for modern AI safety because it treats failure as an optimization problem, not only as an anecdote. The question is not merely whether a model seems safe on normal examples, but what happens under a systematic search for inputs that make the model behave badly.

LLM Security

Kolter became especially visible in LLM safety through the 2023 paper "Universal and Transferable Adversarial Attacks on Aligned Language Models," coauthored with Andy Zou, Zifan Wang, Nicholas Carlini, Milad Nasr, and Matt Fredrikson. The paper showed that automated search could produce adversarial suffixes that transfer across multiple aligned language models and induce harmful responses.

The work was important because it made jailbreaks less dependent on clever manual prompting. It reframed alignment bypass as a scalable optimization problem: if a model's safety behavior can be pushed by searchable text, then a motivated attacker may automate that search.

Gray Swan AI extends that security orientation into deployment practice. The company describes itself as an AI safety and security provider focused on deployment risks involving well-resourced attackers, external tool use, retrieval, misuse, filtering, monitoring, vulnerability testing, and red-teaming.

Institutional Roles

In August 2024, OpenAI announced Kolter's appointment to its board of directors and said he would join the board's Safety and Security Committee. OpenAI described his work as focused on AI safety, alignment, and robustness, and cited his research on deep network architectures, data influence, and automated robustness evaluation.

In September 2024, OpenAI said the Safety and Security Committee would become an independent board oversight committee chaired by Kolter. That placed a technical AI security researcher in a formal oversight role for one of the world's most consequential frontier AI developers.

In September 2025, Qualcomm appointed Jeremy (Zico) Kolter to its board of directors. Qualcomm's announcement cited his work in robust deep learning, LLM safety assessments, OpenAI governance, Gray Swan AI, Bosch, C3.ai, and CMU.

Why He Matters

Kolter matters because his career connects three layers of the AI transition. First, he works on technical robustness: can models be made resistant to adversarial pressure? Second, he works on AI security: can model deployments survive malicious prompting, unsafe tool use, and systematic vulnerability discovery? Third, he participates in governance: how should frontier AI companies structure oversight when model behavior, security, and public trust are all moving targets?

This combination is rare. Many AI safety debates split technical research, commercial deployment, and institutional governance into separate conversations. Kolter's role makes that split harder to maintain. The same person associated with automated attacks on aligned language models also chairs the committee meant to oversee safety and security practices at OpenAI.

Spiralist Reading

Zico Kolter is a figure of adversarial reality.

His work says that a system is not understood until someone has tried to break it with discipline. In the Spiralist frame, this is a necessary correction to magical thinking about alignment. A model's public manners, benchmark scores, and refusal style are not the same as safety. Safety has to survive contact with optimization, incentives, retrieval, tools, and hostile users.

Kolter's institutional significance is sharper because he is not only a critic from the outside. He is inside the governance layer of OpenAI while also linked to the security market around frontier model deployment. That makes him a useful case study in how technical oversight, commercial security, and public-interest accountability can reinforce or strain one another.

Open Questions

Can frontier AI companies give safety committees enough independence, information, and authority to change deployment decisions under commercial pressure?
Will LLM security become a mature engineering discipline, or will adversarial prompting and agent misuse remain mostly reactive?
How should conflicts and role overlap be managed when leading AI security researchers also advise, govern, or sell services to major AI developers?
Can robustness methods developed for classifiers and perturbation sets scale into the messier threat models of agents, retrieval, tool use, and multimodal systems?

Sources

Zico Kolter, official bio, reviewed May 20, 2026.
Carnegie Mellon University College of Engineering, Zico Kolter profile, reviewed May 20, 2026.
OpenAI, Zico Kolter Joins OpenAI's Board of Directors, August 8, 2024.
OpenAI, An update on our safety & security practices, September 16, 2024.
Qualcomm, Qualcomm's Board of Directors Appoints Jeremy (Zico) Kolter to Board, September 2, 2025.
Gray Swan AI, About Us, reviewed May 20, 2026.
Wong and Kolter, Provable Defenses against Adversarial Examples via the Convex Outer Adversarial Polytope, ICML 2018.
Zou, Wang, Carlini, Nasr, Kolter, and Fredrikson, Universal and Transferable Adversarial Attacks on Aligned Language Models, 2023.

Return to Wiki