The Great Security Update: AI ∧ Formal Methods with Kathleen Fisher of RAND & Byron Cook of AWS
Hello, and welcome back to the Cognitive Revolution!
Two quick notes before we jump in:
First, applications for the MATS 2026 Summer program are now open. At NeurIPS this year, MATS fellows presented many papers at mechanistic interpretability and alignment workshops, and 6 MATS fellows had spotlight papers – so if you're interested in a career in AI safety research, you should definitely consider applying. I'm also excited to say that I've booked MATS Executive Director Ryan Kidd for a full episode early in the new year, so listen to that and be prepared to submit your application by the January 18th deadline.
Second, we're planning to record another AMA episode, probably in the first week of 2026. Visit our website, cognitiverevolution.ai, and click the link at the top to submit your question, or feel free to DM me on your favorite social network.
With that, today, we're diving into the world of automated reasoning and formal verification of software.
My guests, Kathleen Fisher and Byron Cook, are legends in this underappreciated but increasingly important field.
Kathleen famously led the High-Assurance Cyber Military Systems or HACMS project at DARPA, is currently the Director of the Cybersecurity Initiative at RAND, and starting in February, will take over as CEO of the UK's Advanced Research and Invention Agency, ARIA, which is often described as the UK's DARPA, though its mission goes beyond military technology, with the goal of unlocking scientific and technological breakthroughs that benefit everyone.
Byron, meanwhile, is Vice President and Distinguished Scientist at Amazon, where he's made a major contribution to cloud security by leading the application of formal methods to distributed systems at AWS, which, despite being arguably the world's biggest target for cyberattackers, has maintained an amazingly strong security record.
In all honesty, I don't think I've ever felt more out-classed by my guests than I did in this conversation. My own math career topped out at differential equations and complex analysis in college, and I've never been strong when it comes to mathematical proofs or formal logic.
Nevertheless, I'm hearing more and more in AI circles, not just about the need to harden critical infrastructure against cyberattacks before AI-powered hacking becomes ubiquitous, but specifically about the unique power of formal methods to deliver true information security guarantees, so I was eager to learn as much as I could, and honored that these masters were willing to answer my remedial questions.
We cover a lot of ground in this conversation, from the nature of the cybersecurity threats that AI poses, to the relationship between software specifications and the proofs generated by formal methods, to the critical role of assumptions, and how we can be confident that many low-level logical statements do in fact add up to system-level guarantees.
We also discuss how these methods can be used to create reward signal for coding models, and why, despite what we've seen from AI coding assistants to date, we should expect that LLMs, will, over the next generation or two, achieve superhuman levels of code security and thus create the opportunity for a great society-wide software re-write that could dramatically and durably reduce cybersecurity risks.
Finally, we explore how AWS is applying these methods to AI agents in the form of their new "automated reasoning checks" product. In short, by translating natural language policies—like a company's HR handbook or a city's zoning laws—into a set of variables and rules which a user can iteratively run test cases against and refine, and then mapping an AI agent's outputs onto this structure, this service makes it possible to use formal methods to check the agent's work for policy compliance. While the translation from natural language to formal rules does create some gray area that isn't present in the traditional process of formally verifying explicitly-coded software, this approach stands out to me as the strongest paradigm I've seen for implementing AI agent guardrails.
So much so, in fact, that toward the end, for the first time ever, I found myself worrying that AIs might one day become so consistent in their adherence to policy that we could lose the flexibility to make exceptions when official policy doesn't fit a specific situation.
That, ultimately, is a problem for another day. For now, the bottom line is that, with threats ramping up everywhere, the techniques we discuss today constitute one of very few strategies I'm aware of that could "really work", in the sense that they're capable of resolving specific critical vulnerabilities once and for all.
This is challenging material. If you're like me, and find that some of this conversation is over-your-head, I would encourage you to pause and chat with your favorite frontier language model to clarify key concepts. I'm confident that if you do, your AI worldview will be stronger for it.
And so, I hope you enjoy this technically demanding but deeply optimistic conversation about the future of secure systems, with Kathleen Fisher and Byron Cook.
Watch now!
Thank you for being part of The Cognitive Revolution,
Nathan Labenz