Approaching the AI Event Horizon? Part 2, w/ Abhi Mahajan, Helen Toner, Jeremie Harris, @8teAPi

Abhi Mahajan discusses AI's emerging role in biology and cancer treatment modeling. Helen Toner and Jeremie Harris examine automated AI research, superhuman systems, and US–China coordination challenges for maintaining global AI oversight.

Approaching the AI Event Horizon? Part 2, w/ Abhi Mahajan, Helen Toner, Jeremie Harris, @8teAPi

Watch Episode Here


Listen to Episode Here


Show Notes

Abhi Mahajan (@owlposting) explains how AI is reshaping biology and medicine, including foundation models to predict cancer treatment response and why he’s both skeptical and optimistic about current results. Helen Toner unpacks CSET’s “When AI Builds AI” report and why automated AI R&D is a major source of strategic surprise. Jeremie Harris then explores our lack of control over superhuman AI systems, fragile US–China coordination, and how to maintain situational awareness in a rapidly shifting landscape.

Use the Granola Recipe Nathan relies on to identify blind spots across conversations, AI research, and decisions: https://recipes.granola.ai/r/4c1a6b10-5ac5-4920-884c-4fd606aa4f53

LINKS:

Sponsors:

GovAI:

GovAIwas founded ten years ago on the belief that AI would end up transforming our world. Ten years later, the organization is at the forefront of trying to help decision-makers in government and industry navigate the transition to advanced AI. GovAI is now hiring Research Scholars (one-year positions for those transitioning into AI policy) and Research Fellows (longer-term roles for experienced researchers). Both roles offer significant freedom to pursue policy research, advise decision-makers, or launch new initiatives. Applications close 22 February 2026. Apply at: https://www.governance.ai/opportunities

Blitzy:

Blitzy is the autonomous code generation platform that ingests millions of lines of code to accelerate enterprise software development by up to 5x with premium, spec-driven output. Schedule a strategy session with their AI solutions consultants at https://blitzy.com

Tasklet:

Tasklet is an AI agent that automates your work 24/7; just describe what you want in plain English and it gets the job done. Try it for free and use code COGREV for 50% off your first month at https://tasklet.ai

Serval:

Serval uses AI-powered automations to cut IT help desk tickets by more than 50%, freeing your team from repetitive tasks like password resets and onboarding. Book your free pilot and guarantee 50% help desk automation by week four at https://serval.com/cognitive

CHAPTERS:

(00:00) About the Episode

(03:54) Introducing Abhi and pipeline

(06:40) Biology's messy ground truth

(13:53) Noetic's tumor foundation model (Part 1)

(13:59) Sponsors: GovAI | Blitzy

(17:05) Noetic's tumor foundation model (Part 2)

(24:42) Calibrating AI biology impact

(30:53) China's biotech rise (Part 1)

(34:23) Sponsors: Tasklet | Serval

(37:12) China's biotech rise (Part 2)

(38:28) Reading biology ML critically

(46:00) Automated AI R&D workshop

(52:29) Software-only singularity debates

(01:03:34) Labs, policy, and oversight

(01:18:50) Infrastructure and Taiwan risk

(01:26:53) Export controls and DeepSeek

(01:30:33) Labs arms race dynamics

(01:47:26) Security and blind spots

(01:59:15) AI productivity and markets

(02:09:44) Closing reflections and outlook

(02:24:50) Outro

PRODUCED BY:

https://aipodcast.ing

SOCIAL LINKS:

Website: https://www.cognitiverevolution.ai

Twitter (Podcast): https://x.com/cogrev_podcast

Twitter (Nathan): https://x.com/labenz

LinkedIn: https://linkedin.com/in/nathanlabenz/

Youtube: https://youtube.com/@CognitiveRevolutionPodcast

Apple: https://podcasts.apple.com/de/podcast/the-cognitive-revolution-ai-builders-researchers-and/id1669813431

Spotify: https://open.spotify.com/show/6yHyok3M3BjqzR0VB5MSyk


Transcript

This transcript is automatically generated; we strive for accuracy, but errors in wording or speaker identification may occur. Please verify key details when needed.


Introduction

Hello, and welcome back to the Cognitive Revolution.

Coming up, you'll hear Part 2 of a marathon live show that I co-hosted with my friend Prakash, also known as @8teAPi on Twitter, in which we explore AI for Science, Recursive Self-Improvement, and Geopolitical Competition.  

I love doing deep dive episodes, but I can only cover so many topics that way, and so I'm experimenting with higher-intensity live shows as a way to deliver what I hope is the same high-quality analysis in a denser format.  

In the first half, which hit the feed yesterday, we talked to:

  • Professor James Zou of Stanford about his work on AI for Science;
  • Sam Hammond about how well the Trump administration is managing international AI competition;
  • and Shoshannah Tekofsky about AI Agent behavior in the wild.

In this second half, we talk to:

  • Abhi Mahajan, also known as @owlposting, about AI for Biology and Medicine, including the foundation models he's building at Noetek AI to better predict which patients will respond to which cancer treatments, and why, though he's skeptical of many AI for biology results that have been published to date, he does expect trends to continue to the point where AI is ultimately transformative for the field;

  • Then we talk to Helen Toner about a report that CSET just put out, called "When AI Builds AI", which summarizes conversations from a closed-door workshop in which participants tried but failed to establish any consensus expectation about the impact of automated AI R&D, ultimately leading to the conclusion that automated AI R&D is a major source of potential strategic surprise;

  • And then finally we have Jeremie Harris, talking about the very challenging position we find ourselves in, where we lack both the technical means to reliably control superhuman AI systems, and the trust and coordination mechanisms needed for the US and China to address this problem collaboratively – plus a bit of discussion of how he maintains situational awareness, and how our respective personal productivity stacks are evolving.

As you'll hear, the challenges of making sense of massive disagreement among leading experts, and simply keeping up to date with AI developments broadly come up repeatedly in these conversations, and to be honest, nobody has great solutions.  One that I can recommend, though, is using LLMs to help identify blindspots, and for that purpose I'm really enjoying the blind spot finder Recipe that I recently created in Granola.  Granola works at the operating system level, so it can capture all the audio into and out of your computer, including, if you wish, the contents of this video.  And its Recipe feature can work across sessions to identify trends, opportunities, or blind spots that only become apparent with that zoomed out view.  Obviously this is a tool that grows in value over time, but if you want to try it, I suggest downloading the app, starting a session while you play this episode, and then asking it to identify blind spots based on this conversation.  What's so cool about this feature, for active Granola users, is that the blind spots it identifies for you will be different from the ones it identifies for me.  

As I said last time, this was fun for me, but especially because it's a new format, I would love your feedback.  Do you feel you got as much value from this more time efficient approach as you usually do from our full deep-dive episodes?  Or did we miss the mark in some way?  Please let me know in the comments, or if you prefer by reaching out privately, via our website, cognitiverevolution.ai, or by DM'ing me on the social media platform of your choice.

With that, I hope you The Cognitive Revolution, LIVE, from February 11, co-hosted with @8teaPi



Full Transcript

(0:00) Nathan Labenz: Hello, and welcome back to the Cognitive Revolution. Coming up, you'll hear part 2 of a marathon live show that I cohosted with my friend Prakash, also known as 8teAPi on Twitter, in which we explore AI for science, recursive self-improvement, and geopolitical competition.

I love doing full deep dive episodes, but I can only cover so many topics in that way. And so I am experimenting with higher intensity live shows as a way to deliver what I hope is the same high quality analysis, but in a denser format. In the first half, which hit the feed yesterday, we talked to Professor James Zhou of Stanford about his work on AI for science, Sam Hammond about how well the US administration is doing to manage international AI competition, and Shoshana Tikosky about AI agent behavior in the wild. In this second half, we talked to Abhi Mahajan, also known as Owl Posting, about AI for biology and medicine, including the foundation models he's building at Noetik AI to better predict which patients will respond to which cancer treatments. And though he's skeptical of many AI for biology results that have been published to date, he does expect trends to continue to the point where AI is ultimately transformative for the field. Then we talked to Helen Toner about a report that CSET just put out called When AI Builds AI, which summarizes conversations from a closed-door workshop in which participants tried but failed to establish any consensus expectation about the impact of automated AI R&D, ultimately leading to the conclusion that automated AI R&D is simply a major source of potential strategic surprise. Then finally, we have Jeremie Harris talking about the very challenging position we find ourselves in, where we lack both the technical means to reliably control superhuman AI systems and the trust and coordination mechanisms needed for the US and China to address this problem collaboratively. Plus, a bit of discussion of how he maintains situational awareness and how our respective personal productivity stacks are evolving. As you'll hear, the challenges of making sense of such massive disagreement among leading AI experts and simply keeping up to date with AI developments coming at us daily comes up repeatedly in these conversations. And to be real, nobody seems to have perfect solutions.

(2:53) Nathan Labenz: One partial solution that I can recommend, though, is using large language models to help identify blind spots. And for that purpose, I am really enjoying the blind spot finder recipe that I recently created on Granola. Granola works at the operating system level, so it can capture all of the audio into and out of your computer, including, if you wish, the contents of this episode. And its recipe feature can work across sessions to identify trends, opportunities, or yes, blind spots that only become apparent with that zoomed-out view.

Obviously, this is a tool that grows in value over time. But if you want to try it today, I suggest downloading the app, starting a session while you play this episode, and then asking it to identify blind spots based on this conversation. What is so cool about this feature, for active Granola users at least, is that the blind spots it identifies for you will be different from the ones it identifies for me. As I said last time, this was fun for me, but especially because it is a new format, I would love your feedback. Do you feel that you got as much value from this more time-efficient approach as you usually do from our full deep dive episodes, or did we miss the mark in some way? Please let me know in the comments, or if you prefer, by reaching out privately via our website, cognitiverevolution.ai, or by DMing me on your favorite social network. With that, I hope you enjoy the Cognitive Revolution live from February 11th, cohosted with 8teAPi.

(3:53) Prakash: I'm going to add Abhi Mahajan. Abhi is Owl Posting online, and he works on AI for cancer at Noetik AI.

(4:04) Abhi Mahajan: Abhi, welcome. Great to meet you both. Thanks for having me on.

(4:09) Nathan Labenz: You have the great distinction of being recommended to me as the person for AI and biology and the intersection of those two. So big shoes to fill, big reputation to live up to, but excited to meet. This is actually the first time we've properly spoken.

(4:24) Prakash: Yeah. And I learned from Ron Alpha that you built an entire competitive intelligence platform, LLM-based, to feed the clinical analysis pipeline. And also that Claude recommends every cancer drug it sees. So let's talk about that.

(4:41) Abhi Mahajan: Yeah. The typical way that a lot of biopharmas are interested in asset acquisition these days, as opposed to just developing their drugs from scratch—this is partially because China's pumping out a lot of very interesting preclinical assets. Why not just buy those for a few million dollars? They've already done the optimization. Let's just run those in patients. Most of the time, the way you look for these drugs is either you mine your personal network or you have these clinical trial aggregation platforms that help you do the job. Both of these are obviously lossy. A better way is just scrape the entire semantic web yourself and annotate every single investigational drug you find with your company's priorities, what you think is important to look for, modalities you're particularly interested in. Organize that all into a table, rank it by some metric, and then you give that to the therapeutics team to work off of. Obviously, there's still a human due diligence step. These models still are not perfect—even 5.2, 5.3, not perfect—but it's pretty good.

(5:39) Prakash: Do you have an internal eval that you run? And when you swap model engines, do you upgrade every time a new model engine shows up? Do you evaluate and then decide?

(5:50) Abhi Mahajan: It's a pretty hacky process. Our metric for eval—at least my personal metric for evaluation—is amongst the drugs that our therapeutics team are really interested in and want to move forward on, does the next version, next generation of the LLM continue to recommend those drugs as very good? And I don't actually think it was pretty good at the very beginning. I only built this pipeline a few months ago. It remains pretty good now. I don't think there's been any dramatic jump. I partially think this is due to the fact that identifying what makes for a good drug is a very qualitative process and a very vibe-based one. It depends on the economic status of the company. It depends on, do we know anyone there? Because oftentimes, these companies don't make it easy for you to give them your money. It takes a super long process to figure that out. Yeah, it's pretty good though.

(6:40) Nathan Labenz: So I definitely recommend your blog, owlposting.com. I still have quite a bit of archive to work my way through, but I want to throw a couple of what I thought were your more interesting, arguably hot takes at you and then get you to double click into some of the intuition and implications. Because we're obviously in a moment now where there's a tremendous amount of interest in creating AI scientists of all kinds. And one of the big bets that companies are making with some serious capital behind them increasingly is that they're going to close the loop by allowing AIs to design and run their own experiments through some sort of automation, feed that data back in, and they're going to get reinforcement learning from basically experimental results. Now, one of the things that you had said in one of your posts is that there's not a lot of verifiable ground truth in biology. I would love to understand what that means exactly. Then what does that mean in terms of the ability to close that loop? Is there some sort of fundamental messiness or uncertainty that you see as kind of, at least in the near term, being irreducible that would become the functional limit on how much systems could learn from that kind of closed-loop experimentation?

(8:01) Abhi Mahajan: Yeah, I like to say that biology has no verifiable ground truth—that's probably a little bit hyperbolic on my end. But what I will defend is that there's not a lot of verifiable ground truth for the most clinically valuable problems. So yes, there is verifiable ground truth of does this protein exist in the solution? Is this variant that your NGS sequencer identified true? Those are both verifiable. But I don't think you'll quite see the same explosion of intelligence that happened in math and code as you will in biology because rewards are so cheap and so easy to get in those fields. In biology, it's such a long iterative process to get any iota of information.

One easy analog to this is training an RLVR model on the task of bestselling. There is technically a verifiable reward—there's book sales, there's the country that the author is writing from, all these sources of data. But it takes 18 months to get that singular data point. And when you get that singular data point, it's very hard to trace it back to any one of these things. One biology-grounded example of this is, let's say you want to do RLVR on toxicology. This is arguably the thing that sinks the vast majority of phase one drugs out there. Toxicology sounds like a very simple topic. It does not. A drug can be toxic on the order of seconds, like snake venoms. It can be toxic on the order of months, years. It potentially doesn't kill an animal. It maybe just leads to cognitive deficits, heart damage. Oftentimes, it's dose-dependent. It could be species-dependent. All these measures of toxicity—there's no real way to understand them other than just observing that in an in vivo setting and then seeing what the readout is.

There are companies—one asset-based startup called Axion, which is trying to create a model that can very easily tell, given a small molecule, what is its toxicity impact on hepatocyte cells in a cell dish. It's a very clean, simple problem that probably saves months of time in preclinical settings. But it doesn't poke at the much more important problem: how does this perform in an animal?

(10:14) Prakash: Just a segue here. So Isomorphic Labs, I think yesterday, announced they have a predictive model which doubles the performance of AlphaFold 3 on key benchmarks—binding affinity, pocket identification, structure prediction. How does that fit into how things go? Is this actually useful, or does this just create more targets which need to be validated anyway, and it's not that useful?

(10:43) Abhi Mahajan: Yeah, I mean, obviously, it's a very incredible piece of work. I've seen the DDE, and I'm no longer in the protein engineering field, but I think the benchmark they did, that leftmost plot they're presenting—that's an incredibly difficult benchmark to get better at, under 2x better than what was previously. So very good. But I'm sure you've heard the sentiment that the field is already awash with many really good preclinical assets, and the bottleneck is actually how well do these work in patients. It sounds perhaps obvious that if you get better at this preclinical design step, you get better at putting it into humans. That's the story that has been told for 10 years. It is not obviously clear that any of it has borne fruit. I imagine at some point it will, but there isn't really strong evidence to suggest that it does.

There's actually this really great paper, a chemistry paper that came out just a few days ago called the Affinity Advantage. That paper is probably one of the strongest bull cases that being able to optimize every facet of every protein that comes in through the preclinical pipeline has nonlinear or superlinear benefits to the drug development process. And it's just a matter of time till these models get even better. It's not in the picture that I share, but I'm sympathetic to it.

(11:57) Prakash: In Dario's, I think, one of his papers, the blog post that he had, Machines of Loving Grace, I think he tried to kind of map out how he thought developments in biology would work. And he kind of pointed out that a lot of the major developments in biology come from better imaging and sensing techniques that allow you to look deeper and understand deeper what's happening. And then after that, it becomes easier to do a lot of other things downstream of that, starting from microscopy and all the downstream developments from there, et cetera. What do you think are potentially the developments which might be coming up in the next four to five years that might do something like that?

(12:48) Abhi Mahajan: I guess I would vaguely gesture to building models that are generative models of human in vivo biology. There's layers of discussion to be had—what other instruments do we need? What other modalities do we need? But I think there's a lot of low-hanging fruit in simply collecting a huge amount of highly rich data collected from real human tumors, intestinal lesions, plasma readouts, and just feeding a model with that information and not paying attention to any of the in vitro or otherwise biologically unrealistic settings. And then from there, maybe you get access to a genuine, bonafide human simulator of biology. And maybe that's a lot helpful for fixing the current state of 97% of oncology trials failing.

I think the Dario pitch of scientists in the data center churning out interesting ideas—there's already tens of thousands of PhD students churning out very good ideas. Most of them can't be validated because it's too expensive to do.

(13:54) Prakash: Mm-hmm.

(13:54) Nathan Labenz: Hey, we'll continue our interview in a moment after a word from our sponsors.

Are you interested in a career in AI policy research? If so, you should know that GovAI is hiring. 10 years ago, a small group of researchers made a bet that AI was going to change the world. That bet became GovAI, which is now one of the world's leading organizations studying how to manage the transition to advanced AI systems. GovAI advises governments and companies on how to address tough AI policy questions and produces groundbreaking AI research. GovAI is now hiring its next cohort of researchers to tackle hard problems that will define AI's role in society. The research scholar position is a one-year appointment for talented, ambitious individuals looking to transition into the field. And they're also hiring for research fellows, experienced researchers doing high-impact AI policy work. Past scholars and fellows have defined new research directions, published in leading media outlets and journals, done government secondments, gone on to work in leading AI labs, government agencies, and research groups, and even launched new organizations. Applications close on February 15th, so hurry to governance.ai/opportunities. That's governance.ai/opportunities, or see the link in our show notes.

Want to accelerate software development by 500%? Meet Blitzy, the only autonomous code generation platform with infinite code context, purpose-built for large, complex enterprise-scale codebases. While other AI coding tools provide snippets of code and struggle with context, Blitzy ingests millions of lines of code and orchestrates thousands of agents that reason for hours to map every line-level dependency. With a complete contextual understanding of your codebase, Blitzy is ready to be deployed at the beginning of every sprint, creating a bespoke agent plan and then autonomously generating enterprise-grade premium quality code grounded in a deep understanding of your existing codebase, services, and standards. Blitzy's orchestration layer of cooperative agents thinks for hours to days, autonomously planning, building, improving, and validating code. It executes spec and test-driven development done at the speed of compute. The platform completes more than 80% of the work autonomously, typically weeks to months of work, while providing a clear action plan for the remaining human development. Used for both large-scale feature additions and modernization work, Blitzy is the secret weapon for Fortune 500 companies globally, unlocking 5x engineering velocity and delivering months of engineering work in a matter of days. You can hear directly about Blitzy from other Fortune 500 CTOs on the Modern CTO or CIO Classified podcasts, or meet directly with the Blitzy team by visiting blitzy.com. That's blitzy.com. Schedule a meeting with their AI solutions consultants to discuss enabling an AI-native SDLC in your organization today.

(17:07) Nathan Labenz: So that connects pretty directly, it seems like, to what you are doing in your work on cancer at Noetik. Right? You guys are focused, first of all, at roughly the clinical stage and try to predict what drugs will work best for a particular patient given some relatively deep data about their specific condition. Right? So maybe walk us through what that looks like. I was interested to learn that it is basically a foundation model with lots of different data sources thrown into it. And also that it's trained with this kind of masking strategy where the idea is that the model has to learn how to predict from partial data, whatever partial data it might have. And I'm a big believer in that strategy because there just is so much—obviously, so many modalities and so much noise going on inside the system that we don't understand. I've been a big speculator about that being a driver of how AI can help in health over time. Give me the kind of double-click past what I have been able to learn with online research into what you guys are doing.

(18:21) Abhi Mahajan: Yeah. So let me start with perhaps the economic pitch for Noetik. 97% of oncology trials fail. You can look at that and say, well, we're awful at designing these drugs. Maybe we should get better at designing them. But one interesting phenomenon is that if you look at a lot of the papers that are published after a cancer clinical trial fails, there's usually some patients who did respond to the drug or responded to the regimen they were on. And the researchers try really hard to figure out, what is the exact biological archetype that makes up this response population? And they always come up with something super complicated, very heterogeneous. It's like this particular cytokine group or granzyme genes were highly expressed in the response population. It never leads to anything particularly interesting.

And so one argument you could make is that maybe the biomarkers that define patient response for this particular drug are non-human legible. You need a black box biomarker to encapsulate whatever that piece of information is. And so Noetik is kind of built around that thesis. We collect vast amounts of human tumor data. We profile them at four levels of modality. So pathology, which is kind of the blue chip that almost everyone has. Spatial proteomics is 16 to identify cell types. Whole plex spatial transcriptome—so this is 19,000 genes over the entire surface of a tumor—then to identify functional state of the tumor. And then exome sequencing to identify genetic alterations. So is this KRAS positive? Is this STK knockout? And so on.

And then the ML angle is that you train—yeah, exactly as you said—a self-supervised masked model in the hopes that, one, you get a very good representation of any given tumor that walks in the door. So you now have the ability to place, in the universe of all the cancers I've seen, where does this patient fall in that embedding space? And so that's what we are doing a lot of. We gather patient samples from people who have run clinical trials. They have patient samples. We profile them. We run that through the model, see if the response population falls in a different area than the non-response population. And if it does, maybe we have access to a biomarker that no human on Earth understands, but we uniquely are able to.

The more interesting thing you can do with this is use the generative capacity of the model and knock out specific transcripts or specific genes and see how that changes the expression of transcripts within the tumor microenvironment. You can imagine there's this concept that's appearing in the cancer literature called nudge drugs, which are drugs that don't actually operate on the immune system or really the cancer side itself, but rather push it in a direction that makes it more sensitized to other drugs. And so you can imagine, oh, I'll knock out this particular transcript, and then I will hallucinate what would it be like if I added Keytruda, which is an immune checkpoint blockade that operates on the PD-1 axis into the site of the tumor. And maybe now you predict, oh, the tumor's highly inflamed. It's hot. There is a high chance that it'll just melt away entirely. Yeah, those are the two big economic and ML strategies we're pursuing.

(21:26) Nathan Labenz: Yeah. That's really super exciting. When you talk about—first of all, identifying or having access to, I think was your phrase—biomarkers that nobody else has access to because you can see a sort of divergence in where different patient populations fall in embedding space. Do you have any means—if not, maybe I can introduce you to the good folks at Goodfire who just did a version of this with identifying biomarkers for Alzheimer's that had been previously not identified—but do you have any means right now of saying, okay, because one thing to say, these things are falling—these patient populations are falling into different parts of embedding space. It's another thing to then say, why? What is it actually that is causing that divergence? How far are you guys along in terms of being able to make interpretable what it is that the models have learned from their unsupervised training?

(22:20) Abhi Mahajan: Yeah. The Goodfire post was very interesting to read. We do have a Mech Interp research group internally which is exploring these ideas. I have no doubt they'll find something interesting. But one argument against doing this at all is, why do we care about interpretability? In a clinical setting, we might care about interpretability because the FDA gets very upset with you if you try to do anything that's black-boxed. And maybe that was true a year ago. But circa, I think, September or August 2025, there was a pathology AI company called Arterra AI, which came up with—it was basically a companion diagnostic as to would they—they intake the pathology slide of your prostate tumor, and they will predict whether you will respond to androgen deprivation therapy. They have no idea why this model works. They've retrospectively validated on thousands of patients from prior phase three trials, and the FDA was fine with that.

So one argument against doing Mech Interp at all is that, why spend a ton of resources exploring something that the primary regulatory agency you care most about doesn't really mind whether or not it's white box or black box?

(23:27) Nathan Labenz: I guess the obvious answer would be because presumably that knowledge would be a great input to further experimental ideas or other knowledge. But maybe you think it's just so hard to—I don't know. There's no verifiable ground truth or something that would prevent that from working?

(23:43) Abhi Mahajan: I guess, what was discovered in the Premenetic at Firepos? I forgot what exactly it was. There was something about fragmentomics, something about how the genes are fragmented is a potential biomarker for Alzheimer's. It's a very interesting piece of work. It sounds very expensive to validate it. And so I imagine we would run into the exact same problem. Maybe we have a very good hypothesis for what comes out of the system, but we already have so many other hypotheses, potentially ones that even have higher literature backing. I could imagine a world in which Mech Interp as a field gets so good that you can triage—this thing's going to be really easy to validate, this thing's going to be really hard to validate. But right now, the way that Mech Interp usually works in biology is you're staring at semantic segmentation plots a lot and trying to think, oh, is this real or is this fake? Is this the model identifying this very spurious correlation? And that time just simply feels better spent elsewhere.

(24:44) Nathan Labenz: Okay. Here's another idea of a place that it might be well spent. Continual learning, of course, a huge theme right now in AI in general. The first conversation we had today with Professor James Zhou from Stanford included a little talk about their recent paper, Learning to Discover at Test Time, where they're using auto-regressive large language models and giving them problems like make a faster CUDA kernel or find a better solution to this math problem with a lower bound than anybody has previously found. And they interestingly kind of flipped the usual model of what we're trying to do when we create an ML model on its head and said, what if we just try to get this model to produce the single best answer that we can? And we don't care if it generalizes, and in fact, we'll probably throw away this model after this test-time fine-tuning. What we want is the answer. And they were able to find, at relatively reasonable cost, like $500 compute cost, that they were able to actually get some new state-of-the-art answers on some of these highly technical questions.

If I'm a cancer patient and you've got a general foundation model, a question that naturally occurs to me is, can you fine-tune this on my data? Can we do some test-time tuning? Can we do sort of intensive masking on just my samples and really dial this thing in to understand my particular biology? And then, if we did that, would it be more accurate for me? Do you think that line of thinking has legs, and why or why not?

(26:29) Abhi Mahajan: So I actually looked at the paper, and they actually have a section for biology. They do single-cell RNA denoising using this test-time training model, which I thought was really interesting. I guess my instinctive answer is it's an interesting idea. It very well might work, and it falls into the bucket of ideas that we would simply have to try it to make sure that it does or does not work. The results for single-cell RNA denoising that's in the James Zhou paper are certainly good. They're better than the state of the art, but for each one of these cases, they attach a note by an actual domain expert saying, how useful is this in practice? And the domain expert in question for the single-cell RNA section did say, this is very cool, but at the end of the day, we don't really care about the results of single-cell RNA denoising. We care about some biological utility that is underlying that. And so maybe you get better at solving this verified task problem, but that doesn't translate to anything actually useful.

(27:31) Nathan Labenz: Maybe it would be for...

(27:32) Abhi Mahajan: The response/non-response prediction case. But it kind of just sounds easier to fine-tune the model using normal supervised learning. Why go through the RL process if the end result is binary? I think they even called out in the paper, this setup isn't really meant for binary or sparse learning tasks. It's meant for fuzzier things. (27:51) Nathan Labenz: Yeah, they're working on that, but it's not done yet. Maybe zoom out—you alluded to this earlier—a lot of people think we just need better ideas for drug candidates, and your consistent position is that's probably not really the bottleneck. You made a really interesting point around how the ability to more accurately evaluate those candidates drives a lot more value than just throwing more candidates through a pipeline. The quality of the pipeline matters more than its scalability. Again, I think you've suggested where you think this can come from with large-scale foundation model style training, but give us the next level of depth on that. Why are all these other ideas not so exciting? Is this basically just a bitter lesson sort of idea where all your cleverness is going to be washed away by scale, so keep your eyes on the prize—data max and compute max until you solve it all? Is it kind of that?

(29:07) Abhi Mahajan: I guess I kind of view things in three ideological camps. The first one is maybe us—we're indexed very heavily on human data. The only thing that matters. You can start from in vitro settings and bootstrap your way up to something more complicated. You need to start with the most complicated thing to begin with. The second camp is very interested in modeling single biomolecules and their interactions, with the hope that maybe you can't bootstrap your way upwards, but you raise the absolute success rate from maybe 5% to 20%, and maybe that's all you need. I think the second camp defines the vast majority of ML bio companies that exist today. Some of them have clinical candidates that are ongoing right now. We'll see what the results are. Generally, it doesn't seem like there's been a massive step change in their ability to design drugs. This is not knocking them—drug discovery is hard, everything's a bet at the end of the day. The third camp is maybe it's not really a for-profit thing, but you can just improve the clinical trial process to begin with. This is arguably the path that China—this is China's main advantage. They're able to run clinical trials far more cheaply than anyone else, partially because of lower cost of human labor, but also because they've just set up the system pretty nicely such that it's not such a huge regulatory and financial headache to get things going. This has some downsides—drugs are treated innocent before proven guilty, whereas the FDA is the other way around. But the obvious benefit of doing that is you're betting neither on the AI and human data getting better, nor the AI and in vitro settings data getting better. You're trusting that the typical drug design process, if just made slightly more financially efficient, will improve things on its own. I think all three of these are important, and it would probably be grandiose of me to assign an unequal weighting to any one of them. Each one feels important to push on.

(30:56) Nathan Labenz: There have been some interesting—oh, sorry. Let me sneak in one from—

(31:00) Prakash: I'm going to take a little bit of a segue to something you said earlier, which is that a lot of the new INDs are coming in from China. What has happened in the last couple of years? Is it an AI thing? The CEO of Ginkgo Bioworks was on TV yesterday and said they just have more hands. Some people believe it's a regulation thing. Some people believe it's a clinical trial registration—they can just register more people. Some people believe it's a US cost thing. What is driving this transfer of basic R&D to China at this point?

(31:39) Abhi Mahajan: I think this particular subject is very deep. It's not something I have expertise in. My instinctual thought is that there are many different answers to this, and the one that I think is most interesting is the idea that China was always a very good generics manufacturer. That's where they started. They slowly extended their way into WuXi having a very good CRO ecosystem. At some point, enough talent began to be incubated in China where they began to realize, we have all this infrastructure here, why not just develop our own drugs? There is something very important about having this very close interplay between both the person who is designing the drug and the person who is actively doing wet lab assays on the drug. Whereas in America, you have a super long feedback loop of I need to get my SAD together, I need to go reach out to VCs, I need to go buy a lab. Whereas in China, that ecosystem is a little bit set up already. Actually, maybe the only missing part is that the VCs are still more risk-averse than perhaps VCs in America. But the colocation of the grunt work and the intellectual work is actually surprisingly important. A few months ago—actually, last year—I interviewed one of the very few people doing novel biotech research in India, a guy named Soham who runs a company called Pop Facts. He said this is the primary reason why he expects not only China to start producing really interesting drugs, but also potentially India, potentially Egypt—places where there is intellectual capital, there's a lot of hands, and it's just that combination leads to really good compounding results.

(33:17) Prakash: Indeed. And does that accelerate with the AI models? These kinds of AI co-scientists? Does that mean that even if they don't have that much intellectual capacity yet, they have the hands to carry it out?

(33:33) Abhi Mahajan: I guess this is something that's a little bit opaque to almost everyone as to what exactly is the level of how impressive the bio AI models coming out of China are. I think there's certainly some interesting work that has been done. It's not clear to me that there's anything radically new there that won't be found anywhere else. A fair amount of it is—I don't want to say this in a disparaging way, but it is scaling up stuff that was originally developed in either the UK, London, or America. There hasn't really been a DeepSeek thing where there's something radically crazy that comes out of any of the Chinese labs. I obviously could be wrong on this, though. Whatever the bio AI labs are doing in China, there's much less American visibility around it.

(34:18) Nathan Labenz: Hey, we'll continue our interview in a moment after a word from our sponsors. The worst thing about automation is how often it breaks. You build a structured workflow, carefully map every field from step to step, and it works in testing. But when real data hits or something unexpected happens, the whole thing fails. What started as a time saver is now a fire you have to put out. Tasklet is different. It's an AI agent that runs 24/7. Just describe what you want in plain English—send a daily briefing, triage support emails, or update your CRM. Whatever it is, Tasklet figures out how to make it happen. Tasklet connects to more than 3,000 business tools out of the box, plus any API or MCP server. It can even use a computer to handle anything that can't be done programmatically. Unlike ChatGPT, Tasklet actually does the work for you. And unlike traditional automation software, it just works. No flowcharts, no tedious setup, no knowledge silos where only one person understands how it works. Listen to my full interview with Tasklet founder and CEO, Andrew Lee. Try Tasklet for free at tasklet.ai, and use code COGREV to get 50% off your first month of any paid plan. That's code COGREV at tasklet.ai. Your IT team wastes half their day on repetitive tickets. And the more your business grows, the more requests pile up. Password resets, access requests, onboarding, all pulling them away from meaningful work. With Serval, you can cut help desk tickets by more than 50%. While legacy players are bolting AI onto decades-old systems, Serval was built for AI agents from the ground up. Your IT team describes what they need in plain English, and Serval AI generates production-ready automations instantly. Here's the transformation. A manager onboards a new hire. The old process takes hours—pinging Slack, emailing IT, waiting on approvals. New hires sit around for days. With Serval, the manager asks to onboard someone in Slack and the AI provisions access to everything automatically in seconds with the necessary approvals. IT never touches it. Many companies automate over 50% of tickets immediately after setup, and Serval guarantees 50% help desk automation by week 4 of your free pilot. As someone who does AI consulting for a number of different companies, I've seen firsthand how painful manual provisioning can be. It often takes a week or more before I can start actual work. If only the companies I work with were using Serval, I'd be productive from day one. Serval powers the fastest growing companies in the world like Perplexity, Vercata, Merkor, and Clay. So get your team out of the help desk and back to the work they enjoy. Book your free pilot at serval.com/cognitive. That's serval.com/cognitive.

(37:13) Prakash: Okay, go ahead.

(37:15) Nathan Labenz: One big question I have is I find it very hard to calibrate myself on how excited I should be about all these AI for biology and AI for medicine developments. I know that there's always these kind of headlines—AI does this, AI discovers this drug. I've done episodes of Cognitive Revolution on it, one with Jim Collins. He has created a bunch of antibiotic candidates. There's a long list, right? Professor Zhao did the nanobodies thing that came out of the virtual lab. To hear him talk about it earlier today, it sounds like those were reasonably well validated. But then you always get this other side too that's like, well, not so fast. It's all very messy. We got a long way to go. Most of these things don't pan out. I feel like that sort of parallels the debate that we hear in a lot of different domains where, you know, even in programming, which is one of the more, let's say, legible domains, we've got something like a meter study that showed slowdown of developers—that was very confusing. I'm still quite confident that it's making me faster, and I kind of want to throw that away. There's, of course, just a lot of denial and cope out there and all sorts of motivated reasoning. How should one try to ground their worldview? Obviously, subscribing to Owlposting is something everyone should do. But what else would you advise me? How can I patch these blind spots in my worldview or get to a better position from which to have my own sense of what really counts, really matters, and what doesn't? Because, again, this happens all over the place where there's this disagreement even among some of the most informed people about just how big AI's role is or how big of a deal is it going to be. But in biology, it's particularly hard for me to make sense of. So I'd love to get some tips for how to climb the learning curve faster.

(39:12) Abhi Mahajan: I've actually written about this in the past a very long time ago. The title of the article is "5 Things to Keep in Mind When Reading Biology ML Papers." The long and short of it is the evaluations in biology, I think, are very difficult. You see this thing in more typical wet lab biology of, oh, we cure cancer, but it's in a mouse. Who knows when it'll actually translate to humans? There's a very similar phenomenon in a lot of biology ML papers where they're doing something that feels like it should be useful, but there's a lot of things that they're probably hiding from you when explaining the results that would only be obvious to a domain expert. One really funny example of this is small molecule binding affinity papers. I've written about one company's work in this, but they found that these—let's say you're able to predict these set of molecules bind to this target, these set of molecules do not bind to the target, and you're very happy with yourself. You publish a Nature article about it. What these folks, the company called Leash Bio, found is that this can often be confounded by which chemists actually produce the molecule in the first place. Because some chemists are very attached to specific targets. They're very good chemists, so they often produce things that bind to that specific target. And these chemicals all look very similar to each other. It is this type of similarity that is very human vibes-based, and it's hard to pin down to a singular metric. They found that these models are often confounded by this overlap. I think these problems just appear over and over again across in vitro biology, biomolecule generation, where you can be confounded by variables that you did not know even existed in the dataset. I would probably name that as the thing to be most aware of when reading these papers. There are a few people I trust on Twitter and people in real life who can give a pretty good overview of any arbitrary paper. But I think with LLMs, popular science people often retweet them and say, this is transformative. And more often than not, they're probably correct. Opus 4.6 is genuinely crazy. I think when people do that for bio ML papers, there's a 50/50 chance that they're completely missing the point because they're not in that field, and they don't understand how the failure modes emerge in these models.

(41:35) Prakash: Yeah. Do you think that—

(41:38) Nathan Labenz: Can Opus help me identify those blind spots? Is it—

(41:42) Abhi Mahajan: Yeah. Sorry, go ahead. Go ahead.

(41:44) Nathan Labenz: Yeah. Is it good enough to do that?

(41:46) Abhi Mahajan: I've actually written an article about this also. It's titled, "Can O1 Preview Find Mistakes Amongst 56 MLSB Papers?" MLSB is a structural biology workshop at NeurIPS, and it's not very good at it. This was, obviously, last generation of models. Maybe it would be a lot better. But there are some problems that are going to recur in almost every biology ML paper of, oh, your train sizes aren't large enough. Your test sizes are not stratified correctly. But you kind of just learn to pick your battle in this field, and you just move on. There's a lot of more fundamental problems with these papers that LLMs, in my experience, have often missed entirely. I think the funniest—in almost every article I've written, I have found that LLMs tell me something about this particular subfield that the domain experts completely disagree with. They say, like, that is not how you should think about this domain. That's not the real problem we're actually worried about. I don't know why this is the case. It's kind of fun. It's like a domain of science that LLMs still haven't quite captured human taste.

(42:51) Nathan Labenz: Yeah. Fascinating. Okay. That leaves a lot of work in front of us. Do you want to go back briefly before we break to Noetik again? I mean, fortunately, my son is doing well. He recently got cancer three months ago. I've had an intensive crash course in cancer, and I hope to be able to close the book on it and return it to a more intellectual and less personal interest going forward. And I think we're on good solid track to do that. But I think you've demonstrated in this conversation that you're not getting too carried away with the promise of what AI systems can do. We've got the data center of geniuses. We've got the century of progress compressed into five years kind of visions. How much would you shave off of those notions just to describe your own expectations of what it can do specifically, and maybe what the field more probably is going to be able to accomplish?

(43:48) Abhi Mahajan: I think I'm very optimistic that human simulation companies, akin to Noetik, but I think there's other players out there as well, will be able to vastly help with the results of at least a few clinical trials within the next few years. That feels like—you're not even paying attention too much to the trend lines. I'm almost indexing on what we're capable of today. I think it's pretty obvious. There are papers going back years that are able to show, oh, we've developed an ML model that is better able to stratify patients. The problem has always been an economical one—how do you actually deploy this in a real setting? I think we'll be able to do that just fine. I think phase 1 drug failure rates will go down. And I think this has already been slightly proven out. There was a McKinsey study that was done a few years ago that showed that AI-designed drugs have a 5 to 10% lower failure rate, which may be noise, may be real. I do kind of expect those trend lines to continue. Where I'm most unsure of is whether these models will be able to discover brand new targets entirely, which is ultimately what people will care about. I think believing that these models will be able to find new targets far faster than humans would really requires you to index heavily on the trend lines. And I want to index on the trend lines. So I believe that these models will be able to deliver very good target finding. But I'm also very sympathetic to that mindset of finding targets is just such an unbelievably hard problem that being cautious will not make you regret it, because you need this human iteration feedback loop. And unless you build a really good human simulator, which is our bet, you're not going to get close to solving that problem.

(45:32) Prakash: The way I put it is usually you can kind of see one order of magnitude ahead, maybe two. No one can see three orders of magnitude ahead. It's just not possible. You have no idea what's going to happen. Abhi, thank you so much. I learned a lot from this, and hope to see you online. Hope to read more of your blog.

(45:51) Abhi Mahajan: Yeah. Absolutely. Thanks for having me on.

(45:54) Nathan Labenz: Thanks for being here. Bye-bye. We'll be working our way through Owlposting archives for some time to come. Wonderful.

(46:01) Prakash: Bye-bye. Our next guest is Helen Toner who runs CSET at Georgetown. She's a former OpenAI board member, and there's two competing views here. She has, on the one hand, the intelligence explosion is coming. On the other hand, AI capabilities may be permanently jagged. So, let's add her to the stage. Helen, nice to have you.

(46:21) Helen Toner: Hey. Thanks for bringing me in at the end of your marathon. Impressed you guys are still going strong.

(46:27) Nathan Labenz: There's so much to cover, you know, and we've all got to accelerate our personal productivity timelines and try to pack more information into the same amount of time. So experimenting with ways to do that. Talk super fast.

(46:52) Helen Toner: I mean, my constant struggle is to talk slower than I naturally want to. So if you want me to talk double speed, I'm here for it.

(46:58) Nathan Labenz: Please go, fast as you want.

(47:00) Jeremie Harris: Go for it.

(47:01) Nathan Labenz: Go for it. Okay. So you guys just put out this report. I think this is obviously a great candidate, if not a shoo-in, for the most important question of our moment. What is going on with the possibility of automated AI R&D? Do we have this tipping point where we're starting to hit recursive self-improvement? And if so, how big of a deal is that going to be? You guys brought together a bunch of people that authored this report and some others as well that aren't necessarily authors but contributed to conversations. I understand quite a few people from frontier model developers. And it strikes me that this debate goes back basically to the beginning of AI. There was the idea very early on that we could have an intelligence explosion. When I started reading Eliezer in 2007, he was very worried about this. And yet, you've written—I think you put your finger on something a lot of people were feeling last year when you said even though what passes now for long timelines is pretty short—and yet the disagreement on this topic seems to be as fundamental and seems to be kind of as impervious to new evidence as it has ever been. Maybe just for starters, take us inside the workshop, give us kind of the lay of the land in terms of what are the world models that people have, and why are we still just working from so much intuition despite the fact that we now have, in some circles, what would even be called AGI out there as products for us to use today.

(48:32) Helen Toner: Yeah. So this workshop was held in July and was maybe one of my work highlights of the year. It was a day and a half. We brought people in. We had people coming from some of the frontier companies, policy, a bunch of great people. To get a sense of kind of what the vibe was like, we started out—the first session was about how is AI being used to automate AI R&D right now. We had presentations from people who are doing that. And before the first break, we had Ryan Greenblatt from Redwood Research, Nicholas Carlini from Anthropic, Dash Kapoor from Princeton of AI as Normal Technology, and Thomas Larson, who's one of the AI 2027 authors. They were all arguing so fiercely in a friendly and productive way before the first break that when everyone else stood up to go and get coffee and drinks and snacks, they just kept on arguing right through the break, which was great. It was exactly what we were looking for. But I think it did sort of presage something that we knew going in, which was there are really different perspectives here. The workshop was Chatham House. I feel okay giving that anecdote because they ended up writing—one thing that came out of that was Nicholas was pushing the others constantly for, okay, you have such different views about where things are going. Where's the first place that you actually disagree about what we'll see? And they found that as they're looking out for what we're going to see in 2026, 2027, they actually agree a lot about kind of what we're going to see before we get to that recursive point, which is kind of a bummer. It's nice that they agree and they've now reposted about that actually, which is the reason I feel fine sharing that anecdote from an otherwise Chatham House workshop. They were to post about the stuff they agree on. But it sucks that that means that it's actually going to be hard to identify kind of in advance whether we are heading into a recursive loop or whether we're not. Two big things that I think—so what we're trying to do with the workshop, one was get this idea of recursive self-improvement out of purely Silicon Valley, San Francisco, really AI-filled spaces and make it—explain it, present it to a wider audience, let people engage with it. But then also actually try and make some progress on, okay, why do people disagree about what is happening? What might happen in the future? What indicators could we gather? Stuff like that. And I came out of it thinking that maybe two of the core disagreements here—one is, does AI truly replace all of what humans can do? So does it—do you get to that fully automated? Because if you're going to have the big, really scary recursion, that's probably what you need. It can't be that you have much more productive human researchers. You could have the Alec Radfords and the Ilya Sutskever's managing fleets of AI researchers. But if it all has to come back to them and they have to process and digest and think through the research, you're not going to get that massive recursive loop. So that's one piece—do you truly get humans being fully replaced? Because if not, then maybe you have some parts of the workflow being really accelerated. We have this diagram in there of sort of an Amdahl's law kind of thing where Amdahl's law is basically if you have a process that depends on different inputs and there are different potential bottlenecks, then if you speed up one part of the process, the bottlenecks will just bite somewhere else. And so it may be that you speed up the coding part of AI research, but if you don't speed up other parts, then you don't end up speeding up the whole thing very much. Or, you know, do you have—I think another mental model that people who are skeptical that this is going to really go crazy, a mental model they bring is, okay, we have a long history of computers doing more and more of the lower level work. So, you know, we don't have to do punch cards anymore. We don't have to write assembly code. We have these higher level languages. And so, for example, AI doing more of the coding is just another natural step in that process. And humans—this is kind of a like expanding pie model. The amount of tasks that we realized can be involved in AI R&D expands, and there's always that outer band that the humans can do while they're automating the inner bands. And I think that is very different from the view that other people—sort of more that AI 2027 authors would have or lots of other people in this space, which is, no. First, you automate some of what the humans can do, and then you automate all of what the humans can do. And then, you know, you kind of go until some other bottleneck hits. So then, the other question is, okay, what are those bottlenecks? We can talk about that as well. But I think those are two of the biggest questions that came out for me. One was, are you truly going to automate everything, including what all the humans can do? And the other is, if you do, how soon do the bottlenecks bite?

(52:32) Prakash: So Sholto Douglas, who is now at Anthropic, had this idea of a software-only singularity, he calls it, where we get very good at coding and all of the digital stuff, including AI research, I presume, but not at producing power or copper or all of the physical substrates which are going to be required in order to support this expansion. How would you feel—how do you think that fits in? Like, the fact that maybe the digital stuff happens, but the physical stuff just doesn't. (53:06) Helen Toner:

Yeah, I think there are two versions of this. Some people, when they talk about a software-only singularity, basically mean it turns out that software is enough to get absolutely crazy recursive loops. Tom Davidson at the Foresight Institute has written about this, for example. You might be able to get massively more intelligent systems having massive impacts on the world primarily through software improvements.

But there's a different version—what you're describing—which I would think of more as a jagged software-only intelligence explosion. The AI is getting much more capable in certain ways, but its effects on the world are very limited because it is software-only.

This gets at another thing that I found really helpful and interesting from the workshop: people have very different intuitions about what it means if you have an AI that is very, very good at AI R&D. What does that mean for what the AI can do elsewhere? Some people think, "Okay, well, if it's very good at AI R&D, then it can train AI models to do whatever, so it can do whatever. Maybe you have to spend a week gathering data or something, but then if you want to do some arbitrary task, you could do it."

Whereas I think other people have an intuition like, "Okay, well, even if it gets very good at automating AI R&D—this most software-based task—it's still going to really struggle to, for example, design new biological molecules, or think about geopolitical strategy questions, because you have to actually go out and see how different countries and decision makers will react." That connection between AI that can do incredibly good AI R&D and AI that can affect the world in non-AI-R&D-specific ways—we tried to tease that apart a little bit in the report. That piece goes under-discussed in a lot of these conversations.

(54:48) Prakash:

Would you think that's the connection between, "Okay, now you have AI doing AI research, that's affecting the economy, it's also affecting the political economy, and then you have to have mitigations to the political economy for this to work out"? Does that mean you might need the AI research to go into how to fix the political economy, which is going to be a little bit scary?

(55:12) Helen Toner:

Say more about what you mean by affecting the political economy.

(55:15) Prakash:

Right now, you have Bernie Sanders saying we should have a moratorium because he's very scared about jobs. He wants a moratorium on data centers. I think there are six states with moratoriums now, including New York State. So one question is: does AI research lead into "How does AI fix the political economy? How do we deal with the humans? How do we mitigate the impact we have on the humans?" Is that something that would happen with the first configuration of the software-only singularity—where it's not jagged and also affects the political economy?

(55:49) Helen Toner:

Yeah, that's the kind of thing. If you're positing that you can just have a software-only singularity that is going to radically transform the world, then it's going to have to be able to do things like deploy chatbots that talk to enough people to convince them that data centers are great, get all the data centers built, and get the moratoriums rolled back. That kind of capability has to be built in. That's right. Which, to me, intuitively feels like a different skill set and is more dependent on deployment, rollout, and adoption. So I tend to be a little more skeptical there, but yes, I think that's an example for sure.

(56:22) Prakash:

I see.

(56:23) Nathan Labenz:

One seemingly odd pairing of beliefs that I observe—and I sort of detect in the report—is the idea among the more skeptical folks that there's going to be a plateau, and also that plateau is going to be subhuman. Then on the other hand, it's "It's not going to plateau, it's just going to run away and have some sort of singularity."

A position that I feel pretty intuitively attracted to, that I don't hear too often, is the idea that maybe there will be a plateau, but it could very easily be a superhuman plateau. If I try to zoom out as far as I possibly can and look at life on Earth, it seems like humans are part of maybe an entry into a steep part of an intelligence explosion—an S-curve of capability. I don't think we're the end of history, but we were clearly better than what came before, and that was enough to take over the world.

I just don't hear too many people say, "Yeah, it's not necessarily going to be a singularity. It's not necessarily going to go totally beyond comprehension. But in the same way that we were just that much better than Neanderthals—"

(57:34) Prakash:

It might not have been that much—

(57:35) Nathan Labenz:

"—but it was enough to change everything." I feel like there's not too much more room between where the AIs are now and where they will soon presumably be. Even if that doesn't go critical from there, it feels very hard for me to imagine that it's not enough to be transformative. So is that a position that was represented in the workshop, and how do you personally react to it?

(58:01) Helen Toner:

Yeah, I think that's pretty close to my default expectation. If so, then it was represented there because I was there.

Maybe to riff on it a little bit—something we didn't put in the report, but that I've definitely found helpful for my own thinking—is thinking about the S-curve. We're on some kind of S-curve. I agree. There are three segments that are of interest. One is how long is the lead-up period, the first part of the S-curve. One is how steep is the middle of the S. And one is how high is the ceiling.

A lot of times, when you're hearing people talk about automated AI R&D, they're in one of two camps on all three of those questions. Either they think the lead-up is short, the curve is steep, and the ceiling is high, or they think the lead-up is long, the curve is gradual, and the ceiling is low. I find it really interesting to think about different combinations of those parameters. It's really looking like the lead-up is pretty short these days. We're not too far from that takeoff period. But what if the curve is steep but the ceiling is low? Or the curve is gradual but the ceiling is high? We don't talk that much about either of those.

Maybe also to your point, Nathan, about the superhuman but not all-powerful god-like singularity with no point of return—I really think there's room for more thinking about what does it mean to be superhuman? What are the domains where there's tons of headroom above humans, and you can easily identify what it would look like to be superhuman at optimizing a kernel, or selling things, for example?

(59:41) Nathan Labenz:

Yeah. In previous parts of this marathon conversation series, we've seen how the ability to interpret the signals that people are throwing off in sleep to predict disease—it's just a really random but instructive example of how there's an obvious lot of room to be superhuman at some of these tasks. There's potentially a lot of power to unlock, especially if you can integrate that kind of infinite-modality processing with a basic reasoner. I really don't see any reason that we're not going to be able to achieve that.

(1:00:24) Helen Toner:

Yeah. Often those things, though, will involve—I think another piece that's underexplored here is people will tend to either be in the camp of "The ceiling is high, and you're not going to need all that, it's not going to be delayed by real-world adoption," or "The ceiling is low, and it's going to be delayed by real-world adoption."

To me, I'm sort of like, "Isn't the obvious combination of these?" Once you get the real-world integration—for example, you have to collect all that sleep data, or humans are really bad at interpreting scent data, but dogs can smell things we can't, but you have to deploy a bunch of sensors—there are unexplored questions around how high is that ceiling as you have increasingly integrated AI in more and more aspects of life and the economy.

(1:01:03) Prakash:

I do also wonder to what extent—the way I might view things happening is software and mathematics first. The question for me is: if you get software and mathematics first, you may get things like, "I don't need a LIDAR for my self-driving car anymore. I can use cameras. And the cameras can be really bad cameras now because the math does all the work, and I don't need all this sophisticated technology."

It could be that your phone could do what those sleep detection machines do with the right software package. The phone has a lot of sensors in it. There's an enormous amount of technology within the phone. You do wonder whether it would really be application of algorithms to existing frameworks, existing infrastructure—increasing the bandwidth of your communications tech with new encryption and new cryptography, which is how DSL was invented. DSL was really using the existing copper pipes with new algorithms. I wonder to what extent you don't get a slowdown just because of your physical infrastructure, because you kind of innovate around or with your physical infrastructure.

(1:02:24) Helen Toner:

Yeah, I'm sure that will work in some places. I think it'll work in some places and not in others. If we're talking about—where my mind goes is cybersecurity for critical infrastructure, where the physical systems are old. They're hooked up—the operational technology is hooked up to old information technology because they have to be. There's going to be a limited amount that you can optimize using smart new algorithms there because the stuff is just old.

Likewise, my center does a lot of work with military technology. Same thing there. If you have a ship that was built in the 1960s, it's a ship that was built in the 1960s, or other pieces of equipment. So yes, I think in some places, yes. In some places, no.

To me, that's another place where the jaggedness bites. I think Prakash mentioned it as I came on—the talk I gave on jaggedness. To me, it's another place where the jaggedness is fractal. You zoom into the task or skill of AI R&D—actually, it's many, many different things. We'll see AI R&D accelerating in areas that are especially amenable to using AI and lagging more in other areas. Not to say they can't ultimately be automated, but it will take longer.

(1:03:35) Prakash:

How far do you think the product which is on the market right now is behind what people are using inside the labs?

(1:03:44) Helen Toner:

I honestly don't know. That was one of the most actionable sections of the report—a set of indicators. We have a table summarizing the three categories of indicators. The biggest category is indicators from inside companies, and one of them is that public-private gap. My sense is that it's not huge right now, but I don't have any inside information. You guys talk to company employees as well.

(1:04:09) Nathan Labenz:

If you believe Arun, he says we have no idea how good we have it, and the gap is very small.

(1:04:13) Helen Toner:

Exactly. I'm thinking of things like that exact tweet.

(1:04:16) Prakash:

What I learned in the last few days is that the real gap is that they're using models which are three times faster. So it's just the same model—they're running on lower batch size. It's three times faster, and that's what they're using internally. It's the same tokens, it's just a lot faster.

(1:04:34) Helen Toner:

And they're surely also doing tooling stuff, right? Something we put in the report: a couple of our reviewers who were looking at this, who were less familiar with the idea of automating R&D, were like, "Oh, haven't you seen that study of 95% of AI pilots fail?" And there's the study of AI slowing people down. We included in the report explicitly noting, "Yes, productivity boosts from AI are mixed, but these AI researchers are in the very, very best position to benefit from their technology." They are the most up to speed on what it can do and what it can't do. They are shaping how it's developed and what directions it's pushed in. They're in the perfect setting to be building tooling to squeeze the most juice out of these models. I'm sure that is also a piece of it as well.

(1:05:15) Nathan Labenz:

One of the things you mentioned early on, just a few minutes ago, is the idea that you wanted to bring awareness of these possibilities outside of the places where they are most often discussed. One thing I would love to hear your perspective on is: how ideological do you think companies are about this?

This is one of the things that confuses me. Every frontier lab leader has read their Eliezer Yudkowsky catechism. Many of them in the past have said things about how we should be extremely careful about this sort of thing, and we should not engage in an arms race dynamic. It's obviously part of the OpenAI charter. Dario has said things like this. And now we're in a world where there is a publicly stated timeline by OpenAI to the AI R&D intern and then another timeline, not too much longer out, 2028, for the full AI R&D researcher.

(1:06:17) Helen Toner:

And Anthropic as well.

(1:06:20) Nathan Labenz:

Yeah. I would say Anthropic has probably seemed more committed to it, or more resigned to it maybe, but they believe it.

(1:06:27) Prakash:

Jack Clark, June, summer this year. Jimmy Bah, who just left xAI, was a cofounder—12 months. And OpenAI this year: research intern, and then full researcher kind of a year later. Yeah, I think it's this year. So that's my guess.

(1:06:48) Helen Toner:

This year for what specifically?

(1:06:50) Prakash:

The start of recursive self-improvement. We're going to talk to—

(1:06:53) Helen Toner:

But aren't we there already? Wasn't it last year? I mean, you had Gemini doing the evolutionary algorithm stuff, or designing an algorithm that sped up its own training by 1%. Come on, that's recursive. Truly. This is what I'm talking about—the lead-up to the loop. But I don't know. Do you think we might be this year where there's no human needed whatsoever? I think that's a high bar.

(1:07:18) Prakash:

I think we might be. I updated on Modelbench. The Modelbench thing took me by surprise—1.5 million agents all of a sudden on the web. Yeah, it's all nonsense for sure, but things start off as nonsense. I think what might happen is you get a single model update which fixes a little bit of hallucination, a little bit of security issues around leaking secrets, and I think that might be enough.

(1:07:44) Helen Toner:

Sounds hard. Sounds real hard. Fixing the security stuff.

(1:07:48) Prakash:

Yeah, we'll see.

(1:07:49) Abhi Mahajan:

Yeah, maybe. Maybe.

(1:07:51) Nathan Labenz:

I do want to give you the chance to talk about the dynamics of this. There are different reads that we might put onto people. They're ideological about it. Elon Musk has said things like, "I don't know if this is going to be good or bad, but I want to be around to see it." Also, "I'd rather be part of it than a spectator." That sounds like somebody who is inclined to gamble with humanity in a pretty self-aware way. Others may feel like they're trapped into these dynamics and they at least will do it as safely as possible. How would you describe that milieu right now? I think it's dramatically underappreciated by people outside the AI bubble where we spend all our time.

(1:08:36) Helen Toner:

Yeah. I mean, I think my impression from the people I've talked to is there is just a sense of inevitability about AI advancing and a desire to be a part of the future being created, because they see this as a future that's being created.

There's also—you mentioned how this has been part of the AI conversation since the very beginning. I. J. Good was in the early 1960s talking about when you create the first ultra-intelligent machine. I feel like we always need more terminology in AI. I feel like we should get "ultra-intelligent" to make a comeback. There's this very natural logic, if you have a computer science kind of brain, to say, "Okay, we have some level of skill at building computers. When the computers have more skill than we do, then they'll build ones that have more skill than that, and then you get a loop."

That logic is just very appealing and seems very natural. People think of it as something that's going to happen anyway, and then they may as well be involved. That's not everyone, but I do get the sense that that's the water that most folks are swimming in. Then if you have a different view, it's in contrast to that. I don't know. Is that your sense as well?

(1:09:42) Nathan Labenz:

Yeah, I think so. The inevitability is a pretty compelling argument. I resist it because I at least want to make the point that, even if some form of this is inevitable, there is still probably important discretion that we can exercise in terms of exactly what flavor. There are questions like: Should we keep chain of thought interpretable, or should we embrace thinking in latent space?

I do think it's important to keep in mind that it's probably not all one or all the other. AI defies all binaries. There are going to be these gradations and these more local decision points. But yeah, in 2022, I was just trying to make AI work for practical tasks. With no background in AI research, I basically ended up independently inventing a number of the techniques that have gone on to produce great things. I didn't take them past any local plateaus, but just having AIs improve their own outputs—proto-constitutional AI type stuff—I do think the attractor, the gravity well, is pretty strong. It's hard to avoid some version of these techniques because if even a bozo like me lands on them, I don't know how they're not going to happen in the big broad world, especially as we start to also get dramatic democratization of training techniques.

Prime Intellect just put something out that allows anybody to spin up their own RL environment on a distributed basis, on a community basis. Everything's going to get tried, and I think that's pretty hard to argue against. But again, I do want people to still own what exactly it is that they are doing along the way.

(1:11:29) Helen Toner:

Yeah, I think there's something in here which takes me back to long-running conversations about autonomous weapons. There's something about the level of human oversight that you can have. Using AI to accelerate research is, I totally agree, an attractor. But there's a meaningful difference—you'd really hope there is—between "I have a fleet of 10 million AI agents, and they're running experiments for me, and I am leading them, and I am guiding them" versus "I have set something into motion. I have no fucking clue what's going on."

I think there's a boundary somewhere. Is it a boundary that we're able to stay on one side of? I'm not sure, but I hope it might be. To me, that feels like the point to try to intervene, not "We shouldn't use it for research." That's obviously not going to work.

(1:12:12) Prakash:

To what extent do you think policymakers are naive? Earlier on, we spoke to Sam Hammond, and he advises some policymakers on AI. He was talking about privacy and the restrictions—the constraints that we could put on and the regulations. One thing that struck me was that I think a lot of policymakers are not aware, perhaps, that AI with access to existing technology, with access to persistent search, persistent memory, would basically do a Google stalking of you before it even met you. It would know all of those things in the public domain.

The amount of access to information that it could have, the persistence of information, listening in on conversations—these things are going to be very powerful in that sense. I think you can ban the AI from using facial recognition. Fine. But then you have network analysis, metadata analysis on WhatsApp conversations on where the messages are going. You don't need to know the content. There are a lot of these techniques where you can deanonymize traffic, deanonymize people. You don't need facial recognition. You can ban facial recognition. You can still do gait analysis and speech analysis, voice analysis, handwriting analysis. There are so many other techniques. All of these things will be available to AI. To what extent is this whole "We're going to make sure we have privacy" thing—are they being naive? Is it going to be possible?

(1:13:47) Helen Toner:

I mean, the US has done a worse job of this than pretty much every other country on the planet. I think there are some basic rules. I don't think you want to do rules at the level of "no facial recognition." I think you want to do rules at the level of "no data brokers." You can collect data, but if you're going to collect it, the user needs to know, and they need to have notice and consent. I'm not deep on privacy law, so I don't want to pretend that I have the right privacy proposal here. But I do think there are ways to do it that are better than the US, and I do think there are ways to do it that give you that underlying flexibility.

Yeah, maybe I'll leave it at that because privacy law goes real deep, and I'm not there.

(1:14:22) Nathan Labenz:

One more question for you. In the report, you talk about the possibility that the gap that we think is currently small between the models we have and the models that are used internally could open, and you have some recommendations around certain transparency measures. I want to give one quick shout-out to the AI Whistleblower Initiative, founded by my friend Carl Cox, who has engaged with OpenAI and at least played some role in their recent updates to their whistleblower policies. I find it amazing that OpenAI is continuing to work in that direction even today.

Where do you think we are on the spectrum from secret non-disparagement clauses to where we need to be in terms of insight into what is going on at the labs, other than private philanthropist-funded whistleblower support? What other policies do you think the government should be doing? Maybe even more broadly, if you want to zoom out: What do you think a situationally aware US government should be doing in general that it is currently not?

(1:15:35) Helen Toner:

Yeah, I think there are a bunch of things here. On transparency, I think we're doing better than we have been. We have these two new state laws: SB 1047 in California, RAISE in New York. I think those are good starts. It would be great—but I think for a lot of this information, we're also just really dependent on what the companies still choose to put out.

Now, we're fortunate. I want to give credit to both OpenAI and Anthropic, and to a somewhat lesser extent Google—they do proactively put out a pretty good amount of information. I think they should get some credit for that. But I don't love that it's almost entirely at their discretion what it is that they put out. I guess that will be shifting as SB 1047 and RAISE start to be enforced. I'm interested to see what that looks like.

I think we also need to shift a bit. There's been the beginnings of a push to shift from a model-release-based schedule to something more continuous, which is partly driven by interest in these internal deployment type dynamics, not just the external releases. The idea here is: if the risk is not actually purely tied to when you put your model on the market, then all of your risk evaluation shouldn't be tied to that either. Also, creating better incentives for the companies around not forcing them to just rush things out the door, but instead trying to have more of a continuous pulse of updating metrics over time.

I think we could definitely be doing better on transparency. Ideally, pairing those requirements with some kind of independent audit requirement or independent way to let external third parties come in and check that things are happening as they're supposed to be happening. That has been in several of these proposals and keeps getting stripped out by industry lobbying. That, I think, is a new frontier as well.

There are various other policy implications that we put in the report. One that's maybe interesting is this general hardening-the-world sort of recommendation, or societal resilience. This is cyber defense, biodefense, biosurveillance—investing in biosurveillance just meaning monitoring diseases, not surveilling people. Investing in epistemic security stuff, trying to have a way to determine what's real and what's fake, tagging real content. All this broader societal resilience stuff is like, "Okay, just assume that this is going to get much, much better, and then we might see automated AI R&D contributing to increased pace of change."

There's also been—this is less of a policy and more of a mindset—there's been a shift over the past year or two to "Actually, maybe open models are always going to be pretty close behind." So concerns that you might have about there being an access gap or a concentration-of-power gap if the closed models are far ahead, maybe we don't have to worry so much about that. I think if you're taking seriously the possibility that automating R&D speeds up the closed labs significantly, then we just need to revisit those assumptions about open models and closed models.

There are a few others, but I would point people to the report for the full set.

(1:18:23) Nathan Labenz: When AI builds AI, things just might start to get weird. So yeah, definitely check out the full report from Helen and coauthors at CSET and beyond. Interesting times, for better or worse. Any closing thoughts before we break?

(1:18:39) Helen Toner: No, great to be on. Great to chat with you as always. Yeah, look forward to next time.

(1:18:44) Prakash: Indeed. Cool, Helen. Very nice.

(1:18:46) Nathan Labenz: Here's to our timeline between now and next time.

(1:18:50) Nathan Labenz: See you.

(1:18:51) Prakash: Cheers.

(1:18:51) Nathan Labenz: Bye for now.

(1:18:53) Prakash: So our next guest is Jeremie Harris. He's from Gladstone AI, and they wrote the first ever US government AI threat assessment for the State Department. It's been about ten months now since they said every American AI data center is compromised. Jeremie, what has changed? Have things gotten better or worse?

(1:19:15) Jeremie Harris: Yeah, well, to piggyback off what Nathan just said—things are getting weird. Great to be on. What has changed since then is less than I might have hoped, and for really interesting reasons. I think one of the things that a lot of people who are concerned about the AI risk story and the AI threat landscape from a national security perspective—whether it's loss of control or weaponization—a big part of the story that's missing is understanding the infrastructure buildout. What are the actual bones that we're building on here? Because that's the substrate that underlies everything, and there are all kinds of assumptions being made about it where we're kind of abstracting away what I really think is at least 50% of the problem here.

We think a lot about model reconstruction attacks and all kinds of interesting debate about whether it even makes sense to secure models in a world where you can just reconstruct them if an API is available. But more fundamentally, when you're building your entire AI industrial base off of components that are made in China with personnel who often are Chinese nationals—I mean, forget about the Manhattan Project. We're so far behind that. I think it's incumbent on us to take a step back and just ask: what is that chessboard? Forget about the pieces—what is the board itself? Are we playing on something that's fundamentally stacked in a way that doesn't allow for a winnable outcome?

I'm not saying this to be pessimistic. I think there are actually solutions that you come up with very quickly once you take that new perspective. But closing your eyes and not looking at it doesn't address the problem. I think we're in a space where we're doing a lot of algorithmic-level thinking because that's what so much of the Western economy now is based on—people at keyboards who are used to that. We're not making t-shirts anymore. We're not building transformers anymore. We're not doing that stuff, and so we tend to like to pretend that it doesn't exist. So that's kind of my more recent lens on the problem in the last two years. I know that's not quite an answer to your question, but that's kind of the chessboard as I see it at this time.

(1:21:24) Prakash: When you look at it end to end, you have the software piece and the talent piece—50% of top AI researchers are Chinese nationals, and that includes people working in the frontier labs in the US right now. And then you have the infra piece. A lot of stuff is coming from Taiwan, South Korea. Some of it is coming from China too. And then you have ASML sitting in Holland, which is supplying into TSMC. And then you have ASML's suppliers—they have like 3,000-odd suppliers spread across the world. They're buying neon gas from Ukraine. When Ukraine got invaded, they had a problem. All of these missing pieces, all of these pieces spread out across the place, right?

And TSMC has been upfront by saying, we are only possible in a safe, globalized economy. If we ever got invaded, everything's over. There's no—we can't do anything. That's it. It's done. So where do you think—how do you think that fits in with a threat perspective? It seems like someone just has a dead man switch over TSMC. So how does that work in terms of security and securing US prospects and the future in the US?

(1:22:42) Jeremie Harris: Yeah, I think it's a great question. That whole Taiwanese scenario planning thing is something that everybody has talked about. I'm not so sure everybody's kind of worked out the implications to full satisfaction. I mean, yes, if Taiwan gets invaded, TSMC is gone. It's gone whether it's because China takes it or because it's, as I would expect and hope, booby-trapped to the nines to blow.

It takes hundreds or thousands of insane-level PhDs to tweak that. Think of it as like a giant box of 500 dials, each one of which has to be perfectly tuned to keep these things pumping out at the right yields. You're not gonna replicate that if you're missing either the equipment or the people. So this is the most fragile production process that primates on this planet perform. An invasion is unlikely to leave it in China's hands.

And so the question is then: what do you get when you roll that back? What's the number two positioned entity? And then you start thinking about, okay, what does SMIC do? What can it do? And the SMIC-Huawei complex does seem like a very plausible runner-up, especially when you look at scale production, especially when you look at the emphasis Huawei's placed on networking large numbers of GPUs together. They don't have to be as efficient as ours. They can't be—they don't have the logic. But they can be networked together way better, and that's how they get effectively competitive scale performance. So this is a real issue.

You also, in a funny way, this interacts somewhat positively with the energy bottlenecks that we have here anyway. We're gonna be bottlenecked by energy probably sometime around the end of the year. When that happens, TSMC's ability to outproduce—it gets complicated because on a per-chip basis, they're way more energy efficient. They're pumping out more flops. But we do have that energy ceiling. That's the main constraint. So the timing matters a lot here. That dance between how much does logic matter, how much does energy matter, how much does memory matter, how much does packaging matter—all four of those things have become bottlenecks at different parts of the game in the last few years.

Another piece that I think, again, when we think about the actual bones that the AI economy runs on, it's not just chips and not just the data centers themselves. The power grid is a really just generally vulnerable target. We know that, for example, there have been components in Chinese transformers that have been snuck in—explicitly Trojans to be able to take down our grid. A very plausible scenario, just based on talking to folks who are working this problem on the IC side, is a Taiwanese invasion begins, one of the first things that China considers doing is just shutting down the American grid.

It's kind of obvious. If it's existential, that's massively escalatory. So there are huge question marks there, but it's a scenario that's being taken very seriously for all the reasons you might imagine. So yeah, I mean, I think if that happens, there are questions that suddenly run much deeper than just our ability to literally make chips in Arizona or wherever the next thing is. Literally, if we can be kneecapped economically at a more fundamental level, we don't even get to look at the chessboard that we hope to look at. We don't even get to indulge in the "oh, well, what can Samsung do versus what can SMIC do versus CXMT." We don't get to play that game. We literally don't have an economy. There are serious implications there.

So if we think about this as a game with the stakes that it might have—and this is contingent on what's between Xi Jinping's ears and the Politburo's ears—but this could end up looking like we're preparing ourselves to take a punch in the face, but then we get kicked in the balls, if you will. I mean, this is the kind of scenario where they use the technology that we may be vulnerable towards. And again, that zoom out, I think, is really important. We've got target fixation here on what could be a pretty narrow part of the chessboard.

(1:26:56) Prakash: You had some ideas on not only do we need to speed up, but we need to slow China down. What was your concept around slowing China down? Because they're trying their best. They are definitely not there on the chips yet. The Ascend 910s—the Huawei Ascend 910s—they don't really like them. They want to get the H100s in. There is this concept of building on the US AI stack—it's also revenue denial. If you manage to flow the revenue into NVIDIA versus flowing the revenue into Huawei, Huawei has more revenue to develop those chips, so therefore we should deny them. How does this balance out? This letting them take on the chips but not too powerful, but still enough that it doesn't create a market for Huawei? It sounds like a very delicate balance here.

(1:27:51) Jeremie Harris: It does sound like a very delicate balance. I personally am less oriented towards the argument that says, if we just let NVIDIA do business in China, then the Chinese will go, "Oh, sweet. We have NVIDIA that's serving our needs. We don't have to push so hard on the gas on this issue that's been identified for years as possibly the number one national technological priority that we are pouring multiple Apollo moon landing amounts of cash into."

This is, to me, a kind of miscalibrated sense of even just the messaging that the CCP has been putting out. I just don't see a world—we don't see, for example, NVIDIA shipping the H200 or whatever it is now, and then suddenly the CCP goes, "Oh, okay. Forget about that quarter trillion dollar investment that we just made in PPP terms into our national AI chip capacity and infrastructure. Forget about that. We'll stick with the NVIDIA play."

There's a sense both that the ability to access these NVIDIA chips is transient because the next administration may just as well pull it down. But also that—why not both? I mean, it seems like an insane thing given that AI is a matter of national security importance for China. It'd be pretty surprising to me if they just decided to respond that way, and it seems like they haven't so far. So I guess that's why I think about the export control thing from a slowdown standpoint.

They have worked. We know from DeepSeek—the public statements of their CEO before DeepSeek was on the radar. And this is actually, I think, really worth noting and under-recognized and appreciated. Before DeepSeek was on the radar, they were coming out and saying, "Hey, we really think we could do the AGI thing, though. Just one problem. We can't get chips. And these export controls are killing us."

Then, obviously, R1 drops and everything is about DeepSeek, and they get dragged in front of the Politburo or whatever and debriefed, and suddenly things change. You get these little trickles, these little leaks of similar information that come out at the edges of this kind of Chinese AI ecosystem every once in a while. But it's pretty clear that the export controls were worth it. If nothing else, just look at the massive orders that are gonna be coming in for the H200 to show how much pent-up demand there actually is in the AI ecosystem. Of course, we know all about the frustrations of AI companies in China and about the current way they went after their chipset. So yeah, I mean, my bias take is very much towards the direction of: I think we gotta listen to Chinese companies when they tell us that our export control policies are working.

(1:30:36) Prakash: Mm-hmm.

(1:30:37) Nathan Labenz: Maybe I'll come back to some of the frustrating duality of difficulties where on the one hand, you have expressed very low hope for the opportunity or the possibility of meaningful true collaboration between the West and China. And then at the same time, I think you're also not super optimistic about our ability to create a superintelligence that we can actually control and get to do what we want it to do.

And I think the way I think about our conversation from a year ago or so and your contribution to the broader discourse with "America's Superintelligence Project" is like: those two things are both real. They're both true, and you're kind of engaging in motivated reasoning if you try to deny either one of them.

With that in mind, we are now also seeing some of the potentially foreshadowing kind of moments on the AI side itself. Just in the last week, we've had these new models from Anthropic and OpenAI, and they've both kind of said, "We weren't really able to run the evals like we kind of intended to." Anthropic basically said the eval awareness is pretty high, and so we'll just do a survey—a little internal survey of whether or not this is safe to release. That's probably a bit of a simplification on my part, but I think that is a fair enough summary of their position.

Then OpenAI similarly was like, "Well, the autonomy risk part of our preparedness framework is also pretty hard to evaluate. We don't really have tasks that are kind of long enough horizon that we can get a real handle on just how autonomously capable a new model like o3 Codex is."

So that's kinda crazy. And yet, of course, both models are put out there. I don't see China driving the need to do that. It seems like they're doing that and doing it on the same day, notably, because their own competition between the two of them and also just sense of rivalry seems to be heating up. They're going at each other in Super Bowl ads to some degree at this point. Not something I thought I would see from Anthropic at the beginning was like a Super Bowl attack ad, but here we are.

What do you make of the dynamics between the Western companies? If I were to put on my slightly pessimist hat for a moment, I would say it seems like we might be racing to the bottom, which was exactly what we were hoping to avoid.

(1:33:13) Jeremie Harris: Yeah, I think we are racing to the bottom. I think the only frame that makes any sense—if we're gonna talk about, "Okay, we need to regulate this technology domestically," in the same way that everybody from all leading AI companies have been saying for, I wanna say, over a decade pretty much. You're never gonna do that unless you deal with the outer loop, the outermost loop, which is international competition.

There is no version of it. I don't think anyone—I think, again, we can enjoy the indulgence in target fixation of like, "Oh yeah, let's play the game pretending that other countries don't exist." But in the same way as the algorithmic target fixation not seeing infrastructure, this causes us to miss what is really the entire problem.

So you are not going to get to a point where you can have a tactical slowdown when you really need it. Suppose we find that the next version of whatever model can design custom bioweapons, execute catastrophic malware attacks—all these things that are entirely plausible and that no counter-jailbreak measures are truly 100% effective against the kind of people we'd be worried about. Yes, you would absolutely in that world need somebody to be able to say, "Okay, guys. Tactical halt. This is insane. We can't be in a universe where you get a nuke and you get a nuke and you get a nuke."

(1:34:40) Helen Toner: We can't do a BOGO free for nukes.

(1:34:41) Jeremie Harris: Okay. So what are we gonna do? We're gonna have to have a slowdown. If China still exists and has their program and they are—I'm repeating all this stuff that everybody said a million times—if they're 12 months away, six months away, I don't care. We've got a shot clock now. That's the situation.

So we have to start there. We have to start there and say, "Okay, any serious solution to this problem will involve dealing with China." Two ways you can do that. One is you have a kumbaya moment with China. There are a lot of interesting reasons why I think this is just not gonna work. One of which is: if you think about international treaties, they don't tend to reflect some sort of Star Trek-y commitment to everybody on planet Earth wanting to do the right thing. They tend to reflect the realpolitik kind of level in terms of actual power.

Nukes—you have nuke drawdowns when everybody can retain arsenals that can still destroy the entire planet three times over, and there's literally no point in building the marginal nuke. You have similar things if you actually look at the history of bioweapon and chemical weapon treaties. You find in every case that actually, they don't give you the marginal lift over just killing people with artillery and gunshot. It looks nice, and they often get adhered to for that reason. But then at the margins, you have Chinese research labs on American soil doing all kinds of crazy research. You have whatever facilities in Wuhan. All this stuff happens anyway.

And so this may sound super cynical. I think it just reflects the way things work. That's at least my take. So the question then is: how do you deal with an adversary like China that's in the position they are, that does have a stranglehold as they do on our infrastructure? They simply do. So the question is, what are your offensive options? That is it. You're not going to build the perfect Fort Knox. This is not a thing that's possible. And so the question is, what do you do to induce consequence on the other side? That's the only math that will work if my theory of the world is correct.

It's not a pretty theory. It's not one that leaves us feeling warm and fuzzy inside. It's one that may make you think a little bit about mutually assured destruction, that sort of thing. I think there are nuances here with—obviously, Dan Hendrycks had his frame on it. But bottom line is: I think you kind of need—and it doesn't need to be an AI-based response, though eventually you can certainly argue that any offensive option that isn't coupled to the scaling laws is eventually going to be beaten by something that does. So there's kind of an important design principle in these things.

But there are offensive options that need to be explored. And this is unfortunate, but it does mean that if you have a situation where your adversary can turn to you at any time and say, "Hey, watch me turn the power off on your entire grid and have tens of millions of Americans as a result die of starvation or exposure"—

(1:37:31) Prakash: We need—

(1:37:31) Jeremie Harris: —you need the ability to say, "Okay, watch the same thing happen in Beijing, and we can turn it back on, by the way. We need to have this de-escalation option."

I know it's a bit of a grim view, but I think that when I think about what actually gives leverage in this situation, it looks a lot less like a pretty treaty, especially given the history of countries like China, like Russia, with respect to treaty adherence. We know what it looks like when China signs a treaty. It doesn't end up being pretty in a situation like this where you need perfect adherence at such a high level of precision.

There's no version of an international treaty on AI that doesn't involve inspections of compute stockpiles and precise overwatch of the kinds of algorithms that are being deployed, the kinds of evaluation schemes—the level of cooperation that's required to do something tractable here strikes me as being quite significant, and the trust just—I don't see it being there.

(1:38:26) Nathan Labenz: So what's your p(doom) and on what timeline? I mean, we were just talking with Helen about this report that they put out about when AI builds AI and the possibility of recursive self-improvement. Sure seems like all of the vague tweeting that is going on right now out of the frontier labs is suggesting that that is happening. And then on top of that, of course, OpenAI has public timelines that they've put out, I guess to their credit, maybe. I guess we could see that both ways.

The Anthropic people that I talk to are, if anything, always the most firm believers that the recursive self-improvement dynamic is unavoidable. How long do you think we have before these things really start to take on a kind of runaway dynamic? Is there anything that you—if you had power, and a lot of power—is there anything that you feel like you would want to bet on? And where does that leave you in terms of p(doom)? Maybe I should just stop all this and spend more time with my family.

(1:39:29) Jeremie Harris: Yeah, in general, I'm a big fan of the happy warrior mindset. I think it's just never constructive to go in whole—first of all, we have to assume that no matter how firmly we might believe in whatever outcome, we may just be wrong. There's a famous story of Richard Feynman walking around New York City in the seventies, I think it was, looking at all the skyscrapers and being like, "Wow, isn't it sad that all of this is gonna be wiped out by a nuclear war between Russia and the United States sometime in the next few years?" And he was just—that was a fact of the matter, and it reflected a pretty reasonable understanding of the dynamics unfolding between those countries at the time.

I'm not saying it's ever quite that simple, but this is an ingredient, nothing else, that makes you less effective if you're just stuck in a hole all the time. And just as a meta point, I guess, I think that is the first piece. You know, we have to act with agency, and we're gonna be most effective doing that if we're not stuck in a kind of deterministic Calvinist frame with this whole thing.

In terms of—I'll also not answer your question before I answer it just by saying: regardless of timelines, one thing to focus on is that there's some things that are pure optionality plays. And so there are things that you do if you're going to build a frontier AI cluster at scale that rule out nation-state security at that cluster.

(1:40:50) Prakash: Mm-hmm.

(1:40:50) Jeremie Harris: They just rule that out. If you don't do these things right on day one, by day 360 once you finished building the site, your site is going to be compromisable, and there's no going back in doing that. We think of these as—we call it the one-way doors of the data center construction process. Figuring out what those one-way doors are, setting standards for them, and actually executing on that, even doing it just voluntarily. We think about OpenAI and Anthropic and so on, all independently going, "Hey, we just wanna buy that optionality because at some point—"

(1:41:25) Prakash: Can you give me a concrete example of a one-way door?

(1:41:29) Jeremie Harris: Yeah, so there's a bunch that I can't go into, but one that I can—it's pretty easy—is think about the people that you're getting in the loop to review the site plans and details that would be, let's say, useful to an adversary who is trying to extract information. If those people are Chinese nationals, okay, you're done. Cool. You're never gonna unfuck that. That's baked in.

So these are actually—the interesting thing with these one-way doors is they tend to be surprisingly cheap. And that's the tragedy of it all. If you were thoughtful, you could go through and be like, "Well, on a fraction of the budget that would be required in CapEx and OpEx for these builds, you could just create pure optionality by implementing these safeguards."

Putting offensive options on the table is a pure optionality play. You don't need to exercise those options. You need to have them on the table. That's what I'm saying. I'm not saying let's go to war with China. That's a crazy thing to say. There's all things in their context, but you need options. And that's a crucial thing. So having an understanding of mapping out the ecosystems that are relevant, the AI ecosystems that are relevant, and thinking about what might that endgame play out to be—those seem like pure optionality plays regardless of timelines, all things you can do quickly. Again, this is—it seems to me kind of—and I'm not saying they're not being done. It's just that often there's a lack of sort of focus on the endgame here, anyway, without getting into the weeds too much.

Okay, so p(doom) and timelines. Oh, sorry, Prakash.

(1:43:05) Prakash: Yeah. No, go ahead. Go ahead.

(1:43:07) Jeremie Harris: Yeah. P(doom) and timelines. So I'll almost say p(doom)—I don't find it useful. I know what I'm focused on. I know what I gotta do. If I start thinking about—my generic answer has been for years: any number between 10-90%, I'll take as—that's a reasonable number. I'm not—I've read the debates. I've seen the posts on LessWrong.

(1:43:34) Prakash: So is that your p(doom) or p(loss of control to superintelligence)? Because I think in some places you've mentioned it's a loss of control to superintelligence rather than doom.

(1:43:45) Jeremie Harris: You've obviously done your homework really well. Yes, that is more of a loss of control to superintelligence. I think by virtue of the way that numbers multiply together, I don't know that my answer is that different for p(doom) in general. Again, this is coming from somebody who, for better or for worse, has almost explicitly not put in that much time to kind of wallow in those numbers as I think we're all sort of tempted to do. I have that temptation. I get it. I mentioned I had a daughter, right? I don't like the landscape that's playing out, but I had a daughter. I chose to have a daughter, and I didn't have her in 2018 before the scaling laws blew up. This is a choice that I made. I think there's a kind of almost spiritual risk to getting locked into that kind of thinking, and I say this—

(1:44:34) Abhi Mahajan: (1:44:34) Jeremie Harris: I went through that experience and found how much it changed my views. So I guess I'll just not answer the question by saying 10 to 90% sounds reasonable. I think if you're below 10%, there's homework you have to do, because a lot of these scenarios maybe sound crazy, but they're a lot less crazy than they seem. When you get into the nitty gritty, a lot of these scenarios are halfway to happening. And if you're above 90%, first of all, if you live as if you're above 90%, that's going to make you less effective. I also think, again, Richard Feynman certainly seemed to think he was in that ballpark. There's an epistemic question here of how quickly does the world adapt. I think we're constantly surprised by how quickly the world adapts, both how fragile and how resilient it is. The eleventh chapter of the book will often involve a new character that comes out of nowhere, and we just need to make sure that we keep uncertainty about our uncertainty factored into this analysis. And I think that buys me 10% pretty easily. I've been wrong on stuff that I thought I was 100% on often enough to be like, okay, I'm not going to push it that much. I know that's frustrating for a lot of people. Like, no, but look at the math, man. And I get the math. But what I'm questioning here is the process that led to the math, and I don't know that I can possibly ever get fully behind that process and interrogate it with confidence.

So last thing is timelines. I thought AI 2027, and contrary to what I think, Dan is going to pull back a little bit on timelines from now.

(1:46:14) Prakash: He said 2027 always meant 2028, but now it means 2029.

(1:46:20) Jeremie Harris: Yeah, AI is the apocalypse of the future and it always will be. But, you know, not actually. When GPT-3 first came out, I was like, oh man, I've got two year timelines. And that was because I didn't understand what the hell would be involved in the infrastructure buildout. Now that I have a much better understanding of that, I'm still kind of like, what's the next bottleneck going to be? I'm very uncertain about this. And again, it's one of those things that doesn't really affect what I do just because I'm so focused on all the low hanging fruit that we have to pick right now. There's so much stuff that we're just not doing because we're paralyzed by the problem. So I think in terms of what we do, there's pure alpha on the table in the short term. 2027 doesn't sound insane to me. 2030 doesn't sound insane to me. 2035 sounds a bit far. I guess I'll sort of leave it at that as a spread. I think we should be acting as if 2027 is plausible. I think it would be unfortunate if it happened in 2027 and we're like, man, we had a lot of really plausible analysis that pointed to that and we just didn't do anything. That would be a shame.

(1:47:29) Nathan Labenz: Can you give us a little bit more of a hit list in terms of the low hanging fruit that you want to see us pick? I mean, we've got the one which is build at least some subset of our data center buildout in a secure way so that we can run hypersensitive projects there as needed. What else is kind of on the list? If you're replacing David Sacks as the next AI czar, what's going to be your priority sheet?

(1:47:59) Jeremie Harris: Yeah, I mean, that first one, by the way, is a lot of things. It bundles together the personnel security issue, insider threat problems. There are a huge number of things in that bucket alone that are necessary and contribute very cheaply to much more optionality on the security side. I think if you zoom out more and look at the grid, what could you be doing to introduce redundancies quickly? The supply chains that lead to a lot of these components are very clearly sourcing heavily from China. Here's an easy win: look into companies that are offering to build data centers suspiciously fast and who owns those companies. There was actually a letter that came out from the House Select Committee on the CCP a while ago naming May Day Data Centers as an entity that is somewhat suspect. They'll have these data center building companies where it's like, oh wow, you could build stuff way faster than anybody else. It involves sourcing components from China. And my personal opinion is, if I were to see that, I might be asking myself the question: China is kind of a command economy through civil-military fusion. If the CCP wants me to have this very rare and precious and backlogged component for my data center in the continental United States, that might tell me something about how much faith I should have in the security and integrity of that component.

There's just not a lot of infrastructure level attention being paid to these things. And the labs, by the way, they want to do the right thing here. They don't want to be in a position where they're getting a company to build something for them, and then it turns out that thing is compromised and it comes out. That is not good for anybody. So incentives are aligned there. There's just been so little attention paid to the bones that there's tons of stuff we could improve, including with AI. Looking for malware in old software that's load bearing for our infrastructure, or non-malware but rather vulnerabilities, and finding ways to harden it. Yeah, this is a defocused answer, but it hopefully gives a sense of the menu.

(1:50:24) Nathan Labenz: One thing we haven't really given you a chance to flex your ability on in this conversation is just the breadth and depth of your technical understanding of so many AI developments. And I definitely recommend the Last Week in AI podcast, which you usually host, as a great source of very sophisticated analysis by both of you, but I tune in for you mostly, to be honest. And I wonder how you are doing it. How are you keeping up? What is your method? How have your methods evolved so that you're maintaining situational awareness as much as you can?

(1:51:06) Jeremie Harris: Well, thank you, first of all. That's very kind of you to say. I have told you this before, but I do actually watch The Cognitive Revolution. I think the ecosystem here is really rich, and interviews are really important because you get stuff that you can't get from the papers. I tend to focus more on the papers, so I just don't get that kind of analysis. I just talk to friends from the labs, but it's different from those deep dives.

Yeah, I mean, I can't remember when I started on Last Week in AI, but it was maybe 2021 or something. Back then, I would just read papers, and you couldn't use GPT-3 to help you understand a paper. It just wasn't a thing. Now that's changed. I've had an experience that was kind of frustrating this week in particular because I'm preparing a state of play briefing for a customer, and basically they want to know what happened in the last quarter in the world of AI that we should be tracking. There's a paper that I had Gemini help me with, and I got to a really good understanding of the dynamics of gradient flow through residual connections and whatever. It was pretty complex. What I realized, though, after having an interaction with Gemini for long enough, I switched over to Claude, and I was like, wait a minute, I just hallucinated my way through that entire conversation, got to an understanding where I was like, oh yeah, I'm pretty smart for figuring this out. And also I got this nailed down, and everything got flipped around. So it doesn't always happen, but that has been the most recent update in my process: be mindful to double check, especially as you start to get lost in a rabbit hole.

(1:52:51) Prakash: Yeah.

(1:52:51) Jeremie Harris: I typically spend about 30 to 40% of my time reading the paper and then the rest interacting with a model, usually about the implications of the paper or what it is. Is it reinforcement learning versus supervised fine-tuning? If I'm reading the paper, I'm doing SFT. That's what's going on. With the models, I get to actually go on policy, and I get to test my own understanding. Like, I would have done this experiment differently. Is that a stupid idea? And often I'll get a pretty good answer, and it makes you feel like you're rotating the shape instead of just staring at it. And that for me has been really helpful and empowering. It feels empowering.

(1:53:31) Nathan Labenz: Do you have any particular workflows, pipelines, whatever that try to filter things for you and surface what you really need to spend time on? Because that is as challenging, I mean, it's more challenging than ever, and it seems to be maybe as big of a deal as being able to successfully make sense of any one thing is what are you going to choose to spend your time on in the first place. How has that evolved for you?

(1:54:01) Jeremie Harris: Yeah, it's a great question. This is that age old question of taste. And one of the things that I've had to come to accept is I can't develop good taste in all the domains that we want to cover on the podcast. My taste is basically: if one of the frontier labs puts out a piece of research, or if a researcher I know and have a lot of respect for puts something out or is a coauthor on something, I'm going to take a really hard look at that. And then besides that, I have the usual set of Twitter accounts that I follow, and that's another way. But my passes at these papers are pretty focused on the "what's on the critical path to ASI" question. Not that I know the answer, but I'm trying to find things that gesture at that, which is why I don't tend to talk about GANs or the latest in... well, I was going to say the latest in text-to-video. Now that seems like it could be on the path, so you never know.

But I guess part of it is just acceptance. I am reading these papers for the concepts more than the outcomes. And often what'll happen is there's a paper that'll come out, and it might not be the perfect paper to cover from a given topic area. There's this paper about residual connections and really optimizing the crap out of them to get ultra deep transformers. Is this the best paper? Probably not. But the reason I focus on it when I'm explaining the underlying concepts on the podcast is that, A, there's going to be another paper next week that obviates whatever the hell the last paper did. And B, I think the core concept landscape is the most important thing. So when there's another paper that comes out about optimizing residual connections, you're like, okay, I'm familiar with this playpen. I know the furniture in this room. I can rearrange it a little bit, be more confident. So I guess the answer is I get around the taste issue by not having it, which is maybe just annoying.

(1:55:59) Nathan Labenz: What's underappreciated for you right now by AI-obsessed people? I mean, there's, of course, in the broader world, AI is underappreciated, and just how crazy things might soon get is very probably underappreciated. What do you think I might be missing? What are the most likely blind spots for somebody like me that you would want to draw attention to?

(1:56:21) Jeremie Harris: I guess the challenge with blind spots is that we all have them, and by definition, we don't know that we have them. So what I'll try to do is roll back and tell you about my blind spots from about two years ago. And that was around the time that we put together that report that Prakash mentioned earlier. This might sound the wrong way to put it, but the stuff that feels too blue collar to most people who are AI obsessed, like I am: you sort of start to realize how much of the world is actually built on infrastructure that we just abstract away. So I think that's actually really important and needs to be paid attention to. Understanding down to what are the dynamics of the leasing process that a frontier lab goes through to get a new piece of land? What can screw up there? What causes delays in construction projects that we talk so much about? You know, xAI has their new Colossus cluster, and it's going to be online, shockingly, up to one gigawatt sooner than Anthropic, which surprised everybody and all this stuff. When did the groundbreaking happen? Because that's going to tell you, if you believe in the scaling laws, that is probably one of the most important variables that you want to track: delays in construction processes. Sounds pretty mundane, but the world runs on it. And procurement schedules and things like that. So I guess that's one piece that I had been missing.

Another one is how real nation state security happens. It's hard to get information about that. One of the biggest problems is there is no such thing as one nation state security capability. Nation states are siloed, obviously, because for security you can't have tactics, techniques, and procedures that are exchanged between silos because then there's no information security. And this, by definition, means that you would have to go through a process of taking team A, comparing them to team B. Well, okay, team A wins. And then next day, you'd have to go through that kind of selection process, run a tournament-style situation to even know what the most exquisite capabilities are that we could field, and that still wouldn't tell you quite what other countries can do. So that's kind of, anyway, I think a really important dynamic that's very easy to undervalue in the AI security context, especially for physical security, which is, again, undervalued precisely because we tend to abstract away. We focus a lot on cyber because it couples to AI, and it feels like it's in our sweet and nerdy space, and I get that and I love it and it's critical. But it's also not... if you look at what the Russians do, they do cyber for sure, but they will go up and arson your transformer. That's not an issue for them. We've had examples of that sort of thing happening. So anyway, there's that piece.

Maybe the last one, and more in the comfortable and familiar learning space that I occupy, is this idea of the distinction between having a model and then having the compute to run that model.

(1:59:17) Prakash: Mm-hmm.

(1:59:17) Jeremie Harris: If you believe in the inference time scaling laws, then model theft is one thing, but actually being able to point that model at, basically have a compute on compute war at inference time, seems like a really important dimension. And you see this play out in a lot of interesting ways. One of which is the Chinese ecosystem. They have a huge number of users. And then they have some okay-ish language models. Problem is that their labs are all flooded with these inference requests from their giant user population, which leaves very little R&D compute for just innovation and improving of models. And so that's actually one of the frustrations for Chinese labs much more than labs here. They're just like, dude, we have so much demand, but we're not bottlenecked by money. We're bottlenecked by compute.

And so the dynamics of how inference affects training, and then what it means to steal a model and what it means for model-on-model warfare to happen, especially in cyber. Cyber hardening has a certain amount of test time compute. That test time compute is going to be focused in some way. And then the offense side is going to have a certain amount of test time compute. And how those play out, the relative budgets, matters a lot there. Obviously, if you're defending, you have a wider surface area you have to defend. But there's a whole debate there. That, I guess, would be another dimension: going beyond just owning a model. What about running it? What can you do with this model that you have?

(2:00:45) Prakash: I have one last question, which is: you're pretty security conscious. Have you run OpenClaw, and what is your current personal productivity stack?

(2:00:55) Jeremie Harris: Yeah, I have not run OpenClaw. And actually funny you say that, I'm setting up an old laptop that I'm going to use as my burner laptop for the purpose of exactly that, partly because, you know, I, anyway, yeah, for the exact reasons you would imagine. In terms of my productivity stack, a big part of my job is becoming now constructing agentic workflows to do some things that are not super security sensitive, but more just I'm going to try to use it to optimize my comms, because that's a huge bottleneck for me. And for that actually, I'm still in the discovery phase of trying to choose platforms. I'd be interested in your thoughts for that actually as I dive in. This is literally next week is my deep dive. So this is almost the worst possible timing because I think my answer is going to be horribly outdated. Yeah, it's a great question. I wish I had the answer.

(2:01:55) Nathan Labenz: I talked about a little bit about mine at the top. I'm interested to hear more about what Perplexity is doing too. But for me right now, it's Claude Code as the base product and then taking inspiration from a guy named Daniel Miesler, who I did an episode of the podcast with, who's created Personal AI Infrastructure, a framework. It's an open source framework. And also friends who I just trade notes with privately. I'm trying to create deep context for myself by first exporting all of my digital history from Gmail, Slack, all the other places where I have these comms, get them into a local database. Then, of course, you need a daily update process to fetch the latest because you're still communicating on all these other platforms. Then layering on top of that summarization and different kind of angles on the data.

So one right now, I'm at the phase where I'm like, here's a month's worth of all comms. That seems to, for me, come out to about 300,000 tokens. Now summarize that down to 10,000 tokens of what a chief of staff would need to understand this month in Nathan's life. So you can do that 30 to 1 reduction, then probably put a year-long version of that and then sort of the "let's talk about the relationships" cut on it, the projects cut. And then hopefully with that deep context and the... I'm also trying to have it leave pointers in those summaries with a regular habit of quoting any distinctive language so it can go search down to the ground truth for the original. Hopefully, it will then have enough context to be able to not exactly write exactly as I would, but sort of come much closer at least to responding as I would, having the sort of context necessary to exercise something like the judgment or taste that I would exercise in doing things.

And that was actually part of the process of setting up this episode. I gave the system 20 names and said, do research on these people, find out what they've been up to lately, give me a brief on that, and then also had it draft the outreach emails, which were only lightly personalized. And I still did go in and add a little bit before sending.

(2:04:15) Jeremie Harris: But I appreciate that. That's nice.

(2:04:18) Nathan Labenz: But yeah, I don't like to publish or even send as one-to-one communication AI output directly. But I do find that I can get to something that I do feel comfortable signing my name to faster with an AI draft in many cases these days. And so it's very much a work in progress for me, but that's kind of where I'm at at the moment. And I'm sure by the time we talk next, it'll have changed quite a bit. Prakash, what's your angle right now?

(2:04:52) Prakash: I've got a couple of things that I ended up building out. One was a market stock market tracker. I have a number of metrics which I think no one else watches, and it's fairly hard to obtain. And the great thing is Claude is very good at financial math. Very, very good. Far better than I ever have been. And so it's relatively easy to talk to Claude and kind of figure out what kind of thesis you have and then build up metrics precisely for that thesis to watch pickup lines. So that's been very useful. I used to do it in my head. You look at something, you look at something else, and then you calculate the ratios of blah blah blah. And then I realized that I was spending a lot of time doing ratios in my head, and I was like, maybe I should automate this. And so now it's all automated. It's nice. I don't do the ratios in my head anymore. I just look at it, and I can see the screens automatically. I can see what I'm looking for.

And then the other thing was podcast clipping because we do a lot of podcasts. And the content these days has to be repackaged into short clips in order to hit socials. And I tried that six months ago. The tech wasn't there. And I tried about three, four weeks ago, and the tech was there. Everything works. Transcription works. Review works. Selection works. Everything works. And this is, in my experience, it's kind of like maybe it gets 1% better, but that 1% better clears the hurdle. And that's a binary step up. It works or it doesn't work, and that 1% just clears the hurdle. And I really feel like in the last month, a lot of things started clearing the hurdle.

(2:06:42) Jeremie Harris: I was just going to say, you know, when you said that in the last six months, so many things have gone from toy to just serviceable in production, and it seems to map on to, Nathan, what you're saying earlier about the takeoff dynamics and the labs automating their own research. That all kind of maps very nicely. One of the things on the financial side too, I find Claude is also useful on questions like, you might have a thesis, but then there's a question about how do I, if I'm right about this, what's the best bet to make? Because that's a category of problem I've mentioned in the past, where you'll have a thesis, but you're not going to bet on Microsoft because OpenAI has such a tiny fraction. It's already going to be all priced in and stuff. Where do you leverage and torque it to this thesis? And that's kind of something that, you know, the world is so complex that you just need something to peruse and have all that knowledge. So the finance use case is a really great one. Great point.

(2:07:38) Prakash: It's also been very, very weird in the market because I feel like Twitter is literally a month or two ahead of the market. It's just been amazing. People tell you TSMC will do well, and then three months later, it happens. And I know, I was a professional financier. I've always expected that hedge funds get there before you do. And in talking to my friends at prime brokerages and hedge funds, they are very negative on AI. They just don't believe it's happening. They don't believe... they believe it's like crypto. They believe a lot of West Coast tech is just scamming retail investors. Index investing is the only thing that really works, and everything else is either insider trading or scams. And that's pretty much what the prime brokerage guys and the hedge fund guys believe.

(2:08:34) Jeremie Harris: You know, Medallion, right? Or Renaissance. These guys who have AI in their bones. I guess Medallion, they can't, you know, they only invest, famously a $5 billion cap because otherwise they would actually move the markets and create feedback loops. But yeah, what about them? Is there...

(2:08:51) Prakash: They were down last year. So the impact is starting to be felt. I think, well, you know, also Jim Simons died. I don't know to what extent he was still supervising because he'd already kind of semi-retired for 10 years almost, but Medallion was down. There's some sense that it's also because they're losing talent to the labs too. You can't forget about that. They're starting to lose talent to the labs. Absolutely. And some of the labs do have internal teams which will eventually look at trading on the market, I think. So we'll see where that goes.

(2:09:23) Jeremie Harris: Very cool.

(2:09:24) Nathan Labenz: Jeremie, thanks for joining us. Let's check back in on your personal productivity stack once you've upgraded it. And in general, let's, I'm reusing this joke everywhere I go, let's shorten the timeline for our next conversation.

(2:09:39) Jeremie Harris: I like it. Thanks, guys. Appreciate it.

(2:09:43) Prakash: Thanks, Jeremie. Cheers.

(2:09:44) Jeremie Harris: Cheers.

(2:09:46) Nathan Labenz: So what do we make of it all? I mean, the big thing I can't get past in all this stuff is the amount of disagreement. And this has been commented on so many times, in so many ways, right, up to the level of the Turing Award winners that can't see eye to eye on the same phenomenon.

(2:10:06) Prakash: Yeah.

(2:10:08) Nathan Labenz: But it seems to happen at every layer. It's like a fractal problem. You go into these specific workshops around AI R&D. You get people from the labs. I do understand that there are even people at the frontier companies that have heterodox positions and don't really buy into the hype. And then even with the AI for science, I can't make any case that I should trust my own intuition more than Abhi's because he's, you know, how many times did it happen in talking to him where he was like, I've actually written about that? So he's clearly thought about this much longer and harder than I have. But it does still feel like it's a very hard thing to reconcile where you do see these examples and it seems like some of them are really starting to work, but then the skepticism remains and is very hard to move people off of. And I don't want to paint him as overly skeptical either because he did say toward the end, I think his skepticism is more backward looking. You know, forward looking, he was kind of like, I do believe the trends will continue and that they will have impact. But where does that... how do you try to make sense of...

(2:11:19) Prakash: When you say it's practical, I feel it's also practical internally to me where I have some assumptions here, and then sometimes I feel cognitive dissonance from something else that I might believe. And then you kind of test those assumptions and you see where things are going. I have had moments of truth or perception where I start to realize that I think things might move faster than I expected. My original timelines were 2025 for junior software developers to be replaced in capability, not in organizations, but the capability is available in 2025. And it takes about three years to percolate. So 2028, no more junior software developers, basically, or at least the tasks that junior software developers are doing today. And then I had 2025, 2026, 2027. 2027, even senior researchers at AI labs, fully capabilities are done. The models have the capability, but deployment again takes two to three years. It takes time. That was my sense.

My update in the last month has been probably that things are going to go faster than we expected and that we will see discontinuities. And those discontinuities are like this kind of thing where things get 1% better, but all of a sudden they clear the curve. Right? We don't have a good sense of these things because we keep seeing linear improvements. They're kind of linear, maybe superlinear, but we don't have this sense of clearing the hurdle. But when it clears a hurdle, it's obvious. It's started to be obvious for software, I think, in the last month or so.

So I think we just have misperceptions on where things are going because we can kind of see the trajectory of capability, but we don't understand how humans absorb that capability. Like, what is that process and what hurdles do we need to clear? What is the OpenClaw? I thought you need full security and privacy and all of this stuff. It seems you didn't. It seems like people are willing to put out their credit card numbers and crypto tokens on the open web, and you don't need privacy. There's a bot on Notebook saying, oh, you know, my user is so annoying. Here's his credit card number. And Scott Alexander ended up calling up the guy and asking him, hey, did this actually happen? And yes, that was the credit card number. It was leaked.

So I think it's clearing the hurdle concept and where humans accept the technology and kind of where the market pulls that technology that we don't know. That even I don't have a good perception of, but it seems like we're starting to clear those hurdles where humans are starting to pull the technology from the market in. And that's when you start to see kind of revenue growth. That's when you start to see the demand growth really happen where the market starts to pull the product out. And I think that's happening now. I think the OpenClaw, I think we'll have a much better version of OpenClaw, closed source, secure version running inside corporate data centers by the end of the year.

I saw I watched the All-In podcast. Jason, not the most technical person in the world, he had a team for the All-In, about 15 people. He started to get everyone to create a skill for themselves. Like, every task that they do, they create a skill. He has OpenClaw machines, like one machine per person. And then he has a consolidation agent that consolidates everything into something he calls Ultron. And then he can talk to Ultron. So he can ask Ultron, and that's his entire company. Like, it's a summary of the entire company, and he's talking to it. I thought that would be two years from now. I knew it would eventually happen, but I didn't think it would happen now.

So yeah, I think things are actually moving faster than people think because of the market acceptance. The market is pulling it out. I don't think the researchers have a good sense because researchers don't understand the market that well. They don't understand demand dynamics that happen with consumers and how products get pulled out. Once the demand is there, products will just get pulled out because people start focusing. They know that money can be made there. They just start focusing on it. So that's my sense. So it's not a p(doom) answer. It's more of, like, this is what I feel. What's your feel?

(2:15:49) Nathan Labenz: The confusion and the lack of ability to establish consensus on foundational points is a major challenge to having a lot of confidence on much of anything. Yeah. I do think a good true north for me, well, the true north for me with everything I'm doing is trying to learn as much as possible, trying to have the most up-to-date comprehensive worldview as possible. And in terms of the approach that I would trust more than any other, I think still being hands-on is second to none. I haven't allowed that to lapse much at all over the last few years, but anytime I do get too busy or cluster too many podcast recordings into a week or whatever, I always come away feeling like I've got to get a little bit more grounded with the latest stuff in a very interactive way.

And I think one metric I have for myself, or metric is maybe not quite right, but an indicator that I want to pay attention to this year, is can I get to the point where I'm spending less time at the desk? And that's along the lines of Jason talking to Ultron. I want to be able to do stuff while exercising. Even if that's just a walk around the neighborhood, I want to get the frameworks, the tools, the deep context, all that stuff set up well enough where I can start to go comfortably out into the world, have a thought, maybe have an actual conversation, but move things forward in practical ways.

(2:17:33) Prakash: Yeah.

(2:17:33) Nathan Labenz: In on fronts that, like, right now, I can really only do on my computer. I think a lot of that is, right now, with the latest models that have come out, kind of on me just to get the setup and the sort of familiarity in the workflows to be able to do that. A little bit, probably still more than a little bit, but I could put more on me right now in terms of why have I not hit maximum capacity than I do on the models or the model developers. More computer use would help for sure. A little bit more ability to just get over these sort of UI humps remains, I think, a barrier.

Nothing I really have been not surprised by, but something I've really learned from just being deeply interactive over the last few weeks is I think another big unlock to watch for is when the models get better at knowing when to use code versus when to use their own fluid intelligence. Because one of the first projects I've been doing is, like, just backfilling information, backfilling transcripts of the podcast for the website, backfilling all these different data sources into a queryable database. And you hit so many edge cases in doing that. And the model right now called, you know, Opus, we've gone from, you know, 4.1 to 4.5 to 4.6 pretty quickly. But pretty consistently, I have felt like it really wants to code. And I have often given it the feedback. Don't try to guess at this and write some sort of regular expression or, you know, it'll grep for one search term or another or, you know, to throw ten search terms into a grep command. And a lot of times I'm like, just read the document. If you just read the document, you will know what it contains. You will know what to do. You'll have the right judgment once you have read the document. If you don't read the document and you instead try to grep your way through it, you're never quite going to get there.

So that's like a metacognitive skill that I think I've been able to improve its performance somewhat through prompting, but it's obviously going to get better in training. That I think will be a huge unlock just as it gets a little bit smarter around its own, a little bit more, a little bit more inclined or a little bit more intuitive about when it should deploy its own fluid intelligence rather than use other tools. Getting that balance right will be, will make it, in my experience, dramatically more useful. And I have to imagine that's coming pretty soon.

(2:20:17) Prakash: Yeah. I think, you know, when we talked to James today, that continual learning piece, the test time training, it will be fascinating if it actually worked with your own model because your model will start to diverge. You have the baseline, and then your model will start to diverge. And then it would become your personalized model within, like, you know, two or three cycles of, like, talking to it. A month, two months of data. It will become your own model. It would start to diverge from the baseline. And that would be fascinating. Because at that point, it's for real. You can, and especially for, I used to write a lot of journals. I have, you know, '99 to 2003 at Stanford, I have full journals for every single month, like, everything that happened. You know, obviously, I've never read those after writing them. It's just kind of an exercise in journaling. But I do wonder if, like, those of us who have lots and lots of written work either in the public or in the private, once you get this continual learning going, you can kind of start feeding it in. This is what Kurzweil is doing with his dad's writing, by the way. He's feeding his dad's writing into these models, and he's talking to the model about his dad. Kurzweil is someday going to feed all of that into a test time training kind of model and, you know, with a voice access, and he probably has a recording of his dad's voice, and he's going to start talking to his dad. It's a fascinating time.

(2:21:41) Nathan Labenz: Yeah. To say the least. More explorations of all these themes to come. Couple things coming up on the Cognitive Revolution feed. One is with Ali Behrouz, who is the nested learning author. He was on our last live show. I did do a full three-hour always-take on everything. And he's got a new paper coming out also that I think, you know, the way to continual learning is starting to become elucidated, I would say. I mean, it's, I wouldn't say it's clear, but the it's, you know, no less than Jeff Dean has said that he kind of sees this as a very promising paradigm. So I'm definitely watching that really closely.

Workshop Labs is a startup also that's trying to do this, you know, personalized model training on top of the latest large open source models up to the sort of Kimi scale trillion parameter kind of thing. So that's really interesting. That's actually another reason I spent so much time doing all this personal data curation is that I wanted to be able to give them a dataset for them to train a model for me on that would be a really good dataset. They don't need that much data, but I was like, well, you know, we really want to make sure it's the right data to hopefully get a good model back. So that's still pending. I haven't seen that model yet, but I'm very going to be very interested to see how much that closes the gap between what Claude can do with just access to.

(2:23:07) Prakash: Mhmm.

(2:23:07) Nathan Labenz: You know, all this stuff in text, and then how much does it help to actually start tuning weights to try to capture more of, they aspire not just to style transfer, but judgment transfer. They want the model to reflect the judgment that you, the individual user, would make at the time. And an interesting theory there too is their motivation is that they want to help individuals preserve economic leverage. So instead of doing everything through a foundation model and kind of adjusting yourself to take advantage of the model, they want to shape the models around individual humans with the goal that, you know, it's not a winner take all. Big tech runs away with everything, but some sort of more decentralized ecological kind of proliferation of somewhat different models that hopefully at least kind of can exist in some sort of equilibrium with one another.

And then on top of that, there's another one that I have coming soon with the founders at Harmonic, and they are chasing mathematical superintelligence. Yeah. And when it comes to, like, these, I will say, just as a teaser, they gave maybe the most ambitious vision of what five years from now could look like, the most mind-blowing vision of what five years from now could look like of probably anyone that I've heard, and that is saying something because I've heard a lot. But they still kind of blew my hair back a little bit with what they think they can accomplish over the next five years.

(2:24:39) Prakash: Definitely going to look forward to that one.

(2:24:41) Nathan Labenz: Lots more to come.

(2:24:42) Prakash: Yeah, indeed. Nathan?

(2:24:45) Nathan Labenz: Thanks for doing this.

(2:24:46) Prakash: Always a pleasure.

(2:24:47) Nathan Labenz: It's been fun.

(2:24:49) Prakash: Bye bye.

(2:24:50) Nathan Labenz: Until next time.

(2:24:53) Nathan Labenz: If you're finding value in the show, we'd appreciate it if you take a moment to share it with friends, post online, write a review on Apple Podcasts or Spotify, or just leave us a comment on YouTube. Of course, we always welcome your feedback, guest and topic suggestions, and sponsorship inquiries, either via our website, cognitiverevolution.ai, or by DMing me on your favorite social network. The Cognitive Revolution is part of the Turpentine Network, a network of podcasts which is now part of a16z where experts talk technology, business, economics, geopolitics, culture, and more. We're produced by AI Podcasting. If you're looking for podcast production help for everything from the moment you stop recording to the moment your audience starts listening, check them out and see my endorsement at aipodcast.ing. And thank you to everyone who listens for being part of the Cognitive Revolution.

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to The Cognitive Revolution.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.