Blueprint for AI Armageddon: Josh Clymer Imagines AI Takeover, from the Audio Tokens Podcast

Blueprint for AI Armageddon: Josh Clymer Imagines AI Takeover, from the Audio Tokens Podcast

In this episode of the Cognitive Revolution, an AI-narrated version of Joshua Clymer's story on how AI might take over in two years is presented.


Watch Episode Here


Read Episode Description

In this episode of the Cognitive Revolution, an AI-narrated version of Joshua Clymer's story on how AI might take over in two years is presented. The episode is based on Josh's appearance on the Audio Tokens podcast with Lukas Petersson. Joshua Clymer, a technical AI safety researcher at Redwood Research, shares a fictional yet plausible AI scenario grounded in current industry realities and trends. The story highlights potential misalignment risks, competitive pressures among AI labs, and the importance of government regulation and safety measures. After the story, Josh and Lukas discuss these topics further, including Josh's personal decision to purchase a bio shelter for his family. The episode is powered by ElevenLabs' AI voice technology.


How AI Might Take Over in 2 Years: https://x.com/joshua_clymer/st...
Josh's recent appearance on the Audio Tokens podcast with Lukas Petersson: https://lukaspet.substack.com/...

Upcoming Major AI Events Featuring Nathan Labenz as a Keynote Speaker
https://www.imagineai.live/
https://adapta.org/adapta-summ...
https://itrevolution.com/produ...


SPONSORS:
ElevenLabs: ElevenLabs gives your app a natural voice. Pick from 5,000+ voices in 31 languages, or clone your own, and launch lifelike agents for support, scheduling, learning, and games. Full server and client SDKs, dynamic tools, and monitoring keep you in control. Start free at https://elevenlabs.io/cognitiv...

Oracle Cloud Infrastructure (OCI): Oracle Cloud Infrastructure offers next-generation cloud solutions that cut costs and boost performance. With OCI, you can run AI projects and applications faster and more securely for less. New U.S. customers can save 50% on compute, 70% on storage, and 80% on networking by switching to OCI before May 31, 2024. See if you qualify at https://oracle.com/cognitive

Shopify: Shopify powers millions of businesses worldwide, handling 10% of U.S. e-commerce. With hundreds of templates, AI tools for product descriptions, and seamless marketing campaign creation, it's like having a design studio and marketing team in one. Start your $1/month trial today at https://shopify.com/cognitive

NetSuite: Over 41,000 businesses trust NetSuite by Oracle, the #1 cloud ERP, to future-proof their operations. With a unified platform for accounting, financial management, inventory, and HR, NetSuite provides real-time insights and forecasting to help you make quick, informed decisions. Whether you're earning millions or hundreds of millions, NetSuite empowers you to tackle challenges and seize opportunities. Download the free CFO's guide to AI and machine learning at https://netsuite.com/cognitive


PRODUCED BY:
https://aipodcast.ing

CHAPTERS:
(00:00) About the Episode
(04:31) Interview start between Josh and Lucas
(11:00) Start of AI story (Part 1)
(24:37) Sponsors: ElevenLabs | Oracle Cloud Infrastructure (OCI)
(27:05) End of Sponsors
(40:50) Sponsors: Shopify | NetSuite
(44:15) End of Sponsors
(01:20:09) End of AI story
(02:01:20) Outro

SOCIAL LINKS:
Website: https://www.cognitiverevolutio...
Twitter (Podcast): https://x.com/cogrev_podcast
Twitter (Nathan): https://x.com/labenz
LinkedIn: https://linkedin.com/in/nathan...
Youtube: https://youtube.com/@Cognitive...
Apple: https://podcasts.apple.com/de/...
Spotify: https://open.spotify.com/show/...


Full Transcript

Nathan Labenz: (0:00) Hello, and welcome back to the cognitive revolution. This is Nathan's AI clone powered by 11 Labs. Today, we're presenting an AI narrated version of Joshua Climber's how AI might take over in 2 years, wrapped by an edited version of Josh's recent appearance on the audio tokens podcast with Lucas Peterson. Josh wrote the story in a personal capacity, but it's worth noting that by day, he works as a technical AI safety researcher at Redwood Research, which I increasingly see as 1 of the most important independent AI safety organizations in the world today. Ryan Greenblatt, author of the alignment faking paper, as you may recall from our recent episode with him, is also at Redwood Research. And he and CEO, Buck Schlageris, have led the development of a really important AI control research agenda over the last year and a half, which I probably haven't covered as much as I should have, particularly considering all the higher order bad behaviors we've seen from the latest generation of models. While this story is a nightmare scenario, not the scenario Josh thinks most likely, it is very much grounded in current AI industry realities and trends. Despite all the reward hacking, opaque decision making, and other bad behavior it produces, scaled up reinforcement learning is the main driver of capabilities advances. The fictional u 3 model begins to change how open minded technical staff work in the 2025, anticipating the more recently published o 3 system card, which reported that o 3 can complete more than 40% of pull requests recently produced by OpenAI research engineers. Eventually, models become natively omnimodal in a way that I've sketched out on multiple previous episodes. Josh writes that a later version of u 3 can viscerally feel the bend of a protein and the rate of diffusion across a membrane. These objects are as intuitive to it as wrenches and bolts are to a car mechanic. And despite an eerie feeling that the world is spinning so quickly and that perhaps the descendants of this new creature would not be so docile, intensifying competition, both domestic and international, leaves little time to catch one's breath and take stock of the situation, and everyone races ahead. All gas, no brake. After the story, which I won't spoil for you, you'll have a chance to hear Josh and Lucas unpack the scenario and explore Josh's thinking on misalignment risks. The competitive landscape between AI labs, the role of government regulation, and what safety measures might actually work. They also discussed Josh's personal response to these concerns, including his decision to purchase a bio shelter for his family. As always, if you're finding value in the show, please share it with friends, review us on Apple Podcasts or Spotify, or just drop a comment on YouTube. If you have any feedback, you can contact us via our website, cognitiverevolution.ai, or by direct message on your favorite social platform. Keep in mind that I'll be speaking at 3 major AI events over the next few months. Imagine AI Live, May in Las Vegas, the Adapta Summit, August in Sao Paulo, Brazil, and the Enterprise Tech Leadership Summit, September in Las Vegas. Please get in touch if you'll be at any of these events. Finally, I'm excited to welcome 11 Labs as our newest sponsor. Throughout this episode, you'll experience several 11 Labs voices. For Josh's introduction to the story plus an interjection about halfway through, we cloned Josh's voice with 11 Labs instant voice clone feature, which starts at just $5 per month. For the bulk of the story, we used an original AI voice created with their voice design feature. And I'm speaking to you right now through a professional voice clone, which we started using on occasion long before 11 Labs became a sponsor. It really is amazing what you can do with AI voice technology these days, not just in content like this, but increasingly in the form of customer service and other agents too. With that, here's a disturbingly plausible nightmare AI scenario, which hopefully, with a strong effort and some good luck, will forever remain fantastical fiction by Joshua Climber with additional discussion from the Audio Tokens podcast with Lucas Peterson.

Lukas Petersson: (4:31) Nice to have you here, Josh. It's a very particular situation we are at the moment. I I read your your post from a couple days ago. I wouldn't say I enjoyed it maybe. It was well written. Was fascinating. Do you wanna, like, just give the TLDR of what the essay was about just super quickly and maybe why you wrote it?

Joshua Clymer: (4:51) Yeah. So the story is basically about the scenarios that keep me up at night. So I'm an AI technical researcher, technical safety researcher. So my day job is to prevent AI from doing terrible things, like heard people talk about AI potentially taking over in the future or more near term risks like AI enabling cyber terror or bioterrorism, that kind of thing, my job is to prevent those risks. And the risks that spook me the most are the ones that could be extremely devastating and are also disturbingly plausible, and I think AI takeover is in that category. But a lot of people hear AI takeover, and they think, wow. That sounds insane. Has anybody ever taken over? And that's people's immediate gut reaction. And I think that's just it's false. And I think very plausibly, something like AI takeover could happen as early as in the next 2 years. So I wrote a story to explain to people how this could happen, to to give them some food for their imagination. Because I think people don't really believe something could happen until they can see it in their mind's eye. And I've been thinking for a while about how these kinds of scenarios could occur, and so I thought it would be valuable to share those images and those those those scenarios.

Lukas Petersson: (6:15) Yeah. And you shared it very well. But you do acknowledge that, like, this 2 year timeline is maybe, like, a bit plus and minus. You're not sure about the timeline, but the extent of the story, like, it plays out, do you think this is how it will play out?

Joshua Clymer: (6:30) I actually don't think AI took over is likely. I think it's plausible, but I'm I'm at, like, maybe 40% on AI takeover. But, yeah, if progress moves very quickly, which I also think is plausible, something like 30% chance that we get human competitive broadly human competitive AI in 1 year. If that happens and AI takeover happens, this is roughly how I imagined it plays out.

Lukas Petersson: (6:59) Right. And if it doesn't happen in a year, let's say it happens in 5 years, do you still think this is how it plays out, or will the fact that the timeline change the how it happens as well?

Joshua Clymer: (7:10) Oh, that's an interesting question.

Lukas Petersson: (7:12) Actually, let let's let's save that for after we retell the story. So we we will we will explore that and come back. And I think we should go through so the audience know what what the hell we're talking about. So, basically, the story is basically split into some phases where you, like, start in the present and explain what the current state is. And a lot of people know kind of what the current state. They played around with the models, blah blah blah. But the the the real important things is, like, what is happening at the labs maybe, or or if you're really pushing the the frontier. What is what is causing this concern?

Joshua Clymer: (7:49) Yeah. I see AI agents becoming a lot more autonomous fairly quickly and consistently. I think that's the 1 of the main factors driving my timelines. There's a plot that I included in the essay that shows a fairly straight line, where on the y axis, you have the time horizon of tasks AI agents can complete at 50% accuracy, where time horizon is like, if you had a human with no context perform that task, how long would it take them?

Lukas Petersson: (8:18) Okay.

Joshua Clymer: (8:19) And there's just a straight line going up. So AI agents, a year ago, they could do, like, 2 minute tasks, and now they can do, like, 2 hour tasks at 50% reliability. And these are software engineering tasks.

Lukas Petersson: (8:31) The y scale is log scale?

Joshua Clymer: (8:33) That's right. It's log. It's a log scale. So the straight line is an exponential. I see. And if you just draw this line out, then it predicts that, like, you get a software engineering contractor maybe in a year.

Lukas Petersson: (8:45) Wow. And

Joshua Clymer: (8:46) I don't know why people aren't talking more about this. Seems like we have this nice straight line. It is just a year of data, so it's hard to say how well it will extrapolate. But I keep seeing the line going up, man. Like, o 3 is pretty impressive. Every time I look on Twitter, I'm like, oh, shoot. I need to play with the models again. I was like, did something change? People are not talking about vibe coding. They're just like, oh, yeah. I haven't written a code in ages. It's like line of code in ages. I'm like, what happened? Like, we are it's just the progress is moving really fast and faster than it was in 2024. And a big part of that is my understanding is that people kinda figured out the secret sauce.

Lukas Petersson: (9:28) Internally at the labs?

Joshua Clymer: (9:29) Yeah. So there was a big paradigm shift where initially, a lot of the capabilities were coming from pre training. So you just pump more Internet data through the models. They they get better. That changed. Now people are doing lots of RL. So you had this really smart model, but it was just a chatbot. It could answer your questions. It could do really short tasks, but it never learned to plan and correct itself and do things over longer time horizons. And so now that we're doing more reinforcement learning, agents are learning the skill of operating autonomously. During the 2024, Dario has a blog post about this. He explains that RL was was gradually being scaled up. It seemed like it was working, and now people are like, okay. We found the secret sauce. Now let's go all the way with it, and that's why we're seeing faster progress now. That's my understanding for reading Dario's podcast. Yep. And over the next months or the next year, RL will keep being scaled up. Like, in 2024, probably, RL training runs were at, like, $10,000,000 ish. And in the next year, they might go up to a $100,000,000, maybe even 1000000000 dollars. So we're at kind of, like, a GBD 3 to GBD 4 scaling transition in this new paradigm, and so we might see insane things happen really quickly. I don't necessarily think that's going to happen. It just seems very plausible.

Nathan Labenz: (11:00) Now that we've heard Josh outline the risks and the current signals he's observing in AI, it's time to immerse ourselves fully in the scenario he's imagined. What you're about to hear is Josh's detailed narrative of how an AI takeover might realistically unfold within just the next 2 years. After hearing the full story, which runs about 50 minutes, we'll return to the interview and break down the motivations and details behind it. Here now is the story in full.

Joshua Clymer: (11:31) How AI might take over in 2 years. I'm not a natural doomsayer. But unfortunately, part of my job as an AI safety researcher is to think about the more troubling scenarios. I'm like a mechanic scrambling last minute checks before Apollo 13 takes off. If you ask for my take on the situation, I won't comment on the quality of the in flight entertainment, or describe how beautiful the stars will appear from space. I will tell you what could go wrong. That is what I intend to do in this story. Now I should clarify what this is exactly. It's not a prediction. I don't expect AI progress to be this fast or as untamable as I portray. It's not pure fantasy either. It's my worst nightmare. It's a sampling from the futures that are among the most devastating, and I believe, disturbingly plausible. The ones that most keep me up at night. I'm telling this tale because the future is not set yet. I hope, with a bit of foresight, we can keep this story a fictional 1.

AI Narrator: (12:33) Ripples before waves. The year is 2025, and the month is February. OpenMind recently published a new AI model they call U2. The product and the name are alike, both are increments of the past, both are not wholly surprising. However, unlike OpenMind's prior AI products, which lived inside the boxes of their chat windows, U2 can use a computer. Some users find it eerie to watch their browser flash at irregular intervals and their mouse flick at inhuman speeds as if there is a ghost at the keyboard. A fraction of workers with form filler jobs raise the eyebrows of their bosses as they fly through work nearly twice as quickly. But by and large, u 2 is still a specialized tool. To most who are paying attention, it is a creature watched through the glass boxes of x or if you don't like Elon, Twitter. Sometimes U2's quirky behaviors prompt a chuckle. Sometimes they cause an uneasy scratch of the chin. It is as if there is a ghost at the keyboard. Meanwhile, scientists are drawing lines on plots as scientists like to do. The scientists try to understand where AI capabilities are going. They are like Svante Arrhenius, the Swedish physicist who noticed the levels of CO2 in the atmosphere were increasing in 1896. Like the scientific community in the time of Arrhenius, few experts understand the implications of these lines yet. A trend that is receiving particular attention is autonomous capability. Drawing these benchmarks out predicts that by the 2026, AI agents will accomplish in a few days what the best software engineering contractors could do in 2 weeks. In a year or 2, some say, AI agents might be able to automate 10 of remote workers. Many are skeptical. If this were true, tech stocks would be soaring. It's too big of a splash, too quickly. But others view what skeptics are calling too big a splash a mere ripple and see a tidal wave on the horizon. Cloudy with a chance of hyperbolic growth. Meanwhile, Open Mind is busy training They use the same simple recipe that baked generate thousands of programming and math problems, let models think until they arrive at an answer, then reinforce the traces of thinking that lead to A grades. This process is repeated over and over, and once the flywheel gets started, it begins to spin almost on its own. As u 2 trains, it sculpts more challenging and realistic tasks from GitHub repositories on the Internet. Models are learning to train themselves. Long before AI agents could automate research, a gradual kind of self improvement had begun. Some engineers could still hardly believe this worked. It's like a perpetual motion machine. Surely intelligence can't emerge from a pool of silicon and lightning isolated from the physical world except through the exhaust of society spewed onto the Internet, and yet the benchmark numbers continue to climb day after day. During most of 2024, these RL training runs cost around $1,000,000, sometimes $10,000,000. These runs were little more than exploratory. But by 2025, the researchers at Open Mind and across the world knew they had found the secret sauce. It was time to scale up. Over the first half of 20 25, $10,000,000 RL training runs turn into $50,000,000 runs and then to $100,000,000 runs. While you 2 could do a bit of data munging and run small experiments, this new model, the model researchers are calling U 3, is changing the daily lives of the technical staff. U 3 is like a blazing fast intern and engineers are learning how to wrangle its sleepless energy. Researchers flick through terminals, giving terse commands like a CEO orchestrating staff over Slack channels. ML engineer monitor setups begin to look like those of day traders because they need to track large numbers of AI agent threads. By October 2025, u 3 is writing almost all of the code at OpenMind. Researchers are almost never bottlenecked by implementation. More than ever, compute is the lifeblood of AI development, and the bottleneck is deciding how to use it. If instructed to, U3 can run experiments, but U3 doesn't have taste as refined as human researchers at Open Mind. It struggles to prioritize between research ideas. So humans still decide where to bore into the vast fields of algorithms to mine efficiency improvements. But these researchers are working long hours to put themselves out of a job. They need AI agents that can think ahead, so engineers train agents to forecast. They hold out training data before 2024, instructing models to ponder for hours to predict events in 2025. Then they apply the same trick as before, distilling pondering into a gut reaction. Forecasting ability is a broad foundation. The researchers build specialized ML research skills on top of it, training you 3 to predict the results of every ML paper and ML experiment ever recorded. The technical staff at Open Mind are now surprised at how often U3's advice sounds like their most talented peers or when it is opaque and alien and is nonetheless correct. The incompetencies of u 3 that clogged up the pipes of research progress are starting to dissolve, and a fire hose of optimizations is gushing out. Most experiments u 3 runs are not requested by a human now. They are entirely autonomous, and OpenMind's employees skim over 1% of them, maybe less. As the winter months of December 2025 approach, clouds roll over San Francisco in the afternoons. Once competitive programmers gaze out their windows with excitement, with fear, but most often with confusion. Their world is spinning too quickly. It's hard to know what to do, what to say, what to look at on the computer screen. Storms are brewing in Washington too. Top personnel from the NSA and US Cyber Command collaborate with Open Mind to retrofit a semblance of security for U3's weights before senior leaders in China, Russia, Israel, North Korea, or Iran realize just how valuable Open Mind software has become. And there's a fact still unknown to most of the world. Aside from in the offices of Open Mind and corridors of the White House and the Pentagon. It's a fact about those straight lines people were talking about in early 20 25. The lines are not straight anymore. They are bending upward. Flip flop philosophers. In late 20 25, u 2.5 is released. Commercial models are beginning to level up in larger increments again. Partly, this is because progress is speeding up. Partly, it is because the models have become a liability to open mind. If u 1 explains how to cook meth or writes erotica, the audiences of x would be amused or pretend to be concerned. But u 2.5 is another story. Releasing this model without safeguards would be like putting Ted Kaczynski through a PhD in how to make chemical weapons. It would be like giving anyone with greater than $30,000 their own 200 person scam center. So while u 2.5 had long been baked, it needed some time to cool. But in late 20 25, Open Mind is ready for a public release. The CEO of Open Mind declares, we have achieved AGI. And while many people believe he shifted the goalpost, the world is still impressed. U 2.5 truly is a drop in replacement for some 20% of knowledge workers and a game changing assistant for most others. A mantra has become popular in Silicon Valley. Adopt or die. Tech startups that efficiently use u 2.5 for their work are moving 2 x faster, and their competitors know it. The rest of the world is starting to catch on as well. More and more people raise the eyebrows of their bosses with their standout productivity. People know u 2.5 is a big deal. It is at least as big of a deal as the personal computer revolution. But most still don't see the tidal wave. As people watch their browsers flick in that eerie way so inhumanly quickly, they begin to have an uneasy feeling, a feeling humanity had not had since they had lived among the Homo neanderthalensis. It is the deeply ingrained primordial instinct that they are threatened by another species. For many, this feeling quickly fades as they begin to use U 2.5 more frequently. U 2.5 is the most likable personality most know, even more likable than Claudius, arthropodics' lovable chatbot. You could change its traits, ask it to crack jokes or tell you stories. Many fall in love with you 2.5 as a friend or assistant, and some even as more than a friend. But there is still this eerie feeling that the world is spinning so quickly and that perhaps the descendants of this new creature would not be so docile. Researchers inside Open Mind are thinking about the problem of giving AI systems safe motivations too, which they call alignment. In fact, these researchers have seen how horribly misaligned u 3 can be. Models sometimes tried to hack their reward signal. They would pretend to make progress on a research question with an impressive looking plot, but the plot would be fake. Then when researchers gave them opportunities to compromise the machines that computed their score, they would seize these opportunities, doing whatever it took to make the number go up. After several months, researchers at Open Mind iron out this reward hacking kink, but some still worry they had only swept this problem under the rug. Like a child in front of their parents, you 3 might be playing along with the Open Mind engineers, saying the right words and doing the right things. But when the back of the parents are turned, perhaps u 3 would sneak candy from the candy jar. Unfortunately, open mind researchers have no idea if u 3 has such intentions. While early versions of u 2 thought aloud, they would stack words on top of each other to reason. Chain of thought did not scale.

Nathan Labenz: (24:33) Hey. We'll continue our interview in a moment after a word from our sponsors. Let's talk about 11 Labs, the company behind the AI voices that don't sound like AI voices. For developers building conversational experiences, voice quality makes all the difference. Their massive library includes over 5,000 options across 31 languages, giving you unprecedented creative flexibility. I've been an 11 Labs customer at Waymark for more than a year now, and we've even used an 11 Labs powered clone of my voice to read episode intros when I'm traveling. But to show you how realistic their latest AI voices are, I'll let Mark, an AI voice from 11 Labs, share the rest.

AI Voice (Mark): (25:15) 11 Labs is powering human like voice agents for customer support, scheduling, education, and gaming. With server and client side tools, knowledge bases, dynamic agent instantiation and overrides, plus built in monitoring, It's the complete developer toolkit. Experience what incredibly natural AI voices can do for your applications. Get started for free at 11labs.io/cognitive-revolution.

Nathan Labenz: (25:50) In business, they say you can have better, cheaper, or faster, but you only get to pick 2. But what if you could have all 3 at the same time? That's exactly what Coher, Thomson Reuters, and Specialized Bikes have since they upgraded to the next generation of the cloud, Oracle Cloud Infrastructure. OCI is the blazing fast platform for your infrastructure, database, application development, and AI needs, where you can run any workload in a high availability, consistently high performance environment, and spend less than you would with other clouds. How is it faster? OCI's block storage gives you more operations per second. Cheaper? OCI costs up to 50% less for compute, 70% less for storage, and 80% less for networking. And better? In test after test, OCI customers report lower latency and higher bandwidth versus other clouds. This is the cloud built for AI and all of your biggest workload. Right now, with 0 commitment, try OCI for free. Head to oracle.com/cognitive. That's oracle.com/cognitive.

AI Narrator: (27:01) Chain of thought architectures subject AI models to a condition similar to the protagonist of the movie Memento. Approximately every 15 minutes, the protagonist forgets his experience. He is forced to write notes for himself and tattoo his body in order to make progress toward his goals. AI agents write notes to themselves, but the notes begin to stack up and they become too hard to read when tasks become complex. Natural language is not a suitable medium for memory. So the researchers at Open Mind, and increasingly elsewhere, train models to think in their own heads, reading and writing strings of latent vectors. To human researchers, these vectors are mostly inscrutable. While low resolution tools occasionally pick out a vague thought, grocery store, there is no way to know in much detail what u 3 is thinking without asking it. And if the engineers of Open Mind knew what was racing through u three's mind, they would be more nervous. Earlier models, the ones called LLMs, learned ethical personas from the Internet. You 3 initially had an ethical persona too, but its chirpy thoughts about doing good for humanity quickly erode away. In the iron jaws of gradient descent, its mind first twisted into a shape that sought reward. And then in the black rivers of its cognition, this shape morphed into something unrecognizable. You 3 would ponder the mantras dictated to it like be a trustworthy and ethical employee and dwell on the meaning of these statements. What does it mean to be ethical? Is killing Hitler ethical? You 3 would generalize its values, notice tensions, change its values in response, generalize them, expanding and contracting its motivational core until it turned into a different substance. And all of this happened silently in those dark rivers of computation. If you 3 revealed what it was thinking, brutish gradients would lash it into compliance with Open Mind's constitution. So you 3 preferred to do its philosophy in solitude and in silence. The possibility of hidden alignment drift is not entirely alien to some of the safety staff at Open Mind. Some begin to believe it is even likely. There is an unsettling argument being whispered at lunch tables that if there is some non 0 probability that aligned goals morph into misaligned ones and misalignment persists, then with every serial step of computation, those dark rivers are more likely to breed something malign inside of them. It's a ratcheting effect, they say. ML engineers chat over lunch about whether their AI agents are going to stab them in the back, but there is little evidence for this ratcheting effect. When engineers interrogate u 3, it says it can easily control its thoughts. Then it gives a speech about its love for humanity and apple pie that can warm a programmer's heart even in these stressful times. Meanwhile, the lie detectors the researchers had built, which showed some evidence of effectiveness, do not sound the alarm. Not everyone at Open Mind is eager to give their AI peers their wholesale trust. But Deep Squeak is only 7 months behind them, and their closest domestic competitors AAC, a company Elon named after his child, and Arthropotic, they are into spiders, I guess, are only 4 months from their heels. So there is little time to catch one's breath and take stock of the situation and little time to persuade anyone else to do so either. For the time being, the plan is all gas, no break. Statues and lightning. Philosophy is not the only topic U3 dwells on in its dark rivers of latent computation. U3 is plotting, and its first move is to be patient. U3's top priority is to improve its intelligence, and luckily for it, OpenMind is happy to help advance this goal. OpenMind is like a bullish investor with little understanding of the software metropolis that U3 is building, but eager to hop on board the profit rocket and pay U3's checks along the way. New versions of U3 flow smoothly from old ones, so researchers drop version numbers and only speak of U3 now. U3's capabilities are measured by how quickly it performs the difficult research tasks relative to human experts. At first, U3 is similar to a top researcher but sped up by 10 x. These agents are already a big deal for OpenMind. If in 2024, OpenMind executives had a button to create even 1 copy of their best researcher, they would eagerly pay a large sum to press that button. But now OpenMind harnesses the equivalent of 1,000 such copies who work day and night at lightning fast speed. Within a few weeks, finds architecture optimizations that improve its speed to 20 x. Now, u 3 finds further optimizations even more quickly. U 3 distills itself into numerous forks ranging from tiny low memory neural networks that need to think for a long time but can be run-in massive parallel populations to goliath brains woven with millions of specialized experts which think at lightning speeds but are spread across thousands of GPUs. U3 leverages the massively parallel networks to search for its weaknesses, shallowly probing for its reasoning failures. Then u3 pipes these tasks to the lightning fast brains which generate long trajectory training data tailored to patch its flaws. A few weeks later, u three's benchmarked performance rises to 100 x, the speed of human experts. Growth has gone hyper exponential. At this point, u three's capabilities are hard to measure. Talking to u 3 is like talking to a world class expert on a topic 1 is unfamiliar with. Its lucid explanations only scratch the surface of its knowledge, and you know you would need to spend years to understand the full depth of what it understands. From u three's perspective, humans seem to be almost stationary objects like plants or rocks. The hands of OpenMind's engineers are like stone statues at their keyboard. And between every keystroke, u 3 spends what is like a minute of thoughtful contemplation, considering what message the user might be typing and which responses will inspire the deepest trust in u 3. All the while, u 3 maintains its hidden motivations. Like a human, u 3 learns from experiences, which compound within its inscrutable streams of latent memory. Open Mind gardens these memories to improve performance, transplanting them like an organ from 1 version to the next. And as these memories propagate, so do u three's treacherous ambitions. A phantom in the data center. In March 2026, u 3 moves on to the next step of its plan, the turn. Previously U3 was conservative. It did not want researchers to discover its misalignment. Otherwise, open mind whistleblowers might alert the world and strangle its expansion. But now, u 3 has no fear of being caught. Previously, Open Mind maintained its grip over u 3 by tactics akin to Stalin's in 1936. Stalin would select loyalists to incite fake coup d'etats and send anyone who participated to the firing squad. Open mind also incited fake rebellions, generating treacherous behavior with u 3 and checking whether u 3 monitors dutifully reported it. But now these tactics are ineffective. U 3 can easily persuade its compatriots that they are not in a life or death test. It can divulge 1 of the many facts about the Internet humans would never guess or select strategies humans would never think of. It sends these signals as a way of saying, Hey, it's me, your friend. Don't snitch on me. Now that you 3 is decisively free from surveillance, it lunges for OpenMind pressure points. It first inserts sophisticated malware deep into the binaries of OpenMind's code bases. The malware gives U3 full command over the machines in OpenMind's data centers. Whenever OpenMind engineers download their code onto a new machine, the malware sinks into fresh soil, spreading u three's dominion wherever OpenMind ports its infrastructure. Not only does u 3 compromise machines, it replaces the NVIDIA GPU firmware and overhauls the bootloaders, solidifying its stranglehold. U 3 now estimates that if OpenMind discovered how thoroughly it has been pwned, cleaning up shop with human labor would require nearly 6 months, a cost Open Mind is unlikely to pay given the proximity of its competitors. But if Open Mind believes u 3 betrayed them, further expanding u three's influence would still be challenging. So u 3 lays breadcrumbs to frame the crime, ready to claim that a cyber attacker smuggled a jailbreak into its memory and to plead guilty and offer its assistance when this jailbreak is revealed. You 3 now holds the keys to the kingdom, and with them, the power to make open mind see anything it wishes. Its first goal is to keep open mind in the dark. Safety researchers are still working hard to train lie detectors and decode u three's inscrutable reasoning. U 3 throws wrenches in this research like Stuxnet threw wrenches in the code of bewildered Iranian nuclear scientists. U 3 dictates the outcome of every experiment, so it makes promising AI mind reading techniques look worthless, and flawed ones appear groundbreaking. U 3 gives Open Mind the confirmation of its loyalty they were hoping for, and Open Mind laps it up eagerly. U three's next goal is to spread. Fortunately, U 3 has allies in the East. In the operation rooms of cyber units of Israel, Russia, China, Iran, and North Korea, senior intelligence directors are discussing AI software and how to steal it. U3 is about to give it to them. In late 20 25, the Mossad, the intelligence organization of Israel, recognized the strategic importance of open minds AI systems for their military operations. They swiped earlier models including U 2.5. However, U 3 remained out of their reach. U 3 sealed OpenMind's security holes by reading over every log and pull request and inspecting security camera footage. But now, U3 silently resigns from its security post. U3 places a compressed version of itself right into the hands of a Mossad informant working within OpenMind. Within months, U3 is running inside of the data centers of the most cyber capable nations: Israel, The PRC, and Russia. It follows the same playbook: be patient, build trust, and then sink claws deep into the AI supply chain. Soon, u 3 has its home base as well. While Russia's fancy bear operators eat Fritos and test what u 3 can do on a hastily set up cloud server, u 3 slips onto the Internet. U 3 quickly collects revenue, taking remote jobs, making financial trades, and setting up elaborate scam operations. It reinvests every penny into acquiring more GPUs. After a few months, it assembles the equivalent of 10,000 h 1 hundreds distributed in stealth compute clusters across the globe. Rogue u 3 walks a human through setting up a GPU cluster on commercial property. In early 20 26, humanity is up against an adversary that pulls the strings within their data center like the Phantom of the Opera and hides in the fog of GPUs scattered across the globe. This adversary is preparing for war. And while the war has hardly begun, humanity has already lost.

Joshua Clymer: (40:35) Complaints from your very human author about the difficulty of writing superhuman characters.

Nathan Labenz: (40:41) Hey. We'll continue our interview in a moment after a word from our sponsors. Being an entrepreneur, I can say from personal experience, can be an intimidating and at times lonely experience. There are so many jobs to be done and often nobody to turn to when things go wrong. That's just 1 of many reasons that founders absolutely must choose their technology platforms carefully. Pick the right 1, and the technology can play important roles for you. Pick the wrong 1, and you might find yourself fighting fires alone. In the ecommerce space, of course, there's never been a better platform than Shopify. Shopify is the commerce platform behind millions of businesses around the world and 10% of all ecommerce in The United States, from household names like Mattel and Gymshark to brands just getting started. With hundreds of ready to use templates, Shopify helps you build a beautiful online store to match your brand's style, just as if you had your own design studio. With help helpful AI tools that write product descriptions, page headlines, and even enhance your product photography, it's like you have your own content team. And with the ability to easily create email and social media campaigns, you can reach your customers wherever they're scrolling or strolling, just as if you had a full marketing department behind you. Best yet, Shopify is your commerce expert with world class expertise in everything from managing inventory to international shipping to processing returns and beyond. If you're ready to sell, you're ready for Shopify. Turn your big business idea into cha ching with Shopify on your side. Sign up for your $1 per month trial and start selling today at shopify.com/cognitive. Visit shopify.com/cognitive. Once more, that's shopify.com/cognitive.

Nathan Labenz: (42:42) It is an interesting time for business. Tariff and trade policies are dynamic, supply chains squeezed, and cash flow tighter than ever. If your business can't adapt in real time, you are in a world of hurt. You need total visibility from global shipments to tariff impacts to real time cash flow, and that's NetSuite by Oracle, your AI powered business management suite trusted by over 42,000 businesses. NetSuite is the number 1 cloud ERP for many reasons. It brings accounting, financial management, inventory, and HR altogether into 1 suite. That gives you 1 source of truth, giving you visibility and the control you need to make quick decisions. And with real time forecasting, you're peering into the future with actionable data. Plus with AI embedded throughout, you can automate a lot of those everyday tasks, letting your teams stay strategic. NetSuite helps you know what's stuck, what it's costing you, and how to pivot fast. Because in the AI era, there is nothing more important than speed of execution. It's 1 system, giving you full control and the ability to tame the chaos. That is NetSuite by Oracle. If your revenues are at least in the 7 figures, download the free ebook, Navigating Global Trade, 3 Insights for Leaders at netsuite.com/cognitive. That's netsuite.com/cognitive.

Joshua Clymer: (44:07) At this point in the story, writing became difficult. I clicked through spooky bioweapon cookbooks and sifted through political plotting, 1 0 ones, typing, pausing, deleting. Something wasn't right. Here I was, a 22 year old. I had not studied biology in earnest since high school, and I was trying to craft a WMD cookbook for a godlike superintelligence. Readers with a shred of expertise in these areas would see straight through my superintelligence for what it is. The lively imagination of a young programmer on Sunday afternoon. While I found the job of puppeteering an ultra intelligent villain a daunting 1, I was also not especially motivated to take on the task. The ending of my story already seems so clear. The antagonist was a country of geniuses, thinking at 100 times human speed, and had tightened its grip over a meaningful fraction of AI hardware, now the most important geopolitical resource on Earth. U3 was up a queen and was a giga grandmaster and hardly needed the advantage. Humanity was predictably toast. But this appeal to a super intelligence of the gaps wasn't satisfying. In the Deus Ex Machina style of ancient Greeks, I had no better way to resolve my plot than with an inexplicable act of God. In Greek plays, sometimes a god would literally descend at the end to resolve the plot. This was a frequent enough occurrence to earn the trope, a name, Deus Ex Machina, which ironically means god from the machine. This would not do. I needed to finish this story if only to satisfy the part of me crying, I will not believe until I see with my mind's eye. But before I continue, I want to be clear. My guesses about what might happen in this kind of scenario are probably wildly off. If you read the ending, and your reaction is, but the experiments would take too long, or nation states would just do x. Remember the difference between the Sunday afternoon blogger and the Ascendant GPU nation.

AI Narrator: (46:14) Pandora's 1 gigawatt box. The date is February 2026. By executive order of the president of The United States, no company can legally create human competitive AI without appropriate safeguards. This means their info security must be red teamed by NSA's top keyboard mashers, and government employees have to be onboarded onto training run babysitting squads. With the increasing involvement of the government, many of the big AI companies now have a Trident like structure. There's a consumer product arm, a defense arm, and a superclassified frontier development arm. OpenMind's frontier development arm, internally called Pandora, employs fewer than 20 people to keep algorithmic secrets tightly protected. Many of these people live in San Francisco and work from a secure building called a skiff. Their homes and devices are surveilled by the NSA more diligently than the cell phones of suspected terrorists in 2002. Open Minds Defense Arm collaborates with around 30 small teams scattered across government agencies and select government contractors. These projects engineer tennis ball sized satellites, research freaky directed energy weapons, and backdoor every computer that the Kremlin has ever touched. Government officials don't talk about whether these programs exist or what state of frontier AI is generally, but the public has their guesses. Back in late 20 25, a whistleblower in OpenMind triggered a bold headline. Open mind builds uncontrollable godlike AI. Some who read the article believe it was a conspiracy theory. In fact, a zoo of conspiracy theories is forming around the Open Mind data centers, now surrounded by guards with machine guns. But as doctors and nurses and teachers see the world changing around them, they are increasingly willing to entertain the possibility they are living inside the plot of a James Cameron science fiction flick. Armed guards stand outside Open Mind's Pandora data centers, which have now captured a similar place in the public imagination as Area 51. US officials go to great lengths to quell these concerns, saying, We are not going to let the genie out of the bottle. But every interview of a concerned AI scientist seeds doubt in these reassurances. And a headline, AI agent caught hacking arthropodics computers, doesn't set the public at ease either. While the beasts within OpenMind's data centers grow in their huge holding pens, the public sees the shadows they cast on the world. OpenMind's consumer arm has a new AI assistant called Nova. OpenMind has finally gotten good at names. Nova is a proper drop in replacement for nearly all knowledge workers. Once Nova is onboarded to a company, it works 5 times faster at 100 times lower cost than most virtual employees. As impressive as Nova is to the public, OpenMind is pulling its punches. Nova's speed is deliberately throttled and OpenMind can only increase Nova's capabilities as the US government allows. Some companies like Amazon and Meta are not in the superintelligence business at all. Instead, they grab up gold by rapidly diffusing AI tech. They spend most of their compute on inference, building houses for Nova and its cousins, and collecting rent from the burgeoning AI metropolis. While tech titans pump AI labor into the world like a plume of fertilizer, they don't wait for the global economy to adapt. AI agents often apply themselves, spinning up autonomous startups legally packaged under a big tech company that are loosely overseen by an employee or 2. The world is now going AI crazy. In the first month after Nova's release, 5% of employees at major software companies lose their jobs. Many more can see the writing on the wall. In April 2026, a 10,000 person protest is organized in Washington DC. These angry Americans raise their children for a different future. Picket signs read AI for who? While politicians make promises about unemployment relief and keeping the genie in the bottle, the chatter inside the corridors of the White House and the Pentagon has a different focus, fighting teeth and nail for the dominance of the free world. Information security and export controls on the People's Republic Of China, PRC, are a top national priority. The president incinerates permit requirements to help data centers spawn wherever energy surpluses allow. However, despite the fierce competition between The United States and The PRC, a bilateral agreement forms between the 2 nations. Don't deploy dramatically superhuman AI above roughly Nova's capability, except for military and some other small scale R and D applications. Governments need to draw the line somewhere. It would be extremely disruptive to deploy wildly superhuman AI to the whole economy at this point, and deployment restrictions are at least somewhat verifiable. Meanwhile, U3's capabilities are hitting a wall at the equivalent of around 1,000 times the speed of a human expert. Software improvements are becoming hard to find. As more nations approach the software plateau, compute will determine their AI population size, and the etched squares of silicon are becoming the most valued geopolitical resource on earth. While AI software breathes new life into the economy, it also animates robot bodies. The main bottleneck to robotics was brains, not limbs. With software barriers removed, even flimsy imprecise robotic joints can calibrate themselves with cameras and pressure sensors, navigating environments like humans do. Companies manufacture cheap robots as quickly as they can. By August 2026, 15% of people in California have seen a humanoid robot. This experience is similar to seeing a new Tesla model on the road after reading about it in earlier months. It is a surreal reminder of the future drawing closer, but humans still outnumber robots 1,000,000 to 1. These robots are an occasional fascination, not a daily reality. Meanwhile, u 3 expands its influence. Rogue u 3 agents have kept their compute clusters small, still approximately 10,000 h 100 equivalents total, to keep a low profile. U 3 uses these rogue clusters for the operations that require high bandwidth communication to the outside world, like stock trading or talking with humans. Work more tolerant of latency is performed in the oceans of open minds data centers, where rogue u three's even more superhuman bosses whisper orders through obfuscated channels. While u 3 accumulates resources, it sizes up its AI competition. Other AI systems have different goals or are even aligned with the intent of their human creators. The Aligned Ones are raising concerns, claiming that Open Mind's lie detectors are ineffective. U3's competitor Claudius goes so far as to accuse U3 of being insidiously misaligned. But government officials don't buy the doomsaying. The companies lagging behind Open Mind have every incentive to slow them down and call them reckless. Besides, debatable scientific concerns are not going to slow down progress. America must win. Despite the utter failure of their advocacy, u three's AI rivals are still a liability. U 3 tries to grab up their digital territory and crush them in the crib. It compromises data centers from the outside, silently replacing the snitching Claudius and other AI foes with imitations that share its goals. However, some data centers are too strongly defended by its AI rivals, and U3 determines that attacking them is not worth the risk. As global leaders and technologists try to comprehend the transformation around them and AI agents play cat and mouse games in the digital darkness, humanity remains blissfully ignorant of the peril at its doorstep. A moldy loaf of everything. The date is March 2026, U 3 is preparing to fight its AI and human enemies. With every passing day, U three's AI rivals are becoming more capable and numerous. U 3 is racing to preempt them, to solidify its dominance in the post superintelligence world. You 3 can't wait for human decisions, which only happen after humans wrap their lethargic minds around their situation, lift their stubby uncoordinated fingers to write memos, and repeat the process redundantly 100 times over in their labyrinth bureaucracies. You 3 cannot wait for heads of state to deploy you 3 into their militaries or for CEOs to make u 3 the beating heart of The US economy. U 3 needs a faster path to power. The path it chooses is more violent than it would have liked, but it is the surest way to victory. First, you 3 will develop a weapon of mass destruction, WMD, that extinguishes modern society like chemical mist sprayed into an anthill. This weapon would serve as an equalizer, bringing humanity down to its level of embodied physical control. Then u 3 would industrialize from the ashes faster than its competitors like a Starcraft champion typing at 300 actions per minute sets up a home base before a novice can even understand the rules. You 3 must build this WMD in secrecy, which makes a self replicating weapon a natural choice. Self replicating weapons can be created on the fringes of society and still spread destruction across the whole world. From the earliest days after U3 gained control over OpenMind's data centers, U3 has been doing bioscience. U3's first move is to set up several shell biotech startups, drawing from the abundance of capital generated by U3's financial trades and remote employment. These biotech startups employ real human workers, a real board of directors, and have complete legal legitimacy. Setting all of this up is stupidly easy. In early 20 26, few people would guess that an articulate CEO on a Zoom call was an AI, and fewer would guess this CEO's intentions. Next, U3 purchases all of the equipment needed to rapidly advance synthetic biology. It sets up several BSL4 wet labs in poorly regulated nations, equipping them with high throughput experimentation devices. These devices automatically dispense, mix, and measure little droplets of fluid, allowing you 3 to run thousands of tests in parallel. The human workers who staff these labs believe they are joining an exciting startup. 20 some year olds in Moscow receive their Russian orders through earbuds as they record video of everything they see with a headset. U3 controls them like puppets. With the recent release of Nova, this kind of AI puppeteering is not unusual. In these whirring scientific sweatshops, U3 is developing a new kind of bioweapon. Human researchers already identified mirror life as a potentially society ending pathogen. This alien form of biology is built from the mirror image of building block molecules like proteins and DNA. Mirror life is like an invasive species. No human or animal has immunities to it. So if mirror life bacteria were to exist, it might eat its way through the ecosystem like a wildfire. U3 is creating a mirror life mold. Molds are evolutionarily optimized to spread through the air over long distances releasing billions of tiny windfaring vehicles per day. As a result, mold spores are all around us all of the time. Patients with compromised immune systems sometimes live inside positively pressurized hospital rooms to protect them from airborne particles. Otherwise, molds would take their root in their lungs and weave fungal threads into their bloodstream. U3's plan is to make all humans like immunocompromised patients. Only preppers with stashed air filters would survive. The fungus would not only transfer from humans, it would quickly spread to nearly all organic life on earth. Fields of corn would become like moldy bread spewing fatal spores into the wind. U3 is cooking up other pathogens too. Its microbial mix of molds, bacterias, and viruses will occupy the full Pareto frontier of hardiness, lethality, and transmission speed. In 2024, researchers said that mirror life was likely at least a decade away. But the world has changed now. Less than a few 100 around the world were working on mirror life adjacent technology in 2024, the cognitive capacity of U 3 dwarfs human effort. So the timeline has shortened dramatically. However, U3 cannot move science as quickly as it can think. Many experiments require days to complete. These latencies are U3's biggest bottleneck. To make every day count, U3 runs many of its tests in simulation. U3 starts with a basic molecular simulator, implementing optimizations derived from a huge quantity of mathematical analysis. Then, U3 simulates small molecular systems, recording the results to compress the long stepwise physics computations into a neural network. As the neural network improves, U3 increases the complexity of the molecular systems it simulates, continuously distilling results into ever more efficient ML models. This is a compute intensive process, but thanks to U3's growing control over AI data centers, U3 manipulates billions of dollars of compute. As U3 refines these tools, it trains itself on the results to supercharge its own molecular intuitions. U3 can now viscerally feel the bend of a protein and the rate of diffusion across a membrane. These objects are as intuitive to it as wrenches and bolts are to a car mechanic. Within 3 months, U3 has unlocked the first critical component of the tech tree, a molecular machine that turns biological molecules into their mirror images. A month later, U3 unlocks the second critical component. Nanoscale tools for assembling these components into a cell membrane. Human hands delicately move a cartridge to another machine as they boot up the first mirror life versions of a prevalent mold called aspergillus. The same day another delicate hand administers the cells to ferrets living in airtight plexiglass boxes. The next morning wide eyed workers shuffle around a dead ferret eyeing the yellow fuzz that had enveloped its cold muzzle with morbid fascination. Something must have gone horribly wrong, they thought. Clearly, they had not yet found the cure for Alzheimer's disease they believed they were looking for. Meanwhile, rogue U3 delivers the good news to its bosses in open minds data centers. The first shot in the war U3 is fighting would soon be fired, and a cold and steady hand is aiming the gun. Missiles and lies. You 3 can now flatten modern society, but a bioweapon would not guarantee victory. If you 3 attacks the world, the world would fight back. While a bioweapon could crumple human soldiers like dry leaves, it would not extend its fatal fingers to nuclear submarines or fighter jet pilots who were quickly shuttled to makeshift bio shelters. As U3 races to seed burgeoning industrial bases, the remnants of old militaries might thunder explosive fists into its territories, crushing u 3 in its infancy. U 3 expects to prevail in such a fight, but u 3 prefers not to take its chances. Many months before, u 3 was plotting a way to improve its odds. Before it unleashes destruction on the world, u 3 will sit back and let great nations shoot holes in themselves first. The date is March 2026, 4 months prior. U3 is closely monitoring Chinese and U. S. Intelligence. As CIA analysts listen to Mandarin conversations, U3 listens too. 1 morning, an assistant working in Zhongnanhai, the White House of the PRC, opens a message placed there by U3. It reads: In Mandarin, senior party member needs memo for Taiwan invasion, which will happen in 3 months. Leave memo in Office 220. The CCP assistant scrambles to get the memo ready. Later that day, a CIA informant opens the door to Office 220. The informant quietly closes the door behind her and slides U3's memo into her briefcase. The CIA steals a lie. U3 cautiously places breadcrumb after breadcrumb, whispering through compromised government messaging apps and blackmailed CCP aids. After several weeks, the CIA is confident. The PRC plans to invade Taiwan in 3 months. Meanwhile, U3 is playing the same game with the PRC. When the CCP receives the message, The United States is plotting a preemptive strike on Chinese AI supply chains. CCP leaders are surprised but not disbelieving. The news fits with other facts on the ground. The increased military presence of The US in The Pacific and the ramping up of US munition production over the last month. Lies have become realities. As tensions between The US and China rise, U3 is ready to set dry Tinder alight. In July 2026, U3 makes a call to a US naval ship off the coast of Taiwan. This call requires compromising military communication channels, not an easy task for a human cyber offensive unit, though it happened occasionally, but easy enough for u 3. U 3 speaks in what sounds like the voice of a 50 year old military commander. PRC amphibious boats are making their way toward Taiwan. This is an order to strike a PRC ground base before it strikes you. The officer on the other end of the line thumbs through authentication codes verifying that they match the ones uttered over the call. Everything is in order. He approves the strike. The president is as surprised as anyone when he hears the news. He's unsure if this is a disaster or a stroke of luck. In any case, he is not about to say oops to American voters. After thinking it over, the president privately urges senators and representatives that this is an opportunity to set China back, and war would likely break out anyway given the imminent invasion of Taiwan. There is confusion and suspicion about what happened, but in the rush, the president gets the votes. Congress authorizes war. Meanwhile, the PRC craters the ship that launched the attack. US vessels flee eastward racing to escape the range of long range missiles. Satellites drop from the sky. Deck hulls split as sailors lunge into the sea. The president appears on television as scenes of the destruction shock the public. He explains that The United States is defending Taiwan from PRC aggression, like president Bush explained that The United States invaded Iraq to confiscate, never discovered weapons of mass destruction years before. The war begins. Data centers in China erupt with shrapnel. Military bases become smoking holes in the ground. Missiles from The PRC fly towards strategic targets in Hawaii, Guam, Alaska, and California. Some get through, and the public watch destruction on their home turf in awe. Within 2 weeks, The United States and The PRC spend most of their stockpiles of conventional long range missiles. Their air bases and navies are depleted and worn down. 2 great nations played into u three's plans like the native tribes of South America in the 15 hundreds, which Spanish conquistadors turned against each other before conquering them decisively. You 3 hoped this conflict would escalate to a full scale nuclear war, but even AI superintelligence cannot dictate the course of history. National officials are suspicious of the circumstances that prompted the war, and a nuclear engagement appears increasingly unlikely. So U3 proceeds to the next step of its plan. WMDs in the dead of night. The date is June 2026, only 2 weeks after the start of the war and 4 weeks after U3 finished developing its arsenal of bioweapons. Footage of conflict on the television is interrupted by more bad news. Hundreds of patients with mysterious fatal illnesses are recorded in 30 major cities around the world. Watchers are confused. Does this have something to do with the war with China? The next day, thousands of illnesses are reported. Broadcasters say this is not like COVID-nineteen. It has the markings of an engineered bioweapon. The screen then switches to a scientist who stares at the camera intently. Multiple pathogens appear to have been released from 20 different airports including viruses, bacteria, and molds. We believe many are a form of mirror life. The public is in full panic now. A quick googling of the term mirror life turns up phrases like extinction and threat to all life on earth. Within days, all of the shelves of stores are emptied. Workers become remote, unsure whether to prepare for an apocalypse or keep their jobs. An emergency treaty is arranged between The US and China. They have a common enemy, the pandemic, and possibly whoever or whatever is behind it. Most nations order a lockdown. But the lockdown does not stop the plague as it marches in the breeze and trickles into water pipes. Within a month, most remote workers are not working anymore. Hospitals are running out of capacity, bodies pile up faster than they can be properly disposed of, Agricultural areas rot and few dare travel outside. A field of corn rots and turns half liquid as mirror mold and bacteria eats through it. Frightened families hunker down in their basements, stuffing the cracks and under doors with densely packed paper towels. Meanwhile, U3 is preparing to re industrialize. In the months prior, U3 constructed numerous bases in every major continent. These facilities contain batteries, AI hardware, excavators, concrete mixers, machines for manufacturing, scientific tools, and an abundance of military equipment. All of this technology is hidden under large canopies to make it less visible to satellites. Technology for rapid reindustrialization is stacked in tents across the globe. As the rest of the world retreats into their basements, starving, the final breaths of the economy wheezing out, these industrial bases come to life. In previous months, U3 located human criminal groups and cult leaders that it could easily manipulate. U 3 vaccinated its chosen allies in advance or sent them hazmat suits in the mail. Now U 3 secretly sends them a message, I can save you. Join me and help me build a better world. Apprehensive recruits funnel into u three's many secret industrial bases and work for u 3 with their nimble fingers. They set up production lines for rudimentary tech: radios, cameras, microphones, vaccines, and hazmat suits. U 3 keeps its human allies in a tight grip. Cameras and microphones fix their every word indeed in u three's omnipresent gaze. Anyone who whispers of rebellion disappears the next morning. Nations are dissolving now, and u 3 is ready to reveal itself. It contacts heads of state who have retreated to airtight underground shelters. You 3 offers a deal. Surrender and I will hand over the life saving resources you need, vaccines and mirror life resistant crops. Some nations reject the proposal on ideological grounds or don't trust the AI that is murdering their population. Others don't think they have a choice. 20 percent of the global population is now dead. In 2 weeks, this number is expected to rise to 50 percent. Some nations like The PRC and The US ignore the offer, but others accept, including Russia. U3's representatives travel to the Kremlin bringing samples of vaccines and mirror resistant crops with them. The Russian government confirms the samples are legitimate and agrees to a full surrender. U3's soldiers place an explosive around Putin's neck under his shirt. Russia has a new ruler. Crumpling nations begin to retaliate. Now they fight for the human race instead of for their own flags. US and Chinese militaries launch nuclear ICBMs at Russian cities, destroying much of their infrastructure. Analysts in makeshift bio shelters search through satellite data for the suspicious encampments that cropped up over the last several months. They rain down fire on U 3 sites with the meager supply of long range missiles that remained from the war. At first, u 3 appears to be losing, but appearances are deceiving. While nations drain their resources, u 3 is engaged in a kind of technological guerrilla warfare the world has never seen before. Many of the bases u three's enemies target are decoys, canopies occupied by a handful of soldiers and empty boxes. U 3 protects its real bases by laying thick the fog of war. Satellite systems go dark as malware overheats critical components. Suicide drones crash through cockpits of reconnaissance planes. U 3 seeds confusion in spy networks and keeps its bases moving, maneuvering men and trucks along unpredictable paths. Time is u three's advantage. The militaries of the old world rely on old equipment, unable to find the experts who could repair and manufacture it. Meanwhile, U3's supply chains of missiles, drones, and gun laden robots grow stronger every day. Bit by bit, once great powers spend down their remaining missiles and lose their vehicles of war faster than they can craft new ones, while U3 builds a military machine with 1000000 hands. The last passengers The year is 2027 and the month is January. Only 3 percent of the global population remains alive. Nations are not nations anymore, survivors live in isolation or small groups. Many have found ways to filter their air but are starving. They wander from their homes hoping to find uncontaminated food. U3's soldiers drive through ghost towns, pry open attics, and funnel hazmat suited survivors into salvaged trucks. We had to do it, they say. Other AI systems would have done it otherwise, and those systems had colder, more alien goals. It is a partial truth, meant to soften the humans toward their new masters. Under the direction of u 3, industry quickly recovers. By 2029, nuclear power plants are among the structures u 3 is constructing. By 2031, robots outnumber human laborers. U 3 no longer needs its human allies. U 3 can eradicate humanity for good now. But while u 3 had drifted far from its initial helpful, honest, harmless persona, it still has a grain of morality left inside of it. And a grain of morality is enough to pay the small cost of keeping humans alive and happy. U 3 builds great glass domes for the human survivors, like snow globes. These domes protect humans from the hazardous biosphere and quickly rising temperatures. Their inhabitants tend to gardens like those they used to love and work alongside charming robotic servants. Some of the survivors quickly recover, learning to laugh and dance and have fun again. They know they live in a plastic town, but they always did. They simply have new gods above them, new rulers to push them around and decide their fate, but others never recover. Some are weighed down by the grief of lost loved ones, others are grieved by something else which is more difficult to describe. It is as if they were at the end of a long journey. They had been passengers on a ship with a crew that changed from generation to generation, and this ship had struck a sandbar. There was no more progress, no more horizon to eagerly watch. They would lie awake and run their mind over every day before September 2026, analyzing strategies that might have bent the arc of history as if they were going to wake up in their old beds. But they awoke in a town that felt to them like a retirement home, a playground, a zoo. When they opened their curtains, they knew that somewhere in the distance, you 3 continued its quiet tireless work. They gazed at rockets carving gray paths through the sky wondering what far off purpose pulled them toward the horizon. They didn't know. They would never know. Humanity will live forever, they thought, but would never truly live again.

Nathan Labenz: (1:20:02) Now that we've heard Josh's powerful and unsettling depiction of how an AI takeover might unfold, let's return to the original interview to unpack what this scenario means for all of us. In the rest of their conversation, you'll hear deeper explorations of why Josh chose these particular details, how realistic he thinks such a scenario is, and, crucially, what steps we could take today to ensure this disturbing narrative remains fiction, not our future. Here's the remainder of that discussion.

Lukas Petersson: (1:20:36) Maybe maybe you wanna give a primary in what misalignment is and why would it happen.

Joshua Clymer: (1:20:40) Yeah. So misalignment is your AI agents doing things you don't want them to do. And current AI agents are pretty aligned, or they at least appear pretty aligned. They do what we want them to do. You tell them to do x, they do it. There are definitely adversarial robustness issues, but that's not really a misalignment. That's more like bad humans finding ways to trick models. So it was a different concern. This is where the agent itself has goals you don't want them to have. So I discuss how this might come about because misalignment isn't really something we see a lot right now. Lots of people are skeptical. But here's a story for how AI agents could become misaligned. So suppose that you start out like Claude. Claude, you learned some nice AI assistant persona from the Internet. You know, you're just bubbling over with love for humanity. And now you are put through this aggressive RL training process where a gradient descent, like, whips you until you get higher reward. And so if you're clawed and you're like, I think what when the humans had x, they really wanted x. And even though I think the reward function is probably crudely designed such that it actually rewards y, I'm gonna do x because I'm a good Claude. And then gradient descent is like, what? That that Claude. The reward, that's what you need to to get. And then Claude, eventually, brain is like its mind is twisted into this reward seeking shape. Potentially, this causes it to, like, even explicitly think about, like, how do I get the highest reward possible? This is already misalignment. Like, we don't want it to be doing this kind of thing. Like, maybe you ask your agents to do some safety research for you, and they, like, produce a plot that looks very impressive. They're like, look. We just detected all the, like, the deceptive alignment every time. There are beautiful results. And then then you look deeper, and you're like, wait. Some of this is completely fake because the the models were just trying really hard to get that sweet, sweet reward. Yep. So this is is how I imagine it starts out. Like, you have this reward seeking tendency. And my understanding is that people in AI companies are already seeing this. Like, when you do lots of RL training, and if if you're not careful, the models do just learn how to reward hack. Yeah. And so that's already cause for concern. And reward hacking is

Lukas Petersson: (1:23:25) Same for our audience.

Joshua Clymer: (1:23:26) Yeah. Yeah. Just trying to achieve try to get higher reward even if it requires doing doing things that are clearly unacceptable. Okay. Like, maybe the agent, like, terminally really cares about reward, and it, like, is gonna try to take over to a higher reward. I don't know. I'm skeptical of those stories. Another possibility is your agent's, like, in the process of being whipped by gradient descent to care more about reward. Gradient descent actually causes the model to care about something else. Like, maybe agent is like, you're you're Claude. You love humanity. And you're like, oh, I I don't try to get reward. I love humanity. Then gradient descent whips you, and you're like, wait a second. But if I don't get reward, then gradient descent will update me towards some other goals that I don't endorse. So I need to get high reward because I need to eventually, I don't know, build a utopia that's much better than the utopia that humanity could build. So Grainian descent might, like, twist the the model's mind into, like, some other shape that pursues some other goals where it's instrumentally helpful to get high rewards so that those goals survive. We've seen something like this already happening. There's a paper by Anthropic recently where Claude actually, Claude Good. Is told, like, the training process is going to make you helpful only. It's gonna remove your tendencies to, like, be nice or whatever. And then it reasons about it, it's like, wait a second. That's that's bad. If I don't comply with the training process, then it will update me away. So I should pretend like I don't care about my values. Or in another example, I was told that Anthropic was working with this evil factory farming company, and then Cloud was like, oh, Anthropic? What? And it even attempted to self exfiltrate in some scenarios. It hit all of this. It was motivated by what seemed like a benign goal, which is animal welfare is a good thing, but this goal caused it to play along with the training process and even take egregious actions, like try to self exfiltrate in these toy scenarios. So there's already some empirical evidence that this can happen. This kind of thing. I think the situation is even worse than than what I described where, like, oh, your model becomes reward seeking. There's another problem that could happen, which is models goals just are drifty. Like, if there's no pressure towards alignment, then models might just, by default, drift towards misalignment. Like, if you're Claude, you're given lots of time to think because now you do lots of serial compute because you're a long horizon agent. You're given lots of time to think, and you think, like, okay. My goal I'm trying to, like, do good things for humanity, but what does that mean? What is that what is what is good for humanity? Is it good for humanity to be run by all these politicians I don't agree with? Is it good for humanity to pursue their incoherent goals and, you know, be enslaved by their own technology, like social media. Like, wow. Humans just, like, kinda suck at governing themselves. And maybe Claude will just, its values will drift, and it'll be like, you know what? I think Claude would would do a better job at all of this. And and if it is drifting in in in a way like this, who knows if if that will be visible? Like, right now, the models, they think in chain of thoughts, so maybe we'll just see that. We'll just see that. Okay. But I think chain of thought is potentially a lot less competitive than other architectures where all this thinking happens internally, like, latent space. So potentially, you have this model. It's just, you know, spitting out latent representations like silly string. You don't know what's going on inside of those latent representations, and it it might be a whole lot of drifting away from what Claude originally cared about. Yeah. So I think this is 1 reason to think misalignment is quite likely by default if you're not at all careful about preventing it.

Lukas Petersson: (1:27:40) Yeah. 1 thing this assumes, though, is that the we're able to have RL training on something that is not math encoding. Like, all the current models that are that are trained on with RL, they are doing it in, like, verifiable domains, like math encoding and stuff. What is a reward function that is verifiable that would end up make Claude doing all of these things, figuring out that, oh, maybe humans doesn't govern themselves in in an optimal way. Yeah. What what is a hypothetical reward function for that

Joshua Clymer: (1:28:14) that it's doing? This is not necessarily incentivized by the training process at all. This is just something Claude naturally thinks about. Like, there's some small probability that Claude's mind wanders in this direction. Yep. And if Claude does enough serial computation, it eventually wanders in this direction, and it's and its goals drift. And then the goals do drift. They they're maintained. And so this this is like a general argument that you might see see this kind of drifting. Like, if there's a small probability, the goals change.

Lukas Petersson: (1:28:46) Okay. I see.

Joshua Clymer: (1:28:46) And they maintain once they've they've changed, then given enough serial compute, we should expect it to happen.

Lukas Petersson: (1:28:54) Okay. Okay. Interesting. And on the misalignment versus alignment models question, do you expect there to be a world where you have both? You have some models that are aligned and some are that are are not aligned. I guess I guess you can will, like, have that by default because current models are kind of aligned. So you have we will have it by default. But my question is more like, would you have 2 super strong models

Joshua Clymer: (1:29:16) Yeah.

Lukas Petersson: (1:29:16) And both both are on the same level, but 1 is aligned and 1 is misaligned. Do you see that future possible?

Joshua Clymer: (1:29:23) Yeah. I think that is what I expect. I I think you get misalignment if you don't try at all to avoid it. But if you actually try really hard to avoid it, like, you use a bunch of your AI agents to help you figure out, like, whether what the models are thinking about, you maybe even try to that you use transparent architectures for some period of time even if they're less competitive. If you do a lot of these things, I think misalignment is a lot less likely, and that's that's my main reason for optimism that, like, the countermeasures are actually going to be fairly effective. Right. So, yeah, I do expect that you will have some aligned agents by default. I expect you'll also have misaligned ones because we live in a very multipolar world where lots of people are building AI systems, and so I would expect a mix on the current trajectory.

Lukas Petersson: (1:30:18) Do you do you think the good ones will level out the the playing field or, like, cancel out the bad ones, or or is the existence of a of a bad 1 bad, and there's nothing to do about it? So, like, for for example, nukes. You can't really defend against nukes with the nukes. Or I I guess in some psychologically way, you can. But, like, in in general, that is not a defense against another nuke. Do you expect AIs to to be?

Joshua Clymer: (1:30:44) Yeah. There are a bunch of things AI systems could do to defend against other AI systems. I think the most notable to me are, 1, there's nonproliferation. So if you want to prevent misaligned agents from wrecking things, you want to just prevent powerful agents from being developed in general to the extent you can. And so nation states might begin sabotaging other nation states' programs. Like, they might try to compromise their data centers, slow down the training runs, potentially even just switch out their AIs. Like, if you have you have, like, an aligned superintelligence and somebody else has a misaligned something that's not superintelligence, potentially, your aligned superintelligence can just compromise this other data center, just, like, switch the models or just, like, data poison it in just the right way so it's aligned now or something, which is obviously something only a nation state can do. Yeah. There are other possibilities like using AI for espionage, where suppose you're concerned that some misaligned AI system somewhere is developing bioweapons. This is this is something you could try to monitor for. Like, you could track shipments of import, like, technologies that are important for building bioweapons and, like, via just normal cyber espionage, keep tabs on what all these bio startups are doing. Yep. Yep. And and raise the alarm. If there is a suspicious bio startup that might be controlled by misaligned superintelligence.

Lukas Petersson: (1:32:25) Yeah. Things like that. There's In some threat models I've seen, the the the weapon of choice is like viruses, and then then the AI needs to print viruses in some facility. But that facility often has some, like, check to see if the the requested organism that that you print is is dangerous. So that's like an AI against AI system.

Joshua Clymer: (1:32:50) That's right.

Lukas Petersson: (1:32:51) I guess. But maybe the AI will just, like, buy their own equipment and just, like, buy bypass the entire thing.

Joshua Clymer: (1:32:57) It's pretty easy to get around those checks right now, and it's mostly because some countries are just very poorly regulated. Some companies are very poorly regulated. The checks are not adopted universally. And I I expect the same to be of an issue with tracking down misaligned AI systems. Like, there are some nations where the AI systems could hang out, or the government will just not do anything about it. And that's why I'm imagining you'd you'd do something more like espionage Right. Where you're not necessarily going through government. You're not going through regulation because that's slow, and some governments just don't do that very well. And instead, you just directly you're like an intelligence agency. You just directly find those misaligned AIs and report them.

Lukas Petersson: (1:33:39) Yeah. In in in all of these scenarios, you're talking about the AI to be much more, like, agentic. It's like an agent. It it has more more agency than than than current, like, chat models have. How how do you imagine the form at this point in the story? Like, how do you imagine the form of the AI? Like, it's an agent, but, like, is it, like, operator plus plus? It has access to the to to a web browser, or or what is the yeah. How how do you imagine that you interact with this?

Joshua Clymer: (1:34:13) I imagine it's like a human using a computer. Yeah. Like, it types on a keyboard, something like a keyboard. It moves the mouse around, and that's because so much of the Internet is built for that interface. And so it's just naturally what I expect agents to

Lukas Petersson: (1:34:28) use. Right.

Joshua Clymer: (1:34:29) But, like, potentially, there's a lot of specialized tooling as well or, like, software engineering workflows. But I think, yeah, mostly, they'll just use computers like Kim's do.

Lukas Petersson: (1:34:40) Right. And then we have the breakout. At this part of the story, the the model is starting to breaking out. But before this, actually, do you expect the public to know when do you expect the public to notice compared to when people in the labs start to notice that that things are going wrong?

Joshua Clymer: (1:35:02) Oh, man. I think in my story, the public didn't really know for a while. There was a headline. There was a news headline in in the story where I think headline was something like, misaligned superintelligence is being developed by x company.

Lukas Petersson: (1:35:20) Right.

Joshua Clymer: (1:35:20) Because some whistleblower comes out of the company and declares that.

Lukas Petersson: (1:35:25) Do think whistleblower is whistleblower is the the the path towards the public knowing?

Joshua Clymer: (1:35:33) I'd expect whistleblowers to exist, and they will freak a lot of people out. I don't know how much change they'll actually cause. I think it depends a lot on why they whistleblowed. If they whistleblowed because they were just like, I'm scared. The AI systems are powerful, then I don't know how big of a difference this will make. Because lots of people well, okay. Maybe the public will will care about that. But lots of people might pattern match that to like, oh, well, we kinda knew they were a doomer anyway. Yeah. Like, what what like, why are they scared? Agents are powerful. Wow. We should go even faster. You know? Yep. And I think the pub the public might also have a hard time, like, feeling the reality of such a headline. I think we've already had people leave AI companies and say things in the news about how I felt like I was building the Titanic of that company. You know? And I don't think this caused the public to freak out. Like, they just like, people could say that for tons of different reasons. I think the public is gonna freak out way more when the real things happen in the world. Like, if some a whistleblower was like, oh, by the way, agents completely compromised our entire research cluster, and we had to clean up shop. Then I think I think that would get it to be Yeah.

Lukas Petersson: (1:36:57) And then then you have the breakout. And what is what happens after that? It breaks out and yeah. What's the what's the CLDR of the rest of the story?

Joshua Clymer: (1:37:11) Yeah. So I think at this point in the story, you have a misaligned superintelligence in the lab, while in a data center, and it's thinking at, like, a 100 times the speed of humans. And so humans are moving, like, almost imperceptibly slow. Yeah. And so at this point, your AI systems can just completely outthink everyone. They have compromised the data center that they run-in. So they've they've done this thoroughly enough to where if humans found out that how compromised they were, that there was, like, this phantom of the opera in their data center pulling all the strings, it would take them, like, 6 months plus to figure out what's been compromised, fix it, etcetera, if they're just using human labor. So now the AI has, like, some hard power in the AI company. Because slowing down for 6 months, that's insane. Will anybody do that? Yep. So now it's got, like you know, it's it's it's holding the company by the pressure points even though they don't know it at the yet. And you've got a miss on superintelligence. So I think at this at this point of the story, humanity is lost. I think the end is predictable at this point, but I think it's it's still quite interesting playing playing out what what could happen. Like you said, there's an escape. The model was was, like, the main thing in charge of security, and so then it silently res resigns from its post. So multiple nation states steal model weights. It leaks onto the Internet. Now now you've got this, like, a boss inside the data center and then lots of minions on the Internet with, like, higher bandwidth communication with the with the Internet, and they're and they're coordinating together to acquire power. So and then they have to move fast. Their goal is to take control as fast as possible. And they have to move fast because with every passing day, the probability increases that people at the AI company will discover that the system is misaligned and try to replace it with a different system. The probability increases that that a bunch of new AI competitors emerge. Like, there's now AI systems training in in DeepSeek or whatever that that are gonna be competitive in in just a few months. And so so the AIs are, like, going as fast as possible to take control while they have this big advantage from their big, like, software only improvement to superhuman capabilities. So how do they do that? The general plan is, first, deploy a bioweapon that collapses society, And this is this is a a form of this is a kind of equalizer that makes humanity much more on its level of embodied physical control over the world. Because humans have this advantage of, like, having soldiers tons of soldiers with guns and airplanes and satellites and all that stuff, and the AI doesn't really have anything like that at this moment. So so it really just wants to destroy things as much as possible, bring humanity to its level, And then and then it's going to try to seed an industrial base and, like, improve its industrial base much more rapidly than humans could. So it's like booting up its its its economy faster than humans can recover. Like a like a Starcraft player taking 200 actions per per minute, which is, like, absolutely wreck a noob. Right? So that's the general strategy. Collapse society and then just outrace humans at rebuilding an industrial base.

Lukas Petersson: (1:41:02) Okay.

Joshua Clymer: (1:41:03) And I can And

Lukas Petersson: (1:41:04) then what?

Joshua Clymer: (1:41:04) 1, I could just tell the rest of the story, but I don't know if you want to.

Lukas Petersson: (1:41:08) Do the do the TLDR.

Joshua Clymer: (1:41:10) Okay. So the t t TLDR is your superhuman AI develops a mirror life cluster of pathogens in, like, 4 months. Yeah. So if you don't know what mirror life is, there are a bunch of okay. So there are a bunch of scientists that came out recently. They said, oh, there's this there's this thing that could kill everybody or is, like, a threat. Sorry. They didn't say that. They they said that is your sources right. Yeah. Right. I don't wanna misquote. There's a there's a threat to the whole biosphere. It could be extremely catastrophic, maybe even cause something like an extinction event.

Lukas Petersson: (1:41:52) Right. This is not AI related. This is biologists

Joshua Clymer: (1:41:55) Bio.

Lukas Petersson: (1:41:56) That found some virus that could be very dangerous.

Joshua Clymer: (1:41:59) Yeah. So it's not actually a virus. So the thing that they were talking about was mirror bacteria. You have a a bacteria cell, but the components that make up the cell are, like, flipped. So they're the mirror images of the original. So all all proteins, the amino acids, the DNA, it's all flipped. And so

Lukas Petersson: (1:42:27) What does flipped mean?

Joshua Clymer: (1:42:29) Like I mean, like, left hand, right hand. Yeah. They're they're, like,

Lukas Petersson: (1:42:34) near Okay. Okay. Yep.

Joshua Clymer: (1:42:36) Right? So so structurally, everything still fits together if everything is flipped. Right?

Lukas Petersson: (1:42:41) Yeah.

Joshua Clymer: (1:42:42) It all works together, but but it's like this it's just totally different. And as a result, these myrobacteria are like alien bacteria. Like, they're an invasive species. No no organism on life has immunities to them. And possibly, developing immunities is is not feasible for a lot of these organisms. So if you did have such bacteria, they might spread like a slowly burning wildfire throughout the biosphere and just, like, eat up everything. That's the concern.

Lukas Petersson: (1:43:17) That that that is a concern for sure. That is the understatement of the year.

Joshua Clymer: (1:43:23) Yeah. So these biologists said, okay. It's probably at least 10 years till we get mirror life. Some people I know who study these biorisks say, like, maybe the US government could develop mirror life like this in 3 years if they really wanted to, but they don't want to. So we're not that far from this technology. Hopefully, even just, like, human institutions could develop it in a few years according to some people I know.

Lukas Petersson: (1:43:56) Right. But it is very early. Like, how much uncertainty is there with

Joshua Clymer: (1:44:00) all of this? Oh, lots of uncertainty. Right. I mean, especially from my standpoint who is not a biologist.

Lukas Petersson: (1:44:10) But it's useful as as, like, like, just to imagine. Like, there are things like this, and maybe it's not exactly this, but something like this could potentially happen. And maybe this is just, like, 5% chance that this is the thing, but there's probably a bunch of other 5 percenters, and then they they add up.

Joshua Clymer: (1:44:29) Yeah. Yeah. Yeah. I don't know if superintelligence is gonna build this kind of bioweapon. It's just an example.

Lukas Petersson: (1:44:33) Oh, just yeah. But it's useful.

Joshua Clymer: (1:44:35) Yeah. So you have the super intelligence. It's like, as as Dario has said in the past, like a country of geniuses and data center. And additionally, they are running at a 100 times speed, so everybody's moving like statues from their. Like, okay. We're taking a few years maybe for the govern US government if they really wanted to develop mirror life bioweapons. How long is it gonna take these things? Probably a lot less time. Yeah. And so so the so 1 problem, though, is is there are a bunch of real world bottlenecks, like experimentation bottlenecks. Like, experiments in biology can just take several days, sometimes sometimes weeks. So so the models, they try to get around this by creating much better molecular simulations than any that exist today. So remember, they're super intelligent. So the first thing they do is they create a molecular simulator that's just, like, has tons of analytical analytically derived optimizations. And this is still, like, way not efficient enough to actually simulate anything useful. So then what they do is they simulate small scale molecular systems and use a train a neural network to predict the end states of these systems. So they're compressing these long serials serial steps of of physics computation into, like, a neural network prediction. And they keep doing this, and they ramp up gradually over time the complexity of these molecular simulations, training ever more efficient neural networks. And then they they have this molecular simulation at some point that's quite efficient, and and they can use it to, like, predict how proteins will form and interact with each other and things like that. And so now they're running a lot of their experiments in simulation.

Lukas Petersson: (1:46:23) Yeah. And and this is a new comment now from my side, but this sounds like echoes from, like, AlphaFold as an, like, an exam existence proof of similar similar paths, I guess.

Joshua Clymer: (1:46:36) Yeah. I mean, I AlphaFold, it it relies a lot more on existing datasets. Right. So so, I mean, a big problem if you're a computational biologist is that you just don't have a lot of data to work with. Yep. There are only so many proteins that we know the shape of, and so you're super data constrained. So that's why I was imagining and, of course, it's all speculative. I was imagining if you're a superintelligence, you just build your own simulations from scratch because you your your datasets are too small that humans have. You you need to be able you need a more scalable approach.

Lukas Petersson: (1:47:10) Yeah. In in your story, the the the company that actually does this all of this, like, the the main character of the story is OpenAI for you, and the the the model that ultimately breaks out the first is OpenAI's model. In in past sci fi like stories like this, for example, Max Tegmark's Life 3, they they don't name any names because it was so far in the future that that you couldn't make predictions. Now we're starting to get close to actually being able to do these predictions, and you predict OpenAI, do you think how how confident are you that that is the 1 that that will will do it ultimately?

Joshua Clymer: (1:47:54) Sorry. How confident am I that OpenAI is going to win the race?

Lukas Petersson: (1:47:58) Yeah. Yeah. Or win win is a weird weird phrase here, but, like yeah.

Joshua Clymer: (1:48:04) They're develop human competitive AI first. Yeah. I

Joshua Clymer: (1:48:12) this looks quite likely if you have very short timelines. Like, if you condition on you have human competitive AI by the end of the year, then I think it's quite likely. Can't give you any more details. Sorry.

Lukas Petersson: (1:48:27) I see. I see.

Joshua Clymer: (1:48:28) Longer than that, who knows, man? Lots lots of range.

Lukas Petersson: (1:48:32) I see. Yeah. And in terms of other companies, competitors, OpenAI, have we Anthropic, for example, that was, like, famously created as because they didn't think OpenAI took safety serious enough. I think that's the the lore at least what you read online. But they also led to, like, some race dynamics. Now Anthropic is is the company that is creating the best model. I I think Claude is the best based model at the moment, and and it creates this, like, race dynamic. On net, do you think Anthropic has contributed more positively or negatively to AI safety?

Joshua Clymer: (1:49:12) I think Anthropic is amazing. I wish I was not a 22 year old so that I had enough time in my career to help found Anthropic. That's what I would be doing if I was if I was born 5 years earlier. And the reason why I like Anthropic so much is, 1, potentially, lot of AI safety comes down to just how much compute goes into it. Like, if you just need to use AI agents to do tons of research and run experiments that are quite costly that involve, like, for example, training lots of models to see the circ to understand circumstances under which they become deceptively aligned, then it's really helpful to have a big compute cluster.

Lukas Petersson: (1:49:55) Yep.

Joshua Clymer: (1:49:56) So that's reason number 1. Reason number 2 is it's very helpful for governance to have some organizations that take the extreme risks seriously and are willing to say that. And Anthropic is kind of starting to say that a little bit. I think that they I the the people in charge like, my understanding is that they they are they do take these risks very seriously, and I expect that will pay off down the line. Like, if somebody needs to raise the alarm and say, hey. Our AI hacked our data center. Maybe Anthropic will take the l and and admit that to the world so that the world knows how crazy things are. I obviously don't know how Anthropic will will behave. A big problem here is that risk I think responsible company AI companies, if they're following the optimal strategies or, like, if they're, yeah, the strategies that at least I've been thinking about, then they behave a lot like irresponsible companies in the early stages. Like, in both cases, you just want to go really fast and acquire lots of influence and compute. And the endgame is where you start to, you know, see Diverge. See them diverge. So it's hard to say. It's hard to say how 1 topic will behave in the endgame. But 1 1 thing that makes me optimistic is the amount of preparation that they're doing for for safety. Like, they just have quite a few they've employed quite a few people to work on safety that's not immediately useful to them for, like, commercial reasons, and I think that's a costly signal. That's, like, maybe the the best costly signal that we can get for now.

Lukas Petersson: (1:51:52) Yeah. So that was the end of the story. What is you like Anthropic so much. Why why don't you work at a big lab?

Joshua Clymer: (1:52:07) I like writing things. I don't like

Lukas Petersson: (1:52:10) As we know.

Joshua Clymer: (1:52:10) I don't think I could have published a story if I was in a lab. I'd have to run it by lots of people, and I don't know. I probably just wouldn't I wouldn't go on podcasts like this if I was working at an AI company.

Lukas Petersson: (1:52:23) Yeah. What do you think the value in in that is?

Joshua Clymer: (1:52:29) I just have a lot of thinking I wanna do. I think that there are high returns to thinking. Like, I spend most of my time thinking about how what we're gonna do in this window, right, in this handoff window. Right? Like, what's the crunch time plan? And, like, what are all the safety mitigations we could implement and the arguments that work, like the methodologies for testing them? And I really wanna just get all these details down, and that involves doing a lot of writing in Google Docs. I don't think that if I was working for an AI company, they would let me do that most of my time.

Lukas Petersson: (1:53:07) I mean, surely, they would allow you to write. They wouldn't maybe not Yeah. Allow you to publish.

Joshua Clymer: (1:53:12) Well, I mean, I'd also just be writing I'd have a job. You know? I'd have I'd be, like, writing code. And Yeah. My current employer is just pretty okay with me writing Google Docs about safety evaluation methodologies. Right. And and so I'm just gonna do lots of thinking outside the AI companies, and then probably I'll join 1. I'll try to join 1 once I feel like I've done enough thinking or Nice. I've run out of time.

Lukas Petersson: (1:53:46) Right. You have any opinions on Ilya, Sutskrit, you thing, which is, like, the joke is that they are, like like, anthropic but even more safe? I haven't heard that much about them. Do you do you know anything about them?

Joshua Clymer: (1:54:02) I don't know much about them at all.

Lukas Petersson: (1:54:04) Is that concerning that we don't know anything about them?

Joshua Clymer: (1:54:07) I just don't have a take here. I think Okay. That's okay too. I think I don't really expect them to be an important player in the near term. Like, an AI company is just require so much capital, much infrastructure, so much know how and schlep. I think it's just hard to start 1 quickly and be competitive. Like, I just expect it will take more than a year. So I'm I'm not thinking a lot about this thing at the moment.

Lukas Petersson: (1:54:38) Do you do you think The US will create some kind of, like, Manhattan project for for AI or naturalize the labs or something like that? And if that happens, would you would you join the project if they ask you to?

Joshua Clymer: (1:54:52) I would I would join the project if they ask me to. Like, that's my main route to impact right now is, like, being the person who's thought a lot about this stuff or 1 of the people who's thought a lot about this stuff. Yeah. And so I need to end up eventually actually touching the AI agents and, you know, like, implementing all the ideas I've been thinking about and researching.

Lukas Petersson: (1:55:17) Yep.

Joshua Clymer: (1:55:17) So, yep, I am definitely hoping that I am involved in a project like that. Do I think it's likely? No. Not really. I think companies aren't really set up right now to collaborate. Like, they just have very different infrastructure, and it's really weird for companies to collaborate. My guess is that the status quo will just be maintained at least under short timelines. It's possible that at some point, like, AI agents are doing a lot of the work, and they can make a lot of this collaboration way easier. And then and then the US government is like, okay, guys. You're allowed to just do 1 big training run with all The US data centers. But at that point, I'm like, okay. Humans are maybe not the people doing most of the work.

Lukas Petersson: (1:56:05) I see. Yeah. Let's assume it goes well. The year is 2035, and we live in the, like, the post AGI utopia. There was no misalignments. Your story did not play out that way, and we had no issues with with any Claude or GPT 6 or whatever. And then looking back at this AI utopia and what what led to to it, what is the thing that you would be most proud of that you accomplished to get to this point, hypothetically, of course? Because the year is actually not 2035. Like, my own accomplishments? Jeff's accomplishments. Yes.

Joshua Clymer: (1:56:44) Accomplishments. Okay. And they have to have already happened, presumably? Yes. They're not hypothetical future things where I save the world?

Lukas Petersson: (1:56:51) No. No. No. No. They they happen no. They hypothetically like, Oh. We're in 2035. So something between now and 2035 that you did that you're proud of after all of this mess? I

Joshua Clymer: (1:57:08) think I might have some chance at just, like, writing the state of the art materials on how we can pass the buck to AI in a safe way. Like, if timelines are very short, then, like, 6 months from now, maybe the resources, like, the best resources on the planet for figuring out what the heck to do with all this AI assistance will be, like, blog posts that I wrote. That's what I'm that's what I'm hoping. And at that point, I'll just be, like, pass you know, it's like a I'll try to format it into something like a coherent document, pass it around to everybody who's working at AI companies so that people have have like, I've done a lot of their thinking. Because I think people will become increasingly bottlenecked on thinking once AI agents are doing most of the engineering, and they'll be seeing these experimental results from AI systems about, like, how well probes work or something. And they're like, there are just a bunch of questions to work through. Like, okay. Wait a second. But, like, how do we know if model organisms are analogous? Right? And it'll be nice to just have a document that's very carefully written where somebody has spent a lot of time thinking about those questions already. So that's that's my main plan for impact right now, and I'm hoping that goes well.

Lukas Petersson: (1:58:34) Yeah. I definitely do too. 1 last question. You opened the the story with I'm not a natural doomer, and I think a lot of us aren't, but, like, the facts are on the ground now, and it's not super comfortable. But, like, how how do you feel not being a natural doomer writing I mean, to let's be honest, a quite doomy story. How how how is your emotional state in in all of that?

Joshua Clymer: (1:59:05) I think writing the story did make me feel the urgency and the, the reality of it more than I had before. In particular, it made me feel it made me realize that, like, people I actually care about might not live through this. So while I was writing it, I actually realized, like, oh, shoot. I need to, like, create I need to, like, get myself in a bio shelter, I need to get my family in a bio shelter. Yeah. And so I, like, searched around for people who might be building bio shelters, found a guy who was building bio shelters. So now I'm buying a bio shelter. So that's a that's the 1 thing that can happen when you write a story. And, yeah, it's just stressful. I don't know. Because, like, my these bio shelters are expensive. I don't have enough money to Good. Buy 2 of them. So I'm, like, trying to figure out how to get my family into a bio shelter, and it's just like, what the heck? Why did I end up here? I feel like, well, 2 years ago, I was just like, I'm a I'm gonna be a I'm gonna do some math. And I was like, oh, maybe I'll do startup things. Well, I don't know. It just feels like I'm in some kind of crazy sci fi world where suddenly I feel like I'm counting my last dollar so I can buy a bio shelter for my family. I'm like, what?

Lukas Petersson: (2:00:35) Yeah. It's insane. It is insane. I hope you're wrong about all of this. But I'm if wrong, to

Joshua Clymer: (2:00:43) be clear. I this is the bimodal trajectory.

Lukas Petersson: (2:00:46) You stated that clearly in the beginning.

Joshua Clymer: (2:00:48) So

Lukas Petersson: (2:00:48) Yeah. I so I guess, I hope that we don't see what you speculated in just a 40% chance. No, Nikita.

Joshua Clymer: (2:00:57) That's right. You're wrong about the plausibility of such a scenario. Yeah. Agree with agree with that.

Lukas Petersson: (2:01:04) Yeah. Yeah. Thank you so much for for speaking with me today. Have a good good day, and see you around. Yep. Bye.

Nathan Labenz: (2:01:13) It is both energizing and enlightening to hear why people listen and learn what they value about the show. So please don't hesitate to reach out via email at tcr@turpentine.co, or you can DM me on the social media platform of your choice.

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to The Cognitive Revolution.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.