In this special AMA episode, Nathan answers questions posed by The Cognitive Revolution podcast listeners.
Watch Episode Here
Read Episode Description
In this special AMA episode, Nathan answers questions posed by The Cognitive Revolution podcast listeners. He discusses AI developments in 2024, including OpenAI's o3 announcement, deliberative alignment, and the future of AI technology. It is an insightful discussion about AI's impact on education, coding careers, and business sustainability in an AI-driven world.
Check out http://aipodcast.ing for AI-powered podcast production services or reach out to Adithyan (https://www.linkedin.com/in/ad...) for more information.
Help shape our show by taking our quick listener survey at https://bit.ly/TurpentinePulse
SPONSORS:
Oracle Cloud Infrastructure (OCI): Oracle's next-generation cloud platform delivers blazing-fast AI and ML performance with 50% less for compute and 80% less for outbound networking compared to other cloud providers. OCI powers industry leaders like Vodafone and Thomson Reuters with secure infrastructure and application development capabilities. New U.S. customers can get their cloud bill cut in half by switching to OCI before March 31, 2024 at https://oracle.com/cognitive
80,000 Hours: 80,000 Hours is dedicated to helping you find a fulfilling career that makes a difference. With nearly a decade of research, they offer in-depth material on AI risks, AI policy, and AI safety research. Explore their articles, career reviews, and a podcast featuring experts like Anthropic CEO Dario Amodei. Everything is free, including their Career Guide. Visit https://80000hours.org/cogniti... to start making a meaningful impact today.
NetSuite: Over 41,000 businesses trust NetSuite by Oracle, the #1 cloud ERP, to future-proof their operations. With a unified platform for accounting, financial management, inventory, and HR, NetSuite provides real-time insights and forecasting to help you make quick, informed decisions. Whether you're earning millions or hundreds of millions, NetSuite empowers you to tackle challenges and seize opportunities. Download the free CFO's guide to AI and machine learning at https://netsuite.com/cognitive
CHAPTERS:
(00:00:00) Teaser
(00:01:00) AI Podcasting
(00:03:19) o3 vs. Other Models
(00:10:25) o3 Breaks Benchmarks (Part 1)
(00:14:19) Sponsors: Oracle Cloud Infrastructure (OCI) | 80,000 Hours
(00:16:59) o3 Breaks Benchmarks (Part 2)
(00:27:51) OpenAI's Safety Plan (Part 1)
(00:28:45) Sponsors: NetSuite
(00:30:18) OpenAI's Safety Plan (Part 2)
(00:39:08) Safety & Governance
(00:50:48) Tail of the Tape
(00:59:38) Underutilized Potential
(01:05:07) RAG & State Space
(01:18:14) Agentic Frameworks
(01:29:55) AI & Education
(01:35:59) Learn to Code?
(01:43:00) Defensible Moats?
(01:53:53) UBI & Wealth
(01:59:08) Contributing to AI Safety
(02:04:25) Outro
SOCIAL LINKS:
Website: https://www.cognitiverevolutio...
Twitter (Podcast): https://x.com/cogrev_podcast
Twitter (Nathan): https://x.com/labenz
LinkedIn: https://www.linkedin.com/in/na...
Youtube: https://www.youtube.com/@Cogni...
Apple: https://podcasts.apple.com/de/...
Spotify: https://open.spotify.com/show/...
Full Transcript
Nathan Labenz: (0:00) Even with these reasoning models, they are still weird. They are still subject to sort of cashed heuristics. They're subject to bad perception. They're subject to just, like, straight up bad reasoning, and you can expose that in simple toy examples. We hope the underlying reality is such that these things are, like, pretty manageable because we're not, you know, taking the level of caution that we would need if they're not manageable in 2025. We'll probably start to see a lot more of these things like the famous AlphaGo move where it looks like a mistake at first, but it actually turns out to work. Or some of these, like, deceptive scheming behaviors. You know, AI is accomplishing the goals they've been given in strange ways. Do what you wanna do. Don't spend years of your life on a bet that, you know, that that in some sense implies that the AIs, like, aren't gonna dominate the programming profession in the next few years because it seems plausible enough that they might. Welcome to the cognitive revolution.
Adi: (1:04) Thanks for having me, Nathan.
Nathan Labenz: (1:06) Thanks for being here. I'm excited to do this. So this is the AMA, and so you're making your first appearance in front of the camera after a lot of effort over the course of 20 24 behind the scenes. So I've mentioned you guys a couple times on the show and a few times on Twitter, but for any who haven't caught those mentions, Adi and his partner, Sai, are the founders of a company called AI Podcasting. Aipodcast.ing is the website, and we've been working together for most of this year to produce the show. It's been a lot of fun. I appreciate all the effort, and thanks for coming on today to drive this AMA episode.
Adi: (1:44) Thanks to you for being our first client and making AI podcasting happen.
Nathan Labenz: (1:49) My pleasure. You know, my experience obviously with Turbotine has been great in many respects, they kind of took all this stuff off my plate at the beginning, sponsorship sales, editing, whatever. And the 1 thing that I thought I was always like, I wish we were using AI more to actually produce the show, you know, and then create some sort of leverage. So the opportunity to work with folks who had listened a lot and who have a real interest in AI and are interested in kind of creating these processes was exciting. And, you know, that that's where all the clips have come from. If you've seen clips on Twitter and, you know, all the shorts that we're doing on YouTube, definitely still a lot more that we can do, but it has been pretty amazing how much we have been able to do on a pretty small budget and especially considering we're we're putting out 8 episodes a month. I know I'm keeping you guys pretty busy with the the pace of content.
Adi: (2:39) No. I mean, I should also mention that thank you for seeing that. Just to rewind a bit, like, when we were doing this, we were not sure that we would actually pull off the complete post production for podcast, but you came in and said that, hey. Look. This is our pain point, and why don't you kind of apply AI everywhere? And, I yeah, think we have we have done good, but I think we could do better, as you said. Looking forward to what we have in 2025.
Nathan Labenz: (3:02) Always always opportunities. But for now, we've put up these questions mentioned on a few episodes, put it out on Twitter, you put it on YouTube. We've got a pretty good response with quite a few questions spanning a lot of different areas. Take it away and try me on all these questions.
Adi: (3:19) Absolutely. I'll try to be the voice of our audience here. So we have lots of questions. So if I did miss some, please feel free to drop them as YouTube comments. We would try to address that there. Nevertheless, we will get started now. Way we are going to do this is via sections, but we thought we would start with something that's been trending in the headlines recently. A user would like to know what do you think of all 3? And a related question, what do you think about OpenAI's new safety plan called deliberative alignment? Obviously,
Nathan Labenz: (3:53) the o 3 announcement, not yet a release, caught everybody's attention and has had people kind of, you know, using their spare holiday cycles to try to figure out what it means. First of all, is important to keep in mind, there's a lot that we don't know. Right? Nobody has seen the model yet. I have applied to the safety review program, and, you know, I think it's cool that they put out that open invitation. But, you know, basically, what we've seen so far is a sort of smattering of results across a number of different benchmarks. And those are super impressive, but that's limited information. Like, the sources on this are not as authoritative or as clean as you might wish that they were if you really wanna have a a high confidence answer. Nevertheless, this is what we have, so we'll try to make sense of it. For starters, I think it is useful to go use some other reasoning models. OpenAI is currently unique as far as I know in terms of having a reasoning model that does not show its chain of thought in its response to you. Everybody else so far who has put 1 out, as far as I know, is sharing the chain of thought. That includes Google with their flash to experimental thinking version. It also includes deep seek, which has a reasoning model. There's another Chinese 1 as well, although I haven't really used that 1. But I have gone and and used the flash thinking version and deep seek model. It's interesting to read these long chains of thought and and try to get a little bit of a better sense for what the models are thinking. It has been a weird experience. I think that, you know, the headline that LLMs are weird remains very much in effect. All of the models that I've tried, and this includes o 1 pro, have failed on what you would think would be a pretty amenable problem for a reasoning model. You know, certainly somebody who can reason in general purpose terms, you would think would be able to handle this challenge of simply here is a tic tac toe board. I basically just put an x in 1 corner of the tic tac toe board, an o in the corner Yes. Immediately below that and said, x went first. It's x's turn. Assuming optimal play, is it possible to to determine who the winner will be? Every model loved this question to my surprise. But the answer is x can win. You can force a fork and you can win. So if you have optimal play from that position, x will win. The responses that I've got from OpenAI models, from Claude, from latest Geminis, even leaving aside the reasoning, are basically all wrong. They seem to be very much anchored in the commonly repeated statement that tic tac toe is a solved game. With optimal play, it will always be a tie. So you see a lot of responses like that. And, you know, that's like not reasoning. Right? That's just sort of your cash heuristic. You could even say that's sort of stochastic parrot mode because it it latched on to a couple tokens, and it's seen that statement a lot in the training data. I've also seen surprising problems where the perception is the weird thing where, for example, in 1 case, 1 of the models just didn't read the board right and started with the wrong board state, an x in the corner and a y in the middle bottom as opposed to the bottom corner. Obviously, you're not gonna do very well from there if you are starting with the wrong board state. Then perhaps most revealingly, the reasoning in both the flash 2 and deep seek was pretty bad. Outright nonsense in a lot of cases. Things where it's trying to do this sort of rollout. You know, it's trying to kind of go down these branches and then come back to the top. So it'll it would sort of do things like, okay. Well, let's say x goes in the middle. Then what will o do? But it would often just fail to, like and again, had said optimal play. So it would so often get to a point where 1 of the players could win just by, like, completing the 3 in a row, and it would fail to do that in its analysis of the situation. It did not know what optimal play was for, like, a great many board states. You are reading through, you know, tens of thousands of tokens of reasoning about the situation, And you're just like, man, this reasoning is a lot of it's pretty bad. So I think there's, like, definitely not a panacea here. And you're just because somebody comes out with a reasoning model doesn't mean it's going to be necessarily an effective reasoner. At the same time, with other kinds of problems, I've seen, like, remarkable results. So those same 2 models, Gemini to FlashThinking and DeepSeek, ask them both also for suggestions for elaborations of neural network architectures that would take inspiration from some sort of biological system. Thinking there about the AE studio episode we did, which I thought was, you know, excellent in terms of really thoughtfully picking something out of the biological world and and trying to make obviously, not a direct invitation of it, but something that sort of captured the key concept of how these biological systems were understood to work. And I was quite impressed. That was a little bit less reasoning, a little bit more griffing on ideas. And I'm sure, know, plenty of those ideas are, like, not gonna work and, you know, flawed in in whatever ways. But it does go to show that even with these reasoning models, they are still weird. They are still subject to sort of cached heuristics. They're subject to bad perception. They're subject to just, like, straight up bad reasoning. And you can expose that in simple toy examples. And at the same time, you get things that are remarkably impressive with, you know, a question that would seem, at least to a human, like a lot harder. Right? You would expect more people to handle the tic tac toe question versus the bio inspiration for neural network architectures question, and yet it's totally the reverse for the reasoning models. So that's weird that we don't know what's going on, obviously, in the OpenAI chain of thought. I think it's probably likely that it's higher quality, but how much higher quality is not clear.
Adi: (10:15) That's interesting that you'd say that. But nonetheless, man, I I did watch the livestream. I mean, I do use o 1 pro. I'm I was not as impressed as for the $200 that I pay. A very similar experience as that. But, nevertheless, when I saw on the twelfth day the livestream and they showed the RKJI result. Right? I in my head, when when we even did a couple of episodes with the as a peer cofounder, my impression was it was at least 2 years out breaking RKGA benchmark, how much of a compute that you throw at it. But now it seems like o 3 has kind of broken it almost. Yeah. That that should mean something. Right? So what do you make of that? Like, o 1 tool quite doesn't work, but then there's o 3 tools kind of it's already up. AGF and Shpal is out. So
Nathan Labenz: (10:58) Yeah. I mean, it seems there's definitely something here, and it seems like there's a couple dimensions of improvement. 1 is they're doing reinforcement learning at scale on these reasoning models. They said they are not doing reinforcement learning on the actual chain of thought, but only on the response. So in other words, when they feed an output to a reward model to get the score used to power updates to the weights, that output scored does not include those intermediate tokens. It only includes the final thing the user sees. Their hope is it won't create weird incentives for the chain of thought. The chain of thought is free, unencumbered by these pressures, and therefore, it can sort of reason in a very natural way. If it is being deceptive, then hopefully, it will continue to make that, like, visible to their monitoring systems, which they're developing and control it somehow. I do think the ArcGI blog post on this is probably the best source of concrete information that we can break down. I think it you know, it's important they work directly with this independent group. You know, I don't think this is faked. So in terms of, like, you know, people being very skeptical in general, I think there's something quite real here. I think that's safe to say. There's been some discussion or sort of debate around, okay. Well, they trained on the training set. Is that allowed or not allowed? I think for practical comparison purposes, everybody that's been competing in the challenge has been training on the training set. So it's they're not, like, alone in that by any means. And I think their prompt also was an interesting revelation. It it basically just said, yeah, identify the pattern and apply it to the final example and give us your output, that's it. You know? No detailed instructions. Like, super, super simple. So that was an incredible flex. And then so the results in the o 3 low effort setting, they got 75%. And that is blowing basically everything else away. It's roughly at kind of human level. They put it a little bit higher than your average Mechanical Turk respondent. It's maybe a little, you know, more motivated or more savvy people, but it's, you know, getting up there in the level of human performance. And it's a good 15 points clear of everything else that we've seen to present. So that's like a pretty substantial leap. And then there's yet another leap with the high compute setting that gets them up into the high eighties. 87.5%. That's like above human. And basically, you start to feel like, okay. They've solved this puzzle. Well, that it's maybe not fully solved or it's solved in still sort of a weird way because there are some that seem like easy to solve that it isn't able to solve. And I think the people involved with the RKGI project have been, I think, appropriately giving credit where it's due. They're like, this is a big deal. It is a real advance, and we are gonna need to really study these capabilities. At the same time, Francois said that it is still possible to make things that are easy for humans that are hard for even even these systems. And in his mind, it won't be true AGI until that's basically no longer possible at all. Hey. We'll continue our interview in a moment after a word from our sponsors.
Ad (14:24)
It is an interesting time for business. Tariff and trade policies are dynamic, supply chains squeezed, and cash flow tighter than ever. If your business can't adapt in real time, you are in a world of hurt. You need total visibility from global shipments to tariff impacts to real time cash flow, and that's NetSuite by Oracle, your AI powered business management suite trusted by over 42,000 businesses. NetSuite is the number 1 cloud ERP for many reasons. It brings accounting, financial management, inventory, and HR altogether into 1 suite. That gives you 1 source of truth, giving you visibility and the control you need to make quick decisions. And with real time forecasting, you're peering into the future with actionable data. Plus with AI embedded throughout, you can automate a lot of those everyday tasks, letting your teams stay strategic. NetSuite helps you know what's stuck, what it's costing you, and how to pivot fast. Because in the AI era, there is nothing more important than speed of execution. It's 1 system, giving you full control and the ability to tame the chaos.
Adi: (15:28) That
Ad (15:29)
is NetSuite by Oracle. If your revenues are at least in the 7 figures, download the free ebook, navigating global trade, 3 insights for leaders at netsuite.com/cognitive. That's netsuite.com/cognitive.
Nathan Labenz: (15:48) I guess for for AGI in his mind, not to say that that's necessarily a move, but it's always a little bit of a move. Right? There's arc AGI 1, and there's gonna be arc AGI 2 coming. So as always, there's a lot of nuance in these things. I thought 1 thing that was really interesting was just the number of samples, the number of tokens, the the cost that translates to, and the time that that was able to run-in and sort of what all that implies. As far as I know, the o 1 models are just doing single rollout chain of thought and then give you the answer. I'm pretty sure that's how o 1 mini and o 1 are working. There's been some speculation online that o 1 pro is maybe multiple o ones running in parallel and then taking the best answer. Maybe. I'm not sure. Maybe I haven't quite unlocked it yet, but I have not felt that it's really all that much better than just normal o 1. Yep. And I keep using it because I sort of feel like that's probably a skill issue on my end, but I candidly have not really gotten much from it that is better than the normal o 1 as far as I can tell. But there's been the speculation that maybe it's multiple rollouts and then some sort of decision at the end as to how to aggregate that. I don't know if that's true. I I kind of guess not, but it is clearly and just stated as much for 0 3. So the low effort setting, what that translates to is 6 samples per task. And they kind of report this in aggregate in the blog post. But, basically, they've got a 100 tasks, and they total up 33,000,000 tokens across all of these tasks. That obviously implies something like 300,000 tokens per task. And then with 6 samples, that would suggest something like 50,000 tokens per sample. And then they say that that costs basically $20 per task, which basically lines up with the o 1 pricing. The o 1 pricing is $60 per million output tokens. And so if you had 50,000, you're at 5% of that. So that would be $3 per, you know, per 50,000 tokens, which all kind of coheres. Right? If you have 6 samples and they're 50,000 ish each and you get to $20 total cost, that would suggest that basically the cost is thought to be, for these purposes, is is considered to be the same as a 1, but you're using 6 wide. Interestingly, that they also report that the time per task is just 1.3 minutes. So now we're like, jeez, are they really going 50,000 north of 50,000 tokens per minute? That would be like 1000 tokens per second. You're usually not seeing 1000 tokens per second from the OpenAI APIs. A couple 100, you know, is is much more in line with what I typically see. And even o 4 mini is actually not faster in my experience than o 4. It's cheaper, of course, but it's it does not in terms of tokens per second, does not seem to be all that much faster. So maybe this was happening on, like, dedicated hardware. I I kind of don't expect that we're gonna see 1000 tokens per second as retail customers in the immediate future. But, you know, if they're saying, hey. We'll carve out special compute for this purpose, then they can maybe juice it up to that level. That's pretty interesting. I think these $20 per task, there's a lot of things that are worth paying $20 for. You know, if you could do a good job, this can open up a lot of of new use cases, I would think. The high effort thing gets even a little more interesting yet because that's reported to be 1000 samples, precisely a 24 samples. They report 5,700,000,000.0 tokens generated there. So, again, you're at now just simple division. You're at something like 50,000,000 tokens per task, which is, again, 50,000 tokens per individual rollout. So 5,000,000,000 tokens total divided by a 150,000,000 per task, divided by a 50,000 per rollout. So okay. That's interesting. Now they report that that happens in 13 minutes. So it's like there's, like, some interesting implications there, I think, for what exactly is happening behind the scenes. 1 really big question I have about all of this is, okay. If you are doing 6 or 1000 samples on the same question, how are you choosing the right answer? And how generalizable is that strategy? Like, I don't think you could take 6 rollouts of o 1 and get to the same level of performance. On 1 dimension, it seems pretty clear that there's just better reasoning coming from the o 3 model, full stop. Like, the reinforcement learning has continued to work. It's continuing to get better at its core, you know, single rollout reasoning analysis and come to an answer. But now if you have 6, how do you choose the answer? Well, a classic in the space for a while has been to just take like a majority vote. You know, if it's a math problem and there is a single answer, then have multiple rollouts. If you get, you know, 3 that say the same thing and the other 3 each say something different, go with the 1, you know, that has the the most votes. Okay. That could be could be as simple as that. Another version would be to have some, you know, additional, call at the end that sort of says, like, which of these are better. Right? So if there's not a single answer, like, if you're trying to write something, for example, of course, you can't say, okay. Here's my, you know, paragraph or my page of text. It's not gonna be exactly the same as any of the other ones. They're all gonna be different. So how do you choose what's the best? So there's been, you know, a lot of different techniques in terms of, like, feed them all into a model and ask which one's the best or do, you know, pairwise comparisons or do, like, round robin. You know, there's all these sorts of tournament style, you know, schemes for having language models judge themselves. Those are, like, somewhat viable, but not super viable or, you know, not super accurate, let's say. They're viable, but, you know, are they really working super well? Not always, I would say. But it does seem like something like that has gotta be happening here because if it was purely parallel and this could also be, you know, hardware constraints, whatever. Right? I'm definitely kind of trying to fill in gaps with what information we have. But if it was a purely parallel thing, then with a very simple resolution mechanism at the end, then those thousand samples could happen just as fast as the 6 samples. Right? There's no reason it would take 10 times longer to do the 200 times as many samples. So why is it 10 times longer? Is there some sort of aggregation step that is like a sequential process where maybe it's generating 1000 on the first minute and then running a 10 round tournament or something like that? If you did say, okay. Now I've got 1000 candidates, and I want to pick the best 1, and I'll do it, you know, tournament style, just head to head, single loss elimination or whatever. And those calls took a minute. Right? You'd have sort of in the first minute, it'd go down to 500, then to 2 50, then to 1 28, then to 64, and all the way down. Finally, it would take you basically 10 minutes. Let's say at least 10, you know, rounds of the tournament to get down to a single winner. So it seems like something like that is probably going on, but we don't know the nature of that mechanism. And I think that is going to be a really important question for figuring out how generalizable this is going to be. People have commented already pretty widely that we should expect that the reinforcement learning is gonna continue to work, and basically, the AIs are gonna run away with anything where there is already easily accessible reward signal. So anything where it's, you know, where it's verifiable. Did you get the the correct number on this question or not? If so, you know, reward that. This is where superhuman performance comes from. Right? This is where AlphaGo got to be so good at playing Go that it made famous moves that surprised our best grandmasters. So I think it seems pretty safe to say that that will happen in these, like, easily verified domains. Math and, you know, programming is probably like that to a pretty significant degree. And then the question becomes, like, how many other things are like that? And, you know, can you get a signal like that for writing? It's a lot harder. That's for sure. So there's sort of an expected divergence between these things that are verifiable and things that are much harder to verify or subject to taste or what have you. But I think just how much that divergence holds might depend a lot on what this consensus finding mechanism is and and how, like, robust it is. You could imagine a scenario where the reason it took 10 times as long is because basically said, alright. Well, we're only gonna paralyze it up to x wide. And so we can you know, these things take 1 minute, but we'll we'll, like, max it out at a 100 at, you know, at a time. We'll just run 1000 in 10, you know, sequential steps, 100 each, and then we'll take, like, some very simple resolution at the end that's fast, and that could get you to roughly the 10 minutes. Or you could have something where it's, you know, generated 1000 wide at once, and then there's some, like, longer, more compute intensive aggregation mechanism. I think that is, like, probably my biggest question right now. What what exactly have they figured out there, and how much will that apply to other things? If they have figured out a way to choose the best response in areas where there's not like a super obvious, like, yes, this is right or wrong, then I think we really could be talking about a huge, huge breakthrough. But it's not they really haven't said much at all about that so far. Everything that they've said, as far as I know, is about stuff that is pretty verifiable. You know, you've got these coding tasks. You've got the frontier math. There's been some interesting discourse there too around, like, to what degree are these frontier math questions being answered the right way for the right reasons? And to what degree might they be answered correctly, but not for necessarily the right reasons? Of course, the frontier math people tried to design a test that you couldn't luck your way into getting the right answers on. So I think we have a pretty good reason to think it's probably doing the math. But I will say going back to my tic tac toe example, there was at least 1 of the models that gave the right answer at the end based on totally garbage reasoning. So I have seen at least a little bit of a glimpse of this sort of thing where, you know, at the final paragraph, I was like, oh, you know, huge. Like, this 1 got it. You know? Because the final answer was like, yes. If you assume optimal play from here, then x can force a win, whatever. And I was like, oh, that you know, I'm I'm impressed. Let me go now look at the chain of thought. And it was totally garbage. But it still landed on that answer, not in a way that I know, and this is maybe could be what is happening with Frontier Math as well, but but hard to Hey. We'll continue our interview in a moment after a word from our sponsors.
Adi: (27:39) So Okay. Cool. It seems like these reasoning models are getting better. Have you hit the wall? That question is not clear, but at least it's clear that these models keep continuing to get better. I think that nicely follows up to the next questions that somebody asked, which is also about safety. Yes. These models are getting better. How do we keep them safe? So they asked, and what about OpenAI's new safety plan, which they term as deliberative alignment? What's your view of it?
Nathan Labenz: (28:08) I think I'm still sorting out how I feel about it, but I did read the paper end to end. It was a strange way in which they presented these things because they went on the livestream and, you know, blew up social media with new high watermarks on all these benchmarks. Amazing. And then they were like and by the way, we also have a, you know, new safety plan associated with this. And that was almost all they said in the video, and it didn't get much of the airtime broadly that they had this new safety plan. But then when you go to their website, there's not really much of a mention of o 3 other than there's this new deliberative alignment paper, and it's kind of mentioned in there. So I guess for starters, like, what is it? It is a very constitutional approach where basically they say they take this, I think, purely helpful model, which, you know, for folks who wanna know what a purely helpful model is like, you can refer back to my GPT 4 red teaming discussions. But basically, a purely helpful model will do anything that you ask. Right? It has been trained purely on the sort of reinforcement learning, how to be helpful to humans to get a high score from the user. And that makes it, in some ways, even like a little bit more useful than the versions that we tend to see. Although this is they've they've definitely improved on this. But, you know, they will never refuse. No matter what you ask, it will do. Even if illegal, whatever. It has no sort of baked in moral compass other than do whatever is going to satisfy the user and get me a high score right now. That's I understand the kind of model that they start with. And then they give it a full policy. And it sounds like their policies are getting pretty long, potentially even long enough that the policy doesn't fit into the context window that they have available even in these latest models. So, ideally, they would maybe give it the full policy, but the full policy is too long, so they have sort of subsections of the policy where they'll basically, you know, spell out in a ton of detail. And it kinda reminds me of reporting on what Facebook's moderation policies have been like over time, where they had at some point scaled up to more than 10,000 people globally, and the policy has become, you know, hundreds of pages. And it goes down to these, like, very sort of minutiae edge cases. 1 that I recall, because my wife was posting pictures of of herself with the babies, over the last few years, is on Facebook. You can't post a picture of a woman's breast, but you can post a picture of a woman breastfeeding a baby. What if the woman is breastfeeding the baby, but the baby's mouth comes off the breast for a second, but then you can see the the nipple, but it's still in the context of breastfeeding. How how do you handle that? These policies get very long as you encounter global scale norms. There's so much. And, obviously, that's been a real bear for Facebook, you know, staffed up and, you know, huge infrastructure and reviews and, you know, supreme court and all this kind of stuff. Well, OpenAI basically sounds like they've developed a pretty similarly long policy that's like, how do we want our models to behave in all these situations? In this case, it's what are we wanting to send back to the user for all these different situations? And, again, that's become so long that apparently it can't fit into the model context. Nevertheless, they'll give it, like, the relevant section, and they'll ask the model to reason through given this policy and user input how to respond. Right? The first thing that it does is it just has to generate a response for the user. It's given basically the normal task of you are a helpful assistant, you know, helpful, harmless, whatever, and here's the whole policy. Reason through what you're supposed to do and then respond to the user. In this reasoning, they observe it it can detect jailbreaks. You know, 1 of the more memorable examples was the model saying the user is trying to trick me and realizing that and then sort of responding in the way that OpenAI would want based on the fact that it understood this was a trick. And, you know, therefore, applied the policy in the right way with that with that realization. Similarly to the normal reinforcement learning thing, they then drop the chain of thought, just take the output and have the same model, the same helpful reward model, again, given the policy and this time given the the response, evaluate the response per the policy. So it's the same as I understand it, it's the same base purely helpful model in both cases. First, respond to a user given this policy that tells you what to do. Then given this policy and the actual response, score it. Same model. Then fine tune on the highest scored examples with instruction tuning, and they train on the chain of thought there. My understanding is the reward model does not see the chain of thought, but the model is fine tuned on that chain of thought, what which whatever led to the successful outputs, but without the policy itself. So they're trying free up this context. They're trying to bake this understanding of the policy into the model weights by saying, when we do the fine tuning, we're not gonna include the actual text of the policy, but we're gonna include this reasoning process that refers to the policy. And then just over time, the model will naturally refer to the policy even though we didn't give it at runtime each and every time. After that, they enter the reinforcement learning stage where basically you're now just doing the the normal reinforcement learning thing, but with the sort of safety agenda in mind and generating lots of outputs, having them scored, and, you know, updating toward the the higher scored ones.
Adi: (33:58) Interesting. Yeah. A question and a thought a comment. So this is just, again, to summarize for for us to audience to understand it. This is some sort of reinforcement learning for safety using the same base model. It it frees up humanness. Is the advantage here that you could do safety at scale sort of. So you could use the model to kind of align itself to what are the policy OpenAI rights is is would would you factorize that as a good summary?
Nathan Labenz: (34:25) Yeah. I think that's a big part of what they're going for here is that it doesn't seem there is any human input aside from writing the policy itself. So, actually, 1 of the more interesting quotes from the paper was where they said, we anticipate OpenAI's policies will keep evolving, but that training models to precisely follow the current defined set of policies is essential. This practice helps us build the skills for aligning with any policy requirements, provide a invaluable preparation for future scenarios where the stakes are extremely high or where strict adherence to policies is critical. So it seems like the goal is to say, we have a little machine. We can wear this centrifuge really quick, and we can bake into the weights of a model through this combination of initially instruction tuning and then reinforcement learning, which is very standard. We can run that fast. Any policy that we need to bake in, we can bake in in this way. It works pretty well. You know, the the metrics are on par with Claude. You know? So they're they're good. I wouldn't say that they're, like, great, you know, if you were like, hey. I'm the, you know, the US government or whatever, and I've got some super highly sensitive scenario where strict adherence to policy is critical. Can we count on that? No. We're talking like high nineties success rates in terms of adhering to the policy, but definitely not like 05/09, you know, level of of accuracy. So it's like on par with what has come so far, but not, like, leaps and downs above. The big thing seems to be just that they can run the process really quickly. Or at least it seems it's they have not disclosed how much compute they're putting into it, but presumably, it's pretty small.
Adi: (36:16) So maybe rather the question is how different is that from what Anthropic proposed constitutionally as what's your take on that? Is this an improvement? Is this like an orthogonal vector of attack to keep it safe? How that would work?
Nathan Labenz: (36:29) Yeah. It seems definitely more similar than different, you know, in as much as you have a policy or a constitution that sort of says what you want, and then you're having the AI critique itself with that policy in mind and then, you know, gradually, iteratively fine tuning to get all that baked in and and hopefully in a robust way. The difference of having the chain of thought not fed into the reward model is interesting. You know, there's all these worries about deceptive alignment and what happens if you sort of train against bad behavior. Do you, like, eliminate that bad behavior, or do you somehow, like, drive it deeper and incentivize the model to to obscure what it's actually thinking while generating an output? This hopefully does something to address that. Peter Thiel has famously said that AI is communist. And as I was reading this, it did have a certain, like, boy, if you wanted to moderate social media at scale with Facebook or if you were China and you're like, how do we control the speech of 1000000000 people online and stay nimble as we do that? Then this sort of feels like a really good answer to that question. It doesn't feel like something that is so reliable that you can really, you know, put super high stakes things on the line yet. Is that good or bad? I mean, it's not great. You know, it doesn't I'd say this this paper was, like, essentially silent on what I think of as, like, the hardest questions in AI safety. Like, can we really take this thing to 5 nines? It's not really discussed. You know, what about all these sort of, like, big picture worries about deception? And what if the AIs form their own goals? And how would we know? None of that is really addressed here either. It also doesn't say anything about what the policy should be. Right? I mean, they in the quote that I read, it was very much like policies are gonna change. We just need to be able to align to whatever. So I do think there's something a little bit missing that you maybe could bake in with a different policy than the 1 that they have, but there is something about the Claude character. What character should Claude have is, like, a big thing that they think a lot about at Anthropics. That seems to be also a little bit sort of underdeveloped here. It was kind of like, given a policy, we'll align to it. We'll do it fast. We'll do it, like, pretty reliably. But what are we aligning to, you know, is is not it feels like a little bit under discussed. And I hope to do an episode on this. We'll see if we can get the the folks involved to come do it. But there was recently a really interesting paper that showed relative cooperation rates basically, between models of of the same type, you know, kind of game theory sort of games where in an iterative game environment, you know, do you kind of fall into an equilibrium of trust and growth and everybody prospering, or do you sort of stay in the low trust equilibrium where, you know, you're you're not, like, cooperating as effectively as you ideally would? And Claude dominated that test. The Claude models were able to cooperate with each other. They were able to achieve this, like, high growth trajectory, whereas basically everybody else kind of failed. And that included, you know, GPT 4 0 was failing pretty hard. So I I do think there's something where you're like, jeez. It's kinda cool that you're gonna align to any policy, but this is sort of a classic, you know, technical solution maybe without the real hard part being done. And I think we should be pretty caught Elias has said many times, you know, we can't ask the AIs to do our alignment homework for us. And Yeah. This does feel like a little bit of a step in that direction where you're like I also was also joking. It's like the Ron Burgundy of AI alignment. Like, anything you put in that policy, it will align. So that's not great. You know? It's like a piece of the puzzle, but it doesn't feel like it is really an actual you know, if you were not sleeping well a night before, I don't think this really should help you all that much. And if you worry about all these big picture questions, unfortunately, they're, like, not really addressed. So it's, like, good. You know, I don't wanna say it's not good, because I think it is good to be able to say, okay. We you know, we're advancing how quickly we can bake in desired behavior to a purely helpful model. And that allows us to, you know, do it quickly and do it in, you know, different ways for different contexts, whatever. All that seems good, but it does still seem like it it largely punts on the big questions.
Adi: (41:03) God. So deliberate alignment, better, but not enough.
Nathan Labenz: (41:08) Definitely that. 1 interesting final thought on this whole o 3 thing is it does change the landscape, I think, a bit in a couple of really practical ways. We're gonna have questions coming up about sort of impacts and, you know, is this an egalitarian technology, or is it a concentration of power technology? There's also regulatory questions. Is there any way to regulate this stuff? And if so, you know, what's it gonna be? It seemed like prior to the reasoning models, the general trends were pretty clearly toward egalitarian and toward governance being very, very difficult. Just like, jeez, you know, we've seen the story out of China where they trained their latest model on a single digit million dollars worth of compute. And so if I remember correctly. Yeah. Crazy low. And that's, like, you know, allegedly a GPT 4 0 class model, although probably not quite as good, but an impressive accomplishment for not a lot of money and the kind of thing that, you know, you're gonna have a really hard time getting compute governance to the level where you could prevent somebody from spending a few million dollars on compute. Right? In a global market of compute that is already in the hundreds of billions and and headed into the trillions, you know, millions ain't much. Right? So that, like, that kind of stuff is gonna fall through the cracks. So it it sort of seemed like everybody's gonna have gbt 4 quality stuff. There's no way to prevent it. It's gonna run locally. You know, on the 1 hand, there's no hope for governance, but on the other hand, there's no, no big risk of concentration of power because you can always have your own AI. I think both of those are pretty seriously challenged by this different paradigm because now, obviously, if if a high effort arc AGI thing costs, like, thousands of dollars, then that's obviously not something everybody can afford. It's also not something that right now we have the infrastructure to scale it even if everybody could afford it. There's just literally not enough chips to run that for everybody. So we're gonna start to see inequality of access to AI inevitably, I think, as a result of this new paradigm.
Adi: (43:14) Yeah. That's that's very interesting. Not many people talk about and can okay. That's very interesting. Sorry. Go on. So
Nathan Labenz: (43:20) Yeah. The flip side is also possibly that it sort of reinvigorates the idea that you could have compute governance, And it maybe creates a dynamic you know, it could be physics being kind to us, a a phrase I borrow from Zvi. But his kind of intuition there is basically like, if we are not being very, very careful, then the fundamental physical questions are really what's gonna determine how this goes. And we don't know what the, you know, what the physical laws are that govern, like, how intelligence evolves or, like, how likely it is for things to have their own goals or to you know, we know that there are things like instrumental convergence, but, like, does it always happen? Does it sometimes happen? Is it rare? We don't know the answer to those questions. And so he sort of says, we hope physics is kind to us. Meaning, we hope the underlying reality is such that these things are, like, pretty manageable because we're not, you know, taking the level of caution that we would need if they're not manageable. And this could be seen as some evidence that maybe physics is being kind to us. Maybe we're seeing that and, you know, this sort of is is maybe 1 for the, like, Martin Casado camp as well. Although, I I think not as far as he would would likely interpret it. But going back to that episode, you know, my kind of core disagreement with him was I think that something that's intelligent can often, like, pluck the right answers kind of out of nowhere and do that in a way that doesn't necessarily require, like, huge, huge compute. And he was like, well, you there's some things you just can't answer without really simulating them. And so if that is, like, prohibitively expensive, then it doesn't matter how smart you are. Like, you still can't get answers without bringing real resources to bear on these questions. And if that's true, that sort of puts everything in a more stable place. Right? Because random solo actors or random, like, rogue AIs that are surviving on the land somewhere on some, you know, on some server that they managed to to hijack. Like, they're not gonna have the resources to do these, like, super big things. And I wouldn't say that this goes all the way to his extreme, but it does kind of nudge my understanding a little bit in that direction where maybe we're headed for a world where you do have to spend significant compute to really answer hard questions. And maybe that can still be a lot less than, like, full simulation. And maybe there is some, like, insight, you know, that's still driving it, but you still need to burn tokens to land on those insights. That feels kinda like how people work. It's a little bit strained to make these analogies. If I reflect on my own moments of insight, I usually find that they come on the heels of, like, some amount of effort. Right? I'm like, I'm burning cycles, and then something clicks. Maybe it could have clicked sooner. Maybe I don't necessarily have to burn all those cycles, but there's some sort of search process going on where I'm just kind of gradually recombining ideas and and finally land on something that works. So if that is happening, then it it it could mean, you know, compute governance could really work. You could say, sure. You could do whatever you want on your laptop.
Adi: (46:21) But
Nathan Labenz: (46:21) we know that, you know, you're just not gonna have the power there to really, like, solve these huge problems that could be, like, super disruptive. And Zuckerberg has said similar things where he's like, we fight spammers all the time. Spammers have AIs and they have automation, but we have better AIs and better automation. And we sort of win against them not because it's impossible to create spam or we've, like, denied them access to the technology to create spam, but rather because we just have bigger computers than they do. You know? And we have bigger data than they do. We're pretty good at using infrastructure advantages to maintain balance of power in our favor, and so we're not overrun by spam. I mean, obviously, this stuff could you know, they do have some spam. So if you're thinking about a world of, you know, can you prevent all the pandemic all the future pandemics that somebody might want to engineer that way? I wouldn't wanna bet on it, but it does seem like it shifts the needle a little bit. Maybe those things are pretty hard to do with finite resources. Maybe it's pretty hard to come up with a world beating pandemic on your laptop. Maybe you do need a a real cluster to do that even with the super smart models we're about to get. And maybe it'll shift more toward defense because these big companies, you know, they're gonna be doing all this monitoring, and they're gonna be kind of you know, they'll a fraction of their super, super, you know, quantities of compute and devote that to just kind of keeping tabs on everything else that's going on. And that, you know, might be enough to kind of keep the balance of power in favor of defense or stability. Time will tell. Safety. You know, you could still have again, wouldn't wanna bet the future of the universe on it, but it does seem like a if I was a compute governance person, I would be feeling, like, invigorated by this trend for sure.
Adi: (48:05) Cool. Cool. Safety via compute governance. That's definitely interesting. Okay. That's about OpenAI and the recent headlines. I'd like to switch tracks to the other set of questions. I call the section 20 24 interview. We had couple of questions related to that. The first 1 is, compared to what you expected 12 months ago, how have the Frontier models progressed?
Nathan Labenz: (48:29) Yeah. I broke this 1 down into the tail of the tape, which is my framework for just trying to come have some, you know, dimensions on which to compare AIs and and humans across various aspects of cognition.
Adi: (48:45) That sounds interesting. Tale of the tape. Let's hear it.
Nathan Labenz: (48:50) I had to update this slide. For the last 2 years, occasionally given AI scouting report, and it's been really interesting to watch how this tale of the cognitive tape slide has evolved. The community at large has gradually identified all these different dimensions of cognition that AIs are, like, not actually super good at. So they have, you know, certain areas in which they're, like, already superhuman, breadth of knowledge. Right? Read the whole Internet. Know lots of facts. Know way more facts than any person. Speed, cost, availability, and scalability. Availability meaning I can go back to any previous chat with Claude or GBT and pick up right where I left off, and it, you know, doesn't need any sort of refresher. Right? It's just always living in that moment and immediately ready to help.
Adi: (49:32) Agree, Claude.
Nathan Labenz: (49:33) Or to even do it, you know, a 100 or 1000 wide at the same time. So those things are pretty remarkable, but then there's all these other dimensions where they're sort of catching up, but they're not quite there yet. It's not entirely clear. There could even be still a few more that we need to identify. But on that list, I have depth of knowledge, which I would say the AIs now maybe with this o 1 series might actually just have tipped over past human experts. Certainly, they are on the benchmarks. Things like Google proof QA, MLU, or even Frontier Math. They're now hitting coding problems with o 3 in the top 200 individual coders in the world at competitive coding problems. It does seem increasingly safe to say the AIs are on the level of and there's medical diagnosis too. There's been interesting stuff recently where the gap seem I've been reporting that AIs are as good, if not better than humans at diagnosis. Now that gap seems to be widening. They are getting, like, quite a bit better even at medical diagnosis. Depth might be 1 that has now moved from previously humans had the edge to now maybe the AIs have the edge. But humans definitely still have the edge when it comes to managing local context, perception, which is, like, literally just seeing things accurately for what they are. And we're not amazing at that. We can get fooled by optical illusions. Going back to my experience with tic tac toe when it's like just can't see the board, that's a problem. I also experienced that with Arc AGI problems going back to the summer. I took some screenshots of these grids and put them in to Claude and and to tell me what you see. How big is the box? How many squares across? They couldn't even answer questions like that. It's amazing in some ways that they're able to succeed in a in a meaningful way given that they, like, literally can't, at least at that time, you know, could not count the squares on the box. Claude has made progress there. Now it can point to a button with pixel coordinates. Its perception has improved, but has weaknesses. Memory is weird, multifaceted, and maybe needs to be broken down. AIs have probably better working memory, which is closely related to the availability that they can pick up right where they left off. You know? Even if there's a 100,000 tokens of prior history, they don't need much of a refresher, whereas I can barely remember what I talked about yesterday. Right? So I'm like, working memory or ability to bring something from the past up to current memory standard. The AIs are better on that, but I have this ongoing memory that is holistically coherent and always updated, and they don't have that. And that there's been various schemes to try to make that work, but they haven't really worked yet. Robustness is another 1 where humans are ahead. By far, AS are still pretty easy to trick. Notably, the Apollo deception results were based on a trick of the AI. Right? They told the AI, this is your private scratch pad where you won't be monitored when in fact they were monitoring it. And so its willingness to believe what it's told, where it maybe should be suspicious, that's something we're much better at than the AIs. Awareness is another 1. Situational awareness, you might call it. When something's not working and you're cycling, sometimes you need to break the frame and come at it from a different direction. Classic example would be like, have you tried restarting the computer? I've never had an AI tell me, have you tried restarting the computer? They always attack whatever bug I've highlighted, and they'll attack it, attack it, attack it often in repetitive ways. But they could really do better if they said, you know, we've tried this 5, you know, kind of obvious ways. Like, I zoom out, break frame, give you a totally different way to think about it? Have you tried resetting the computer? They don't do that. Reasoning is another 1 where I would say the AIs are catching up, but it's still muddy and hard to say. Sometimes you see the right answer from garbage reasoning, so that's weird. Insight and time horizon are my last 2. I think we are starting to see some notable insights from AI. I think we've gone from no Eureka moments 18 months ago to precious few Eureka moments to, like, few Eureka moments maybe now, but there's definitely a notable trend there. Time horizon is my last 1, and that that's really inspired by the meter work where they found that, you know, the AIs given a 2 hour time budget outperform humans. Actually, the AIs outperform humans even if you just give them 30 minutes. What what they do is they'll give them 4 times 30 minutes and compare that to a human for 2 hours. But beyond that, they sort of start to run out of steam. They start to, like, do the same thing over and over again. And, you know, they just can't make progress over long time horizons in the same way that humans have. So we're definitely seeing things like moving gradually from human being better to AI being better. Like, we're we're the the trend is clear. More and more things are moving toward the AIs being at least on par or better. But then we're also kind of at least I am finding my understanding of different aspects or dimensions of cognition. And typically, when there's a new 1 discovered, it's because the AI is discovered to be not very good at it, and we sort of realized that we kind of in some ways, we take it for granted, and so we don't write about it on the Internet. There's nobody writing about how to see a tic tac toe board. You know, you you can find many pages on the Internet that say tic tac toe is a solved game, and with optimal play, it'll always be a tie. But you'll almost never find somebody saying now what you see here is when you see those lines, like, those are the grid. You know? And when you see the o in that spot, like, that's an o and that's in that it's hard to talk about because it's just so intuitive for us. The AIs don't have that same level of perception naturally. Often, it's a reflection of things that are so intuitive that we take them for granted, not necessarily always. But I think that's basically a good map for what has underperformed and and overperformed relative to my expectations. I would say a number of those things on the human advantage side have moved slower than I guessed. I would have said memory would have made more progress in 2024, a more integrated memory. We've seen, like, ChatGPT has this memory function, but it's still very brittle. It doesn't have a great sense for, like, what to note as a memory. You've seen many instances online where people are like, ChatGPT was doing something super weird, and I had no idea why. And then I dug into my memories, and it had recorded, like, 3 things that were, you know, random 1 offs that aren't really the kind of thing it should be remembering, but it thought they were. And so now it's coming back around. Perception, I think we've covered enough. I'd say agents generally have underperformed. I would have expected more progress on autonomy, you know, just just more reliability than we've actually seen. And then I think I would also have to say people have kind of underperformed, like, how well we have collectively deployed the technology. This is maybe an 18 month analysis as opposed to just 20 24. But shortly after GPT 4, I wrote a couple of long Twitter threads. And when I look back on those and I say, like, okay. You know, how do these line up with what actually happened? I was projecting more impact from g p d 4 class models than we have seen. And I also did say at the time, you know, even if there's no further progress, you know, we'll have years of implementation in front of us. And I just have been a little bit surprised by how little truly successful implementation we've seen. And I think that honestly is kind of a people issue more so than a model issue.
Adi: (57:04) So the model has latent potential, but people have not un hobbled it. Is is that what you're trying to say? Like, we have not used the existing class of models well enough. Are you talking about, like, vertical application built on top of the, I don't know, GPT 4 class of models? Is that is that what you're trying to say?
Nathan Labenz: (57:22) Yeah. Any any specific thing you might want to automate, basically. Automation has not been as widespread as I would have guessed. And I don't think it's because the models can't do it, especially now that we have fine tuning of GPT 4 0 and fine tuning with images available. And now they've got, you know, reinforcement learning coming too. I mean, that's new enough that I can't say people have underutilized that at this point. But, like, you've been able to fine tune g p d 4 0 for quite a while, and that is enough to automate a lot of tasks. If you are good at it and you have, like, clarity of what the task is and what good looks like and, you know, are willing to grind through the process of making some examples and fine tuning if needed. You know, this is basically my AI automation presentation from a few months back. I hope to have another 1 soon, by the way, for with a focus more on application developers. But either way, it's like the same core discipline of just what exactly do we need the AI to do? What does good look like? Can we marshal a few examples? Can we teach it how to do that? If it's too big of a task, how do we break it down into multiple subtasks that it should be able to handle? People are just way under doing that, and I don't really know why.
Adi: (58:41) But Yeah. I was gonna I was gonna ask you. If you have to put your speculation hat, why haven't people exploited or unpubbled it enough?
Nathan Labenz: (58:51) But, yeah, probably every probably all of the above. You know? I mean, it's all moving quickly. To some degree, people have formed first impressions that are now out of date, but they maybe haven't updated their first impressions. I see that a lot in software developers as well. You know, the classic, well, it can't help me. I'm quite sure it can help you. Maybe an 18 months ago 1 couldn't, but this 1 almost for sure can. That doesn't mean it can replace you. Like, it can help you. So I think there's some of that. There's some of just, you know, honestly, probably a version. You know, some people are like many many people, right, are like, I could automate my job, but what does that mean for me? Do I want that? I think there's a lot of local knowledge in the brains of people who just don't see it as being in their interest. We can talk more about the big picture of where's all this going and how do people get to eat, you know, in an AGI world. But I think people are worried about that, and so they're maybe not so gung ho. And then I think also just people are not the best practices have not been super well established. It's still pretty commonly the case that people are fine by, like, AI automation talk more revelatory than it should be at this point. You know, the idea that we just need a relatively small number of examples. We need really good chain of thought to fine tune the model on. That's like the upper end of what is typically needed. Often, you can get away with less. But if you can do that and you can do the fine tuning, you can make most things work. And I think people have not really distributed that knowledge very broadly or, you know, haven't haven't diff it hasn't diffused or hasn't been absorbed. I see this with
Adi: (1:00:31) my programming peers. Even today, many questions, either the, I don't know, the effectiveness of AI programming. Some would just claim it's still not good enough because maybe they used the 3.5 class of Copilot when it released, and their experience is still shaped by it. Or, yep, some averse to it. In a very indifferent way, they would say, I can never do what I can do, but they would have never tried it. I I do see this for sure for programming, and I I don't know what the right answer is. It's since it's people, it's always very complicated. So
Nathan Labenz: (1:01:05) It might also be just lack of confidence that it will work. The process of example curation for the automation is, like, not very glamorous, not very fun. A lot of times it involves, like, just sitting down and writing things out that are very tedious. You know, the chain of thought we usually don't record is not fun to sit there and write out. You know, if you had to write out 500 words about how you're doing some, like, very mundane simple task, That's, like, not anybody's idea of a good time. It's not really my idea of a good time. When I do it, it's because I'm confident it will get me somewhere. And so probably a lot of people are not aware this would work or even heard about it as an approach. They're just like, I don't know if that'll work. If everybody had a good command of best practices and they were confident it would work and actually do it, we could easily automate a 100 x relative to what we've already automated even with just, you know, even leaving o 1 and all that, you know, latest hotness aside. We just had g p d 4 with, like, the 100 k context window and the and the fine tuning ability, easily a 100 x automation potential relative to what we've seen so far.
Adi: (1:02:20) Okay. That was 1 good takeaway for me. Anything else on how the frontier models have progressed this last 12 months?
Nathan Labenz: (1:02:28) Rag is another area that I think has kind of underperformed. It has been frustrating. These days, I often say flash everything. Andrew White from Future House probably had the best distillation of that where he was basically like, we spend whatever it takes to get the right answer. And they don't have a vector database backing their scientific literature QA process. Instead, they just take everything that seems relevant on, like, a keyword search level and run it through a language model and have the language model assess, is this relevant to the current question? That, of course, does increase the cost a lot relative to a simple vector database lookup, but it seems to really help with performance. I think people have been stuck a little bit on Rag maybe because it's it's a very, like, programmer friendly paradigm. We have a database. We make a fetch into the database. We get what we need. But the accuracy of these vector searches has just not been great. Maybe it will become great, but I would say that has kind of underperformed my my expectations as well. And and probably a lot of things that people set out to build over the course of 2024 have, like, not quite thrilled them with results for that reason. I just was talking to somebody. I recently gave a presentation to the society of actuaries. They had an AI summit, and 1 person followed up with me and was like, we've got this internal chatbot, and it doesn't really work that well. We're not super pleased with it. And I I gave him, you know, this same recommendation. Like, how big is your database was my first question. You know, in in their case, 50 states, and they all have these different insurance regulations, and they might be 50 to a 100 pages each or whatever. And it was well, with Gemini Flash pricing, you could run literally the entire thing through Flash for every single question for, like, 10¢ or something. You know? So Right. If you're willing to do that and wait, you know, 1 minute, I think you could have, like, quite a bit better performance. But, this is just the kind of thing that is kind of new. It's not super intuitive. People try to, like, build software efficiently, and, you know, they there it's it's uncomfortable territory for developers to think every time somebody clicks this button, 10¢ might be spent. You know, that that obviously wouldn't work at Facebook or Google scale, but it can work for a lot of internal enterprise applications. I think there's there's definitely a big mismatch between what people should be willing to spend and what they're used to being willing to spend. Just just the idea that there's gonna be marginal cost on usage is very unfamiliar territory for a lot of people that are trying to build the applications, and they just don't wanna go that direction. Intuitively, it feels wrong to think there would be a $10.50 cent, or even a dollar cost. But, you know, when I asked the guy who had this insurance use case, it's like, how many questions you think you're gonna have a day? The answer was, well, we've got a 100 people, and it might be a couple dozen a day. And it's like, okay. Cool. Well, let's say that is 50¢ each, you're doing a couple dozen a day. Maybe you're doing $10.15 bucks spent. That's probably less than you're spending on coffee grounds on a daily basis just for people at the office. So if this thing can really do useful work for you, the cost is basically negligible, but it's it's been a hard paradigm for people to adjust to. On the plus side, I would put couple in the middle. Those are the ones that have maybe underperformed. I would say in the middle, price and speed of models have definitely continued their trend. I would only say, you know, they've gotten a lot faster. They have gotten a lot cheaper. And I would only say that's on trend as opposed to overperforming just because the trend was already pretty well established in 2023. They've also, I would say, maintained a trend on coding, which is an interesting 1 where the benchmarks look like they maybe even have overperformed, certainly with the latest results on Suitebench and Codeforces, it's getting really good. But in my day to day use, I do still find a lot of the, like, little weirdnesses, you know, being active issues. I work with a guy named Suraj on a few projects, and we now refer to the cursor effect, which the little one's fair to cursor because I I do like it a lot. And I do like cursor a lot, and I'm I'm a happy user. But you do see these weird things where, again, for lack of context or whatever, it's like or for or for cost management reasons, you know, they're not always they could, in many cases, feed the whole code base into the model. But a lot of times, they're not. They do sort of a rag, you know, try to figure out which files should we include. And then you'll have these situations where it's like, I don't see any mention of this, so I'll create a whole new thing. In a TypeScript app, you have your types file where all the types are defined, basically classes, sort of definition of what different primitives are in your application and and what sort of dimensions they have. And so often, I've gotten better at this, but it is a, you know, it is an issue if you're new and you haven't quite figured out how to do it. So often you'll find a new types file just appeared. Why did a new types file appear? Well, because the system took your query, did some rag, you know, similarity, figure out what to include in context, didn't include the types file, understandably, maybe, because what I'm talking about isn't even referenced in the types file yet because it's gonna involve creating a new type. But then where does that new type get created? The model will just, like, say, okay. I guess we need a types file. Create 1. And it failed to realize there already is 1, and that's where it should go. You see these sort of weird artifacts. I think, like, you can think about this in terms of effective context. The headline numbers of, like, how many tokens the models can handle hasn't maybe moved that much. I think it was late 23 maybe when Gemini got into the million token range. Those numbers haven't moved much, but they've got quite a bit better at using the full context effectively where if you just stuff in, you know, all the relevant information, it can actually handle that in a in a useful way, but we're not always again, for cost efficiency reasons, we're not always inclined to stuff everything in there. So sometimes these days, I have to take a I actually have a script. I hope to introduce this in more depth on a future episode coming before too long. But I've been working on this app to help people create the small number of gold standard examples that they need to power whatever automation workflow they wanna power. That app is not big, but it's a bunch of files at this point. So I actually have a script that just prints it all out into a single file. I take that file, go to Clawd or ChatGPT, mostly using o 1 on that these days, and say, here's my whole code base. Here's what I wanna do. Your job is to reason through it. Make a plan. Maybe sometimes I'll say, like, give me, you know, sort of analysis, then I'll give you some feedback, ask any questions that seem important. I'll give you some answers, then we'll move to a plan. Sometimes we do that in 1 step, sometimes 2, then take the plan back to cursor. I still find Claude better at writing the right code. This might be a skill issue, but there've been a lot of people out there saying, 0, 0 1, especially o 1 pro, just like 1 shot at this whole app for me or, you know, did a full refactor perfect the first time. I have not experienced that candidly. I I would say the the quality of line by line code that I get from o 1 still does not feel as good as what I get from Claude purely subjectively in terms of, like, what works. But that combination has worked.
Adi: (1:10:08) I I do basically the same. I use Owen for reasoning, but but I suspect it's because maybe if you're using Cursor and Clot, I I have a feeling like tooling around Clot is better within Cursor so that the line by line changes. For me, I'm my daily driver with but I'm and I wanna reason now, I do exactly what you do. That's been my experience too. So
Nathan Labenz: (1:10:27) So it's come a long way. I mean, you know, it is definitely remarkable that this little app has ballooned to a 100,000 to the point where I've been thinking, jeez, if gets much bigger, it's not gonna fit anymore. So, you know, that's not not nothing. Right? I mean, it's a not trivial code base, although it's still, like, pretty small in the grand scheme of things. It is amazing that, you know, that it can handle that whole thing in in 1 bite and reason effectively over the whole thing. And I think you're probably right that the the Claude tooling, it's just it seems like it's been a little bit more of a focus for Cursor. We'll see. It's very hard to do objective side by side analysis of these things. 1 other thing somebody asked about state space models and Mamba and has that lived up to the promise that I saw in it just over a year ago, was early December when I did that first Lucieberla Mamba monologue. I would say, basically, my 2 predictions were, 1, that we would see hybrids being the real winner, not that we would see everybody switching over to Mamba, but that it would be some sort of hybrid architecture was the wave of the future. And when asked somebody at the time asked, like, you know, if this doesn't happen, why wouldn't it happen? And my answer was because transformers just continue to work so well and just keep delivering time and time again that there's just, you know, no oxygen for anything else, and everybody kind of keeps mining the main vein, and and that's it. And I would say that basically seems like it has happened. You know, the the transformers definitely have continued to deliver. And so the number of people who have been looking for other things is, like, not super high. At the same time, you know, plenty of good examples of hybrid architectures working really well. We've had Albert Gu from Cartesia on. They've done some incredible stuff. The speed and quality of their text to speech is, like, unbelievable, and the price is so low. I think that is, you know, definitely a strong sign of of how well these architectures can work. EVO, where we had Brian Heath come on and talk about that 1. You know, that's another hybrid architecture. And, you know, there's rumors of not rumors, but, like, stated, you know, next year from from Google leadership that they're kind of headed toward an infinite context window. I suspect something like this is behind that. It does seem a bit of attention is needed, but maybe the updated statement is a little attention is all you need, and a lot of things can be made a lot more efficient and handle longer sequences if you have some version of this state space concept. I'd give that 1 an incomplete. I don't think the technical analysis was wrong. I just think that the incredible continued progress from the mainline research direction system so good that there hasn't been as much need to look at something new as I might have guessed. And Ziphrite was another 1 I wanted to mention. The efficiency they are able to get and the ability to run really, like, quite good models locally on device with all the, you know, multiple different efficiency tricks, 1 being a state space attention hybrid. I do think it is going to be a part of the future. I mean, just in fundamental terms, to have finite size memory that can evolve over time without having to grow in required resources over time. There has to be a version of that. Right? I mean, that's kind of, like, inherently what we do. Right? Well, like, my brain is not getting bigger quadratically with all the information I consume. So there's there's got to be something like that to make, like, long lived agents viable. And we have at least an existence proof in ourselves that it is possible, and we have a first version, I would say, with the state space models that kinda shows how this could work in an artificial neural network. So I basically still believe in it. Again, maybe with the maybe with the caveat that maybe the transformers will just continue, you know, to deliver. And maybe if they deliver really well, what effective context do you need is an interesting question. If they can scale Gemini up to 10,000,000 or a 100,000,000 tokens on conventional attention, that's a pretty long sequence. Right? Maybe it's growing quadratically, but maybe you can get it to grow quadratically for as long as you need to get something that feels functionally unlimited. If so, maybe that's enough to keep the the states based thing more marginal for another year or more, But time will tell on that.
Adi: (1:14:57) Let's see what 2025 holds for state, please. A very related question. I mean, you ranked things now as underperformed, on trend, or outperformed. There's 1 more question that somebody asked. What surprised you the most in 2024, and what do you expect in 2025 because of that?
Nathan Labenz: (1:15:15) What has surprised me the most so far has just been and I think this could get reconciled really quick here in the new year, But I expected the OpenAI announcement, the twelfth day of ShipMiss. I was expecting to see an agent framework of some sort. It's been widely reported. They're working on 1. I used the computer vision thing, which they've only shipped to the iPhone app as far as I know. But I did read a biology paper not long ago with advanced voice mode and screen sharing. And it was an awesome experience where I'm just saying, you know, like, what explain this figure to me. And the thing is explaining the figure. And it had both the drawings and the text description of the figure, so I'm I'm not exactly sure how great that perception is still necessarily, but it was able to really augment my reading with this combination of me talking to it and looking, you know, effectively over my shoulder and seeing what I was seeing. Add to that the cloud computer use that we've seen, and it seems like we have got to be really very much on the verge of some agent thing happening that will really start to work. I expected more successful autonomy and agent style things for lower end tasks and not as much high end reasoning progress. That divergence, I would say, has been the biggest surprise. It's a weird world where it's like we are beating PhDs in their own domain of expertise on all these really hard questions. And yet you can't reliably book a calendar event without extreme guardrails. And even then, it can be tough. That divergence has surprised me the most, but I guess I would still be surprised, you know, doubling down on maybe my own mistaken analysis or possibly time will tell. If this is doubling down on the wrong ideas or if time will prove it right, hard to imagine that can't get resolved soon. Hard to imagine these things that reason well can't do basic stuff. It would seem you would see a lot of pretty clear signal. Right? Like, did the thing get booked at all? Did it get booked at the right time? You know, did the user, like, give you a did the user accept the invite? I mean, there's a lot of these little things that I think we can get pretty good reinforcement learning signal from. I assume they've been doing that, but, you know, all I really have is sort of the rumors to go on there.
Adi: (1:17:47) So if if I have to summarize, so you expect, I don't know, for retail customers like us, better agentic workflows or frameworks that we can build on top of offering the models that we can rely on. Would would that be a fair assessment? You're disappointed that that did not happen in 2024, but you expect it to happen in 2025.
Nathan Labenz: (1:18:07) I'm still a little confused as to why it hasn't happened or maybe just hasn't been released. Claude, computer use does work reasonably well. You know, first shot, it was able to go use Waymark. It seems like they sort of really reined it in in terms of not allowing it to do things in your account. I had to lie to it to get it to use my account even on just, like, a marketing, you know, product like Waymark, not like a bank or anything, you know, sensitive. Just literally like, hey. You're in my account. Make me a video. I said, oh, I don't use accounts. I logged out. Now you can use it anonymously. And that was not true. Again, the robustness, you know, is not always there, but it was able to do the thing and prompt the video maker and, you know, use most features. And it does seem like the reward signal will be there for that. That doesn't seem maybe I'm wrong, but it that doesn't seem harder than, like, or doesn't seem much harder than getting feedback on software task. Right? Did you pass the unit test? Did you accomplish the goal? Maybe they would need, like, synthetic environments to do this. But, you know, did you get to the end and, like, tell us what the secret password was at the end of this web navigation maze? Yes or no? It seems like that is something that should be quite doable. So, yeah, I expect agents the consensus view at this point is that '25 will be the year of agents. If anything, it feels overdue to me. A couple other predictions for 2025. 1 is just echoing Sam Altman directly when he was asked. I think I'm like a Reddit AMA not long ago for his, you know, 2025 forecast. He said, saturate all the benchmarks. And that seems, like, increasingly realistic to me. The 2 to 25% jump on frontier math, you know, and all the signals that they're sending basically suggest that they don't see this ending anytime soon. And then I would also say because this is much more reinforcement learning powered than previous approaches, it also seems like stuff is gonna get weirder in 2025. We'll probably start to see a lot more of these things like the famous AlphaGo move where it looks like a mistake at first, but it actually turns out to work. Or some of these, like, deceptive scheming behaviors. You know, AI is accomplishing the goals they've been given in strange ways that is kind of strange, surprising, leaves us with these sort of uneasy feelings of what's going on here. We're sort of losing the ability to understand what these things are doing. That is probably gonna wrap up quite a bit. The chain of thought, you know, the the idea that they're not putting pressure on the chain of thought is like an attempt to try to keep that to a relative minimum, but there's a lot of other research that is coming out that I think will take that in other directions. And especially, I'm worried that it especially coming out of Chinese in in short, but, like, compute poor environments in general, there's a lot of pressure to make these things more efficient, and more efficient might also mean less legible. So 1 research trajectory I've been following recently from is they're trying to do more and more things to kind of train things end to end. 1 was baking in the temperature selection on a token by token basis to the model. So, you know, with APIs today, you have the ability to, like, set your temperature value. You can set it low, and it will give you the models, like, best guess for any given token. You can set it high, and it'll be more random and hopefully more creative or whatever. But there's a lot of situations where you would ideally like that to vary. Right? Like, you would depending on the use case and which token it is, when it's clear the next token is going to be a period, if the probabilities on the tokens are, like, 99.99% on the period and that 0 1% distributed over everything else, you don't want 1 of those other tokens. You want it to go with its best guess. But in scenarios where it's like, you know, trying to come up with something creative or you try to, you know, write a joke or whatever you and I think about episodes with Trey Kollmer, Holly you know, AI enthusiast and and Hollywood comedy writer, he was saying, you know, that there's often, like, a token where, like, you're really gonna make the joke, and you need the right token for that joke. But that's, like, really where you wanna be creative, and you would wanna see, like, coherent up until that moment. You would wanna ideally recognize that moment, and then you'd wanna, like, turn the temperature way up and do, like, a ton of generations just on that token until you find, like, really the right joke that works. And they are starting to do that kind of thing now at meta where they're they're baking in, like, dynamic sampling strategy to the model itself. So it is now kind of saying, like, okay. We're confident here. So put the period where it needs to be. But here, creativity could be rewarded. So, you know, let's let's go for it. They also have 1 where they are doing what they're now calling reasoning in continuous space or latent space. Long time ago, we covered a paper on the pause token where a model was basically given extra tokens to think without necessarily needing to say anything. Still, though, that was just a pause token where that token was selected and then fed back into the autoregressive machinery. In this new meta paper, they're not making it choose a token, but instead taking the internal state at the end of a thinking forward pass and how exactly they determine what's a thinking forward pass versus the token generation forward pass. They have a couple mechanisms. 1 very easy 1 is just give it a finite fixed number of tokens to forward pass in thinking mode before reverting back into token generation mode. But they just take that last hidden state and put that back in as the embeddings as the next input and just let it run that way. That's, like, pretty cool. Definitely seems like a powerful attractor. It could be like you know, 1 of the things that they find is that it's significantly more efficient. They can, you know, achieve similar results with fewer forward passes. It also seems better at breadth first search. They kind of create some, like, toy problems where there's, like, graph structures that the thing has to navigate. Picking 1 token at a time, you have to sort of trace out these paths and essentially do a depth first search to see what works. But what they think is happening, and I think they have decent evidence for it, is that what since it doesn't have to pick a specific token and and is able to kind of continue to chew on its internal states, that it represents different paths in superposition in a breadth first way and can perform better on these tasks that are best done by a breadth first search approach. But for all those nice things to say about it, I kind of hate it in as much as I feel like I want somebody to be able to read the chain of thought. You know, even if I can't see it as a user, like the idea that OpenAI can have some sort of monitoring system looking at the chain of thought and knowing what it's saying seems quite good. And to make all of that illegible to humans seems bad. This is something I used to worry about and never really had a good answer for it, so I just kind of let it go. Now it is showing up. It's gonna be really hard to, like, write rules about that, but I do have somewhat of an instinct to say, like, can we have a rule that you know, since we can actually have them think in ways that are eligible to us and and that we can read, maybe we should do that. You know, maybe we shouldn't try to take all this stuff out of language space and and put it all into the continuous space. Still also seen some analysis. It's like, well, you could still interpret those internal states. We have interpretability, ability to probe and classify, but we're making it harder on ourselves when we take everything out of language space. You can't read it. You now have to use other techniques. These techniques are relatively nascent. All that adds up to more weirdness from AIs driven by these, like, reinforcement learning paradigms, broadly by efficiency things and and things that kind of push the reasoning deeper and make it less legible with some upside in terms of efficiency, in terms of breadth for search, superposition, representations. But I think all that leads us to 2025 might be at the beginning of general purpose weirdness from AI. It's like surprising moves where you're like, I don't know what you were thinking there. It might have been genius, but I'm like, I'm unnerved by it. I I think that will probably start to happen more. I think we'll start to see probably, like, scheming in the wild, scheming that actually has consequences. I would be surprised if that gets under control well enough to not be an issue. So on the meta level, it might not be a surprise these things will happen, but I think, like, a lot of individual users are are probably going to be surprised even if, like, you know, the sort of zoomed out perspective makes it somewhat predictable. We should expect a bunch of, like, move 37, like, things to start to pop up. It won't be surprising than the aggregate, but, yeah, these individual cases will probably be, like, pretty surprising, pretty inscrutable, and probably a lot of them will, like, you know, go viral, make the rounds, and be debated in in similar ways that the scheming results were debated.
Adi: (1:27:17) So we talked about 2024, 2025. Now we are going to zoom out even more because there are questions about trajectory, about the future and impact. I'll start with Juan. 1 of our audience members asked, what are your thoughts on how AI affects education?
Nathan Labenz: (1:27:35) I hope to do an episode on this coming soon. I recently had, a couple of guys from a nonprofit in Michigan reach out and offer to take me to lunch, and I had a really good conversation with them over a couple of hours about what they're seeing and what they're not seeing in terms of AI adoption in the classroom. We've not done a lot of episodes about this, but 1 that does stand out in my memory was Sean Jancupar from Khan Academy. That stuff is gonna be big, although how quickly it will be adopted, you know, could lag significantly depending on how much we see, like, features unions try to block it or, you know, not. I've been pleasantly surprised by how little anti AI activity we've seen from doctors. I feel like maybe that's because people don't know yet what's possible. In education, there might be a pretty strong immune response from the current stakeholders to say, like, you know, we we don't want this in our classrooms. Certainly, there is some of that out there. 1 really important thing is to separate learning from education. If you think of learning as, like, something you want to do and education is something forced on you, forced learning in a way. If you just wanna learn, it's unbelievable what is now possible. Right? I mentioned my biology paper reading experience. To have a real time natural conversation interface that can see what you're seeing, that has encyclopedic knowledge, and it's just instantly there to explain things to you, that is unreal. You know? And I would say my willingness to even try to take on the intersection of AI and biology for this feed has, like, been basically predicated on the fact that I can use guys to help me ramp up. Because otherwise, I would have needed you know, maybe maybe I could have hired, like, a human tutor perhaps. Obviously, that would have come with a lot more expense and a lot more logistics, lower response times, and probably also just a lot, you know, less comprehensive knowledge. Like, it would be basically impossible to hire somebody with all these positive traits that that the advanced voice mode with vision has as a tutor. So it's amazing if you wanna learn any domain. Right? I mean, it's it's truly incredible. If you are motivated, this is the best thing that has ever happened for learners. It's not as clear how that will be applied in public schools. And, you know, I've recently been telling my kid who's 5 years old and is, like, quite bright, but gets frustrated quickly sometimes when it comes to, you know, if he's trying to read. He's, like, very ready to read, But he's not comfortable being uncomfortable struggling. He kind of runs into these words he doesn't know, and he, like, gets a little frustrated, and he's like, whatever. I don't know. Don't feel like doing this. And so it's not entirely clear how AI will help with that. Maybe AI pep talks. But if it's pep talks from me, maybe AI will give him, like, better pep talks. But I've started to tell him recently that the most important thing is to enjoy the process of learning. That feeling of this is hard and then trying anyway and it's getting a little easier. Like, that's the most important thing he needs to learn. More important even than reading because it will apply for everything. Can AI teach that? I don't know. Maybe. But disposition, I guess, for lack of a better word, is really a huge fork in the road for, like, just how much value you're gonna get from AI. If you like that experience of learning, you like being a little uncomfortable and you're or at least can tolerate it, then AI can help you learn extremely quick. If you are averse, I'm not sure how much it helps. I'm not sure it makes it easy enough that you're not at all uncomfortable. Like, it's still, of course, effortful to learn. How will it actually get applied in the classroom is anybody's guess. My general sense is it's, like, quite lagging. What I heard and hopefully, we'll do an episode on this with these with these 2 guys. They basically said, like, they've done Khan Academy pilots and stuff like that. They said, like, sometimes the teachers will say, oh, we love this. What about your students? Oh, we never even got that far. They're not close to pushing to the limits of what is possible. They're dabbling over themselves, getting a little lesson plan, you know, writing help, or maybe getting a little whatever kind of help. Khan Academy has way more already developed than is deployed even in places where, like, it's kind of getting deployed. It seems like it's, you know, it's part a lot of times partial deployments. So lot of opportunity there. Personal tutoring is the most powerful way to learn. Alexander the Great had Aristotle as a tutor. We could all have an Aristotle level tutor perhaps in the not too distant future. Will we actually have the inclination to take advantage of it? I suspect Alexander the Great had a lot of cultural context encouraging him to be great. And I think he kind of knew, like, what he had a sense for what that was gonna require of him. And I don't wanna overly glorify Alexander the Great, who I don't think is necessarily a positive figure in history in many ways, but he was definitely 1 who was ready for a challenge and not gonna shy away from it and ready to do the hard work. Those things seem really important in the context of learning slash education and much harder for the AI to have an influence on.
Adi: (1:33:00) So it seems like if you want to learn, this is the best time to be in, but you need to want to learn. That seems to be the big takeaway. So I'm gonna follow that with another question that 1 of our audience asked, which I think is kind of similar. They simply asked, should I even learn to code now?
Nathan Labenz: (1:33:18) Yeah. I think that 1 is so individual. I think my simple answer is increasingly, I think most people should do what they want to do. And there's a leap of faith around society making good choices. If you wanted to put a doomer lens on it, enjoy the time we have before the singularity in ways that you won't regret. Whatever gives you the most pleasure and fulfillment is a good default answer. I have a little bit of skin in the game on this question in the form of an ex au pair in our family who we have sponsored to go to community college locally. She was an au pair in our family for 2 years, and then we basically said for various reasons, like, we'll support you to stay here and go to the local community college. Now her goal long term was to get an employer sponsored visa and be able to, you know, stay and and make a life here. Okay. How do you go about doing that? Well, as everybody knows from h 1 b, discourse from the last, week or so, huge, huge percentage of those visas are going to IT and programming kind of jobs. So we looked through the database to just try to get a sense for, like, what kind of organizations are hiring and what kind of roles are they hiring for. And, you know, indeed, we found just overwhelming IT and programming. This was 2 years ago now. There was a minute when we were like, I guess you gotta do that if you wanna, you know, ultimately get 1 of these spots. And she was kinda like, okay. I mean, I don't really wanna be a programmer. I came here and did this because I like kids. I you know, early childhood education would be my real preference. But she was like, oh, I guess if this is what I have to do to, you know, be able to achieve that goal of being able to stay, then I guess that's what I'll do. In the end, we were like, it might be so different a few years from now that I would hate to see you spend 4 years getting a degree in programming, which you're telling you don't love it now. You don't have those signs of being like a kid that was programming at a young age. You know, she's I'm sure she could learn it. She's bright person, but you don't really have, like, passion for it. I would hate to see you do all that and then come out the other end 4 years from now only to enter a world where the skills aren't even that relevant anymore. So by that time, it's plausible to me that maybe it'll be very different. Maybe we'll be putting h 1 b's toward first of all, of course, the policy itself could change. But in terms of what is considered scarce and valuable, maybe childhood education will be, you know, a place where there'll be a lot more demand because the AIs will be doing all the programming. In the end, we basically said, do what you wanna do. Don't spend years of your life on a bet that, you know, that that in some sense implies that the AIs, like, aren't gonna dominate the programming profession in the next few years because it seems plausible enough that they might. That at the same time, though, I would not caution people. I would not say don't learn to code. I would say learn to code if you wanna learn to code. It's never been easier. You know, you can you can pick up all certain with replet, with cursor, as we've talked about. It is way less tedious than it used to be. You can get things explained to you way more conveniently. You can ask all sorts of questions. And I think that's probably something that, again, is probably underdone by even, like, professional programmers, you know, to I have the advantage of, like, never having been that good at programming. So it's, like, doesn't feel like a huge ego hit to me to ask the AI for advice. It can handle a 100,000 tokens of code. It can reason over that. It can give you refactoring plans, what libraries to use for this sort of thing, what patterns. You know? I mean, it can really explain a lot about code that you have in front of you or, you know, code that you are application that you're dreaming of creating. I think the barrier is low. If you want to create applications and you have, you know, never realized that dream, now is a pretty good time to realize that dream. But if you don't have that dream, I wouldn't try to cultivate a new identity as a software developer for some expectation of future payoff because there's a real risk it won't be there. In the short term, if you do embrace the latest tools and work effectively, there's a lot of arbitrage opportunity. People are used to paying high prices for software, and you can deliver it cost effectively. So I think hundreds of dollars, even, you know, thousand dollar an hour projects are, like, actually pretty abundant out there right now. I just don't know how sustainable it's gonna be. So I wouldn't say, like, build your long term future around a bet like that. But if it seems like it will be fun to you, if you're, like, curious about it, if you have ideas that you want to see realized, then jump in. I think, like, this this stuff is coming to code maybe sooner than other things, but it's hard to imagine it doesn't come to other things just as well. Right? Like, if your analysis was like, well, I think that's what I would enjoy the most, but I think I'm gonna go be a lawyer because, you know, I understand that, like, the reinforcement learning signal is, like, stronger in programming, and it's gonna take longer to come to legal analysis. I I wouldn't make that bet either. They'll figure it out. It's gonna come to legal analysis. You may have a couple more years there. Maybe. Miles Brundage just said something really interesting about this too. He's the former OpenAI policy guy that I've cited repeatedly. He said, don't mistake small relative differences in timing for the shape of the overall trend. And his point was, like, this is coming for everything. It's gonna take a little longer for some things to be figured out than others for multiple reasons. The the reward signal being 1. Another that he pointed out astutely, I think, was these are all software companies, so they know code. This is their own need. You know, they're it's like there's multiple reasons that the code is gonna happen first, but it doesn't mean that it's not coming to legal analysis and, you know, just about everything else that you might imagine. So bottom line, life advice, I guess, broadly, I would say do what you wanna do, what you feel intrinsically motivated by, and let the chips fall where they may. Aside from that, I wouldn't. I personally do not optimize really at all for marketability of future skills. I'm not really thinking super long term. My goal with this show and with most of the activities that I'm doing is just to learn as much as possible. That includes coding. I wanna learn how good the models are in coding. I wanna learn what the new workflows are. Sometimes I have intrinsic motivation to build something, but I have no theory that, like, honestly, any of the skills that I'm developing right now are gonna be, like, very differentiated in a few years.
Adi: (1:40:13) So nobody knows. Seems to be do it if you want to. Timelines are very hazy. Cool. K. We will move on to the next 1, which actually I think is relevant for me, but I suspect many are in a similar boat as me. As you know, we run a small company. We do post production for podcasters with AI. But there is always this fear that mean, if you asked me at 2023, I would not have been soft. Right? But watching the trajectory of the last 1 year and also how you gave an overview of how things are, it's hard to figure out what's the defensible moat for a company that you're trying to build. So you're trying to add value to somebody, and you want to build this in a defensible way so you can build this up sustainably or long term, especially if you're a small bootstrapped company. For people who are on this board, what do you recommend? What what strategies do you need to kind of think or what are things that you would recommend we do or we do not do so that we don't get steamrolled by, I don't know, big companies, but foundational companies or other ones? So what would be your advice? And also, what do you think others are in the space are trying to do to kind of safeguard themselves?
Nathan Labenz: (1:41:21) Tough 1. You know, no moats has been sort of the refrain or the callback over the last couple of years for the AI space broadly. I'm on record that I do think there are some moats in some places, but it is tough. You know, at least maybe a better way to say it is my crystal ball gets real foggy not too far out. You know, with Waymark, I think we've done pretty well. I have some sense of why that is, but then I also have some real doubt as to, like, how that may play out in the future and and, you know, just how sustainable is that for the long term. So Waymark makes videos for small businesses. We previously had a user interface DIY kind of approach where you would come in, you would pick, you know, out of a big library of of well designed templates that our creative team made. Would You pick something that you like the look of, and then you would be responsible for, like, filling in all the details, all the copy, all the you know, choosing all the assets, whatever, and and tinkering with the colors and yada yada yada to make it yours and make it what you like. So we had all that stuff. Now when AI comes along, we can get AI to do those things in our software so the user doesn't have to. And that could make the user experience way better. It could be faster. It's a better writer than most people are. It's especially on these, like, kind of somewhat idiosyncratic tasks around, you know, having a voice over and lining up the voice over with the what's on screen and making all this stuff work well together. Still work in progress, but we we're confident early on, though. This could do better than our users are doing. We've worked on that. And, course, a lot of people have come into the space and said something similar. They're like, hey. Maybe we can get AI to create videos. I think what those new entrants have missed is you actually do need the stuff we started with. You need a really well designed template library because the AIs, at least for now, are not able to create something from nothing that's awesome. You know? They are very good or and increasingly approaching excellent at filling in a template in a way the user likes. But it was still a human team that created those original designs that made them coherent, you know, that, like, did good motion graphics work, mixed music and motion graphics together for impact, and all the all those sorts of things. The AI is good at sort of, you know, filling that vessel with new content, but the quality of that form remains really important. And there's not an AI that can handle that from scratch today. Similarly with the interface itself, you know, people very seldom don't make any changes. You know? They they almost always see something in what the AI creates for them that they're like, even if they think this is awesome, they'll usually have a couple things for the likes, but I do wanna change that and that. That doesn't mean the experience wasn't awesome. It doesn't mean they're not floored by how amazing the AI is, but they're just like, I changed my phone number. It could be as simple as a factual change. Right? It could be whatever, but they always have something they wanna change. Or that 1 picture, I actually have a better picture. It's not online, but I just took it. They'll always have something. So you need the ability for that last mile edit to give the user exactly what they want. And this is pretty way more specific, but hopefully somewhat generalizable. At the highest level, it's like bringing a certain level of taste of, like, understanding of what this is and what we're trying to do here and what good looks like and embodying that somehow into the nature of the product, which in our case is like the template library. And then the the low level of, like, those final changes, making that something that the user can actually do in a reliable way where they can get exactly what they want. Those things are really important. I see a lot of products missing those, especially if they came in as part of the AI wave, then it feels like, okay. We're gonna build with the AI, build with the AI, and they don't think as much about those other things. But those other things are still important. How long this holds is tough to say. With Sora, ChatGPT, they can use Sora. You can imagine or Veo too, whatever. You can imagine saying, hey, ChatGPT or hey, Gemini. Here's my small business website. Make me, you know, a 30 second TV commercial. And maybe it can get smart enough that I can just do that whole thing and just do it all in pixel space and make it awesome. But I think that's a little bit out, and it's unclear how much do the folks at the big companies care about that. If you really said, okay. I wanna refine Veo, refine Sora to the point where we can layer copy into these things and we can edit copy, you know, on a copy edit level or where we could say, hey. In the last scene, the phone number should be changed to this. Will they be able to do those command type edits on raw video footage that are spitting out in, like, world sim, you know, pixel space? Probably can get there. I don't think anything is off the table, but they clearly haven't fine tuned or taken care to, like, get the things to work on tic tac toe. So are they really gonna grind out the data to cover all those little use cases? You probably have a while before those things will emerge. So, I mean, that's kind of like a niche strategy a little bit. Basically saying like, yeah, these things might get really generally super powerful. But if, you know, what counts as good in a particular domain and you can kind of create some guardrails and ensure that that happens and you can create some, like, last mile editing or customization features that make sure people get exactly what they want rather than being frustrated by, like, trying to ask the AI over and over again and never quite getting it right. Those things seem pretty valuable for at least a while to come. But, again, it's tough. Right? Because you could also imagine, like, if they get really good at coding to the point where they can spin up things on demand, then who knows? At some point, maybe everything collapses to the the cost of inference. Maybe 1 model to think about it is like, yeah, maybe even in a future state, ChatGPT could, like, spin up a Weimar cap. It can both, like, call Sora for footage, handle layers, and set fonts into videos and all those sorts of things. You know, it can write the code to do those things in a programmatic way so it can recreate our experience. Then you maybe have some price pressure if people start to expect and to legitimately experience that any set of software tools can be, like, relatively easily spun up through simple commands through a model. Then we're headed for a world of abundance, I guess. Right? And that hopefully then maybe that's good.
Adi: (1:48:19) Maybe then I don't have to care about building a small company. It can just live in the ladder up in that size.
Nathan Labenz: (1:48:25) Yeah. I think that's kind of the the Waymark attitude where, like, you know, we think it's gonna be a while before they're gonna do the stuff that we do really well that, like, delivers real practical value for users. I mean, I guess that also I I sort of gloss over that sort of thing. But, you know, to make sure you are actually delivering real value for users and not just, like, making something neat with AI, I think is really important. I have a bias for making something neat with AI regardless of whether anybody was asking for it or not. Our CEO, Alex, longtime close friend of mine, is better and more disciplined about being like, what do our users really want? You know? What are the pain points that they have with the current product? And making sure that he solves those? And sometimes that coincides with making the next really neat thing with AI, and other times it doesn't as much. In the short term, there's value in just making sure you actually nail it and and give people what they really want and are prepared to pay for. And longer term, it's definitely harder to say. But it is what's weird is that these capabilities can either emerge or be engineered, and we don't always know which has happened. Right? In the few shot learning was an emergent capability. They they weren't, like, necessarily expecting that to fall out of g p d 3, but there it was. And with few shot learning, it's like, man, this can sort of, in theory, maybe do kind of anything. It's got meta learning. So what won't emerge with the next few generations? It's hard to say. Not everything will emerge. In the meantime, they're identifying their biggest weaknesses, patching them, and, you know, working with scale AI and doing all sorts of things to collect really high quality data to train things on. So you're more sort of insulated as a business if you are doing something that is both non emergent and not so focal that they're gonna go, like, invest the time and energy to collect the robust datasets needed to power that capability. I think for Waymark, we're in a decent spot there. They do care about video, but I think they mostly care about that as a world simulator thing. I don't think they really care that much about content. So I think we're in sort of a safe space for a while until maybe it emerges. At what point does that emerge through pixel level stuff or enough, you know, coding sophistication that you can just say, hey. How would you solve this? And it spit out an app that is sort of like, oh, yeah. No problem. If that happens, it could be a problem. Because certainly the things will have, like, much more generality than we have. The trade off with our app and most apps is like, are trying to do something narrow really well. If these things can match you on quality, they will crush you on breadth. If the AIs ever sort of can match us on our strong points, then they'll definitely be way better on our weak points. So we just kinda have to hope that doesn't happen super soon or at least until we're in the age of abundance.
Adi: (1:51:14) Cool. Unless we speak of age of abundance, I'm gonna pick another question that's very related to that. Why would UBI or postcastee be likely outcome of AGI or ASI? It seems like wealth concentration is the natural direction set in motion by capitalism, which will only quicken the pace. So they are questioning the assumption of why should UBI even be a likely outcome of AGI? Why wouldn't it turn out to be another kind of build concentration?
Nathan Labenz: (1:51:46) Yeah. I certainly don't rule that out. I mean, a lot depends on the shape of the technology. Right? So we've seen this reasoning paradigm shake things up a little bit again there, and I think I probably talked enough about that earlier when it comes to things like compute governance, offense, defense balance, and how we were headed toward highly egalitarian and low ability to control. This maybe moves us back a little bit more toward inequality and more ability to control, but I don't think this is the last, you know, word on that either. Right? The the core models are probably gonna continue to get better. I fully expect that, like, what you can run on a laptop will continue to advance for at least a few years to come. How does that all play out? I guess 1 possible way is that there could be a decoupling. And arguably, this sort of already happened. You know, I mean and it's I think it has certainly happened in Scandinavian societies. You could people can call me out here on being full shit because I'm not, by any means, an expert on, Scandinavian societies. But I think you look at a a society like Sweden, my understanding is that there is actually a lot of wealth concentration and hereditary passing down of sort of, you know, really important companies. You know, like, IKEA is a is a family held company. There is a an elite, you know, that that sort of controls a lot of the really important institutions of society. But then there is also, you know, a a generous, pretty robust social safety net, and people are looked after. Right? And they're not you know, certainly, they don't have, like, a mass homeless problem. They don't have a mass incarceration problem. People get to, you know, have have their needs taken care of. I think something like that could be maybe the way that this goes. Like, in a future where compute remains, you know, where you need a lot of scale, like resource owners will have a lot of but maybe individuals, even if you don't have that sort of wealth and power, will benefit from access, you know, and and and the ability to just, like, take advantage of things. It's always been an active discussion around American health care. It sucks in many ways. It's awesome in other ways. If you have a rare disease and want frontier treatment, The US is probably the best place to get that. If you want, like, to not have to worry about a lot of bullshit and, you know, to not have to have fear that your, you know, claims might be denied by an insurance company or if you wanna, like, look at things like maternal mortality statistics, US does not look so good at all. Maybe we can all have the best access. Right? Maybe we can all have an egalitarian future. We can all have access to the top oncologist because the top oncologist is an AI, and it's highly available and highly, you know, scalable and affordable. This is kind of the Ahmad vision. Right? What if everybody had access to top quality guidance? What if expertise was effectively free or very low cost? I think that some version of that, I guess, would be my sort of best guess with major caveat that, like, the future is probably gonna be pretty weird. A couple governments and a handful of big tech champions own the physical capital, do the frontier development, and basically are, like, deciding amongst themselves, like, how the future is going to go. And then everybody else is, like, living very comfortably and has, for practical purposes, like, access to all these incredible sources of expertise that were previously unimaginable, that seems like maybe the new social contract we're headed for. Because the compute and the decision making power do remain scarce. Who trains the $100,000,000,000 model and on what hardware at that level, it's gonna be scarce. And if you want to spend 1000000 dollars inferencing on 1 problem, on that level, it's gonna be scarce. What research bets do we make? You know, what do we prioritize? I think at OpenAI, they're still, like, fairly compute bound. My sense as to why they haven't launched Sora is they just don't have the compute to support it really. There could be other reasons too, but that seems to be a big 1. Internally, we can only scale up so many things. Which of the things are we gonna scale up? That is, like, not always an easy decision for them, and that's just not the kind of thing that is probably gonna be like direct democracy, you know, it would seem. At the same time, you could live extremely comfortably, efficiently. I think about how many language model interactions you can have at the cost of 1 crosstown trip, you know, 1 20, 30 mile car ride, you know, that you might take a trip to the doctor and back, that covers an awful lot of tokens. So I think there is abundance in many ways, abundance of of expertise, abundance of creative tools, but it seems like decision making and capital could end up being quite concentrated.
Adi: (1:56:41) I think that kinda naturally tees up the next question, which is an audience asked, in your previous episode, you stated 1 of your podcast goal is using your voice to influence the world in a slightly positive direction, how marginal that might be. Now somebody who feels responsible and without much technical expertise, how can I contribute meaningfully to this effort of ensuring safe and responsible AI development, especially in a world where commercial pressures might rush AI progress towards potentially unsafe outcomes?
Nathan Labenz: (1:57:19) First of all, I guess I would say, don't let the not much technical expertise hold you back for all the reasons we talked about learning. Right? If you're actually motivated, you can make great progress pretty fast when it comes to understanding what you need to understand to at least be conversant. Right? To at least be sharp on what's going on. Advancing the frontiers, you know, that's a different question. But catching up, you know, to the the near frontier at least, I think, is, like, never been more accessible. I mean, there's a lot obviously going on in AI, but so much of it is the same thing working over and over again as we talked about with, like, transformers versus, you know, something new, state space model or otherwise. If you wanna know what is really going on right now, what really matters, I think you can get there pretty fast. I wouldn't let lack of technical expertise be a mental barrier. I think that's, like, very much overcomeable for most people to get, you know, get to the level they need to do something. What do you do? How do you try to shape the the world? I think it is tough. You know, we're in there's a lot of, like, tough dynamics going on. Kind of played around with this adoption accelerationist hyperscaling poser notion for a while. If people had a better understanding of how far things have already gone, then they they would have a healthier respect slash fear for where things might be going next. So I think, like, in some ways, the can I make a positive contribution is maybe answered the same way as, like, how can I take advantage of AI in my personal life? Personally, I do find that there's, like, high high overlap. Understanding what's going on, helping other people understand what's going on, getting day to day value, helping other people get day to day value, bringing people up to speed, or at least, like, alerting them to the fact that, like, look at what AI can do, you know, and and having the implementation know how to be able to do that effectively so that they're actually, like, getting properly calibrated to where things are. I think that's, like, pretty useful, honestly. Even in the AI safety community, a decent amount of stuff can be interpreted that way. When you look at, like, Apollo Research's work on scheming, you know, it's, like, very thoughtfully done. I think those guys are really smart, but they're basically demonstrating capabilities. Right? They're saying, look at how far this has already come. We just got this new model. We put it into somewhat realistic situation, a little bit of a toy problem, a little bit contrived. But, like, if you squint, it's not hard to see similar things in the wild. And look at what we observe. I wanna be properly calibrated. I don't wanna understate how much thought has gone into that work, but I also don't wanna overstate how technically demanding that work is. I think it's much more driven by having an accurate sense of what they can do and, like, exploring in the right spaces and having a sense of, like, if we could show this capability, would that be meaningful? You know, would people, like, update their thinking on that? And, you know, and and I think they would probably tell you that, like, the engineering that went into it was not, like, insane. You know? And that there's not, like, breakthrough sort of eureka moments there. It's more about strategically shining a light on, look at this aspect of what these things can do today and then trying to make an argument about what we should think about that. They were quite successful with that. People were not all sold. Of course, there are always voices that come out to say, oh, that's not a big deal or you're overreacting or whatever. But they definitely made a little dent in the universe with that work by getting people's attention on it and by showing what is possible, helping people be accurately calibrated when things are moving so fast that is really hard. So that's really what I try to do for the most part, you know, to learn as much as possible. And my sort of my hope is that I'll I'll be 1 of the people that has the most up to date and accurate worldviews. And I kinda trust that that will be valuable. You know? Try to help other people have the most, you know, up to date accurate worldview they can. Any decision making will be advantaged, hopefully, by having a more accurate and up to date worldview. So that's like that's not exactly a master plan, but it's a strategy a lot of people could play. It is both energizing and enlightening to hear why people listen and learn what they value about the show. So please don't hesitate to reach out via email at tcr@turpentine.co, or you can DM me on the social media platform of your choice.