Is AGI Far? With Robin Hanson, Economist at George Mason University

Watch Episode Here

Video Description

In this episode, Nathan sits down with Robin Hanson, associate professor of economics at George Mason University and researcher at Oxford’s Future of Humanity Institute. They discuss the comparison of human brains to LLMs and legacy software systems, what it would take for AI and automation to significantly impact the economy, our relationships with AI and the moral weight it has, and much more. Try the Brave search API for free for up to 2000 queries per month at https://brave.com/api

LINKS:
- Robin’s Book, The Age of Em: https://ageofem.com/
- Robin’s essay on Automation: https://www.overcomingbias.com/p/no-recent-automation-revolutionhtml
- Robin’s Blog: https://www.overcomingbias.com/
- AI Scouting Report: https://www.youtube.com/watch?v=0hvtiVQ_LqQ&list=PLVfJCYRuaJIXooK_KWju5djdVmEpH81ee&pp=iAQB
- Dr. Isaac Kohane Episode: https://www.youtube.com/watch?v=pS5Vye671Xg

SPONSORS:

The Brave search API can be used to assemble a data set to train your AI models and help with retrieval augmentation at the time of inference. All while remaining affordable with developer first pricing, integrating the Brave search API into your workflow translates to more ethical data sourcing and more human representative data sets. Try the Brave search API for free for up to 2000 queries per month at https://brave.com/api

Omneky is an omnichannel creative generation platform that lets you launch hundreds of thousands of ad iterations that actually work customized across all platforms, with a click of a button. Omneky combines generative AI and real-time advertising data. Mention "Cog Rev" for 10% off www.omneky.com

NetSuite has 25 years of providing financial software for all your business needs. More than 36,000 businesses have already upgraded to NetSuite by Oracle, gaining visibility and control over their financials, inventory, HR, eCommerce, and more. If you're looking for an ERP platform ✅ head to NetSuite: http://netsuite.com/cognitive and download your own customized KPI checklist.

X/SOCIAL:
@labenz
@robinhanson (Robin)
@CogRev_Podcast

TIMESTAMPS
(00:00) Preview
(07:10) Why our current time is a “dream time” and the move back to a Malthusian world
(13:30) What sort of world should we be striving for?
(13:40) Sponsor - Brave
(17:50) Distinguishing value talk from factual talk
(18:00) Comparing and contrasting Ems to LLMs
(22:30) The comparison of human brains to legacy software systems
(30:52) Sponsor - Netsuite
(41:01) AIs in medicine
(53:30) A several century innovation pause
(55:30) Achieving full human level AI in the next 60-90 years
(1:03:55) Chess and routine benchmarks not a good predictor of AI performance in the economy
(1:07:44) Reaching and exceeding human-level AI in the next 1000 years
(1:11:40) Losing technologies tied to scale economies
(1:12:00) Why AI is hard to maintain in the long run
(1:12:20) Standard deviation in automation
(1:14:05) Computing power grows exponentially but automation grows steadily
(1:15:50) AI art generation and deepfakes
(01:21:42) The economics of AI-powered coding
(1:33:51) Merging LLMs
(1:36:02) Rot in software and the human brain
(1:40:18) Parallelism in LLMs and brain design
(1:41:00]) Moral weight for AIs, enslavement, and cooperation with AI
(1:47:10) What would change Robin’s mind about the future
(1:49:18) Wrap

Full Transcript

Transcript

Nathan Labenz: (0:00) We're on this upward growth trajectory. We have the potential to take a big chunk of the universe and do things with it, and I'm excited by that potential. So I want us to keep growing, and I see how much we've changed to get to where we are. My book Age of Em is about brain emulations. That's where you take a particular human brain and scan it to find spatial chemical detail where you fill in for each cell a computer model of that cell. If you've got good enough models for cells and a good map of the brain, then basically the input-output of this model should be the same as the input-output of the original brain. If we can get full human-level AI in the next 60 to 90 years with the progress, then this population decline won't matter so much because we will basically have AIs take over most of the jobs, which can allow the world economy to keep growing.

Robin Hanson: (0:46) Hello, and welcome to the Cognitive Revolution, where we interview visionary researchers, entrepreneurs, and builders working on the frontier of artificial intelligence. Each week, we'll explore their revolutionary ideas, and together, we'll build a picture of how AI technology will transform work, life, and society in the coming years. I'm Nathan Labenz, joined by my co-host Erik Torenberg. Hello and welcome back to the Cognitive Revolution. My guest today is Robin Hanson, Professor of Economics at George Mason University and author of the blog Overcoming Bias, where Robin has published consistently on a wide range of topics since 2006, and where Eliezer Yudkowsky published early versions of what has become some of his most influential writing on AI. Robin is an undeniable polymath whose approach to futurism is unusually non-romantic. Rather than trying to identify value buddies, Robin aims to apply first principles thinking to the future and to describe what is likely to happen without claiming that you should feel any particular way about it. I set this conversation up late last year after my deep dive into the new Mamba state space model architecture, because Robin's 2016 book, The Age of Em, which analyzes a scenario in which human emulations can be run on computers, suddenly seemed a lot more relevant. My plan originally was to consider how his analysis from the Age of Em would compare to similar analyses for a hypothetical age of LLMs or perhaps even an age of SSMs. In practice, we ended up doing some of that, but for the most part took a different direction as it became clear early on in the conversation that Robin was not buying some of my core premises. Taking the outside view as he's famous for doing and noting that AI experts have repeatedly thought that they were close to AGI in the past, Robin questions whether this time really is different and doubts whether we are really close to transformative AI at all. This perspective naturally challenged my worldview, and I listened back to this conversation in full to make sure that I wasn't missing anything important before writing this introduction. Ultimately, I do remain quite firmly convinced that today's AIs are powerful enough to drive economic transformation, and I would cite the release of Google's Gemini 1.5, which happened in just the few short weeks between recording and publishing this episode, as evidence that progress is not yet slowing down. Yet, at the same time, Robin did get me thinking more about the disconnect between feasibility and actual widespread implementation and automation. Beyond the question of what AI systems can do, there are also questions of legal regulation, of course, and perhaps even more importantly, just how eager people are to use AI tools in the first place. When Robin reported that his son's software firm had recently determined that LLMs were not useful for routine application development, I was honestly kind of shocked, because if nothing else, I'm extremely confident about the degree to which LLMs accelerate my own programming work. Since then, though, I have heard a couple of other stories which, combined with Robin's, helped me develop a better theory of what's going on. First, an AI educator told me that failure to form new habits is the most common cause of failure with AI in general. In his courses, he emphasizes hands-on exercises because he's learned that simple awareness of AI capabilities does not lead to human behavioral change. Second, a friend told me that his company hosted a Microsoft GitHub salesperson for a lunch hour demo, and it turned out that one of their own team members had far more knowledge about GitHub Copilot than the rep himself did. If Microsoft sales reps are struggling to keep up with Copilot's capabilities, we should perhaps adjust our expectations for the rest of the economy. And third, in my own experience helping people address process bottlenecks with AI, I've repeatedly seen how unnatural it can be for people to break their own work down into the sort of discrete tasks that LLMs can handle effectively today. Most people were never trained to think this way, and it's going to take time before it becomes common practice across the economy. All this means that change may be slower to materialize than those of us on the frontiers of AI adoption might expect. And while that does suggest more of an opportunity and indeed advantage for us in the meantime, on balance, I do have to view it as a negative sign about our preparedness and our ability to adapt overall. Regardless of your views, and I do suspect that most listeners will find themselves agreeing with me more than with Robin, his insights are always thought-provoking, and I think you'll find it very well worthwhile to engage with the challenges that he presents in this conversation. As always, if you're finding value in the show, we would appreciate it if you'd share it with friends, post a review on Apple Podcasts or Spotify, or just leave a comment on YouTube. And of course, I always love to hear from listeners, so please don't hesitate to DM me on the social media platform of your choice. Now, I hope you enjoy this conversation with Professor Robin Hanson. Robin Hanson, Professor of Economics at George Mason University and notable polymath, welcome to the Cognitive Revolution. Nice to meet you, Nathan. Let's talk. I'm excited about this. So I have followed your work for a long time. It's super wide-ranging and always very interesting. People can find your thoughts on just about everything over the years on Overcoming Bias, your blog. But today, I wanted to revisit what I think is one of your destined to be perhaps one of your most influential works, which is the book The Age of Em, which came out in 2016 and envisions a future which basically amounts to putting humans on machines, and we can unpack that in more detail, and then explores that in a ton of different directions. Where we actually are now as we enter into 2024 is not exactly that, certainly. But I've come to believe recently that it's maybe bending back a little bit more toward that, certainly more than my expectations a year ago. So I've revisited the book, and I'm excited to bring a bunch of questions and kind of compare and contrast your scenario versus the current scenario that we seem to be evolving into.

Nathan Labenz: (7:07) Okay, let's do it.

Robin Hanson: (7:09) One big theme of your work, always, I think, is that we live in this strange dream time and that our reality as modern humans is quite different than the reality of those that came before us and likely those that will come after us for some pretty fundamental reasons. Do you want to just sketch out your big picture argument that our times are exceptional and not likely to go on like this forever?

Nathan Labenz: (7:35) The first thing to notice is that we are in a period of very rapid growth, very rapid change, which just can't continue for very long on a cosmological timescale. Ten thousand years would be way longer than it could manage, and therefore, we're going to have to go back to a period of slower change. And plausibly, then a period of slower change will be a period where population can grow faster relative to the growth rate of the economy and the universe. Therefore, we will move back more toward a Malthusian world, if competition remains, such as almost all our ancestors were in until a few hundred years ago. So we're in this unusual period of being rich per person and in very rapid change and also sort of globally integrated. That is, our distant ancestors were fragmented culturally across the globe and each talked to a small group of people near them. And our distant descendants will be fragmented across the universe, and they won't be able to talk all across the universe instantaneously. So future culture and past culture were both very fragmented, and we're in a period where our entire civilization can talk rapidly to each other. The time delay of communication is very small compared to the doubling time of our very rapid growth economy. So we are now an integrated civilization. We're rich, growing very fast. And there's a number of consequences of being rich, which is that we don't have to pay that much attention to functionality. We're not pressured to do what it takes to survive in the way our ancestors and our descendants will be. So we can indulge our delusions or whatever other inclinations we have. They aren't disciplined very rapidly by survival and functionality, and that makes us a dream time. That is, our dreams drive us. Our abstract thoughts, our vague impressions, our emotions, our visions. We do things that are dramatic and exciting and meaningful in our view according to this dream time mind we have, which isn't again that disciplined by functionality. That is, the mind we inherited from our distant ancestors was functional there. It was disciplined there. We're in a very different world, but our mind hasn't changed to be functional in this world. And so we are expressing this momentum of what we used to be in this strange new world. That's the dream time.

Robin Hanson: (10:18) So let me just try to rephrase that or frame it slightly differently and tell me if you agree with this framing. I would maybe interpret it as we're in a punctuated equilibrium sort of situation where we're in the transition from one equilibrium to another. There have probably been however many of these through history, not like a huge number, but a decent number. Think of such phrases as the Cambrian explosion, perhaps as another dream time. These moments happen when some external shock happens to the system, whether that's like an asteroid that takes out a lot of life or human brains come on the scene. And there's a period in which the normal constraints are temporarily relaxed. But then in the long term, there's just no escaping the logic of natural selection. Is that basically the framework?

Nathan Labenz: (11:07) Your analogy of the Cambrian explosion could be that we discovered multicellularity, we discovered being able to make large animals, and that happened at a moment. There was the moment of multicellularity, and then evolution took time to adapt to that new opportunity. And the Cambrian explosion is the period of adaptation. Then after the Cambrian explosion, we've adapted to that new opportunity, and then we're more in a stasis. So you're imagining this period of adaptation to a sudden change. But for humans today, we keep having sudden changes and they keep coming fast. There wasn't this one thing that happened 300 years ago or 10,000 years ago that we're slowly adapting to. We keep having more big changes that keep changing the landscape of what it is to adapt to. So we won't see this slow adaptation to the new thing until we get a stable new thing, which we haven't gotten yet. Things keep changing.

Robin Hanson: (12:06) I want to maybe circle back in a minute to what would be the conditions under which things would restabilize. I think the Em scenario is one of them, but there may be others that might even be more imminent at this point. Before doing that, I just wanted to touch on another big theme of your work, which is, and I really appreciate how you introduced the book this way, with the idea that I'm just trying to figure out what is likely to happen in this scenario. I'm not telling you you should like it. I'm not telling you you should dislike it. I'm not trying to judge it. I'm just trying to extrapolate from a scenario using the tools of science and social science to try to figure out what might happen. I love that, and I try to do something similar with this show around understanding AI. I think there's so much emotional valence brought to so many parts of the discussion. And I always say, we need to first figure out what is, and even in the current moment, what capabilities exist, what can be done, what is still out of reach of current systems before we can really get serious about what ought to be done about it. I guess I'd invite you to add any additional perspective to that. And then I'm also curious, I think that's very admirable, but could you give us a little window into your own kind of biases or preferences? Like, what sort of world do you think we should be striving for? Or do you think that's just so futile to even attempt to influence against these grand constraints that it doesn't matter? We'll continue our interview in a moment after a word from our sponsors.

Nathan Labenz: (13:42) Pretty much all big grand talk is mostly oriented around people sharing values. That's what people want to do when they talk big politics, when they talk world politics or world events, when they talk the future. People want to jump quickly to, do I share your values? Here's my values. What are your values? Do we agree on values? Are we value buddies? And people are so eager to get to that that they are willing to skip over the analysis of the details. Say, if you want to talk about, I don't know, the war in Ukraine, people want to go, which side are you on? Do we have the right values? Then they don't care to talk about who has how much armaments, who will run out soon, or who can afford what. All those details of the war, they don't want to go there. They just want to go to the values and agree on them. And that happens in futurism too. People just want to jump to the value. So for the purposes people have, they're doing roughly the right thing. They don't really care about the world and they don't really care about the future. What they care about is finding value buddies, or if they find a value conflict, having a value war. That's what people just want to do. And so if you actually want to figure out the world or national politics or national policy or you want to figure out the future, you really have to resist that. And you have to try to pause and go through an analysis first, a neutral analysis of what the options are, what the situation is. I am afraid literally that if I express many values that the discussion will just go there and we'll never talk about anything else. And that's why I resist talking about that. But I think my simplest value with respect to the future is I really like the fact that humanity has grown and achieved vast things compared to where it started. We're on this upward growth trajectory. We have the potential to take a big chunk of the universe and do things with it, and I'm excited by that potential. So my first cut is I want us to keep growing. And I see how much we've changed to get to where we are, and I can see that had people from a million years ago insisted that their values be maintained and that the world be familiar and comfortable to them, if they'd been able to enforce that, we would not have gotten where we are now. That would have prevented a lot of change. So I kind of see that if I want us to get big and grand, I'm going to have to give a lot on how similar the future is to me and my world. I'm going to have to compromise a lot on that. I just don't see any way around that. So I get it that if you want the future to be really comfortable for you and to share a lot of your values and your styles, you're going to have to prevent it from changing. And you may have a shot at that. I would not like that, but you might. So, again, even as part of the value framework, even when I talk values with you, I want to be clear to distinguish my value talk from the factual talk. I'm going to be happy to tell you what it would take for you to get your values even if they aren't mine. So maybe we should talk about the facts of LLMs. You want to go there in terms of comparing Ems and LLMs? Right? So first of all, for our audience, we should say my book Age of Em is about brain emulations. So that's where you take a particular human brain and you scan it to find spatial chemical detail to figure out which cells are where, connected to what other cells through what synapses. You make a map of that, and then you make a computer model that matches that map where you fill in for each cell a computer model of that cell. And if you've got good enough models for cells and a good enough map of the brain, then basically the input-output of this model should be the same as the input-output of the original brain, which means you could hook it up with artificial eyes, ears, hands, mouth, and then it would behave the same as the original human would in the same situation, in which case you can use these as substitutes for humans throughout the entire economy. And then my exercise in the Age of Em book was to figure out what that world looks like. And a primary purpose was to actually be able to show that it's possible to do that sort of thing. It's possible to take a specific technical assumption and work out a lot of consequences. And many people have said they didn't want so many details. They'd rather have fiction or something else, but I was trying to prove how much I could say. And I hope you'll admit I proved I could say a lot and that almost no other futurist work does that. And so I'm trying to inspire other futurists to get into that level of detail to try to take some assumptions and work out a lot of consequences. So that's my book, The Age of Em. You'd like us to compare that to current large language models and to think about what we can say about the future of large language models. So in my mind, the first thing to say there is, well, an Em is a full human substitute. It can do everything a human can do, basically. A large language model is not that yet. So a key question here would be, how far are we going to go in trying to imagine a descendant of a large language model that is more capable of substituting for humans across a wide range of contexts? If we stick with current large language models, they're really only useful in a rather limited range of contexts. And so if you're going to do forecasting of them, it's more like forecasting the future with a microwave oven or something. You think about, well, where can you use a microwave oven and how much will it cost and what other heating methods will it displace and what sort of inputs would be complements to that. It would be more of a small-scale future forecasting exercise. Whereas the Age of Em was purposely this very grand exercise because the Ems actually change everything. Whereas most futurism, like if you're trying to analyze the consequences of a microwave oven, you have a much more limited scope because, in fact, it'll have a limited impact. So that would be the question I have for you first, which is, are we going to talk about the implications of something close to the current large language models, or are we going to try to imagine some generalized version of them that has much wider capabilities?

Robin Hanson: (20:10) Yeah. Very good question. I think maybe two different levels of this would be instructive. One of the key things that jumps out and I think a lot of stuff flows from is the assumption that Ems can be copied cheaply, paused and stored indefinitely cheaply, but not understood very well in terms of their internal mechanism. Right? Very much like the similar understanding to what we have of the brain where we can kind of poke and prod at it a little bit, but we really don't have a deep understanding of how it works. We can't do very localized optimizations. But we do have this radical departure from the status quo, which is you can infinitely clone them, you can infinitely freeze and store them. So this creates all sorts of elasticities that just don't exist in the current environment.

Nathan Labenz: (21:00) So a number of those features are going to be general to anything that can be represented as computer files and run on a computer. Any form of artificial intelligence will be of the sort in general that you could have a digital representation of, archive it, make a copy of it, pause it, run it faster or slower. That's going to be just generically true of any kind of AI, including Ems. The ability to sort of modify it usefully, I mean, yes, with human brains initially, they're just a big mess. You don't understand them. But honestly, most legacy software systems are pretty similar. So today, large legacy software systems, you mostly have to take them as they are. You can only make modest modifications to them. That's close to what I'm assuming for Ems, so I'm actually not assuming that they are that different from large legacy software systems. They're just a big mess that even though you could go look at any one piece and maybe understand it, that doesn't really help you usefully in modifying the entire thing. You basically have to take the whole thing as a unit and can only make some minor changes. But you can copy the whole thing. You can run it fast or slow. You can move it at speed, transfer at the speed of light around the Earth, through the universe. Even those things are true of pretty much any AI that could be represented as a computer file run on a computer.

Robin Hanson: (22:27) Yeah. I think these dimensions are a really useful way to break this down. And I took some inspiration from you in a presentation that I created called the AI Scouting Report, where I have the tale of the cognitive tape that compares human strengths and weaknesses to LLM strengths and weaknesses. And I think for the purposes of this discussion, maybe we might even have like four different kinds of things to consider. One is humans. Second would be Ems. Third is, let's say, transformer language models of the general class that we have today. Although I think we can predictably expect at a minimum that they will continue to have longer context windows and have generally more pre-training and generally more capability, at least within a certain range. And then the fourth one that I'm really interested in and has been kind of an obsession for me recently is the new state space model paradigm, which actually has some things now in common again with the humans and the Ems that the transformer models lack. The state space models, this has been, of course, a line of research that's been going on for a couple of years kind of in parallel with transformers. Transformers have taken up the vast majority of the energy and the public focus because they have been the highest performing over the last couple of years. But that has maybe just changed with a couple of recent papers, most notably one called Mamba, that basically shows parity, rough parity with the transformer on your standard language modeling tasks, but does have a totally different architecture that I think opens up some notably different strengths and weaknesses. Whereas the transformer really just has the weights and then the next token prediction, the state space model has this additional concept of the state, which is, and I recall from the book, sort of taking an information processing lens to the human or, you know, where you spend more of your focus is on the Em, you have the current state plus some new input information, sensory or whatever, and then that propagates into some action, some output, and a new internal state. And that I think is really the heart of what the new state space models do is that they add that additional component where they have not only the weights, like a transformer has static weights, but they also have this state which is of a fixed size, evolves through time, and is something that gets output at each inference step so that there is this internal state that propagates through time and can change and have long history. I think it is likely to bring about a much more integrated medium and long-term memory than the transformers have and create more sort of long episode conditioning where these models, I think, will be more amenable to employee onboarding style training, which is something also that the Ems have in your scenario. Right? You can kind of train a base Em to be an employee for you. You can even get it to that mental state where it's really excited and ready to work, and then you can freeze it, store it, boot it up when necessary, boot it up end times as necessary. The transformers don't really have that same feature right now. They're just kind of in their monolithic base form at all times, but the state space models start to add some of that back. Obviously, it's not going to be one-to-one with the humans or the Ems.

Nathan Labenz: (26:30) Here's going to be my problem with that number four. If I look at the history of AI over the history of computers and even the history of automation before that, we see this history where a really wide range of approaches have been tried, a really wide range of paradigms and concepts and structures have been introduced. And over time, we've found ways in some sense to subsume prior structures within new ones, but we've just gone through a lot of them. And there's been this tendency, unfortunately, that when people reach the next new paradigm, the next new structure, they get really excited by it and they consistently say, are we almost done? They said that centuries ago. They said that half a century ago. Every new decade, every new kind of approach that comes along, people go, there's basically typically some demo, some new capability that's a new system can do that none of the prior systems have been able to do. It's exciting and it's shocking even and exciting, but people consistently say, so we must be almost done. Right? Surely, this is enough to do everything and pretty soon humans will be displaced by automation based on this new approach. And that just happens over and over again, over and over again. And so we've had enough of those that I've got to say the chance that the next exciting new paradigm is the last one we'll need is a priori pretty low. We've had this long road to go, and we still have a long way to go ahead of us, and therefore, it's unlikely that the next new thing is the last thing. So that's my stance. I would think, okay, I can talk to you about LLMs because they're the latest thing. We can talk about what new things they can do and what exciting options that generates in the near future. Then we could ask, well, what's the chance it's the last thing we'll need or that the next one is the last thing we'll need? One way to cash that out is to ask, what do we think the chances are that within a decade or even two, basically all human jobs will be replaced by machines based on this new approach? Most of the forecasting that's done out there is excited about near-term progress in a lot of ways, but when you ask the question, when will most jobs be replaced, they give you forecasts that are way out there because they think, no, we're not close to that. And I don't think we're close to that. So then the question is, now we could say, what will happen when we eventually get to the point where AIs are good enough to do everything? And we don't know what that approach is, but we can still talk about that point and what's likely to, what the transition rate would be and the transition scenario and who would get rich and who would be unhappy and all the different things we could talk about there. But now we're talking about whatever approach eventually gets us past being able to do pretty much all human tasks, which is not where we are now. Or we can talk about where we are now and what these things can do and what exciting things might happen in the next decade. Robin Hanson: (29:48)

Hey, we'll continue our interview in a moment after a word from our sponsors. Well, I'm tempted by all of those options. So maybe for starters, I would be interested to hear how you would develop a sort of cognitive tale of the tape between humans and EMs. By presumption, they have the same cognitive abilities, but these different external properties of copyability and so on. The large language model today, the transformer, is a remarkably simple architecture. When you really just look at the wiring diagram, it's way simpler than the human brain is. And not shockingly, it can only do certain things. There are really important traits that the human brain has that the language models don't have. I've identified one of those as integrated, ever-evolving medium and long-term memory. I wonder what else you would flag there. I don't know if you have a taxonomy of what are the core competencies of humans that you could then say, oh, and here are the things that language models currently lack. I'm trying to develop something like this in general because it does seem to me that the large language models have hit not genius human level, but closing in on expert human level at some very important, dare I say, even core aspect of information processing. They can do things that I would say are qualitatively different than any earlier AI system could do. It certainly seems like we're getting closer to whatever the last step is than we used to be.

Nathan Labenz: (31:29)

But just notice that phrase you just gave was true for most of all the previous ones as well. They could also do a thing that the previous ones before it couldn't do. It's always been exciting. We've found a new fundamental capability with each new paradigm, each new approach, and it's always been of the sort that it allowed the system to do fundamental things you couldn't do before that seemed to be near the core of what it was to think. So there's apparently a lot of things near the core of what it is to think. That's the key thing to realize. What it is to think is a big thing. There's a lot of things in there.

Robin Hanson: (32:06)

Well, let's list some. I can't come up with that many honestly. I would love to hear how many can you name? I have all day. So could you begin to break down what it is to think into key components?

Nathan Labenz: (32:19)

I was an AI researcher from 1984 to 1993. I was full-time at NASA and then Lockheed. Certainly, at that time, I understood the range of approaches people had and could talk about the kinds of things systems then could do or not do in expert terms relating to the then-current tasks and issues. I am not up to date at the moment on the full range of AI approaches, and I don't want to pretend to be an expert on that. But I have listened to experts, and the experts I hear basically consistently say, this is exciting, this is great, but we're not close to being able to do all the other things. And they would be much better than I am at making a list of that. And I feel like they should make the list, not me. I mean, as a polymath, you call me. I want to be very careful to know when I'm an expert on something and when I'm not. And I want to defer to other people on areas where I can find people who know more than I. And when I think I'm near the state of the art, as good as anyone on a topic, then I will feel more free to generate my own thoughts and think they're worth contributing.

Robin Hanson: (33:29)

Fair, certainly. I think where you do still bring something very differentiated to the discussion is just the willingness to stare reality in the face or at least try to.

Nathan Labenz: (33:47)

The simplest thing is if I start talking to a large language model, there's a whole bunch of things I can ask it to do that it just can't do. I'm not so sure how to organize that in terms of large major categories, but it's really obvious that there's a certain kind of thinking it can do and a bunch of other kinds of thinking it can't do. And I don't know exactly why it can't do them, but I'm talking to you, there's a bunch of things I could ask you to do in this conversation. You would probably do a decent job of them. If I were talking to a large language model, it just couldn't do those things. So it's just really obvious to me that this has a limited capability. It's really impressive compared to what you might have expected 5 or 10 years ago. It's wow, I never would have thought that would be feasible this soon, but you just try asking it a bunch of other things and it just can't do them.

Robin Hanson: (34:34)

Yeah. I mean, I think that in my view, a lot of those things are kind of overemphasized relative to what maybe really matters. You see a lot of things online where people, and there's different categories of this, some of the things you'll see online are literally people just using non-frontier models and muddying the water. So always watch out for that. I have a long-standing practice: the first thing I do when I see somebody say GPT-4 can't do something is try it myself. And I would honestly say like two-thirds of the time, it's just straight-up misinformation, and it, in fact, can do it. But there's still the one-third of the time that matters. They're not very adversarially robust. They're easy to trick. They're easy to sort of get on the wrong track. And then they seem to get kind of stuck in a mode is a good term for it, I think, where once they're kind of on a certain track, this is how they can often get jailbroken. If you can get them to say, okay, I'll be happy to help you with that, then they'll go on and do whatever you asked because they've already kind of got into that mode.

Nathan Labenz: (35:49)

Yeah. I'm much less worried about them doing things you don't want them to do than being able to get them to do things at all. Humans can be made to do all sorts of things you might not want them to do, but we survived that. I mean, to me, the main thing is if you imagine treating a large language model as a new employee in some workplace where you're trying to show them how to do something and get them to do it instead of you, that's the main thing that will be economically valuable in the world. That is when you have a thing like that, that can be introduced into a workplace, trained roughly and said, watch how I do this, you try to do it now, et cetera, then that will be the thing that makes an enormous difference in the economy because that's how we get people to do things. So that I think is, in a sense, the fundamental main task in the economy, which is a bunch of people are doing something, you have a new thing and you say, you come watch us and ask us questions and we'll ask you questions and figure out how to help us and be part of what we're doing. That is the fundamental problem in the economy. So that, in some sense, is the fundamental task that any AI has to be held up to. In the past, of course, we don't even bother to have a conversation to show how to do. We actually say, well, let's make a machine to do this thing, and then we design a machine to do this thing, and then we train it up to do this thing all with the idea of the whole thing having in mind the thing we're going to have it do. That's how AI has been usually in the economy so far. But now if you're imagining a thing that could just be trained to do a new job, well, that would be great. Sure. Then we won't have to design the AI ahead of time for the particular task, but you'll have to have a thing that's up to that. And large language models today are just clearly not up to that. You can't say, I'm about to train you how to do the following thing. Pay attention. I just did this. Now would you do it?

Robin Hanson: (37:39)

Well, you can do that quite a bit, right? I mean, that was the main finding in GPT-3. And I'm not sure if this is verbatim, but the title of that paper was Language Models Are Few-Shot Learners. And the big breakthrough observation there, which I don't think they designed for, was, you know, there's a whole quagmire of what should count as emergent or not emergent. But my understanding is they didn't specifically train for this few-shot imitation capability, but they nevertheless got to the point where at runtime today, you can give a few examples of what you want. And in fact, that is a best practice that OpenAI and Anthropic recommend for how to get the most from their systems. They'll say some things are hard. They also have now trained them to follow instructions just verbatim or explicitly, but they will still say that some things are better shown by example than described in terms of what to do. So do that, and you'll get a lot better performance. It seems to me that there is, borrowing from medicine, watch one, do one, teach one. It seems like we're on the do one step, and that does seem to be a pretty qualitative threshold that has been passed. Now they obviously can continue to get better at that.

Nathan Labenz: (39:03)

Right. But it's the range of things they can do that's the question. Yes, it's great that you could say, here's some examples, give me another one, but the range of things you can do that for is limited. Most people in most jobs, they couldn't have a large language model swap in for many of their main tasks that way. There are some and that's exciting and I hope to see people develop that and improve it. But again, the key question is how close are we to the end of this long path we've been on for a while?

Robin Hanson: (39:32)

Yeah. I guess I think about it a little bit differently in terms of rather than thinking about the end of the path, I think of how close are we to key thresholds that will bring in qualitatively different dynamics relative to the current situation. So one threshold that I think has recently been passed in a pretty striking way, this should get more discussion than it does in my view, is Google DeepMind just put out a paper not long ago where they showed basically a 2 to 1 advantage for a large language model in medical diagnosis versus human doctors. And then, of course, they also compared to human plus AI, and that was in the middle. So on these cases that they lined up, in the scenario of chatting with your doctor, 60% accuracy from the language model, 30% accuracy from the human.

Nathan Labenz: (40:29)

I was in AI from 1983 to 1994. And at the beginning, one of the reasons I came into AI was there were these big journal articles and national media coverage about studies where they showed that the best AI of the time, which they called expert systems, were able to do human-level medical diagnosis. This was in the early 1980s, right? We're talking 40 years ago. And obviously, the computer capacity is vastly larger than that. So either they were lying back then and messing with the data, or they did have human-level diagnosis back then, but they weren't allowed to apply it because of medical licensing. And we're still not allowed to apply it because of medical licensing. So this is exactly the sort of ability that won't give substantial economic impact because we had it 40 years ago and it didn't have an impact then.

Robin Hanson: (41:30)

Yeah, I don't know. So I think one qualitative difference between that earlier system and this system, I won't claim to be an expert in the earlier expert systems, but I would guess that a huge difference is that you can take today a totally uninitiated person who has a medical concern and say, sit in front of this computer, talk to this doctor. They don't even need to know it's an AI doctor. They can just talk to it.

Nathan Labenz: (42:00)

That wasn't the problem back then. They could have made these expert systems usable by ordinary people with modest effort. That wasn't the problem in using them. The problem was just you're not legally allowed to use them. Only doctors are allowed to give medical diagnoses, and so only doctors are allowed to use these systems to talk to people. That was the main obstacle, and it still is today. You could make such a system today that ordinary people could talk to, but they're not allowed to talk to it, and they won't be allowed to talk to it for a long time.

Robin Hanson: (42:31)

I think there is a qualitative difference between these systems. If I were to sit down in front of the early eighties thing and I were to say, what's different today is the chat system could say, Robin, tell me how you're feeling. Tell me about your experience. And you can just go on in your own language, however you want to express yourself, and it can get you. And then it can ask you specific follow-up, but you're not going through a wizard and going down an expert system tree and asked for numeric scores you don't understand and don't know. You can literally just express yourself. That was not there then, right?

Nathan Labenz: (43:02)

But that's not the limiting factor, right? I mean, you couldn't have a fancy graphics interface back then either. This was early 1980s, right? But again, the limiting factor is the legal barrier. It was back then and still is, and that legal barrier doesn't look like it's about to go away. So if you're going to make us excited about applications, it'll have to be something that's legal.

Robin Hanson: (43:25)

My model of this is that the consumer surplus of this type of thing is going to be so great.

Nathan Labenz: (43:32)

It already was 40 years ago. It would have been a huge consumer surplus 40 years ago.

Robin Hanson: (43:36)

But there was never a groundswell. I don't know. I'm just not buying this. I'm not buying that there was an experience that is qualitatively like the one that we have today, such that I think today, if you show people what Google has, they will say it is not acceptable to me that you keep this locked up behind some paywall. I don't think that was the general consumer reaction to early eighties expert systems, and it seems like that political economy pressure could change things.

Nathan Labenz: (44:08)

Consider the analogy of nuclear power. The world has definitely been convinced for a long time that nuclear power is powerful. It is full of potential and power. And if we had let it go wild, we would have vastly cheaper energy today, but it was that power that scared people, which is why we don't have that energy today. The very vision of nuclear energy being powerful is what caused us not to have it. We overregulated it to death, and we made sure that the power of nuclear power was not released. We believed the power was there. It was not at all an issue of not believing that nuclear power was powerful. It was believing it was too powerful, scary, dangerous, powerful, and there's a risk that we'll do that with AI today. We will make people believe it's powerful, so powerful that they should be scared of it, and it should be locked down and not released into the wild where it might do us terrible danger.

Robin Hanson: (45:07)

Yeah. Well, that's certainly a tragic outcome in the case of the nuclear power, and I think it would also be a tragic outcome if people are denied their AI doctors of the future on that basis. And it could happen. Certainly wouldn't rule out the possibility that AI research probably gets made illegal. This time, we do have, I mean, again, I do think we're in a different regime now where enough has been discovered, enough has been put into the hands of millions. There is sort of the open source hacker level.

Nathan Labenz: (45:41)

Medical diagnosis is not. We have not put medical diagnosis AI in the hands of ordinary people. And if you tried it, you would find out just how quickly you'd get slapped down.

Robin Hanson: (45:51)

Yeah, I think I know someone who actually may be about to try this, and it'll be very interesting to see how quickly and how hard they get slapped down and how they may respond from it. I've actually been very encouraged by response from the medical community. I would say, obviously, it's not a monolithic thing, but I did an earlier episode with Zak Kohane, who is a professor at Harvard Medical School and who had early access to GPT-4. He came out with a book basically to coincide with the launch of GPT-4 called GPT-4 and the Revolution in Medicine. Broadly, I have been encouraged by how much the medical establishment has seemingly been inclined to embrace this sort of stuff. I don't know if it's just that they're also overworked these days.

Nathan Labenz: (46:40)

Well, they'll embrace the internal use of it. Again, it's always been doctors allowed to use these things. The main reason they didn't get more popular is doctors couldn't be bothered to type in and input all the information because they want to have short meetings with patients. Even today, of course, if you've gone to a modern doctor, most of your meeting with the doctor is them typing in information to their computer as they talk to you, and they don't want to spend much more time typing in more. So they don't want to use computer aids in their diagnosis, and that's been true for a long time. Computer diagnosis aids have been available for a long time that would give them better diagnoses at the cost of them having to spend more time with them, and they've chosen not to spend more time. That's been true for many decades now.

Robin Hanson: (47:24)

Have you personally used GPT-4 for any advanced things like this, medical or legal advice or whatever?

Nathan Labenz: (47:32)

No. I'm an economics professor, so I've used it to check to see what my students might try to use it to answer my exam questions or essay questions or things like that. I've asked it things that I wanted to know and tried to check on them. I haven't used it for legal or medical questions. Those are areas which are heavily regulated. It's always been possible for other people to offer substitutes. For example, many decades ago, there were experiments where for the purpose of general practice doctors, we compared doctors to nurses, nurse practitioners, or paramedics. We found that those other groups did just as well and much cheaper at doing the first level of general practice, but they haven't been allowed. So that right there is enormous value that could have been released. We could have all this time been having nurse practitioners and paramedics do our first level of general practice medicine, and they would save at least a factor of 2 or 3 in cost, and that's been true for decades. We've had randomized experiments showing that for decades.

Robin Hanson: (48:31)

So going back to the Age of Em then for a second, are you just assuming that that scenario doesn't happen in Em land for some reason? The first objection to the Age of Em seems like it maybe should be EMs will be made illegal. Nobody will be allowed to do it. Basically, you're just in the analysis saying, well, let's just assume that doesn't happen because it's a short book if they just get made illegal too early. Is that the idea?

Nathan Labenz: (49:01)

Well, first of all, I say transitions are harder to analyze than equilibria of new worlds. So I try to avoid analyzing the transition, although I do try to discuss it some towards the end of the book. But I admit, I can just say less about a transition. It does seem like that compared to a scenario where everyone eagerly adopted Em technology as soon as it was available, more likely there will be resistance. There will be ways in which there are obstacles to Em technology early on. And therefore, at some point, there would basically be the breaking of a dam flooding out where a bunch of things that had been held back were released and then caused a lot of disruption, faster disruption than would have happened had you adopted things as soon as they were available. That can be a very disturbing transition then. If all of a sudden large numbers of people are disrupted in ways they weren't expecting in a very rapid way because of a dam suddenly broke open, then I think there will be a lot of unhappy people in that sort of a transition and maybe a lot of dead people. Imagine the Em technology slowly just gets cheaper over time, but it's not very widely adopted. Then there'll be a point at which it eventually gets so cheap that if some ambitious nation like, say, North Korea said, gee, if we went whole hog in adopting this thing, we could get this big economic and military advantage over our competitors, then eventually somebody would do that. Now, it might take a long time. That is the world could coordinate to resist this technology for a long time, but I don't think they could hold it back for 1,000 years. So then I feel somewhat confident. Eventually, the Age of Em happens, and then eventually, there's a thing to think about, and then I'm analyzing that world. So I don't want to presume in the Age of Em that this transition happened smoothly or soon or as fast as it could. But I want to say eventually, there'll be this new world, and here's how it would play out. So I don't know if you know that in the last few months, I've dramatically changed my vision of the future to say that there's probably going to be a several century innovation pause, probably before the Age of Em happens. And then the world that would eventually produce AI and EMs would be a very different world from ours and somewhat hard to think about. That is rising population will stop rising and will fall due to falling fertility. That will basically make innovation grind to a halt. Then the world population will continue to fall until insular fertile subcultures like the Amish grow from their very small current levels to become the dominant population of the world. And then when that becomes large enough compared to our current economy, then innovation would turn on again and then we would restart the AI and Em path. And then eventually the Age of Em would happen. Trying to anticipate how transitions would happen in a world we can just hardly even imagine seems tough, right? That is, okay, imagine the descendants of the Amish become a large, powerful civilization. They've always been somewhat resistant to technology and very picky about which technologies they're allowed, but eventually I would predict there would be competition within them and that would push them to adopt technologies like AI and EMs. But we're looking a long way down the line. And this isn't what I wish would happen, to go back to your initial thing. I would rather we continued growing at the past rate of the past century and continue that for a few more centuries, by which time I'm pretty sure we'll eventually get EMs and human-level AI, although question in what order. But I got to say at the moment, that's not looking so good. So basically, I've estimated that if we were to continue on a steady growth path, we would eventually reach a point where we had the same amount of innovation as we will get over the entire interval of this several centuries pause. I've estimated that to be roughly 60 to 90 years' worth of progress. So if we can get full human-level AI in the next 60 to 90 years' worth of progress, then this population decline won't matter so much because we will basically have AIs take over most of the jobs, and then that can allow the world economy to keep growing. I think that's iffy whether we can do that, whether we can achieve full human-level AI in 60 to 90 years. And I know many people think it's going to happen in the next 10 years, they're sure. Sure, of course, it'll happen in 60 to 90 years, but I look at the history and I go, look, I've seen over and over again, people get really excited by the next new kind of AI, and they're typically pretty sure, a lot of them are pretty sure that we must be near the end, and pretty soon, we'll have it all. And it just keeps not happening.

Robin Hanson: (53:58)

The main change I want to suggest to that paradigm is replacing the end with meaningful thresholds along the way. I think there are probably several that we will hit on some timescale. And it feels to me like at least a couple of the big ones are pretty close. And then my crystal ball gets very foggy beyond a pretty short timescale. But I'm struggling with the early 80s expert systems. It really does seem like in my lifetime, I have not seen anything that remotely resembles the experience of going to a doctor. I've seen WebMD. I'm familiar with expert systems to a degree, but I've never seen anything that, Ilya Sutskever from OpenAI puts this really well. He's like, the most shocking thing about the current AIs is that I can speak to them, and I feel that I'm understood. And that is a qualitatively different experience and clearly, I think, reflects some qualitative advance in terms of what kind of information processing is going on. If I had to say what is that under the hood, I would say it's a high-dimensional representation of concepts that are really relevant to us that have previously been limited to language-level compressed encoding. But now we are actually starting to get to the point where we can look at the middle layers of even just the systems we have today, the transformers, and say can we identify concepts like positivity, or paranoia, or love? And we are starting to be able to. It's still pretty messy. We have an analogous problem to understanding what's going on inside the brain. And it's just a mess in there still in the transformers. But we are starting to be able to see these high-dimensional representations where it's like, that is a numeric representation of some of these big concepts. And we're even starting to get to the point where we can steer the language model behavior by injecting these concepts. So you can say, for example, inject safety into the middle layers of a transformer and get a safer response, or danger, or rule-breaking, and then they'll be more likely to break their rules. Nathan Labenz: (56:31) What you're focused on at the moment is telling me about how the latest generation adds capabilities that previous generations didn't have. But every previous generation had that same conversation where they focused on the new capabilities their new generation had that the ones before didn't have. The conversation you're participating in is continuing the past trend. But the fundamental question is, when will AIs be able to do what fraction of the tasks that we have in the human economy? If they can't do a large fraction of them, no matter how impressive they are at the tasks they can do, we will see this economic decline as the population declines. They need to be able to do pretty much all the tasks in order to prevent the economic decline and then the halting of innovation. I did this study of innovation in the United States over 20 years, from 1999 to 2019. And that was a period that encompassed what many people at times said was enormous AI progress. Many people in the period were talking about how there was this revolution in AI that was causing or about to cause a revolution in society in this period from 1999 to 2019. So we did a study, a coauthor and I, Keller Scholl, who looked at all jobs in the US, basically roughly 900 different kinds of jobs. And over that 20-year period, we had measures of how automated was each job in each year. And then we could do statistics to say, when jobs got more automated, did wages go up or down? Did the number of workers in those jobs go up or down? We could say, what about jobs predicts how automated they are? And did the things that determine which jobs are how automated change over that 20-year period? That is, if there had been some revolution in the nature of automation, then the things that predicted which jobs would be more automated would have changed over time. What we found was that when jobs got more or less automated, that had no effect on average on wages or number of workers, and that the predictors of automation didn't change at all over that 20-year period, and they remain very simple-minded predictors that you might expect about automation from long ago. The nature of automation hasn't changed in the aggregate in the economy. Main predictors of automation are whether the job has nice clear measures of how well you've done it, whether it's in a clean environment with fewer disruptions, and whether tasks nearby have been automated, because there's a way in which task automation spreads through the network of nearby tasks. So that study suggested at least up until 2019, there had been no change in the nature of automation. Basically, there's a Gaussian distribution of how automated jobs are, and the median automation had moved roughly a third of a standard deviation through that distribution. So jobs had gotten more automated substantially in that 20-year period, but still most jobs aren't that automated. And that would be my rough prediction for the next 20 years, to say the pattern of the last 20 years will continue. That is, we'll slowly get more jobs more automated, but most automation will be very basic stuff. So far, we just haven't seen much at all of advanced AI kinds of automation making a dent in the larger economy.

Robin Hanson: (1:00:01) So what do you make of things? I'm sure you're familiar with the MMLU benchmark or the Big Bench?

Nathan Labenz: (1:00:10) Is this a machine learning set of tests in order to benchmark performance?

Robin Hanson: (1:00:16) Yes. I believe it's Massive Multitask Language Understanding. The great Dan Hendricks and team.

Nathan Labenz: (1:00:22) So basically a bunch of language understanding benchmarks?

Robin Hanson: (1:00:25) Yeah. They basically went and took final exams from university and early grad school courses from every domain and compiled them into this massive benchmark. There have been a couple different efforts like this, but this is basically the gold standard on which all the language models are measured. And we now have a high 80s to 90% accuracy rate across all fields from a single model, namely GPT-4, and now Google claims that its Gemini is hitting that level as well. I would agree that these have not been broadly customized to the last-mile specifications that they need to work in the context of different firms and cultural contexts and all that sort of thing. But it does seem like the way I typically describe it is that AIs are now better at routine tasks than the average person, and that they are closing in on expert performance on routine tasks. And that's measured by these medical diagnosis benchmarks, these MMLU type things, etc.

Nathan Labenz: (1:01:32) So let me remind you that in the 1960s, AI researchers took chess as a paradigm of, if you can make a machine that can do that, well, obviously you'll have to have solved most of the major problems in thinking because chess involves most of the major problems in thinking. So when we can finally have human-level chess abilities, we will have human-level AI. That was the thinking in the 60s, and they could look at the rate at which AI was getting better at chess and forecast long before it happened that in the late 1990s, is exactly when chess would reach human-level ability, and that's when it did happen. And that was 25 years ago. And clearly, we were just wrong about the idea that you couldn't do chess without solving all the major thinking problems. And we repeatedly have this sort of phenomenon where people look at something and they go, if you can do that, surely you can do most everything. Then we can do that and we aren't near to doing most everything. I just got to say this benchmark is just wrong. It's not true that if you can do this language benchmark, you are near to doing most everything. You are not near.

Robin Hanson: (1:02:45) Yeah, I would frame my position to say, I think you're near to being able to do all the routine things that are well documented in the training data.

Nathan Labenz: (1:02:53) Well, but the question is, in the economy, all the things we need doing, how close are you to that? I'd say you're not close.

Robin Hanson: (1:03:00) We're seeing just the very beginning of...

Nathan Labenz: (1:03:06) What do you think was going on in their head in the 1960s when they looked at chess, right? They looked at chess and they said, it takes really smart people to do chess. Look at all these complicated things people are doing when they do chess in order to achieve in chess. They said to themselves, that's the sort of thing we should work on because if we can get a machine to do that, surely we must be close to general artificial intelligence if you could have something that could do chess. There is a sense that when you have general intelligence, you can use all of that to do clever things about chess, but it's not true that you need to have all those general things in order to be good at chess. It turns out there's a way to be good at chess without doing all those other things, and that's repeatedly been the problem, and that could be the problem today. It turns out there's a way to do these exam-answering things that doesn't require the full range of general intelligence in order to achieve that task. It's hard to pick a good range of tasks that encompasses the full range of intelligence because you teach to the test and you end up finding a way to solve that problem without achieving general intelligence.

Robin Hanson: (1:04:09) This does seem different, though. I mean, I agree with your characterization that basically it turned out that there was an easier way or a more direct way, a narrower way to solve chess. And it's interesting that it's rather different. It involves these sort of superhuman tree search capabilities.

Nathan Labenz: (1:04:28) But that wasn't just true of chess. There were another dozen sorts of really hard problems that people in the 1960s took as exemplars of things that would require general intelligence, and a great many of them have been achieved.

Robin Hanson: (1:04:41) But when I look at the current situation, I'm like, this does look a lot more like human intelligence, and I would say that from any number of different directions.

Nathan Labenz: (1:04:53) And that was true in every decade for the last century. Every decade has seen advances that were not the sort that previous systems could achieve.

Robin Hanson: (1:05:03) It's clear that you, or at least I think it's clear, that you don't see the human brain, the human level of achievement, as sort of a maximum, right?

Nathan Labenz: (1:05:13) Oh, of course not. Absolutely.

Robin Hanson: (1:05:16) It's like there's got to be a finite number of breakthroughs that need to happen.

Nathan Labenz: (1:05:22) We will eventually get full human-level AI. I have no doubt about that. And not soon after, vastly exceeded. That will happen, and it will happen plausibly within the next thousand years.

Robin Hanson: (1:05:33) It also seems like you would probably agree that it need not be point for point. The EM scenario is a great one to play out and analyze, but it need not be the case.

Nathan Labenz: (1:05:45) Right. So the AIs could be much better than humans in some ways and still much worse than others. That will probably actually be true for a long time. That is, it'll take a lot longer till AIs are better than humans at most everything than that they are better at humans at, say, half of things people do today. But of course, you have to realize if you looked at what humans were doing two centuries ago, we're already at the point where machines do those things much better than humans can do. That is, in fact, most tasks that humans were doing two centuries ago are already long since automated. We've now switched our attention to the sort of tasks that people were not doing two centuries ago, and on those, we're not so good at making machines do them, but we've already dramatically achieved full automation basically of most things humans were doing two centuries ago.

Robin Hanson: (1:06:29) Which for very shorthand, I would say is routine, repetitive, physical tasks.

Nathan Labenz: (1:06:35) Right. We managed to change the environment to make them more routine and repetitive. So, you know, a subsistence farmer on a subsistence farm two centuries ago, we couldn't, our automation could not do that job that they were doing then. We managed to make the farms different, the factories different, etc., so that our machines could do them. And now they are producing much more than those people produced. But if you had to try to produce the way they were doing two centuries ago, our machines today could not do that.

Robin Hanson: (1:07:03) Yeah. A big theory I have also, I actually don't think this is going to be huge, well, everything's going to be huge, but I don't think it's going to be like the dominant change that leads to a qualitatively different future. But I do think we will start to see, and are beginning to see, that same process happening with language models where, you know, I consult with a few different businesses and we have processes that we would like to automate. A classic one would be like initial resume screening, right? We're not going to have the language model at this point make the hiring decisions. But if we get a lot of garbage resumes, you know, we can definitely get language models to band the resumes into a one to five and spend our time on the fives. It does seem to me that there's a lot of kind of process and environment adaptation that is not that hard to do. Like, I personally have done it successfully across a handful of different things. Yet it seems like your analysis sort of assumes that that's not going to happen at scale this time around with the technology we currently have.

Nathan Labenz: (1:08:11) I said, you know, in the last 20 years from 1999 to 2019, we moved roughly a third of a standard deviation in the distribution of automation. Okay, so what if we, in the next 60 years, move a third of a standard deviation in each of the 20-year periods? Then over 60 years, we would basically move an entire standard deviation. That could represent a large increase in automation over the next 60 years. And that would mean a lot of things we're doing by hand today will be done by machines then. It would mean our economy is more productive, but it still would mean humans have a huge place in the world. They get paid and most income probably still goes to pay humans to do work, even though they have much better automation at the time. If that's the situation in 60 years, then unfortunately, that level of increase in automation is just not sufficient to prevent the economy from declining as population declines. And so we won't get much more automation than that. The well of automation will dry up because innovation will stop, and we would then have a several-centuries-long period where our technology does not improve and in fact, we lose a lot of technologies tied to scale economies. As the world economy shrinks, we'll manage to have less variety, less large-scale production and distribution, and we would then struggle to maintain previous technologies. AI is at risk of the sort of technology it would be hard to maintain because at the moment, AI is a really large-scale concentrated sort of technology. It's not being done by mom and pops. It's being done by very large enterprises on very large scales.

Robin Hanson: (1:09:53) I would agree that the supply chain is definitely prone to disruption in AI. No doubt about that. Can you describe in more detail what is the standard deviation in automation and how should I conceptualize that?

Nathan Labenz: (1:10:08) I guess what you'd want to do is see a list of tasks and how automated each task was and then see sort of how much on that score it would have. So basically, if you look on this list at the most and least automated tasks, you'll agree which are which. Like, the nearly most automated task is airline pilots. Nearly the least automated task is carpet installers. Carpet installers use pretty much no automation to staple in carpets, and airline pilots are pretty much always having automation help what they're doing. And then, you know, you can see the scores in the middle and see that we've moved up a modest degree over those 20 years. That would be the way to get an intuition for it, is just to see a list of particular jobs and their automation scores and compare that to the amount by which we've moved up.

Robin Hanson: (1:11:03) How do you reconcile, or how should I understand, the idea that whatever doubling time of the economy today, I think you said it was like 15 years in the book, which seemed a little fast to me just based on rule of 70.

Nathan Labenz: (1:11:18) Right. I think it's more like 20 or something now.

Robin Hanson: (1:11:21) But still, it seems like there's a little bit of a disconnect between a notion of, you know, over these next 60 years, we would be double, double, double, you know, essentially 10Xing the economy, but we'd only move at sort of a linear rate in automation. Like, we would only move a third of a standard deviation in each period.

Nathan Labenz: (1:11:43) Let me help you understand that then. People have often said, look, computer technology is increasing exponentially. Therefore, we should expect an exponential impact on the economy, i.e., early on hardly any impact, and then suddenly an accelerating boom such that we get this big explosion and then everything happens. But that's not what we've seen. So what we've seen over time is relatively steady effects on the economy of automation, even though the economy is growing exponentially. The way I'd help you understand that is to imagine the distribution of all tasks that you might want automated and that the degree of computing power, both in hardware and software, required to automate that task, for each task, is distributed in a log-normal way with a very large variance. That is, there's this very large range of how much computing power it takes to automate a task. As computing power increases exponentially, you're basically moving through that log-normal distribution in a linear manner. And in the middle of the distribution, it's a pretty steady effect. You slowly chip away at tasks as you are able to automate them because you're slowly acquiring sufficient hardware to do that task. That gives you a simple model in which computing power grows exponentially, and yet you see a relatively steady erosion of tasks through automation.

Robin Hanson: (1:13:07) It's a low-hanging fruit argument.

Nathan Labenz: (1:13:09) Yeah. The low-hanging fruits are hanging really low. This is a log-normal tree, basically, that you're trying to grab things from. Your ladder is growing exponentially into the tree, and every time your ladder gets taller, you get to pick more fruit, but it's a really tall tree. That means that you have a long, long way to go.

Robin Hanson: (1:13:28) How do you think about things like the progress in AI art generation or deepfakes over the last couple of years? This is an area where I feel like if we rewound to two years ago, just two years ago really, when I was first starting to see AI art popping up on Twitter. And it was like not very good for the most part. You'd see the occasional thing where you're like, oh, that's really compelling. And then you'd see a lot of stuff that was like, yeah, you know, it's whatever. It's remarkable that you can do that. It's wow compared to what came before, but it's like, I'm not going to be watching feature films based on this technology in the immediate future. I feel like we could have had a very similar discussion where you might say, well, you know, yeah, it's progress, but, you know, the real human art, the top-notch stuff, like, that's so far away. And then early last year, my teammates at Waymark made a short film using nothing but Dolly 3 or Dolly 2 at that time, imagery and some definite elbow grease. But like, the quality of production that they were able to achieve with half a dozen people and Dolly 2 is on the level that previously would have taken, you know, a crew in Antarctica, you know, to go shoot. You know, again, is that work all done? No. But if you look at the Midjourney outputs today and you look at some of the deepfake technologies that are happening today, it's like it does feel like we've hit certainly photorealistic thresholds, you know, almost indistinguishable from photography with Midjourney, and with the deepfakes, you're not quite there yet, but like, watch out for 2024 to have a lot of stories of people being scammed by the custom text-to-speech voice with a family member's voice or whatever. My voice is out there. People are going to be calling my parents with my voice. So I guess what I'm trying to get at there is, it seems like even just in the last couple of years, we have these examples where we are seeing really rapid progress that is not stopping before critical thresholds.

Nathan Labenz: (1:15:37) In the 1960s, there was a US presidential commission to address and study the question of whether most jobs were about to be automated. It reached that level of high-level concern in the country and major media discussion about it. Ever since then, we continue to have periodic articles about dramatic, exciting progress in AI and what that might mean for the society and economy. And in all those articles through all those years, they don't just talk in the abstract. They usually pick out some particular examples, and they don't pick out random examples from the economy. They pick out the examples where the automation has made the most difference. That, of course, makes sense if you're trying to make an exciting story. And so we've always been able to pick out the things which are having the most dramatic increase lately that also seem the most salient and interesting, and now you can pick out image generation as one of the main examples lately as something that's increased a lot lately. And I'm happy to admit it has. I would put it up, you know, and that's the sort of thing that somebody writing an article today about the exciting AI progress would in fact mention and talk about graphic artists being put out of work by the availability of these things, which probably is happening. The point is just to realize how selective that process is to pick out the most dramatic impacts and to realize just how many other jobs there are and how many other tasks there are and then how far we still have to go. I'm happy to celebrate recent progress. And if I were, you know, if I were a graphic artist person, I would be especially excited to figure out how to take advantage of these changes, because they are among the biggest changes. If you're, say, a 20-year-old in the world, it makes complete sense to say, where are things most exciting and changing? I want to go there and be part of the new exciting thing happening there. If, of course, you're a 60-year-old, you've already invested in a career, then it makes less sense to like, try to switch your whole career over to a new thing. But a lot of people are at the beginning of their career, and they should look for where the most exciting changes are and try to see if they can go be part of that. Move West, young man, if West is where things are happening, right? But you still have to keep in mind, if there's a few people going out West making exciting things happen, how big a percentage of the world is the West, right? Yes, it's exciting and there's huge growth in the West. Ten years ago, there was hardly anything, now there's a big town. How great, the West is growing. There are always times and places where right there, things are growing very fast. Newspaper writers should focus on those to tell stories, and novelists should focus on those to tell stories. They're exciting places where exciting things are happening, and I want to make sure the world keeps having things like that happening because that's how we can keep growing. But you have to be honest about the fraction of the world that's involved in those exciting frontier stories.

Robin Hanson: (1:18:31) Yeah. I mean, I guess my kind of counterpoint to that would be the same relatively simple technology, like the transformer or like the attention mechanism perhaps it is better, you know, pinpointed as, is driving this art creation. It's also writing today programs. Yeah, I would personally say my productivity as a programmer has been increased like several fold, not like incrementally, but like multiple with GPT-4 assistance. You know, it's the wide range, right? You could go on, but like, it's also happening in medical diagnosis. It's also happening in, like, novel protein structure generation.

Nathan Labenz: (1:19:14) Certainly, from an economic point of view, the biggest category you've mentioned is programming. That's a much larger industry, a larger profession than the other ones you mentioned.

Robin Hanson: (1:19:22) Well, watch out for biotech also, I would say, for sure.

Nathan Labenz: (1:19:25) Biotech has been shrinking for a while, so that's not a thing you should point to as a growing thing.

Robin Hanson: (1:19:31) I will predict growth for biotech, definitely. It's reading brain states. Have you seen these recent things where people can read the brain state?

Nathan Labenz: (1:19:40) Among the things you're talking about at the moment, the biggest profession being affected is programming, clearly. I have a younger son, two sons. My younger one is a professional programmer, so I've had him look at, and his workplace has looked into, what they can do with large language models to help them write programs. Their evaluation so far is they'll wait six months to look again. It's not useful now.

Robin Hanson: (1:20:06) Can I short that stock?

Nathan Labenz: (1:20:08) Well, I could tell you after we finish what that is, but basically, I think this is true. Most actual professional programmers are not using large language models that much in doing their job. Now, I got to say that if some people are getting factors of two productivity increase, that eventually we should see some effect of that on their wages. That is, of course, now if lots of programmers go out and use productivity spaces, in some sense, we're going to increase the supply of programming. And so supply and demand would mean that maybe increasing the supply lowers the price, even if it dramatically increases the quantity. But there's such a large elastic demand for programming in the world that I actually think that effect would be relatively weak. And so you should be expecting large increases in the wages going to programmers if you are expecting large overall increases in the productivity of programmers. Because again, it's a large elastic demand for programming in the world. For a long time, a lot of change in the world has been driven by programming and limited by the fact there's only so many decent programmers out there, only so many people you can get to do programming. So clearly, if we can dramatically expand the supply of programmers, we can do a lot more programming in a lot more areas, and there's a lot of money that's willing to go to that to do that. There's a lot of people who would be hiring more programmers if only they were cheaper, and they're about to get cheaper in effect. And so you should be predicting large increases in basically the wages and number of programmers in the world. We haven't seen that yet.

Robin Hanson: (1:21:51) I do predict large increases in number. I'm not so sure about wages. It feels like...

Nathan Labenz: (1:21:55) Why not?

Robin Hanson: (1:21:56) Well, I've done a couple episodes with folks at a company called Replit, which is a very interesting end-to-end at this point software development platform. Their mission is to onboard the next billion developers. And, you know, they have a great mobile app. They have kids in India that are, you know, 14 years old that are doing it all on their mobile app. And I'd say it's much harder, and maybe this reflects the kind of programming that your son is doing, but I'd say it's much harder to take the most elite frontier work and accelerate that in a meaningful way versus, like, commoditizing the routine application development that, like, the, you know, the sort of long tail of programmers mostly do. Nathan Labenz: (1:22:42) My side is definitely doing routine application development, not at the frontier programming at all. But again, I'm saying I don't expect this sudden large increase in programmer wages and quantity, especially wages. I mean, the less the quantity increases, the more wages would have to be increasing to compensate. And I think it'll be hard to get that many more people willing to be programmers, but you could pay them more. And I don't predict this. So this is a concrete thing we could even bet on over the next 5 or 10 years. Will there be a big boost in programmer wages? That would be the consequence. It's a very simple supply and demand analysis here. This isn't some subtle rocket science version of economics.

Robin Hanson: (1:23:29) Well, typically when supply increases, price drops. I'm expecting lots more programmers and them to be broadly cheap.

Nathan Labenz: (1:23:36) Depends on the elasticity of demand. If you think about something that there's just a very limited demand for in the world, if piano tuning got a lot cheaper, you wouldn't have a lot more pianos because piano tuning is not one of the major costs of having a piano. It's the cost of the piano itself plus the space for it in your living room, and the time it takes to play on the piano. So piano tuning is a really small cost of pianing. So that means the elasticity of demand for piano tuners by itself is pretty low. There's just basically so many pianos, they all need to be tuned. If each piano tuner could tune each piano twice as fast, say, then we basically only need half as many piano tuners because there's just not much elasticity for demand. For kinds of jobs like that, productivity increases will cause a reduction in the employment. But even in that case, you might get a doubling of the wages in half the number of piano tuners because they can each be twice as productive. But for programming, it's clear to me that programming has an enormous elastic demand. The world out there has far fewer programmers than they want. They would love all over the place to hire more programmers to do more things. There's a big demand in the world for software to do stuff and there's a huge potential range of things the software could be doing and it's not doing now. So that means there's a pretty elastic demand for programming. That means as we increase the quantity of programming, the price doesn't come down that much. There's still people willing to buy this stuff. So that tells me that as productivity increases, basically the supply is expanding and the demand's not coming down much. So we should just see a much larger quantity, but then basically because each person is being more productive, each person should get paid more. So the elasticity of supply is going to be a combination of two things, each person getting more productive and more people being willing to join that profession. And I think we've already seen that, even as the wages for programming has gone way up in the last decade or so, the number of programmers hasn't gone up as fast. That is, there's just a limited number of people who are decent at programming, and it's hard to get the marginal person to be a programmer. But the people who are programmers, when they're productive, they get paid a lot. I mean, as you've probably heard rumors about AI programmers and how much they're being paid lately, it's crazy high because there's just a limited supply. So I got to say, I expect large increases in wages for programmers if, in fact, large language models are making programmers much more productive. But according to my son, at least, and others I've heard, that's not happening.

Robin Hanson: (1:26:24) I'm with you up until the very last two points. I would say, I think it is happening. And I would also say, I think my estimation of the relevant elasticities is that there will be a large growth in people who can be and will choose to be programmers, but that the wages don't go up. They don't fall, like, dramatically necessarily either because it has to be an attractive thing for people to want to do it. But I think that the prevailing wages are quite high compared to what a lot of people would be excited to take if they could easily break in with language model assistance, which I think they will increasingly be able to do. Let me change gears a little bit. So we've debated this. This has been really, I always appreciate a useful and thoughtful challenge to my world model. You're definitely supplying that. Let's do a couple more speculative things that could be kind of fun. First, a little bit of LLM. As I was going through the book, there are a number of things that I was like, this is really interesting. How would I think about this a bit differently? And maybe suspend a little bit of your skepticism of how much impact LLMs will make. Let's go in a world where scaling continues to work, context lengths get long, we start to see not total displacement of humans, but like substantial fraction of tasks being LLM automatable. One interesting inference that you make is that there won't be that many different base ems, that essentially there will be super selective emifying of really elite, really capable people, that those will become the bases that they'll be sort of essentially turn into kind of clans where they'll highly identify with each other and they'll have marginally different specialization, but that there will be these sort of recognizable, almost canonical personalities that are not that many of them that kind of come to dominate the economy. It seems like we're kind of seeing something similar with language models already, where it's like, we have GPT-4, we have the new thing from Google, we have Claude, we have like a couple open source ones, and then they get a lot of local fine tuning and kind of adaptation. I guess my read on that was that it's an odd, initially a very surprising vision of the future, but it does seem like we see the proto version of that in the development of large language models. Any thoughts?

Nathan Labenz: (1:29:03) It's basically how many different kinds of jobs are there is the question. Job tasks are there. And then how many dimensions do they vary? I mean, there's clearly a lot of different kinds of jobs. Like I told you, the study we did looked at 900 of them. But once you look at 900 different jobs, a lot of jobs are pretty similar to each other, and they take pretty similar mental styles and personalities to do those jobs. So when we're looking at humans, at least, it looks like a few hundred humans would be enough to do pretty much all the jobs. That's looking at the variation in humans. Now, the harder part is to say, well, large language models, is their space of dimensional variation similar to humans or is it very different? That's much harder to judge. But yeah, I would guess that it's in this way not that different. That is, even in large language models, there's a difference where you first train a basic model and that's a lot of work. Then you train variations on it and it does look like the variations are mostly enough to encompass a pretty wide range of tasks. You need a small number of base approaches and then a lot more cheaper variations that are enough to do particular things. So certainly, that's a remarkable fact in some sense about large language models is the range of different tasks they can do starting with the same system. Right? And so they have a degree of generality that way. And humans in some sense have a degree of generality that way where we are able to learn to do a pretty wide range of things. So yeah, I would, and I don't know if it's going to be just 4 as opposed to 40 or 400. That's harder to say. But in some sense, it could be 1 or 2. I mean, even in the age of ems, I was giving the few hundred as an upper limit. It could turn out to be much lower. It really depends on how much quick, fast, last minute variation can actually encompass the range of differences. If differences are somewhat shallow and surface, which not really fundamental, then yeah, last minute variation might be enough.

Robin Hanson: (1:31:16) Another interesting assumption, this one I think is more of a contrast with language models, is, and we talked about this briefly earlier, that the ems, they can be easily cloned, but they can't be easily merged. In other words, because we don't have a great sense of how exactly it works inside and what internal states are meaningful, we can't just superimpose them on top of one another. Language models, it seems like we are making actually a lot more progress on that front. It's not a solved problem, but there are techniques for merging. There are techniques for, like, training separately and combining. There are these sort of mixture of experts techniques.

Nathan Labenz: (1:31:55) People are exploring those, but notice that to make GPT-4, you didn't start with GPT-3 and add more training. You started with a blank network and you started from scratch. And that's consistently what we've seen in AI over decades. Every new model does not start with an old model and train it to be better. You start with a blank representation and you train it from scratch. And that's consistently how we've made new systems over time. So that's a substantial degree of not being able to merge. And that's quite different than humans. I mean, often to get a human to do a new task, you want to take a human who can do lots of previous tasks because they can more quickly learn how to do this new task. And that's just not what we're seeing. You try to take, I don't know, Claude and GPT-4 and Grok and merge them. I mean, I just don't think anybody knows how to do such a merge today. There's no sensible way you could do such a merge. You could take Claude and then do all the training that you would have done on GPT-4 except do it starting from Claude. I think people think that would be worse than starting with the blank representation as they usually do.

Robin Hanson: (1:33:07) Yeah, I think that's definitely not a solved problem today. Wouldn't claim that you can just drop Claude and GPT-4 on top of each other. But there are enough early results in this that it seems much more plausible. Plus, we have the full wiring diagram and the ability to kind of x-ray internal states with perfect fidelity. It seems like there is a much more likely path. Forget about the plausibility for a second. What do you think it would mean if the AIs could be kind of divergent but also remergeable?

Nathan Labenz: (1:33:39) I think the fundamental issue here is rot. So we see rot in software, especially with large legacy systems. We see rot in the human brain. I think we have to expect rot is happening in large language models too. Rot is the reason why you don't start with old things and modify them. You start from scratch. When you have a large old legacy piece of software, you could keep trying to modify it, improve it, but typically, at some point, you just throw it all away and start from scratch again. People get a lot of advantage about being able to start from scratch, and that's because old large things rot. And my best guess is that that will continue to be true for large language models and all the kinds of AIs we develop. We will continue to struggle with rot as a general problem indefinitely. And this is actually a reason why you should doubt the image of the one super AI that lasts forever because the one super AI that lasts forever will rot. And in some sense, to maintain functionality and flexibility, it would have to replace itself with new fresh versions periodically, which then could be substantially different. That's in some sense how biology has worked too. Biology could have somehow made organisms that lasted forever, but it didn't. It made organisms that rot over time and get replaced by babies that start out fresh and rot again. And that's just been the nature of how biology figures, and that's how our economy works. We could have had the same companies as we did a century ago running the economy, just changing and adapting circumstances, but we don't. Old companies rot and could die away and get replaced by new companies. And I predict in the age of ems that ems would in fact rot with time and therefore no longer be productive and have to retire and be replaced by young ems. And that's a key part of the age of ems scenario that I think would generalize to the AI world. I think, in fact, rot is such a severe and irredeemable problem that AIs will have to deal with rot in roughly the same way everybody else has, i.e., make systems, let them grow, become capable, slowly rot, and get replaced by new systems. And then the challenge will always be, how can the new systems learn from the old ones? How can the old ones teach the new ones what they've learned without passing on the rot? And that's a long time design problem that we're going to face in large language models even. I think in a few years, a company will have had a large language model and they've been building up for a while to train, to talk to customers or something, and then it'll be rotting. And they'll wonder, well, how can we make a new one that inherits all the things we've taught this old one? And they'll struggle with that. They can't just move the system over. They'll have to have maybe the same training sets or something. They have to collect training sets they're going to apply to the new system like the old one. But that will continue to be a problem in AI as it has been in all complicated systems so far.

Robin Hanson: (1:36:33) Yeah, interesting. I think that is a pretty compelling argument for medium and long timescales. And I can even see that already. OpenAI supports, for example, fine tuning on a previously fine tuned model. And I don't in practice use it. I'm not sure how many do. What I do think is still a plausibly very interesting kind of fork and merge is, you know, with these new state space models, it seems that you could like, one remarkably difficult challenge for a language model is scan through my email and find what's relevant. You know? It's like it has a hard time doing that for a couple different reasons, finite context window, and I just have a lot of email. With the state space models, I do think you could clone, or parallelize, have them each kind of process a certain amount, and literally then just potentially merge their states back together to understand, in kind of a superposition sort of view, what are all the things that are relevant even though they were processed in parallel. And so I do think that that kind of quick forking and merging could be a really interesting capability. But at some level of divergence, it does seem like it probably just becomes unfeasible or not even desirable.

Nathan Labenz: (1:37:55) I mean, so a very basic interesting question about brain design is the scope for parallelism. So, you know, in your brain, there's a lot of parallelism going on. But then when you do high level tasks, you typically do those sequentially. And so there's just an open question in AI. Surely, can do some things in parallel at some small of a timescale, but how long of a timescale can you do things in parallel before it becomes hard to merge things?

Robin Hanson: (1:38:23) Okay. Another different topic. So in the age of ems, the assumption seems to be from the beginning that because these things are, in some sense, one for one with humans, that they should get or people will naturally be inclined to give them a sort of moral worth status?

Nathan Labenz: (1:38:44) I think it's more the other way around. They would insist on it just like you would insist that people around you dealing with you give you some substantial moral weight. If the ems are just actually running the society, they will similarly insist on that, and humans who want to deal with them will kind of have to go along. Unless the ems are enslaved by humans, then if the ems are free to work with the humans or not, it's just like in general, having a modest degree of respect for your coworkers is a minimum for being a coworker. If your coworkers perceive that you disrespect them enough, then they just won't want you around and you'll have to go somewhere else. So if humans are going to interact and work with ems, they'll have to, on the surface at least, when they're not in private, treat them with modest respect.

Robin Hanson: (1:39:36) Well, for the record, I always treat my language models with respect as well. I'm very polite to them. I never engage in the emotional manipulation techniques that some have shown to perhaps be effective, but doesn't feel quite right to me. And not because I think they're moral patients, but it's more about just the habits I want to get into. But I'm still a little confused by this on a couple of ways. One is, first of all, just by default, it seems like they will be enslaved to humans. The first ems that get created, they get loaded onto a machine, they're in some state, I can turn them on, I can turn them off. They can't decide when they get turned on and turned off. If I boot them up in an eager, ready to work sort of state and they're ready to do a task, and they've got these virtual inputs, they're probably not even going to be in the mindset, right? To think, I demand respect. They're just going to be in that mindset that they were kind of stored in of ready to work. So I'm still a little confused as to where that comes from. Then the flip side of that question would be, under what circumstances, if any, do you think we would start to treat our language model or successor systems as moral patients, even if they're not one to one with us, but are there things that they might start to do or ways they might start to behave where you think we would feel like that's the right thing to do?

Nathan Labenz: (1:40:55) We have substantial understanding of slavery in human history and where it works and where it doesn't and why. First of all, we know that when land was plentiful and people were scarce, then people would have high wages and then it might be worth owning somebody. But in the vice versa case where people were plentiful, land was scarce, then there really wasn't much point in having slaves because free workers would cost about the same. Why bother with enslaving? Situations where slavery made some sense is where wages were high, but then depending on the kind of tasks, there are some kinds of tasks where slavery can help and others where it doesn't so much. So say in the US South, out in the field of picking cotton or something, if you just need people to push through their pain, then slavery can force them to do that and make them be more productive. But if they need to do complicated things like being a house slave or a city sort of slave at a shop, those sorts of slaves tended to not be abused and to be treated like a worker would because they just had so many ways to screw you if they were mad. Their jobs were complicated and you were trusting them to do a lot of things. And so as a practical matter, you had to treat those sorts of slaves well. Work has become far more complicated since then, and employers have become far more vulnerable to employee sabotage. There's not that much that a cotton picker can do to sabotage the cotton. If they're mad at you, you can just whip them and make them pick the cotton faster. But again, house slaves, shop slaves, city slaves, they just have a lot more discretion and you need to get them to buy in. And so again, in the age of ems is a world where wages are near subsistence levels. Kind of work you can get out of a slave is about the same as you can get out of a free worker because they're both working for subsistence wages. If the free worker is more motivated, they enjoy themselves more and they feel more owning themselves and that gives them a sense of pride and devotion and they're less willing to sabotage your workplace, that would be a reason to not have them be slaves. I think large language models, certainly they have been trained on data about human behavior, wherein humans are resentful of being treated as slaves and want to be respected and need to be motivated and need to feel respected to be motivated and are less likely to sabotage if they feel like they have some freedom. All of those things would continue to be true of large language models to the extent that they were trained on human conversation and behavior, and that's how humans are. So in this vast space of possible AIs, there could be AIs that don't mind at all being enslaved, but large language models aren't going to be those.

Robin Hanson: (1:43:53) But it does seem like you expect that natural selection or human guided selection of these systems will trend that direction. The idea that ems or language models will demand leisure seems to be at odds with the other part of the vision that they will become okay with being turned on, turned off.

Nathan Labenz: (1:44:16) So the need for leisure does seem to be more just a constraint on the human mind. That is, people are just more productive when they get breaks. That seems to be a very robust feature of human work across a wide range of context, even including literal slaves. They need a 5 minute break every hour. They need a lunch break. They need an evening break. They need a weekend. This is just what human minds are like. They are more productive when they get periodic breaks. So maybe the breaks aren't leisure exactly. Maybe they don't write a novel in their spare time, but they do need what they see as a break.

Robin Hanson: (1:44:47) Well, I know we're just about out of time. Maybe my last question is, are there things that you are looking for or are there things that you could imagine happening in the not too distant future where you would change your expectations for the future again and begin to feel like maybe we are entering into a transition period that will lead to a qualitatively different future, going a different direction from this sort of technology stagnation?

Nathan Labenz: (1:45:19) The trends that I would be tracking are which jobs, tasks actually get automated, how much is paid for those. So if I saw big chunks of the economy where all of a sudden workers are doing a lot more, automation is doing tasks instead of workers and that changing the number of workers and the wages they get and the number of firms supplying that go up, then yeah, I start to see a lot of things happening. That's the thing I'm looking for, and that's the thing that people haven't seen so much in the past. They tend to focus on demos or maybe the high-tech companies that get a lot of reputation out of doing AI, and not so much the rest of the economy and who's actually getting paid to do stuff. I mean, if you think about, say, the farming revolution, where tractors went out and replaced farmers, that was really large and really visible and really clear. If you look at, say, trucks replacing horses, you saw a very large, very substantial replacement with enormous differences in who supplied them and who got paid. We have seen large changes in automation in the past. We don't have to scrape to sort of see subtleties in such things. They're often just quite out in the open and visible and very obvious. So that's what I'm waiting for, those big obvious sorts of displacements. And even having trucks replace horses and tractors replacing farmers didn't make AI take over everything. Even if I saw big changes, I wouldn't necessarily predict we're about to see AI take over everything, but I would at least know what I'm looking at. And that's the sort of thing to try to project forward and try to think about where that's going to go.

Robin Hanson: (1:46:56) This has been an awesome conversation. I've been a fan of your work for a long time, and it's been an honor to have you on The Cognitive Revolution. Robin Hanson, thank you for being part of The Cognitive Revolution.

Nathan Labenz: (1:47:08) Thanks for having me.

Robin Hanson: (1:47:09) It is both energizing and enlightening to hear why people listen and learn what they value about the show. So please don't hesitate to reach out via email at tcr@turpentine.co, or you can DM me on the social media platform of your choice.

Infinite Code Context: AI Coding at Enterprise Scale w/ Blitzy CEO Brian Elliott & CTO Sid Pardeshi

The AI-Powered Biohub: Why Mark Zuckerberg & Priscilla Chan are Investing in Data, from Latent.Space

AI & The Law: Changing Practice, Claude Constitution, & New Rights, w/ Kevin & Alan of Scaling Laws

Is AGI Far? With Robin Hanson, Economist at George Mason University

Watch Episode Here

Video Description

Full Transcript

Transcript

Nathan Labenz

Read next