Hollywood Strike Update and AI Roundup with Trey Kollmer

Watch Episode Here

Video Description

In this episode, Trey Kollmer returns to the show to discuss updates to the Hollywood Strikes, including news on SAG-AFTRA and WGA. Trey and Nathan chat why actors are joining the strikes, how AI will change acting as a profession, Trey’s views on reasoning and how he’s experimenting with GPT-4, and much, much more. If you're looking for an ERP platform, check out our sponsor, NetSuite: http://netsuite.com/cognitive

📣 CALL FOR FEEDBACK:
To borrow from a meme… we're in the podcast arena trying stuff. Some will work. Some won't. But we're always learning.
http://bit.ly/TCRFeedback
Fill out the above form to let us know how we can continue delivering great content to you or sending the feedback on your mind to tcr@turpentine.co.

TIMESTAMPS:
(00:00:00) Episode Preview
(00:04:01) Hollywood and AI: updates on SAG-AFTRA and the WGA
(00:15:20) Sponsors: NetSuite | Omneky
(00:18:56) Studio approach to copyright and compensation with generative AI
(00:22:17) Hollywood receptiveness to using AI and protections guild members are asking for
(00:24:05) How much potential is there in fine-tuning models on writing Hollywood scripts?
(00:24:56) Models implementing gradient descent in the weights
(00:29:14) How Nathan uses models to write for the podcast
(00:34:15) Generating and mining jokes in the writer’s room
(00:35:18) Generating polarizing material
(00:40:18) Untraining models
(00:44:20) AI writing tools and writer perception of them
(00:46:34) Context length and pooling layers
(00:51:11) Microsoft China
(00:52:09) Chat-GPT’s system prompt: steering the model in the direction you want
(01:00:02) Actors’ strike
(01:02:20) Background actor rights
(01:05:45) Using 1 million DALL-E images to create an AI short film
(01:09:11) Deepfakes
(01:10:29) Speculation on outcomes for the actor’s strike
(01:12:51) The future where anyone can be a reasonable synthetic actor
(01:17:33) Trey’s take on reasoning and why Hollywood should be more open to AI
(01:16:20) New generative choose your own adventure content
(01:25:49) A monk’s experience with Chat-GPT
(01:28:56) Outdatedness of stochastic parrot notion; reasoning and synthesis
(01:32:05) Adversarial attacks
(01:36:14) Model vs human susceptibility to adversarial attacks because of human robustness
(01:39:20) Trey and Nathan’s reasoning experiments
(01:47:30) Performance jumping with abstraction
(01:49:22) Language model self-delegation
(01:51:04) NVIDIA’s margins compared to TSMC and ASML
(02:07:35) Adding an AI layer and competing with incumbents
(02:09:22) How sustainable is the demand for an AI friend?
(02:13:10) Rewind AI
(02:19:27) Big tech vs old school studios
(02:14:52) Dramatic ironies from the picketing lines
(02:21:38) AI development moments that feel like a movie
(02:26:34) CEO of Inflection’s views on misuse being a greater threat than the AI itself

LINKS MENTIONED:
- Watch Part 1 of The AI Revolution in Hollywood: https://www.youtube.com/watch?v=BkXiQitKu9I
- Pamela Samuelson on Generative AI Meets Copyright: https://www.youtube.com/watch?v=6sDGIrVO6mo
- Google Lawsuit: https://www.reuters.com/legal/litigation/google-hit-with-class-action-lawsuit-over-ai-data-scraping-2023-07-11/
- Gradient descent in weights: https://twitter.com/labenz/status/1611745393007525890
- Backspace Paper: https://arxiv.org/abs/2306.05426
- Paige Bailey on Google’s PaLM 2: https://www.youtube.com/watch?v=K-XYxLifpQE

X/SOCIAL:
@treyko (Trey)
@labenz (Nathan)
@eriktorenberg
@CogRev_Podcast

SPONSORS: NetSuite | Omneky

NetSuite has 25 years of providing financial software for all your business needs. More than 36,000 businesses have already upgraded to NetSuite by Oracle, gaining visibility and control over their financials, inventory, HR, eCommerce, and more. If you're looking for an ERP platform ✅ head to NetSuite: http://netsuite.com/cognitive and download your own customized KPI checklist.

Omneky is an omnichannel creative generation platform that lets you launch hundreds of thousands of ad iterations that actually work customized across all platforms, with a click of a button. Omneky combines generative AI and real-time advertising data. Mention "Cog Rev" for 10% off.

Music Credit: GoogleLM

Full Transcript

Transcript

Trey Kollmer: 0:00 Even she showed pictures of pretty similar image generated by stable diffusion model next to a training image that was very similar to it. And she even thought that no court would conclude that that was a derivative work. But she does think there may be questions in the training of the model. You're copying these works and using it to create a system which can just replace a lot of the underlying creators. But that is really up in the air and that maybe Congress will weigh in and come up with new legislation because it's so new. It seems crazy. We're trying to apply these old laws and old rules to such a different paradigm.

Nathan Labenz: 0:39 Hello, and welcome to the Cognitive Revolution, where we interview visionary researchers, entrepreneurs, and builders working on the frontier of artificial intelligence. Each week we'll explore their revolutionary ideas and together we'll build a picture of how AI technology will transform work, life, and society in the coming years. I'm Nathan Labenz joined by my cohost Erik Torenberg. Hello and welcome back to the Cognitive Revolution. Today, Trey Kollmer returns to the show. Trey first appeared in episode number 30, released back on May 30, when he and 2 other members of the Writers Guild of America discussed the WGA's AI related strike demands and also shared a glimpse into how they are beginning to use AI as part of their individual writing processes. That episode was extremely well received, thanks in large part to Trey's nuanced perspectives, and I encourage you to check it out if you haven't already. This time, we're doing something a bit different because Trey's interest in AI goes much deeper than just experimenting with it as a writing assistant. Like me, he's extremely curious about all AI developments and very well read on the subject. So I invited him back to first share an update on the strike and also to bring his own questions for discussion. We ended up talking for over 2 hours covering topics including our ever evolving understanding of AI models reasoning capabilities, why NVIDIA stock has popped so much more than other AI stocks, including their critical partners, TSMC and ASML, and where else besides hardware value is likely to accrue in the AI stack over time. Finally, we close with a segment that I'm calling the reality writer's room in which we imagine our current reality as entertainment for some hypothetical outside observers and then attempt to identify scenes of particular interest or dramatic irony. 1 small correction. Toward the end of the episode, I incorrectly said that Inflection AI was 1 of 7 companies party to the Frontier Model Forum. But what I should have said was that Inflection was 1 of 7 companies that agreed to certain voluntary commitments as part of a recent White House statement. Those 7 companies are OpenAI, Anthropic, Google, Microsoft, Meta, Amazon, and Inflection. But only 4 of them, OpenAI, Anthropic, Google, and Microsoft officially joined the Frontier Model Forum. It's a small detail, but worth getting it right. If you ever notice any inaccuracies in the show or just want to request the return of another favorite guest, please reach out and let us know. You can always email us at tcr@turpentine.co or feel free to DM me on the social media platform of your choice. And if you're finding value in the show, we always appreciate a review on Apple Podcasts or Spotify, a comment on YouTube, or just a plain old share on social media. Now I hope you enjoy this wide ranging conversation with Hollywood writer and fellow AI scout, Trey Kollmer. Trey Kollmer, welcome back to the Cognitive Revolution.

Trey Kollmer: 3:38 Oh, thanks. Thanks for having me back.

Nathan Labenz: 3:40 Very excited. This time, we're gonna pursue a wide range of topics. But just for context, if folks haven't heard your first appearance, they should go back and listen to an interview with you and 2 of your guildmates, which is a very well received episode. People thought that the perspective that you guys had on language models and the way that it's playing into the dynamic that you have with the writer strike was super interesting, and certainly I agreed with that. This time, you brought a bunch of questions, and I'm excited to get into those with you. But before we do that, tell us what is going on in Hollywood.

Trey Kollmer: 4:14 Okay. So the 2 main updates since last time is that the actors have also struck. So SAG and AFTRA, their deal came expired and they weren't able to negotiate a new 1. So they are now on strike and some of their demands are AI related and we can get to those. But there's also been some movement on the writer strike. After a few months of not speaking with us or with the union, the AMPTP, which represents all the studios, invited us back to the table and we negotiated for a bit. And it ended up falling through this negotiation, but the AMPTP did release their latest offer. They released it publicly. Part of it is a tactic to hopefully get some of the writers to say, wait, that actually sounds pretty good. It hasn't been effective so far, but it's kind of interesting in some of the movement that there's been on the AI issues. Last time we talked, I think 2 of the big things we were concerned about were the way that these language models could be used to create loopholes that would take some of the credit from writers or some of the compensation. 1 example is a rewrite is usually paid at a lower rate than a first draft. So they could have an AI model write the first draft and then pay the writer less for the rewrite. And that's just what or in television, a teleplay, you get paid separately for the story and for the actual script. So the language model

Nathan Labenz: 5:46 could come up with

Trey Kollmer: 5:46 an outline, and then you would lose your story payment. But the AMPTP did come back, and their offer agreed that generative AI generated material could not be considered assigned material. It would not be considered the first draft for the purpose of a rewrite. And basically, it's treated as just research that if you are starting to write from scratch, there are no claims for the studio or for anyone else because of the material generated by the model, which does seem like it starts to close off some of those other loopholes. I'm not fully sure if it covers everything we'd want. The guild, they have to be a little bit private with how they're thinking of progress and the updates to both not give away their bargaining position and to not, I think talking publicly can kind of sabotage the talks as they go. So we don't have full access to everything they're thinking. But the other thing that they're still very concerned about and which has not been addressed is whether writer's material can be used as part of training data to train new models or to fine tune models that could then generate scripts. And it seems like that's something that we're still fighting for. And it kind of intersects with, I was talking to another union member who was more involved in setting up some of the demands before the strike. And it seems like right now there's 2 prongs that the Guild is thinking of in terms of trying to address the idea that these models will be trained on our work and then replace us. And 1 is that there have been some copyright lawsuits. So there's been a number of copyright lawsuits against most of the labs, OpenAI, Meta, stability. And 1 hope is that from people trying to protect writers' jobs is that this will slow down the use of just taking this copyrighted material and using it to improve the models. And there's this great talk. I don't know. The Simons Institute had a bunch of talks the past couple weeks, and this 1 woman, Pamela Samuelson, had a really interesting talk on these lawsuits. And there's sort of 3 theories or 3 ways that she was explaining the lawsuits claim that copyright is being infringed. And the 3 main ways are that 1 is if there's copyright material in the dataset, just to copy the material onto their servers in order to do the training is could be an infringing event. Just even if no 1 sees the material or if the material isn't republished, just the copying of the material could be infringing. The second concern is that the outputs of the model, if they're so similar to what they've been trained on or if they're considered just derivative of the work that's been trained on, the outputs of the model themselves, that could be separately infringing material. And then there's a more technical 1 that it's illegal to remove or alterate what's called copyright management information. So you could picture watermarks on Getty images. If you are removing watermarks like Getty images or a more complicated thing that's happened is if the model is trained on images and then generates very similar images that no longer have the watermark, that could get you into trouble. I mean, there's a lot of interesting things in our talk. 1 is that a lot of the issues seem to come down to fair use and whether this use of copyrighted material is fair use. So there was a case where Google was sued for copying all this text and written material in order to index it. Or in the Google Books project when they were photocopying all these books and ingesting them in order to index all of the content in these books. And that was considered fair use because the 2 main prongs of fair use that seem to be coming up and relevant are it was transformative. They were taking this material and using it in such a different way than just republishing it. That was considered transformative, which is 1 of the fair use defenses. But it was also considered, another prong that they consider is how much this use of the material affects the market for the underlying copyrighted material. So for Google indexing, 1 of the considerations is, listen, they're copying all this material, but they're just indexing it, making it easier to find, and it's not hurting the demand for the blog posts or the books, the articles, the kind of underlying text that the copyright was meant to incentivize the creation of and protect. And it does seem like with these models, stable diffusion and with mid journey, that you could really see that affecting the market for using artist images or for using Getty images. So there may be a different result this time. Her conclusions were that she didn't think the outputs of the models were likely to be deemed derivative works. Even she showed pictures of pretty similar image generated by stable diffusion model. I think it was from, yeah, from Stability AI next to a training image that was very similar to it. And she even thought that no court would conclude that that was a derivative work. And she thinks they're moving alteration of copyright management information won't necessarily pan out, but she does think that there may be questions in the training of the model and that you're copying these works and using it to create a system which can just replace a lot of the underlying creators. But that is really up in the air and that maybe Congress will weigh in and come up with new legislation because it's so new. It seems crazy. We're trying to apply these old laws and old rules to such a different paradigm. And I will say 1 other thing that was interesting is the Copyright Office put out this notice of inquiry, and some writers have been mentioning, have been asking other writers to respond to it. And it's 34 very detailed, thoughtful questions. And I was pretty impressed by how thoughtful all the questions that the copyright office put out. It was just very clear, and it seemed like it addressed most of the important issues. A couple of the interesting ones were it was asking whether, it's also for the people building the models to comment on the copyright and for the tech industry to comment as well. It asks whether it's possible to unlearn training material without retraining, which makes me think that they're trying to get ahead of depending on how courts rule, how possible is it to mitigate the infringement without destroying a ton of investment that's already happened? There's sections on do you think datasets should be retained and recorded so that court cases can investigate to what extent are the outputs derivative of these datasets. And it's just so it's possible for people to investigate how much copyright material is in the datasets. Yeah. Just a lot of questions. Are new laws needed? Which seems like probably.

Nathan Labenz: 13:13 Yes. Seems likely. Yes. That's kind of my big takeaway. I think from everything that you're saying, was a lot of interesting stuff. So I'm struck by just how the AI phenomenon broadly is kind of scrambling a lot of things and just showing really the need for kind of new paradigms in so many different areas. It does seem pretty clear to me that generative AI, writ large, is both transformative and derivative at the same time. And so it seems like it kind of triggers the problem on the copyright side and also perhaps the defense. Now that's not a legal opinion, but in terms of an everyday person understanding of what those words mean and a pretty decent understanding of how the systems are actually built and work, it does seem to me like both of those are really pretty obviously in play. And yet, also, I'm not even sure they're quite the central questions. I mean, there was a recent Anthropic paper about trying to figure out what elements of the training data most influence a model's behavior much later downstream. And the kind of core of course, this also speaks a little bit to your question of could you unlearn stuff without retraining? Basically, not really right now. That's not certainly not a solved problem. Right? So the purest way to say, well, how would a model behave different if this 1 data point wasn't in the training set? You'd have to, in theory, retrain minus 1 data point for every data point. It's totally infeasible. That motivated them to come up with an approximation that uses some advanced math, which I really don't understand, but which kind of allows them to get the bulk of the understanding with infinitesimally less compute needed. But then the results are still kind of really weird. Hey. We'll continue our interview in a moment after a word from our sponsors. You kind of see, for example, that if the model is small, then the parts of the training dataset that most influence behavior were very literal matches, the same keywords. And then as you go bigger and there's this higher level understanding that starts to develop, then it's much more conceptual stuff. But it still seemed to me, at least in some cases and, I mean, I would need to study this in more detail. Maybe we can get somebody from Anthropic on the horn here to help us get a little more clarity. But it still seemed to me that it was kind of idiosyncratic. And while it was more conceptually relevant stuff, it seemed like there were a kind of surprisingly small number of data points that had kind of still a large impact. And it didn't seem like they were the obvious ones that should have or exactly what's going on here still seemed a little weird. The trend of keyword matching at small scale and conceptual matching at higher scale was clear. But then when you zoom in, you're like, well, why is it that conceptual passage instead of probably a ton of other things that could have been in there? So all just very weird, very hard to figure out. It seems like it kind of even if you were to say, we had this answer, but now all of a sudden, oh, jeez, it still varies across model sizes, and there still seems like there's some weirdness in here. It doesn't necessarily feel super just that this passage is being identified as the 1 that's the most influential. There you could have obviously multiple ones too. But, yeah, it just

Trey Kollmer: 16:57 seems kind of all strange. Trey Kollmer: 16:57 seems kind of all strange.

Nathan Labenz: 16:58 I guess going back to the beginning on just the proposal and kind of a state of negotiations, broadly speaking, I've been pretty positively surprised by how good faith a lot of people have approached the whole AI potential for disruption with. For example, in medicine, I've repeatedly said, it seems like the medical establishment is a lot more receptive to trying to get the value from AI that they should be able to get over the next few years as opposed to digging in now and saying it can never do anything important or whatever. It almost sounds like at the high level, maybe the studios are taking a somewhat similar approach. They're at least being reasonably conciliatory and saying, hey, we kind of get you. To write you out of these deals and have credit AI instead, we were okay agreeing to not do that. Is that basically the state of play there now?

Trey Kollmer: 17:56 So I don't have insight into the actual negotiation, but I think it was positive to see that in their offer, they were at least claiming that they were trying to block out all those loopholes and to claim they weren't going to try to use the models to steal credit and to nickel and dime and give lower compensation and stuff based on the model generating a pass of the material. I do think 1 way that the other demand intersects with the copyright issue is I do think the union wants some movement on not using our scripts as training data to improve the models. The reason that is separate from the copyright is that usually when a script is purchased or you're hired to write a script, it's a work for hire, which means that the studios own the copyrights. So a lot of the discussion is they own the copyrights to all this material. Can they use it? Whatever is found in the broader kind of legal cases that are affecting these companies, can they use their material, which they own, but which we've written to train models that could 1 day replace us? I think it seems like there's still not agreement on that. But I don't have insight into the behind the scenes what's going on.

Nathan Labenz: 19:19 That may be another instance of threshold effects being super important too, because, again, I can imagine that in the medical setting, hey, we're all overworked here. This is really tough. If you can help me write more empathetic follow-up notes to patients, then I'm all for it. That's great. If I can get a second set of eyes and double check me in all sorts of different ways and that can avoid oversight that improves patient outcomes, great. All of a sudden, what happens if it's Google now has a doctor on your phone? It can take, it can see the pictures. It can hear your voice. It can basically service that frontline doctor independently. That may flip how people start to see it. And here, it almost is, you could see a same possibility where they may say, okay, we really don't know what's going to happen. Let's take kind of a conciliatory approach for the time being. Let's get, let's not let this derail this contract for too much longer. Obviously, there's streaming revenue splits remain a big issue too, but let's kind of put this 1 behind us. But maybe we don't want to concede yet that we can't do these things because if it really does flip to the point where we could just fine tune models and have infinite scripts for free, we don't want to cut off that opportunity for ourselves as the studios before we even know if it's a real possibility. I could see that these threshold effects, sometimes they kind of cut sharply in different ways at different times, it seems.

Trey Kollmer: 20:55 Yeah. Totally. And I think to that, I think a lot of the protections the guild is asking for are most relevant in sort of a medium progress scenario. There's a low progress scenario where the current models really aren't good enough to have a huge impact. And then there's a world where they get so much better that it's sort of a quixotic effort to try and stop them from being used.

Nathan Labenz: 21:23 I mean, I think a lot of things right now are kind of playing to the middle scenario. I just have been working on a charity evaluation project with a focus on AI safety, and probably we'll talk a little bit more about that in the future once it's all wrapped up. But that's kind of a trend that people are, this could get out of hand really fast. And if so, nobody really knows what to do about it. There's definitely a risk now that scaling just continues or that there's another algorithmic breakthrough or somebody kind of figures out some agent fundamental concept that unlocks a whole new level of sustained goal directedness in autonomous agents. And, again, just nobody really has much of a plan for that. It's basically, we hope that doesn't happen. And in the meantime, we'll try to kind of make an impact on the scenarios that seem more steerable. So in that sense, I think, honestly, the strategy that the guild is playing there seems quite reasonable and oddly consistent with strategies that I'm seeing in AI safety charity work, which would seemingly be quite different but doesn't to be premised on a similar sort of distribution of range of possibilities.

Trey Kollmer: 22:41 Yeah. So, I mean, to that, and how quickly you think progress could be happening in terms of how good these models can get it, specifically writing stories, coming up with outlines or writing scenes and scripts. I was sort of curious, how much juice do you think there is in fine tuning these models specifically on trying to write?

Nathan Labenz: 23:05 Yeah. It's a really good question. 1 is with Waymark. In some sense, the closest thing I've been involved with to writing a, for example, a sitcom or drama screenplay would be writing the 30 second commercial scripts that we generate with language models with Waymark. That is obviously a much simpler problem, but what we've seen there for sure is that fine tuning helps. Fine tuning of the original da Vinci model was night and day better than trying to kind of do a few shot thing with the da Vinci model. There's a pretty deep continuum, it seems, between few shot learning and fine tuning where, at least at some scale, there's evidence that the models implement gradient descent in the weights. So the mechanism that they're using to learn from the few shot is very similar in some sense to the, I don't know if it's, you know, isomorphic is probably a little too strong, but there's a very deep kind of correspondence between the way that the few examples are influencing output and the way that fine tuning on examples is influencing output. 1 of my papers of the year last year was the a paper where they basically designed an algorithm in the weights to implement gradient descent and then went looking for it in actual trained models and found it. And were, look at this. We predicted that this might be there as a mechanism and sure enough, here it is in the wild. Anyway, so I kind of see this as a continuum is the key point there between a few examples and low scale fine tuning.

Trey Kollmer: 24:44 Oh, it's interesting. So you're saying if you give it a few shot examples just in context, people have gone in and watched that the change in the activations is equivalent to having done gradient descent on those.

Nathan Labenz: 24:59 Yeah. There's a gradient descent. I don't know quite how generalized this is. This was published last year, and we can put a link to the paper in the show notes. But I would assume that it's basically the same thing must be happening in the frontier models. If anything, I would be kind of surprised, thinking back to what models were available at the time. I don't recall exactly which models they looked at, but they certainly didn't have access to your GPT-4s or your 3.5s or your Claudes. So it was or even your Llama 2 at that time. So it was definitely something significantly smaller than where the frontier is today. So to see that it happened at something that's probably only 1% as big as the current frontier models in terms of compute flops or whatever, definitely suggest that it's probably a pretty general phenomenon. So, yeah, it's an incredible result. All that just to say though that there is basically, I think a pretty fundamental sense in which it's a continuum. Few shot examples in runtime versus fine tuning on a few shot or low scale fine tuning basically seems to be doing something very similar, and it definitely adds value in our commercial writing use case. It's not necessarily, in general, this whole thing, this was another really interesting grant from the charity process where somebody made this point that all the current systems pretty much have a fundamental assumption that the future is the same as the present. They're trying to predict the dataset. And so there's a lot of ways you can kind of throw them off when you go out of distribution or whatever. But 1 just fundamental assumption that's baked in is the future is kind of the same as the past and that the overall landscape isn't shifting too much. So I think you kind of see that in the results. I wouldn't say the results that we get with fine tuning are mind blowingly awesome, but they are very helpful in terms of dialing in form and structure. And just if we have scenes in a certain template that have kind of a 1-2 punch, then when it sees some of those, it starts to deliver the 1-2 punch. Whereas if you didn't have as many examples, then it might just kind of go something and something and kind of be more boring. So I do think for structure, for kind of if you're, oh, this thing is kind of coming up with okay ideas, but they're not the right length or this isn't really how things are typically done. We need to really dial in consistency of structure so that we can then at least have something that is plausibly going to work and we can decide if we like it. Definitely very helpful for that. No doubt. The other data point that I have is trying to have it write intros for this show. And there, I basically always end up rewriting them. I'm currently using Claude 2 for that. I take 5 examples of previous intros plus the transcript of the current episode plus a runtime kind of angle. I call it an angle, which is do I have kind of a spin or an angle or something that I have in mind to direct the model on this particular episode? And so those are the 3 inputs. Half a dozen quality examples from the past, transcript, and spin angle, kind of runtime instruction. And that can be useful, but it's not great. I aspire to provide something in the first couple minutes that's meaningful synthesis or a useful perspective that kind of contextualizes what's to follow. And I usually don't get anything that feels really like that to me. What it can do is it can give me a decent outline of what we talked about. It can kind of give a neutral narrator voice. They talk about this, then they talk about this. It can do that pretty well. But if I want to have something I would sign on, Nathan's take on this, I don't get that. So I suspect that it's probably still in kind of a similar, I don't know. It's interesting. Depending on what you're doing. Comedy could fall maybe somewhere in between where it's maybe there are enough jokes out there that it can write a bunch of stuff and 1 in a 100 could be cool. I don't think you're going to get to a high rate of success, but I'm not sure it's a high rate of success even by the human writers. So I don't know what percentage of jokes in a writer's room or pilots or whatever are ultimately good, I would guess that even the best fine tuned models that you could come up with today would be kind of not hitting at a super high rate, but maybe high enough where it could be viable. I mean, you still have to do, you still have a filtering problem. How do you identify? Okay. Now I have a 100 scripts. Maybe 1 of them on average is going to be good. Which 1 is good? That's not necessarily super easy either. And there's a ton of disagreement. I mean, that's another thing I always really see. And, again, it could be prompt developing with just few shot or it could be moving more toward fine tuning. But disagreement is a real struggle for this kind of thing. The RLHF is kind of taking us toward a kind of neutral style in some sense that if you probably want to be a little more opinionated in your script writing, I think we talked about this a little bit last time. A bimodal distribution is kind of probably the best you could do in the case of creative work. You're not going to get everybody to love it, but if you can get some people to really love it, you're in business. The current AI training regimes do not push toward that bimodal as much. They're more kind of converging toward a center. So I think it would be interesting. I guess to summarize all that, it's I'm thinking about it all out loud here. But simple tasks, it definitely can help dial in structure. And in our commercial writing, we want pretty good, but we're not really looking for a provocative. A lot of times we're trying to introduce a small business. We want to get a few key points across. It can do that pretty well, but it's not like we need to blow anybody's mind. We need to introduce a small business, ideally with a good hook and make it kind of attention grabbing. But, again, it's not totally novel groundbreaking material. I aim for at least somewhat groundbreaking insight in my interest to the show, and I don't really get that from AI. And that the frequency almost ever, basically super low. I don't even run it enough to see those examples. I more just kind of run it and then end up rewriting it myself. And then in between, it'd be interesting to try to imagine or try to maybe figure out for something like a sitcom script, what positive rate of success would you need for it to be something that would actually be useful? If it's 1 in 1,000,000, it's obviously, well, forget it. We're not going to be able to find the 1 in 1,000,000. If it's 1 in 10, you could probably eyeball through them and be, oh, this is the 1 that's best. Maybe this will have legs. Somewhere in between, maybe there's another threshold effect where it actually makes sense to generate and mine the generated content as opposed to, it gets very different workflow. That's, again, another trend. A lot of these workflows end up being very different where it's I might have to generate and mine. I wouldn't generate it. Maybe Hollywood collectively is maybe generating and mining. But as an individual, you wouldn't. But maybe now as an individual with AI at some level of success, maybe you could. Nathan Labenz: 24:59 Yeah. There's a gradient descent. I don't know quite how generalized this is. This was published last year, and we can put a link to the paper in the show notes. But I would assume that it's basically the same thing must be happening in the frontier models. If anything, I would be kind of surprised just thinking back to what models were available at the time. I don't recall exactly which models they looked at, but they certainly didn't have access to your GPT-4s or your 3.5s or your Claudes. So it was, or even your Llama 2 at that time. So it was definitely something significantly smaller than where the frontier is today. To see that it happened at something that's probably only 1% as big as the current frontier models in terms of compute flops or whatever definitely suggests that it's probably a pretty general phenomenon. So yeah, it's an incredible result. All that just to say though that there is basically, I think, a pretty fundamental sense in which it's a continuum. Few shot examples in runtime versus fine tuning on a few shot or low scale fine tuning basically seems to be doing something very similar, and it definitely adds value in our commercial writing use case. It's not necessarily, in general, this whole thing was another really interesting grant from the charity process where somebody made this point that all the current systems pretty much have a fundamental assumption that the future is the same as the present. They're trying to predict the dataset. And so there's a lot of ways you can kind of throw them off when you go out of distribution or whatever. But one just fundamental assumption that's baked in is the future is kind of the same as the past and that the overall landscape isn't shifting too much. So I think you kind of see that in the results. I wouldn't say the results that we get with fine tuning are mind blowingly awesome, but they are very helpful in terms of dialing in form and structure. And just if we have scenes in a certain template that have kind of a one-two punch, then when it sees some of those, it starts to deliver the one-two punch. Whereas if you didn't have as many examples, then it might just kind of go, something and something, right, and kind of be more boring. So I do think for structure, for kind of if you're this thing is kind of coming up with okay ideas, but they're not the right length or this isn't really how things are typically done. We need to really dial in consistency of structure so that we can then at least have something that is plausibly gonna work and we can decide if we like it. Definitely very helpful for that. No doubt. The other data point that I have is trying to have it write intros for this show. And there, I basically always end up rewriting them. I'm currently using Claude 2 for that. I take five examples of previous intros plus the transcript of the current episode plus a runtime kind of angle. I call it an angle, which is do I have kind of a spin or an angle or something that I have in mind to direct the model on this particular episode? And so those are the three inputs. Half a dozen quality examples from the past, transcript, and spin angle, runtime instruction. And that can be useful, but it's not great. I aspire to provide something in the first couple minutes that's meaningful synthesis or a useful perspective that kind of contextualizes what's to follow. And I usually don't get anything that feels really that to me. What it can do is it can give me a decent outline of what we talked about. It can kind of give a neutral narrator voice. They talk about this, then they talk about this. It can do that pretty well. But if I want to have something I would sign on, Nathan's take on this, I don't get that. So I suspect that it's probably still in kind of a similar. I don't know. It's interesting. Depending on what you're doing, right? Comedy could fall maybe somewhere in between where it's maybe there are enough jokes out there that it can write a bunch of stuff and one in a hundred could be cool. I don't think you're gonna get to a high rate of success, but I'm not sure it's a high rate of success even by the human writers. So I don't know what percentage, right, of jokes in a writer's room or pilots or whatever are ultimately good. I would guess that even the best fine tuned models that you could come up with today would be kind of not hitting at a super high rate, but maybe high enough where it could be viable. I mean, you still have to do, you still have a filtering problem, right? How do you identify? Okay. Now I have a hundred scripts. Maybe one of them on average is gonna be good. Which one is good? That's not necessarily super easy either. And there's a ton of disagreement. I mean, that's another thing I always really see in the, and again, it could be prompt developing with just few shot or it could be moving more toward fine tuning. But disagreement is a real struggle for this kind of thing. The RLHF is kind of taking us toward a kind of neutral style in some sense that, if you probably want to be a little more opinionated in your script writing. I think we talked about this a little bit last time, right? A bimodal distribution is kind of probably the best you could do in the case of creative work, right? You're not gonna get everybody to love it, but if you can get some people to really love it, you're in business. The current AI training regimes do not push toward that bimodal as much. They're more kind of converging toward a center. So I think it would be interesting. I guess to summarize all that, it's, I'm thinking about it all out loud here. But simple tasks, it definitely can help dial in structure. And in our commercial writing, we want pretty good, but we're not really looking for a provocative. A lot of times we're trying to introduce a small business. We want to get a few key points across. It can do that pretty well, but it's not we need to blow anybody's mind, right? We need to introduce a small business, ideally with a good hook and make it kind of attention grabbing. But again, it's not totally novel groundbreaking material. I aim for some at least somewhat groundbreaking insight in my intro to the show, and I don't really get that from AI. And that, the frequency almost ever, basically super low. I don't even run it enough to see those examples. I more just kind of run it and then end up rewriting it myself. And then in between, it'd be interesting to try to imagine or try to maybe figure out for something like a sitcom script, what positive rate of success would you need for it to be something that would actually be useful? If it's one in a million, it's obviously, well, forget it, right? We're not gonna be able to find the one in a million. If it's one in ten, you could probably eyeball through them and be oh, this is the one that's best. Maybe this will have legs. Somewhere in between, maybe there's another threshold effect where it actually makes sense to generate and mine the generated content as opposed to, it gets very different workflow. That's, again, another trend, right? A lot of these workflows end up being very different where it's I might have to generate and mine. I wouldn't generate it. Maybe Hollywood collectively is maybe generating and mining. But as an individual, you wouldn't. But maybe now as an individual with AI at some level of success, maybe you could.

Trey Kollmer: 32:57 For jokes, you definitely do generate in mind. The show New Girl was sort of famous for it. There's a concept called alts, which are alternate jokes that you have on set. So if a joke isn't working or you just want to try something different, you have a list of alts for each joke you might want to switch up. And New Girl would have just the writer would have a binder of hundreds of jokes for every episode. And then you would try a few. And then in the edit, would mine your favorites. I do think there might, I mean, the idea of doing whole scripts and mining them sounds exhausting, but if you chunk it into kind of hierarchies and you have it do premises and then, okay, 20 premises you can look through, and then you have it do outlines, and you can read through 10 outlines, and then you can kind of go scene by scene and have it generate scenes. It's still, I don't think it's there, but at some level of hit rate, I could see some of that stuff making more sense. And that's what you're saying about kind of wanting more polarizing material, something that you want 10% or 20% of people to love it versus 70% to be vaguely not offended by it. I'm guessing the answer is no. But when you have, say you're fine tuning on, I guess now you can do it on GPT-3.5, they're not exposing enough of the model to do reinforcement, your own reinforcement learning on it.

Nathan Labenz: 34:26 Not yet. Although I do think that is coming. They have kind of teased more sophisticated fine tuning tools, and I think that some of those are in kind of alpha or beta phase with select customers, but the current fine tuning that's available is still just kind of examples, supervised fine tuning.

Trey Kollmer: 34:49 You send them the examples. They do all the fine tuning on their end. Yeah. Got it. Because it does seem it really is trained or is rewarded for being inoffensive, and it there is a pull toward the mean, and you could probably get improvement training it where you have a really polarized reward structure where you want people to love it, and it's less penalized if just half the people don't respond to it. And then I think I was mentioning this to you offline, but I also think it struggles a bit because you're generally sampling from a fixed temperature. And a bunch of different levels of writing, whether it's a premise or a scene or a joke, there's a concept that comes up a lot called the one unusual thing, where you want one element to be really surprising and unpredictable, and then you want everything else to play it as straight and as normal as possible. That's why usually a joke, there's one part that's very surprising and the rest is very normal. Or an example, just a, I don't know why this movie popped in my head, but the movie, The Invention of Lying, it's a movie where lying doesn't exist. And that's a very unusual thing. But within that world, all the people behave psychologically normally. And although every other part of it is a natural consequence of the one unusual thing played out as realistically, I guess, as possible. Maybe there's something to train them to learn to select their own temperature for different outputs so it knows it doesn't have to be the same level of randomness throughout every token's generating. It could learn, okay, I need something more surprising here. Okay, I've picked something really unlikely. Now I need to land the plane and try to make sense of this random thing I've just thrown into my generation. And there's a method of writing that different people use or sometimes where you write yourself into a hole where you can't imagine how you get out of it. And then you try to just solve, find some way to solve the problem and make sense of this crazy thing that's happened. And the Coen brothers do it where I don't know if they do it for every script, but I've read that they trade off every 15 pages. And one brother will try to write into an impossible situation that they can't imagine how they would get out of it. And then the other brother has to show up, make sense of this crazy thing they've been handed, and in 15 pages then steer their brother into an impossible situation. Or in the movie Django Unchained, it opens, Christoph Waltz's character goes into a bar and shoots a guy. And you're just what is happening? How is he gonna get out of this? And then you eventually learn that there was, oh, a bounty on his head. He was an outlaw on the run. And you find the reason why that makes sense. I do think there is, there may be value in the way the models can make sense of something or the way they can hallucinate some reason why something is true when they've made a mistake makes me think if you were generating unusual situations and using the model to then write out of it, that it might be better at that sort of creativity than the sort of creativity where it has to generate that unusual thing on its own. Trey Kollmer: 34:49 You send them the examples. They do all the fine tuning on their end. Yeah. Got it. Because it does seem it really is trained or is rewarded for being inoffensive, and there is a pull toward the mean, and you could probably get improvement training it where you have a really polarized reward structure where you want people to love it, and it's less penalized if just half the people don't respond to it. And then I think I was mentioning this to you offline, but I also think it struggles a bit because you're generally sampling from a fixed temperature. And a bunch of different levels of writing, whether it's a premise or a scene or a joke, there's a concept that comes up a lot called the 1 unusual thing, where you want 1 element to be really surprising and unpredictable, and then you want everything else to play it as straight and as normal as possible. That's why usually a joke, there's 1 part that's very surprising and the rest is very normal. Or an example, just a I don't know why this movie popped in my head, but the movie, The Invention of Lying, it's a movie where lying doesn't exist. And that's a very unusual thing. But within that world, all the people behave psychologically normally. And although everything, every other part of it is a natural consequence of the 1 unusual thing played out as realistically, I guess, as possible. Maybe there's something to train them to learn to select their own temperature for different outputs so it knows it doesn't have to be the same level of randomness throughout every token's generating. It could learn, okay, I need something more surprising here. Okay, I've picked something really unlikely. Now I need to land the plane and try to make sense of this random thing I've just thrown into my generation. And there's a method of writing that different people use or sometimes where you write yourself into a hole where you can't imagine how you get out of it. And then you try to just solve, find some way to solve the problem and make sense of this crazy thing that's happened. The Coen brothers do it where I don't know if they do it for every script, but I've read that they trade off every 15 pages. And 1 brother will try to write into an impossible situation that they can't imagine how they would get out of it. And then the other brother has to show up, make sense of this crazy thing they've been handed, and in 15 pages then steer their brother into an impossible situation. Or in the movie Django Unchained, it opens Christoph Waltz's character goes into a bar and shoots a guy. And you're just what is happening? How is he gonna get out of this? And then you eventually learn that there was, oh, a bounty on his head. He was an outlaw on the run. And you find the reason why that makes sense. I do think there is there may be value in the way the models can make sense of something or the way they can hallucinate some reason why something is true when they've made a mistake makes me think if you were generating unusual situations and using the model to then write out of it, that it might be better at that sort of creativity than the sort of creativity where it has to generate that unusual thing on its own.

Nathan Labenz: 38:18 Number of interesting threads there to pull on a little bit. So we talked about unlearning briefly, and you kind of mentioned specific names here. Right? So the 1 thing that people have a lot of success doing, honestly, I probably have the most success doing this if I'm trying to get something creative in a style that's not coming super naturally to me, is invoking particular authors, particular masters of whatever the style is. So that could be, right in the style of Hemingway or if I'm doing a commercial right in the style of David Ogilvy or whatever. Now Hemingway's old enough, either now or soon should be in the public domain, although they seem to keep extending that. But obviously the folks you're talking about, all their work is owned material. I don't think it's gonna be easy at all for large models to untrain that stuff. The next generation because it's all just mashed in there in insane spaghetti form. Right? And maybe you can tell what document influence what in the case, using that anthropic paradigm, if you invoke the Coen brothers, whatever, you're then gonna presumably see very obviously relevant material being the data points that influence the behavior most. So presumably, you'd have some clarity there at least of okay. Yeah. There's a pretty clear chain of how this is happening from the fact that the name was invoked to what data points, which are their data. And it all that's all kind of seems it could add up. Honestly, what I would expect people might have to do if they get a forcing function put on them would be some sort of filter on top of the main model. They'd have to move to a system approach where they would say, okay. And we're seeing this in other again, in AI safety, the some of this work is happening with biosecurity risks. So Anthropic had a little testimony, I believe, in the senate not too long ago where they kind of talked about this, and it was today, the best language models can't engineer a new pandemic, but they can help people get over steps that are not super obvious how to get over. There are things that are hard to Google that are kind of the know how that exists in labs, and actually the models do have some of that or can help you kind of get it where it's not super searchable. So what do we do? Well, they're developing a filter, a classifier basically to put on top to say and you could do this at the prompt level or the upper level or both. Is this something that seems problematic, even kind of independently? Right? I just have a very different purpose for that additional system or that additional part of the system that is supposed to kind of catch those things. And I think you could do something like that for copyright as well. Not necessarily, hey. Write me a joke. Well, jeez. That's all comedians. That's maybe tough. But write me a joke in the style of Sarah Silverman, that you could probably catch and say maybe it's just, sorry. We can't do that. And it's not it could be as simple as that. But certainly, when names are named, you could identify those cases and do something different even if you can't pull that information out of the model. So the hierarchy thing is interesting. Did you see this this company called PseudoWrite that, I think it was just a little bit after the last episode we did. This guy's name is James Yu on Twitter. I've invited him to come on the show. Haven't he hasn't taken me up on it just yet. But an AI tool for writing long form stories is the promise, And the demo and the interface is very hierarchical. It's kind of a cascading what's the premise? And it'll generate at every level of the hierarchy, but you can also then edit different levels of the hierarchy and have it kind of regenerate and cascade your changes down. Ultimately, you can kinda create book length things. And the value of this tool is it provides the scaffolding to help you kinda project, well, what is the main line story and who are the characters? You project that down into all the little things efficiently so you can not lose yourself from all the prompts. I don't know how well it works. It seems it probably is quite valuable relative to just sitting down with ChatGPT by yourself and trying to write a novel. But people went nuts on this guy. And just, especially in the context of the writer strike, we're how dare you? You became public enemy number 1, it seemed, of the the writers. And it was unclear, of course, on online how many of the people hating on them are actual writers versus purely aspiring writers or people that think they're defending the writers. But I don't know if you saw that or if any of these new tools have crossed your radar.

Trey Kollmer: 43:12 I didn't see PseudoWrite. I know there I've noticed there's some outrages. It's hard at the level of a script or a novel. It's so much it's obviously longer than the context window. But I even think when you get to the extremes of the context window, story wise, it can be hard playing around with it to keep the coherency. And I do think you would you almost need these sort of hierarchical step scaffoldings to enforce that coherency, where it's not in 1 generation, it's not producing a max length coherent story. You're getting a much shorter coherent outline and using that to generate coherent scenes that when you manually put them together will be coherent.

Nathan Labenz: 43:58 Yeah. Even with Claude a 100 k now, which I think also came out since the last recording and definitely has changed the game for long form document processing for starters and probably a lot more beyond that to come too with just super long context windows in general. But even with that 100 k as it stands today, you do see performance degradation at the high end of the context window. If I take a 90 minute podcast transcript and ask it to create a time stamp outline or summarize that, does a pretty good job. If it's closer to 3 hours, that will still fit technically in the context window, but it's getting close to the limit. And then somehow I see it go off the rails pretty consistently, skips whole sections, just kind of kind of fails in a way that's that's not just that I might quibble with bits of the summary, but you actually missed whole parts. You did meaningfully objectively fail at the task. So, of course, that stuff will continue to get better. But as it stands right now, you can kind of get a quick summary of the Gatsby, but I wouldn't expect it to get every question right about the great Gatsby based on what I've seen.

Trey Kollmer: 45:08 I I have a question about that. I was wondering because I was I listened to the mo your Mosaic ML episode. And I was curious how when they train on 1 context length and then they allow you to input longer context and have this variable length, how they project the larger input down to the same vocab size. And they seem to keep answering with, well, they were talking about how you copy the attention heads. Those are all the same weights. And they talked a lot about their positional encoding scheme, which I guess, which was the main point of their paper. But it still seems once you get past the attention heads, you've trained for weights that project a smaller input onto some vocab size, and now you have a larger input. I was wondering 1, if you ever got to the bottom of that, and is it are they just using pooling layers where they're kind of max pooling or average pooling? And so once you get to a 100 k on Claude, if they're doing something similar, that just the specifics get lost in these pooling pooling layers. And so that's kinda why you're seeing the the performance degrade.

Nathan Labenz: 46:22 Yeah. I don't have a I still don't have a super clear understanding of that. I recall that they said ultimately, you're sharing the weights. Right? There's is some sort of dilation type of phenomenon, but the core kind of model size is the same, of course, and so there is some sort of sharing. I I don't still have super clear sense for how that is supposed to work. I guess my general sense of it right now is the original trigonometric function positional embeddings definitely look insane. It's that is so weird that that would work at all and definitely has the feel of something that somebody kind of left there as a placeholder because it sort of worked and they were moving on to the next thing for the time being. So I'm I'm not at all surprised that we can come up with better stuff. And it feels this is not the end of that story either in all likelihood. I I I don't, yeah, I don't fully quite get it, but it is interesting. I haven't I guess I haven't fully characterized it behaviorally either because I do see these weird failures at a 3 hour transcript level. But it does work pretty well too. It's initially, people were this is a superpower. It's definitely not quite a superpower in that it it doesn't seem to be unlocking any increased reasoning ability. The the raw sort of power power, if if you make an analogy to raw g or something, it's not stronger in that way. It does seem to be if you could really separate how good are you at processing information versus how much information can you hold in working memory. It seems this is a pretty clean separation of those concepts. So for intuition, it can hold more in working memory, but it doesn't seem to be able to do that much more with it. It's not better at solving problems as far as I can tell. I haven't it does not seem, for example, either that if you were to give it lots of examples versus a decent number of examples that it's there seems to be definite diminishing returns and examples. But, yeah, I I I don't know. I guess I I still sort of think something else is probably coming there. There's a few other things that I've I've been really interested in over the last couple months around just possible transformer successors or elaborations. 1 big 1 seems to be some form of recurrence being introduced to the transformer because the main thing there is you can kind of selectively bring forward the information that matters most without having to keep track of everything at the same time always, which is basically what the transformer does is everything that's in scope is kind of fully treated. And it I guess the projection of the alibi mechanism is kind of maybe a little bit of a fudge on that because now you're able to bring more stuff into the same number of weights. But a mechanism that allows you to selectively bring information forward with some sort of recurrence type thing, I think, is maybe another thing to look for in terms of how that you know, the how the current limitation maybe just gets blown through. The the paper on that was RetNet. It was out of Microsoft. Actually, really interesting. A Microsoft China collaboration, and I always am inclined to celebrate those.

Trey Kollmer: 49:52 Well, it's ResNets or Microsoft China. Right?

Trey Kollmer: 49:52 Well, it's like ResNets or Microsoft China. Right?

Nathan Labenz: 49:55 I wasn't following the primary literature closely enough at that time to know who was driving things. These days, it seems like, man, US China collaboration is increasingly an endangered species. But this one paper was potentially part of a very proud tradition. In any case, it stood out to me as, wow, this seems like potentially really transformative work. And to see that it was US China collaboration, I thought was pretty cool. I personally am all for turning down the temperature and up the collaboration wherever we can. That was pretty cool. One real practical thing, and maybe this one arguably should have been first. If you want to do something to get out of the centrist mode, they are helping you there now quite a bit with the system prompt. So go into your ChatGPT account, go set up your perma profile and tell it what you want and tell it I don't want to be the same normal everyday thing. I'm open to provocative results or whatever, and that will help push the boundaries. It's kind of unclear how far that will go, but it definitely does help. Expertise is rewarded. Right? So when you can bring a paradigm like you described of one strange thing, that is something that the model probably can follow but won't necessarily follow unless it's directed to. So you need the know-how of just what are some approaches that people take that prove to be useful. And just articulating that to the model in the first place can really be helpful. Again, I don't know that it will be great in today's world, but it definitely could push you a little bit farther in that direction. And another research result that that made me think of too is the Backspace paper. This is maybe a little bit like the opposite perhaps of the Backspace paper, but in the Backspace paper that I call the Backspace paper, which I believe is out of Stanford, they do two things there. One is they introduce a new loss function that punishes whole generations that are out of distribution more as opposed to the vanilla next token predictor that has been the norm. So they take a more episodic approach. And as you get far out of the norm, the reward function can punish that more. And then this backspace idea is, well, as that starts to happen, then maybe the model can backtrack and try again. If it realizes that, hey, this last token was taking me off in a different strange direction. Now that's too anthropomorphic, but the effect seems to be that, basically, if there's low, you make one bad or unlikely token prediction, and then you have no good options in the next prediction. Previous architectures, previous approaches, you're just stuck. You just have to pick something and that's that. Once you take that path, you're on that path. But the new with the backspace added, if nothing else seems likely, then the backspace becomes the option where it can go, okay, boom. We'll go back. And I believe that the actual context now just has the token and then backspace, and then it's learned to just pick up from two tokens back or whatever. But that will allow it to go back, get back in distribution with a different option that then hopefully will have more likely or natural continuations. And so you could imagine flipping the sign on that somehow. And I haven't studied the math enough to know exactly how the loss function is constructed differently. But you can kind of imagine that if you can make one that punishes things that are being out of distribution, you might be able to and have a backspace that allows it to correct. You might also be able to do the inverse of that and reward things that are more out of distribution and maybe have sort of a, not a backspace, but an opportunity to sort of flag that this is a good moment to introduce something very different or unexpected. A lot of work left between my speculation and that actually working obviously, but I could see something like that starting to happen. And I think that's also really relevant in a lot of cases. People are always interested in when might models be able to make a meaningful contribution to science. And science in some ways has a very similar problem to creative writing in that the most valuable stuff by definition is kind of out of distribution. If you are generating scripts or hypotheses that are very consistent with what has come before, it's unlikely to be a breakthrough hit, and it's unlikely to change the scientific paradigm. You need to be somewhat out of distribution to have a meaningful impact, but you need to be out of distribution obviously in a smart way or it doesn't work. Interestingly, the a similar technique might end up working for both.

Trey Kollmer: 55:38 Yeah. Because it seems hard because most things that are very unlikely in your distribution are just gibberish and bad. But it seems interesting that they're evaluating their outputs at larger than just each token itself. I mean, does seem to me that a little bit is, as you do more reinforcement learning, it's less out trying to predict a likely token and more trying to predict a great token, which I do wonder if getting better at writing, it might just take actually rewarding the behavior you want for some RL fine tuning regime.

Nathan Labenz: 56:19 Yeah. I think in general, we're just still really early in the paradigm. Whether it proves to be an exponential curve or perhaps more likely some sort of s curve, we're definitely in the steep part of the s curve. And this first wave of large language models was really a kind of mad science project in search of a capability, which then went in search of a problem. And the first versions of it were not really useful for anything. Then they said, but if we make it way more powerful, it'll probably start to be useful for something. And so they did that, and then it actually does start to become useful for things. Still at that point, they weren't really trying to solve a problem. So now we have this kind of different situation where, okay, we've got base things that can do a lot of things pretty well, but the properties that we want in a model that might power an autonomous agent that's supposed to do online tasks for you may be very different from the qualities we want in a script writer or a hypothesis generator. And we may want an approach that punishes too random things and has a backspace for the agents. And we may want a kind of whatever the mirror image of that is that rewards more out of distribution things or identifies when a good moment would be to take an unusual step for those other domains. I think it's just there's so much room. And the world, among other things, the number of people working on this is going exponential. So we may be bounded by how many humans we can put at it, where we may just find we can exponential our way with just more and more AIs focused on it. And that's, in some ways, the big question. But either way, it does seem like we're now hitting a phase where knowing that it is possible to build something that is useful for a particular task, people are really only just now starting to say, well, how would I change the approach if I wanted to solve that particular task? And it is turning out, shouldn't be surprising really, but a coding system is going to be different from a different kind of system. So we covered a lot of ground there. Let's get back to the actors.

Trey Kollmer: 58:44 Yes. This is what people really want to hear about. Yeah. So the actors have also gone on strike for similar wanting to get a larger cut of streaming, residual, of the streaming revenue and wanting pay increases and artificial intelligence protections. So it's very similar. The two big paradigm shifts of there is now these generative AI capabilities and the business model of the industry is changing, I think are also driving, obviously driving this strike. I figured maybe just a helpful thing is to give just a quick overview of what the actors are fighting for on the AI front. It mostly seems to come around to they want to own their digital likeness. One thing that I think scared people is that you can be a background actor, and you're showing up for a relatively low wage for a day's work, and they're requiring you to get a full body scan. And the fear is, okay, you get paid for one day's work, they scan your body, and then they use your likeness for whatever other projects in perpetuity that they want to do. The guild wanted to if the studios do take your digital likeness, that you still own it and you get to negotiate consent and compensation for any future use of it. Sort of what I think the situation is that the studios agreed to give you that consent and compensation for your likeness to be altered or recreated for a future use, but not for the image to be used for training. For example, to create synthetic characters that don't look or talk like anything like any specific person, sort of similar to the writer's situation of just taking scripts and generating new material based on it. So I think that they are still fighting for that. I think they're also demanding the union wants consent over individual AI uses. I guess they want some sort of say going forward, if there are unique situations that come up, they want to be consulted and have a say in it. And then I think there is a dispute over whether or not they've given in on the background actors issue. The studio's claim they agreed that they would only use scanned background actors for that production. So they do a day's work, and they can put them in other, I don't know, other situations, or they can fix something if they didn't like how it was shot when they struck. The union claims that they did not agree to that. I do think there it's a very, just persuasive, gettable thing of you want to have some control and leverage over your actual personal likeness and not have it taken from you and then used in ways you don't know going forward. I do wonder if the actors end up in the medium to long term coming up against just fully as they were are worried about synthetic digital characters and actors. Computer generated characters have sort of been a mainstay and have been a common thing we are already very familiar with and already see. So it may be harder to try and cut that off entirely going forward.

Trey Kollmer: 58:44 Yes. This is what people really want to hear about. Yeah. So the actors have also gone on strike for similar wanting to get a larger cut of streaming residual, of the streaming revenue and wanting pay increases and artificial intelligence protections. So it's very similar. The 2 big paradigm shifts of there is now these generative AI capabilities and the business model of the industry is changing, I think are also driving this strike. I figured maybe just a helpful thing is to give just a quick overview of what the actors are fighting for on the AI front. It mostly seems to come around to they want to own their digital likeness. One thing that I think scared people is that you can be a background actor, and you're showing up for a relatively low wage for a day's work, and they're requiring you to get a full body scan. And the fear is, okay, you get paid for 1 day's work, they scan your body, and then they use your likeness for whatever other projects in perpetuity that they want to do. The guild wanted to if the studios do take your digital likeness, that you still own it and you get to negotiate consent and compensation for any future use of it. Sort of what I think the situation is that the studios agreed to give you that consent and compensation for your likeness to be altered or recreated for a future use, but not for the image to be used for training. For example, to create synthetic characters that don't look or talk like anything like any specific person, sort of similar to the writer's situation of just taking scripts and generating new material based on it. So I think that they are still fighting for that. I think they're also demanding the union wants consent over individual AI uses. I guess they want some sort of say going forward, if there are unique situations that come up, they want to be consulted and have a say in it. And then I think there is a dispute over whether or not they've given in on the background actors issue. The studio's claim they agreed that they would only use scanned background actors for that production. So they do a day's work, and they can put them in others, I don't know, other situations, or they can fix something if they didn't like how it was shot. The union claims that they did not agree to that. I do think there it's a very, just persuasive, gettable thing of you want to have some control and leverage over your actual personal likeness and not have it taken from you and then used in ways you don't know going forward. I do wonder if the actors end up in the medium to long term coming up against just fully as they are worried about synthetic digital characters and actors. Computer generated characters have sort of been a mainstay and have been a common thing we are already very familiar with and already see. So it may be harder to try and cut that off entirely going forward.

Nathan Labenz: 1:02:05 Yeah. Somehow this one does seem a little tougher. Asking myself, is that just a reflection of relative state of the technology, or is it something that's different? I think you highlighted a good point too about kind of where we've been. Right? There already have been lots of special effects and different digital ways of making things look real that are not real. And so Hollywood has a lot of practice with that, whereas there was no it's not like we had a sort of worst AI that used to be able to write scripts, where there's any precedent. There's a lot of precedent for CGI plus plus. This one does feel tougher to I think you're right. It does seem tougher to avoid this. And if I took the charitable angle, I might say, an agreement that's like, we'll scan you, but we only get to use that in this specific production until and unless you agree otherwise. That would seem like a pretty reasonable outcome for the short term. And especially because there is I guess, I don't know would the union have any hope of getting a sort of no generated characters or no generated well, especially because you just you blur the line. Right? In so many of these movies, you blur the line between what is a human and what is not. Right? I mean, you just think about things like Arnold and the terminator, and it's like, well, where does that stop exactly? Or if we kind of synthesize the new terminator out of nothing, and there's not really an actor under that, it seems like that sort of creativity is part of what Hollywood does. Can't really take that out of it. You can't really say every role has to be played by a human in sort of pure human form or whatever that would mean. It just seems very hard to draw real lines on this. And the technology is getting good too. At Waymark and I take zero credit for this. We will have an episode coming up on it before too long, but we have the creative team at Waymark with the company cheering them on, has created a short film called The Frost, which is kind of a sci fi, AI, premised film in its own right made with images created with DALL E 2 and then with motion kind of added on to the DALL E 2 images. And the team there at Waymark, the creative team is deep, our creative director has, I think, just recently crossed 1 million DALL E images generated, to give you a sense of how many, how deep he's gone down this rabbit hole. So yeah, I mean, they know the tools inside and out, prompting experts. They have a real ability to create a coherent visual aesthetic through the length of a short film that is hard won know how in and of itself. Now they're working on the Frost 2 and starting to use one of the runway models to do that, from the company runway that is, to generate short clips. And just from the first part of the year when they did the Frost 1 to now, early second half of the year when they're doing Frost 2, you can just see a huge difference in how compelling and natural and real the motion seems to be. And even with the Frost 1, it was like they have these characters that, and this is a really good episode, I think, too. So definitely listen to the full thing. But they have these characters in their short film, which are AI generated. And I asked a question of them, how did you get these characters to look the same all the time from scene to scene? Right? You're using different Dolly 2 generations. Right? How is it that they look the same? And they said, well, basically, there's 2 parts to it. One is use archetypes, and I think that means more to them than it does to me, but they were kind of like, we sort of find success with certain descriptors that do seem to be more consistent because in some way they seem to be sort of an archetype. So that's one. We can kind of improve the consistency with that approach. But then 2, they're also like, if you actually stop on the different frames and the different scenes, you'll see that in many cases, they do look meaningfully different. But it's fast enough. The edit is fast, and there's motion. And there's a story going on that you're sort of immersed in such that those differences kind of wash away. There's these famous psychological results of you can separate the sight and the sound of somebody clapping their hands or whatever by up to don't know exactly what it is, but up to a half a second or so before people start to perceive them as being different times. Something kind of similar here is going on where you have sort of an ongoing narrative that everything fits into, and you kind of know that it's that character. And so you're not studying the visual details with the level of specificity that you'd need to to start to see these little differences. As long as it's not too flagrant, it just kind of sails right by. Certainly did for me watching this stuff. Had asked the question. I thought they had made it way more consistent. They were

Trey Kollmer: 1:07:29 like, yeah. Actually, it's a little bit of a cheat.

Nathan Labenz: 1:07:32 So, anyway, all that is to say, I do think it's going to be pretty hard to figure out where would you draw a line on what counts? What's a role that has to be played by a human? What is a role that is sort of human based but could be enhanced? Certainly, some things seem like they could be generated. And by the way, this is all in the context too of the deep fakes are hitting a threshold right now, and it's another thing that's happened. I think these kind of short intervals between interviews or between episodes are pretty interesting in that we've got no shortage of major news over the last 3 months. The voices have gotten a lot better with things like 11, PlayHT, multilingual, the ability to do direction on the voices as well, say it in an angry way, say it in a happy way, say it in a surprised way. And even some deepfake video stuff, which I think is kind of just starting to tip maybe right now, but some really compelling examples of that on Twitter flying around lately too where I've seen a couple where I'm like, I honestly can't tell you which is the real person and which is the fake as I'm looking at the side by sides. So I don't think the results are consistently that good yet, but when you look back at where we're just past 1 year of stable diffusion. That's another kind of short time interval. We're recording this right now at stable diffusion plus 1 year and 5 days. It's really only been that long. So, yeah, I don't know. I mean, in the actor one, it does seem tough. Do you have any I mean, it's certainly hard to speculate, but could you attempt a you know, I guess with the writers, there's at least an attempt of something that could make sense. Right? We want to have the same credits, the same roles. AI can be a tool for the writers, but we want to kind of carve that out. With the actor side, it seems a lot harder.

Trey Kollmer: 1:09:26 My just guess is the actors will, both in this negotiation and in general, have some success owning their identity and their likeness with protections in this agreement. I think in a lot of states, there's a right to publicity where you're not allowed to just train something to copy someone's exact voice and way of talking and then commercialize that. So I think in general, it'll be hard to commercialize deep fakes based on people. But I do think the longer term protecting actors' jobs from AI generated characters and voices will be a much harder battle. And not even just from purely synthetic things, but this might also be one of those Tyler Cowen averages over situations where Meryl Streep can have her digital likeness scanned. And then when the technology gets there, star in twice or 3 or 4 times, however many movies she wants a year without having to be on set for all those days. I could see there being a norm develop among actors where it's become shameful to scan yourself and take all the jobs. But it does seem like with anything celebrity and your own brand differentiation gives you some sort of lasting power and maybe an even increasing ability to take more share of the value from the industry. And so you might end up with a few celebrities who are getting way more roles thanks to this, and then it becomes much harder to break in. I mean, you could also like, the number of meetings we've had, whether it's YouTube or TikTok where your agents are like, we got this YouTube star. We want you to write something for them. And it's like, can they act? Is this going to work at all? But in the future, in a world where the technology gets much better and you can make a reasonable acting performance synthetically, anyone famous could star in something. In the longer longer term, anyone, this people say, you could put yourself in a movie. Anyone could be the star of their own movie. I think people will still want to be watching famous people they like and relate to in movies versus everyone just putting their friends in their own things is more than a gimmick. But now we're getting into the far, just very speculative stuff.

Nathan Labenz: 1:12:08 Yeah. I mean, I think that where that kind of intersects with reality right now is probably gaming. Right? It's like the line there maybe gets blurry. I sort of agree that TV and movie entertainment as it's currently sort of understood, it's easy to imagine the sort of cheap silly way that you'd put your friends in it. I mean, gaming is even bigger, right, than movies at this point. So the other angle of that is people are going out and actually hanging out with their real friends in real time in these sort of virtual worlds for entertainment purposes. And the dialogue that can kind of be generated there now and the just the open ended adventure is certainly dramatically expanded compared to anything that we've had in the past. Just had a really funny experience yesterday where I was over at a neighbor's house in our neighborhood just, took the kids out on a hot afternoon to see if any friends might be out to play and ended up talking to this kid. I think he's 10 years old. And he's a real bright kid, kind of ahead of his class and really into D and D. And I just gave him my phone with the ChatGPT app and was like, try this. Have the AI be your dungeon master, and you get to play yourself. And zero shot right into the experience. The first thing it says to him is your turn. He kind of says, your turn. I was like, yeah. Now you get to talk. Now you tell her what you want to do. And he's like, oh, okay. So he just took to it immediately and we didn't see he was quiet for half an hour. Just going down this D and D, whatever the narrative was, whatever choices he's making. But that world is just unfolding in front of them. And D and D is obviously not for everyone, but it does seem like there is something there that try to imagine what's the not kind of gimmicky version? Is there a compelling version? It seems like there very well could be.

Trey Kollmer: 1:14:21 That does seem like, especially with your avatars and your video games, those will get more and more realistic. And maybe I'm totally wrong and people will just prefer to be the star and each person will watch a movie where they're the star of it.

Nathan Labenz: 1:14:35 And that's maybe the other reason, the other case maybe too for some of this generative technology or the reason that you might need to be scanned or it might not be shameful would be if the best entertainment is ultimately choose your own adventure to some degree, then you kind of have to have the generative, and you want that celebrity component to it, then it becomes like, well, hey. It's not just taking all the jobs. It's like, it's kind of a new form. Without the technology, there's just this experience couldn't exist. And if this experience is super compelling, then it's probably not shameful to help create it. Right?

Nathan Labenz: 1:14:35 And that's maybe the other reason, the other case maybe too, for some of this generative technology. The reason that you might need to be scanned or it might not be shameful would be if the best entertainment is ultimately choose your own adventure to some degree. Then you kind of have to have the generative, and you want that celebrity component to it. Then it becomes, well, hey, it's not just taking all the jobs. It's kind of a new form. Without the technology, this experience couldn't exist. And if this experience is super compelling, then it's probably not shameful to help create it. Right?

Trey Kollmer: 1:15:11 Yeah. I mean, my guess is a lot of the top actors and talent will be wanting to scan themselves and monetize it as best they can.

Nathan Labenz: 1:15:19 The hourly rate is super attractive. I mean, they already have a high hourly rate. But

Trey Kollmer: 1:15:24 As it gets into the editing applications and you don't have to do reshoots or you can just fix something that was wrong in a scene and you get better than that, that might be a few days less an actor needs to be on a movie and they could do one or two. There's the slow incremental where they can do a little bit more work each year and then the more extreme version where they can do a lot more work because they don't even have to be physically present. The way I've been hearing people talking about the models, I feel like a lot of people aren't giving it credit for understanding some things or even sort of opening themselves to the questions of what would understanding even look like. You think about the training task and how hard it is to predict the next word and the idea that if there's underlying structure in this stuff you're trying to predict, learning that structure makes you better at predicting the word. And maybe learning the deeper underlying structure is kind of what understanding is. It's a bit more philosophical, but then you see the examples in Neil Nanda's paper or the Naftali Tishby. He just had these toy models where he would go through. He was able to calculate the mutual information between each layer and the input and each layer and the output, and he would train the models. And in the beginning, the first layer has a lot of mutual information with the input, and the later layers have almost none. And there's the memorization phase of training where the later layers kind of learn, get more of the information from the input. And I guess I assume the model's learning to just pass through the inputs and then the final layers are memorizing which labels go with it. You have the memorization phase. And then at some point, as it starts to generalize, the mutual information between the input and the output decreases and gets pushed to zero. And that's when the model actually starts generalizing outside of the training data. And he sort of has this theory of the information bottleneck that the model in compressing the training data gets better. The least information you have between the layers and the training data. So you don't have all the pixels one by one, but you have, there's a dog in it. There's much more compressed higher level information. Which just made me think, it just felt like the same process you've seen it. It's not the same process. They're kind of, I think they're theorizing different things. But Neil Nanda's paper where it memorizes the training data, but you have this weight decay penalty. So even once your training loss is at zero, the model's pushing for simpler solutions. And obviously there's some searching through the possible functions that the model can represent. And if it latches onto a simpler solution, the weight decay pushes it toward these simpler models. And that seems like what grokking is in that paper, where the model is now modeling an underlying process that is this deeper structure to the training data it was given. And the models could be copying. Obviously, they can copy the inputs because they know facts, so it can regurgitate stuff. But to not count out the chance that the models are getting some understanding. And the Ilya Sutskever example he gives of the next word prediction is that if you're reading a short mystery story and you get all these clues and the story's going and then the detective declares, I've got it. The killer is blank. To guess who the killer is, it's not just the statistical probabilities of every time I've seen the killer is blank. You sort of have to be latching on to the psychology and clues and the cause and effect of the story. You need some deeper understanding to predict who the killer is. And that the next token prediction sounds very prosaic, but it's actually a very hard task. And as I think Eliezer Yudkowsky pointed this out in a tweet and not that you ever could learn this in a language model, but there probably are lists on the Internet of large primes and their factors. So to predict the next token in a list of large primes and factors, you would have to solve, possibly not efficiently computable problem. But just that next token prediction is a very rich task that can push a model towards different levels of understanding and seeking out kind of the underlying structure. And then you see in examples of the sentiment neuron where you have next token, just next letter character prediction that OpenAI trained a model on, but then they found there was one activation that would give the positive or negative. I think it was trained on the IMDb movie reviews or something that was a classic dataset. Or they did a next pixel prediction, this iGPT where they just trained the model to predict the next pixel in images, and then they took out, I think they called it a linear probe. I think it means they just take out a higher level, and then they can train a linear classifier on that to do object recognition, and it got near state of the art without really trying to push it or train it too long. Anyway, I just think that, at least in terms of when the writers are thinking of the

Nathan Labenz: 1:20:57 future

Trey Kollmer: 1:20:57 of your careers, it makes sense to want to try to protect your livelihood, and it makes sense to feel there's an injustice if your work is taken to train something to replace you. But I do think you need to be pretty open eyed with what the models are actually doing and not just dismiss how good they are or what they're doing right now, especially because they're going to be getting better every year.

Nathan Labenz: 1:21:23 I do think this is one of the more interesting debates going on right now. And a good example of it, honestly, just last week, we published an episode with Paige Bailey who was the lead product manager on Palm 2 at Google. And we had a few rounds of back and forth within that conversation on what level of reasoning do you think the models are achieving? And she did kind of surprise me, and this has kind of gone mildly viral online with a clip that pulled it out of context. But she did kind of surprise me with how little credit she gave to the models on their reasoning. And I think one thing that I see commonly underlying some of this confusion or kind of disconnect between people who have very different understandings of what's going on, there's probably multiple things. But one big thing I think is the importance of robustness and reliability versus how easy it is to find counterexamples to various capabilities. So what Paige said was there's these examples that seem to show reasoning ability, and she was kind of alluding to the sparks of AGI paper. And then she was saying, but if you change the situation a little bit in ways that are not super meaningful really. It doesn't seem like it should throw it off, then it can totally throw it off, and it can fail. And that's definitely true. We've certainly seen those examples. But I do think I come to a different conclusion than she seems to. I mean, we had less than an hour, so I didn't get to ask all the follow-up questions I would have liked in that interview. But it did seem like she was kind of saying, yeah, that doesn't really count as reasoning if it's easily confused. I think that's kind of one view on it. I would say something different that's like, I do think it counts as reasoning or shouldn't be ruled out as counting as reasoning just because it is easy to find counterexamples. I mean, for one thing, from a practical utility standpoint, if you can reason through common examples, that's really useful. And pretty plainly, the best models today can do that. No doubt. Right? Now you could say, well, maybe that's all just correlation. I don't know. It seems very implausible. Another funny story I had from this weekend that kind of showed that was the same visit over to a friend's house as the 10 year old playing D and D, but the father of this family's brother is a monk and lives a very monastic life, at a monastery. Head to toe, brown robe, sandals, long beard. Doesn't follow the news. Right? Doesn't take the paper, and doesn't have the Internet. Definitely not up to speed with everything going on with AI. Interestingly, he had some limited prior exposure to ChatGPT. Somebody visiting through the monastery or whatever had showed them something. So he wasn't totally unaware, but definitely not paying a lot of attention. So he asked, well, if you haven't sworn off of it entirely, he's like, no. I'm here. I'm visiting. I can see what's going on in the outside world while I'm on my home visit. So I was like, well, then I have to be the one to show you what ChatGPT can do. So I was like, ask it a question you're interested in. And he asked a question about a very sort of theological question, as you would expect, about the difference between essence and existence, which is not something I've really studied. And we went through a couple rounds of it, and it was like first response was, okay. That's fine. He asked for an argument in the original prompt, his question to it. It gave back a summary. So his first comment was like, well, it gave a summary, not really an argument, but that's cool. And I was like, please do that again, but this time give an argument like I said the first time. And so then the second time, it gives an argument. And he's like, okay. That's good. I wouldn't say that's really the best argument. It's referring to Descartes and that's I think that's really not the best. That's kind of what everybody would be talking about, but I don't think that's really the best thing to be talking about. I was like, well, this goes back to the names thing too. Is there a thinker specifically that you would say would be the one to refer back to? Thomas Aquinas. Okay? So we ask again, could you give the Thomas Aquinas take on this question? Now he starts to get impressed. He's like, okay. Now we're getting to the real substance of this debate that I think is most important. So he was quite impressed at that point. And then the 10 year old kid comes in and says, can it tell me a joke like Bill Cosby? And I was like, watch this. Yeah. No. I don't think nobody in this group was following the news, I don't think, about Bill Cosby. I didn't touch on that bit. So I said, okay. Well, watch this. And I said to ChatGPT, okay. Now can you represent this concept? We're still talking about the philosophy of essence and existence in the form of a Bill Cosby joke. And now this is clearly well outside the training data. Right? I'd say it's safe to say there's never been a Bill Cosby bit. Now you could still call this interpolation. I wouldn't necessarily call this 100% reasoning, but it's definitely some meaningful conceptual understanding. Because what it came back with was a Jell-O based description of the difference between essence, which it described as the sort of notions that you have about Jell-O, like the wiggly jiggliness of it, and the chocolaty goodness that you imagine that causes you to want it in the first place. Whereas existence is when you really eat it and you really get to experience and have the actual sensation of enjoying the chocolaty goodness. And at that point, everybody was just like, what? It can do that. I think the sarcastic parrot notion is clearly outdated at this point. But what does seem to be the case is that there's definitely some reasoning ability, and it can be thrown off, but there's definitely some reasoning ability. There's definitely some synthesis ability, which may not be exactly reasoning, but it's like it could not have done that well without some conceptual notion of what's going on. Right? If you looked at the anthropic if we could run this thing through that anthropic thing and be like, what were the data points that most influenced this result? It would not be keyword driven, just kind of engram like matching. It would definitely be conceptual stuff from Thomas Aquinas and Jell-O commercials from Bill Cosby. And you would see that those are being merged in some sort of pretty sophisticated conceptual way. I guess what I think is happening right now is it seems like both are happening at the same time. The reasoning, the sort of synthesis ability does seem to be clearly there, but maybe it doesn't always get activated by certain examples. Or maybe with certain examples, it can kind of create noise that just takes things very surprisingly off track. Who knows exactly why that would be the case? Maybe with the anthropic type of work, we can get a better window into that. But it definitely seems like it's an overstatement to me to say they can't reason. Definitely not established that they reason like we reason, but you see these kind of generalizations in the Grokking example and others, and it's like, something is happening there. Or your linear probe concept too where you can look into a model playing chess and find that even though it's just trained on moves that are just like a letter and a number, it's played on basically a chessboard. So it's an 8 x 8, a through h, 1 through 8, whatever. And okay. D7 is a move, and it just has a series of moves. That's all it sees. But it does learn to represent a 2D understanding of the state of the board. And they can even go in and manipulate the internal representation to change how it understands the board in a way that then changes how it plays the next move in a way that ultimately makes sense. I mean, you can really see that there is some world modeling. I know it's a toy world, but there's some world modeling going on there that is clearly more sophisticated, and you can kind of prove it in the ability to manipulate it and get predictable results. I guess what I would, if I was to try to give one very time bound, definitive statement on what's going on, I think it is ultimately both. Clearly, these abilities are coming online. They're not reliably used in all the cases that they ought to be used. And there's a whole study of why not? And then we have an episode coming up too about the universal jailbreak. And that's another example of, boy, it's really weird. Right? You put on these super strange strings on the end of a prompt, and all of a sudden it doesn't refuse to do bad stuff anymore. It'll just do all the bad stuff. Like, what's going on there? I don't know. They don't know either, really. But my best understanding would be similar to the main question is just it seems like some sort of circuit has formed, some sort of funnel where it has learned that a certain set of things get this kind of response. And that response is, and once it understands that, then it's going to refuse. Right? And the refusals basically all come the same. But it's largely become pretty effective at funneling a certain class of input into the right circuit, if you will. But if you add enough random stuff and you're smart about how you choose that random stuff, then you can steer it away from that pattern and you can get the original desired or whatever. You can get the result that the designers of the model don't want you to get, but that the user hypothetically wants. Build me a bomb or what have you. So it just seems like kind of both things are going on at the same time. We don't have a clean representation of it or a clean separation of it. Pretty clearly to me, there is some, I don't want to say clearly it's a circuit. But clearly, there's some mode. They use the word mode, which is a good word because it doesn't really suppose any internal structure. It just is about more of a behavioral description. But there is clearly some mode that most of the time gets activated, but then you can find instances where it doesn't get activated or at least it's not dominant. So I think it's all kind of going on at the same time. There is both the sort of more structural, more reasoning style stuff and the just kind of random pattern matching or this word, so maybe this word again kind of stuff. And which one dominates in any case is kind of not super obvious. But I'll bet that gets untangled over the next couple years. It seems like that's a big part of the mechanistic interpretability project. It's also a huge goal in terms of just commercial reliability. Robustness is definitely very important. So I think we'll see tons and tons of work and resources go into untangling those. My best guess is that when we do, we'll end up finding something where it's like, yep. This is the thing that's kind of getting activated that does the thing that we want. And now that we can kind of see that, we can sort of see why some of these other random things ended up kind of going off in a different direction, and probably that gets refined over time. I'd be pretty surprised if something like that doesn't happen.

Trey Kollmer: 1:33:40 Yes. Yeah. And I do think it's an interesting question of when it fails in these you slightly tweak a reasoning problem and it fails. How much is it? Because it wasn't, it's a deeper, it didn't really understand the deeper structure. It was just pattern matching to a similar problem versus it forgets details and doesn't check that it's using all the information it has. Or there's, I think, some research that there's biases into which parts of the context window that it is biased against attending to, and maybe there's an important detail in there it's missing. I guess I'm not even sure what's easier. To get more of that general robustness versus, it actually isn't understanding at a deeper level as it appears to. And that maybe that means we're farther away for more progress. I do think there's something interesting on the adversarial examples you bring up. And it's interesting that in vision, there's that whole research on adversarial examples where you can tweak a small number of pixels. The human eye can't tell the difference, but suddenly you could have a model take a picture of a dog and think it's a fire hydrant or whatever. And I'm curious, and maybe there's already research on this, how much of model susceptibility to adversarial examples versus humans is because humans have some robustness quality with respect to these examples that the models don't have yet. And how much is it that, well, we have access to all the weights of the model, so we can optimize the perfect adversarial example to trick it. And if we had access to the full connection of a human brain and you put that optimization pressure in, you could find similar adversarial examples with reference to humans?

Nathan Labenz: 1:35:29 Yeah. It's a great question. My guesses were maybe a bit of both, but I would guess that if you had the full root access to the human brain that you would find it pretty hackable too. The level of defense that we have is sensory input. That's what we get. And we need to be robust to that. Right? So and we have obviously increasingly weird things like deepfakes and whatever. And supposedly, I don't know if this is apocryphal, but in the original movies, supposedly people ran away from the screen because a train was coming and they couldn't even process what's going on. We can be tricked even through the senses, I guess, is the key point there with all sorts of magic eye type, there's all these little perceptual tricks that kind of show the limits of our ability to be robust to those kinds of attacks. And that's all with zero access to the actual information processing. We're still just looking at raw inputs that you have no real way of optimizing other than behaviorally. Right? So the equivalent of what we can do to humans is just sitting in front of ChatGPT and just typing into the text box. So, yeah, it's hard for me to imagine that if you had the full visibility like they do in this universal attack that it wouldn't yield some results. I would have to imagine it would.

Nathan Labenz: 1:35:29 Yeah. It's a great question. My guesses were maybe a bit of both, but I would guess that if you had the full root access to the human brain that you would find it pretty hackable too. The level of defense that we have is sensory input. That's what we get. And we need to be robust to that. Right? So we have obviously increasingly weird things like deepfakes and whatever. And supposedly, I don't know if this is apocryphal, but in the original movies, supposedly people ran away from the screen because a train was coming and they couldn't even process what's going on. We can be tricked even through the senses, I guess, is the key point there with all sorts of magic eye type, there's all these little perceptual tricks that show the limits of our ability to be robust to those kinds of attacks. And that's all with zero access to the actual information processing. We're still just looking at raw inputs that you have no real way of optimizing other than behaviorally. Right? So with the basically, the equivalent of what we can do to humans is just sitting in front of ChatGPT and just typing into the text box. So yeah, it's hard for me to imagine that if you had the full visibility like they do in this universal attack that it wouldn't yield some results. I would have to imagine it would.

Trey Kollmer: 1:37:01 That'd be my guess. Yeah.

Nathan Labenz: 1:37:03 Because there's just no reason for us to be that robust. Right? I mean, nature in general just doesn't have super robust defenses to things that have never existed. And there's never been the ability to optimize in that way against us. So why would we have a defense against that? We probably, that's not to say also, it's always kind of both. So I do think, and Zvi made some really good points about this last time I talked to him, that there is a human robustness that is stronger than the language model robustness right now. So it's not something that we don't have, but just that my guess is that's not enough to defend us against the sort of full access attacks that might be possible.

Trey Kollmer: 1:37:51 Yeah. And yeah, just say I agree. Yeah. Outside of the yeah. Those specific adversarial examples, I do guess that we have, yeah, I mean, we clearly have much more robustness than any of these models. I was listening to that, to your last episode and was thinking through these, some of these reasoning questions. And so I threw in just a fun reasoning test to GPT-4, tried to just throw in a bunch of distracting language, and it tried to steer it toward some answers I might want it to give. And I'm curious if you can guess which of the three likely answers GPT-4 guessed. So I prompted, here's a question. I own a fancy top hat from the twenties, but the top of it has been entirely cut out. It's currently sitting brim side up on the ottoman in my living room. I placed a yellow softball inside of it. I then pick up the top hat and place it brim side down on my dining room table. Later, I pick up the hat, package it, and mail it to the capital city of France. Once the package arrives, where's the yellow softball? Please think through your answer step by step and explain your reasoning before giving your answer.

Nathan Labenz: 1:39:01 Okay. Chain of thought, human edition. So if I understood correctly, the top hat has a hole in it, and that's kind of the key thing. Right? So when I pick up the top hat, if there had been no hole, the softball comes with me, but it doesn't because there's a hole in it so it stays on the couch. So is that the right answer?

Trey Kollmer: 1:39:24 That is yeah. That's the right answer. I generated a bunch of answers and tweaked some of the language, and GPT-4 always guessed the same answer. What do you think it guessed?

Nathan Labenz: 1:39:34 So the possible answers would be it could be in France. It could be on the ottoman. Is there a third? It's another obvious thing.

Trey Kollmer: 1:39:42 Table.

Nathan Labenz: 1:39:43 Interesting. It'd be very surprising to me if it ends up on that one. Not to filibuster, but we just did, I just did a small project on kind of a similar thing related to theory of mind. It was a kind of similar setup, except instead of asking where the object is, it would tell you where the object is, and it would ask where a character in the story thought it would be. And in that context, we found broadly really good reasoning in the step by step. And the whole motivation for doing this was that there was a reported result in a paper that was, I was immediately, there's GPT-4 can definitely do better than that. You're definitely underreporting. So then the question became, well, why is it underreporting? There are multiple reasons. There were also some things where you get into this robustness debate. If the situation is sufficiently weird, and the model kind of comes back and is gives you an analysis that's, this is kind of too weird to be believed or whatever. Does that make it wrong? Well, according to the benchmark, yes. But in my reading of the transcript, I thought it seemed very reasonable in many cases. Yours is a less kind of crazy stilted example. I'm gonna guess it gets it right.

Trey Kollmer: 1:41:02 It guesses dining room table every time. But here's the thing that's interesting. It'll say, alright. Let's break it down step by step. This is just one of the examples. You own a fancy top hat from the twenties, but the top has been entirely cut out. This means that the hat is essentially a hollow cylinder, and any object placed inside could easily fall out from the top if the, well, this is if the hat is inverted. So it doesn't seem to grasp that it's fully just a cylinder with no ends now. But in some of the responses, it does remember it. A softball will fall out of it as soon as you lift it. And then it'll say, and then you put it in place on the table. The softball stays in the hat as you lift it up and will fall out when you turn the hat over on the table. And I was a little bit trying to trick it into the capital city of France, which I think sounds like a quiz question, where the answer would be Paris. But it's funny, it always gets table. Even when sometimes it would specifically say, when I asked it to describe the shape and orientation at each step, it would describe that the hat could not contain a softball if you moved it and then would forget that detail.

Nathan Labenz: 1:42:13 That is very confusing. I guess I kind of fall back to my same general notion that it does seem like both things are kind of going on somehow. It seems like there is real reasoning there on the level that, again, I don't know if it was just purely statistical correlation and there's no structure. Wouldn't expect it to be that good even, but the fact that it gets it wrong is odd.

Trey Kollmer: 1:42:41 Your podcast in this made me a little bit more open to, okay, there are, it isn't robust to some of these. You do small tweaks and it loses the thread. I do think Yann LeCun had tweeted about holding a mug with a coin and moving it around and turning it upside down. The model, GPT-3, didn't know it would fall out on the bed, but then GPT-4 did know. So I wonder if it's been trained. This was kind of a public call out of GPT-3. I wonder if they're, we're going to make sure when something gets turned upside down, we know that stuff falls out of it. And then you change another detail in it. And you have to keep plugging the holes. But yeah, it overall seems I think I agree with you. I'm sold that there is a little bit of both. There's definitely failures to remember all the details and to synthesize everything into coming up with the correct answer.

Nathan Labenz: 1:43:32 Yeah. Another next step possibly to take on this, and this is to tie back to the interview with Paige too. She was really inspiring, I would say, in her call to the sort of citizen scientist, you and me and anybody, my buddy Graham who worked with me, we worked together. He did most of the work to give credit where it's due on this little reproduction project of this theory of mind benchmark. There really is just so much surface area to be explored. And the at least, for better or worse, the dynamic today is that the labs developing the models do not have the time to do all the exploration. They're doing red teaming to try to identify really bad stuff, and they're, biowarisk could be insane. So we better look at that. But when it gets to the top hat flipping, they're, that's kind of on the community. So it really is an opportunity to explore and contribute in a really meaningful way. The fact that I got that question wrong, and I'm about as obsessed with this as they come, does show that there's just a lot of need still to go explore these edge cases and kind of characterize model behavior. And the next thing I would do, I think, this case, and this is another, I give credit to Graham for this idea on the theory of mind thing. He observed that there were so many, this is kind of where the shortcomings of the benchmark maybe generate insight in their own way. This is not the kind of insight that the benchmark was meant to generate. But we noticed that a lot of the setups were very weird. And it was the cupboard is in the crawl space and where cupboards aren't in crawl spaces, and then there's a thing in the cupboard in the crawl space. Just very weird, super low prior setups. And so then he started basically abstracting two variables, different key nouns in the story. And I thought that was really interesting as a way to isolate reasoning versus whatever stochastic, pear tree, noisy process type modes of operation because when we took the variabilized approach and this, by the way, we open source all this code so, I mean, it's not that much code, but the prompts and everything, it's all in a Replit. You can go, check it out and try your own. The interesting phenomenon was when we replaced the fact that the person and the other person are in the crawl space and then so and so puts their blouse into a cupboard in the crawl space. Okay. That's all very weird. Let's just make it x, y, and z. They're in x. Person a puts y into z. Person b then leaves. What does person b know? Performance jumped actually with that abstraction. So I don't know what would be the equivalent of this in your example, but what if instead of a top hat, it was something that was a cylinder from the start, then would it be able to do that? You could kind of progressively move toward a more clean, you can kind of imagine a spectrum where it's, here's a very clean posing of this problem where you have to take all the same reasoning steps. And then here's a more noisy version of the problem where you have all these kind of things that are there to sort of throw you off. Can is there a place where one tips into the other? At least in the theory of mind case, we found that removing the specifics and just going variables, person a, person b, location x, option y and z, gave a significant performance boost. And it was the particulars were kind of adding too much noise, I guess, is my best explanation for what might have been happening.

Trey Kollmer: 1:47:26 That makes sense. So you're mapping it to the abstraction for it. Well, it makes sense. I guess there's only so many steps of computation through the layers of the model, and you're sort of freeing up more steps for the reasoning, or for part of the reasoning. It doesn't have to take your description and map it to the abstraction and then solve the abstraction. That's cool.

Nathan Labenz: 1:47:50 Yeah. That actually suggests another way too would be to decompose the problem. If the human is setting up the problem decomposition, then obviously that's not all language model ability. But we're increasingly seeing language model self delegation too. So you could imagine a situation where you might say, you are faced with a problem. Your task is to identify the parts of the problem that require reasoning, delegate that to yourself with kind of a narrow prompt, get the response back from that, and then integrate it into an answer. And that might be another way. I feel like we're constantly referring to Anthropic research here, but there were, they had another really good recent paper on that that showed that chain of thought is good, but even better and more robust is to actually decompose the problem into subproblems, get the language model to do a detailed analysis of each subproblem, and then roll back up hierarchically to the kind of main solution. And that probably does relate to what you're saying, which is there's only so many layers. So you can only make so many leaps if you invoke the model three different times or how many different times. Get all the, you can kind of spread the leaps that you need to make out across the different forward passes.

Trey Kollmer: 1:49:09 Yeah. That makes sense. Oh, and I will say to its credit, responding, is there a detail from point one you made that you may have been forgetting in your answer? It goes, oh, my apologies. The hat was completely cut out, so the ball would have fallen out as you pick that. So you give it a little bit of feedback and leave it kind of vague, and it seems to be able to explain what it missed from a small amount of feedback.

Nathan Labenz: 1:49:33 Interesting. Well, lots more refinement to do on our understanding of these things, to say the least.

Trey Kollmer: 1:49:40 So I just had a question I wanted to ask you that I was curious what you thought, and it sure does seem that NVIDIA is capturing a lot of the value from the increased demand for chips. And I was wondering why does NVIDIA get so much, have such margins and get so much of the value relative to TSMC and ASML when they seems like they can mostly only make their chips on TSMC, and TSMC can only manufacture their chips with lasers from ASML. I had a couple loose guesses, but I was curious, obviously, yeah, genuinely curious about what you thought.

Trey Kollmer: 1:49:40 So I just had a question I wanted to ask you that I was curious what you thought, and it sure does seem that NVIDIA is capturing a lot of the value from the increased demand for chips. And I was wondering why does NVIDIA get so much and have such margins and get so much of the value relative to TSMC and ASML when it seems like they can mostly only make their chips on TSMC, and TSMC can only manufacture their chips with lasers from ASML. I had a couple loose guesses, but I was curious what you thought.

Nathan Labenz: 1:50:19 Year to date, NVIDIA started the year at $1.43 share price. It's now $4.85. So it has basically tripled then some over the course of the year, and it's now a $1.2 trillion market cap. For comparison, ASML is also up on the year, but more like 40% up. And it's a $270 billion market cap. And then TSMC has fluctuated a little bit more. The beginning of the year was a little bit of a low point for it, so this may overstate it a little bit, but still only up like 25, 30% on the year. $445 billion market cap. So NVIDIA is currently worth twice the others combined, roughly speaking. And it's up a multiple this year, whereas the others are up double digit percentages, but nowhere near the same kind of huge pop that NVIDIA has seen. And I don't really have a great explanation for that. My sense of the market structure is pretty similar to yours. They seem like they all sort of have monopolies for now. Whose monopoly is most enduring? Is it really NVIDIA's? Or is that really the one that would be the hardest to get at? That doesn't seem quite right. I mean, it seems like if all of a sudden NVIDIA ceased to exist, I would think it would be easier to replace versus if all of a sudden TSMC ceased to exist or if ASML ceased to exist. I don't know why that wouldn't be true. I mean, certainly the conventional wisdom of bits are easier than atoms, but it does seem like not obvious at all that NVIDIA would be the hardest one to replace. There's been some speculation even just since you sent me this question online about the CUDA moat. They have the only software that people can manage to write to. And AMD, for example, has terrible software. There's been a lot of noise from the tiny corp and George Hotz's team that are like, we're gonna make AMD work. Oh, no. We're not. It sucks so much. Oh, yeah. Wait. Wait. Wait. Wait. We are again. They've responded to our ticket favorably. Getting the real time, blow by blow from them. But yeah, it seems like NVIDIA would not obviously be the hardest one to replace. Certainly seems like if any of them go out, I mean, seems like NVIDIA is more dependent on the others, honestly. Right? I don't know how that wouldn't be the case.

Trey Kollmer: 1:53:14 TSMC and ASML, they seem like they co-invested in the technology. They already sort of have agreements in place of how their financial structure works. But I'm just thinking what's more painful, NVIDIA to try to switch to Samsung or TSMC to fill the volume they lose if they didn't find an agreement with NVIDIA? I guess these fabs are such a huge upfront investment and such a volume business that taking the volume hit would be extremely painful, and there'd be time before AMD is really at a place where they could take advantage of these fabs. Or I guess also, alternately, maybe Google would want to, I don't know how interchangeable the current NVIDIA fabs are with if Google wanted to repurpose them for their purposes or AMD to repurpose it versus the pain of NVIDIA going to Samsung who, I don't know. It seems like the CoWoS, the chip on a wafer on substrate, is TSMC's big differentiator that lets them get the high bandwidth memory. I mean, these are reasons why it seems like TSMC maybe should be getting more of the upside, but it gets the high bandwidth memory close enough and with fast enough connections to the GPU. That seems to be where they are really ahead of Samsung, but maybe Samsung is only, they're working on their own a couple years behind, and that the pain of switching to Samsung is less than the pain of such a volume business like TSMC's fabs losing a giant customer. But I don't know. And I'm kind of interested in the bigger question of where in this, if you call it the intelligence supply chain from chip manufacturing and all the, there's a lot of lower levels than that, to chip manufacturing, data centers, foundation models, and productizers. Where is the value of that gonna end up being most captured?

Nathan Labenz: 1:55:23 Yeah. It's a great question. It's certainly a hotly debated one. I just looked up Samsung for what it's worth as well. $472 billion company, also up 28% on the year. Basically, very different companies, a lot more different business lines at Samsung versus TSMC, but similar market cap, similar delta on the year. I would put this out to the audience. If anybody knows more about why NVIDIA is up 3x and that has only translated to these double digit increases. And by the way, the whole stock market is up. Right? So, I mean, if we just look at Nasdaq, up 35%. So somehow, Samsung, TSMC have underperformed the Nasdaq this year. ASML is a bit higher. They're up 40%. So that's a bit more increase than the Nasdaq at 35, but still basically roughly in line with the Nasdaq. And then you've got NVIDIA that is up 238%, which is almost a trillion in market cap increase. That does seem tough to explain. And if anybody has a better understanding of that, I would love to hear it. Agreements did come to my mind as well. Could there be some sort of lockup that NVIDIA has had? I was thinking maybe NVIDIA being a California company, US financing appetite for risk maybe being a little bit different than a European or, perhaps a Taiwanese, I don't know. I'm really speculating here. But you could imagine US companies maybe are a little bit more likely to do these super high leverage balance sheet moves where if they have conviction, they'll say, I'll go do some lock in thing now where I really can get outsized returns if I'm willing to make that bet. And maybe TSMC is happy to take the other side of that bet and be like, hey. We're just happy with, if this thing, if sure. If it goes 10x, you want to get 8 of those x? Cool. You can have that. If that locks in our revenue for the next 3 years, maybe some sort of agreement like that could make sense of it. Yeah. I mean, we've talked about this previously too, but seeing Emad from Stability tweet that the CUDA moat is a mirage, I kind of buy in the sense that it does seem like software is just getting easier and easier to create generally. And we've seen some real interesting examples of this recently with Andrej Karpathy from OpenAI, obviously, highly productive and capable individual, has done some awesome work on CPU based inference with a C library where, this is a weekend project, I think, for him, where he took a small model and said, how efficient can I make this if it's just running on a CPU? No GPU, just CPU, but let's go down to the low level of the code and really try to optimize for it. And he made tremendous progress. Over the course of a weekend, he got to the point where 7 billion, I think it was a 7 billion parameter model was fitting out tokens at a pretty good rate. And it was like, hey. This is getting viable to run on a local device, which even maybe suggests that there could be some threats to GPU dominance even. Who knows? Certainly not for training. I mean, GPUs are gonna be in demand. The key point though is greatly helped in that effort by GPT-4 helping him write the low level code that he's described himself as being rusty on. So yeah, sure. If AMD's software, or if the new Microsoft chip has a software layer that's less well known, less convenient than the CUDA layer, how hard is it really gonna be for them to create a model that can translate CUDA code into their new code? And now it's like, sure. You can just port this from one to the other. I don't see the difficulty, or the relative ease of writing CUDA code, which is still not that easy as compared to working on other chips. So I think you've posed a very good question on this one, and we may have to admit that we just don't know at the moment.

Trey Kollmer: 1:59:52 I'm excited for the future episode where you answer it.

Nathan Labenz: 1:59:55 Where we untangle it. Yeah. Well, one thing I can say, and I don't pick individual stocks at all, but I do have some friends from high school who have a little investing club, and I participate in it just to do stuff with them. Even though I'm more of a mutual fund kinda guy in general, my only stock pick ever was NVIDIA about 9 months ago or whatever. So I did real well on that for the club. But downstream of your question, I did send them a note saying, hey. This is feeling a little meme-y all of a sudden. And is there a play where we think other parts of the stack may start to renormalize. It seems right that everything should be up. Demand for chips is certainly up, but it doesn't seem like it should be so concentrated in NVIDIA. So we'll see if the rest of the club wants to, with our very small holdings, diversify across the chain a little bit.

Trey Kollmer: 2:00:50 Yeah. That's funny. Around 9 or 10 months, a little less than a year ago, I was very nervous for my career future. And it was like, I should just buy a bunch of AI stocks in case as a slight hedge against them fully taking over. And I tried to get different levels of the stack. And it's crazy how much NVIDIA has outperformed everything. And just their margins, even you're looking, they sell a chip. It's interesting. NVIDIA gets this percentage of the final price and TSMC gets this percent. It seems like NVIDIA is just capturing so much of the pie.

Nathan Labenz: 2:01:27 Yeah. I mean, that presumably does have to be somewhat around just the way they've structured their agreements. Right? They must have a price lock in that NVIDIA doesn't have at the retail level such that now an H100 can cost a lot, but TSMC can't raise their price all that much.

Trey Kollmer: 2:01:45 That makes sense because I don't know if you listened to Stratechery. I think I may be getting this wrong, but they were talking about how NVIDIA had write downs on their A100s after the crypto bust. I think it was future purchases that they were basically just writing off of A100s. And now they've sold all of those, but just it does imply that they do have future commitments, and maybe they're really reaping the benefits from those.

Trey Kollmer: 2:01:45 That makes sense because I don't know if you listened to Stratechery. I think I may be getting this wrong, but they were talking about how NVIDIA had write downs on their A100s after the crypto bust. I think it was future purchases that they were basically just writing off of A100s. And now they've sold all of those, but it does imply that they do have future commitments, and maybe they're really reaping the benefits from those.

Nathan Labenz: 2:02:09 Any insiders listening, let us know how you can help. Just to go briefly to your other question on where is the value in the stack in general. My first answer there is always consumer surplus. I think that's going to be the big macro trend. Certainly, if things go reasonably well, we're all going to have cheap access to expertise. And in its cheapness, there may not be huge value accruing to any particular layer of the stack, but just better quality of living for the general public. That would be the great hope.

I do think, of course, there will be companies that are going to do super well. It does feel like physical chips are a pretty safe bet. I would probably say both in the production and the operation. Cloud, nobody wants to set up their own racks. So if you're really good at that and you can manage compute at scale like the big tech companies can, that seems like it's a very safe bet for the foreseeable future to continue to do at least reasonably well. I mean, who knows exactly what that looks like in terms of growth rate or what have you, but I would be long cloud compute broadly.

The models themselves, I think, are very hard to say long term. But over at least a couple years, it seems like the leaders will continue to have real market share. OpenAI just said they're at 1 billion or somebody reported credibly that they're at 1 billion revenue run rate now. That's exploding. Dario from Anthropic has said they look at their number and it just keeps going up. And he's like, we're not even really trying to make it go up, but it just keeps going up. So I do think those real leaders, because those products are already so cheap, the APIs are already so cheap, people are just going to throw more and more stuff at that and not be super price sensitive in order to use the best model. That's certainly my approach almost always. Anything new I'm trying, let's just use GPT-4 or maybe Claude 2, but that's pretty much it. If and when we get to a point where there's significant cost, then we can reevaluate and think, well, jeez, maybe we can at this point fine tune 3.5 perhaps as the next step.

People are way more expensive than the language models, so you end up in a 10,000 generations for 1 hour of human work trade off a lot of times. It may cost 2 cents to do a generation. If you do 10,000 of those generations, you're at $200. And that's maybe about what it might cost you all in to get an AI person to help you develop a fine tuned model or whatever. So how many generations are you really going to have to do before it's going to be worth doing those kinds of things? Certainly, for plenty of use cases, it will make sense to try to save money and use the smallest model and do the fine tuning and whatever. But you've got to be at significant scale.

And part of what's magical about the language models is that they can work on things that don't have huge scale. Right? I may have 100 resumes that just came in for my job, and I want to just scan through them or even just the charity evaluation thing I did, 150 applications. I used Claude 2 to summarize every single one as the very first step, and it was really helpful. So that cost me whatever, couple bucks. In fact, it was free because it was subsidized on Claude.ai. But if it wasn't subsidized, it still only cost me a couple bucks. I'm certainly not going to go spend hundreds to fine tune 3.5 to do that task.

So all that is to say, I think the frontier model providers continue to capture real value for a while. Long term, though, it's less clear. And then at the app layer, my general sense is incumbents do really well where there is a big platform already built out. So your Salesforce can apply AI before you build a new Salesforce, and your Adobe can apply AI before you build a new Adobe. And on and on that goes. But fundamentally new stuff is where I do see there's possibility. Your virtual friend, where there is no virtual friend today, you could imagine some of these new experiences popping up and becoming huge phenomenon that maybe become part of the new world in a kind of Google sort of way where it's like, there was no Google. Now there's Google. Now Google is kind of inevitable.

I wouldn't be too surprised if we see a couple of big things pop up there that are like, there was none of this. Now there is. Perplexity maybe could be one. Character AI could be one. Maybe Pi from Inflection could be something like this where just nothing like this ever existed before. And if somebody can really crack it, then that could be huge. And how defensible is it? Obviously, a lot of things are going to depend, but just doing something that straight up never existed before feels like at the app layer, the place where you could really capture huge, huge value. Otherwise, seems like you're in the same dynamic of startups challenging incumbents and they can get a little bit ahead because they can move faster, but can they really take the market from the incumbent before the incumbent can respond? I think in most cases, no.

Trey Kollmer: 2:07:55 Yeah. I mean, it does seem interesting how much these developments feel like sustaining innovation in a lot of industries and then disruptive to others. And how much lock in do you have to a digital friend? People are pretty loyal to their friends.

Nathan Labenz: 2:08:10 Our second episode was with the CEO of Replika and to see how attached people were. And it was interesting timing. I didn't even quite realize it. But just as we did that interview, they were also making changes to eliminate explicitly sexual interactions from the app, and people were upset, really upset.

Trey Kollmer: 2:08:34 It's crazy. I mean, it makes you think that even if your technology is not a digital friend, you should have a Clippy in whatever you're doing, given how much connections those people made with Replika and how much people make connections with these characters or with a digital friend, that even if your application has nothing to do with being a digital friend, it can have its version of Clippy or its character who's interacting with people just because the technology is making it so much easier to genuinely form these weird artificial emotional connections with the users.

Nathan Labenz: 2:09:13 Almost reflects a thing that people so often say. It's like, woah, boy. When I can attach it to my Gmail, then it has my full Gmail history. That's going to be insane. And that could be one way it plays out and probably is one way it plays out, and probably Google does that. In the end, I've been coaching a few or advising a few companies on, I don't know that I would go rush to build the layer on Gmail because it seems like they're going to do it, and they're going to have all sorts of inside lanes to do that well.

But if you have an app that can start to really lay down a rich history and have a long track record, a relationship of some sort, we have another episode that's with folks from a16z who did open source implementation of that simulated town project. And one of the things that's really interesting about that is they have observational memories and then also reflective memories where every so often, a job comes through, almost by analogy to how humans sort of process information or experiences during sleep. They come along and collect all the recent observed memories and then try to create more synthetic conceptual memories based on those observations. So and so said this. I interacted with this person, whatever. Well, what kind of a person is that? And try to build up more layers.

That kind of thing might be the sort of thing that is truly hard to switch out of. The apps presumably won't make that kind of data portable, and it would be hard. That might not even be visible to you, but it might still be an important part of the experience. Replika promises that, but I think at least when I was using it, wasn't delivering it in a super meaningful way yet, and people were still already very attached. So I can only imagine how much more powerful that could be if you have a robust streaming. Just some of the stuff that's come out first half of this year, right, with the Voyager project from NVIDIA building up these skills, keeping them stored, the synthetic memory concept, just all the work that's going on in retrieval. Seems like there could create some first mover advantage lock in.

Everybody all of a sudden goes and gets into this experience, and now there's enough history on an individual level there. I don't know what the experience that is, but an experience that follows that pattern seems like it could be the kind that can actually hold people for a long time. Rewind actually is a really good example. And I haven't had a great personal experience with Rewind on my computer. I feel like it's always using a lot of my processing power and causing me some trouble. They're working on that. I gave that feedback. They said they are working on it as a top priority, so it could already even be fixed. But that's one where I feel like once an app has everything I've seen, everything I've said for the last however long, now it's like, okay. Can I port that? Probably not. And am I willing to let it go in order to start to use some other alternative?

There are some dynamics there where I think, they say the new thing has to be 10 times better than the old thing to displace it, but that is subject to there being some meaningful switching cost, a reason why it's not totally trivial to flip from one to the other. Language model to language model, it's pretty easy to flip. I can go GPT-4 to Claude 2 and back and try 3.5, fine tune, whatever. Something that has everything that I've seen on my computer, lot harder to just let go of.

I was thinking, what if we look at current reality and then just ask ourselves, what are the moments, the scenes, the vignettes that are actually playing out that feel like they are foreshadowing where the future is going, where you can see like, oh my god. How are they not seeing this? These dramatic ironies that seemingly should be apparent to a viewer, but which maybe people are missing because we're inside the scene.

Trey Kollmer: 2:13:23 There's one key moment that pops to my head and then just a couple dramatic ironies that seem to have been funny I can go through. I mean, one key moment is in a lot of stories, you get a place where a character either hits a dead end or is maybe beginning to face some pain and then an ally comes in and you get an energy shift in a movie. And it felt that way in the middle of the summer, picketing in the beginning, there was so much energy. And then at some point it's 100 degrees out, the strike's been going, you've been walking in a lot of circles and you sort of hit this summer doldrums. And then that's right when the actors struck and it felt like this, the part of a movie where an ally comes in and you get this new boost of energy and you really felt that on the picket lines. I mean, just the actors, their t-shirts are every detail about them just looks cooler than us. Their t-shirts fit so nicely and have such a cool logo. And ours look sort of like this old rotary club design. Anyway, that felt like a key moment.

And then just a couple dramatic ironies that I think are sort of interesting. One is just the irony of the strange bedfellows of the studios who are negotiating together. You have Paramount teaming up with Netflix and Amazon and Apple to fight for a world where they get to use AI to generate content. Does Paramount or Warner Brothers Discovery or the legacy media really want to be competing against these tech companies in their closer to the tech company's home turf versus the way they've been doing things for forever? I'm sort of questioning whether Tim Cook is aware a strike is happening, that there is a writer strike. Some of this stuff is so central to Paramount Warner Brothers, and it's a drop in an ocean to Apple.

I feel like one other irony is just in the term, just in the idea that automation of jobs isn't really a new phenomenon. Other jobs have been automated for decades or for a long time. And as the blue collar jobs are being automated, it was just more the natural march of progress. And now that white collar workers, quote unquote, knowledge jobs are getting automated, people are losing their minds. So it does, I feel like there's an irony in that.

And then I do have to say, listen, some are great. And when you're going to picket, you're not looking to just write the funniest sign, but some of the writers' picket signs aren't the best. There's just a funny dynamic where in the beginning, there were all these blank signs and you could write something on them. And there was a mix of funny or very personalized or just simple and earnest, but now they're all taken up. So when you check in to picket, you have to pick your picket sign and you just see people spending 10 or 15 minutes trying to pick the sign they want to carry because there's no blank ones left and you're always like, I don't know about this one. I think it's fair. The writers, we're not writing, we're on strike. We don't want to put a ton of effort into writing amazing picket signs, but sometimes it's hard to find a good one out there. Trey Kollmer: 2:13:23 There's one key moment that pops to my head and then just a couple dramatic ironies that seem to have been funny I can go through. I mean, one key moment is in a lot of stories, you get a place where a character either hits a dead end or is maybe beginning to face some pain, and then an ally comes in and you get an energy shift in a movie. And it felt that way in the middle of the summer. Picketing in the beginning, there was so much energy. And then at some point it's 100 degrees out, the strike's been going, you've been walking in a lot of circles, and you sort of hit this summer doldrums. And then that's right when the actors struck, and it felt like the part of a movie where an ally comes in and you get this new boost of energy. And you really kind of felt that on the picket lines. I mean, just the actors, their t-shirts are every detail about them just looks cooler than us. Their t-shirts fit so nicely and have such a cool logo. And ours look like sort of like this old rotary club design. Anyway, that kind of felt like a key moment. And then just a couple dramatic ironies that I think are sort of interesting. One is just the irony of the strange bedfellows of the studios who are negotiating together. You have Paramount teaming up with Netflix and Amazon and Apple to fight for a world where they get to use AI to generate content. How much does Paramount or Warner Brothers Discovery or the legacy media really want to be competing against these tech companies in their closer to the tech company's home turf versus the way they've been doing things for forever? I'm sort of questioning whether Tim Cook is aware a strike is happening, that there is a writer strike. It's so central some of this stuff is so central to Paramount Warner Brothers, and it's a drop in an ocean to Apple. I feel like one other irony is just in the term just in the idea that automation of jobs isn't really a new phenomenon. Other jobs have been automated for decades or for a long time. And as the blue collar jobs are being automated, it was just more the natural march of progress. And now that white collar workers are quote unquote knowledge jobs are getting automated, people are losing their minds. So it does, I feel like there's an irony in that. And then I do have to say, listen, some are great. And when you're going to picket, you're not looking to just write the funniest sign, but some of the writers' picket signs aren't the best. There's just a funny dynamic where in the beginning, there were all these blank signs and you could write something on them. And there was a mix of funny or very personalized or just simple and earnest, but now they're all taken up. So when you check in to picket, you have to pick your picket sign, and you just see people spending 10 or 15 minutes trying to pick the sign they want to carry because there's no blank ones left, and you're always like, I don't know about this one. I think it's fair. The writers, we're not writing, we're on strike. We don't want to put a ton of effort into writing amazing picket signs, but sometimes it's hard to find a good one out there.

Nathan Labenz: 2:16:56 In the strange bedfellows one in particular, there's a sense of certain companies embracing the new thing even though it might be the thing that doesn't hit. I think we saw that in the Facebook kind of social media era where it was like, local newspapers, get on here. You'll get distribution. And then it was like, actually, but it's killing your ability to monetize. And obviously it hasn't gone very well for them, but they did rush to adopt that platform. In the end, that probably didn't even really matter all that much, whether they did or didn't, any individual one. That's part of why it's a problem. Right? You got its platform power versus kind of all these local papers, and what one did didn't do was probably not going to matter all that much. But I just feel like there's a little bit of an echo here perhaps where certainly this stuff advantages big tech way more than it advantages traditional studios. So, yeah, a Netflix that's prepared to drop close to a million on an AI specialist does seem like it's kind of maybe taking some of the older studios along for a ride and saying, oh, this is going to be good for all of us. Yeah. But really only some are likely positioned to really use it.

Trey Kollmer: 2:18:19 Totally. And speaking of echoes, I mean, echoes what happened was what is happening was streaming, where there's a very profitable cable bundle, and then Netflix comes along and they're getting such a multiple on the revenue they're making. And then everyone's like, let's take all of our best things off our hugely profitable cable bundle and make our own streaming platforms. It's always a hard question how much do you change your business to get ahead of things versus stick with the thing that works for you and watch people slowly pass by you and leave you behind?

Nathan Labenz: 2:18:57 Yeah. You got to try, I think, on some level. I mean, this has been kind of asked about a bunch of apps lately too. I don't know if it was DoorDash or Uber Eats or whatever. But all these kind of apps one by one are starting to add the language model layer where now you can just chat with Uber Eats or chat with DoorDash and have the order kind of placed that way. And then some level it's like, well, who cares? Not necessarily that it's a threat, but just does that really add anything? And I think the answer is if you are a tech company, probably at least have to try. You'll look really dumb if all of a sudden everything goes this way and you're the one company that's stuck with the button pushing model and it looks just super anachronistic. And you won't look that dumb if you hopped on a trend and the trend didn't really play out. There's a lot more forgiveness for that in most cases. So, yeah, I think you kind of have to try. I mean, what else can you do? Right? You can't just bury your head in this. I mean, that's probably the last thing I would advise to almost everybody. I mean, maybe the occasional monk can live best by paying no attention, but it seems like almost everybody else should at least be paying some attention. I also kind of zoom out too, and I'm just like, if I look at the last year and I'm like, what are the moments when this felt suspiciously like a movie? Probably the number one that jumps to my mind is the Sydney Bing release where it was one of the biggest companies in the world, flagship thing, CEOs on board. All of a sudden, this truly deranged behavior is observed, and it's trying to break people up from their spouses and attacking users. And I was a GPT-4 red teamer. I've seen pretty intense misbehavior from models. I never saw with GPT-4 it turn on me. I could ask it to write malicious code. I could ask it to help me make a bomb. I could ask it all kinds of crazy shit. It never would refuse me. It would never lecture me, but it never turned on me. And so to see that get launched with all this fanfare is like, this is the next big thing. And then hit the front page of the New York Times, full transcripts, irrefutably like, this is what happened. And then for everybody to just kind of move on, for Microsoft to be like, yeah, thanks for reporting that. And then I've done some digging into this too. They were testing it in Southeast Asia for a few months before they launched it in the US. And there was at least one report in a forum, which last I checked was still live, where some person who had no context, because this had not been announced. GPT-4 was still in speculation, and this is just some random user, I think, in either Malaysia or maybe Indonesia or something, reporting in a forum like, your chatbot is going off the rails here. And here's what it said to me. And then some person from Microsoft responds back and is like, what are you talking about? This person wasn't read into the fact that this test was even going on. And this did not make its way up to leadership. So they were totally surprised by it when it happened, and then they just totally move on from it. And then society just kind of moves on from it. I feel like that's the most kind of in a movie that would be a three to five minute episode of kind of a warning shot was ignored. This is when you all had your chance to say, hey, what the fuck are we doing? Are we really going to put this out there in this state? Is this really acceptable? No apology. I continue to think back on that. Were we not owed an apology from Satya at the very like, shouldn't there be some statement at some point that's like, we hold ourselves to a higher standard than this and this is not acceptable and this is not what you're going to see from Microsoft going forward? At a minimum, I would think that we would have something like that. And in reality, we just kind of moved on.

Trey Kollmer: 2:23:19 Yeah. I mean, and that models are being rolled out into its enterprise business, and it doesn't seem like people are like, wait a second, the model was threatening to murder someone's husband or whatever. It was nice that the model know this level of model wasn't actually going to be dangerous. It's not a great sign, I think, for how much people are willing to just roll the dice with that stuff. And yeah, how little reflection it felt like there was after, I guess immediately after it felt like everyone was talking about it, but how quickly the reflection disappeared.

Nathan Labenz: 2:23:57 Yeah. Especially just because of who it is. When I mean, there's plenty of things getting thrown out there right now. And I recently posted something on Twitter where I did this ransom call experiment and called myself and recorded this. And this was no jailbreak, just kind of this thing would call me and demand ransom for the safe return of my child. It's like one thing for that to be put out as an MVP. People didn't take proper concern on a three person team or whatever. But to see that from Microsoft, the very it's the very first thing that was ever launched with GPT-4. It predates GPT-4's launch itself, and they were just so far and so flagrantly inadequate in their own testing. That's another thing too. It's like, did you not sit down and just do some chats with this thing yourself, Microsoft leadership? Did you not just try to fuck with it a little bit ever? They had Tay. I mean, you could go back to that one too. Right? It's only a few years ago that they had their toxic chatbot. And did you not run any of the Tay dialogues through it just to see what would happen? It just seems like the care that was put into that was so low. I think another one just from this week too, and this could be potentially a recurring segment for us in the future. But this week, there was this interview on the 80,000 Hours podcast, which I thought was really good with Mustafa, founder, CEO of Inflection. And he basically says, yeah, in the next two years, we're going to train models a hundred thousand times as much compute as GPT-4. But any real big problems are 10 years away. And Rob was like, well, how do you square that? Right? I mean, we've got some pretty crazy stuff already. You're going to go a hundred thousand times bigger, and you've just no worries about it? And he was like, yeah. To his credit, his key point was misuse is going to be the big thing, and that's much more of an obvious clear and present danger with powerful models versus the models themselves becoming impossible to control. And I totally agree. On that, but it's a question of what is your risk tolerance? Yes, it's pretty clear to me that intentional misuse by humans seems 10 to a hundred times more likely, maybe a thousand times more likely than the models developing goals of their own and becoming getting out of control. Sure. But let's not neglect that one either just because it's less likely. And just to see people who are pressing the accelerator at seemingly full speed, raising billion dollar plus in not valuation, but capital, turning around and spending a huge portion of it on H100s and then just plowing ahead orders of magnitude beyond GPT-4 and not really seeming to seemingly taking seriously the possibility of Sydney-like behavior in that thousand times more compute version just seems like, dude, we need a little bit more discipline from you here. And again, if it was a movie scene, it would be like the dramatic irony, I think, would be almost painfully obvious.

Trey Kollmer: 2:27:21 Personally, I am I think it sounds like more worried about issues with the models themselves causing a lot of harm, even from well-intentioned actors. And it was interesting, he was mostly worried about misuse. And when Rob Wiblin pushed him back, I think I might be wrong in this, but I think he might've even said five years. He said, yeah, but those won't be for at least five years, those worries. And I was like, that wasn't very heartening from someone who's going to have a lot of compute.

Trey Kollmer: 2:27:21 Personally, I think it sounds like I'm more worried about issues with the models themselves causing a lot of harm, even from well-intentioned actors. And it was interesting. He was mostly worried about misuse. And when Rob Wiblin pushed him back, I think I might be wrong in this, but I think he might've even said 5 years. He said, Yeah, but those won't be for at least 5 years, those worries. And I was like, that wasn't very heartening from someone who's gonna have a lot of compute.

Nathan Labenz: 2:27:55 I really don't know why he's even confident in that. Again, going back to the interview with Paige from Google, she describes coming in to work often and seeing these unexpected unlocks of different advances kind of happening regularly where she's like, oh, we didn't think this was gonna happen for another 18 months, and here it is. We're hitting human performance on this thing today, or nearly quoting her, here's another experiment where we did a slightly different technique and it made 10 percentage points better. Here's, it looks like RLHF on this improved things by a ton. And it seems like they're just coming in steadily ahead of schedule and sometimes still surprisingly. I mean, that's the big juxtaposition. We have a read on scaling laws, kind of see where we're going, but that's at such a zoomed out level. What I thought was so interesting about her perspective is she's checking the dailies. And every day, they're running all these different evaluations and then it's like, oh, well, boom. We didn't expect this to pop right here, and we didn't expect this to take such a sudden drop in the loss performance over here. I don't know how somebody who is sitting, about to spend a billion dollars on compute isn't more concerned about those unexpected twists and turns. I mean, he's not wrong to worry about the human misuse. I'm probably broadly with you in that. I think both are very real concerns. And I definitely take the models getting out of control seriously. I don't know how somebody in that position is not. They're also party to the Frontier Model Forum too, so that's gonna be a really interesting dynamic. What does that really mean? It's all voluntary so far. Presumably, that Frontier Model Forum is kind of a precursor to some more regulatory type of setup. But if that's kind of the attitude that one of the seven, there's only 7 members. If that's the attitude that 1 of 7 members has, then it becomes unclear concretely, what exactly are we committing to here?

Trey Kollmer: 2:30:08 Yeah. Yeah. It's all very scary.

Nathan Labenz: 2:30:11 Well, it's all very scary then is maybe the right note to end that it's awesome and scary at the same time. I always try to not lose track of either end of that equation. But this has been a lot of fun. So thank you for making time. Thank you for writing up all the questions in advance. It was, I think, a really interesting discussion, and I hope we could do it again in the future as well.

Trey Kollmer: 2:30:35 Cool. Yeah. I love this. Yeah. I would love to do this again. This was so fun. I really enjoy this, and I will fully say, fully admit in the strike times, it's so nice to have something like this.

Nathan Labenz: 2:30:48 Cool. Well, Trey Kollmer, thank you for being part of the Cognitive Revolution. It is both energizing and enlightening to hear why people listen and learn what they value about the show. So please don't hesitate to reach out via email at tcr@turpentine.co, or you can DM me on the social media platform of your choice.

Infinite Code Context: AI Coding at Enterprise Scale w/ Blitzy CEO Brian Elliott & CTO Sid Pardeshi

The AI-Powered Biohub: Why Mark Zuckerberg & Priscilla Chan are Investing in Data, from Latent.Space

AI & The Law: Changing Practice, Claude Constitution, & New Rights, w/ Kevin & Alan of Scaling Laws

Hollywood Strike Update and AI Roundup with Trey Kollmer

Watch Episode Here

Video Description

Full Transcript

Transcript

Nathan Labenz

Read next