Where Are the Moats in AI? With Nathan Labenz and Erik Torenberg

Watch Episode Here

Video Description

Nathan and Erik analyze the moats of the most powerful companies in AI. The paradigm-shifting technology has led to a flourishing open-source community with market share. Yet, the big players have key competitive advantages that can be examined from many different angles.

LINKS:
- Nathan’s Twitter thread on AI moats that sparked this discussion: https://twitter.com/labenz/status/1654853321876815872

- Read about the 9 moats at length in our newsletter: https://cognitiverevolution.substack.com/p/the-leaked-google-memo-and-a

- Leaderboard for AI rankings: https://lmsys.org/

PODCAST RECOMMENDATION:
The AI Breakdown: @TheAIBreakdown
As anyone in AI knows, the pace of progress of new releases is relentless. The AI Breakdown is a daily podcast (10-20min long) that helps us ensure we don't miss anything important by curating news and analysis.

TIMESTAMPS:
(00:00) Episode Preview
(01:36) Where are the moats in AI?
(07:31) Open source vs. closed
(15:24) Recommendation: The AI Breakdown Podcast
(16:39) Sponsor: Omneky
(20:46) AI Safety
(33:00) Which players are going to win? 9 Moats.

TWITTER:
@CogRev_Podcast
@labenz (Nathan)
@eriktorenberg (Erik)

Thank you Omneky for sponsoring The Cognitive Revolution. Omneky is an omnichannel creative generation platform that lets you launch hundreds of thousands of ad iterations that actually work, customized across all platforms, with a click of a button. Omneky combines generative AI and real-time advertising data. Mention "Cog Rev" for 10% off.

Music Credit: MusicLM

More show notes and Nathan's analysis can be found in our Substack: https://cognitiverevolution.substack.com

Full Transcript

Transcript

Erik Torenberg: 0:00 So moat 1, GPT-3.5 Turbo is the best value in the LLM game today. Moat number 2 is branding and just trust. Moat 3 is the feedback loop that they have. Nobody has the volume of LLM usage that OpenAI has. 4 is pricing power. Dollars per million tokens. You gotta be using a lot of tokens. 5, we talked about, again, privileged access to cloud compute. 6, GPT-4 itself. They're using GPT-4 in really interesting ways that are gonna give them advantage. Moat number 7, team and talent density. Satya said to their head of research at Microsoft, how the hell did they do this with a couple 100 people? We've got all these people, and how are they kicking our butt so much? Moat 8, insane distribution and partnerships. The customer list is growing rapidly. Number 9, yeah, network effects. If I have something that's working with OpenAI and then I'm thinking about exploring something else or switching, first thing I'm gonna try is the exact same task. And if it doesn't work, then I'm gonna be oh, this sucks.

Nathan Labenz: 0:00 Hello, and welcome to the Cognitive Revolution, where we interview visionary researchers, entrepreneurs, and builders working on the frontier of artificial intelligence. Each week, we'll explore their revolutionary ideas, and together, we'll build a picture of how AI technology will transform work, life, and society in the coming years. I'm Nathan Labenz, joined by my cohost, Erik Torenberg.

Nathan Labenz: 1:27 So let's get into it. You wrote this thread on moats. People are asking you a ton about moats. Where are the moats? How should people think about moats?

Erik Torenberg: 1:36 The best kind of comparable vision that I think is intuitive that seems likely to be right in terms of how this broad AI utility intelligence market plays out over the next few years, is it probably ends up looking quite a bit like cloud. And you might even say, in the limit, it sort of could be cloud, right? If the algorithms are all trending toward commoditized or lots of stuff is getting open source, open source is making all this progress, then eventually, maybe the same stuff runs everywhere, and it's just how many computers do you have determines how much AI utility you can deliver. I don't think it's quite that simple, but it does seem it's shaping up on somewhat similar lines where in cloud, you have big tech oligopoly has half-ish of the market. When you look at AWS, Microsoft, Google, Oracle, whoever you want to draw that line on who is the top tier cloud providers, they have about half. And it seems if you were to just ask, what is the breakdown of where inference is served over the next few years? It seems it's probably similar. This time it's probably Microsoft and OpenAI that are in the lead in terms of they've got the best product, and they've certainly got the cloud to back it up. So they probably lead. And then Google, DeepMind, which is also their cloud is powering Anthropic, is probably next. And then AWS is a little behind at the moment, but has Hugging Face and has plenty of stuff. Plus, they just have tons of people already using their cloud. So for many homespun open source, you got to host it somewhere. So it's going to be a natural place. And then you've got long tail besides that. And that can be still on those same public clouds, but different service layers. It could be your own on-prem solution if you're an enterprise that's supported by wherever, whoever. And even on consumer devices. It seems we're seeing a lot of trends toward that as well. So in keeping with my theme of everything everywhere all at once, I think there's no reason to suspect really that any significant pools of compute would not be used to power AI over the next few years. And so that market structure seems it probably lines up. For that not to happen, somebody would either have to run away with things dramatically, which doesn't seem it's currently happening. OpenAI is ahead, but it doesn't seem they're light years ahead. And they're maybe 18 months to 2 years ahead of open source, which is to say what they were offering commercially 18 months to 2 years ago is about what open source has got to now. Not to say that open source will catch up to where they are in the next 2 years. That remains to be seen. GPT-4 is gonna be harder to match. It takes more compute, if nothing else, than GPT-3 was to match. Cost of GPT-3 quality models is now under $500,000 if you go to Mosaic from scratch. Bring your own trillion tokens of whatever data you have, and you can get to GPT-3 quality, smaller model, more intensive training, but under $500,000. But yeah, meanwhile, GPT-4 is still rumored to be $100,000,000. That's not going to be so easy for the open source folks to replicate. But anyway, it seems it's headed towards shaping up like the cloud market. You've got the oligopoly that combine the hardware infrastructure and services layer, which for now is the models themselves are the core differentiator there. And then you've got a ton of other things too that make up the other half of the market. And so just like AWS certainly has moats in general in compute, all these big players have moats to varying degrees. And I think that's ultimately pretty obvious. That whole no moats thesis, it's I just I think that that hasn't really held up over the last couple of weeks under deeper scrutiny. And the negation of that memo isn't true either. I think that that memo had a lot of things going forward in terms of the flourishing of what's happening in the open source community, but it's not still not the case that everybody doesn't need OpenAI anymore. It's far, far from the case. The folks that are trying to implement this stuff in medicine, they are not that cost sensitive. They want the best thing. And OpenAI has it, and Google is trying to get on their level and certainly seems they can or can get real close. So that's just not something that anybody in the open source community has any credible claim against. You can sort that out at the level of a quick demo. You don't need a deep evaluation process. You can literally can just use common sense and ask some questions to GPT-4 or to the new MedPalm 2 out of Google. And you'll taste the difference between that and whatever the latest woolly animal named model might be. Again, there's lot of moats. We could break it down from a lot of angles, but that position is, I think, pretty secure for the foreseeable future.

Nathan Labenz: 7:31 Zooming out, it's interesting to think about when you have a new kind of platform shift or technological breakthrough, there's some that have led to more sort of open source or open source would have bigger market share and some that have led to the opposite. It's an interesting thing about the characteristics that make either happen.

Erik Torenberg: 7:48 Yeah. They'll probably continue to evolve as well. And I'm certainly no scholar of earlier software waves. Certainly haven't studied any of those with anything close to the intensity that I've been studying the AI wave. But so much depends right now on context. People ask these questions. So how would you decide? That's a question I've been getting a lot. All right. Let's say you're maybe at Waymark or you're in some other context. How do people decide whether they're gonna use an OpenAI or an open source model? And I do think in most contexts, it does seem pretty clear that you do perfectly well with an OpenAI product. At Athena, for example, where I'm the AI adviser, the explicit strategy if we're gonna try to do some new task and set up some automation to power some task, it's always just start with GPT-4. There's no reason to start anywhere else. It's clearly number 1. It's number 1 on all the leaderboards. It follows instructions the best. It's not that expensive. It is a little slow, which can be a little cumbersome when you're trying to test stuff out. But it's the fastest path to the application of this raw intelligence to your problem to demonstrate whether or not you can even automate this task. GPT-4 is going give you the fastest path to that. And of course, there are some things that you can't do that maybe you could fine tune a model for them. And that's a whole other scale of project. So that could be 1 reason. You might be able to dial something in better if it's a narrow application and you're really focusing in on a use case. That could be 1. There are reasons to go away from GPT-4. That is 1 of them. But the reasons that people tend to think more often, I think, are really not great reasons, like cost. It's pretty cheap, actually. And it's especially cheap if you consider a total cost of ownership because I can get in there and take most common tasks and get the AI performing reasonably well on it with a few rounds of prompt engineering often in 30 minutes or less. And what is the cost of the person's time? How many runs of this task automation do you need to be planning for before it is gonna pay back even to just spend a handful more hours looking for a cheaper option? What is the end of your task? Because most tasks, people are not trying to scale to the billions. Or even to the millions. So you think about the cost per token on GPT-4, $0.03 cents per thousand tokens in, $0.06 per thousand tokens out. Maybe just for simple math, average that to $0.04, $0.04 per thousand tokens. That's $0.04 per 2 pages of text processed. And the kinds of tasks that we're setting up are can we take a long transcript of a call that we have with a client and convert that into a condensed information-rich profile of the client that we can then put in the assistant's hands as they're beginning a relationship to give them as much context as we possibly can. And that cashes out to however many pages. Right? But it's not that many pages. We might be able to power this for a quarter. It used to take 4 hours. So we're saving a ton. That's when people get very obsessed with these cost questions, I'm often most people don't care if they're saving 90%, 95%, or 99%, especially if it's all new. The cost, I just don't find to be that compelling. There are some applications where if you're doing high volume generation and you have a low hit rate, this is another thing I've been highlighting for people. With Waymark, we have a pretty tight process. And we've been doing this before generative AI. So we make these advertising marketing videos for small businesses. Typically, they're 30 second commercials. They make run on TV. We've been doing this before generative AI. Now the generative AI writes the full script, picks all the images, layers on the voice over, but we still have that structure even before GPT-3 popped up. So the ratio of how many of your generations are ultimately useful to people is a reason that you might end up wanting to go towards some cheaper model. So Jasper, for example, they're now on the Mosaic homepage as a client. And they've been an OpenAI client, obviously, from the beginning, helping with all this marketing copy. They're probably spending quite a lot. So for 1 thing, they have some real budget that they can maybe take a chunk out of if they can fine tune their own stuff. But then I also suspect that a big part of what's going on there is because a lot of their stuff is fairly open ended. Give us 2 words and generate a LinkedIn profile or whatever. I'm guessing they're seeing a lot of generations. And people are maybe just using it in sort of a rifle through sort of way, seeing it what Suhail told us with Playground, where people are making, 10% of their users are making more than 1,000 images a day. It sounds that's the pattern that's going on at a Jasper to some degree. And that's a big contrast to with Waymark, it's basically 1 in 3, maybe 1 in 4, maybe 1 in 5, probably varies a little bit by cohort, of the videos that we generate ultimately gets rendered and downloaded. So people are not sitting there rerunning it that many times. And so the economic relationship between the cost of that generation and what they're getting for it, where they're gonna go spend thousands of dollars on a TV campaign, that ratio is totally fine. And then when you get to task automation, your hope is that you're gonna get to basically use all of the work. And that's fairly feasible. When we do these transcript to client profile workflows, the goal is not to have to run it a bunch of times. And from what we're seeing, it looks yeah, it's basically gonna work. Our workflow is going to be immediately after the call. And this, again, this used to take a couple days and whatever, and people get bogged down, and the things would take hours to write. Now you've got basically instantaneous transcription. You feed that into the AI process that chunks that, the biggest limitation with GPT-4, biggest 2 limitations are context window is still limited to 8,000, and it's kinda slow. So when you're chunking these long transcripts into bits in order to summarize them, in order to then have that unified summary that you can process into whatever format you want. It can be a little bit of a friction point. It can be a little bit slow, but it is the kind of thing that we're basically targeting. You should be able to use a 100% of this output. We're not seeing any instances where it's what is this unusable garbage?

Nathan Labenz: 15:20 Hey. We'll continue our interview in a moment after a word from our sponsors. Nathan Labenz: 15:20 Hey. We'll continue our interview in a moment after a word from our sponsors.

Erik Torenberg: 15:24 Hi, everyone. I wanted to take just a moment to share another podcast that I've been enjoying recently, the AI breakdown. As anyone in AI knows, the pace of progress and all the new releases are relentless. I call myself an AI scout, and I work overtime to keep up. But these days, even I can't keep track of everything. The AI breakdown helps me make sure that I don't miss anything important by curating news and analysis daily. Host Nathaniel Whittemore, aka NLW, quickly highlights the top stories of the day before going deeper on a single topic of interest. Episodes are usually 15 to 20 minutes, and he releases them every single day. Now it's not easy to keep up with a daily release schedule and still maintain your sanity. So I really appreciate how NLW maintains a curious posture and avoids rushing to judgment. A big part of the reason I'm inclined to recommend the show is his willingness to sometimes say, I don't know. I think listeners will find the AI breakdown to be a great complement to the long form deep dive interviews that we create, so I encourage you to check it out. The link to the AI breakdown with NLW is in the show notes.

Nathan Labenz: 16:40 OmniKey uses generative AI to enable you to launch hundreds of thousands of ad iterations that actually work, customized across all platforms with a click of a button. I believe in OmniKey so much that I invested in it, and I recommend you use it too. Use Cognitive to get a

Erik Torenberg: 16:56 10% discount. The human review layer at the end is more to be, okay, I just talked to this person for 90 minutes. Does this thing miss anything that seems important? And we do see issues where it's, yeah, this person said this, and it does seem pretty important, and the AI didn't quite pick up on it as a key point, didn't represent it in the final thing. So it's not a flawless process, but you can go from just having got off the call to looking at a draft of your final document in a couple minutes and then read through it, figure out what rings. Mostly it's good, but there might be something that doesn't quite ring true or something that was missed, and then you're kind of done. Still in that whole process, the cost of the AI is very small. We had a person that sat there and talked for 90 minutes. We're gonna have all these things going on. There's no reason to try to save a couple more pennies at this stage on that process. It's all about driving the quality. We wouldn't even consider an open source model. There's just no reason for it. Now we're also probably not going be that huge of a customer because none of that many clients coming in. But you start to extend this to other things where there are points in the business where there's real scale. The next one downstream is client matching with the candidates. And they may be onboarding 100 clients a month. You may have 200 candidates in the pool that have already been filtered and gone through a whole ringer. I think they hire sub 1% of initial applicants. To do every analysis there is 20,000 of these sort of comparisons. This client profile and this assistant profile, is that a good match? It's not easy to scale up to 20,000 of those with human power. In practice, I think historically, you would more kind of look until you found one that seemed good and kinda have to stop there because it just isn't time to do the fully exhaustive version. But the AI can run overnight and do that more exhaustive version. And it's still gonna be cheaper than the human powered thing, and it might even be better. I wouldn't get on quite yet and say it's gonna be better, but I think it can match pretty readily. And the prospect for it to be better is definitely there as well. And to make it more just more personalized, to then turn around and be, alright, now write a blurb introducing this person to the client. You can enhance the client experience in ways that aren't even necessarily about the fundamentals, but just let's get this thing off to a good start by kind of introducing this person in a really nice personalized way. Yeah. Anyway, in all of that, cost is not an issue. I can tell you that.

Nathan Labenz: 19:54 Do you see the gap? You mentioned OpenTore is about 2 years behind. Do you see that gap widening or shrinking over time? What do you predict?

Erik Torenberg: 20:02 I don't know. It probably goes back and forth. I think it'll probably a yo yo effect. I mean, the open source line is rising all the time, and then you probably have more punctuated big next releases from the key players. The perceived public gap probably closes and then widens again, and it won't be obvious necessarily is the overall gap, given what they have internally that the rest of the world doesn't know about. Is that shrinking or narrowing? I bet would be pretty hard to tell still for a while. And it seems like they are probably gonna cluster. There's a weird tension right now between the fact that the leading companies, namely OpenAI and Google, DeepMind, Anthropic, are pretty clearly coordinating on some very meta level and kind of saying very agreeable statements to each other about how what's most important is that we do this safely and that everybody can benefit from it. And Google's CEO Sundar just said, the only race that matters is the race to safety or something like that. We should find that exact quote. That's not quite what it was. But he's specifically trying to say we are not going to get into an AI race, which I really view honestly as great validation for the entire AI safety movement and even specifically the EA AI safety movement because it's very easy to imagine a counterfactual where people running the cutting edge AI companies were just totally dismissive of big picture AI risks. And the fact that you've got basically all 3 of the leaders who have a real, demonstrated awareness of the and seem to be pretty credible in taking seriously these tail x risk scenarios. I don't think that that necessarily happens in a counterfactual world where there's no EA movement or where there's no Eliezer. Their influence seems pretty clear there.

Nathan Labenz: 22:27 I certainly agree. I think the question there is, actions speak louder than words, and how much will the actions match? If they were in a race, would they be acting any differently? Will they act any differently? And how so? How extreme? Because in some ways, if you're a little cynical, you could say it's reverse psychology in this really effective way, which is, say it's not a race as a way to potentially throw it off competitors or not threat to competitors, but also on regulatory side. We have this weird dynamic where for other technologies, let's say crypto, people said, hey, this is the next great thing. Everyone needs this. And then on the regulatory side or immediate side, see a response of, hey, actually, maybe this is bad, or maybe this isn't great, or it's too powerful, or it's too dumb or whatever. But with AI use, it was really interesting to get that the hearings and thinking about kind of the response media and regulatory. It's people's, I think some accuse them of marketing by emphasizing how dangerous this thing is. It also emphasizes how powerful it is. And some people, media and government are saying, hey, it can't do as much as you say it. They're kind of, they're more scared of social media and misinformation and the complaints they have on social media, teen depression, etcetera, than they are on AI. I would have expected people to be much more concerned than they are. It seems that most concern is happening from within tech. And I wonder if that's unconsciously a form of reverse psychology in the sense that because tech is so internally worried about it, maybe that encourages them because they're somewhat anti tech or somewhat contrary to tech to be less worried.

Nathan Labenz: 22:27 I certainly agree. I think the question there is actions speak louder than words, and how much will the actions match? If they were in a race, would they be acting any differently? Will they act any differently? And how so? How extreme? Because in some ways, if you're a little cynical, you could say it's reverse psychology in this really effective way, which is say it's not a race as a way to potentially throw off competitors or not threat to competitors, but also on regulatory side. We have this weird dynamic where for other technologies, let's say crypto, people said, hey, this is the next great thing, everyone needs this. And then on the regulatory side or immediate side, see a response of, hey, actually, maybe this is bad, or maybe this isn't great, or it's too powerful, or it's too dumb or whatever. But with AI, it was really interesting to get that the hearings and thinking about the response media and regulatory. It's people's, I think some accuse them of marketing by emphasizing how dangerous this thing is. It also emphasizes how powerful it is. And some people, media and government are saying, hey, it can't do as much as you say it. They're kind of down, they're more scared of social media and misinformation and the complaints they have on social media, teen depression, etcetera, than they are on AI. I would have expected people to be much more concerned than they are. It seems that most concern is happening from within tech. And I wonder if that's unconsciously a form of reverse psychology in the sense that because tech is so internally worried about it, maybe that encourages them because they're somewhat anti-tech or somewhat contrary to tech to be less worried.

Erik Torenberg: 24:08 I think Sam Altman's statements about what kind of regulation they recommend have been pretty clear and not always interpreted in good faith. He's literally explicitly said, we think the regulation should apply only to the leading companies that are doing the biggest models, the biggest compute budget training runs. And we don't want to interfere with companies, open source, research, startups, etcetera. And that seems like a very reasonable position for the regulatory bodies to take. And I hope that they take something along those lines. That seems pretty sane. So yes, there could be some amount of hype marketing in that, but really, they don't market at all. They don't need to market. The businesses, the phone is ringing off the hook. When I talked to Mosaic ML guys recently, Jonathan and Avi, they were like, yeah, we can't run long. And they also said, we're starting, he's like, if you want to be a customer, you better hurry up and call us because we are starting to get to the point where we're making serious tough decisions between research and customers. And basically, they're not going to cut research to nothing. They know they can't do that. So, it's like you might have a wait list at Mosaic pretty soon. The level of spend that you have to commit to OpenAI to get some commitment from them of some attention to you has risen dramatically. When I bought it just over a year ago, and we were already a retail customer on the API and fine-tuning models and stuff. But as we're starting to get more serious, and I was like, I kind of want a tutor or a consultant, especially on the inside of OpenAI. What do I have to do to get you guys to take an every other week call with me? It ended up being at the time $2,500 a month was the service package that we bought into. And now they don't even offer that. And to get that kind of account manager or whatever, you're at least into six figures, if not a quarter million upfront commitment that you're going to spend that with them to have them take your calls. So I don't really think he needs to go in front of congress and call for regulation as a marketing strategy. They could also downplay their shit. I mean, if you want to say that they are hyping everything, look at the launch statement that Sam Altman put out on GPT-4 day. One of his very first things was it seems more impressive at first than it really is, which is true. And it is honest. That's an honest statement. But you don't really see tech CEOs downplaying their major launches on launch day. And yet they did that in a very clear way. Sam's tweet, it goes something like, here is GPT-4. It's our most capable model yet. Like all of our models, it still makes stuff up. It still has major weaknesses, and it appears more impressive at first than it ultimately is. That is not how most, that doesn't sound like a hype cycle to me or a marketing ploy. I think they really mean it. I think that at the end of the day, I think that's becoming increasingly clear. Now you could really ask, and I was thinking about this earlier today too, you read their governance statement from this last week, and you get down to the bottom of it. And the last point that they make is we think that it's counterintuitively dangerous to not develop this stuff because the fundamentals are all going this way. And that means that it's increasingly easy to develop these powerful systems. And if we don't keep the gap between what's possible and what exists somewhat narrow, then we may have these sudden super disruptive events in the future if somebody achieves some breakthrough unexpectedly, and all of a sudden drops into an unprepared society. And it's like, okay, I don't know if I really follow that to the same conclusion that they, I mean, it almost kind of seems like they're saying somebody's gonna develop AI so powerful that it's dangerous and better us than them who are, for, I guess, basically all other thems that could that it could be. I don't necessarily want to sign on to that or endorse that, but I do think everything they're doing is pretty consistent with that. They do seem to be trying to rise to the occasion. And they definitely could be jamming stuff out faster than they are. They even have that part of the charter too, which, I mean, people obviously at this point don't trust them on necessarily anything, but it is in the charter that they will combine forces with another effort that they believe is credibly close to AGI rather than compete with them. That's a hell of a commitment to have made in 2015 or whenever they came up with the charter. At that time, it must have seemed like a pretty distant worry. But they've had that online for a long time. That is not like they just released that statement. That's been there for years.

Yeah, going back to your with the open source and closing the opening and closing of the close and open source gap, I do think we'll see probably these releases again come in some kind of cluster where GPT-4 and kind of PaLM 2 came in a pretty tight window. I would bet that in the future, that will continue to happen as a reflection of the fact that leadership at these companies is genuinely concerned with the possibility of an AI race, genuinely doesn't want to create that dynamic. And so we'll talk to each other behind the scenes a little bit and say, can we kind of ease into this stuff together so we're not, you're gonna have an extra urgent thing, we're gonna have a next generation thing. If we kind of bring those online around the same time, we can, the oligopoly is not gonna get too disrupted. Microsoft and Google don't have to battle to the death. They can both kind of advance similarly. And that's probably a really good thing as long as they don't collectively still blow up the world in the process. But it does seem definitely way better than a different dynamic where they're sworn enemies, won't talk and are trying to one-up each other, which we saw a little bit of that earlier this year, but it seems like they've kind of, Google has not taken that bait. I think to their, I think they will be vindicated in the end for saying, alright, you go ahead, but we're gonna do it when we're ready. And so far, it doesn't even seem like they've lost some significant market share as far as I can tell. Maybe a couple points, but not like, they're still dominating Bing in terms of just raw search volume. So it seems like even from a shareholder standpoint, people were not that, the consumer wasn't that ready to switch where corporate requirements would dictate you must ship immediately. So I think their judgment has been actually pretty good and vindicated. And again, I think that's by extension of another vindication of the kind of EA safety crowd because, he didn't, I don't think he grew up thinking about this kind of stuff. So he needed some people intellectually to have done that work to be able to tap into it. If there's no literature, I don't think these people are coming to these, some of them might, but Sundar probably not on his own totally organically.

Nathan Labenz: 32:48 I agree. It's very influential literature. Marx too is very influential literature. I'm just using. Let's go deeper on sort of the respective kind of assets. We were talking about NBA the other day, and we're talking about the Nuggets' chances versus the Boston versus the Heat. We talk about their relative strengths and weaknesses and kind of where they're at now. Let's do something similar with some of these players that we're talking about in terms of how do we see things playing out based on what are the where the facts on the field, so to speak, in terms of people's relative strengths, weaknesses, opportunities, etcetera.

Erik Torenberg: 33:24 I'll just run down the quick nine moats that I identified for OpenAI, and then we can kind of compare that to Google. Most of them are basically the same, and OpenAI is just a little stronger in a number of these categories, it seems right now. But also, you could make the argument that Google is stronger in some of the maybe the most important categories as well. So Moat 1, GPT-3.5 Turbo is the best value in the LLM game today. So earlier, I talked about how GPT-4 is generally a huge savings over a human-powered process. And if I'm new to AI and trying to make stuff work, there's really no incentive for me to go anywhere else. Well, if I were gonna go anywhere else, they happen to have a 60x cheaper model that is instead of, 50x, I guess, whatever. Instead of 4 cents per thousand tokens, it's 0.2 cents per thousand tokens, which is $2 per million words, and that's what powers the ChatGPT free tier. And I said it's the best value in the utility LLM game today. I think that's probably still true. There's a new leaderboard that I've started to follow called lmsys.org, where they literally have chess-style head-to-head battles between language models and keep track of Elo ratings. So users go in. You get, for the same input, you get the response from different models. You choose the winner. They keep the score. And GPT-4 is at the top of the power rankings. Claude is number two, and Claude Instant, which is kind of the Anthropic answer to the ChatGPT, has actually taken third place recently and put Turbo into fourth place. So arguably, you might say at this point that Claude Instant could be, even slight edge over GPT-3.5 Turbo for the best value in the game today. But either way, they both have great value. They're both cheap. They're easy. They're fast. They can do a ton of stuff. They can handle the marketing copy tasks for the most part. They can return formats pretty reliably. They can't do the advanced stuff of GPT-4. This is the difference between bottom 10% on the bar exam and top 10% of the bar exam. That's 3.5 to 4 is that leap. But still bottom 10% of the bar exam, you can do a lot of stuff. You can process a lot of data. You can organize a shopping list. There's plenty of things that you can do without being quite powerful enough to pass the bar. So that's a great product. And it's still better than all of the open source imitators. And in fact, all of these open source things, I mean, not all, but a

Nathan Labenz: 36:11 lot Nathan Labenz: 36:11 lot

Erik Torenberg: 36:11 are using GPT output from OpenAI and just imitate training on that. So they are getting somewhat close on some test domains, but they're not really that close to even, and the leaderboard will tell you this. And that's a blind head to head, user call the winner process. They're not that close even to the second tier 5% price of GPT-4 version of OpenAI's products. So just having that means there's not that much opportunity for people to come in and steal share on pure price reasons. They have not left the door open there all that much. So people could do it for maybe other reasons, control, fine tuning, ideology, data, not wanting to send data out over certain boundaries, whatever. But they've got a great commodity product in today's world. I'd say, again, it's pretty clear at this point, OpenAI and Anthropic are the 2 leaders in that category. Moat number 2 is branding and just trust. For all of the complaining that has been posted on the Internet about how ChatGPT is to this, to that. I mean, and you hear it from all sides. Right? It's simultaneously depending on who is trying to embarrass it. It can be both too white supremacist and too woke. And you can kind of see that almost regardless, I think, of perspective, you can see unwanted biases or unwanted behaviors in it. But the alternative in the open source world is just way worse if you are a corporate customer. If you want a sort of radical free speech experience and you're not a major company, then you can go the open source route and do whatever you want. But if you are the kind of business that is thinking about maybe putting a chatbot into your product experience somewhere, you don't want it to get too adventurous. I was joking with a guy at OpenAI. I was like, yeah, these open source radicals really overestimate the corporate appetite for large language model adventure. Nobody wants their own personal Sydney experience. Nobody wants that kind of embarrassment. And when you can get at $2 per million tokens, good quality, pretty reliable service from OpenAI, then burden's kind of on you to figure out well, why did you do something? Why would you or why did you do something different? So I started saying that people used to say nobody got fired for going with IBM, and now it's that might be true for a few of these top players because your AI might still embarrass you, but at least you can fall back on like, look. I used the industry standard. These guys spent 6 months working on safety of GPT-4. What do want me to do, boss? We're gonna use this kind of shit, there's gonna be some risk, but I kind of made the safest choice I could with OpenAI or Anthropic. So, again, that's a moat. Right? And Google's, I think, gonna get there too soon. Certainly, they have the, I think, trust and kind of gravitas that a corporate buyer would believe them that they do have good standards in place. And, you again, when it's so cheap, why am I gonna go put myself as a CIO or whatever in a position where I don't really know exactly how this open source model was trained or by whom or to what degree it's really been battle tested. I'm gonna do that to save a little money. And by the way, am I even really saving money? Depending, maybe, but I'm certainly gonna be putting more man hours into it than I would if I was just doing the simple thing. So, I don't know. It sounds kinda tough. Mode 3 is the feedback loop that they have, and this is where OpenAI is currently quite out in front depending on exactly what you think about some of the alternative strategies. But nobody has the volume of LLM usage that OpenAI has, that ChatGPT has. Bing's doing some decent volume I've seen, but not as much as ChatGPT, and they're, of course, PowerVeggie before anyway. So they're getting this data. They now have terms of service that say if you use them via the API, that they will retain your data for a short time, then delete it, and not use it in training. But the free tier of ChatGPT by default is used in training, and you can opt out of that. Now they just added that as well. But by default, you're opt in. And if you wanna opt out, you can opt out. But they're getting more raw usage and feedback data than anybody else. And they have a well honed product development process that is humming. And others are finding that kind of hard to reproduce. Even somebody like Bing comes out and finds out, wow. We didn't expect that. And some of those things you're like, I don't, there's there were some really pretty flagrant breaks in the in the Microsoft process. I've gone down that rabbit hole. They tested this thing for months in other parts of the world. They had users report in their forum such behavior from the bot as ultimately graced the cover of the New York Times. They failed to detect it because 1 part of the organization wasn't talking to the other or whatever. The the Microsoft employee that responds in the forum last I checked, this was also online in the Microsoft forums. The person who responds, who works at Microsoft, seems to not know about the AI powered search experience at all and is doesn't even know what the person is talking about, who is saying your chatbot is accosting me. And that's all kind of documented as of late 20 22. Then they launched the thing in 2023, and then you have things that are as simple as you got the date wrong. The AI got the date wrong. The user corrected the AI on the date, and that was enough at launch for whole things to go off the rails. And I think it's a huge people should be very clear on the difference between a jailbreak where you can trick ChatGPT into saying a bad word or whatever. And that's a problem. But I've never seen ChatGPT turn on a user. And that's a whole different level of failure of alignment. Failure of control when the user is trying to break your control measure versus failure of basically of decent engagement with the user in the first place. So qualitatively different thing. All that story is to say the product feedback loop is pretty important. And it's this stuff does not just magically cohere into a well behaved AI by accident or even with moderate effort. It seems to take a lot of effort. So OpenAI is just crushing it in that regard. Anthropic, again, has a unique approach with their constitutional AI system where they use kind of self critique and synthetic data to basically try to get the same level of control as if they had the user base. And it seems to work quite well. So that may be something that other people could in fact, in some ways, they are ahead in terms of their safety profile, not their raw capability profile. But in terms of their safety profile, in some ways, I've seen them be ahead. I've also heard from others that maybe that's not the case. So it's probably mixed. The surface area is so huge that you can be ahead and and behind at the same time in different areas, and that's almost certainly the case. But nevertheless, it does seem to work quite well. They also do a pretty good job of avoiding hallucinations. At 1 point, I think they did a better job than than OpenAI was. But, again, with GPT-4, they've they've improved a lot. But you could previously see with with ChatGPT versus Claude, you could kinda see these situations where Chad GPT would still I asked a question about a property tax in 1 Massachusetts town 1 time, and it made up a rate that was not the real rate. It had all the structure and conceptual analysis of property tax generally right. But it made up a rate at the key moment that was false. Whereas Claude said, for example, if the rate were, it just gave a nice reasonable round number. And so and there there are some ways in which it does seem like the anthropic method is at least competitive, if not, in some ways, maybe superior. So those 2 seem to be kind of at the forefront of that. Actually, Character AI has a really good feedback loop as well. They're kind of a dark horse in this whole game playing a bit of a different game. They're the they're the only the 1 that comes to mind as having a real tight at scale product feedback loop. Google doesn't have it yet. They should be able to get it real quick. Whether or not they can really iron out all the sort of sand off all the barnacles on their sort of product flywheel, I guess still remains to be seen. As of now, they've not done it. And if you look at the leaderboard, Bard is down the leaderboard, not because they don't have the raw horsepower, but because, at least as far as I can tell, they have not shaped the product in the same way that the best in that category have. They can match on things like medical question answering. GPT-4 and MedPalm 2 are both expert level on these benchmarks. So they can definitely get to the point where they can do high level stuff, but they don't have this unified product that kinda does everything and knows how to handle all these different situations. It's like MedPalm is dedicated to medical question answering, and it can't help you with whatever random search queries you have or just be your general sort of chat adviser. It's it is specialized and and fully end to end fine tuned for that domain. And that arguably could be good. Somebody might say, well, jeez, do we really need these all purpose AIs? Maybe a bunch of specialist AIs could be arguably a better overall architecture. I think Google is definitely kinda playing that both ways. They're gonna have their general open chatbot that you can talk to about anything, but they're also going they're taking MedPalm too to hospital groups and stuff like that. They're not, they're not even messing around with some base, yeah, some base model. They're gonna only take the good stuff. So 4, I kinda talked about in my in this thread. 4 is pricing power. I think we've probably covered that at enough length. $2 per million tokens. You'd you gotta be using a lot of tokens. And another way I was thinking about this was, this is actually kind of interesting math. If you were to try to compete at that price point, you would have to serve 100 billion tokens to pay for 1 employee. They make $200,000 in revenue for 100 billion tokens served. So I'm rounding down from the, we're so back San Francisco AI, salaries there to 200,000, calling that 1 employee. By the way, whatever electricity, compute costs you have, Microsoft gets its share out of that too, right, out of out of that, $200,000 for 100 billion tokens. I mean, 100 billion tokens is a ridiculous amount of stuff. They're training these models on 1 trillion tokens. That's a kind of general ballpark of what a not like a GPT-4, but a llama type good open source project might train on today would be roughly 1 trillion tokens. So you're talking about 10, and people also talk about that as the scale of the whole Internet. So you're talking about generating text on maybe 10% of the scale of the Internet for $200,000. Just who needs that many tokens? It's gonna be hard to build that many businesses competing at that level. I just I don't see how you build another world class team that can that can hit at that level only to then be like, okay. Cool. Now we've now we're here. Now who needs 100 billion tokens? And we need to by way, we need to find a lot of people who need 100 billion tokens each to have a chance. I don't know. It's tough. The volume of that really kinda blows my mind. I think there are things, you'll see just mass summarization, just just summarizing everything, just processing everything, quality. Just this stuff is a it's cost effective for quality control. Right? I mean, every time you've ever been on hold and said this thing is being called being recorded for quality control purposes. Now it is cheap enough that you could actually implement that the quality control on literally every call perhaps. So there will be I don't think we've, by any means, exhausted our imaginations when it comes to what are we gonna use these tokens on. And 100 billion, that's input and output. Also, it should be noted. So just scanning through tons of shit and just kind of post processing information, I think, is gonna be a huge trend. There's a lot of tokens in that, but still, how many how many businesses can compete at that kind of commodity layer when it's that cheap? It seems really hard to. I don't I don't know. If I'm a VC, I'm not sure I wanna invest in that. 5, we kinda talked about again as well, which is privileged access to cloud compute. You can only serve as much of this as you have compute. There are starting to be compute shortages. I've heard stuff from a friend at Google who has said demand for compute is now starting to kind of bind a little bit. They've made 1 of the biggest decade long investments in computing infrastructure, and they're starting to have to ration some stuff, it sounds like, at Google. The same was true, from what we heard from OpenAI in middle of last year. They were like, we have to make some choices on our product line because even with our partnership with Microsoft, we we just cannot scale our access to compute as fast as we would like. So there were there were a couple of different things that were happening at the time, but DALL E 2 blew up. And that was like, they had kind of decided, yeah, we gotta kind of delay the launch of some other product to just put all the all the resources behind this. And, again, these are this that's the Azure cloud that they're building on. Right? So your your moat is pretty apparent. They're where they are already hitting capacity constraints after hundreds of billions of capital investment, I mean, that's about self explanatory as it gets as a moat, I suppose. You're you're gonna have a hard time, accumulating anything on similar scale. And that doesn't mean you can't build a business out there, but it definitely means they have a moat. 6, GPT-4 itself. They're using GPT-4 in really interesting ways that are gonna give them advantage. If you believe that there is a and Anthropic has kind of said some similar things. By the way, that going back to 5, Anthropic partnered with Google. Hugging Face partnered with Amazon. Cohere, I forget who they partnered with, but somebody, all of these leading labs that don't have the compute, they're all entering into strategic partnerships for it. So there's kind of a musical chairs game going there that isn't necessarily 1 to 1, but how many preferred model providers does AWS ultimately take on? Probably not that many, I would guess. Right? Anyway, so coming back to GPT-4, if you wanted to make a case for why are the leaders gonna run away with it and widen the gap between themselves and open source, maybe 1 of the best answers would be that they already have these advanced models that allow them to scale all sorts of things that were previously really hard to scale. And OpenAI has given us a little bit of a glimpse into that with their recent interpretability publication where they kind of use GPT-4 to look at itself and try to figure out what are the neurons doing within, actually, think they might have been looking at GPT-2. I think they were using GPT-4 to look at GPT-2. But still kind of looking at you've got all these neurons. You don't know what they do. So how do you figure out what they do? Well, they basically run a bunch of text through the model, keep track of what is making each individual neuron, highly activated, and then kind of pull that out and look at it in batches. In our tiny stories interview, which we just recorded, the it's really apparent there. They have these small models. And when they do that process on these smaller models, the concepts actually kind of jump off the page and are very apparent. And you can see like, oh, this 1 seems to be responding to animals because it, dog and cat and bird and, okay. I see a pretty clear category there. This thing fires when there's an animal. And they showed a bunch of examples of that. As models get bigger, that stuff gets more messy, hard to figure out. Some of the concepts remain quite clear and interpretable, but others, you're looking at this like, okay. So this and this and this and this all caused this to, activate at a high level. I'm not really seeing anything here that is super coherent or an obvious concept that I could label. But they're using GPT-4 to automate that process. Even in GPT-2, there's already, think, 1.5 million neurons or whatever, or parameters different than neurons, fewer neurons. But still, a lot of plenty. To scale that, how else are gonna scale it? Right? So they you think about that kind of thing and just the mega scale that they can apply to kind of enriching these datasets, cleaning datasets, the next they they were talking about, it's not all just about scale. It's also about quality. Well, how are you gonna clean your dataset? Right? You're gonna probably go crunch through with GPT-4. Anthropic has kind of lent some credibility to this notion with their leaked, I think accidentally, their leaked pitch deck, which said something along the lines of, we think the companies that fall behind in the 25, 26 cycle maybe never catch up. And if you're like, well, what the fuck does that mean? That sounds kind of ominous. I I think it is kind of ominous. But if I had to interpret it, I would say it's that the models themselves become this engine of advantage that if you don't have access you can't perform the next level of research at the same pace. And there you could I do generally believe that the calls for regulation are pretty sincere, but that is also maybe where things start to kind of diverge. And you're like, oh, man. If certain things are required, you have to perform some exhaustive check. And you can do it with GPT-4 at 1 level of scale. Whereas if you don't have GPT-4, you can't do it, then that becomes kind of an interesting challenge. Maybe you could imagine a regulation where they are required to kind of share certain certain capability with other developers or something along those lines. So it's not like they can control the whole stack. But as it stands, by the way, their terms do not allow the all of these models that have been trained on ChatGPT output, basically, they all violate the the OpenAI terms. So if you actually did wanna go and commercialize that and you're like, oh, look how smart I am. I took this open source model, trained it on ChatGPT output. Now here's my business. They'll they could just straight up sue you and probably win, because you just took a bunch of their now the good follow-up question there would be, what about humanity as a whole assuming OpenAI for having taken all of our shit and running the first, training process that created the model in the first place? Is it how is how can it be that they're allowed to take all of the human data and create a model and then prohibit you from taking from their model to train a a downstream model? That does seem a bit weird. And I'm not sure that that exact position is ultimately gonna be tenable because for multiple reasons. For 1, it just sounds kind of insane. You're going have a hard time defending it. And second, if you do want to say, we're not trying to slow down research, but research comes to a point where it kind of depends on this very rapid, high quality processing of information to try to build good, reliable datasets or. China has just put out these guidelines, right, that are people are basically like, they're impossible to meet. And the reason that they China does not appear to be racing into an LLM future. They may be racing into a AI for military future. I I don't know about that. But in terms of putting chatbots online, they do not view that as the space race right now as far as I can tell. On the contrary, they're more worried that it's gonna talk about Tiananmen Square or whatever, and they don't want that. So they're like, you as a developer are responsible for your shit. I would really recommend there's a great Syndicate podcast on this recently with a couple of guests. We can find the link where they a couple of China scholars do the reading of what the CCP has said about this, and they have issued statements. And they they put these standards out there that are not easy to meet. Your data has to the data that you use to do the training has to be reliable or have quality. I mean, they're using adjectives. Right? So what does that even mean? It has to not violate anybody's intellectual property claims, which that legal regime, I don't think it's sorted out in China either. It has to be it has your data alone has to meet these quality standards per their statement. And then everybody's like, well, that's impossible. How could we have web scale data that meets those standards? And the answer is, if you own a cloud and you have GPT-4, then you can do that data cleaning next time. And they're already to a point where they could even probably just do the next model trained on almost pure synthetic data by just kind of taking what is real, maybe transforming it, filtering it and transforming it into something totally synthetic, taking all that synthetic stuff, doing the training on that, and then being like, look. We're clear because here's the entire dataset that we trained GPT-5 on. It was all now, there is still that link in the chain. Right? It was all kind of made by GPT-4, which was in turn made by with whatever. And eventually, it does get down to obviously, human data was your was the you couldn't have got here without it. But I do see some potential for that kind of dynamic where it's like, okay. There's a new standard. Your data has to be squeaky clean or whatever, and it's like, shit. Now there is kind of a lock in effect because nobody outside can really do that unless they have this model speed factor to kind of power that that kind of thing. And are they gonna share that? As of now, they basically say no, per the terms, but maybe they could be required to. Maybe they could change their minds. But, anyway, there's there is some this is why GPT-4 is a moat, because it it does have qualitatively different ability that they might even be able to use to accelerate their own work. And they don't as of now, nobody is forcing them to share that to accelerate others' work. And they also they've reduced to the logits. Right? They not reduced. But in the past, you could maybe even still with the earlier models. But with g p t 3, you go in and use the API. When you get that API result, they would give you not just the 1 token that was chosen, but up to, I think, the top 5 most likely tokens with the percentage that each was assigned in that prediction step. And then under the hood, they've actually generated a number for all 50,000 plus tokens in the vocabulary. So they don't have to do any extra work to do that. They're they're they're doing all that work for all 50,000 candidates anyway, picking 1, which could be the top 1 or could be semi randomly chosen, but then they would return to you. Here's the top, however many choices. And that was really useful if you wanted to study the model. It was also really useful if you wanted to train a imitator model because it's way more information to say the token was the that's 1 thing. But then to say top token was the at 47%, next was a at 32%, then was an at 9%, and then the you can learn a lot more from that, much much deeper level of disclosure, and they've now closed that off. There is no logits returned with GPT-4. So that's kind of the raising of the drawbridge a little bit. You could still get your GPT-4 outputs and try to train on them, but they've made it that much more difficult to do it than it used to be. Mote number 7, team and talent density, whatever. They're they are definitely absolutely killer team. There was just a new story where, apparently, Satya said to their head of research at Microsoft, how the hell did they do this with a couple 100 people? We've got all these people and how are they kicking our butt so much? And that's a probably complicated question to answer, but no doubt they do have extreme talent there. And I've seen it in kind of every part of the organization as well. The folks that we interacted with when we're in that $2,500 a month consulting engagement, all very, very good. The business contacts, they know their technology in a way that you just can't sustain. I don't think if you get really, really huge. The business guys that I've talked to, they're like, they don't have to check with the team. They know what's what. So I do think it's just extremely strong organization top to bottom. Mote 8, insane distribution and partnerships. The customer list is growing rapidly. Here's a list of customers recently announced for OpenAI. Intercom, Wix, Morgan Stanley, Shopify, Khan Academy, Atlassian, Zoom, Brex. That was just 1 little thread from the 1 of the business guys there. They've also got a huge partnership with Bain. I think they just formed another consulting partnership with another kind of global consulting firm. They're already in the door at basically all of corporate America. So, again you can chase a you can try to sneak in that door before it closes behind them. But those sales processes are in process, if not already closed and moving on toward model customization or what have you. Number 9. Yeah. Network effects. I mean, this one's a little bit it's definitely not as much network effects. We talked about this with Elad and Sarah a little bit back. The network effects in social media, certainly way stronger than network effects appear to be in AI. Sometimes people will ask me, how much lock in is there? And I'm always like, honestly, there's not much lock in. We when we run stuff on OpenAI, we make an API call to OpenAI. You can that code is a few lines, even like a 1 liner. You could flip to Claude in 2 seconds. You could flip to some other model in 2 seconds. It's really not that hard. Even with the fine tuning, you can fine tune on OpenAI's platform. They don't let you download your fine tuned model and walk off of it. You still have to pay you're essentially building it to rent. But once you've done that dataset, you can certainly take your dataset and go run it on an open source model. And that's something I've I've considered recently with Waymark just on pure curiosity, really. We're we're not really trying to save money at the moment, but I'm thinking, jeez, these open source models, they're kinda getting there. Maybe it would be worth just taking our dataset that we currently use on OpenAI and just running it against 1 of them and see how it goes. Maybe if it was comparable, and we I alluded to this earlier, didn't really get into it that much. We have a high degree of developer control. Our task is super defined. It's always the same formula where we're saying, here's the video script structure that you have to follow. Here's some information about the user. Here's the user just set at runtime, and your job is to spit out a completed version of the script structure that you're provided. That's how it works. When we fine tune into that, we're not supporting chat. We're not helping you write haikus. We're not doing anything else. It's that. The developer control there means we can be pretty confident that things can't go too far off the rails. If the language model starts to malfunction, the application just errors. It doesn't attack the user or the user won't even see the output because it'll just break. So that developer control and the kind of predictability of the task is such that we could do a fine tuned model even without all the safety bells and whistles and and niceties that we get would we can't do and and we do get end value from OpenAI, but we could kinda live without them because of the definition of the task. But it's still not hard to change. We we you can just flip around any way you want, even even on the fine tuning side. So is there super strong network effects? Not really. The biggest things are are kind of maybe social in the sense that everybody's kind of introduced to AI with these products and the techniques to use them and the prompt engineering and the tools all get kind of built for OpenAI first. Usually, they are built in a provider neutral way pretty quickly, if not right out of the gate. But nobody doesn't support OpenAI with their first release, right, of a new library or a framework or whatever. It's always gonna be OpenAI on launch. Maybe OpenAI only. Maybe others included too. But there is some kind of just gravity there that's not that's there's a reason that's the last on my list of moats. But it it it also kind of forces others into somewhat of a following position too. It's like, if if I have something that's working with OpenAI and then I'm thinking about exploring something else or switching, first thing I'm gonna try is the exact same task. I'm just gonna literally copy and paste into the other thing. And if it doesn't work, then I'm gonna be like, oh, this kinda sucks. And if it could work, if I reprompt engineered it or read their prompt guide or whatever and did did their way, then that's nice, but I don't really ideally, it's gonna work on the first time, and all of those companies are gonna kinda feel that pressure. They're not gonna want to make you I mean, you can imagine being the CEO at a a competitor. Right? You're like, if you had a prompt you're like, well, people need to read our prompt guide, then they'll know how to use our AI. It's like, no. They're not gonna do that. It's not gonna work like that. You have to make it easy for that. If you gotta reduce that friction so that in their effort to kinda reduce that friction, they end up in kind of a following position, I think, pretty often. The plug in architecture is something that seems like it's gonna kinda go that way and is another area where you can see potentially a some of these things were having the product dialed in, having the feed having the data scale to power a feedback loop. The the plug in architecture, in theory, it's highly portable, but our other so other people can adopt it. Microsoft is that they're going to adopt the the same basic plug in architecture that OpenAI introduced. So probably everybody's gonna kinda have an have to be able to support that in some way, but there's a lot of ways that you could be worse at supporting that than them. And now you're just kind of trying to play their game, but you're playing from behind. It's gonna be hard to leapfrog them at their own game. It's gonna be hard even to catch up with them at their own game. And meanwhile, you're not developing your own game. Right? So that's, I think that's gonna be tough. I think for some for a lot of these reasons, companies like character and, inflection with their new pie AI. I think those are notable exceptions to all of this analysis or at least possibly because they are trying to do something different. When you go talk to pie, it's not it's not like a sort of Butler style what can I do for you? Here it is. Hope it's helpful. It's a much more open ended exploratory dialogue. And that's that may emerge as a totally different lane that is currently kind of unfilled. I don't think many people go to ChatGPT for companionship, and they almost kind of discourage that in their approach. It'll chat with you, but it's like it frequently reminds you that it's an AI in ways that are not super conducive to an immersive experience. Whereas these other ones, they don't deceive you about being an AI. I think they're they're all pretty defensible product designs from what I've seen so far. But they do engage you in a different way that they could just end up being a different product category in the end. Erik Torenberg: (36:11) are using GPT output from OpenAI and just imitate training on that. So they are getting somewhat close on some test domains, but they're not really that close to even, and the leaderboard will tell you this. And that's a blind head to head, user call the winner process. They're not that close even to the second tier 5% price of GPT-4 version of OpenAI's products. So just having that means there's not that much opportunity for people to come in and steal share on pure price reasons. They have not left the door open there all that much. So people could do it for maybe other reasons, control, fine tuning, ideology, data, not wanting to send data out over certain boundaries, whatever. But they've got a great commodity product in today's world. I'd say, it's pretty clear at this point, OpenAI and Anthropic are the 2 leaders in that category. Moat number 2 is branding and just trust. For all of the complaining that has been posted on the Internet about how ChatGPT is, to this, to that. I mean, and you hear it from all sides. Right? It's simultaneously depending on who is trying to embarrass it. It can be both, too white supremacist and too woke. And you can kind of, see that almost regardless, I think, of perspective, you can see unwanted biases or unwanted behaviors in it. But the alternative in the open source world is just way worse if you are a corporate customer. If you want a sort of radical free speech experience and you're not a major company, then you can go the open source route and do whatever you want. But if you are the kind of business that is thinking about maybe putting a chatbot into your product experience somewhere, you don't want it to get too adventurous. I was joking with a guy at OpenAI. I was like, yeah, these open source radicals really overestimate the corporate appetite for large language model adventure. Nobody wants their own personal Sydney experience. Nobody wants that kind of embarrassment. And when you can get at $2 per million tokens, good quality, pretty reliable service from OpenAI, then it's burden's kind of on you to figure out, well, why did you do something? Why would you or why did you do something different? So I started saying that people used to say nobody got fired for going with IBM, and now it's like, that might be true for a few of these top players because your AI might still embarrass you, but at least you can fall back on like, look. I used the industry standard. These guys spent 6 months working on safety of GPT-4. What do you want me to do, boss? We're gonna use this kind of shit, there's gonna be some risk, but I kind of made the safest choice I could with OpenAI or Anthropic. So, again, that's a moat. Right? And Google's, I think, gonna get there too soon. Certainly, they have the trust and kind of gravitas that a corporate buyer would believe them that they do have good standards in place. And, again, when it's so cheap, why am I gonna go put myself as a CIO or whatever in a position where I don't really know exactly how this open source model was trained or by whom or to what degree it's really been battle tested. I'm gonna do that to save a little money. And by the way, am I even really saving money? Depending, maybe, but I'm certainly gonna be putting more man hours into it than I would if I was just doing the simple thing. So, I don't know. It sounds kinda tough. Moat 3 is the feedback loop that they have, and this is where OpenAI is currently quite out in front depending on exactly what you think about some of the alternative strategies. But nobody has the volume of LLM usage that OpenAI has, that ChatGPT has. Bing's doing some decent volume I've seen, but not as much as ChatGPT, and they're, of course, powered by GPT-4 anyway. So they're getting this data. They now have terms of service that say if you use them via the API, that they will retain your data for a short time, then delete it, and not use it in training. But the free tier of ChatGPT by default is used in training, and you can opt out of that. Now they just added that as well. But by default, you're opt in. And if you wanna opt out, you can opt out. But they're getting more raw usage and feedback data than anybody else. And they have a well honed product development process that is humming. And others are finding that kind of hard to reproduce. Even somebody like Bing comes out and finds out, wow. We didn't expect that. And some of those things you're like, I don't know, there were some really pretty flagrant breaks in the Microsoft process. I've gone down that rabbit hole. They tested this thing for months in other parts of the world. They had users report in their forum such behavior from the bot as ultimately graced the cover of the New York Times. They failed to detect it because 1 part of the organization wasn't talking to the other or whatever. The Microsoft employee that responds in the forum last I checked, this was also online in the Microsoft forums. The person who responds, who works at Microsoft, seems to not know about the AI powered search experience at all and doesn't even know what the person is talking about, who is saying your chatbot is accosting me. And that's all kind of documented as of late 2022. Then they launched the thing in 2023, and then you have things that are as simple as, the AI got the date wrong. The user corrected the AI on the date, and that was enough at launch for whole things to go off the rails. And I think it's a huge people should be very clear on the difference between a jailbreak where you can trick ChatGPT into saying a bad word or whatever. And that's a problem. But I've never seen ChatGPT turn on a user. And that's a whole different level of failure of alignment. Failure of control when the user is trying to break your control measure versus failure of basically decent engagement with the user in the first place. So qualitatively different thing. All that story is to say the product feedback loop is pretty important. And it's this stuff does not just magically cohere into a well behaved AI by accident or even with moderate effort. It seems to take a lot of effort. So OpenAI is just crushing it in that regard. Anthropic, again, has a unique approach with their constitutional AI system where they use kind of self critique and synthetic data to basically try to get the same level of control as if they had the user base. And it seems to work quite well. So that may be something that other people could in fact, in some ways, they are ahead in terms of their safety profile, not their raw capability profile. But in terms of their safety profile, in some ways, I've seen them be ahead. I've also heard from others that maybe that's not the case. So it's probably mixed. The surface area is so huge that you can be ahead and behind at the same time in different areas, and that's almost certainly the case. But nevertheless, it does seem to work quite well. They also do a pretty good job of avoiding hallucinations. At 1 point, I think they did a better job than OpenAI was. But, again, with GPT-4, they've improved a lot. But you could previously see with ChatGPT versus Claude, you could kinda see these situations where ChatGPT would still I asked a question about a property tax in 1 Massachusetts town 1 time, and it made up a rate that was not the real rate. It had all the structure and conceptual analysis of property tax generally right, But it made up a rate at the key moment that was false. Whereas Claude said, for example, if the rate were, it just gave a nice reasonable round number. And so and there are some ways in which it does seem like the Anthropic method is at least competitive, if not, in some ways, maybe superior. So those 2 seem to be kind of at the forefront of that. Actually, Character AI has a really good feedback loop as well. They're kind of a dark horse in this whole game playing a bit of a different game. They're the only the 1 that comes to mind as having a real tight at scale product feedback loop. Google doesn't have it yet. They should be able to get it real quick Whether or not they can really iron out all the sort of sand off all the barnacles on their sort of product flywheel, I guess still remains to be seen. As of now, they've not done it. And if you look at the leaderboard, Bard is down the leaderboard, not because they don't have the raw horsepower, but because, at least as far as I can tell, they have not shaped the product in the same way that the best in that category have. They can match on things like medical question answering. GPT-4 and MedPalm 2 are both expert level on these benchmarks. So they can definitely get to the point where they can do high level stuff, but they don't have this unified product that kinda does everything and knows how to handle all these different situations. It's like MedPalm is dedicated to medical question answering, and it can't help you with whatever random search queries you have or just be your general sort of chat adviser. It is specialized and and fully end to end fine tuned for that domain. And that arguably could be good. Somebody might say, well, jeez, do we really need these all purpose AIs? Maybe a bunch of specialist AIs could be arguably a better overall architecture. I think Google is definitely kinda playing that both ways. They're gonna have their general open chatbot that you can talk to about anything, but they're also going they're taking MedPalm 2 to hospital groups and stuff like that. They're not even messing around with some base model. They're gonna only take the good stuff. So 4, I kinda talked about in my in this thread. 4 is pricing power. I think we've probably covered that at enough length. $2 per million tokens. You gotta be using a lot of tokens. And another way I was thinking about this was, this is actually kind of interesting math. If you were to try to compete at that price point, you would have to serve 100 billion tokens to pay for 1 employee. They make $200,000 in revenue for 100 billion tokens served. So I'm rounding down from the, we're so back San Francisco AI, salaries there to 200,000, calling that 1 employee. By the way, whatever electricity, compute costs you have, Microsoft gets its share out of that too, right, out of that, $200,000 for 100 billion tokens. I mean, 100 billion tokens is a ridiculous amount of stuff. They're training these models on 1 trillion tokens. That's a kind of general ballpark of what a not a GPT-4, but a llama type good open source project might train on today would be roughly 1 trillion tokens. So you're talking about 10 you know? And people also talk about that as the scale of the whole Internet. So you're talking about generating text on, maybe 10% of the scale of the Internet for $200,000. Just who needs that many tokens? It's gonna be hard to build that many businesses competing at that level. I just I don't see how you build another world class team that can hit at that level only to then be like, okay. Cool. Now we've now we're here. Now who needs 100 billion tokens? And we need to by the way, we need to find a lot of people who need 100 billion tokens each to have a chance. I don't know. It's tough. The volume of that really kinda blows my mind. I think there are things, you'll see just mass summarization, just summarizing everything, just processing everything, quality. Just this stuff is cost effective for quality control. Right? I mean, every time you've ever been on hold and said this thing is being called being recorded for quality control purposes. Now it is cheap enough that you could actually implement that the quality control on literally every call perhaps. So there will be I don't think we've, by any means, exhausted our imaginations when it comes to what are we gonna use these tokens on. And 100 billion, that's input and output. Also, it should be noted. So just scanning through tons of shit and just kind of post processing information, I think, is gonna be a huge trend. There's a lot of tokens in that, but still, how many businesses can compete at that kind of commodity layer when it's that cheap? It seems really hard to. I don't know. If I'm a VC, I'm not sure I wanna invest in that. 5, we kinda talked about again as well, which is privileged access to cloud compute. You can only serve as much of this as you have compute. There are starting to be compute shortages. I've heard stuff from a friend at Google who has said demand for compute is now starting to kind of bind a little bit. They've made 1 of the biggest decade long investments in computing infrastructure, and they're starting to have to ration some stuff, it sounds like, at Google. The same was true, from what we heard from OpenAI in middle of last year. They were like, we have to make some choices on our product line because even with our partnership with Microsoft, we just cannot scale our access to compute as fast as we would like. So there were a couple of different things that were happening at the time, but DALL-E 2 blew up. And that was like, they had kind of decided, yeah, we gotta kind of delay the launch of some other product to just put all the resources behind this. And these are this that's the Azure cloud that they're building on. Right? So your moat is pretty apparent. They're where they are already hitting capacity constraints after hundreds of billions of capital investment. I mean, that's about self explanatory as it gets as a moat, I suppose. You're gonna have a hard time, accumulating anything on similar scale. And that doesn't mean you can't build a business out there, but it definitely means they have a moat. 6, GPT-4 itself. They're using GPT-4 in really interesting ways that are gonna give them advantage. If you believe that there is a and Anthropic has kind of said some similar things. By the way, that going back to 5, Anthropic partnered with Google. Hugging Face partnered with Amazon. Cohere, I forget who they partnered with, but somebody, all of these leading labs that don't have the compute, they're all entering into strategic partnerships for it. So there's kind of a musical chairs game going there that isn't necessarily 1 to 1, but how many preferred model providers does AWS ultimately take on? Probably not that many, I would guess. Right? Anyway, so coming back to GPT-4, if you wanted to make a case for why are the leaders gonna run away with it and widen the gap between themselves and open source, maybe 1 of the best answers would be that they already have these advanced models that allow them to scale all sorts of things that were previously really hard to scale. And OpenAI has given us a little bit of a glimpse into that with their recent interpretability publication where they kind of use GPT-4 to look at itself and try to figure out what are the neurons doing within, actually, think they might have been looking at GPT-2. I think they were using GPT-4 to look at GPT-2. But still, kind of looking at, you've got all these neurons. You don't know what they do. So how do you figure out what they do? Well, they basically run a bunch of text through the model, keep track of what is making each individual neuron, highly activated, and then kind of pull that out and look at it in batches. In our tiny stories interview, which we just recorded, the it's really apparent there. They have these small models. And when they do that process on these smaller models, the concepts actually kind of jump off the page and are very apparent. And you can see like, oh, this 1 seems to be responding to animals because it, dog and cat and bird and, okay. I see a pretty clear category there. This thing fires when there's an animal. And they showed a bunch of examples of that. As models get bigger, that stuff gets more messy, hard to figure out. Some of the concepts remain quite clear and interpretable, but others, you're looking at this like, okay. So this and this and this and this all caused this to activate at a high level. I'm not really seeing anything here that is super coherent or an obvious concept that I could label. But they're using GPT-4 to automate that process. Even in GPT-2, there's already, think, 1.5 million neurons or whatever, or parameters different than neurons, fewer neurons. But still, a lot of plenty. To scale that, how else are gonna scale it? Right? So they you think about that kind of thing and just the mega scale that they can apply to kind of enriching these datasets, cleaning datasets, the next they were talking about, it's not all just about scale. It's also about quality. Well, how are you gonna clean your dataset? Right? You're gonna probably go crunch through with GPT-4. Anthropic has kind of lent some credibility to this notion with their leaked, I think accidentally, their leaked pitch deck, which said something along the lines of, we think the companies that fall behind in the 2025-26 cycle maybe never catch up. And if you're like, well, what the fuck does that mean? That sounds kind of ominous. I think it is kind of ominous. But if I had to interpret it, I would say it's that the models themselves become this engine of advantage that if you don't have access, you can't perform the next level of research at the same pace. And there, you could I do generally believe that the calls for regulation are pretty sincere, but that is also maybe where things start to kind of diverge. And you're like, oh, man. If certain things are required, you have to perform some exhaustive check, And you can do it with GPT-4 at 1 level of scale. Whereas if you don't have GPT-4, you can't do it, then that becomes kind of an interesting challenge. Maybe you could imagine a regulation where they are required to kind of share certain capability with other developers or something along those lines. So it's not like they can control the whole stack. But as it stands, by the way, their terms do not allow the all of these models that have been trained on ChatGPT output, basically, they all violate the OpenAI terms. So if you actually did wanna go and commercialize that and you're like, oh, look how smart I am. I took this open source model, trained it on ChatGPT output. Now here's my business. They could just straight up sue you and probably win, because you just took a bunch of their now the good follow-up question there would be, what about humanity as a whole assuming OpenAI for having taken all of our shit and running the first training process that created the model in the first place? Is it how is how can it be that they're allowed to take all of the human data and create a model and then prohibit you from taking from their model to train a downstream model? That does seem a bit weird. And I'm not sure that that exact position is ultimately gonna be tenable because for multiple reasons. For 1, it just sounds kind of insane. You're going have a hard time defending it. And second, if you do want to say, we're not trying to slow down research, but research comes to a point where it kind of depends on this very rapid, high quality processing of information to try to build good, reliable datasets or China has just put out these guidelines, right, that are people are basically like, they're impossible to meet. And the reason that they China does not appear to be racing into an LLM future. They may be racing into an AI for military future. I don't know about that. But in terms of putting chatbots online, they do not view that as the space race right now as far as I can tell. On the contrary, they're more worried that it's gonna talk about Tiananmen Square or whatever, and they don't want that. So they're like, you as a developer are responsible for your shit. I would really recommend there's a great Syndicate podcast on this recently with a couple of guests. We can find the link where they a couple of China scholars, do the reading of what the CCP has said about this, and they have issued statements. And they put these standards out there that are not easy to meet. Your data has to the data that you use to do the training has to be reliable or have quality. I mean, they're using adjectives. Right? So what does that even mean? It has to not violate anybody's intellectual property claims, which that legal regime, I don't think it's sorted out in China either. It has to be it has your data alone has to meet these quality standards per their statement. And then everybody's like, well, that's impossible. How could we have web scale data that meets those standards? And the answer is, if you own a cloud and you have GPT-4, then you can do that data cleaning next time. And they're already to a point where they could even probably just do the next model trained on almost pure synthetic data by just kind of taking what is real, maybe transforming it, filtering it and transforming it into something totally synthetic, taking all that synthetic stuff, doing the training on that, and then being like, look. We're clear because here's the entire dataset that we trained GPT-5 on. It was all now there is still that link in the chain. Right? It was all kind of made by GPT-4, which was in turn made by with whatever. And eventually, it does get down to obviously, human data was the you couldn't have got here without it. But I do see some potential for that kind of dynamic where it's like, okay. There's a new standard. Your data has to be squeaky clean or whatever, and it's like, shit. Now there is kind of a lock in effect because nobody outside can really do that unless they have this model speed factor to kind of power that kind of thing. And are they gonna share that? As of now, they basically say no, per the terms, but maybe they could be required to. Maybe they could change their minds. But, anyway, there's there is some this is why GPT-4 is a moat, because it does have qualitatively different ability that they might even be able to use to accelerate their own work. And as of now, nobody is forcing them to share that to accelerate others' work. And they also they've reduced to the logits. Right? They not reduced. But in the past, you could maybe even still with the earlier models. But with g p t 3, you go in and use the API. When you get that API result, they would give you not just the 1 token that was chosen, but up to, I think, the top 5 most likely tokens with the percentage that each was assigned in that prediction step. And then under the hood, they've actually generated a number for all 50,000 plus tokens in the vocabulary. So they don't have to do any extra work to do that. They're they're doing all that work for all 50,000 candidates anyway, picking 1, which could be the top 1 or could be semi randomly chosen, but then they would return to you. Here's the top, however many choices. And that was really useful if you wanted to study the model. It was also really useful if you wanted to train an imitator model because it's way more information to say the token was the that's 1 thing. But then to say top token was the at 47%, next was a at 32%, then was an at 9%, and then whatever, the you can learn a lot more from that, much deeper level of disclosure, and they've now closed that off. There is no logits returned with GPT-4. So that's kind of the raising of the drawbridge a little bit. You could still get your GPT-4 outputs and try to train on them, but they've made it that much more difficult to do it than it used to be. Moat number 7, team and talent density, whatever. They are definitely absolutely killer team. There was just a new story where, apparently, Satya said to their head of research at Microsoft, how the hell did they do this with a couple 100 people? We've got all these people and how are they kicking our butt so much? And that's a probably complicated question to answer, but no doubt they do have extreme talent there. And I've seen it in kind of every part of the organization as well. The folks that we interacted with when we're in that $2,500 a month consulting engagement, all very, very good. The business contacts, they know their technology in a way that you just can't sustain. I don't think if you get really, really huge. The business guys that I've talked to, they're like, they don't have to check with the team. They know what's what. So I do think it's just extremely strong organization top to bottom. Moat 8, insane distribution and partnerships. The customer list is growing rapidly. Here's a list of customers recently announced for OpenAI. Intercom, Wix, Morgan Stanley, Shopify, Khan Academy, Atlassian, Zoom, Brex. That was just 1 little thread from the 1 of the business guys there. They've also got a huge partnership with Bain. I think they just formed another consulting partnership with another kind of global consulting firm. They're already in the door at basically all of corporate America. So, again, you can chase a you can try to sneak in that door before it closes behind them. But those sales processes are in process, if not already closed, and moving on toward model customization or what have you. Number 9. Yeah. Network effects. I mean, this one's a little bit it's definitely not as much network effects. We talked about this with Elad and Sarah a little bit back. The network effects in social media, certainly way stronger than network effects appear to be in AI. Sometimes people will ask me, how much lock in is there? And I'm always like, honestly, there's not much lock in. We when we run stuff on OpenAI, we make an API call to OpenAI. You can that code is a few lines, even a 1 liner. You could flip to Claude in 2 seconds. You could flip to some other model in 2 seconds. It's really not that hard. Even with the fine tuning, you can fine tune on OpenAI's platform. They don't let you download your fine tuned model and walk off of it. You still have to pay you're essentially building it to rent. But once you've done that dataset, you can certainly take your dataset and go run it on an open source model. And that's something I've I've considered recently with Waymark just on pure curiosity, really. We're not really trying to save money at the moment, but I'm thinking, jeez, these open source models, they're kinda getting there. Maybe it would be worth just taking our dataset that we currently use on OpenAI and just running it against 1 of them and see how it goes. Maybe if it was comparable, and we I alluded to this earlier, didn't really get into it that much. We have a high degree of developer control. Our task is super defined. It's always the same formula where we're saying, here's the video script structure that you have to follow. Here's some information about the user. Here's the user just set at runtime, and your job is to spit out a completed version of the script structure that you're provided. That's how it works. When we fine tune into that, we're not supporting chat. We're not helping you write haikus. We're not doing anything else. It's that. The developer control there means we can be pretty confident that things can't go too far off the rails. If the language model starts to malfunction, the application just errors. It doesn't attack the user or the user won't even see the output because it'll just break. So that developer control and the kind of predictability of the task is such that we could do a fine tuned model even without all the safety bells and whistles and and niceties that we get would we can't do and and we do get end value from OpenAI, but we could kinda live without them because of the definition of the task. But it's still not hard to change. We we you can just flip around any way you want, even on the fine tuning side. So is there super strong network effects? Not really. The biggest things are are kind of maybe social in the sense that everybody's kind of introduced to AI with these products and the techniques to use them and the prompt engineering and the tools all get built for OpenAI first. Usually, they are built in a provider neutral way pretty quickly, if not right out of the gate. But nobody doesn't support OpenAI with their first release, right, of a new library or a framework or whatever. It's always gonna be OpenAI on launch. Maybe OpenAI only. Maybe others included too. But there is some kind of just gravity there that's like, it's not that's there's a reason that's the last on my list of moats. But it it and it it also kind of forces others into somewhat of a following position too. It's like, if I have something that's working with OpenAI and then I'm thinking about exploring something else or switching, first thing I'm gonna try is the exact same task. I'm just gonna literally copy and paste into the other thing. And if it doesn't work, then I'm gonna be like, oh, this kinda sucks. And if it could work, if I reprompt engineered it or read their prompt guide or whatever and did their way, then that's nice, but I don't really ideally, it's gonna work on the first time, and all of those companies are gonna kinda feel that pressure. They're not gonna want to make you I mean, you can imagine being the CEO at a competitor. Right? You're like, if you had a prompt you're like, well, people need to read our prompt guide, then they'll know how to use our AI. It's like, no. They're not gonna do that. It's not gonna work like that. You have to make it easy for that. If you gotta reduce that friction so that in their effort to kinda reduce that friction, they end up in kind of a following position, I think, pretty often. The plug in architecture is something that seems like it's gonna kinda go that way and is another area where you can see potentially some of these things were having the product dialed in, having the feed having the data scale to power a feedback loop. The plug in architecture, in theory, it's highly portable, but our other so other people can adopt it. Microsoft is that they're going to adopt the same basic plug in architecture that OpenAI introduced. So probably everybody's gonna kinda have to be able to support that in some way, but there's a lot of ways that you could be worse at supporting that than them. And now you're just kind of trying to play their game, but you're playing from behind. It's gonna be hard to leapfrog them at their own game. It's gonna be hard even to catch up with them at their own game. And meanwhile, you're not developing your own game. Right? So that's, I think that's gonna be tough. I think for some for a lot of these reasons, companies like character and, inflection with their new pie AI. I think those are notable exceptions to all of this analysis or at least possibly because they are trying to do something different. When you go talk to pie, it's not it's not a sort of Butler style, what can I do for you? Here it is. Hope it's helpful. It's a much more open ended exploratory dialogue. And that may emerge as a totally different lane that is currently kind of unfilled. I don't think many people go to ChatGPT for companionship, and they almost kind of discourage that in their approach. It'll chat with you, but it's like it frequently reminds you that it's an AI in ways that are not super conducive to an immersive experience. Whereas these other ones, they don't deceive you about being an AI. I think they're they're all pretty defensible product designs from what I've seen so far. But they do engage you in a different way that they could just end up being a different product category in the end.

Nathan Labenz: (1:13:01) Hey, we spoke to Elad and Sarah, a couple of VCs and how they're approaching it in the space in terms of what to invest in, what not to invest in. And I'm curious, given we just ran through these moats and that's what people are talking about when they talk about moats. It's what what's investable and what what's not investable? Nathan Labenz: 1:13:01 Hey, we spoke to Elad and Sarah, a couple of VCs and how they're approaching it in the space in terms of what to invest in, what not to invest in. And I'm curious, given we just ran through these moats and that's what people are talking about when they talk about moats. It's what's investable and what's not investable.

Erik Torenberg: 1:13:18 Okay. I think the sophisticated analysis right now generally agrees that Salesforce should be fine, because they've got a ton of shit built out, and a ton of distribution and a ton of contracts and a ton of everything. And they can layer on a copilot before you can build a worthy rival from scratch, no matter how smart your use of GPT-4. So that, I think, is happening. And Adobe is in a similar position in a way that is more on the visual side than the text, although it could be both. There's been a ton of amazing stuff, but nobody's really canceled their Adobe accounts. And now Adobe is announcing their own amazing stuff. And it seems like if people have to pick, I just saw a really interesting thread from this guy named Riley on Twitter who said, mid journey is awesome. It's always been totally groundbreaking. But if I was going to tell somebody what to learn today, I'd tell them to use the Adobe Photoshop because mid journey is kind of this black box. You get things out of it. They can be awesome. But Adobe, you can really make what you want. And they have both the generative layer and the dig in and douche layer now. And so again, they should be fine. It is going to take a long time to build up a credible rival to Adobe. I would be very wary of a thesis that was pick an incumbent and be like, we're going to beat them because we're going to use AI. I don't think that's going to work. They all know about AI, and it's not that hard to implement. And in fact, in some ways, it's easier when you have a big platform because you also have extensive documentation. So the existing models already have a decent sense of a lot of these mega platforms. You could probably go ask, I mean, people see even in the marketing copy thing, right? You can ask ChatGPT to write you a tweet or write you a LinkedIn post. It knows what the difference is between those in general form and tone. A social media company that doesn't exist yet, it's not going to know how to do that. And the same thing is kind of true for whatever technology you're pursuing. If it's all well documented out there, you could probably get pretty far with ChatGPT, just off the shelf. You don't have to do any special training. And I think they will, by the way. Salesforce is going to have their own copilot that's not default GPT-4. It'll be better than that. But I don't see any reason. They've got documentation, probably 1000000000000 tokens of documentation and forum posts and tickets and all that shit. I mean, they have plenty to work with. So yeah. I guess bottom line there, it seems like mostly the incumbent should be fine. And mostly, I would be pretty skeptical of a thesis that's like, we're going to take on this incumbent, and we're going to win because we're going to use AI, and that's how we're going to win. I just don't see that path for most things unless you really felt like an incumbent was flagrantly dropping the ball. I think that the more kind of exciting or hard to predict stuff would be then what we also kind of talked about with them, which was what are the new things that just couldn't ever have existed before where there is no incumbent and it's a totally new market? And that, I guess, so far, we've got AI chatbot as a new category. We don't really have that many new categories yet. We also got text to image as kind of a new category. So there are a couple things that are kind of like, holy shit. You just couldn't do that before that have already come online. But I think that stuff is mostly still in the future. Because, again, GPT-4 has only been out for a couple months. And probably a lot of the people that come up with those crazy ideas won't even be the AI developers, in the same way that Uber was developed in a different way than the iPhone itself. And probably Travis couldn't have done the iPhone. But a company like Apple could never have done what Uber did. There probably is some kind of similar dynamic. What gets built on this new platform that's just radically different? Who knows? One thing I am keeping my eye on is this kind of improved discourse space. You can even kind of connect this to crypto, which I've seldom do, but the smart contract, right? Can you take a language model that is kind of a dispute resolver, cannot cryptographically guarantee that you are in fact getting the model that was agreed upon at the time the contract was signed and let it resolve stuff and handle disputes and issue refunds accordingly or whatever, manage little escrow accounts. That kind of stuff I do think should happen. And the cost of the possible unfairness or the AI going against you to me seems like ultimately people just accept that because it's going to be so much cheaper and better than anything else. Are you going to, your alternative in today's world is go to small claims court or something. Good luck with that. You might as well just sign on to some sort of crypto consumer arbitration scheme. And it's better than any alternative, even if it's not perfect or even if you don't get the justice you kind of want every single time. Stuff like that, I think, will be big. I do think these new relationship paradigms are going to be, I'm not excited about that myself. I don't want an AI friend, and I'm kind of, it does, this is the max extent of amusing ourselves to death potentially. So I don't know that that is ultimately healthy for individuals or society. But maybe it could be. Maybe we just haven't seen the final right version of it yet. When we talked to Eugenia from Replica, I was like, man, the revelation here is how bad a lot of people need companionship, because to get as far as she had gotten pre GPT-3 is that's not even an AI story. That's a society story. And now you're going to inject an AI story into that society story, and it seems like it easily can go 100x. I don't see any reason that the percentage of people who have an AI friend does not totally explode. And maybe it even could be good. For a lot of people, it probably would be good. Where it starts to be a problem is where, well, who knows? It could probably start to be a problem in a lot of ways. But for me, I kind of think the last thing I want is that to crowd out my real relationships, which I probably already spend too much time podcasting and thinking about AI relative to what I might ought to be doing. And I don't want to make that any worse with some AI friend in my pocket. But yet, I bet that just like social media is kind of addictive. Not all good, not all bad, but definitely crowds out other activities. I think this is probably also kind of unavoidable. And so when Sarah mentioned that, she was like, you look at the usage of some of these things. I think she mentioned maybe both character and replica. You can start to see it. So from an investment standpoint, that could be really good. I would, my concern with investing in replica right now would be, not in a, I don't mean to cast this version on you, but I would be less worried that they'll be able to grow users and more, do we have the right vision here for what kind of impact we're going to make? And, again, that's not to suggest that their vision is wrong, but questions are probably coming up faster than they can answer them. I think that was definitely a takeaway from that episode, right? It's like, this was all happening pretty quickly. And even as much as she has been in this game, it didn't seem like she was fully prepared for it. And I again, not to blame her. It came up on us all pretty quickly, but she just is the one that happens to be in that seat. So that's a challenging seat to occupy. Again, from an investment standpoint, where do you go invest? Google's going to be tough to be, and Microsoft and their distribution. All these hospital systems are all a customer of Microsoft or Google already, right? They all get docs from one of these two places, or Epic. The again, the distribution seems like it's going to be hard to beat. When I think about Neil, Khosla, and Curai, I'm like, I do expect them to be successful. I do think they will grow. They've already also got some distribution because they've been in the game for a while. And they didn't, this was not a company that was started to take advantage of GPT-4. So I think they will be on a good path at least for a while, but that probably means their past investors do well. It's still not super clear that their next round investors do super well because, again, the prices get low. So how do you make the, how do you make that much money? I don't know how you make that much money. I don't know. It's just tough. The prices keep, when prices drop 97%, and the biggest corporations in the world are kind of ready to run at or near cost for the foreseeable future, and they already have all the distribution, I don't know how much share you can really hope to get. So what's going to be transformative there? Is it, I kind of sense that it is more like task automation for a while than it is a totally different thing. Now you can imagine somebody that could bring forward the true AI doctor, no humans involved, taps into all your sensors. There could be a totally transformative form factor that is just next level and is like, shit, that's so much better than even a thorough application of GPT-4 through our current system. Hard to kind of imagine what that looks like, but you can at least conceive of it. But when it comes to reviewing the doctor notes and making sure that everything's double checked and was there any possible drug interaction? That stuff seems like it happens in the current systems for the most part, because it's really easy to integrate those APIs wherever they need to be integrated. So I don't know. I wouldn't take a job right now as a salesperson for kind of Doc dot AI that's new and on the scene and kind of like, look, we've done whatever exactly. But there's an interesting, there was a company that just came out. Hip was it Hippocratic dot AI? They said they've trained only on medical, trusted medical information. And wherever they, I'm assuming they get that out of existing medical record systems. And they call out that other systems that are trained on the whole Internet plus, maybe specialized have wrong shit in them. And that's one way that they want to differentiate themselves. That seems like it could be true. Could be a real advantage, though I don't know. And Neil had a couple interesting points where he said, if the patient says they had cereal, that probably also means they had milk. And you need that kind of general knowledge that's probably not represented in these medical systems to figure out all those associations. He also said very Silicon Valley, if the if the person went to Burning Man, they probably inhaled a lot of dust. And you aren't going to know that if you're just trained on this medical data, I wouldn't think. The Mosaic guys also said almost all their customers could train on a combination of open source and proprietary. Maybe they're doing that too, but they seem to be saying they're not. So, anyway, that's one big question that I have. And another one is, they take a very interesting position saying, we don't believe language models are safe enough for diagnosis. So we're focusing entirely on non patient facing, but back office shit, which is plenty big enough market. And again, I think they'd probably be successful. Do they dominate the space? I don't know. It feels like in the end, no. I have to guess that the big, the Microsoft epic whatever kind of complex ultimately still takes most of that market, and it's tough to get your little green shoot up through the hornet's nest of all the bullshit that you have to push through.

Nathan Labenz: 1:26:27 This is great. I'll let you go. Thanks for a great conversation, and we'll talk next week. OmniKey uses generative AI to enable you to launch hundreds of thousands of ad iterations that actually work customized across all platforms with a click of a button. I believe in OmniKey so much that I invested in it, and I recommend you use it too. Use Cognitive Rev to get a 10% discount.

Universal Medical Intelligence: OpenAI's Plan to Elevate Human Health, with Karan Singhal

Intelligence with Everyone: RL @ MiniMax, with Olive Song, from AIE NYC & Inference by Turing Post

Mathematical Superintelligence: Harmonic's Vlad & Tudor on IMO Gold & Theories of Everything

Where Are the Moats in AI? With Nathan Labenz and Erik Torenberg

Watch Episode Here

Video Description

Full Transcript

Transcript

Nathan Labenz

Read next