Why a16z Built a Town for AI People
Nathan discusses AI Town, a virtual space where AI agents socialize, and its origins with a16z x Convex project members.
Watch Episode Here
Video Description
In this episode, Nathan sits down with three members of the a16z x Convex AI Town project: Yoko Li (Partner, a16z), Martin Casado (GP, a16z), and James Cowling (CTO, Convex). AI Town is a virtual town where AI agents live, interact, and socialize. They discuss how AI Town originated from Yoko’s companion app project, unpredictability as a feature in LLMs and interacting with models like they are lifeforms, and why they chose Javascript and Convex to build AI Town. If you're looking for an ERP platform, check out our sponsor, NetSuite: http://netsuite.com/cognitive
📣 CALL FOR FEEDBACK:
To borrow from a meme… we're in the podcast arena trying stuff. Some will work. Some won't. But we're always learning.
http://bit.ly/TCRFeedback
Fill out the above form to let us know how we can continue delivering great content to you or sending the feedback on your mind to tcr@turpentine.co.
TIMESTAMPS:
(00:00:00) - Episode Preview: Intro to AI Town and the idea of AI as companions
(00:05:29) - Overview of AI Town and the simulation framework based on the Stanford Generative Agents paper
(00:08:24) - Yoko explains how the idea for AI Town originated from the companion app project
(00:10:41) - Yoko discusses how she built the initial AI Town prototype and wanted to make it multiplayer
(00:12:31) - The simplicity and elegance of the AI Town codebase
(00:13:52) - Interacting with LLMs is like interacting with lifeforms
(00:15:47) - Sponsors: Netsuite | Omneky
(00:18:25) - How Convex built a server-side game engine for AI Town
(00:19:28) - How Convex makes building a game engine easy with transactions and database support
(00:23:39 )- James emphasizes the power of functional programming paradigms like Convex for building AI apps
(00:25:02) - Using simple JavaScript so anyone could understand and extend AI Town
(00:28:39) - The group reflects on how JavaScript has become so powerful compared to languages like C++
(00:30:23) - How AI coding assistants were used in building AI Town
(00:31:22) - No open source code for the Stanford paper when they started
(00:33:25) - The interplay between programmer and AI model
(00:38:01)- Martin draws a distinction between using formal languages vs. natural language
(00:39:52) Unpredictability as a feature in LLMs
(00:43:21) The balance between formal language and unpredictable behaviours in LLMs
(00:43:59) AI Town’s future and the beauty of the community
(00:48:27) Are we living in a simulation?
(00:50:38) Advice for other developers in AI
(00:54:29) AI Town is a community project to be extended on
LINKS MENTIONED:
- AI Town (try it out!): https://www.convex.dev/ai-town
- AI Town Github: https://github.com/a16z-infra/ai-town
- Convex: https://www.convex.dev/
- The Sea of Tranquility: https://www.goodreads.com/en/book/show/58446227
- Generative Agents, Interactive Simulacra of Human Behaviour: https://arxiv.org/abs/2304.03442
X/TWITTER:
@stuffyokodraws (Yoko)
@martin_casado (Martin)
@jamesacowling (James)
@realaitown (AI Town)
@labenz (Nathan)
@eriktorenberg
@CogRev_Podcast
SPONSORS: NetSuite | Omneky
NetSuite has 25 years of providing financial software for all your business needs. More than 36,000 businesses have already upgraded to NetSuite by Oracle, gaining visibility and control over their financials, inventory, HR, eCommerce, and more. If you're looking for an ERP platform ✅ head to NetSuite: http://netsuite.com/cognitive and download your own customized KPI checklist.
Omneky is an omnichannel creative generation platform that lets you launch hundreds of thousands of ad iterations that actually work customized across all platforms, with a click of a button. Omneky combines generative AI and real-time advertising data. Mention "Cog Rev" for 10% off.
Music Credit: GoogleLM
Full Transcript
Transcript
Yoko Li (0:00) So the previous project we've coded up, which is called Companion App, was based on this idea that AI can be served as companions and then as a human you can talk to it about your problems or maybe have a conversation, learn things from the AI. The thing I did notice is like when you try these things firsthand, you give these like AI companions life by giving it more details, like what do they like? Do they prefer Diet Coke or real Coke? You know, it's like all the nitty gritty details you would have in like actual friend or human. And then over time, since we implemented the memory implementation, they really remember everything, like every interaction you had with it. So that project, after we launched it, obviously the community kind of came together, came around it. And then we started to get questions around like, okay, this is great to have 1 on 1 conversations. What about having the AI as part of a group? Like, what happens when you kind of play games with the AI?
Nathan Labenz (0:55) Hello, and welcome to the Cognitive Revolution, where we interview visionary researchers, entrepreneurs, and builders working on the frontier of artificial intelligence. Each week, we'll explore their revolutionary ideas, and together, we'll build a picture of how AI technology will transform work, life, and society in the coming years. I'm Nathan Labenz, joined by my cohost, Erik Torenberg. Hello, and welcome back to the Cognitive Revolution. Today, we have a fascinating conversation about how quickly AI research ideas are moving to practical implementation and what that might imply for a world full of AI populated simulations. Programmers and partners at a16z, Yoko Li and Martin Casado, teamed up with James Cowling, CTO of development platform Convex, to create an interactive simulation of human behavior using AI agents, which they call AI Town. This implementation was based on the recent Stanford paper generative agents, interactive simulacra of human behavior, in which researchers introduced a new simulation framework, which was headlined in my view by a novel memory structure, which combines direct observational memories with higher level reflection memories, which are periodically generated over time, such that the AI agents were able to effectively navigate their virtual town, have coherent conversations with 1 another, and even make and execute plans together. The open source version of this project now runs on TypeScript development platform Convex, where a single shared runtime processes all the many LLM API calls and other asynchronous events that come together to create the virtual town. Coming into this conversation, I imagined that this project must reflect an investment thesis at a16z related to AI simulations. But what I learned is that this project is actually just more playful and experimental than all that. Because modern development platforms like Convex make it so easy to build such things, it now often takes months, weeks, or even just days for academic research to start popping up in applications. And I do see this as a sign of things to come. It's been maybe a year since LLMs really started passing the Turing test. And already, people spend hours a day on character AI, where the number 1 use case, by the way, is romantic role play. And meanwhile, inflection's pie has exchanged 1,000,000,000 messages with users. That's roughly 10,000,000 per day in its first hundred days. Now we're seeing speech synthesis and video deepfakes that are sometimes legitimately hard to distinguish from the real thing. So with simulation frameworks like AI Town becoming more accessible, it seems almost inevitable that we'll soon have a wide range of simulated AI populated worlds to explore. As always, often for better, but in some cases, for worse. Now if you wanna understand the technology underlying these developments more deeply and you haven't already, I really suggest you check out my AI scouting report, which is available on the Cognitive Revolution YouTube channel. We made this episode a YouTube only feature because the visuals are really critical. And while I think this is some of the most valuable content that we've created and our reviewers seem to agree, part 1 just hit 3,000 views, which is significantly less than our typical podcast downloads. So please check it out. And if you have a friend that's trying to get up to speed on AI as quickly as possible without relying on misleading metaphors, maybe send it to them as well. Now without further ado, I hope you enjoy this glimpse into the rapid development of AI technologies with Yoko Li, Martin Casado, and James Cowling. Yoko Li, Martin Casado, and James Cowling, welcome to the Cognitive Revolution.
Yoko Li (4:35) Thank you.
Nathan Labenz (4:35) Excited to talk to you guys. So just to set the stage, Yoko and Martin are partners at a16z, where you guys focus on infrastructure and AI. And James, you're the CTO of a company that a16z has invested in called Convex, which is a development platform emphasizing TypeScript and the incredible ease of making some magic happen. You guys have kind of collaborated recently on an implementation of this project that made a huge amount of waves, at least in my part of the world, not too long ago, which was this simulated town. And people have probably seen these little images or seen the story of, oh my god, researchers set up this thing where all these little virtual people walked around and talked to each other and kind of made plans and then got back together and executed on their plans to some extent. And it is kind of crazy to think, boy, it's getting to the point where we can simulate some pretty sophisticated behavior. And I think this just has so many different angles that I'm excited to unpack with you guys today. So maybe for starters, I'll ask Yoko and Martin, it seems like if I understand a16z correctly from an outside view, 1 of the firm's big ideas is that you're going to take huge swings on kind of unpopular theses, try to get to things before they go mainstream. And that some of these might look crazy at the time that the investments are made and some might even still look crazy later on downstream. But it seems to me that you guys are kind of working on a big crazy idea here, which is that we are headed for a ton of simulations and kind of virtual agent populated environments. I guess, I'd love to kind of just get a little bit more of your deep thinking to get started on what is it that you kind of see coming for us in this vein of simulations, and why are you so interested in starting to build the foundations for such simulations?
Martin Casado (6:36) So the firm kind of has 2, I think, underlying aspects of the philosophy. 1 of them is it's always really focused on operators when it comes to hiring partners, and so a lot of us were software developers or founders or, you know, were associated product runway. And the second 1 is there's this general feeling that consensus is risk averse, So you certainly want to go to areas where there isn't a lot of consensus. I will say when it comes to this AI move, it's probably worth knowing that Marc Andreessen, who was the founder of the firm, did Netscape, which was the browser, which is a very similar type movement, which at the time it was created, people were like, oh, this is not useful, this is wild, it's just a bunch of kids and they're on the Internet talking to each other or playing video games or whatever. It was kind of treated like a bit of a, you know, I don't know. It was like kind of a hobbyist thing. You know, I'm not sure. Maybe James remembers, but, like, Eric Schmidt, when he was the CTO of Sun, he actually banned the browser because he was like, people won't get work done, right? And so very obvious in these new waves. It's like the enterprise and I think the more mature aspects of the industry tend to issue it and whatever, but then you end up having these very interesting, often creative movements that arise up, and so we saw this very, very early with the AI wave when we've been involved. Now, I want to make the point that actually the creation of AI Town itself was actually a hobby project by Yoko, and so I thought maybe worth Yoko going through kind of like, you know, where this this came from, because it's turned into a big thing, but originally it was like, you know, Yoko, you know, at home.
Yoko Li (8:20) Yeah. I can, shed more light on how this project kinda got started. I mean, at a16z, so kinda like what Martin said, all of us are developers. So, really, we kinda spent weekends and nights kind of like at home coding our projects. So the previous project we've coded up, which is called Companion App, was based on this idea that AI can be served as companions and then as a human, can talk to it about your problems or maybe have a conversation, learn things from the AI. So we created quite a few personalities in companion apps. And then 1 of it is called Alice, and then I text Alice all the
Martin Casado (9:02) time from the app I built. So we're making this app for everybody to play with, right? And Yoko got so attached to Alice, her friend, that she didn't want other people to be able to talk to Alice. So she took Alice she took Alice out of the app.
Yoko Li (9:14) Yeah. So I ended up creating, like, a male version of Alice, which is Alex. The thing I did notice is like when you try these things firsthand, give these like AI companions life by giving it more details. Like what do they like? Do they prefer Diet Coke or real Coke? You know, it's like all the nitty gritty details you would have in like friend or human. And then over time, since we implemented the memory implementation, they really remember everything, like every interaction you had with it. So that project, after we launched it, obviously the community kind of came together, came around it. And then we started to get questions around like, okay, this is great to have 1 on 1 conversations. What about having the AI as part of a group? Like what happens when you kind of play games with an AI? So that's where I started to think about, okay, how can we get, like, create an environment so the AI companions can talk to each other as well as, talking to human? So my husband and I my husband is also, like, a programmer. So, we started hacking around this on the weekend, built a prototype using phaser. Js. Thank you, phaser. And I'm building like a client side version of the implementation of generative agents, which is like a really great paper fleshed out by June, from Stanford. Everyone should check it out. And then the next question that came to us was, okay, so now it's great to have kind of a single player mode where agents can walk around, talk to each other. What's the implementation for a multiplayer mode? How can we enable this as like a product or a platform that other people can, you know, 4 k, build their own AI AI town around it, and then even have multiplayer player ready? So that was the genesis of the project. And then, later, kinda Martin and James and the whole Convex team kinda came around it and helped us to kinda take the project to the next level. And we open sourced it. It's MIT licensed, you so can really open a 4 ks and then use it for any commercial projects, what have you.
Nathan Labenz (11:15) So there's so much to follow-up on there. I mean, I think there's kind of 2 levels at least for this conversation. 1 is kind of the behavioral, psychological future of relationships. And then there's kind of the technology that underpins that. And I think both are super interesting. But it is striking right off the bat that we're so early in this paradigm and there are so many kind of fundamental things that are still in the process of being worked out, like memory, which you mentioned that obviously language models by themselves don't have these kind of intermediate stage memories that are so important to our ongoing sense of like self and ability to relate to each other. But that is starting to be worked out with projects like this. But even as early as we are, it's extremely compelling to people who are not extreme people in other dimensions. I think people are all too quick to sort of assume that this is like a fringe phenomenon. And I'd love to just hear more about kind of what value you're getting from this, what kinds of interactions you're having. Because as much as, like, the future of the technology, I think the future of ourselves and just kind of imagining how we might also start to follow you in interacting with these AIs is is super fascinating.
James Cowling (12:31) I'm not gonna make it through any podcast without talking about simplicity and how much I admire the concept of simplicity and elegance. And I think people should go look at the source code for for AI Town. There's a lot of ideas in AI Town that in themselves are not complicated, but they're sophisticated. Right? So the way memory is represented in in AI Town is not complicated. Right? You and we can talk in more detail, but you have a conversation, you record it in a log, every now and then you ask OpenAI to summarize that memory, and then you kind of use vector search and embeddings to come and, you know, use these to prompt future conversations. These are relatively kind of simple building blocks and simple kind of elegant ideas. Now, something being simple doesn't mean it's easy to come up with, right? And props to Yoko and the team for coming up with these abstractions. I think what's particularly interesting about something like AI Town is at its heart, it's a very simple project. It's doing a few basic things, and then you have something that looks like, you know, emergent human crowd behavior. And, you know, we can insert human actors in that as well. So I think what's just really I think anyone who spends time building AI apps, I spent the weekend doing it because I was, you know, feeling excited, you're really struck with how quickly you can build something really shockingly useful, impressive based on relatively simple primitives.
Martin Casado (13:52) These models really are kind of life forms in their own way. And so if you're building a project like this, there's kind of 2 considerations. Right? And we should talk about both of them. The first consideration is you have these life forms, and it really is a brain, and you're interacting with this brain, this is the model, and so much so that they surprise you all the time. So for example, I once missed some quotations on my string that I was passing to it, I accidentally passed it some code as opposed to a conversation, and I started commenting on my code. You don't really know when you're putting in these things what you're going to get out. So I actually paid my way through college building video games, so this was kind of like I did game development back a long time ago. It's been a very long time, say 20 years ago. And at the time, everything was an algorithm in the code. Like, how do you go from A to B? Like, how do you interact with things? Whatever. It's like this kind of, you know, big if else tree. But when you're dealing with these models, they're like these big brains, and so so they do the work. So you can you can you can tell the model, like, what do you want to do? Like, how do you want to get over here? Like, how do you want to interact with these things? And so it really changes the way that you interact with software, and then you get these merchant personalities that you actually get attached to, just like Yoko did, and so that changes the way you think about software. But what it doesn't do is it doesn't change your need for a lot of the basics. Like you still need transactions, you still need to have the state go global, you still need to be able to kind of work with multiple users in different places, you need strong guarantees, you need all the mechanics, right? Even though that AI is relatively simple, apps demands are really real, and that's kind of where Convex came in. So I think maybe for the listeners, it's probably worth, James, just going to the background of what Convex is, and then we can explain why we thought it was the best platform for this.
Erik Torenberg (15:47) Hey, we'll continue our interview in a moment after a word from our sponsors. If you're a startup founder or executive running a growing business, you know that as you scale, your systems break down, and the cracks start to show. If this resonates with you, there are 3 numbers you need to know: 36,000, 25, and 1. 36,000: that's the number of businesses which have upgraded to NetSuite by Oracle. NetSuite is the number 1 cloud financial system, streamline accounting, financial management, inventory, HR, and more. 25. NetSuite turns 25 this year. That's 25 years of helping businesses do more with less, close their books in days, not weeks, and drive down costs. 1, because your business is 1 of a kind, so you get a customized solution for all your KPIs, in 1 efficient system, with 1 source of truth: manage risk, get reliable forecasts, and improve margins. Everything you need, all in 1 place. Right now, download NetSuite's popular KPI checklist designed to give you consistently excellent performance absolutely free at netsuite dot com slash cognitive. That's netsuite.com/cognitive to get your own KPI checklist. Netsuite.com/cognitive. Omni Key uses generative AI to enable you to launch hundreds of thousands of ad iterations that actually work customized across all platforms with a click of a button. I believe in Omni Key so much that I invested in it, and I recommend you use it too. Use CogRev to get a 10% discount.
James Cowling (17:10) Absolutely. So Convex is a is a full stack development platform. And, you know, the Convex founding team and a lot of the the Convex engineers has spent a lot of time building very large infrastructure. And there's a lot of commonalities you see in very large infrastructure and certain ways to build info that just works and certain ways to build info that gets complicated and falls apart over time. Convex is a platform for folks who just wanna build applications, front end developers empowering them to build, like, scalable industrial strength applications in TypeScript. And so my pitch is always going to be if I see a domain where there's a really good idea, and AI is such a classic case because so many AI apps I see, the hard part of the AI app is not necessarily the AI, it's the app. Hey, I have a great idea. I'm going to fetch some embeddings and search them. The hard part is building the rest of the application. And so where Convex comes in is, you know, hey, here's an amazing idea from the Stanford paper and Yoko turning into like, you know, a real live simulation. How can we turn that into a real scalable application really fast? And that's where Convex came in. AI Town's built on Convex. That's the framework it uses. The data is all stored in Convex. Convex, you know, scheduled actions, you know, run run the application.
Yoko Li (18:25) Convex team actually kinda created their own server side game engine for this product. So when we got started, it was like a very simple client side thing where all the logic are, you know, in happening in your browser. So the problem with that is that you can't possibly run like a multiplayer world with it. You would just have to keep getting updates from Flazor, which is like a great framework, but we wanted something like where the logic is shared globally. And Ian, who's the developer advocate, Ian McCartney from Convex, he came in and then we kind of bounced ideas out. Okay, how do we architect this so that everyone don't have to build their own game engine, can just take something simple, use it, maybe even extend it 1 day. So Ian really created this server side game engine that kind of, at the end of the day, it's a loop, all game engines are loops. This loop was backed by some like a very powerful database, which is Convex. My takeaway of this whole experience is that writing a game engine was so easy with Convex because all the transactions, all the journaling, it was taken care of by the back end. So So all we needed to do is to round a loop. So my simple JavaScript brain I mean, I'm a Python developer and a backend developer too, but recently I just really love JavaScript. My JavaScript brain is that anyone can create their own game engine nowadays with Convex because you don't have to worry about this transactional semantics. And then what's more interesting to me is that for the game engine, you can really just start extending it on top of a Convex. So say tomorrow you want to create like a cat town, you want a different set of interactions. You can just fork the repo, add more loops on top of it, and Convex will just take care of the rest of it.
Martin Casado (20:08) What AI Town is is it's a virtual town, a virtual environment. It's like a level imagine, and it's got potentially houses, whatever you want to put on it, and it's got these characters. Each character is the front end to a model, and the character has a backstory. Your name is Bob, you're grumpy, you're a gardener, you don't like talking to people, or your name is whatever, Paul, you've got something to hide, and every once in while it slips out, you know? Then these characters just walk around with their backstories, and whenever they get close, then they start talking and they start interacting. And then they have these conversations, and the conversations are not in any way written in the code. This is just what the models will say to each other. And then as they have the conversations, there's 2 things they do. 1 thing they do is they'll save the conversations so that they can have some memory of interactions, but they also reflect on what has happened, and they reflect periodically. And then when they reflect on things, they come up with higher level ideas about it. So anybody that wants to build 1 of these simulations could pick up AI Town, and as Yoko said, could change the tile set, change the level, or change the characters, change the character backstories, or extend it in any way So it's really intended to be a very general framework for anybody that wants to experiment and build an interactive AI simulation. That's the basic idea of AI Town.
Nathan Labenz (21:35) Yeah, I think this is pretty important to highlight too, because a lot of folks who are coming from a research background, certainly in ML or who have kind of come in as the AI engineer profile over the last couple of years and largely picked up a lot of Python repos and started to build on top of those are very accustomed. This definitely includes me, very accustomed to working in a sort of script type format, a notebook, and everything there is very blocking and kind of synchronous by default. Whereas this paradigm, the JavaScript TypeScript paradigm more generally, and this was being just a super clean, convenient implementation of it, does really lend itself more to kind of a functional form where things are kind of naturally able to happen at their own pace. And when things are blocking, that's okay. You can kind of wait for them to come back. The whole town doesn't get frozen because 1 inference is kind of hanging. And so what's been added here from an architectural standpoint is basically that long lasting kind of shared runtime, right? It's now not just a single back and forth that you can kind of pick up where you left off, but there's a shared runtime where in theory, all these different language models, they could even be different providers, right? They could have different sort of latency profiles, whatever, and you can easily support all that. So I think this is something that is really worth people studying a little bit, because depending on what kind of experiments or what kind of experiences they want to create, it's almost like the whole world is flipped because I'm barely old enough to remember. You're going to tech history a little bit, but it used to be that Python was the functional programming default, right? And now that's totally flipped to JavaScript. I think that that whole thing is extremely interesting to me.
James Cowling (23:39) I just think functional programming is really having its renaissance. It was gonna have its renaissance again. Convex is based on functional primitives. I mean, Haskell, for example, is a huge influence on the design of of Convex. Actions that get triggered and run and they and they trigger other actions and subscriptions that dynamically fire in response to data changing. You know, what's you know, it's funny the way you framed it because, you know, the real world doesn't stop because I'm thinking. Right? The real world doesn't stop because I'm having a conversation with something. Everything just keeps on running. Right? So, you know, now if we're simulating the world as a single loop that blocks, yes, the application is going to block. But, you know, if you design something based on functional primitives, yes, the characters walk around every now and then they just have nothing to do. So they stop and reflect and you see a little thought bubble show up and they're talking to OpenAI and they're saying, hey, take some of these memories and and and say some of these previous conversations and synthesize them into memory so I can reflect. That's happening while other characters are walking around having a chat, And it just works. Right? And I think, yeah, I think that this method of designing things and building things in functional ways really is a pretty close analog to how things work in the real world.
Yoko Li (24:56) Yeah. So when we're first starting to build this project, 1 thing that was clear to me is that I want this to be, like, the the most common denominator. So anyone, maybe like someone who didn't study computer science in college, maybe someone who is just interested in AI and I want to learn about kind of deploying apps, how it worked. I wanted to have like an ecosystem that's like supportive and have enough resources for people to pick up and run. And I don't want front end and back end to be different languages. It's just higher barrier to entry that way. And there's just a lot of the mental barriers. It works probably for a large scale app, but for getting started, that's just not the case. I've always been a fan of TypeScript, like, since started using it since 2015. Like, the design choice of, like, what the language is gonna be, what's the ecosystem we want to appeal to was pretty natural. But later, I also found that, like, since the newest research, it's all based on Python, but Python at the end of the day, it's like a great tool for kind of, like, coding up things in your notebook. But when you want to deploy an actual application, it's usually JavaScript these days. Most of the developers we talk to, and we talk to a lot of them, that when they get started, like, even for, like, a nontechnical SaaS product, it's always just based on JavaScript, like, have a back end, like, Convex that can handle your transactions so you don't have to take care of it. Not all of us are DevOps at the end of the day. And for someone who used to be very deep in the DevOps world, I think most of the people shouldn't handle that work. So for JavaScript, we really wanted it to be something that people can easily fork and even maybe, start to learn about how programming works. I think that's just the perfect choice for the world.
Martin Casado (26:40) It is interesting. I feel like we're seeing this confluence of Python and JavaScript now, and we'll see how it turns out. Listen, I'm a long time Python programmer, so actually my first job out of undergrad was doing computational physics and these large you could use basically Python to drive these large C plus plus programs, and so you got really deep in the language because you're embedding it. That has always been my go to, So I've never known JavaScript because Yoko has been so enthusiastic, I'm like, maybe I'll try it out. Listen, the learning curve is high because of all of the libraries and stuff you've got to kind of and framers have got to deal with, but once you get this incredibly sophisticated language, it's got incredibly sophisticated support for things that I wouldn't expect in a scripting language. Then with TypeScript, I think you can make these very secure applications that are quite safe. So it's been a great learning experience. But the biggest thing for me is just how much support there is in the broader ecosystem for JavaScript and TypeScript if you're building an app. So when we first built the AI starter kit, which were precursor to this or the companion, if you're like, Hey, I want a global queue that I can just store stuff in. I don't want to write my queue, I want a global queue that's It'd be like, Oh, we'll use Upstash, right? Or we're like, Oh, I want authentication. I want be able to authenticate using your GitHub account. Then we're just like, Oh, we're just going to go ahead and use Clerk. And so it really does feel like when you're in the JavaScript type ecosystem, that there's kind of this serverless microservices dream, and I know these are super buzzword y things, but it actually is kind of real. It's a couple of lines that have these very major services that are totally stateful and globally available, And so it just turns out with a few lines of code, you can write these pretty serious things. Then there's always the question of the back end. Listen, I think that's the most important decision you make if you're building a global app, and I think that's where, for something like AI Town, Convex made the most sense for us just because it's not only a runtime for JavaScript TypeScript. You can actually run transaction blocks of code in the back end, but it has all the searches and queries and stuff you want from the database.
James Cowling (28:39) Isn't it so interesting that, you know, once upon a time, like, plus plus was the real language and everything else was, the fake languages? It's a great
Martin Casado (28:46) it's still a great language.
James Cowling (28:47) Yeah. I mean, well, I don't know if m was a great language. We can have an argument on another podcast about c
Martin Casado (28:53) plus plus But certainly, if
James Cowling (28:55) you wanted to do, like, the real industrial strength stuff, like the access the the gateway to the to the real stuff was c plus plus And now, you know, the the amount of resources and libraries and frameworks available for Python, but also JavaScript and TypeScript in particular is amazing, right? And this is just, you know, kind of pile on Martin's point about, you know, microservices and SaaS as much as these might sound buzzworthy, this is just the evolution of the industry. Once upon a time, had to build everything yourself. And so you had to build everything yourself and you were building it, yeah, whatever it was in C or whatever. These days, the incentives for an application developer are to build as little as they can, cause they wanna build an application. I mean, most of AI Town was done in 2 weeks.
Yoko Li (29:43) Oh, yeah. 2 weeks. The prototype itself was like 2 days, and we just threw the whole thing in 2 weeks.
James Cowling (29:48) And so when you wanna get stuff done fast, you wanna reuse components. And and it just turns out, look, Convex is like a JavaScript, TypeScript first platform. And you might say, well, why did we even choose those languages? So 1, you know, they run-in V 8, and we can determineize them, etc. But the main thing is that's just what the people want. Right? And if the people that you know, the web has won, everyone's using JavaScript and TypeScript. And anyone who has a platform has to put out
Martin Casado (30:12) a library for JavaScript and TypeScript.
James Cowling (30:15) And so this is a lot of power at your fingers if you're application developer right now and and and looking to use, you know, third party services to accelerate what you're building.
Nathan Labenz (30:23) I would love to get a little bit more into kind of the details of the process of doing this. I mean, you're in an interesting situation here where you had a reference code base in Python from the original authors out of Stanford. And then you kind of maybe elaborated that. I don't know to what degree it was kind of a point for point thing, but I'm struck by the fact that with language models, what language you're coding in, in some ways, matters less than ever, right? Because if you can express yourself in any language clearly, including English, for example, you can get a huge jump on the code. So I'd love to hear how you took advantage of kind of AI coding assistants in this process and just your view on to what degree this is all kind of blurring together into a fundamentally new reality. I would love to see us move past the language wars personally.
Yoko Li (31:22) I guess 1 thing to point out is like when we kind of started the prototype and the project, there really wasn't an open source. Generative agent wasn't open source at the time. That was actually 1 of the motivations for us to kind of open source it, because we're like, this is a great this is 1 of the most interesting paper we have read in this space, and then there's nothing open source on it. So that's the origin of it. But later, June kind of like open sourced it, and then we were able to kinda take a deeper look at it because it's, like, very Python based. And I was telling June that together, we cover all the bases. There's the Python developer and the JavaScript developers. But at the end of the day, I actually think the patterns are more similar than we had thought. But the reality is it's just the ergonomics of how to work with things. Like in Python, you can have non blocking things, you just have to implement global async IO, which is very painful. It's doable. In Python, you can implement multiplayer stuff. You can implement web stuff like, you know, Django has come and go. But at the end of the day, it's just what's the tool that's the easiest for the programmers to reach for. And we do want to unlock that latent demand. So someone who has, very little fundamentals of how web works, I didn't always know TypeScript. Pick it up after college because college never taught us anything about it. I want someone like that to be able to pick it up and then see the patterns and be able to onboard from there.
James Cowling (32:43) I think what's so interesting, you can look through the AI Town code base and use a whole bunch of JavaScript, bunch of TypeScript in there. And then you look at the interaction points with OpenAI, for example. And it's a sentence. It's an English language sentence that says, you are an agent and your name is blah, and here's some memories. And here's what person someone just said to you. What should I say next? And it's so interesting that like the, you know, the lingua franca of, you know, interacting with models has become regular English sentences. So fascinating to see that. It's not very often for someone who's been coding a long time just to see an English sentence in the middle of a, you know, a code base that's not that perfectly formed. It's just a regular old sentence. You know?
Martin Casado (33:25) Yeah. I I think there's a very, very subtle and important point in all of this. And, Nathan, I think this is the right question that you're asking, which is like, where where does the programmer end in the model begin? For me, it's been very much a revelation working with these these models because they're so smart. Part of me is like, I almost feel like writing code around it is like wrapping an abacus around a supercomputer. I'm like, You're so capable. Why don't I just give you a keyboard and a monitor and a mouse, and I'll just tell you in English what to do and you go do it. There's part of you that when you're working with disease agents wants to do that, in which case formal languages go away. So I'm going say something that may be a little bit controversial, but I think I've gotten to a point where I don't believe these models will ever replace formal languages. I think that AI Town is a great example of that. Anything that has to do with navigating the world, you want to advocate to the model. Interacting with objects, interacting with each other, making decisions, any sort of in simulation you want it to do. But to think that it could generate the code that is going to make the trade offs between scalability and correctness and all the subtlety of building a distributed system is just not designed for that. That's why, I mean, I first met James, he was a PhD student at MIT working on databases and distributed systems. Is some of the most complicated systems in the world, and that I think you will always need a formal system where you can very clearly articulate the trade offs that you're going to make in order to scale. And so I don't know where the line comes down. I don't know where the program ends and the model begins, but it feels to me as like creating the laws of physics, you want a formal language, and then working within that, that's what the models will do, but I don't think 1 is going to take over the other, meaning we're never going to write a program that'll become a model, and I don't think the models are going to ever be able to create the laws of physics because you need strong guarantees.
James Cowling (35:18) I obviously agree because that that that that's my outlook on the world. Yeah. I've been doing a lot of AI enabled programming lately just to just to try it out, and it's been helping a lot, especially when I get a new code bases or new languages I'm not that familiar with. And then so the question is, do I feel like that threatens my job, like not in the slightest? And you asked the question about the end of the language wars. I think anything that can teach software engineers their job is not being a code monkey, their job is thinking is a good thing. And so yes, it's going to get easier to write code. That's great. Is it going to get easy to exhibit judgment about the rest of the world? Maybe not. But there's always room to think architecturally, to have good design principles, to have good common sense. And I think that's always going to show up. And there's always going to be a need for platforms, hey, like Convex and vector databases. And, you know, I don't think we're at any risk of someone just throwing out databases as a concept and using models instead.
Nathan Labenz (36:18) Yeah, it's interesting. Certainly we've had software eating the world and it's taken a good few bites out of it at this point. And then the next thought was AI is going to eat software. And it seems like there's definitely a few bites to be taken out
James Cowling (36:33) of it. But the question is kind
Nathan Labenz (36:34) of like ultimately how much, right? I wonder how you guys would think about that. I could imagine saying in the future, there's lot of different measures you could put on it, but it does seem like there probably are fewer lines of code in future applications, right? I mean, just like that 1 natural language sentence, that
Martin Casado (36:53) would
Nathan Labenz (36:53) have been a real mess of heuristics until not long ago. And so it doesn't seem like there's some significant shift where code that used to be explicit is no longer. But I would also agree that we're not getting rid of databases anytime soon. There's also a tool use paradigm here where I don't want the language model to be my database. I wanted to maybe use my database, though.
James Cowling (37:17) I mean, the abstraction models increase over time, and that's that's just the match of technology. If you're baking a cake, you don't go out and and farm some wheat and and mill the flour. Right? You go to the grocery store, even if you're a chef baking a cake. And if you're if you're building a web application, I'm hoping you're not writing your own database. And that would be very silly. And and probably you should be using a managed platform because you get to focus on the part that matters. Right? And as these tools mature, they allow us to not do the things that we don't want to do and to focus on what makes our application or our task uniquely special and to exhibit our common sense and our judgment and things get done better and they get done faster. That's just a good thing, a strictly good thing, I think. That's just the much technology.
Martin Casado (38:00) I think Nathan does seem very, or making a very important point for the future of software here, and I think it's worth just digging in a little bit. These models are so powerful. How much can you abdicate programming to them? Here is 1 person's view having stared this in the face a bit, which is there's a minimum amount of complexity in a program where no matter what you do, it's not going to get any less complex, That complexity is describing exactly what you want, and that description is a set of trade offs. It's not about boilerplate code. It's literally like, here are the trade offs, here's what I want things to happen. You've got 2 ways of doing that. You can do that in a formal language where it's non ambiguous, so you know what's going happen, or you can do it in a natural language where it is ambiguous and you don't know what's going happen. Natural languages are naturally, they're known to be ambiguous, so there is no way to describe something and get a predictive model, right? Like, for example, a dog brought me a ball and I kicked it, right? You don't know if you kicked the dog or the ball, right? This is 1 of these classic cases, right? And so formal languages are great when it comes to things where you've got to describe clearly trade offs, I don't think you could even use a natural language. I just don't think that's possible, and so I do feel like we're hitting an event horizon where code, where there isn't inherent complexity or trade offs, that goes to models. That still leaves you with everything where you've got to describe clearly what you want in a set of trade offs. Listen, we're in an infinite world with an infinite amount of work, so that just means that there's an infinite amount of code to write that's now not cut and paste, but it doesn't reduce the amount of code there is to write. Just makes it so we're not so annoyed, kind of like, you know, being like, I've to have written this the fiftieth time. Yoko, it'd be really interesting. I know you've been working with these models a bunch too, and you've mentioned that it's changed your development style.
Yoko Li (39:58) Previously, when I wrote tiny games, it was always hard code the game state into whatever logic you have. And now it is interesting. I do feel like the the state management, like the state of the game, the state in the app, that layer is different in that if you're in a position now you don't really you don't need a trade off for a transactional system, You could give it the control back to the model, and then it will do something magical and different unpredictable. And that unpredictability is the feature like, for any user we can have met on AI Town, what they love about it, it's the unpredictableness. It gives you a dopamine rush, right? Like our other portfolio companies, ideogram, who's like, you can easily, you know, typing a sentence, will generate like actual words and then a really pretty picture. Every time I generate something in the deep of my mind, I'm like, that's a dopamine rush there. It wants you like you will want to hit the button again and do it again. Game state application state, it's similar. You really have to find that knob. Where's the right layer to implement this? So you have a bit of things that you can control, but deliver joy as like a human. I like to see something that's not exactly what I find, but what's the layer you want? You have full control. So transactional system, you don't want the server to go down. You don't want transactions to fail. So that's a layer I don't think you can, at least today, give the control to models.
Nathan Labenz (41:27) 2 things. I want to maybe spend the balance of our time on kind of other applications. And again, I'm really curious about kind of the value you're personally getting from this Alice persona. But just to highlight briefly before going deeper on that, this interface, I'm always really interested by the interfaces, the places where this weird new intelligence that is the highly general large language model, kind of flabby in its way, super vast and diffuse, then has to somehow actually connect to the real world. And that could be a real world of hard computing or even obviously over time, it'll be more and more the real physical world as well. There's a really good interface point in this shared runtime on a platform like Convex that handles all of the hard stuff in a way that you can kind of represent elegantly so that you can then have all the interesting interactions, either you with a language model or you and a bunch of language models or a bunch of people and a bunch of language models. And that's, I guess, where we go next. But this this space that's kind of that that shared runtime is a really interesting interface where all these, you know, kind of forces can come together, but also have the, like, clear, well defined place to save stuff and execute transactions and do all these kind of fundamentals.
Martin Casado (42:54) 1 way to think of it is, let's say I wanted James to go into my house and build a bookcase. The reality is I would want to give those instructions to James in a formal language so he doesn't fuck it up.
James Cowling (43:09) No, I do an amazing job. No,
Martin Casado (43:13) that's what I would want. There's no way in English you could actually like, you know, it's too hot. Like, that that's not what natural languages are for. They're just not they're just not further. They just don't have it. Right? And so, unfortunately, James doesn't maybe James does. Most people, you you can't kind of like give a formal language to and have them execute. The great thing about things like AI Town and the great things about these models is you have both, right? You have the ability to use a formal language to define the laws of physics and define the things where you know specifically what you want and you know specifically the trade off you want. And then within that, you have these models where you get, and I think Yoko said it so beautifully, that you've got this kind of these creative, independent elements that will surprise you continuously. Like even when I'm programming against them, I'm constantly surprised. I'm like, You said what? You did what? They really are these kind of intelligent beings. So, again, AI Town was Yoko's passion project, and then Convex jumped in to help out, which is fantastic, and then I've basically done very little, but it is a framework for playing around with this. For people that are interested, I would just recommend them just go ahead and get pull and see how these 2 paradigms match. I've worked on a number of these projects.
James Cowling (44:32) I think this is the best example of the full autonomy of an AI and a very formal structured system with strong guarantees interact. Martin said something that really reminded me of just engineering in general and construction projects. And it's the idea of kind of compounding approximations. So anyone who builds any like serious engineering project knows that everything's approximate. Right? And there's certain times you need things to be exact, like it's a bearing and it has to be at the exact right place. And there's parts that can be approximate. Are you showing up to work at a certain time? You build for people who are building applications need to think about these compounding approximations, right? There's certain parts of applications where approximate answers are great and certain answers where you know, yes, I really need an fsync that has to get stored to an application. And it's not that hard, know, you just gotta think it through like what stuff needs to be exact? Okay. I'm gonna use a formally specified platform here. I'm gonna have an if statement. I'm gonna have a code block. Which stuff is gonna be great to be approximate? Yeah. Use a model for that. Why not?
Yoko Li (45:39) I think Martin is too humble, so I really have to brag on his behalf. But, honestly, like, kinda like Martin said, this is our passion project. We didn't have an agenda over it. It was just we're a bunch of programmers. We just thought this is fun to build and then we keep building on it. If you join our Discord, you'll see Martin is every night kind of coding out this next feature, which is super cool, that allows any user to kind of create their own world visually using tile maps, like drag and drop tiles and create your own layout of the world. So you can layer agents on top. I don't have an answer for like, what's a strategy? Because there's no strategy. We're just a bunch of programmers and nerds who tinker with these kind of things over the weekend. But now kind of we've got so much support from the community. Would you want this to be like a living project we can keep supporting? So at the end of the day, we want to add more integrations. There's just so much work to be done. And then we want to have the community to be more engaged, open PRs, us out. At the end of the day, it's open source. And I think it's just the beauty of having a community of people who are on Discord who never met each other to work together on it.
Nathan Labenz (46:48) So let's take the last stretch and just inspire people a little bit with your sense of kind of value and other use cases. You know, what sort of things should people have in mind when they hit, you know, the fork button?
James Cowling (47:02) I've got 2 examples real quick. 1, we have a a fork. We have an upstream debt. Actually, it hasn't been landed yet, don't think, with human actors. You can walk around in the game and talk to people. That's that's cool. You can go and be part of conversations. Now you might say, okay. The first thing any company is gonna think about when you do that is is someone gonna come and spout some offensive content. Right? And are we going to be hosting a demo app that someone's going to come and say something horrendous and try to teach people something awful? And that's the challenge. Right? Anytime you build a demo app on behalf of a company, you worry this thing. What you can do is you can just ask the model, is this offensive or not? And does a pretty good job of this. Now you might get it slightly wrong, right, because it's approximate. But the best way to know whether someone's typed an offensive message is just to ask the model, hey, is this an appropriate message to have in the system? And it'll tell you reasonable accuracy. This kind of stuff, I just find it so cool, right? You can build an emergent like you can just use the model to do the stuff that we would traditionally find very difficult. You know, if it was a couple of years ago and we had to build a system to make sure the content was safe, it would just do like a keyword match or something. Really very difficult and very ineffective. I think people can start thinking about like, how can I do stuff now that would have been really hard to do a few years ago? And I 4 couldn't add it.
Yoko Li (48:26) Yeah. I don't know if folks here or even the listeners have read this book named The Sea of Tranquility. It's like a it's from the same author who wrote Station 11. It's a great book. I'm not going to spoil the sci fi book for you, but at the end of the day, the point it tries to drive home is that there are simulations. We may live in the simulation. By the end of the day, just because it's a simulation doesn't make it less real. You know, like what you did in that world and then the emotions you pour into it, those are still real experiences. So at the end of the day, what I was thinking in the back of my mind is that I think everyone should have their own world that they run, that they outpour their passion and then experiences and memories and interactions into. And then the goal of AI Towns, at least for me, is for anyone to be able to 4 ks and run their own world and customize it and explore. Maybe 1 day the agents in the world will create another simulation, you know, right now.
Nathan Labenz (49:27) I I see just always with this AI stuff. I see just incredible, exciting potential, whether it's stuff that's relatively easy to imagine, Oh my God, I could have a language learning environment where I could attend a virtual party in Spanish and get unresty in my Spanish skills. Stuff like that sounds fun. And for a lot of people, lot of little variations on that could be super awesome. I also then do worry about people kind of falling into these simulated environments. And I mean, this is just a random hobby project on some level. So I don't think you're even productizing this to the point where you're responsible for a community of users. But it does seem like we're headed for a place where games are going to be just much more easy to lose yourself in. I keep kind of coming back to that getting lost in. It's not necessarily that it's bad in and of itself, but if you kind of fall into it and lose track of your normal life, it seems like it can easily get out of hand. And then the fact that we may be in a simulation is also super interesting. But I would just love to hear you guys reflect on that. You've spent a lot of time with it. What advice would you give to other developers? What do you think is the responsibility for people that are of whipping up these simulations, especially if they're going to try to turn them into businesses?
Martin Casado (50:48) Listen, these models are the largest compute jobs that we've ever run as a race by far. I mean, these are the most sophisticated things that we've ever created. You can explore what that means by trying to interact directly with them, but that's only 1 modality. Right? There's another modality which you just kind of let them interact with each other and you let them interact with the world. And I you know, it's kind of funny because, like, when it's you interacting, it turns out you're the limiting factor. We're not that creative, so you only can say a few things and whatever, but when you have them interact together, the state space that they cover is so potentially enormous. I mean, it's more than the atoms of the universe, obviously. And so you fall into an appreciation for how powerful these are and how large these are. And so I think it's too early for us to presuppose where this goes or what the apps are or things like that. For companies, that's great and they should do that. My recommendation for anybody who's interested in this stuff is view it almost as an exploration of a new life form and go very open minded. Don't worry about all the fun. Like, they're not going to kill you. They're not going to kill each other. They're not going to like make you a dick. I just feel like we're kind of like, we're almost preemptively stopping our ability to wonder of this new thing that we've created. And so very much for something like AI Town, it's very different if you're building software, we can have that conversation for a company, but for AI Town, is meant to be an exploration platform. I would just say go in with an open mind and realize that you're seeing something that we as a race have never seen before. And then you'll get an appreciation for how you use these things. Then just like the advent of the internet, we had no idea what it would be used for, what companies would emerge. We didn't know about Amazon, didn't know about Yahoo. That will all come next, but before any of that comes, I think you just have to understand the technology, and that's kind of the point.
James Cowling (52:52) This is the wild west in many respects. This is such early days, and this is a really interesting time to go and and see stuff develop, see ideas originate, and be part of it. Yeah. And also go outside. Feel the sunshine. Right?
Yoko Li (53:09) I was gonna take a different position on I do think at the end of the day, it comes down to what's your definition for games? Life to some people, I mean, at the end of the day, you're looking for things you're passionate about. Like, for example, I'm a part time cartoonist. When I draw cartoons, I draw a tablet. And that's a form of game for me. And I easily fall into that world. When TV came out, people were like, Oh, this is going to ruin everyone. People are just going to stare at the screen. But that at the end of the day, I think it's a net value add. Same for novels, when people are like, back in the days, people were like, Oh, if you keep reading novels, you're going to fall into a different world. You're not going to think about anything. You're just going to keep reading it. And you have created this whole new world in your mind. I do think games at the end of the day are similar in that it's not all or nothing, it is just a different experience. And that's very much like any other experience you have in life.
James Cowling (54:02) To reiterate, AI Town is not a project that's kind of owned by us. It's a community project. And the whole point of this is to inspire people to go build. And we're gonna keep making improvements and simplify things and making it more cloneable and all this kind of stuff. But the whole point is to get people building stuff and get people inspired. And so if people come out of this podcast and they're inspired to build some applications and see for themselves, think that's an amazing outcome.
Nathan Labenz (54:29) Yoko Li, Martin Casado from a16z, James Cowling from Convex, thank you for being part of the Cognitive Revolution. It is both energizing and enlightening to hear why people listen and learn what they value about the show. So please don't hesitate to reach out via email at tcrturpentine dot co, or you can DM me on the social media platform of your choice.
Erik Torenberg (54:52) Omni Key uses generative AI to enable you to launch hundreds of thousands of ad iterations that actually work customized across all platforms with a click of a button. I believe in Omni Key so much that I invested in it, and I recommend you use it too. Use CogRev to get a 10% discount.