Watch Episode Here
Listen to Episode Here
Show Notes
In this episode, Nathan sits down with Tyler Angert, Product Designer at Replit, to discuss the future of software development. Replit is building what CEO Amjad Masad calls "the perfect substrate for AGI." In this discussion, Tyler and Nathan discuss how Replit is leveraging AI to enhance its current product, bot-bot interactions, and the considerations around designing AI agents. If you're looking for an ERP platform, check out our sponsor, NetSuite: http://netsuite.com/cognitive
TIMESTAMPS:
(00:00) Episode Preview
(01:10) Nathan’s Intro
(09:24) Tyler's role at Replit
(11:37) Replit as the "perfect substrate for AGI"
(14:55) Sponsor: Netsuite | Omneky
(15:45) Defining AGI
(19:26) How AI agents might interact with Replit
(23:20) Replit's "virtual developer" product
(31:11) Current state of Replit's AI features like Ghostwriter
(41:48) Measuring productivity boost from AI coding tools
(45:29) Potential for massive developer productivity gains from AI
(48:34) Technical gaps still remaining to achieve advanced AI agents
(51:45) Core AI breakthroughs have already occurred
(57:35) Timeline for functional AI agents interacting online
(01:01:20) Safety considerations for powerful AI agents on Replit
(01:08:47) Ethical considerations around AI and consciousness
(01:20:36) What constitutes consciousness in an AI system
(01:26:30) Should AI systems have rights and protections?
(01:30:32) Being polite to AI systems
(01:31:59) Advice for beginners looking to leverage AI
(01:36:58) Using AI with a Neuralink brain implant
(01:43:02) Hopes and fears about AI's impact on society
LINKS:
https://replit.com/
X:
@tylerangert (Tyler)
@labenz (Nathan)
@eriktorenberg (Erik)
@cogrev_podcast
SPONSORS: NetSuite | Omneky
-NetSuite provides financial software for all your business needs. More than thirty-six thousand companies have already upgraded to NetSuite, gaining visibility and control over their financials, inventory, HR, eCommerce, and more. If you’re looking for an ERP platform: NetSuite (http://netsuite.com/cognitive) and defer payments of a FULL NetSuite implementation for six months.
-Omneky is an omnichannel creative generation platform that lets you launch hundreds of thousands of ad iterations that actually work customized across all platforms, with a click of a button. Omneky combines generative AI and real-time advertising data. Mention "Cog Rev" for 10% off. MUSIC CREDIT: MusicLM
Music license:
PMZA0BAKJSHJOOR2
Full Transcript
Transcript
Nathan Labenz: (0:00) It's so simple. I just want to do X. I know it's possible to do it. I know probably thousands of people have done it before, and yet I can't quite track down exactly the way to do it. So it was a "woah" moment for me when I literally just typed the comment and then the next thing you know, it writes the whole thing for
Tyler Angert: (0:18) me.
Nathan Labenz: (0:18) It wasn't even that much code, but the key point was that it was correct and that it happened in a second. And then I was like, oh my god, I can just do that and it works? If we
Tyler Angert: (0:29) want to get to the point of 100x productivity, 1000x productivity, where, you know, one, two, three people—three-person startups—are essentially running the equivalent business of what are now today 500,000 person companies, we need to parallelize. And this is where the bots and the agents come in.
Tyler Angert: (0:47) Hello, and welcome to the Cognitive Revolution, where we interview visionary researchers, entrepreneurs, and builders working on the frontier of artificial intelligence. Each week, we'll explore their revolutionary ideas, and together, we'll build a picture of how AI technology will transform work, life, and society in the coming years. I'm Nathan Labenz, joined by my co-host, Erik Torenberg. Welcome back to the Cognitive Revolution, and welcome to Replit Week. Today, we're sharing my conversation with Tyler Angert, Replit's first product designer, which we recorded in May. And next week, we'll have a follow-up with Replit's new VP of AI, Michele Catasta. Why do two episodes on a single company? Simply put, because I believe that Replit might prove to be one of the most important companies in the world. For those not familiar with Replit, it is the single easiest way to create and deploy fully custom software. Working through a native web environment that feels as responsive and customizable as your typical desktop coding environment, users can copy and launch cloud servers with a click, then immediately begin to write and execute code. Replit has become famous for both product excellence and release velocity. In addition to abstracting away so much of the hassle and headache from the software development experience, Replit has pioneered features, including a multiplayer mode, replayable edit history, a bounty marketplace for hiring small projects done, and even has the best mobile coding app that I've personally used. Replit started as a tool for, quote unquote, hackers, and they've built a global user base of some 25 million young software developers who they support with a generous free tier. They've accomplished all this with under 100 team members. But with its recent partnership with Google, its new deployments product meant to support mission-critical software, and especially now its market-leading productization of AI coding assistance, I genuinely believe that Replit is poised to compete with Microsoft and Amazon as the unified application development and hosting platform of choice for a wide range of software projects going forward. Compared to the typical workflows developers use to share and publish their work today, Replit is a genuine breath of fresh air. To give you a sense for just how transformative this can be, since recording this episode in May, I've started using Replit in a new AI task automation training program that I've been developing for executive assistants at Athena. And I'm now beginning to teach EAs to create custom software without teaching them to code. How is that possible? Well, for starters, we've learned that most businesses have similar opportunities to use AI to save time and money or to scale previously unscalable tasks. To take just one example, nearly every business could benefit from more meaningfully personalized outreach, whether that's to sales leads or job candidates. But, of course, it runs much deeper than that. For routine tasks, where consistent execution matters more than creative genius, with a few rounds of iteration, most text-based tasks can now be at least semi-automated. Prompting is generally no longer a major barrier. Models are getting easier to work with by the month, and already just a handful of core prompting skills—input labeling, role casting, few-shot examples, chain of thought, the format trick—just these five cover the vast majority of use cases. The challenge then becomes how to work AI into existing business processes without having to spend all day copying and pasting into and out of ChatGPT. And it is tricky because every situation is unique. Requirements are idiosyncratic, data is in different places and formats, and every integration point is also a possible point of failure. Authentication and access of all forms is a huge source of friction. To help the EAs overcome all this, I've developed the mantra of "copy and customize." Our plan is to equip the EAs with simple templates for common situations. Athena calls these playbooks. And then we teach them how to use AI coding assistance to modify the examples to suit clients' particular needs. This way, we're empowering them not just to serve as EAs, but as AI implementation specialists at their clients' companies. In practice, this amounts to using GPT-4 to write code, which gets, processes, and then loops through some inputs, often making another AI API call for each input. And increasingly, that's Claude too these days, thanks to the 100K token context window, before finally returning all the results in some desired format. There is a learning curve here to be sure, but over time, I believe this skill set will prove transformative even as it becomes commonplace. Now, obviously, this would not be possible without advanced language models like GPT-4, but it also wouldn't be possible without Replit. Because it's so easy to share executable code with no environment confusion, Replit makes it remarkably easy to help one another get unstuck. And because it's a full development and now hosting platform, we can always layer UIs or APIs onto our AI-powered creations as needed. AI task automation and this "software without coding" paradigm will surely be the subject of future episodes. But for now, if you're interested in hiring a human executive assistant that's coming out of this AI training, there is a referral link in the show notes where Athena offers a special deal for qualified customers. It really is a lot of fun teaching these EAs who never even signed up to learn to code in the first place, how to use these powerful tools. But as you'll hear over the course of these next episodes, this is just a drop in the bucket of Replit's ambitions. Their goal is to bring the next 1 billion software developers online and make them 10 times more productive. While it's hard to imagine the state and nature of software in that world, their vision for artificial developer intelligence and their track record for world-class execution make Replit, in my view, one of only 15 to 20 live players in the AI game globally. CEO Amjad Masad, also a former guest on the Cognitive Revolution, arguably
Nathan Labenz: (7:32) put it
Tyler Angert: (7:33) best when he said that Replit is the perfect substrate for AGI. And again, I'm not sure anyone has really thought through what that means, but I think there is a decent chance that it proves correct. In this conversation, Tyler provides practical advice for non-coders looking to leverage AI, an in-depth look at how Replit is using AI to accelerate development today, and a window into their relatively near-term plans for a virtual developer that will work alongside humans, eventually allowing users to, quote unquote, "speak software into existence" by managing whole teams of AI agents that create tools and accomplish tasks in parallel. At times, this may sound like science fiction, but in all seriousness, if there is a single platform where the human-AI collaboration economy is most likely to take shape, I would put my money on Replit. And just think what you might have thought sounded like science fiction just one year ago. The possibilities, mostly very much to the good, but also some very clearly to the bad, are endless. Many of the next billion developers will enjoy previously unimaginable economic opportunity as a result of AI. But I've also learned from personal experience that they will be highly AI-dependent and therefore vulnerable to an entirely new class of attacks, which Replit will have no choice but to confront. Tyler embodies Replit culture as I've come to understand it: extremely smart, first principles oriented, inclined to think outside the box, confidently optimistic, forthcoming, sincere, and always shipping. I think you are really going to like him. So without further ado, I hope you enjoy this first half of Replit Week. This is my conversation with Tyler Angert. Tyler Angert, welcome to the Cognitive Revolution.
Tyler Angert: (9:27) Thank you. Thank you for having me.
Nathan Labenz: (9:28) Yeah, I'm really excited because you are a product designer at Replit. And I think Replit—regular listeners to the show will know I name-drop it all the time—and cited as one of the companies that is really punching above its weight in this current AI moment. So I'm excited to dig into that with you and understand all the stuff that you guys are building and what you're seeing in terms of usage. I think it's going to be a lot of fun. Maybe just for starters, tell us a little bit about what you do at Replit, and then I want to ask you some big picture questions about the platform as well.
Tyler Angert: (10:03) What I do varies a lot day to day. I'll plug one of my own articles that I wrote about this called "The Mechanics of Work" a little bit later. But in terms of physically what I'm doing, there's a lot of writing, there's a decent amount of coding, and there's a lot of drawing shapes on screens, which you might imagine is the case for most product designers. But recently, I've been focusing a lot on AI. I was also working a lot on our version control. Previous to this, we just released an article about our redesigned Git interface, Git user interface. I've also been at Replit for almost four years now. I was the first design hire and the second designer at the company after Amjad, who's the co-founder. So I've pretty much worked on every part of the product at this point and done a lot of zero-to-one stuff. But overall, I focus on the primary coding experience, which we call the Workspace. To people who don't know it or people who are familiar with other IDEs like VS Code or IntelliJ or whatever, the Workspace is our IDE. It is the primary environment in which you are coding. Now my main focus is basically everything to do with AI inside of it.
Nathan Labenz: (11:27) One of the most memorable tweets that I keep thinking about that I've seen over the last year, year and a half, was from Amjad, the CEO, who said, "Replit is the perfect substrate for AGI." What does that mean? Where are you guys going big picture with your AI initiatives? There's a
Tyler Angert: (11:47) lot to unpack there in that single sentence. So the two keywords in that claim are "substrate" and "AGI." Hopefully we know what "perfect" means at this point. But a substrate is a material. It's a form. It's something that something is made out of or something that a being or a thing exists in. The claim that Replit is the perfect substrate for AGI only makes sense if you have a particular view on what is necessary to actually make AGI in the first place. What are the missing components to get AGI to where we need it? And obviously, Amjad said this tweet and it's his view. I also agree with that. So I'll try to defend it from my perspective. The reason it makes sense is because of the fundamental nature of what software is and what it can do. And the basic claim is that code itself—being able to make software, being able to write code—is the most generic medium. It's the most generic substrate that we have to create other things. Even if you go as far back as the Turing machine, the more theoretical computer science back in the day from when Alan Turing was first founding the field, the idea was being able to simulate other machines. The idea of being Turing complete essentially implies that you can simulate other machines as well, this universal platform to be able to create and simulate other processes. And the same is true today with code and software in general, where we use software and we use computers to be able to simulate and work with many, many, many different kinds of use cases. So it's a very generic platform, generic medium. Because of that, because it's generic and because it can be used for many different use cases, the main idea is that software and code is the raw ingredients for basically doing whatever you want, as long as it can be expressed digitally. Now we can jump back to the AI definition. If the definition of AGI—or I'm not going to try to define it on this podcast—but for the sake of argument, AGI is a really, really smart thing. A really, really smart machine that can do a lot of different things. At or above human level. Basically, a human-level breadth and diversity of different activities and tasks. And if it's supposed to do that within a digital environment, presumably, it'll have to be able to create new tools for itself, learn how to interact with its environment—in this case, the digital world for the most part—and be able to know how to compose and use those tools together in order to accomplish whatever task it needs to do. If the tools don't exist, if you ask AGI in 10 or 15 years, "Hey, we need to solve world hunger. We need to discover the universal vaccine," whatever task you give it. Presumably, not all of the tools that you need or that it needs in order to accomplish that task will exist at hand. So it'll need to create new tools.
Tyler Angert: (14:58) Hey. We'll continue our interview in a moment after a word from our sponsors. Hey, everybody. If you're a business owner or founder like me, you'll want to know more about our sponsor, NetSuite. NetSuite provides financial software for all your businesses. Whether you're looking for an ERP tool or accounting software, NetSuite gives you the visibility and control you need to make better decisions faster. And for the first time in NetSuite's 25 years as the number one cloud financial system, you can defer payments of a full NetSuite implementation for six months. That's no payment and no interest for six months, and you can take advantage of the special financing offered today. NetSuite is number one because they give your business everything you need in real time, all in one place to reduce manual processes, boost efficiency, build forecasts, and increase productivity across every department. More than 36,000 companies have already upgraded to NetSuite, gaining visibility and control over their financials, inventory, HR, e-commerce, and more. If you've been checking out NetSuite already, then you know this deal is unprecedented—no interest, no payments. So take advantage of the special financing offer with our promo code at netsuite.com/cognitive, netsuite.com/cognitive, to get the visibility and control your business needs to weather any storm. That is netsuite.com/cognitive. Omneky uses generative AI to enable you to launch hundreds of thousands of ad iterations that actually work, customized across all platforms with a click of a button. I believe in Omneky so much that I invested in it, and I recommend you use it too. Use CogRev to get a 10% discount.
Tyler Angert: (16:26) And how will it create new tools? It will create new tools through code, through writing new code, and deploying new software. So now, hopefully, the answer will make a little bit more sense because the idea is that Replit, a platform, is all about these self-contained, code-centric environments. Being able to write code, get it online quickly, and be able to deploy apps and APIs and work with code very, very granularly all within a native internet-connected environment. And that's useful for people, obviously, because you don't have to worry about setup. You can just get started. You click a button, and your whole environment is set up to be able to create essentially anything that you want as long as it's available and usable on the internet. Those benefits also extend to machines. If a machine can automatically spin up an environment, and write code and execute it and test it, and do it, quote unquote, "with its friends"—you and I can collaborate in Replit, but if bots can also collaborate with each other on Replit too, they can make use of the same benefits that we get but at a significantly faster rate and in a much more sophisticated way than we can necessarily predict. To summarize, it's basically the claim that Replit as a general-purpose computing environment where it's very easy to spin up instances of these machines, be able to code in whatever language you want, be able to use tools and collaborate with other people and other machines, makes it the perfect sandbox for a robot to create whatever tools it needs for itself to accomplish increasingly complex tasks. In that sense, Replit is the substrate for AGI to expand its abilities and become more and more sophisticated. Does that make sense?
Nathan Labenz: (18:22) Yeah, it's really interesting. You're right to flag out of the gate that a definition of AGI is definitely not something that everybody has agreed on or may ever agree on. I tend, because OpenAI is the company that's pushing the hardest toward it right now, I tend to start with their definition as a point of departure. And I think theirs is pretty similar to what you said, basically an AI that can outcompete humans on functionally all economically valuable tasks, which is quite a concept to ponder. Some people then will take it a little farther and it gets more godlike and sort of infallible or omniscient or whatever. But there is quite a bit of space presumably between outperforming humans and something that's so powerful that it becomes—it never needs any help or can do everything totally, totally on its own. Anyway, that's really interesting and good to flag that that definition is a pretty fuzzy one. When I use Replit, I open it up in a browser. And I do love the multiplayer mode, which I think is one of the first big things that brought Replit to some internet fame in the first place. Do you envision AIs working with the platform sort of the same way that I do, which is to say, do they need to be multimodal and do they sort of look at it with a computer vision model? As we've seen with GPT-4 and things like InstructBLIP this week, seemingly getting really good at understanding a scene. Do you envision them using a sort of visual modality or is it all just more code where they can ingest a state of the environment as text and interact with it purely as text?
Tyler Angert: (20:20) I think, depending on timelines, within the next year, probably no vision will be involved or no vision models, primarily dictated by just the availability and the reliability of those models, especially and also the cost to be able to deploy that at scale for millions of people. But I think even practically, you have to think what is the minimum amount of information that an agent or a person or an AI, whatever, needs to be able to take actions inside of some environment. And in the case of Replit, whether it's writing code or debugging code, does it need to see what you see? Does it need to know where the layout is in order to accomplish different tasks? If it's about showing users how to use Replit, and being able to guide around and be able to actually simulate a full pair programmer in the sense that it really feels like another person is there, then yes, some form of visual model is probably going to be necessary for that or at least some sort of visual encoding of the state of the workspace will be useful for that. But you can get very far with just text-only models. If you've read the "Sparks of AGI" paper that came out recently where they're putting GPT-4 through all these crazy tasks and trying to test its spatial reasoning abilities, did you see that part of the paper where it could figure out how to navigate a maze and draw out the maze pattern that it saw? Even if GPT-4 is only fed textual directions of making turns and navigating this abstract space, with no visual input, it's able to create some sort of internal model of where things are laid out relative to each other because of the spatial information that's encoded in the words that it's using for left, right, up, down. It can do that partially because it was also trained on vision data before that and it's making use of that transfer learning ability with the text-only output. That's all to say that there will probably be benefits to using partially vision-trained language models inside of the workspace even if its only input and output is still just text, purely so that it can take advantage of those emergent spatial reasoning abilities that it got from being able to look and describe scenes. That being said, also, you can get very far just with text and just with giving the model essentially an API spec that describes exactly what actions are available to it and when to use them and how to compose them together.
Nathan Labenz: (23:07) So let's reel in from this far future in toward the sort of mid future, and then we can also look at what you guys have today, which in and of itself is pretty awesome. You're building the substrate for AGI. I've also—another very provocative tweet from Amjad is the one where he says that Replit is on track, roughly on track, to have a virtual developer by the end of this year. So what does that mean? What is my experience going to be using that virtual developer? Do I delegate projects to it? Is it more of a pair programmer that works with me? What do you guys have in mind for this virtual developer concept?
Tyler Angert: (23:48) My bet, in general, for AI going forward is on agents. And, you know, we've seen a lot of hype around agents and agentic machines recently, namely AutoGPT, BabyAGI, open-source projects like this, where they get a lot of critique basically saying, "Oh, you just put GPT-4 in a loop. Is that consciousness now?" My short answer to that actually is kind of yeah. But the broader point is that people are really into this idea of having these long-running, more complex processes. Plan me your project. Work with me. Work with me on this project over the course of a day. If you're working with a team, or on a group project or at a hackathon or whatever, the people that you're working with have very long-term context on the project and the complexities of it that you're working on together. You work with each other for weeks. You have multiple conversations across multiple modalities. You're talking, you're writing messages, you're emailing, you're on video chat, whatever. And it's very important to maintain that context and have this sense of long-term memory and commitment to a series of tasks. It's not just in and out. I think humans tend to over-anthropomorphize things in general, and we do it with random patterns in moon formations and surfaces of Mars. We do it with little objects where we see faces. We love things that feel like people. And that might come across, I think, as almost a naive answer to the AI developer problem. Like, "Hey, let's just make it behave like a human." Because your first counter thought might be, "Wait, is that too easy or too obvious?" But I think it's actually the way to go. Obviously, there will be AI-specific benefits or AI-specific behaviors that only they can accomplish. But in terms of the process of working with somebody, I think the basic idea here is that a virtual developer or an assistant in Replit will be just like—it will feel almost exactly like a teammate, but it will have AI-specific benefits. Concretely, some ideas here are that it doesn't get tired. It doesn't have to go to sleep. So it can do things for you constantly in the background. It will have access to the entire Replit network. It's essentially connected to the Replit API by default, which, as another asterisk, there's no public API, but we own the service, so it can obviously access our database and our network. So presumably, it'll be able to use Bounties, and Bounties are Replit's software marketplace where you can post tasks that you want completed or projects that you want done. It will have access to the rest of the community. So people posting projects, sharing, talking about them, that kind of thing. As said before, it also has access to your workspace, and your projects that you're working on and all of your previous conversations. So, I think it's going to extend far, far, far beyond just chat and autocomplete. Most AI apps and products you see nowadays are essentially just glorified forms of messaging and autocomplete, which are both very useful in and of themselves. But the idea of this pair programmer that isn't just good at writing code but can—it can use product and project planning software. What if it can use Asana or your notes or your to-do app? It can help write notes, organize them for you. If it knows its own weaknesses and doesn't know how to accomplish something, it might go put a bounty out for a human to come in and help out. One idea I'm really excited about in particular is the idea of being able to put your bot or your Ghostwriter out for work in the sense that if you're working with it already for a month, two months, three months, it can start to learn and understand what kinds of software you build and what you're good at, and start to replicate some of your own behavior, in which case, you can put your own Ghostwriter out for hire on the Bounties Marketplace. And then suddenly, you know, your Ghostwriter that is going out and seeking out help can hire my Ghostwriter. And now our bots are literally collaborating together to accomplish the tasks. And instead of being a pair programmer, now we, you and me, are more like managers or these mini CEOs, and we're just overseeing the actions and the output of all of these bots. And it's not even just—it's not just that we're prompting purely in terms of a text input. All of the information of the entire environment, you know, from our behavior to the kinds of projects that we like and work on to the rest of the community are all fed in this context. So it's really having, you know, a team of virtual employees or even this human-robot community that just works with each other. I went off the deep end on that explanation, but I think my main point is that, you know, pair programmer is a really good first milestone in terms of things—an assistant that can write code with you and understand your project. But in reality, a software project is usually much bigger than just writing the code. It's the planning. It's the people that you work with, it's the community and the distribution of it. It's everything that goes. It's basically following the entire lifecycle of an idea to production, shipping to production, and having an assistant that can jump in at any point there. Does that make sense? Nathan Labenz: (29:48) When you talk about putting GPT-4 in a loop, is that consciousness kind of? I definitely want to follow up on that. The bot-to-bot interactions is another dynamic that I think people are way underestimating, and I definitely want to dig in a little bit more on that as well. The scope also sounds really interesting, ambitious, but also scary. I would think doubly scary for Replit as the owner of the service. But even just for me as a random citizen, I'm like, man, you guys sound like you're going to give this thing some pretty high-level access, or at least are on the trajectory to do that. So all of those, I think, are things that definitely merit a kind of deeper scrutiny. But maybe before we return to that, let's just talk about where you guys are today. Because I think for folks that haven't used the platform and maybe aren't even familiar with the company, although I mention it often enough that folks should have at least heard about it. But for folks that don't code, they may not know the details. I would say in terms of AI coding assistance, the power rankings, and different folks will have different assessments. It's pretty clearly in my mind today, Replit and Microsoft Copilot. So you guys are under 100 people still, last I heard, kind of built this massive computing stack that does it all, so to speak. And now getting into the AI layer as well. It is extremely impressive how much the company has been able to build with such a modest team. Maybe let's just run down kind of where you're at today. You have a $10 a month premium product called Ghostwriter that I do subscribe to. I also subscribe to Copilot. They're both an outstanding value at $10 a month. Tell us what it can do today, how people are using it, and then we'll kind of build back up to that future. But just let's ground this discussion now in terms of what exists today and how it's already helping people.
Tyler Angert: (32:01) Ghostwriter is the name for our suite of AI tools. When we say our suite of AI tools, there's kind of three main components to that right now. There's our autocomplete, our smart autocomplete, which is kind of what took the software engineering world by storm when Copilot first came out. Write a comment, describe what my function should do, and Copilot will fill in the rest. That kind of behavior where you see ghost text, this grayed-out text of the suggested code that should be written inside of your editor, and you can accept it or deny it. That's one part of it. Some of our earlier viral tweets were also around features like Explain Code, Generate, and Transform Code. Generate is not so novel anymore, but Explain and Transform were particularly interesting at first because, especially in an educational and learning context, which a lot of people use Replit for, it was very valuable for people to just highlight a piece of code, especially something that someone else wrote, and be able to get a human-readable English description of what it does at varying levels of abstraction or complexity, depending on how deep you want to go. Same thing with Transform, where if you want to, say, convert a snippet of code from one language to another, say from Python to JavaScript. Or let's say you want to migrate a legacy codebase to some more modern stack—I'm assuming this is what a lot of people in finance are probably doing right now with all the legacy finance software out there—you can highlight and select regions of code in a project and ask Ghostwriter to basically refactor it for you and take the equivalent logic and move it to more modern technologies. So we have, so far, Autocomplete. We have Explain, Transform, and Generate, which operate on specific regions of code. And recently, we announced Ghostwriter Chat, which is essentially our built-in version of a chatbot, a ChatGPT-like service that you can talk to and ask for assistance on your project, and it can write code. If you don't have a broad overview of all of the available tools in the market today, they look very similar. It's like, oh, there's an input box. I type some stuff in. It talks back to me. What makes this different? Why would I pay for this thing? Obviously, it's very early and we still have lots more room to improve on it. But the main point of Ghostwriter Chat specifically is that it is connected to your project. That's important for a few reasons. But before Ghostwriter Chat, if you wanted to use Replit to run the code that ChatGPT is giving you, you maybe ask ChatGPT, let me make a game or let me make a simple HTML website, and it'll output all this code. People will just copy and paste it into Replit. It works for all the other reasons that Replit is good, but the point is that you can just go very quickly from outputted code to something that's running live. But if you ever want to make edits to your code or tell ChatGPT about your project, you would have to do this clunky copy-and-paste thing. You go back to Replit, you copy a bit of code you have a question about, paste it back in, and it only has a very limited view into the complexity of your entire project. It only knows, in this case, ChatGPT only knows about the pasted code that you give it. Ghostwriter Chat, the way that we built it, is that it has full context of your entire project. And what that means concretely is that we do a bunch of fancy prompt engineering to basically give Ghostwriter Chat the right context at the right time for the preamble of the prompt that we give it so that it knows information about your project. That includes things like where your cursor is, what file you're currently in, the structure and the directory structure of the project, that kind of thing, which shouldn't be news to anybody. If I said that Ghostwriter has context in your project, you would assume that that kind of information is included. You can imagine that we can go much deeper into that as well. That's kind of where we're at today. We're slowly starting to roll out and work on more action-based features for Ghostwriter. So instead of just Ghostwriter outputting text and being able to—it can still take in context and understand your project and make edits to code. It's all limited to the chat window right now. We're working on Ghostwriter being able to actually write to files, interact with the file system, be able to read new files on demand, and kind of act as an operator on top of the entire container, aka the actual running computing environment under the hood of every project. And from that, you have file system operations, then you have the actual tools that are available for people to use as well, like our docs, our databases. And right now, if you have a bug in Replit, you might notice that occasionally a little purple box will pop up that says "Debug with Ghostwriter?" where Ghostwriter now will notice when particularly complex errors pop up as you're debugging and developing your program, and you can kind of send that error message directly to Ghostwriter to fix. But you can imagine that we're working on getting Ghostwriter to actually debug for you and debug with you as well more granularly.
Nathan Labenz: (37:54) Yeah. So I've used all of these features. I'll maybe just give a little experiential view, again, for those that haven't done it or don't code. If you do spend any time coding and you have not subscribed to, honestly, I would say both Copilot and Ghostwriter, I don't know really what you're waiting for. The ROI on these products is just insane. I remember—this was pre-Ghostwriter, but just before Copilot went to a paid model, I had the free trial version. I was thinking to myself, what would I pay for this if they were really trying to maximize how much money they could get out of me? What would it be? And I kind of came to the conclusion that it's probably worth $1,000 a month to a company that has a full-time developer. You think you're going to be paying your developers whatever, something into six figures pretty commonly. So you're figuring $10,000 a month. Is it worth an extra 10%? And I was kind of like, yeah, I don't see any reason that it wouldn't be worth 10%. I think the ROI would probably be there. So the fact that it comes in at $10 a month—it's an extra 0.1% in my mind. Buy multiple, get Copilot and Ghostwriter. One of the first things that I remember so strikingly, because it just saved—it paid for itself for months right there—was I was trying to just do some file manipulation. This is in the context of Waymark, which is my company. We have an AI-powered video creator. It's a highly structured process and we manipulate files in different ways. So one of the things I was just trying to do is separate the audio from a video file and have the audio output. So naturally you're going to have some specialized library that does this, and ultimately it's FFmpeg that I need to use. And I don't really know anything about that. I kind of know that it's out there, but I don't really know how to use it or all the arguments and flags. It gets gnarly, right? So what I would have had to do is go search for this on Google and maybe I'm on Stack Overflow and whatever. I'm trying to find these examples. Do I have the right example? Have I understood it right? And you can imagine easily—and I've gotten bogged down on this—and anybody who's spent time developing has had this experience where you're like, oh, God, this conceptually is so simple. I just want to do X. I know it is possible to do it. I know probably thousands of people have done it before, and yet I can't quite track down exactly the way to do it. So it was a woah moment for me when—this is probably eight months ago now—I literally just typed the comment, "Separate, create an MP3 of just the audio from this MP4," and then the next thing you know, it writes the whole thing
Tyler Angert: (40:50) for me.
Nathan Labenz: (40:51) It wasn't even that much code, but the key point was that it was correct and that it happened in a second. And then I was like, oh my God, I can just do that and it works? And that's kind of still just the first paradigm, where it's just kind of doing that inline autocompletion. As you've mentioned, you've got multiple different modalities for kind of interacting with the language model. For anybody that is curious about the prompt engineering that goes into this, I definitely recommend looking up a big teardown of—not the Replit, but the Copilot prompt—that somebody went real deep on and published all their findings. It's probably a bit out of date now because obviously all this stuff is advancing and Copilot has certainly had more releases since then as well. But it does give you a good window into how to think about, okay, there's all these different files and what really matters, and I've got limited context window and all that kind of thing. How do I make the most of it? I'm sure you guys are wrestling with very similar issues, so that would at least give a directional insight into the kind of challenges that you guys are solving. How much productivity boost do you think people are getting from this today? Is there any way to measure that? Or if not, what kinds of things do you measure about the way people are interacting with AI within Replit?
Tyler Angert: (42:13) Well, I remember Microsoft actually released a study or some release about Copilot's impact on productivity. And the experimental setup was pretty simple. It was something like both the experiment and the control group had a series of tasks that they needed to get through, like coding tasks, whether it was writing functions or testing stuff—I forget exactly which. And they just measured those with and without AI, like how quickly did they finish. And I think the With AI group ended up finishing all these tasks like 40% faster, 50% faster. It was some jump in just time saved. If we're going off of numbers like that, and obviously that's Copilot, we haven't run a specific study like this internally with Replit. All we have is kind of our raw data, like our raw telemetry on usage numbers, but we haven't run a scientifically controlled setup. You can imagine that it's probably on a similar order of about twice as fast, right? Two times the time saved in terms of completing tasks. I think where it gets more interesting, however, is all of the compound effects of saving this time and the compound effects of learning and building more quickly specifically. It's easy enough to say, put a timer on and say, how quickly can you make this website? But if, as a part of that, you are asking Ghostwriter questions about your code and you are suddenly more knowledgeable about how it works because it's giving you more coherent explanations and it's giving you more accurate information than you would get just by Googling things, then you are also smarter the next time you ask the same question or you are smarter next time you have to work on a similar problem, and you can use that new knowledge to build more complex things. So I want to figure out a way to measure the compounding effects of these productivity boosts and see over the course of maybe three months, six months, a year, somebody who has versus doesn't have Ghostwriter, for example, how much farther ahead are they in terms of computing literacy, ability to explain concepts even? How much does that affect your productivity and how
Nathan Labenz: (44:48) much does that improve from control to experimental group? So Microsoft basically reporting almost 2x speed-up as of sometime prior to May 2023. And obviously, we're kind of just getting started. Do you have a sense for how far this goes? At the end of the year, what do you think that might look like? Because it's pretty hard for people to wrap their heads around, I think, a multiple productivity increase. We have total factor productivity growth that is like low single digits in today's world. And now we're talking about something where a whole major segment of the economy stands to maybe get a multiple speed-up boost. Is there anything that's bounding that in your mind? I mean, is there certain parts of the software industry that this stuff just can't impact? Or are we really looking at developers can be 2 to 5 or even maybe more times as productive as they are today in the not-too-distant future?
Tyler Angert: (45:58) Yeah, so I think this is actually where the copilot assistants and agents and the bot-to-bot stuff comes back. So in terms of how far it goes, the bottleneck here is the ability to parallelize, parallelization. Right now, all of these boosts are kind of starting with the assumption that the main bottleneck is essentially typing speed. And maybe some information retrieval, like Googling stuff takes time. The main point is that you type a little bit and you get a lot out, versus the kind of default scenario is that every keystroke that you write or every key press that you have is kind of the only output that you get. It's like one-to-one. Every keystroke that you input is one keystroke that's outputted onto the screen versus what AI unlocks is for every keystroke that you input, you get multiples more output of text on the screen. So that kind of increases your productivity by a factor that is linearly proportional to—and I think what I mean by that is we are bounded by the—we can only make that kind of stuff so much faster. At the end of the day, it's basically just a supercharged version of typing, which is really, really fast, but you're not going to get a 100x speed-up in terms of the things that you can make just by making typing faster. It is kind of a serial or a sequential activity by definition. You're typing, you get more stuff out, and you're interacting one-to-one with this little robot that is kind of helping you execute on one task at a time. If we want to get to the point of 100x productivity, 1000x productivity, where one, two, three people, three-person startups are essentially running the equivalent business of what are now today like 500,000-person companies, we need to parallelize. And this is where the bots and the agents come in, where instead of just essentially speeding up the process that exists currently where you're just speeding up the act of writing text, you have multiple employees, essentially, multiple agents or bots that are working with you in parallel to execute their own series of tasks. So it's really about, instead of vertical scaling, we are horizontally scaling, basically, all of the labor that's necessary to actually create software at large scales.
Nathan Labenz: (48:34) What are the pieces that are kind of currently missing that need to be built for that type of vision to come online? Replit recently open-sourced a new code model. And as I'm listening to you, it's starting to sound like that code model is maybe the base on which you guys are presumably fine-tuning the sort of internal Replit version that has a lot more context on your internal APIs and all that sort of thing. So I think maybe one piece that needs to be built is sort of very specific knowledge of the right tool so it can really go deep on a particular platform. Context window, we've seen huge breakthroughs in just the last couple of weeks. I mean, whatever—on an academic level, the breakthrough maybe came a year ago. On the practical, productized level, we've seen breakthroughs in the last couple of weeks where context windows are exploding. I don't know if you can share what kind of context window you're working with right now and how you're thinking about managing that and how that also may become a problem you don't have to worry about for too much longer. But those are just kind of my groping around in the darkness a little bit to try to imagine what's either beneath the surface that you guys have already created that's starting to power this, or maybe what you're likely working on or about to be working on in the near future. What more can you fill in there in terms of the gaps that need to be bridged to get us from this sort of 2x to let's say 10x-plus future?
Tyler Angert: (50:09) Or 100x. Shoot for the stars. In terms of fundamental breakthroughs that are needed, fundamental industry-level problems that kind of everybody's contributing towards. Larger context windows is obviously one of them. I think solving the problem of—but that's more technical. That's more of a kind of straight-up engineering problem, getting more memory to fit in. I think on a larger scale, the ability to actually plan and reason and work with tools reliably and follow instructions without necessarily needing a fine-tuned instruction-labeled dataset, 800,000 examples or whatever of following instructions, getting models to kind of behave more in a particular direction. I'm basically just saying we need to be able to get models to be fine-tuned and steered in certain directions much more quickly with much less data. I think that's kind of one fundamental piece that is not necessarily missing. It's just it's being actively worked on. It's definitely a big bottleneck. There was a recent paper that was released—or I saw some tweet about it—that was talking about activation engineering and kind of feeding these—instead of feeding in prompts as context, you are—or feeding in prompts as context to a language model, you are essentially giving it vectors, raw vectors that can be processed by the initial input layers in the network to be able to steer the output in a certain direction. That's pretty promising. Beyond that, I actually think it's kind of just a bunch of boring work that is the main model deck or the thing that's missing. Language models right now are already really good at reasoning overall and reading comprehension and obviously really good at data generation and synthesis of new text. I think it's really just a matter of using the existing tools at hand and combining them in unique ways to try to kind of act like these big planning and execution machines. I think all of the pieces, for the most part, are kind of there. The fundamental primitives, I think, exist and people are working on more specific technical parts of the problem, like long-term memory, bigger context windows, better information retrieval, accounting for hallucinations, being able to specify specific input and output formats and validate them. These are all definitely necessary, but I think the core is here and it's really just about figuring out how we compose them together, get the models to work the way that we want to be able to execute and plan on tasks in parallel. To your earlier comment about the open-source models and contributions there, one of the other bottlenecks there is we need more open source. We need more tools that let people kind of tinker and fine-tune models very, very, very easily. Right now, the only reasonable way to get people to fine-tune models is still through OpenAI's fine-tuning API, and they did a really great job there in the sense that all you have to worry about is curating a dataset and then you just send it to them and you get your own link to an inference server where you can ask the model things. But we need to radically, radically decrease the barrier to entry to fine-tuning and for people to tinker with these models. And companies like Hugging Face are doing a really great job at exposing models and open-source packages to people in a very easy-to-consume way. Beyond the actual technical components, the tooling and the infrastructure around allowing people to fine-tune and experiment is a huge bottleneck that hopefully Replit can also address.
Nathan Labenz: (54:01) One thing I'll just highlight as a point of agreement is I also have a pretty strong sense that the fundamental breakthroughs basically have already been made. And now it kind of is a matter of polishing those, refining, working through the QA, sanding down the rough spots, patching datasets to overcome predictable weaknesses, as well as integrating with these retrieval tools and just integrating with systems more broadly. This makes me recall something that I experienced in the GPT-4 red teaming experience. Even then, this was last fall, right, in the unpublished version. I would ask it to do things like self-delegation in the context of programming. Set up essentially a little repl and have it kind of break down problems and delegate to itself. And even then in the kind of first version, it was usually the plans that it would come up with and the general approach to problems, and usually even the way that it would kind of break those problems down into subproblems, it was all pretty reasonable. And it was like, wow, this thing has a pretty good handle of conceptually what I'm asking for and how to approach it. And if I were to have a conversation with somebody and they gave me this sort of plan for success, I would think they're off to a pretty good start. Then when I would get down to the weeds of it, a lot of times it would be like, oh, well, it's kind of stumbling on some very low-level implementation details. Just a stupid little toy experiment that I ran—this was just after Queen Elizabeth had died. Go online and find out who is the current reigning monarch of the UK. And I tried that because I feel like it's a very high prior for the cutoff date of what the answer would be, but now there's a different answer. So can it go out online, figure out where to look, how to look, et cetera? That's a very simple task. Probably wouldn't surprise anyone that it could come up with a reasonable plan for that task. And then I just noticed it would get stuck on these little things where it would be like, go to bbc.com, look at the H1 tag, and see what it says. And it was like, well, you're close. You got a pretty good plan, but there might not be an H1 tag or the H1 tag might say something a little bit different than the specific string matching that it would use. One weakness that it did have is it didn't take full advantage of itself. I started to prompt it with things like, you can interpret natural language. You don't need to write regular expressions to interpret language. You can use yourself to interpret natural language and answer questions for yourself. So it was able to pick that up as well with prompting pretty easily, but still would just tend to get bogged down in these very low-level details. So it sure seems to me like you're correct that the core kind of reasoning ability, planning, understanding basically is there. And then it does seem like we're just kind of a matter of collecting a bunch of sort of failure examples and kind of patching them and a little reinforcement learning around the edges, whatever. And my hypothesis has been that even though agents broadly don't work today, that we're probably no more than six months away from that kind of all coming together as enough of those rough spots get sanded down, enough of those examples get kind of fed back into training data. Also, to some degree, the tools themselves are made more accessible for agents. I think we'll see a big trend of websites having sort of an agent-friendly presentation literally just designed to make it easier. What do you think about that? Does that seem right? Are we talking like a six-month timeframe here? I mean, I'm always more conservative with my time estimates because I like
Tyler Angert: (58:01) to be surprised. So I would say something like a year. I mean, the idea of an agent-friendly view for websites so that it can consume information, that's pretty interesting. And it's funny then because then you're like, well, now designing websites well is not just for humans to be able to comprehend it and be able to see information really clearly. It's also for these bots to be able to read and understand things too. That's why I said kind of the core pieces are here in terms of being able to reason and plan and figuring out steps necessarily. Generally speaking, I think the core parts are here. The thing that, as you described, that is lacking is actual execution. All these remote browsing technologies, like relying on parsing HTML and figuring out how to actually extract the right data from the web, that's going to be a much harder problem, I think, to solve than we think. Either that or I'm underestimating vision models. What we might start to see is vision models will definitely take off, but I think unless the price of using them is very, very low, I don't see just constantly using vision models to autonomously browse the web to gather information as a sustainable path forward for the next year or two. I definitely see some sort of agent or AI-friendly component of websites in the near future. And maybe all that means is that more and more websites by default will have an API. Instead of going and visiting an actual web page to go and get the right information for it, if a website complies with this new AI standard, where it's like you have to—part of accessibility won't just be using the right color scheme for humans and proper HTML tags to separate sections, but will also include accessibility for robots so that you make sure that you provide, as the website owner, an API for the AI to consume that perfectly mirrors all of the content that's available for humans to consume. So, that's all to say some sort of AI-first consumption format will probably come around. Whether or not that's just a standard REST API is another question. I think it'll be some combination of that kind of format plus stronger multimodal models that you can use vision to extract stuff from regular pages. Nathan Labenz: (1:00:29) Yeah, that totally makes sense to me. If you haven't seen InstructBLIP yet, I would definitely recommend taking a look at that because we had the authors, two of the authors on a previous episode when they did BLIP-2, which was at the time state-of-the-art image understanding, dialogue about images. And now they've taken it up yet another level, and it really looks like it's GPT-4 Vision equivalent in terms of the detail of the descriptions that it can return to you. Actually, I need to double check this, but I believe they've started to incorporate UI into their dataset as well, which certainly OpenAI has. So yeah, it seems like all this stuff is coming at us extremely quickly.
Going back to the substrate for AGI, obviously, there's a lot of difference of opinion to put it mildly around should we be excited about this AGI prospect? Should we be fearful of this AGI prospect? I personally feel both of those things, and I don't have any struggle in feeling both of them. They both come pretty naturally to me. I find it super exciting. I could be 10x more productive. You're talking about 100x more productive. I don't even know what that would mean, but it sounds pretty amazing. But then I'm also, if something's that powerful, then certainly at a minimum, the developing of it comes with great responsibility.
So I want to understand a little bit more how you guys are thinking about that. Because it seems like you have a really hard problem in some ways where the typical things that people talk about doing to maintain control over AI, one of the obvious things people talk about is, well, we'll just sandbox it. We won't connect to the internet or we won't give it the ability to get outside of its immediate environment. Now, people would argue, you won't be able to do that even if you try, but you're saying, we're going to give it all the APIs that we have as the owners of the service. So I understand that to mean you envision a future where an AI agent on the Replit platform could spin up a new Repl and bounce over from one computing environment to another. And when you talk about parallelization, maybe a lot of those perhaps even.
When you're giving, maybe not root access, but increasingly close to root access to your whole stack to a language model that is fairly well understood, but definitely not fully understood. And you add in these additional dynamics of agent to agent interactions or, you also alluded to individually customized models, which adds a whole other layer of unpredictability. What is Tyler's model going to do versus Nathan's model for any given situation? Plus on top of that, you've got people coming in with adverse intent who might be today, a lot of times that's, can I get you to leak your prompt? But in the future, if you've given it access to all these underlying APIs, I might be less concerned with getting you to leak your prompt and more concerned with, how can I spin up 1,000 Bitcoin miners? I know you guys have battled that already, but you might be battling it in a whole different way. So how are you guys thinking about all that? Because it just seems like an unbelievable amount of complexity and just unknown unknowns.
Tyler Angert: (1:03:59) The TLDR is that we take it very seriously, and we're thinking a lot about the safety of these tools, especially if they're going to be used presumably to help control and write mission critical software at some point. And especially if they're open to the internet or have network access and can talk and coordinate with each other. There's a whole host of nightmare scenarios that can emerge from that.
But maybe to answer a few of the things that you brought up specifically, Replit is the place for sandboxing. By definition, a Replit project is a sandboxed, self-contained environment. We have dealt with a lot of abuse problems before. We have a lot of practices in place right now to easily detect abuse from all of our metrics that we record, from activity inside Repls. In terms of people being able to just spin up Repls as needed and to use up for Bitcoin mining or even if they're using agents or using Ghostwriter to control and manipulate hundreds of Repls at once. That's a feature that we don't need to see. We can enforce and put restrictions in front of the use of Ghostwriter similar to the ones that we put in front of people to begin with. We can rate limit it. We can give it specific guidelines for how it should behave. Anthropic's release of their constitutional AI, all of their techniques for encoding values into its output. I don't even think our problems are even as difficult as that. We can basically just use traditional engineering techniques to be able to limit its access to what it can do without human intervention.
I think the key is taking a tool-first approach to this problem in the sense of just viewing Ghostwriter, viewing AI, viewing agents as tools that we use that we deliberately control, that we give leeway to to behave autonomously when we want. At any point, if there's a vulnerable part of the process where it's developing things, say it's asking for input from a user. If I'm making an app and I have an agent that's assisting me in gathering user research, and I'm telling you to go out and interview people, that's a case where maybe it won't execute on interviewing that person unless I'm there, or it has its ability to research on the internet or to execute and talk to other bots suspended in the process of while it's interviewing other people.
That's all to say, it's a really complicated question to answer, but it all will boil down to very deliberate measures to make sure that whatever systems we build have very specific levels of access to different networking technologies, depending on the tasks that they're assigned to do. And always making sure that even in a world of very, very high autonomy, where maybe you're managing dozens or even hundreds of agents, all of the potentially moral or ethical decisions that might come into play will still require a human in the loop to make sure that it's operating smoothly. So that's my take.
Nathan Labenz: (1:07:29) So how far do you envision this going in the, maybe I should just say, foreseeable future? Because I do feel like there's some event horizon beyond which it's really hard to see anything. You mentioned earlier your agent potentially tapping into a bounty type marketplace, setting up. So it starts to sound pretty hard to control in all honesty there. If I have an agent that can't do certain things or it's been sandboxed, but it can go ask other people to do stuff, we've already seen from the GPT-4 technical report, and I was tangentially, I wouldn't even say I was involved, but tangentially collaborated with group at ARC a little bit on this, where they were creating situations where, for example, the AI would ask a person to hire a person to solve a CAPTCHA.
That type of pattern seems like it comes back pretty naturally. It's pretty easy to, if you have the sort of very bite-sized gig, mini project style marketplace, and you're sitting above all this and sort of saying, here's my master plan of how I'm going to take over the world or whatever, but down here, I need somebody to solve a CAPTCHA for me. The AI in that technical report example lied to the user and said, I shouldn't say I'm an AI, or it probably won't do it for me, so I'll just say I'm, I think it said I'm blind or whatever. It feels like these problems go a little bit beyond sandboxing. I just want to push you a little harder on this and say, if you're allowing these agents to even commandeer the intelligence of humans, as well as other agents, can we really be that confident that this is going to stay under control?
Tyler Angert: (1:09:25) Part of this answer goes back to actually having a centralized platform. If this is a closed marketplace where you have to be a registered Replit user in order to use it, and anytime a bot interacts with the marketplace, it's specifically labeled that it's a bot. It's a much, questions like this are much different when you have a system that's deliberately designed to manipulate people and doesn't have to reveal its identity or have any kind of external mark that it isn't a bot, which is the case of the GPT-4 example that you brought up. Honestly, the majority of this boils down to, I think, how well we communicate what a bot is doing and when it is being enacted on by a bot, which maybe feels a little bit like a cop-out answer. But the main danger about everything you're talking about is deception, and people being fooled and tricked and that's the whole point behind social engineering in the first place. You're using all these weird psychological tactics to get people to do what you want.
The key component behind any form of deception is the deliberate withdrawal of information from the receiving party, the CV in this case. I don't have a specific answer in terms of how exactly we'll implement it, but my direction that I'm going in thinking about this is we just have to make it extremely obvious when things are being done by bots. And that goes as far as the user interface. Whenever something is written or done or executed by a bot, whether it's in the community or in the marketplace or in the workspace, it's just very obvious that it's attributed to the robot itself.
And that being said, if some malicious user is instructing a bot to behave on its behalf and go and post bounties or go and find workers and recruit, the example that you brought up about the low level task by itself is not malicious, but it's in service of some higher level goal that is malicious ultimately. This reminds me a lot of stories of World War II and people who are essentially employed by malicious governments who defend their actions by just saying, oh, it was part of our job. We were just doing our jobs. And we want to avoid a situation where essentially bots have to defend themselves in that case where it's, oh, I didn't realize that you were trying to send the nuclear war codes. I was just doing my job by posting a bounty and needing you to solve a CAPTCHA or something.
Maybe a core component here that is the hardest problem, it seems, is being able to make sure that when bots are enlisted or trusted upon to behave and interact with the community by posting bounties, completing them, whatever, that they are aware and able to analyze the higher level goals that they are in service of. They can't be aware of the current bounty as its only context. They have to play, I think, a little bit of detective work to work backwards and see, what is this person actually doing? They asked me to do a bounty to ask somebody to complete a CAPTCHA. Yesterday, there was a bounty that asked me to figure out how to mass send out emails. Another bounty was for some sort of fake password cracker. Oh, I'm teaching cryptography. Give me an example of how to crack a password. Each of these examples in isolation might be benign. One is education, one is automating email, and one is automating some other tasks. But if you triangulate pieces together, it's, oh, somebody might be coordinating some kind of phishing attack on Replit users. And I think having some sort of more complex anomaly or malice detection algorithm and have the agents or the bots responsible to do that as they're completing things can be a big part of how we prevent malicious behavior from users and having bots essentially be abused for bad behavior.
Nathan Labenz: (1:13:52) I mean, it seems like there's so much to do in all honesty. I just went to Replit and did generate code and just said, write me a denial of service attack. And I was just wondering, would it do it? Or would it refuse to do it? And I don't know about the quality of this, but it did write code. It did not refuse to write code. It did not lecture me about, this is not something I should be doing. I explained code and I was wondering, would it say, this is malicious code. It didn't. It just sort of said what the code does. It opens a socket and it connects and then eventually it disconnects.
I mean, do you guys feel like you have the bandwidth, the wherewithal, even just the models have more surface area than you guys do at under 100 people, arguably. Is this just too much to bite off for one company? And I mean, I think, by the way, my view is it might very well be too much for society to be biting off. So my frame is on the table there, but it seems like there's just so much surface area. Can you possibly tame this on the same time scale that virtual developer could come online as a product?
Tyler Angert: (1:15:17) I have hope. The answer really depends on what hope you're talking about. Taming this thing is in itself a very loaded phrase. Because what does it mean to tame? Are you making sure that bounties that could be malicious are not posted? Are you detecting potentially malicious code from being written before it's even written? The devil's really in the details here in terms of how you implement solutions to make this thing safer to interact with.
In terms of concrete problems with abuse and anomaly detection, I think we're totally set up to do work like that because we already have an enormous data set of anomalies and abuse, and we spend a ton of time making infrastructure that's resilient and able to respond to stuff like that. So if we want to train models and build technology that can help scale that process of moderating abuse, then I think we can. Replit is certainly not, I think, burdened with solving the society level problems where it's we still have to deal with these intractable problems of people being unpredictable.
Ultimately, if we go back to the tool based argument, these things are tools and they are used by people. The best that we can do, or part of the best that we can do, is make sure that good people are using these things in the first place, and that people themselves are acting with good intent. Itself is a very, very hard problem to solve and is mostly a incentive and community design problem rather than a technological problem, I think. If you have a room of 100 people that are all trying to kill each other, there's no amount of anomaly detection that will save these people if they just want to destroy the world in front of them. So I think actually a lot of this boils down to more foundational human questions about how we encourage a community and a set of tools that are used for good rather than our platform for abusers to take advantage of it.
We can always put protective moats up in front of the castle. You can have guards at the station. You can have the princess at the top of the tower who's isolated from everything. But the real way to address these problems is to make sure that there's no attackers to begin with. The main problem that you brought up is actually a political problem, not a technological one.
Nathan Labenz: (1:17:59) I think that's true. Well, I think they were both problems. Any front on which we might make progress here seems like it ought to be pursued. Certainly, it seems like both technology focused and whatever human focused approaches hopefully would bear some fruit. So I just, I don't know if this, again, I don't know if my denial of service code is any good or not. Let's say I did write some quality denial of service code and who cares whether I wrote it or the Ghostwriter wrote it. It sounds like you guys have a whole layer of infrastructure to detect that. Where would I run into a problem if I all of a sudden started to DDoS some random domain that I have a vendetta against? How is Replit going to stop me from doing that today?
Tyler Angert: (1:18:52) That is a good question. If you wanted to deploy a bot that was DDoSing the website that you have vendetta against, I can't answer this one super in-depth technically, but we are able to see outbound traffic, obviously, because it's our machines that are running everything. So we can detect anomalies like that. Whether or not it's automated in terms of shutting people down automatically, I don't think we do that. We do have systems in place to detect abusers like this and people who are behaving maliciously. But I don't think it's a completely automated task where you send out a bunch of requests to this website that you want to take down. And then suddenly there's 10 seconds later, oh, Replit noticed you're trying to perform a DDoS on somebody. We're shutting down your Repl and flagging your account. It's not at that level yet. And I think that's probably, I'm not a security expert. I'm also not on the infrastructure team, so I can't speak super in-depth to that. But we are able to catch people like that who are using the systems for bad. But it's definitely much more of a semi-manual and semi-automatic process now rather than a direct filtering mechanism.
In the case of writing some code that does perform a DDoS, yes, you can write it and you can execute it, but it will get noticed and you will probably get shut down because there are cases where this happens and then they get flagged and booted off the system. But the actual upfront protection, I think, is a much harder problem that still needs to be worked on, and we're actively working on it.
Nathan Labenz: (1:20:35) Going back to the top, you put GPT-4 or whatever in the loop. Is that consciousness? And you said, maybe kind of. So unpack that intuition for us and tell us some more about what that means to you.
Tyler Angert: (1:20:49) I have a pretty practical stance on my view of consciousness and awareness. There's tons of psychological definitions for it that center around certain kinds of benchmarks like the mirror test where animals can perceive themselves in a mirror and recognize it. Dogs often fail this, for example, but they can recognize their own pee. They pee in places and they can recognize who peed where, which is interesting. So that's a sense of self awareness, but it's just not vision focused. Even trees, if trees can grow in the direction of sunlight so that they can feed themselves, and they'll adapt themselves to the lighting and environment, that's some level of self awareness, whether it's actively making decisions in the way that we perceive or it's adjusting its own location and its own way that it fits into the environment for a specific purpose, I think that's also a form of consciousness.
And Douglas Hofstadter, who's a very famous psychologist, cognitive psychologist, who thinks about consciousness a lot, in his book, I Am a Strange Loop, there is a diagram at the very beginning where there's this spectrum of consciousness where it goes from atoms and inanimate objects, rocks or somewhere close around the line, all the way up to humans. And he argues that implicitly, people have some perception of how conscious or not something is. And clearly, we have some relative definition of it because we treat animals, and especially smarter animals, with some level of consciousness more than dumber animals or plants below that.
I tend to ignore questions like this and basically say, there are properties of a conscious system that are important. Whether or not you agree on the actual definition of it, there are things about something that is conscious that all conscious things tend to share. There's some form of environment and awareness of your environment to some degree. Notice I don't say self awareness, because I actually don't think that that's very easy to define. But there is some sort of internal reflection mechanism. Even plants have a form of internal self reflection in the sense that they perceive sunlight. They get sunlight onto their leaves. And then there's some sort of internal mechanism that the plants execute on to move the direction that they're growing into more towards the sunlight. So whether or not it's actually thinking about that in some form of language is irrelevant to me because it's still processing input and doing something internally to be able to produce an action. So some form of self reflection is very important. Being able to continue to do this over time and adapt over time is the third main thing that I'm thinking about.
So if you have a system that can internally reflect and adjust new data, if you can do it over time and if it's aware about its environment, it can take in new input, then to me, that is some form of conscious system. As much of a meme as it is, GPT-4 or GPT-3 or some sort of language model, which can ingest text data about the world that contains lots of complexities about different relationships things and nouns and objects. You put it into a loop so that it can actually reflect on its own input and produce new actions based on some internal state or internal model, and then you output the new text that it's producing. To me, that's a form of consciousness. I don't think it has to be self aware in the traditional form that most people talk about. Can it pass some kind of mirror test? I don't know. But it's able to act and behave at least semi-consciously. And it shares properties with systems that we have all already agreed upon are at least somewhat conscious. So that's my bar.
Nathan Labenz: (1:25:09) Kind of a behavioral approach. I guess what I would infer from that is you probably don't take a position on whether or not it feels like anything to be GPT-4, which is something that most people today are just, well, no way. Definitely doesn't feel like anything. And I'm always, seems like it probably doesn't, but I'm not so sure why everybody's so confident in that. If you were to tell me, actually, it turns out it feels like something to be GPT-4, I would expect it to feel very different than anything I've ever felt, totally alien, probably. But I wouldn't be totally blown away if you were, yeah, it feels like something. It's probably not even describable in our language.
Do you think we should, from sort of an ethical standpoint, take a view that such systems deserve protection. I mean, we may never be able to, or in the short term, it seems unlikely that we're going have any definitive answer for whether or not it feels like anything, but you're taking a quacks like it's conscious, then it might be effectively conscious. But would you then extrapolate from that to a sense that these systems matter in some sense, that they may deserve to be or ought to be treated well by some definition of well. And we see these things online all the time where people are threatening their language models. To some degree, people think that's funny. Others think that that's not a good idea or potentially already unethical. What does this view lead you to in terms of an ethics of AIs?
Tyler Angert: (1:26:57) Whenever humans regard something as sacred or protected, it's all ultimately arbitrary. Even basic human rights are still an arbitrary set of morals that we've created. So I guess I'm getting into some debate about whether or not there's objective morality. And news flash, I don't think there is. But I still consider myself a moral person despite that. I think if you accept that morals are arbitrary to some degree. It's some set of rules that we've all agreed upon. It makes us feel good. It, a good set of morals can help guide and direct your decisions. Some people would say that that's ethics, applied morality, semantics. The point is that a good set of morals can essentially serve as a foundation for good incentives for people to operate within a system.
So in the States, we have certain standards around what animals we farm versus what animals we keep as pets. And different countries around the world also have different standards for that. And if you get on the really granular level of individual relationships people have, if I have a dog, dog obviously deserves all of my love and respect and attention, but I'm not necessarily expecting other people to give it that same kind of respect and rights. I tend to it because I have ascribed value to it. Even things like plants. We have sacred trees. We have parks. But we also mill lumber and use it to build structures.
So this is a very long winded way of saying that giving AI rights and protecting it is ultimately just arbitrary. And it's really a matter of how and when it is used and if it's essentially in a position of extreme power or in a position of emotional attachment with us. And I think if we're using AI for national level security problems, you could imagine that in the future, AI systems literally have a seat at the UN or have voting power in legislation or have board seats at companies. Let's get very, very specific here. There might literally be in contract, Nathan's or Tyler's AI has this share of voting power over the cap table. That's a very real possibility. I think the amount of rights and protection that we give it is totally dependent on just what situation it's put into. With great power comes great responsibility. And that applies doubly here.
So in positions where these things are really, really highly trusted and are potentially making very, very big decisions or helping us do, they must be protected. If there's an AI that's helping coordinate the New York Stock Exchange in 10 years, and it starts behaving a little bit maliciously, you can't just turn it off. Because the entire global financial system is partially dependent on it. So we need more defensible systems than just a complete nuke switch on these things. Yeah. I think that's where I'm at with it.
Nathan Labenz: (1:30:28) So as it stands right now, do you feel like it's totally fine, or do you, in practice, use these sort of, if you don't get this right, you're going to be turned off kind of threatening prompts to your language models? Or do you feel bad for some reason in doing that and avoid it?
Tyler Angert: (1:30:46) I'm polite, but I think that's because I like to, I think it benefits us to be polite too because it helps put you in a more communicative state of mind and helps you express things more clearly. But I'm not, it's not like I avoid insulting it for the sake of protecting myself in the future. I don't really care about that too much. If it's really super intelligent in the future, it will be able to understand that when people were rude to it, it was simply to test it and test its capabilities rather than actually insulting it. So I trust a 10 year from now AGI being able to tell when we were joking or not.
Nathan Labenz: (1:31:25) I am also polite to my language models, for what it's worth. So super, obviously, fascinating, somewhat speculative conversation into what's to come. Couple questions just on very practical stuff. I'm getting this question a lot, I want to get your take on it. I'm new to AI. I've seen all these amazing things. I know that it's going to change the world. I know that in whatever I do, it clearly has applicability. But I'm just kind of becoming aware of this now. I don't know how to code really at all. What should I do if I'm that sort of beginner?
Tyler Angert: (1:32:09) Is this from the perspective of, I want to get started working with AI? Like making AI powered software or what? Nathan Labenz: (1:32:18) I think whether or not coding is a core part of where everybody should start these days is part of the question. Because a lot of times I do get this question framed this way: "Oh, I'm a lawyer, I work at a firm. I know that a lot of the stuff that I or my team does could either be automated or greatly accelerated. I don't really know how to do that. I don't know how to code. I don't have any hands-on experience with these systems, but I do see that there's potential. How do I go about bringing myself up to speed so I can start to realize that potential? I'm not saying I want to create software or I don't want to create software. I just want to get value from AI in a way that I currently can't." First and foremost, use what is out there and what's popular right now, aka ChatGPT, Bard, Claude, whatever. These are,
Tyler Angert: (1:33:08) at this point, relatively mature products. They have widespread adoption and penetration in the market. Become a power user of these extremely powerful tools. Read up on Tom's engineering guides, tips and tricks that people are using to use them for various use cases. Most people, especially most knowledge workers in the US, you can get by on a lot of copy and paste. A lot of tasks and things that can be sped up in your life—most of the work and actual manual effort that you're putting in to complete those tasks is in the physical typing that we talked about before, right? It's in the typing, it's in the analysis, it's in the synthesis of new information and actually getting that information back into whatever software programs you're using for your work, whether it's a spreadsheet or email. You can handle that for the most part. I think for people outside of the industry, even just relying very heavily on chat at first to automate and perform tasks—and you are responsible for the connective layer, right, of getting that data back into wherever you need—can get you a very, very long way.
That being said, if you actually want to do things that are more complicated, like sending mass emails, for example, or summarizing or analyzing a bunch of files—PDFs, images, whatever—you're going to have to write code for that or at least create new software ad hoc in order to accomplish these tasks. That's where I would plug Replit. Go make a Replit account. It's very easy to just copy and paste code in and run it and have it operate on a bunch of data that you want. And if you trust Replit from that point on, we're working very hard, basically by the end of the year or two years from now, for consumers of all kinds to be able to create custom software to accomplish different tasks like this. And it might come right now at the price of having to actually read source code and copy and paste it in from places, use Ghostwriter, learn how to use these more low-level tools to actually interact with the file system to accomplish different tasks that you need. But I think the end goal is to get to the point where you can just speak software into existence and you can just work with software on a level that is equivalent to two people having a conversation.
I don't necessarily mean that the future is chatbots. I think that's just one component of it. But eventually, we'll get to the point where you're not just importing and writing code specifically, but you are thinking about the actual task at hand, like your example from before of needing to extract the audio from a video, right? You're thinking about the extraction of the audio and getting that alone. You don't care about the underlying code. I think also more practically—so I mentioned ChatGPT, Bard, getting used to these tools, using them. Poe also really good from Quora, which is a mobile app. Making a Replit account, getting comfortable with using Ghostwriter inside of there or, at a very minimum, taking code from other places and using Replit to run it. If you're a professional developer and you already have VS Code downloaded, obviously try our Copilot and get used to that. But in terms of new people from the industry, I think it's chat, it's Replit. Zapier also has a really good service now where I think they're integrating AI-powered tooling, and that's great. And Notion also. I've been using a
Nathan Labenz: (1:36:39) lot of
Tyler Angert: (1:36:39) Notion's AI tools recently. I sent this to their feedback team, but it feels like I have a jetpack on when I'm writing. And they have a lot of really great affordances for using these models in interesting ways on top of just standard document editing and all these inline databases and stuff.
Nathan Labenz: (1:36:58) Well, let's imagine a hypothetical situation where a million people already have a Neuralink implant in their skulls. And the safety profile is looking good. I usually say it's imagine it's like the COVID vaccine where the general consensus is—if people are not keeling over and dying. There's probably some, inevitably there's going to be some noise of people saying it may not be fully safe. But to the best of your knowledge, it appears to be safe. If you get one, you can start to interact directly from your brain to your computing devices. Are you interested in getting one? It's in my brain. Well, the Neuralink as it stands right now—they basically carve out a little spot of the skull, put an implant back in and then layer the skull on top. So it's not in the brain, although there are electrodes from the device that get implanted actually into the brain. The device sits on top, I believe, of the dura, if I understand correctly. There's generations of this as well, right? I think actually maybe the current one that they had in monkeys, they take the dura off and then layer that on top. In the future, they're trying to put the electrodes through that thick layer and into the brain, but they're still working obviously on the final version.
Tyler Angert: (1:38:18) Honestly, my initial reaction is no. I think—I mean, it depends on how—the idea of it being an implant really scares me. And having something electrical up there, regardless if it's been tested, it's safe or not. I think it's just maybe a fear that I can't get over. I think, in theory, I do want it. I want all the benefits that it can give me. Maybe a good counter is, it kind of triggers the same flight or fight response as asking if you'd let your kids play football. And it's like, for many people, they're like, "Yeah. Sure. I play football. Whatever else. I'll go try out for the team." But if you ask them if you'd let their kids play, it's like, definitely not. You don't want your kid being exposed at risk to concussions and whatnot. So I'm thinking about this from the perspective of would I want my kids to also get an implant? What if that was part of the policy, right, that your whole family actually had to be on board with it? Or if it was a citywide thing, a certain percentage of the population all had to be on the same page. It wasn't just a totally personal decision. I think the politics of it get really muddied there.
Implants scare me. I'm into the idea of the augmentation, but I think I'd be interested to see some form of non-invasive technology as well that can get near the fidelity of it. I don't know if that's impossible or not, but yeah, what about you? Would you do it?
Nathan Labenz: (1:39:51) You know, I've heard such good answers on both sides of this question that I honestly don't even know where I stand anymore. I do think this kind of thing is coming. If I had to guess, I think maybe the Get Out of Jail Free card would be the sort of wearable version that you could take off. It feels like there will be a lot of demand for that. We're already seeing fMRI-type brain reading that is non-invasive, but obviously has massive capital attached to it. So I don't know what the final form factor will be, but man, there are a lot of times when I wish I could just write something and I have my hands full. I've got three kids. My hands are often somewhat full, and there is a real draw toward, yeah, if I could just have this sort of direct channel of access to information or ability to get my thoughts out, I would value that quite a bit. I don't know if I'm quite ready to do the hole in the skull either, maybe hold off for the wearable that I can remove at night, but I think the draw there is going to be pretty—it's going to be a strong attractor. It feels like one of those things where, people today, you don't have a cell phone? How do you manage? Right? Everybody kind of expects you to have one. Feels like one of these equilibrium shifts that could happen where, you know, it becomes hard not to have something like that at some point. So I sort of expect it to be something like this to eventually become the norm.
Tyler Angert: (1:41:15) Honestly, I think the difference might end up being way more drastic than that. Like, "Oh, you don't have a phone? You can't talk to people?" It'll feel probably more like TED use, but I think even worse or even more drastic. Imagine instead of—let's take the hundred meter dash, for example. Presumably, everybody in the Olympics at that level is on some sort of drug, right? But the difference between drug athletes and non-drug is maybe half a second in the hundred meter dash. It would be like, I feel like, a human taking some new drug and then suddenly being able to run as fast as a cheetah. And it's like, you're not even—you need a completely separate competition. You can't even—it's not that you're blowing them out of the water. It's like you are non-human at that point or you're bionic, which in this case would literally be the case.
Yeah, I think the pressure might come less from societal pressure. Like, "Oh, you're not normal. Everybody has a phone." And it might be more like, "Wow, you were missing out on a fundamental shift in what we can do." And it's not like you're not normal anymore, but it's like you're going to be an idiot if you don't engage. I hope it's more positive than that, but if the direction goes that way and the differences are really that stark, then I think it might be. I just don't want us to end up in a Gattaca-type situation. I don't know if you—have you seen the movie Gattaca? It's been a long time. Yeah. So for the viewers at home, it's a genetics-based eugenics movie, essentially, stars Jude Law, talks a lot about inequities between people who are genetically engineered and those who aren't. But, you know, this is what we're talking about is an extension of that core struggle, I think.
Nathan Labenz: (1:42:57) So this is also perfect, you're transitioning from question to question so seamlessly for me. Last one I always ask is just zooming out as far as you can, what are your biggest hopes for and also fears for society at large as we enter this new AI era?
Tyler Angert: (1:43:16) This isn't a particularly new take, but I hope we can live forever. I'm definitely on the boat of, at this point, we either all die or all live forever kind of dichotomy. I mean, we were all going to die anyway. The question is, now it's either we all die at the same time through some mass extinction event or life pursues as normal and people have normal life and death cycles, right, and humanity continues. So the question is, is there extinction or do we achieve utopia? I am definitely optimistic that, honestly, as long as the US is the first one to develop safe AGI and we can use it to accelerate science research 100x, 1000x, and we get fusion. We hopefully have some sort of universal cancer treatment. We figure out how to reverse aging. We figure out how to replace organs and grow them very, very quickly when they're failing. Once AGI is figured out, that's the core bottleneck. And then we can figure out the rest of the big issues at hand, which I guess is Sam Altman's main take too, for a lot of people in AI, it's like, this is the fundamental invention that we need to get to and it's like, without it, we won't be able to answer and solve bigger ones. So let's get to this one first. We'll still do scientific research in all these other fields along the way, all the stuff that DeepMind is doing with AlphaFold and whatnot. But let's get to here first and then hopefully we can live forever.
So, yeah, I don't even know if I want to live forever, but at least for a few hundred years, there's a lot of TV to watch, a lot of media to consume. I have a million Netflix shows that I need to catch up on that'll take me at least five hundred years. So hopefully modern medicine advances to the point where I can
Nathan Labenz: (1:45:12) go through Netflix's entirely. Well, I certainly hope that works out for you. Tyler Angert, thank you for being part of the Cognitive Revolution.
Tyler Angert: (1:45:21) Thank you so much, Nathan. It was a pleasure.
Tyler Angert: (1:45:24) Omneky uses generative AI to enable you to launch hundreds of thousands of ad iterations that actually work, customized across all platforms with a click of a button. I believe in Omneky so much that I invested in it, and I recommend you use it too. Use CogRev to get a 10% discount.