Using ChatGPT As a Copilot For Your Mind

Watch Episode Here

Video Description

In this video, Nathan chats to Dan Shipper, CEO and Co-founder of @EveryInc , for the series "How I Use ChatGPT". They discuss Nathan's prompting techniques for creative and cognitive labour, and using GPT in copilot instead of delegation mode. If you need an ecommerce platform, check out our sponsor Shopify: https://shopify.com/cognitive for a $1/month trial period.

Watch the rest of the series, "How I Use ChatGPT", on @EveryInc 's channel!

SPONSORS:

Shopify is the global commerce platform that helps you sell at every stage of your business. Shopify powers 10% of ALL eCommerce in the US. And Shopify's the global force behind Allbirds, Rothy's, and Brooklinen, and 1,000,000s of other entrepreneurs across 175 countries.From their all-in-one e-commerce platform, to their in-person POS system – wherever and whatever you're selling, Shopify's got you covered. With free Shopify Magic, sell more with less effort by whipping up captivating content that converts – from blog posts to product descriptions using AI. Sign up for $1/month trial period: https://shopify.com/cognitive

Omneky is an omnichannel creative generation platform that lets you launch hundreds of thousands of ad iterations that actually work customized across all platforms, with a click of a button. Omneky combines generative AI and real-time advertising data. Mention "Cog Rev" for 10% off.

NetSuite has 25 years of providing financial software for all your business needs. More than 36,000 businesses have already upgraded to NetSuite by Oracle, gaining visibility and control over their financials, inventory, HR, eCommerce, and more. If you're looking for an ERP platform ✅ head to NetSuite: http://netsuite.com/cognitive and download your own customized KPI checklist.

X/SOCIAL:
@labenz (Nathan)
@danshipper (Dan)
@CogRev_Podcast (Cognitive Revolution)
@Every (Every)

TIMESTAMPS:
(00:00) - Episode Preview
(00:03:57) - Copilot vs delegation mode
(00:11:06) - ChatGPT for coding
(00:14:29) - Building a prompt coach
(00:15:44) - Sponsor: Shopify
(00:28:22) - Best practices for using Chat-GPT
(00:43:55) - The “dance” between you and AI
(00:50:16) - Using GPT as a thought partner
(00:52:07) - Using GPT for diagrams
(01:03:18) - Using Perplexity instead of a search engine
(01:12:00) - What's ahead for AI

#gpt #promptengineering

Full Transcript

Transcript

S0 (0:00) When I have an article idea, I'll often start with just this, like, really messy document full of quotes and sentences and little, like, things that might go into it. And then I'll be like, I don't even know where to start with this. This is crazy. And then I will just be like, can you put this into an outline? And I'll just paste the entire document into ChatGPT, it'll often find an outline. And, like, the the outlines it comes up with are, like, really basic. But sometimes but I think what is real 1 of the things it's really good at is, like, pointing out the obvious solution that you missed because you're too, like, close to the problem.

S1 (0:34) Hello, and welcome to the Cognitive Revolution, where we interview visionary researchers, entrepreneurs, and builders working on the frontier of artificial intelligence. Each week, we'll explore their revolutionary ideas and together we'll build a picture of how AI technology will transform work, life, and society in the coming years. I'm Nathan Labenz joined by my cohost, Erik Torenberg. Hello, and welcome back to the Cognitive Revolution. Today, we're sharing an episode of the new podcast, how do you use ChatGPT? How do you use ChatGPT is hosted by Dan Shipper, founder and CEO of Every, a daily newsletter that promises the best business writing on the Internet. In just his first few episodes, he's had guests on the show, including Sahil Lavingia, Nat Eliason, Linus Lee, and today, yours truly. This conversation is both extremely practical and a real exchange of ideas. Coming into it, I had used ChatGPT mostly for unfamiliar tasks where I really needed help orienting myself and getting started. And, of course, I've got great value from a wide range of different use cases. But to be honest, I hadn't found ChatGPT super helpful for my own writing process. So I was really interested to learn more about the methods that Dan has developed to use ChatGPT as a thought partner and a writing assistant. Learning from him inspired me to do more of this for myself. Toward the end of the episode, Dan asks me what I am most excited about next, and I mentioned the new Mamba architecture and state space models more generally, which I honestly can't stop thinking and talking about. We'll have a big episode on this coming very soon, and I'm glad to report that I did use some of Dan's recommendations to help develop the strategy, the devices, and the overall structure for that episode in a way that I did find legitimately very helpful. 1 note before we get started, there are a few points in this episode where we each shared our screens to show off content visually. And while I think you will be fine with just the audio version, if you want to see the visuals, you can check out the YouTube version of this episode. Of course, there's always lots more to learn. So if you like this sort of content, I encourage you to check out how do you use ChatGPT with Dan Shipper.

S0 (2:47) Welcome to the show.

S1 (2:49) Thank you, Dan. Great to be here. I'm excited for this.

S0 (2:51) I'm excited too. For people who don't know, you are the founder of Waymark. You are the host of the excellent podcast Cognitive Revolution, and you are a GPT-four red teamer. So you are responsible or 1 of the people on a team of people who are trying to figure out how to make GPT-four do bad stuff before it was released, which you had a really interesting tweet thread about, I don't know, I think about a week ago or 2 weeks ago, something like that. So I'm very excited to have you. I think you'll you'll have a lot of insights that I'm excited to share with everyone. I think 1 of the things in thinking about your work that stands out and thinking about Cognitive Revolution, in particular the podcast that you run, is I think you have this idea that 1 of the values of AI is in helping us to offload cognitive work. So just like in the way that machines in the industrial revolution, we offloaded like manual physical labor. AI will augment or offload a lot of cognitive labor from humans. And I wanted you to just talk about Tell me more about what that means, and then tell me, is that a good thing and where is it a good thing?

S1 (3:54) Well, that's a big question. I would say I talk about AI doing work and helping us in a couple of different modes. For starters, we will probably spend most of our time today in what I call copilot mode, which is the ChatGPT experience of Yeah. You are, as a human, going through your life and going through your work and encountering situations where, especially as you get used to it, you realize, oh, AI can help me here. So you make a conscious decision in real time to switch over to interacting with AI for a second or a minute or whatever to get the help that you need, and then you proceed. But you are the agent, right, in that situation going around and, you know, pursuing your goals. In contrast, the other mode that I think is also really interesting is delegation mode, and that is where you are truly offloading a task. And I always say the goal of delegation mode is to get the output to the point where it is consistent enough that you don't have to review every single output. And if you can get there, then you can start to really shift work to AI in a way that you no longer have to do it. And that can that can be useful in different kinds of ways. Right? The the copilot mode is about helping you be better. That's your classic symbiosis or intelligence augmentation. And then the delegation mode is more like we can save a ton of time and money on things that used to be a pain in our butts, or we can scale things that are not currently scalable. And there's a lot of that in the world. Right? I think almost everybody has things where they would say, you know, if you just ask the question, is there stuff that you could be doing that would be really valuable to have done, but you just don't have time to do it. Yeah. There's a lot of that that can be quite transformative. In the middle and and what's kind of missing right now still is between copilot mode where you're getting this kind of real time help and deciding how to work it into whatever you're doing and delegation mode on the other end. In between is ad hoc delegation

S0 (6:04) Yeah.

S1 (6:05) Where it's I'm going along, but I wanted ideally, I would like to delegate more and bigger subtasks to the AI on the fly. And that's where we're not quite there yet. The the agents probably can't do much in the way of a significant task. So it's you're still shoehorned into 1 of 2 scenarios where you're engaging with it in real time and getting help, or you're going through the process of doing a setup and doing a validation, setting up a workflow to where you can truly delegate. And it's that in between that I think is probably that gap gets closed over the next year as agents, quote, unquote, begin to work, and then we can start to delegate bigger chunks of work on the fly. The next question was, is it good? I don't know if I have a great answer to that. I think it's largely good. It's I think it says it's good as long as it's it's good as long as humans stay in control of the overall dynamic. And I'm definitely 1 who considers everything to be in play for the future, both on the positive side. I don't think it's crazy to think of a post scarcity world. And on the negative side, to quote Sam Altman, I wouldn't rule out lights out for all of us. I think we are definitely playing with a force here that has the potential to be totally transformative in good and bad and probably a combination of ways. I'm thrilled by how much more productive I can be, and that's some of the stuff that we'll get into in more detail. I am thrilled by the prospect of having infinite access to expertise and especially for people who have far less means than I do to have that kind of access to expertise. I am a pretty privileged person who can go to the doctor without really thinking twice about taking the time off from work or what that's gonna cost me or whatever. Obviously, a lot of people don't have that luxury. I think there is a real way in which AI can cover a lot of those gaps, not fully yet, but already significantly and obviously more and more over time. I think that kind of stuff is gonna be potentially disruptive and maybe the source of a lot of political debates and challenges. But, anyway, yeah, there's there's so much upside, but I think there is very real risk. And it's very easy to hold those 2 perspectives at the same time to be just thrilled by the the capability, but also to be always always keeping in mind a sort of healthy fear.

S0 (8:27) I love that. I think that's such a rare perspective. And as humans, we just tend to collapse on 1. Either it's horrible or it's great. And then we have these camps. And I think, like, obviously, the wise perspective is there's going to be some really amazing stuff about this. And there are dangers. Like when technology changes society and change, it'll change our brains, like we will adapt to this in the same way that it is adapting to us. That will change things and we'll need to like deal with the dangers that it presents. I think that's a very wise perspective. And I asked that question, is it a good thing that cognitive work will be offloaded? Because I think that there's good and bad. But 1 of the things that I feel is the fear scenario is like is quite dominant for a lot of people. And and I think the people who are like anti fear or presenting a hopeful view are a little but they're a little bit too like rose colored glasses. And I think finding real ways and real use cases for how offloading some of this cognitive work actually helps people is just a really important part of of creating a world where AI is a force for good or force for creativity rather than a world where it just replaces people or it creates dangers or there's, I don't all the bad scenarios. And 1 of the things that I've felt going back to your kind of copilot mode versus delegation mode 0.1 of the things that I felt is that AI reveals to me how much drudgery there is even in highly valuable, highly creative knowledge work and that we sort of like lie to ourselves about the amount of drudgery because that work is so romantic compared to, I don't know, I don't know, working in a factory maybe or just any other kind of job. It's easy to look at a lawyer and be like, Well, a lawyer's job is full of drudgery or whatever. But I write, I run a business. I have a YouTube show now. I have a podcast. There's a lot of stuff that's like just pure drudgery of that. And I find it really interesting because using ChatGPT, using AI tools more broadly, it has made me aware of how many repetitive or like just overall kind of brain dead things I have to do just to write something smart on the Internet on every. And once it's visible, I use AI for it, and then I don't have to think about it as much anymore. And I I think that's a really cool thing.

S1 (10:56) Totally. For me, coding comes to mind most there when you talk about the drudgery of high value and, again, pretty privileged work to be doing. But I'm not a full time coder, have been for a couple short stretches in life, but more often, I've been somebody who's dipped in and out of it. And it is a real pain in the butt to have to Google everything. Obviously, different people have different strengths and weaknesses. I do not remember syntax super well. After sometimes if it's been a while, I'm like, wait a second. Am I is this am I remembering JavaScript, or am I remembering Python? Like, what exactly is going on here? Yeah. And so to be able to just have the thing type out even relatively simple stuff for me is a multiple x speed up often in terms of productivity improvements. Often an improvement in just strict quality too compared to what I would have done on my own. Yeah. And makes it so much easier to get into the mode in the first place. There's this kind of I wouldn't even call this drudgery, but it's gearing up. Just, like, somehow getting my people talk about in birdwatching, getting your eyes on, really focusing on what are you seeing and trying to, like, get that detector right. There's, like, a similar thing, at least for me, in terms of getting into code mode. And it also just streamlines that tremendously because next thing you know, it's writing the code, and I'm reading the code. Reading the code's a lot easier than writing the code. So I do find, yeah, just, tremendous, satisfaction, you know, pleasure in just seeing this stuff, like, outputted for me at superhuman pace, better better than me quality, maybe not superhuman quality, but super Nathan quality. It's awesome.

S0 (12:43) What you're making me think about is because I I think in large part, not all of it, but in large part, what the the current class, especially of text models are doing, is different forms of summarizing and how, like, how much summarizing is involved in creative work, in programming, in writing, in decision making. A lot of it is just summarizing. Like, in programming, you're summarizing what you find on Google. You have to decide what to summarize, and you have to summarize it in the right exact way for, like, your specific use case, but that's a lot of times what you're doing. Same thing for writing. Like, a lot of the stuff in my pieces are summaries of books that I've read or conversations I've had or ideas that I found somewhere else that I'm, like, stringing together in a sort of unique way. And, obviously, I still have to do the management overall management task of deciding which summaries put in which order and, like, how they work or whatever, but, like, a lot of it is summary. And I think that's a way that using these tools, you start to see the world a little bit differently, and you're like, oh, yeah. There's a whole there's a whole class of things I'm doing that are summaries that I don't have to do anymore, and I really think that's cool.

S1 (13:55) Yeah. I should be 1 of the areas where I have not adopted AI as much as I probably should have is in repurposing content, making more of what I do with the podcast because I've put out a lot of episodes. There's a lot of stuff there. And we do use AI in our workflows to, for example, create the time stamp outline, right, of the different discussion topics at different times throughout the show. That's the most classic summarization where I'm not looking for a lot of color commentary. It's literally just what was the topic at each time. Yeah. Get it right. So we've got some stuff like that we go to pretty regularly, but I have not done as much as I probably could or should. Maybe this will be a New Year's resolution to, bring that to all the different platforms. And it is I think it's an actually an interesting it's partly a personal quirk, and it is also, I think, a limitation of the current language models that I never quite feel like I want them to write as me. I'm very interested to hear your thoughts on how you relate to it in the writing process.

S0 (15:02) Yeah.

S1 (15:02) When I put something out in my own name, I basically don't use ChatGPT at all for it. I can use it, I find, for, like, voice of the show if I want to do, like, that time stamp outline or just create a a quick summary that's in kind of a neutral voice where it's not assigned Nathan and isn't supposed to be, like, representing my perspective. But I haven't been able to get a I haven't really had a great synthesis yet to help create stuff that I wanna say in my own, you know, voice in my own name. So if you have tips on that, that I that would be something I would love to come away with a better plan of attack on because I'm not quite there. Hey. We'll continue our interview in a moment after a word from our sponsors.

S0 (15:42) I do. I definitely do. I love it. I think it goes back again to when you talk about being a copilot. I think that the failure mode is usually trying to use it when it's a little bit more in delegation mode. Just go do this whole thing. That's when it doesn't really work. But as a copilot, it really works incredibly well for specific micro tasks in writing. So first example that, like, as we just brought as I just brought up, like, some everything is a summary. Like, I often have to explain an idea. Like, I was writing a piece a couple months ago where I had to explain an idea that that I knew the idea was. I was talking about SPF and FTX's collapse and how, like, utilitarianism and effective altruism, like, whether or not that philosophy contributed to the collapse. And in order to write that article, I had to summarize the main tenets of utilitarianism. And I studied philosophy in college, and I've read a lot of Peter Singer's work, and I just generally know it. But I haven't written about that in a while. Ordinarily, I would have had to, like, spend 3 hours, like, going back through all this all the different stuff to formulate my 3 or 4 sentence summary. But I just asked ChatGPT, and it gave me the summary in the context that I needed it in 3 or 4 sentences. And that I I didn't use that wholesale, but it gave me basically the thing I needed to, like, tweak it and put it into my voice. And so that's a really simple example, but I think you can use it in all different parts in the writing process from at the very beginning. I'll often just record myself on a walk just like spewing ideas and random thoughts for associating, and then I'll have it transcribe it and summarize it and pull out the main things, and then it'll help me find little article ideas. I'll often when I have an article idea, I'll often start with just this like really messy document full of quotes and sentences and little like things that might go into it, then I'll be like, don't even know where to start with this. This is crazy. And then I will just be like, can you put this into an outline? And I'll just paste the entire document into ChatGPT, it'll often find an outline. And, like, the the outlines it comes up with are, like, really basic. But sometimes but I think what is 1 of the things it's really good at is, like, pointing out the obvious solution that you missed because you're too, like, close to the problem. Oh, of course, like, the outline for this article is set up the problem and then talk about the solution to the problem that you came up with or whatever. That that's such a common format for an article. But if you're in your, like, head about it and you're being really precious, it can be hard to be like, for this special article, it's gonna be this basic thing that you've written 1000 times before, same basic structure. And then I think, like, 1 of the 1 of the other really great things is it's just incredibly good for helping you figure out what you're trying to express, put into words what you're going for, and then also going through the different options of like how to express what you want to express until you find something that like exactly says the thing you want. Like, example, like trying to find exactly the right metaphor. Okay. What kind of metaphor are you trying to find? What's the idea you're trying to express? And then here's 50 different options of like ways to express that with a metaphor. And 49 of them will be trash And 1 of them will be amazing or 1 of them will push you in the direction of what the actual what like, 1 that you come up with is, and I have, like, zillions of examples of that. So I find that ChatGPT, it's all over my writing, but none of the stuff that makes it into the writing I publish is, like, wholesale from ChatGPT. It's, like, doing some of those micro tasks for me all the time.

S1 (19:14) Yeah. That's interesting. Some of the stuff that you mentioned there, have had some luck with. The talking to it on a walk is quite helpful in in some cases. In try I there I've done a couple of things where I tried to draft, a letter and do, as you said, talk my way through it. Here's what I wanna say. I'm I'm writing here to this person. Here's a little bit of context. Here's the key points I wanna get across. Can you do a draft? And then iterating verbally on that draft. A lot of times, I'll be I'll follow-up and be like, okay. That's pretty good, but you can give it pretty detailed feedback too. The transcription in the app is so good that it is, again, point of privilege. It understands me extremely well. So I can literally just have to scroll through its first generation and say, you know, in the first paragraph, I don't really wanna say that. It's more like this. In the second paragraph, more emphasis on this, add this detail, give it, like, 8 things. But you could wish it would do a little bit better on the revision. I end up I've had a few moments there where at the end of that process, I'm I have something where, like, alright. When I get back to the desk, it's not that far of a leap from that to the actual version that I'll use. It's probably still underutilized for me. I should go on more walks, honestly. Get more time away from the screen, get the blood flowing a little bit, and use a different modality. The micro tasks, should do more, though. That's I think that's the tip that I'm taking here is and it's there's a separation between, like sometimes where I feel like it's hurting me is if I haven't and this will even now start to happen in, like, Gmail or anywhere where there's this, like, autocomplete that's popping up. Sometimes I'm like, I'm on the verge of a thought that is really the thought that I'm trying to articulate. Yeah. And then this autocomplete comes up and it's that's not right.

S0 (21:01) Right.

S1 (21:01) But it's but it it can derail you at times where you're like, don't guess for me right now. Let me get the core ideas down first.

S0 (21:07) Yeah.

S1 (21:07) If you don't have those core ideas, then for me, it's been a real struggle to get anything good. But I I think I've probably not done enough experimentation in the writing process of, okay. I do have some core ideas. Can you help me order them, structure them, iterate on them? Interestingly, I also do use it at the other end often. Critique this. Here's a here's an email. Here's a whatever. Here's an intro to a podcast. Critique it. Yeah. That could be really useful. If it's critiques are usually worthy of consideration at least, I would say. Yeah.

S0 (21:42) It it truly is good at that. And at every we have an an edit we have multiple editors who are, like, highly skilled, and I still use it to be like, what do think of this intro? Because it's up at 2AM when I'm when the night before a deadline.

S1 (21:53) Yeah. It's real hard to beat the availability. The responsiveness is it's clearly superhuman on that.

S0 (21:58) I think the writing sounds really fun. I would love to if you're ready for it, I would love to start just diving into how you actually how you use ChatGPT. Sure. You sent me a doc with a bunch of historical chats, and this is the first 1. Give us the setup. What were you doing? And at what point were you like, oh, I need to go into ChatGPT and then and take us from there?

S1 (22:17) So I am working as the AI advisor at a company called Athena, which was founded by a friend of mine named Jonathan. And

S0 (22:30) Is this the, like, virtual assistant company, the Thumbtack? Thumbtack for

S1 (22:33) Yes. He is 1 of the founders of Thumbtack, and this is a different company, but founded on some of the lessons that he learned in the Thumbtack experience. He legendarily built up, like, a a really amazing operation powered by contractors in The Philippines. Mhmm. And, that included hiring an assistant for himself in his role at Thumbtack who became, like, a almost another key partner in his life over a long time. And then Athena was built to essentially try to scale that magic for startup founders executives in general. They hire executive assistants in The Philippines. They pay premium wage. They're really focused on getting super high quality people. And the idea is to empower the most ambitious and most high impact people by equipping them with this ability to delegate to their assistant in a transformative way. Okay. Now we're working on what does AI mean for us? Right? How do we bring that into the assistance work? So 1 of things I've done is train the assistance on the use of AI. And that's been a fascinating experience, putting content together, examples, etcetera. Another thing that I've done is just worked on building a number of kind of prototype demos for what the technology of the future might start to look like. Mhmm. And this chat, which we call Athena Chat, is basically our own custom in house ChatGPT. It was built on an open source project, so I didn't I didn't have to code every line of it. But it is amazing how much how quickly you can build things like this today with a bit of know how. So it's been me and and 1 other person who have built a number of these prototypes. In this case, what we wanted to do is say, can we create a long lived profile that represents the client that can assist the EA in all sorts of ways? So it's essentially a plug in, but with plug ins, you have some limitations, whatever. We're experimenting with this on our own. 1 of the big things we wanted to enable is adding information to the client profile, updating information that's already in there. So the hope is that this could be a hub where over time client preferences and history and even background context documents all can gradually find their way in there. And you have this holistic view where the assistant can go query anything they need, but again, also update, add to. It's in theory supposed to evolve over time.

S0 (25:05) Right.

S1 (25:06) So we have this ChatGPT like interface. And 1 of the things that we've noticed is that we still see, despite our attempts at education, like, it's not perfect. We still see that assistants sometimes need coaching on how to effectively prompt a language model. So that was my motivation coming into this little thing. I already had this React app, which is, again, just a ChatGPT like little app. Yep. And I wanted to add a module to it. The module I wanted to add was a prompt coach. So I wanted to take a put in another little layer where it would look at what the assistant, the human assistant, put into the chat app and send that through its own prompt to say, are you applying all the best practices? Are you telling the AI, like, what role you want it to play? What job you want it to do? Are you specifying a format that you want your response back in? Are you are you, often these days will do it by default, but are you setting it up in such a way where it will do some sort of chain of thought, think out loud, think step by step reasoning before giving a final answer? Mhmm. That's actually 1 of the the most common things I see people do to shoot themselves in the foot with AI performance is prompt in such a way where it prevents the what is now the kind of trained in default behavior of explain, analyze, think about it a little bit before getting to a final answer. So you just have, like, a number of best practices.

S0 (26:36) Let me stop you there real quick. What are people doing that would prevent the model from doing the kind of chain of thought best practice that that makes it reason the best?

S1 (26:45) Anything that just sets it up in such a way where it's got to answer immediately with no ability to scratch its way through the problem is bad. And I see that very often. It's common. It it happens even in, like, academic publications, not infrequently. Yeah. Often, that's a hangover from the earlier era of multishot prompting. And, obviously, this is all changing super quick. Right? But if you go back, the first instruction model that hit the public was OpenAI's text da Vinci 0 0 2 in January 2022. So we're almost on 2 years, but still not even 2 years since you could first just tell the AI, write me a haiku, and it would attempt to Right. Write you haiku. It may not get at that point, it was not necessarily gonna get the syllables right. The the earlier generations were you would have to say a haiku by author name colon, and then hope that it would continue the pattern. That's the classic prompting And with instructions, now you can tell it what you wanna do, and, obviously, that's gotten better and better. But in the benchmarking, in an academic context that was developed before this instruction change. You know, typically, would have, like, question, answer, question, answer, question, answer, question, and the AI's job would be to give you the answer. And so they would be measured off and on 5 shot prompts or what have you. But a lot of that stuff was all that scaffolding was built before people had even figured out chain of thought. Mhmm. And so now if you take that exact structure and you bring it to a GPT-four, you're often much better off just giving it the single question with no structure, letting it spell out its reasoning because they again, now it will do that by default and then give you an answer. Versus if you set up question, answer, question, answer, question, it will respect the implicit structure that you are establishing, it and will jump straight to an answer. Often, these are, like, multiple choice or they could be a number or what have you. It will jump to an answer, but the quality of the answer is much reduced compared to default behavior if you just let it think it think itself through it. And I've even seen this in bard. I think this is hopefully now fixed, but not too long ago, bard would give you an answer before explanation by default. And, again, that's just like you're going to have a problem. So that sometimes people do that by mistake. They'll say, give an answer and then explain your reasoning. You're just hurting yourself. Right? Because it will explain its reasoning answer once the wrong answer is established. Right. So Got it. Triple a is my in the EA education, it's triple a for triple a results, analysis before answer always.

S0 (29:26) I never heard that before. I like that.

S1 (29:28) It's it's hopefully hopefully, they'll remember it coming out.

S0 (29:30) Of it. No. I like it.

S1 (29:31) Hey. We'll continue our interview in a moment after a word from our sponsors.

S0 (29:35) And just to summarize, I think basically what you're saying is a previous generation of prompting really encouraged in your prompt to give multiple examples of the kind of question and answer kind of thing that you wanted the model to do and then set up the last example such that the next thing the model would do is give you a direct response. But what we found over time is 1 other really effective thing to do is rather than have the model give a direct response or direct answer to a question or a problem posed to it is letting the model quote unquote think out loud first by reasoning through the problem just like a human would do a word problem. And then at the end of its response, give an answer, improves the quality of the answer that you're giving and improves the quality of the result that you get from the model. And what what has happened is OpenAI and other model providers have made that more of the default behavior so that you'll pretty much always do that. But using previous prompting techniques, a few shot prompting or multi shot prompting where you're giving examples might lead it to just answer directly, and you should look out for that and try to avoid that.

S1 (30:41) It's a great summary. Yes. Yeah. It is jumping directly to an answer, you are for sure leaving performance on the table for Yeah. All but maybe the most trivial tasks.

S0 (30:54) Yeah. And just an aside, see how much your creative work is just summarizing?

S1 (30:58) Important because I I tend to give you the long the long version by default. That's my default behavior.

S0 (31:02) This is 1 of those microtasks that I'll just be handing to an AI avatar version of me at some point. Okay. So let's get back to this. You're working on an app, and you wanna add a module to it that explains some prompting techniques. And it looks like the app itself is something that you didn't build from scratch, you're trying to get the lay of the land so you know what to do.

S1 (31:21) Exactly. Yeah. And the the problem is I know how to code generally.

S0 (31:26) Mhmm.

S1 (31:26) And I've even coded in JavaScript quite a bit, but React is a JavaScript framework that has a sort of hierarchy of best practices that if you know them and you can easily apply them, then you can work quickly with the framework. Right? That's the value of all these frameworks.

S0 (31:41) But if

S1 (31:41) you don't know them and you're coming in cold like I was, then where do I even go to there's all these different folders and file structure and, like, where exactly am I supposed to look for the kind of thing that I wanna do, and where do I put a new module? And so that's where this chat really starts. I have a working app. I have the code for the app. And but I'm I've never worked with a React app personally hands on before. Mhmm. So I literally just set up the scenario. And I don't really use too much in the way of, like, custom instructions or, like, super elaborate prompts in my copilot mode work. Certainly in delegation mode, then you get into a lot more detailed prompts with if this, then that, cases, structured formats, etcetera, etcetera. But I often just find a pretty naive approach is effective for things like this. And so I just start off by telling it, I'm working on this React app project, and I am a bit lost. Can you explain the structure of the app? I give it a little bit more information, and it starts giving me a tutorial of what it is that I'm looking at. And then you've got React and you've got Redux, and then you've got these kind of additional SliceJS toolkits and sagas. And these frameworks, in in some cases, take on a life of their own where there's whole conferences, right, and companies. And it's like, you can be very deep down this rabbit hole and whoever, like, built this open source project that I'm trying to modify, they're using a bunch of different things that are not even necessarily standard but are common or whatever. So just, like, 5 different things here that I have no idea about. And without this kind of tutorial, I'd be, like, going off to search for, okay, what is this saga JS? What does that even do? It's able to give me that entire rundown extremely quickly. And then this I thought was a really interesting moment because I get a lot of value from things like this where I feel like it's prompting me. And it wasn't exactly that here, but it gives me this general structure. Yeah. And then I was like, oh. I find it as a general pattern, it's if you can give it something in a format that it natively showed you, that's probably gonna work pretty well. So sometimes if I'm even in kind of the delegation mode, sometimes I'll be like, I don't exactly know what structure this should have. But maybe if I have it suggest the structure, then we'll get a structure that it can naturally work well with. Well, in this case, the structure is, like, dictated by the world, but it's pretty well known that, okay, this is gonna be your structure of a project in this React framework. Okay. Cool. But this got me thinking, I should give it my actual structure. Like, I wanna print this thing out for this project that I'm working on because I didn't make it. I don't know what it is. And I wanna have it help me interpret that full thing. But then again, I'm like, how do I print something like this? I don't even know how to do that. So then my next question for it is, can you write me the command to print out the file structure? And this is where you're like, okay. This is magic, Greg, because now again, I don't know how to do this. This tree command, I don't know if it was installed for me or not, but, okay, shows me how to do it. And next thing oh, I don't know. There's another step here of installing some package that needed to be installed. Okay. It

S0 (34:55) was able

S1 (34:55) to help me with that. So I'm just encountering all these this is the classic developer experience. Conceptually, I have a clear idea of what I wanna do, but now I'm, like, 3 levels 3 nested problems down here, right, where I'm like, oh, okay. I need to understand this framework.

S0 (35:11) Right.

S1 (35:12) Oh, okay. I need to print out the structure to better understand the version I'm working with in this framework. Yeah. Oh, now I need to install something so I can do that print. And this is where people just time goes to die. Right? It's

S0 (35:23) like Yeah.

S1 (35:23) You talk to programmers, and you're like, yeah. You didn't get anything done today. And there was what happened was I was on the way to the market to to get my app together, and then I had to install this thing, and then I couldn't install. But each of these things, like, it's helping me get over. And now finally, I'm able to say, okay. Here is my app. This is the app that I actually am working with. And now we're really getting into something good because it can now break that down, and the names of the things are, like, pretty semantic. I I noticed I haven't even given it any code here. I've just given it the file names, but the file names have a kind of an indication of what is what, and it gets a sense just from that of what the app actually is. So let's go over to I think I just got a link to a working version of the app. It's pretty simple. It's a ChatGPT like environment. We can create these client profiles. We have our chats. We have our history, a couple different models, and there's function calling in the background that connects the chat experience to the client profile. And what I'm trying to add is a module in the lower right hand corner, which I'm actually not sure if this version has. But the point of it is to take my prompts, run them through this meta prompt as we discussed, and then show feedback, warnings of, hey. You may or may not be doing this quite right. So back to the thing. I've given the the file structure. It's now able to understand the file structure. And now I'm saying, okay. Here's what I'm trying to do. I'm trying to create this prompt coach. I figured exactly how I had approached this. Yeah. This is a different this is a different file. Let me see exactly what I'm doing here.

S0 (36:58) Seems like maybe you had some sample code or something you'd written. Or

S1 (37:02) Yeah. I did I guess I took 1 stab at it myself, and it didn't work. I see. Where I'm looking at you know, the the human version that I was looking at the same file structure, and I'm like, okay. I see that there's this module. There's, a sidebar here. And because you see these names. Right? So you've got sidebar and search and there's gonna be chat history here somewhere, chat. I'm looking at this and I'm like, okay, I see all these different elements and I see all these things. Let me just try to copy 1 and mess with it a little bit and hopefully get somewhere. Right. And then I'm not getting anywhere. It's not showing up where I want to show up. I'm not seeing it. And so that's where I come to say, okay, now here's what I tried. Why isn't it working?

S0 (37:40) Yeah.

S1 (37:41) And I explained my problem here at the end. The problem I have is that it's being shown in the wrong place.

S0 (37:46) Right. So And then it explains the answer.

S1 (37:48) Yeah. Next thing we're mod it's giving me instructions with code, modify this, put it over here. This is pretty cool too. Unfortunately, we can't share the old screenshots. I don't know exactly what I used, but this is right as vision was being introduced to ChatGPT as well. So I was able to then say, here's my screenshot. Here's where it is showing up, and here's where I want it to show up.

S0 (38:12) Right.

S1 (38:12) And can you help me with that as well? So from the screenshots, from the HTML structure basically, we just work through this entire thing. I continue to run into issues. We're only 25% of the way through this whole thing.

S0 (38:28) Oh, wow.

S1 (38:28) This probably took me, I don't know, 2 to 3 hours total to get these suggestions, implement them, see what's going wrong, yada yada yada. It writes all the code, basically. Because, again, I've never written a line of React code in my life. So I don't know any of this syntax. There's 1000000 ways to get it not quite right when you have no idea what you're doing anyway. And so it's writing all the code. And just bit by bit, we're refining the experience. We're refining the interface. Here, we're creating some CSS. We're using we have a a particular style pack that's already built into this. So, again, that's just another thing I'm not at all familiar with. You know, this is the syntax for figuring out how to use that style pack. Good luck making that up on your own. And on we go. Basically, after a couple of hours, I got to a working module where the prompt coach, you know, would intercept your call, do the meta prompt, parse the response, identify I had it giving suggestions and the urgency of the suggestions. So we're color coding those suggestions as they come up. If it's serious, then you get it in red. And if it's not, you get it in yellow or just a notice. And I would've I would guess that this would've taken me easily order of magnitude longer Right. In a pre ChatGPT era. If this was 2 to 3 hours, it's probably 2 to 3 days of work to figure out all this stuff. And I know a lot more frustration. That is, because I'm not a super patient person. The feeling of 1000000 people have done something almost exactly like this. There's nothing differentiated or special about what I'm doing. I'm just, like, in this phase of kind of not knowing what I'm doing and just getting constantly stuck, constantly stumbling, constantly running into friction. I really don't enjoy that. I think most people don't. This is none of that or almost none of it. Right? Even just that going back to the install, right, or the the command to print out the structure. Man, this is so stupid. I know exactly what I want. I know that it is doable. I know that it's been done 1000000 times, 1000000 places, and yet I don't know how to do it. And then Yeah. Liberating me from that frustration

S0 (40:35) Yeah.

S1 (40:36) Is and it turns out and it go to your drudgery point. Right? It's that was probably 80 to 90% of the time in a world where I was doing this on my own. And now we're down to the 2 to 3 hours where it was really about defining what I want. This could have been 1 hour if I really knew React, but it taught me the ropes and did the task in probably, again, 80 to 90% time savings compared to the unassisted version.

S0 (41:04) I love this. I think this is such a cool example. I really appreciate you bringing this because, 1, yeah, it's obvious that this kind of thing, which if you're not a programmer like, as a programmer looking at this, I'm like, yeah, this so much of what you do as a programmer, especially if you're a programmer, like, working on on a on startup stuff is like this kind of thing. It's like, this is doable. It's been achieved before. I just need to do it in this in my specific context. And it's obvious that this would have taken you days or taken really anyone days to do from scratch. But with ChatGPT, it it makes it, like, way quicker and takes away a lot of the drudgery. But I think what's really cool and really beautiful, which is, like, weird to say about this stuff. It's striking me right now is there's this dance happening in this chat where at the beginning, obviously, you're asking it for, to help you. But you are giving it what it needs and filling in the gaps that it needs in order to, like, help you, and it is filling in the gaps for you as well. So it is explaining react to you, but you are explaining, here is the project that I have, and here are the specific details that I I want done. And then there's this, like, dance back and forth where you're mutually filling in gaps that both of you can't on your own fill in. And I think that is, like, really cool to to just watch that evolve where at the start, you don't know React and you don't know, like, where to where to put your code and and you don't know why it's not working. Then at the start, it doesn't know who you are or what you're trying to accomplish or what the specifics of your project is. But as you build up this chat, you yourself are starting to understand things more. Like, you didn't ask it, just go do this for me. You asked, how does a React project work and, like, what what is the structure? And so you learned more about React, and it learned more about you. And as your mutual understanding increased, you were both able to accomplish the thing together. And I think that's really cool.

S1 (42:57) Yeah. It's awesome. The next generation, it's episodic. We we're still only halfway through this scroll for all the scrolling I've done. I just highlighted this. Okay, cool. This is working. Because at this point, I'm starting to get into refinements. Okay. Now I want to dial in the styling. At this point, the core problems have been solved. And now, again, it's just going to do the drudgery of making sure that there's padding and things are centered and so on and so forth. I try to be polite and encouraging to my AIs wherever I possibly can. But you can envision a future. I think that future is already starting to become visible through the through the mist a little bit as more and more stuff gets published on the research side where this sort of episodic relationship where I start a new chat, it now knows nothing about this. Right? I can continue this chat up to a limit. And, obviously, superhuman expansive background knowledge, but 0 contextual knowledge. Yeah. And we can't retain that from 1 episode to the next. But I do think that is also coming soon too. And and there's a couple of different ways it could shape up, but I think we will, in a a year, certainly not that much longer than that, I can imagine, start to see things where all this history is accumulated or maybe divided into different threads or whatever, but where this kind of can follow you forward into different tasks as well in a history aware way, I think will be another level of unlock.

S0 (44:28) I think you're totally right. That's instructions is. It's a step in that direction. Unfortunately, custom instructions is very hard to set up. But if you do set it up, it's really great. It's really nice for it to have context on you. But I do think you're right. ChatGPT will definitely have a memory that it can reference this stuff and reference the context of what you need and who you are, and that will make it, even with the same level of intelligence of the model, will make it 10x more useful and 10x faster to get to the right answer.

S1 (44:55) How much do you put into custom instructions? Because for something like this, it might be my profile, my writing sample, maybe whatever. But I I probably wouldn't have. By the way, Nathan's a React novice and doesn't know how to install anything. So would it do you have a a vision or an a sort of recommendation for a custom custom instruction that would help me with things like this?

S0 (45:17) You're asking the right person. I have a very extensive custom instruction and a and a lot of opinions about it. If you want, I can share them. I can share it with you right now, and we can talk about it.

S1 (45:26) Yeah. Let's check it out.

S0 (45:27) Okay. The first part of custom instructions is what do you want ChatGPT to know about you? And I actually like really having it know a little bit about who I am because there's enough about me on the Internet that it knows my name, and that actually helps. Same thing with every there's enough about every on Internet that it knows my name. And every once in a while, not having to explain who I am or what the company is that I run is, like, really useful. For example, I was thinking a couple weeks ago about starting a course, and I was working with ChatGPT to decide how to do the course and whatever. And and the first prompt was, wanna do a course. Can you help me think about it? And with custom instructions on, it knows that I'm a writer and entrepreneur. So, cool. I'll help you build a course. Here's how to think about it. Because it knows that I'm probably gonna build 1. But if I turn custom instructions off, it will be like, What course do you wanna take? And it's those like little things that like really make a difference for me. But basically like who serious relationships in my life are? I have in here, like my sister, her husband, her son. I have my girlfriend up there, who people are at at Every because referencing their names is just much easier for me to be like, okay, when I'm talking about something and not have to explain who she is every single time is really helpful. I think another really interesting thing is adding into custom instructions. What are the things about you that you know that you're trying to, like, work on? For example, I feel like I'm I have a fear of rejecting people, which causes me to be too agreeable. I'm a little bit too opportunistic, and I would like to be more strategic. Like, stuff like that is really helpful to put in customer instructions because it's these it's those little realizations that you have every day where you're like, wow. Yeah. I I am a little too opportunistic. I think ChatGPT is great for being the thing that can help you as you're in the moment day to day. Remember to pull back and incorporate some of these insights that you have that everyone has about themselves. And same thing for goals, like having to know what your goals are and bring you back to those things all the time as you're using it is really helpful.

S1 (47:30) Cool. Well, thanks for sharing. I think I use it a lot more for just, like, very unfamiliar topics.

S0 (47:39) Mhmm.

S1 (47:39) I'm very I'm just looking at these examples, right, that that we had queued up. Okay. There's an app in a framework that I've never touched and know nothing about. Working on a patent application and creating diagrams for a patent application where I don't really know how to do that at all. Is there again, I'm starting with these very basic questions. What's a good syntax that I might use to create a diagram for a patent application? I just come in so cold. But it does suggest that you are doing a lot more kind of thought partner, like, brainstorming about your kind of core stuff, which is Yeah. Interesting. I'm much more on these kind of episodic things that were like my history and this doesn't overlap almost at all in a in a lot of cases. But it's it just goes to show how many different ways of using these tools there are

S0 (48:32) Yeah.

S1 (48:32) Too. And, yeah, this could be another New Year's resolution to try to bring it a little bit closer to the core of what I do. It's not to say that it's not at the core of what I do, but not in this copilot way. With things like Waymark, I'm working very closely with language models to make an app work well. And I feel like I have intimate knowledge of details of how it works in that respect, and it's a big it's a big project for me. But, again, it's a different mode than the interactive dance kind of mode that you described. Fascinating.

S0 (49:03) Yeah. No. I I that that makes a lot of sense. I definitely use it for some of this knowledge exploration stuff stuff too, but, yeah, it's totally a sort of thought partner for me. But I'd love to keep looking through some of the other chats you brought.

S1 (49:15) Cool. Here's this next 1 on working on diagrams. I'm working on a a combination of a provisional patent application and the supporting diagrams for the patent application. This is something that I was doing for Waymark, and we have this ensemble method of creating advertising video for small business. Basically, folks come in to the site, they get to enter a website URL. Typically, people will give the homepage of their small business website. We have some code that goes in, grabs content off of that website. And then we build a profile kind of synthetically, if your custom instructions, so to speak, within the context of our app. Who are you as user? What's your business? What are you all about? What kind of business? What images? And then to actually create the video, you give a very specific, although it's like a super short instruction. I want to make a video for my sale this Saturday, or I'm opening a new location, and here's the address or whatever. It's like these very, this is my purpose in this moment prompt. And then we've got a pretty complicated machinery that takes all those inputs and it works with a language model to write a script and then has computer vision components that decide which of the images from your library should be used to complement the script at all these different points along the way. And that is it's a pretty cool experience now, really, compared to again, you think about pre AI and now. What we had before was an easy to use template library. And what we have now is really the AI makes you content. Like, it's a phase change in terms of how easy it is to use, how quick the experience is, how much you can just rifle through ideas. If you don't like the first thing, you just ask it to do another. And it's qualitatively just way more fun. People used to have to sit there and type stuff in and they were like, Oh, okay, what I what I say? And I'm not sure what to say. And a lot of people are not content creators. But everybody always referred to the mister Burns episode from from the Simpsons. This is a long time ago, but he goes to an art museum for a reveal of of some piece of art, and they reveal it. And he says, I'm no art critic, but I know what I hate. And that's, I feel like, exactly how our users operate. They ask for something. They wait 30 seconds. They now get to watch a video featuring their business. And if they like it, they can proceed. And if they don't like it, it's very obvious to them and they can very quickly be like, no, not that. Give me another 1 and here's an alternate instruction. So anyway, this is the app that we've built. And now we're like, okay, maybe we should think about filing a provisional patent on that. Like most software companies, we're never going to prosecute our patents, but we just want to make sure nobody can come in and give us a hard time. So how do I write a patent and how do I create the diagrams? And I wanna be able to update it. I wanna have something that's not, like, just a total mess. So this was a a series of different interactions that ultimately led me to these diagrams. But I provided initially basically what I just said to you, which is a rambling sort of instruction on here is my app. And here's what it does. Here's how it works. Here's some of the parts behind it. The language model writes the script and the the code that scraped from the website and then the the other part with the computer vision that figures out what what. I just literally tell it the whole thing and say, now can you use some syntax to make me a diagram that shows the structure of that app that I just word vomited to you? And so there's, like, a bunch of different structures out there. So that's the first part of that conversation as well. You could use the mermaid syntax or you could use Graphviz or you could use a couple other things. But what are the pros and cons of those? And can they represent certain different kinds of structures? We dialed it in on either mermaid or graph is it started to make me a thing. And then you can see here too, this is interesting because I did find in this 1 that at some point, it got confused. I'd given it this thing, and it generated this syntax, asked for refinements on the syntax. Because I'm taking the syntax, by the way, going over to another app. What's cool about the syntax is you drop in this pure text

S0 (53:14) Right.

S1 (53:14) Syntax, and it will render the app for you. Right? So you've got things like this, graph viz, diagram what's a digraph? I don't even really know. It's called this digraph is g, and it has these elements, and they have these properties, and they're connected in this sort of graph structure, blah blah blah. You load it in half a second. You know, it renders it. You're like, oh, no. That's not quite right. This point should be connected to this point, and it's it's skipping 1 that it so whatever. So you give it these kind of iterations. It would make progress, but then it would also get confused, it seemed, after a number of rounds because there's just maybe, like, too much syntax. So at some point, I did say, okay. Using the episodic memory to my advantage or working around its working memory weaknesses by just wiping them and starting over, I'm like, okay. Here's the best 1 from that chat that was closest to what I wanted it to represent. Let me just go have another chat. And this time, we're gonna skip all the part about which format do we use and skip all the word salad. And I can just be like, here's a diagram. I wanna make some changes to it. And now have it do more localized edits for me. And so, again, a lot of little details, a lot of nuance here, but it's happy to do that. We worked through a number of rounds of it, and I believe I attached the thing for you. What I ended up with after or a couple chats, you even get to the point where you're, like, color coding and really starting to make sense of it. It's like Right. The green in this diagram now is the things that the user does. So the user tells us what their business website is, then there's code to go scrape, then there's this fork where we have to grab all the images and we process them in in various ways. 1 of the big challenges is which parts of this can happen in parallel and which parts depend on which parts. This is actually something that we didn't have until I did this even for the technology team. And I'm not sure how well all the members of the technology team could have even drawn this. So now we actually have a better reference internally also to be like, hey. If we like, what depends on the image aesthetic step? Now we can go look at it and be like, oh, okay. Yeah. You can't select best images until you have the aesthetic scores, yeah, completed. Just having that clarity is also, I think, just operationally useful. But this is the the sort of thing that you can attach to a provisional patent application and at least begin to protect yourself from future patent trolls coming your way. Now, again, how long would this take? If I had drawn it freehand, I maybe could have drawn it in a somewhat comparable time to the amount of time that I spent to exchange. But having the syntax and now having it in that kind of, you know, structured language way also makes this, like, much more maintainable, can fit in other things, even can be, like, more readily used in language models. The vision understanding is getting very good, but I would say it's probably still better at understanding the syntax of the graph more than this visual rendering of Yeah. The graph.

S0 (56:11) I think that's great. Obviously, ChatGPT has the DALL E integration, so I'm familiar with that. But I've been thinking a lot about sometimes I I wanna create something that looks like this, like a graph with text and boxes and all that kind of stuff. I didn't even think to do to have it just write graph is markup or something like that and paste it somewhere else. So I think that's a really cool thing to know it can do. And it's also pretty clear, I don't know, in a year it'll probably just render the graph phys stuff for you and you'll be able to move it around and do all that kind of stuff without even necessarily having to chat back and forth after the first round or something like that. I think that would be a cool next step for ChatGPT is jump into an edit mode for something like this.

S1 (56:51) Closest thing I've seen to that so far is diagram GPT. This is a slightly different notation, but basically, you can prompt in natural language.

S0 (57:02) Yeah.

S1 (57:02) It will then generate, in this case, mermaid syntax for you in response, and then it'll immediately render your image. And you can then edit the syntax. You can't quite

S0 (57:13) Mhmm.

S1 (57:14) Like, drag and drop within the interface itself. But I think this is a really interesting question around or highlights a really interesting question around, like, what things should be in ChatGPT versus Yeah. What things should have their own distinct experience even if there's still, like, a very AI assistant component to it. This is 1 actually that I would expect lives outside of ChatGPT. Who knows? Right? In the fullness of time, maybe you have, like, dynamic UIs getting generated on the fly. We're starting to see that a little bit already. But I don't think OpenAI is going to say, what we need to do is create, like, a a UI where people can edit these graph things. It possibly could do that. GPTs don't really give you the ability to create, like, custom

S0 (57:56) Yeah.

S1 (57:57) Editor experiences yet anyway. So for now, if you wanna have something like that, have to bring it to a different app. But increasingly, these are out there as well. Right? They just use ChatGPT and just a renderer. So I had the AI doing all the syntax and then the renderer showing me

S0 (58:14) Yeah.

S1 (58:14) What it actually is and then going back and continuing the the dialogue with ChatGPT. I think you're right.

S0 (58:18) Like, I could see a world where they let developers build their own renderers inside of ChatGPT. Not for, like, really serious stuff. I think, like, dabble in half 1 time or make a little video or whatever that, like, having something in the interface so that you can do it in there, like a a rough thing is really helpful. But then, yeah, I think you're right. There will have to be other pro tools for people that all they do all day is make graphs, that are not inside ChatGPT.

S1 (58:42) So here's another 1. This is recent episode in my life where I had to admit defeat after 10 years of swearing that I would not replace my car until the replacement was self driving.

S0 (58:55) Wow.

S1 (58:56) And we're not quite there. So I finally and I've had 3 kids in the meantime. So I finally had to break down and get a minivan. Like many, parents of young kids, I'm like, oh, what my kids do is they really depreciate stuff, pretty quickly. So I was like, I think I'll get a used

S0 (59:13) Right.

S1 (59:13) Minivan because if I get a new 1, it's gonna be pretty used pretty quick anyway. So let me just look at what is out there. Now anybody who's ever shopped for a used car knows that it's a total jungle. Right? The car dealer websites are terrible. What features they have is a huge question. And what you end up encountering very quickly is these trim levels, which if you're not like a car head, you may not even know what that is, but that is the sort of you've got your make, which is the brand of the car, your Chevrolet or your Toyota or whatever. You've got your model, which is the kind of car. And then the the Dodge Caravan is the make and model. And then you've got this trim, which is often just like a couple letters or whatever. It's like the XRT or the SRT or the LLM limited or whatever. They they just have all these, like, little and these are package levels. Right? What features, what upsells have been included? Does it have a sunroof? Does it have a a screen in the back that drops down out of the ceiling for the kids or whatever. Right? And it's just a jungle to even try to figure out what those things have, what what levels there are, and what those things have. So this is perplexity, which is a great complement to ChatGPT. It is more specifically focused on answering questions. So in this way, it's it's a more direct rival to a Google search. It's not so much meant to be like a brainstorming partner. They really aim for accurate answers to concrete questions, do a phenomenal job on it. So here, I had a number of runs of this as well, different kinds of questions or whatever. But okay. These minivans that are, like, not super old, but bold, pretty cheap, what do they have? What do they not have? And this would have taken I don't even know. If I had really tried, I wouldn't have done it. Right? This is 1 of those things that you just I wouldn't do. But if you had set out to go collate, okay, here's all the makes and the models and the trims and what they have, you're gonna be in, like, user manuals or something. I I don't even know really where that information is stored at ground truth. But just in asking that question, I was able to get the trim levels for all of the different brands for this window of time Mhmm. And just easily get a handle now where I that I could reference back to, okay. This 1 on this dealer site, it doesn't have any pictures. It doesn't say anything. But it does say, for example, oh, it's an it's a SXT. Okay. Cool. Now I can at least know that is the second of however many trim levels or whatever. So the SE, well, that's your top 1. Your SXT, that's your you can imagine, right, trying to sort this out on your own. And then you get the AVP slash SE. Who who comes up with this stuff? It's ridiculous, but super useful if you're like, I'm I don't wanna drive across Metro Detroit to go look at this minivan if it doesn't have something that I really cared about. And the things that I were zeroed in on were, like, fairly basic safety features. I wanted the blind spot detection and the the backup camera. So there were other questions too. Like, did USB charging get introduced into cars in general? I didn't know the answer to that. I'm old enough to remember when you had to plug the thing into the lighter, and I didn't want that. Because I don't want a car that's that old where I have to use the lighter outlet anymore. I want a car that's at least into the, like, USB charger era. But when did the USB charger era begin for cars? That was another 1 that Perplexity was able to answer, and it is so good. I I think this is about to be a huge trend if I had to guess because I've been a big fan of this app for a while. We I had the CEO, Arvind, on the Cognitive Revolution twice, and they just they ship super fast. They win head to head comparisons for answer accuracy. The product itself is super fast. It's got great UI with these sources, and others starting to become more multimodal with images as well, which is relatively new, which is a great experience all the way around. And I I see it as, like, setting a new standard for answers that are like I I started I'm starting to use the term perplexity to say, I'm not sure this is necessarily rock solid Mhmm. Ground truth. Like, perplexity is not always right, but it's the most accurate AI tool. It's usually right in my experience. You might be able to find something here that is wrong. But everything I ended up fact checking turned out to be true. And so I think there's this kind of very interesting good enough for practical purposes standard where I don't necessarily need it to be 100 hundred percent accurate for it to be very useful. And I would make my decisions. Did I trust it enough, for example, to be confident that there was, in fact, gonna be a USB charger in the car that I went to go look at? Yes. And and in fact, it was correct about that. And so I have this kind of per perplexity standard of verification in my mind now where I'm like, yeah. It's in many situations, it's, like, good enough to act on. I wouldn't wouldn't make life and death decisions without more fact checking, but I don't even need to follow these links in most cases now. For something like this, I'll trust it. And and it's it's an emerging, like, standard in the family as well. My wife asks, do we really like to get a car that's that old? Do they have this? Do they have that? And I was able to ask perplexity and send her like, yep, it should have a backup camera per perplexity. It should have a USB charger. It should have the blind spot detection. And it's a incredible time saver. I think I've just a worthy alternative to even something like a wire cutter, which has been the standard that my wife has used for a long time. But, obviously, that's an editorial approach where you can't just ask any question you wanna ask. Here, you can ask any question you wanna ask, and I think you do get something oftentimes that is, like, a worthy rival even to a much more editorial product.

S0 (1:05:19) No. That that makes perfect sense. It reminds me of Wirecut. It reminds me of there are all those sites that are like, but it's like for this new generation where no 1 had to think previously to ask this particular question, and it can just gather and answer the question for you immediately. And I think that's so it's so powerful. Like, it's really starting to click for me when and how I might I might use it. There's there are so many questions I hack this for. I'm like, I basically wanna get to the best answer for a fact based question more or less, and I'm so lazy. I really don't wanna do all the research, and ChatGPT will kinda like it'll do 1 search and then sort of crib the first article, and this feels a lot better than that.

S1 (1:06:03) Yeah. It's really good. It's faster than ChatGPT on the browsing side, so you're you're getting to answer notably faster and marginally more accurate. Just more of the sort of answer that I want a lot of times. Like, I've had a couple of instances where I tried the same thing with ChatGPT, and I was able to get there, but it was, like, slower on the browse. Didn't give me the full answer the first time. I was like, no. But I I need a little more. And then I was able to get over the hump and get there, but this was definitely just a faster, cleaner experience that I do believe is a bit more accurate as well. It goes to show that there are different roles that you want AI to play, and I think there is it's interesting. There's forces pushing both ways. Right? What makes the AIs so compelling is that they're extremely general purpose, and it seems like there is something there is, like, a a fundamental reality that they get really powerful at scale, and to scale, they have to be the general purpose, and so that kind of comes as a package. But here, the scope has been narrowed, and there are a lot of things that ChatGPT does for people that this is

S0 (1:07:12) not Yeah.

S1 (1:07:13) Trying to do for people. And in its specialization, it does seem to be achieving Right. Higher heights in the domain that it really attempts to be best in. So I definitely recommend perplexity

S0 (1:07:28) Yeah.

S1 (1:07:28) A lot. And I'm just old enough to remember when people were first saying that they were Googling it. And this has a similar vibe to me where it's it's a standard that I think people can comfortably socially transact on and feel like they're on pretty solid grounds.

S0 (1:07:44) I love this. Like, you're using it to build stuff, but also really using it to fuel your curiosity. And I'm curious, like, you know, before we wrap up, what are you excited about now? What are you thinking about right now? Like, what's on your radar that you think people should be paying attention to in in ChatGPT maybe specifically, but, like, broadly in AI over the next, you know, couple years?

S1 (1:08:05) Boy, broadly in AI over the next couple years, I think almost anything's possible. I take the leaders of the field pretty much at their word in terms of being honest reflections of their expectations. And you listen to what Sam Altman thinks might happen over the next couple of years. You listen to what Dario Amade from Anthropic thinks might happen over the next couple of years. And we are potentially looking at something that is superhuman in very substantial and and meaningful ways. I think there's a lot of kind of conflation and talking past 1 another when people try to analyze this. And I do think it's important to say you can be superhuman in very consequential ways without being, like, omnipotent or infallible. And I think there's actually quite a lot of space, right, between, like, human performance and omnipotence or infallibility. And I kind of expect that AI is gonna land there for a lot of different things over the next Mhmm. Couple of years. So I think the value of the kinds of things that we will be engaging with it for is only headed up just through a recent result from from Google DeepMind on using their best language models for differential diagnosis was a extremely striking result. This team has been on an absolute terror. It was only maybe like a year ago that they first got a language model to like hit passing level on medical licensing tests, which, hey, that's crazy. But you could just kind of say, well, it's a test. It's more structured. The real world is messy and they're only passing. You wouldn't want a doctor. It's just merely passing. Okay, guess what? We didn't stop there. Next thing you know, it was hitting expert level performance on the test. Next thing you know, they've added multimodality and it can now do a pretty good job of reading your x rays and other tissue slides. And again, is it perfect? No. It would be probably on the lower end of what the actual human radiologist could do. Right. Although even there, it was like sixtyforty, I think. I think it was like 60% to 40% that the human radiologist was beating the AI radiologist. So it's okay. That's a pretty narrow margin. Obviously, we're not done. The current thing is taking case studies out of medical journals, case studies being, like, extreme hard to figure out cases. Right? When a case gets reported in a medical journal, that's because this case, you know, is thought to be highly instructive. Right? It was a confusing situation. It's an unfamiliar combination of symptoms or what have you. So they don't publish just the routine cold, right, in the journals. So they take these case studies out of journals, and they had a study of comparing AI's effectiveness at during the differential diagnosis versus human with access to AI. And AI was the best by, like, a significant margin. The human alone was last. And so they in their kind of presentation of this, they're very modest, and they take almost like a in my view, almost like a 2 grounded, willfully bearing the lead almost at times, it seems. And what 1 of the main conclusions of the paper was we need better interfaces so that doctors can take better advantage of this. But it was like, to me, that's yes. That's 1 lesson I would take away from this paper. But the other lesson is that the AI is getting it right, like, twice as often as the human clinician, like 60% to 30%. That's another big lesson too that I take away from a lot of these things. We don't often measure human performance. We think because we've lived in a world for a long time where, like, a human doctor is human obviously, we know that, like, some are better than others, but we look at that as a standard, that there's a human doctor and they're licensed and they're supposed to be good. But, like, how often do they get the right diagnosis on this? It turned out in this particular dataset, it was in the ballpark of 30%. Right. So there's a lot of room for improvement. And you could perhaps say, what's the best doctor in the world do? That best doctor in the world, I'm sure, is a lot better. Maybe even better than the 60% that their language model was able to do. But you probably can't access that person. We are apparently headed for a world where you should be able to access that AI doctor. And if it's a 2 x better performance on such a challenging task as differential diagnosis, then I I think we're headed for a world of radical access to expertise, which I think is going to be at unbelievably low prices, which I think is going to be a transformative force in society. Right? It's going to be 1 of the greatest blows ever struck for equality of opportunity, equality of access in many ways. It's also going to change a lot of market dynamics and change what what wages can be commanded for different kinds of services. I'm excited about that. I also think it it probably is gonna be fairly disruptive, and it probably is gonna become more and more political. It's I think that the upside of that, I think, is pretty clear and really extremely compelling. So I I hope we do get to actually enjoy the the fruits of that future. Then 1 other thing I'll say is just I don't think we are the transformer is not the end of history. That's a ChatGPT is not the end of history. This sort of no memory AI just this last week or 2, we've seen a flurry of activity in the state space model architecture. And, again, it's been reported, The headlines if you're on if you're on Twitter and seeing this stuff, it's, hey. There's a new thing that might even be better than the transformer. It might be a transformer successor. It might be a transformer alternative. It might be a transformer replacement. It has some nice properties that transformers don't have, better long term memory, better scaling, better speed, better throughput. Maybe we just all flip over from 1 to the other. If, oh, the transformer was the old thing, this is the new thing. But I strongly suspect that what we are going to see is a mixture of these architectures

S0 (1:14:04) Mhmm.

S1 (1:14:05) Where just like in the brain, we obviously don't have just 1 single unit of the brain that gets repeated over and over again. We have a lot of different modules, including some that do get repeated. It seems like we're almost for sure headed for AIs that are, like, composites of different kinds of architectures that bring their own strengths and weaknesses in information processing to the table such that as much as this has been a shocking amount of progress to get to GPT-four from GPT-two just 4 years ago. I have to say, I think the next few years are going to bring at least as much more change, and it's it's gonna be a wild ride.

S0 (1:14:42) It's exciting. It's inspiring. I'm excited for the future, and I really appreciate you taking the time to share your thoughts and show us how you use ChatGPT. And I I'd love to have you back and see where we are, see what new stuff comes up on the horizon. So

S1 (1:14:56) Yeah. Thank you. I appreciate the opportunity, Dan. This has been a lot of fun, and I definitely learned some things and was inspired to to go chase down a few more use cases as well. So hopefully next time, I'll have some better custom instructions and a little bit better, track record in the brainstorming department. I think it's been a great exchange.

S0 (1:15:11) So thank great. Yeah. Thanks a lot.

S1 (1:15:13) It is both energizing and enlightening to hear why people listen and learn what they value about the show. So please don't hesitate to reach out via email at tcr@turpentine.co, or you can DM me on the social media platform of your choice.

AGI-Pilled Cyber Defense: Automating Digital Forensics w/ Asymmetric Security Founder Alexis Carlier

Infinite Code Context: AI Coding at Enterprise Scale w/ Blitzy CEO Brian Elliott & CTO Sid Pardeshi

The AI-Powered Biohub: Why Mark Zuckerberg & Priscilla Chan are Investing in Data, from Latent.Space

Using ChatGPT As a Copilot For Your Mind

Watch Episode Here

Video Description

Full Transcript

Transcript

Nathan Labenz

Read next