Watch Episode Here

Read Episode Description

In this episode, Nathan sits down with Jon Noronha, co-founder of Gamma. Gamma is a new medium for presenting ideas, allowing you to focus on your ideas and receive beautiful, engaging content without the formatting work. In this episode, Jon and Nathan discuss the journey of building Gamma, how to coax AI to do things well, and the opportunity for AI A/B testing. If you're looking for an ERP platform, check out our sponsor, NetSuite: http://netsuite.com/cognitive

TIMESTAMPS:
(00:00:00) - Preview
(00:01:23) - Nathan introduces Jon Noronha
(00:06:36) - Intro - Jon Noronha introduces himself and Gamma
(00:09:36) - Gamma's origin story: building the "anti-PowerPoint" pre-GPT-3
(00:13:15) - From using AI for onboarding to generating near-complete presentations.
(00:15:11) - Sponsors: Netsuite | Omneky
(00:18:01) - If Notion and Canva had a baby
(00:23:35) - Searching for the right structure for AI, landing on HTML
(00:28:16) - Degrees of freedom - developing a constrained vocabulary of semantic blocks
(00:29:30) - Choice of model(s) – weighing cost, speed, and quality tradeoffs
(00:35:18) - Editing UI - getting AI to edit slides per user command
(00:38:04) - UI for gathering training data - following Midjourney's example
(00:40:00) AI A/B Testing: a big opportunity space
(00:43:22) - Image generation models - Baseten and Gamma's deployment of it
(00:45:37) - Pacing feature development - Trying to stay ahead of innovations like 3D, video, voice etc.
(00:47:45) Image generation for slidedecks and AI involvement
(00:50:29) Gamma's main building priorities and AI generated websites
(00:52:34) - Competing with giants like Microsoft and Google
(00:55:04) - Enterprise AI shipping challenges - Startups pioneer new patterns; giants will eventually follow.
(00:57:45) - Simplifying complexity - will models smooth over the complexity of bloated product suites?
(00:59:22) - Conclusion and final thoughts

X/TWITTER:

@thatsjonsense (Jon)
@labenz (Nathan)
@eriktorenberg
@CogRev_Podcast

LINKS:

Gamma: https://gamma.app/

SPONSORS: NetSuite | Omneky

NetSuite has 25 years of providing financial software for all your business needs. More than 36,000 businesses have already upgraded to NetSuite by Oracle, gaining visibility and control over their financials, inventory, HR, eCommerce, and more. If you're looking for an ERP platform ✅ head to NetSuite: http://netsuite.com/cognitive and download your own customized KPI checklist.

Omneky is an omnichannel creative generation platform that lets you launch hundreds of thousands of ad iterations that actually work customized across all platforms, with a click of a button. Omneky combines generative AI and real-time advertising data. Mention "Cog Rev" for 10% off.

Full Transcript

Transcript

Jon Noronha: 0:00 I don't think we had any inkling of how powerful that idea would be or how well AI can do the job. We thought it would just be an initial rough skeleton that you could use just to sort of show you what our product could do. It would mix in kind of all the different bits. And we started working on that, and the thing that we found was that it could do much better than we expected. It could actually generate a full presentation and find images and create layouts. It became this really powerful way of showcasing all of the building blocks we built up over the previous 2 or 3 years.

But then, what amazed us was that many people considered that a finished product. They would say, oh, you've basically made my whole presentation for me. Yes, I'll finesse it here or there, but you've now taken this task that would have taken me 10 or 20 hours of preparation, and you've done 90% of it. That is probably always the mission we wanted to achieve, if you think back of, oh, your presentation's due tomorrow. How do we solve it? But it never occurred to us that AI could solve so many of the pieces of it along the way.

Nathan Labenz: 0:56 Hello, and welcome to the Cognitive Revolution, where we interview visionary researchers, entrepreneurs, and builders working on the frontier of artificial intelligence. Each week, we'll explore their revolutionary ideas, and together, we'll build a picture of how AI technology will transform work, life, and society in the coming years. I'm Nathan Labenz, joined by my cohost, Erik Torenberg.

Hello, and welcome back to the Cognitive Revolution. Today, my guest is Jon Noronha, cofounder of Gamma, an AI powered presentation and website creation tool that invites users to just start writing, promising beautiful, engaging content with none of the formatting and design work.

Now, as recently as a few years ago, right up until the point when Jon was starting Gamma actually, it was generally agreed that AI would first automate manual labor, then maybe knowledge work, and then maybe last of all, if ever, creativity. But as it turns out, creative technology has actually been one of the very first markets to be affected by generative AI, and I've been thinking a lot about why that is.

Reflecting on this conversation with Jon, a few things definitely jump out. First, the blank page problem is real. People are often at a loss for creative ideas. GPT-3 level LLMs were not super powerful, but they were capable of brainstorming creative concepts at lightning speed. And simply helping people get unstuck and into the creative mode can itself be super valuable.

Second, no matter how hard you work on a software UI, many people find the process of learning to use new software tools so frustrating that they'd rather not even try at all. Yet, many of those same people are happy to simply say what they want in plain language. And if the tool is smart enough to get them and to produce something decent that they actually like, then you have an opportunity to draw them in.

And third, it's okay in a creative context if the AI sometimes gets things wrong. Even today, while AI tools have become more reliable for routine work, they still rarely nail creative tasks on the first try. For example, I generated the first version of this essay with Claude 2, but then I ended up rewriting the entire thing myself. That doesn't mean though that it was pointless to use AI. By getting the approximate structure on the page, in the slides, or in the video as the context may be, AI makes it anywhere from 2 to 10 times faster to accomplish the end goal. And what that means in practice is that AI empowers people to make things that they simply couldn't and wouldn't otherwise make.

As you hear, I invited Jon on the show because I recently did a search for the best AI first slide maker, and I found that Gamma was the very best tool that I tried. And I had a ton of fun talking shop with Jon in this conversation. As founders of creative technology companies in the generative AI era, we've recognized and embraced many of the same opportunities, and we've also faced many similar challenges. We get super practical talking about which models we use to maximize performance, which tools we use to measure and monitor production results, which approaches we use to collect user feedback, and what strategies we're betting on to ride the AI wave more nimbly than our competitors who just happen to be some of the world's largest companies.

Whether you're a builder or just curious about how the AI products you use every day are built, I think you're really going to enjoy this episode. Of course, if you are enjoying the show, we always appreciate it when you share an episode on social media with your friends or leave us a review on Apple Podcasts, Spotify, or a comment on YouTube. And we always welcome your feedback. You can email us anytime at tcr@turpentine.co, or you can DM me on the allnewx.com where I am still at Labenz.

Now here's my conversation with Jon Noronha, co founder of Gamma. Jon Noronha, welcome to the Cognitive Revolution.

Jon Noronha: 5:14 Great to be here.

Nathan Labenz: 5:15 So, we're here because not too long ago, I was doing this little workshop that I call savvy shopping for AI products with a group of, I'd say, 100 executive assistants that I'm training to use AI tools of all sorts. And one of the things I try to teach them how to do is identify which are the good products and which ones are kind of just thin wrappers, whatever you want to call them, are just not good.

So in front of this group, I did a search for the best AI slide maker that I could find, starting, as I often do these days with perplexity and asking it to tell me what the best, most highly rated AI slide makers are. And then I actually just tested a bunch of them live in front of the group. And it was a pretty unanimous agreement across the group that yours, which is Gamma, was the most successful and most likely to be adopted AI slide maker. So, I thought that was pretty cool. And then I figured, well, hey, let's do a podcast and learn more about it.

Jon Noronha: 6:19 That's awesome. Well, thank you, Perplexity, and thank you, the army of EAs, for rating us that way. That's great to hear.

Nathan Labenz: 6:25 It's a cool experience. It's at least getting toward what people sort of imagine the future to be like. So I'm really interested to unpack some of the decisions that you've made and the way you've gone about building the product. I guess for starters, though, just love to contextualize where you are and kind of how the company and AI related to each other at the beginning.

If I understand correctly, just from kind of checking you guys out online and trying to figure out the timeline, it seems like you started building this product before there was clear line of sight to it being a heavily AI driven product. Correct me if I'm wrong, but tell me where you started, what the original motivation was and what role AI played in that, and how things have changed as you've been building.

Jon Noronha: 7:05 Yeah, totally. It was certainly not our original goal or plan, mainly because when we started the company, AI was not all that good yet. So we started the company in 2020, right in the depths of the pandemic, and we've made our own zigzag through the idea maze deciding what our core focus would be. But always, the meat of it was, we wanted to build the anti PowerPoint.

And what I mean by that is, if I tell you that, hey Nathan, tomorrow you have a huge PowerPoint presentation you need to present, it's high stakes. Get it ready. Do you feel a thrill of excitement, or do you feel a cringe of ick at that thought? And for most people, it's the ick that they feel. And there's any number of reasons for that. Not all of them PowerPoint the software's fault. It's the nervousness of public speaking and presenting. It's having your work judged like a book by its cover rather than by the quality of your ideas. All of those things are sort of part of the core presentation experience.

But there's also a lot that comes from the software, and maybe more than that, the format or the medium of a presentation. It's visual communication, which is something that most of us don't really learn to do very much in school or in the rest of our career. I have to basically make a high stakes, good looking thing using tools I'm not super familiar with to drag boxes around, basically. The format itself is highly linear, so I have to really plan out and thoughtfully execute a story with a beginning, middle, and an end. And then, basically, I have to fill a bunch of rectangles, and they're all kind of the same size and shape, and so I often have to work my ideas to fit on that rectangle. That might mean finding clipart to go on one side, or it might mean cutting down my text to fit on 2 slides, or whatever it is.

And so our mission was, let's see how we can just rethink all of that by building both a better way to make presentations and also a new medium that was an alternative to the typical slide deck, so combining elements of the document and the slide together, combining interactive elements like a web page would have into slides. It didn't even really occur to us at the start that AI would have a major role to play here. In fact, I tried using GPT-3 back in 2020 when we started the company to do some pieces of this, and it just couldn't. The technology wasn't there. And I don't think any of us realized how far it would accelerate in the last few years to be able to contribute until now when it does in a major way.

Nathan Labenz: 9:19 You know, it's funny. I've kind of lived a somewhat parallel life, I think, in building my own company and product at Waymark. Listeners know this, but just for your background, I was the founder and CEO there for a long time. And then became totally obsessed with everything AI about 2 years ago, which I'd always been interested in, but hadn't really fully committed my intellectual effort to. And now I'm just all AI. Unfortunately, I had a good friend and teammate who was able to take over for me as CEO.

But, you know, similar kind of thing where we were like, in our case, it's video. People need to communicate with video. You know, there's all of these different placements. You know, our typical audience is small business. Sounds like your audience is more kind of general white collar professional. But, you know, people have to communicate in this way, and it's not something they really know how to do. And, you know, it's hard even for professionals, but it's basically impossible for amateurs in the case of video. Well, TikTok shows that talented amateurs can do it, but, you know, that's still a pretty small percentage of people. So it's hard, you know, and people don't they just basically don't make stuff most of the time because it's just insurmountable.

So we had been also working, and I want to hear your kind of take on this, we had been working on interface, a lot of UI, trying to make things intuitive, trying to make it browser accessible. What were the big things that you had kind of prioritized before AI came into the picture that you thought was going to solve this problem? And then how did AI kind of layer on top of that and change it?

Nathan Labenz: 9:19 It's funny. I've kind of lived a somewhat parallel life, I think, in building my own company and product at Waymark. Listeners know this, but just for your background, I was the founder and CEO there for a long time. And then became totally obsessed with everything AI about 2 years ago, which I'd always been interested in, but hadn't really fully committed my intellectual effort to. And now I'm just all AI. Fortunately, I had a good friend and teammate who was able to take over for me as CEO.

But similar kind of thing where we were like, in our case, it's video. People need to communicate with video. There's all of these different placements. Our typical audience is small business. Sounds like your audience is more kind of general white collar professional. But people have to communicate in this way, and it's not something they really know how to do. And it's hard even for professionals, but it's basically impossible for amateurs in the case of video. Well, TikTok shows that talented amateurs can do it, but that's still a pretty small percentage of people.

So it's hard, and people just basically don't make stuff most of the time because it's just insurmountable. So we had been also working, and I want to hear your kind of take on this, we had been working on interface, a lot of UI, trying to make things intuitive, trying to make it browser accessible. What were the big things that you had kind of prioritized before AI came into the picture that you thought was going to solve this problem? And then how did AI kind of layer on top of that and change it?

Jon Noronha: 10:52 Yeah, it's a great question. So where we had really begun was thinking through what is the sort of format that we're trying to create that you can present but is not quite a presentation. And so, for us, what we really prioritized was building an experience that was a writing based experience for making a presentation.

So, we took a lot of inspiration from tools like Notion, where you could just type on the page in a very freeing way, but then pull in all these different powerful blocks and elements of multimedia and embedding things and all of that. And so we've actually really been building this rich text editor. We'd also really been building in more mobile responsiveness. We had this idea that a presentation isn't just a thing you present live. It's also a thing that you send around ahead of the meeting for someone to read or after the meeting for people to debrief on and discuss and comment on.

So we wanted it to work beautifully on a big screen when you present it on the TV, but also be something that you can consume on your phone when you send it around. So a lot of that's sort of responsive reading and viewing. The idea of this is what we almost ended up creating was a web page builder for presentations. It had a lot of the elements of WYSIWYG creation and everything, and it was, in many ways, more limited than what a typical slide deck is. It wasn't about dragging rectangles around, because it had to be responsive and reflowing in all these cases. And so we sort of paired out a lot of those elements to make it simple and writing based.

Ironically, and I can't even say intentionally, this turned out to be a brilliant decision in light of AI, because so much of where AI came along is that it's this tool that can turn writing into anything, and it can also write anything. So the fact that we had large language models come along means that we built this interface for a human to make a presentation like they're writing a doc. And then we basically had AI come along that can write a doc about almost anything, and so we sort of put those 2 pieces together.

And what that looked like concretely was we actually originally launched the product just about a year ago, so August 2022, with none of the AI stuff. Our motto at the time was, Write like a doc, present like a deck. We launched on Product Hunt. It actually went better than I expected, to be honest. We did kind of win our first early adopters, people that just really bought the vision and maybe had that grievance against PowerPoint format along the way where they sort of bought it. And so we got our core of early adopters, people who really believed and saw the potential of what we were doing.

But at the same time, I wouldn't say that we broke out beyond that initial core. People who really bought our problem space were into it, but it was very hard to take a new person and teach them what a Gamma was. It was this thing that was kind of like a presentation, but it certainly didn't have all the feature parity yet. And it had these other ideas, too, which, if you got it, you got it, but most people didn't. They would sign up for our onboarding, they'd watch a real short video of what our product was all about, and then we dumped them into a blank page where we would say, Good luck. Hope you figure it out. And let's say 2% would and 98% wouldn't.

So the funny thing is that where AI first kind of entered the conversation for us last fall was really trying to solve the problem of onboarding. What we wanted to say was, what if instead of you having to make your first Gamma yourself, we could get AI to spit out a first Gamma for you so that you would actually see the power of what we were building. And I don't think we had any inkling of how powerful that idea would be or how well AI could do the job. We thought it would just be an initial rough skeleton that you could use just to show you what our product could do. It would mix all the different bits.

And we started working on that and working on it more and more, and the thing that we found was that it could do much better than we expected. It actually generated a full presentation and find images and create layouts. It became this really powerful way of showcasing all of the building blocks we built up over the previous 2 or 3 years. But then what amazed us was that many people considered that a finished product. They would say, oh, you basically made my whole presentation for me. Yes, I'll finesse it here or there, but you've now taken this task that would have taken me 10 or 20 hours of preparation, and you've done 90% of it.

That is probably always the mission we want to achieve, if you think back of like, oh, your presentation's due tomorrow. How do we solve it? But it never occurred to us that AI could solve so many of the pieces of it along the way. And now that we realize it has, it's kind of changed the whole way we think about prioritizing and planning our product, because AI could be at the core of almost every bit of it.

Nathan Labenz: 15:07 Hey, we'll continue our interview in

Ads: 15:09 a moment after a word from our sponsors.

Nathan Labenz: 15:11 Yeah, it's fascinating. I can relate to that so much. For us, we tried to use templates as kind of the way to get people somewhat past the blank page problem. And in fact, with Waymark for the longest time, there basically was no concept of a blank or kind of empty starting video. You would always start with some template and there would be content in there. And then our idea was like, you use this content as inspiration, and then you make it your own by putting your own content there. And I think that helped relative to showing people nothing, certainly. But it was still a big hurdle that a lot of people were like, Well, I don't really get or know how to project my content onto this template. It's a little tricky still.

Similarly, we had some people that really took to it and loved it. But then there were definitely a lot of people who were like, I'm still not quite getting how I'm supposed to make something with this. And interestingly for us, it was also not really about the interface. I think we would kind of look for interface solutions to some of these problems. And then I think in retrospect, especially now that we have the AI layer, it has become clear to me that it was not so much an interface problem in many cases as just a conceptual problem of, what am I even supposed to do here? It's not that I don't know what the buttons do, but I don't know what to do writ large. Right?

Jon Noronha: 16:38 Yeah. Like, what is this for? What can I make with this? Why should I even be here? This has been a hard thing for me to grapple with as someone who's always been kind of a product manager by training and assumes that the solution to every problem is product. Better product, more product, different product.

People often describe Gamma as like, if Notion and Canva had a baby. And when we look at both those companies, they're companies that saw so much of their success through, yes, great products, but also these incredibly vast communities and marketplaces of templates. Everybody who came to Notion or Canva came there because they saw someone else make something really compelling with it, and they're like, Oh, I get it. I see the value. And you can think of those template libraries as these incredible assets that those companies develop, often quite painstakingly over a period of years.

So I don't know the numbers, but let's just say Canva has like 10,000 or 50,000 templates in it. And that is a huge part of the reason why you sign up for Canva. That's a very hard advantage for a small startup like ours to overcome. We do not yet have the resources to create thousands of those things until AI comes along. Suddenly, we found was we started using AI to make our own templates, and then we're like, Wait, why are we even having AI make templates? Let's just have AI make a perfectly tailored first draft for you and cut out the middleman of the template. And in doing so, hopefully, also cut out the advantage that a lot of these established tools have over us in their large template libraries.

Nathan Labenz: 17:56 So I want to dig into that in just 1 second. But just to give listeners who haven't probably seen the app a little bit more context on specifically what I thought was really compelling about it, biggest thing for me right off the top was and I tried this. I have this thing that I created, the AI scouting report. So I went back to my notes, which I had used before making any slides, and it was just kind of as your use case envisions. Right? Write like a doc and present. That's what I was trying to do. I had my doc and all these bullet point outline type content. Drop it in there and say, hey, here's what I'm trying to do. Make this.

I'd say 2 things really stood out against everything else that I tried. 1 was that the slides were I don't know. You use this for the individual unit, if it's not a slide within the broader context. But the individual units, they all just looked really good. Like, the theme was really nice. The colors, the layout. I'm not even super aesthetically sensitive, to be honest. Folks on the Waymark team know a hundred times more about how to make something really look good than I'll ever know. That's honestly a big part the reason I was kind of interested in building this sort of thing, because I can't do it without tool assistance.

But I definitely, at this point, recognize that the slides just looked beautiful. And they made a you know, that sub second impression on does this look pro? Does it look like it's worth my time at all? I think it was really the only 1 that passed that test that we demoed anyway.

And then the other thing that it did that I thought was really effective was it took my raw stuff, some of which was kind of notes and some of which was kind of paragraph-y and definitely not slide ready or even particularly slide friendly. And it put that into a text format that, for the most part, captured what I was trying to say without distorting it too much, but also put it into a format that actually looked like something you would present.

And other things that we tried kind of went off the rails in various ways, either changing my story entirely, which was pretty weird to see, or just not changing it enough and just kind of printing out my paragraphs. And then it was like, well, this I could have copied you know, that didn't save me much, right, if I was just gonna go copy my paragraph in giant text blob form. So I thought you guys did a very nice job of striking the balance there between transforming my content so that it is more suited to this purpose of presentation, but not distorting my content too much.

So that's my pitch. I don't know if you would add anything to that that you think is like the big reasons to go try the product, but those were the reasons that I found that seem to distinguish it.

Jon Noronha: 20:36 Well, that was a great pitch. Thank you. Yeah, I mean, that's sort of the core thing we focused on. To give a little more context on it, my first real unlock with this was my first week using ChatGPT and seeing what people were doing with ChatGPT. And I remember people were doing all these things like write a story about a grilled cheese sandwich getting stuck in the VCR, but in the style of biblical verse. And what was amazing about it was, first of all, just the creativity of AI and what it could do, far beyond, I think, what any of us thought.

But also what was amazing was this idea of transformation. And this whole term generative AI has maybe obfuscated the way in which AI is so good at transforming things from 1 shape to another. So this idea that you can take this and now put it in hip hop lyrics or whatever is a neat trick. It's like a cool toy. But for us, the actual powerful application of it is take my written bullet points or my vague notes and turn it into a structured, thoughtful presentation. And I think we're all blown away by how well AI can do that if you at least coax it in the right way.

And so, yeah, I would say that's a big part of our sort of secret sauce, is transforming your raw notes into something that feels coherent, compelling, and also visual. That's a big part of it. We have kind of a lot of the visual building blocks that we can drop this into. The other part of it, which I'm sure we'll talk about as well, is not just that first generation of spit out a presentation for me and it's done, but helping me edit and refine as I go. We've also put a lot of work into giving you alternatives or letting you try out different variants on top of where you started to iterate your way towards that right kind of visual output. Jon Noronha: 20:36

Well, that was a great pitch. Thank you. Yeah, I mean, that's sort of the core thing we focused on. To give a little more context on it, my first real unlock with this was my first week using ChatGPT and seeing what people were doing with ChatGPT. And I remember people were doing all these things like write a story about a grilled cheese sandwich getting stuck in the VCR, but in the style of biblical verse. And what was amazing about it was, first of all, just the creativity of AI and what it could do, far beyond, I think, what any of us thought. But also what was amazing was this idea of transformation. And this whole term generative AI has maybe obfuscated the way in which AI is so good at transforming things from one shape to another. So this idea that you can take this and now put it in hip hop lyrics or whatever is a neat trick. It's a cool toy. But for us, the actual powerful application of it is take my written bullet points or my vague notes and turn it into a structured, thoughtful presentation.

And I think we're all blown away by how well AI can do that if you at least coax it in the right way. And so, yeah, I would say that's a big part of our sort of secret sauce, is transforming your raw notes into something that feels coherent, compelling, and also visual. That's a big part of it. We have kind of a lot of the visual building blocks that we can drop this into. The other part of it, which I'm sure we'll talk about as well, is not just that first generation of spit out a presentation for me and it's done, but helping me edit and refine as I go. We've also put a lot of work into giving you alternatives or letting you try out different variants on top of where you started to iterate your way towards that right kind of visual output.

Nathan Labenz: 22:04

Yeah, I think that's a really compelling paradigm as well. You said AI can do this well if you coax it the right way. I'd love to get a little bit more into how you do that, and I can tell you a little bit about Waymark's side too. Kind of want to just compare notes because I think you're totally right that transforming one thing into another is, at least with the current quality of models, often where a huge amount of the value is as opposed to generating totally new stuff. If I had asked it to write my AI scouting report from scratch, that would have been useless. But having notes and being able to transform it, that is useful, and the models can really do a nice job on that.

Ultimately, you're producing something that's highly visual. It's got layout. It's got image content, along with the text content. It's got structure in terms of bullet points. It has all of this kind of structure. How do you think about representing that structure for the AI? Obviously, folks who listen to the show know enough about this that whatever I dropped into the tool is combined with some instructions, maybe some fine tuned model, lots of things there. But you also have to tell the AI this is kind of the space that you can project into. So how do you think about describing that space and finding the happy medium between you want to have a kind of simple notation, doesn't take up too many tokens, hopefully the AI can rock it. But it also can't be too simple because you do have a pretty rich grammar, so to speak, that you ultimately output to. I'd love to hear how you've approached that.

Jon Noronha: 23:42

Yeah. This was, I would say, the hardest problem for us to get this off the ground from sort of the idea stage to the reality. We tried so many things. We started with just plain text, and true enough, they can generate plain text with bullets in it. And then we said, but it needs to have visuals and layout. And so we went down a whole path of trying to just do everything through sort of JSON. Let's represent this as structured data. I want to have a timeline that's going to have three steps in it, and these are the things in those steps. And that, we frustratingly got to the point of working like 85% well. It would work some of the time, and your curly braces and your semicolons put it up in the wrong place. It would all fall apart. We cycled through, oh, let's try YAML instead. Let's go through this. Let's go through that. Or, let's try just doing text and then generating the formats.

We even explored thinking through more of the PowerPoint model, which is just draw me this. I would need a rectangle here or a rectangle there, and that's where the AI really falls down, because even though the AI seems like it can see, it really can't. It just is getting text input and output. And so generating things on a grid almost never goes well. I remember trying to get it to draw a pyramid shape, like three levels of a pyramid, like you would see on a slide deck. And it could not draw a pyramid of triangles to save its life.

And so actually, the thing we ended up hitting on, which ties to our whole story where we came from, was HTML. I think we realized that we had to work in a format that the model itself knew really well. And these models are trained by scraping the web all over the place. And so they've just seen huge amounts of HTML and stuff that looks like it. And I alluded to, we had already built our Presentation Builder through the lens of website creation, so mobile, responsive, made up of these different blocks that stack on top of each other. And so we realized the big unlock was, could we actually generate the input and output as HTML and then convert that HTML into our format?

And that had a lot of benefits in terms of there's a lot we didn't have to teach the AI at all because it already knew how to make stuff bold or how to make things into a table or how to make bullet points. That all came for free in its training data. And then we could just teach it the specific tags and custom elements that are unique to us. And even more where it's come in handy is when it comes to editing and refining, we can feed back our data in that format and let it riff on it. We basically prompt it and say, you are the world's best web designer. You have a gift for writing clean, structured HTML. Your client gave you this HTML. It doesn't do quite what it's supposed to do. They want you to change this thing about it, and then we can apply tweaks on top of that. And that turned out to be the unlock that really made AI powerful in doing this.

Nathan Labenz: 26:17

There's certainly some commonalities with Waymark where I was just kind of in pursuit of the most natural seeming, obviously text only representation that I could come up with that still kind of deterministically could be rehydrated back into the full video form. Right? So we kind of said, this is the format that the AI has to generate. And if it generates in this format, then, again, we can map that back onto the full visual video space. And for us, it did get pretty simple in the end, partly because our templates are pretty well formed. So you're not in general choosing specific locations or whatever. That's kind of all baked in on some level to the template. So you're really mostly focused on just the content. What's the copy going to be? What are the images going to be? Etcetera, etcetera.

It seems like you have kind of even more degrees of freedom. But as you say, you can't give it infinite degrees of freedom. You presumably are not, you're probably not giving it the level of freedom to be like how many pixels something is from the top or from the left. Right? So do you have a vocabulary that's like a set of classes that it can apply that are kind of semantic, like left side tall, right side right upper corner. I imagine a vocabulary here.

Jon Noronha: 27:42

We do, yeah. We have very structural elements, and they all correspond to actually things in our interface. That's kind of what led us to add this AI layer on top, that we built all of these kind of semantic building blocks that were things like a side by side layout or a table or columns or timeline or whatever it was. We had those building blocks, and so then we basically teach the AI, using a lot of our precious tokens every prompt, here are the building blocks you're allowed to use. Here are examples of how they work. Now, see what you can do with this prompt.

Nathan Labenz: 28:12

So, what's the model journey that you've been on? You're talking about prompting with pretty detailed instructions, which suggests maybe not fine tuning. So you're able to use an off the shelf commercial model with a developed prompt?

Jon Noronha: 28:30

Generally, the GPT models from OpenAI, they have so far been the leaders for us, although we are now at a scale and a point where I think we're going to start shifting back towards something like fine tuning or maybe even train our own models. We haven't really made a decision there. But when we were getting started, we didn't have a huge vein of data to begin with. We were starting from scratch. And so for us, few shot prompting, which a few examples, worked a lot better than trying to fine tune.

We also happened to be launching earlier this year right at the time that OpenAI made the jump from GPT-3 to GPT-3.5, which totally changed the interface, also totally changed the cost curve by making everything 10 times cheaper, and there was no fine tuning on it. And so in some ways, it made the decision for us. It's like, is fine tuning really going to be 10 times more effective for us? The answer then was no, but something I'm really keenly watching is the evolution of open source models and of different foundation model providers. And so I think the story is actually about to invert. I think we're about to be able to take now this huge number of things we've sort of generated as input to say, okay, how can we actually train our own custom model and start to use that scale as an advantage?

Nathan Labenz: 29:35

Yeah, this is a really interesting topic. Obviously, everybody's watching the open source developments and we're now in the LLAMA 2 phase of history in the open source world. I'd love to hear more about how you think about these trade offs. People all the time are like this kind of line of thinking, I guess, went mainstream with the no moats Google memo. I think that article certainly got a lot right about all the great things going on in open source. I also say, got some things wrong in terms of I still definitely think there are some moats.

But from an individual application developer standpoint, 3.5, as you noted, is cheap, it is fast, and it is scalable. They have obviously very nice parallelizability for where you only obviously get charged for what you are using in terms of tokens. What's the other side of that look like for you? And what would be the reasons to go toward the open source? Where is 3.5 not doing it for you? Let's start there. What would you hope to make better?

Jon Noronha: 30:36

Yeah. So I think there's a couple areas or a couple dimensions that you could think of. One of them is just overall intelligence. So GPT-3.5 is remarkable, certainly, but GPT-4 is still better. It's just sort of hugely expensive and slow for what it does. And so the dream is, can you get GPT-4 level performance while still getting all those qualities you named in 3.5? And I don't know if we're there yet with any the open source models that I've seen, but I think fine tuning is one path for how to get there.

I think the other thing is that GPT-3.5 is really trained to be a chatbot. It really wants to be a helpful chatbot. And many things in life do not fit the schema of being a helpful chatbot, including generating an entire presentation. And so I think this is where the opportunity is to actually take other models in different directions and have full control over that experience. And so for us, that might mean training it for more specialized tasks, but it also might just mean really pushing the use case of document generation and editing beyond what chat type tools are made for. But, yeah, I'm also curious for your experience on this and how you would sort of weigh the different models and think about the open source stuff.

Nathan Labenz: 31:41

I'm not sure. Right now, we are still using a fine tuned OpenAI model, very much considering, is it time to move on to something else? Our approach has historically been not too focused on the cost because we do have very high value users. I think this is just more a result of how we've gone to market and the fact that we've sold to big companies. So you could set a product like ours up, it's probably similar with yours, right, you could go directly to users. We've done some of that historically, but we've actually found the most business success going to larger organizations that need to scale this sort of video production and selling to them on a licensed basis.

With that, our emphasis has always been maximize quality for them. The AI, in terms of the overall revenue and cost structure, is not that big of a deal. So we've never shied away from kind of paying top dollar for tokens. But I'm now getting to the point where I'm like, yeah, maybe this fine tuned model could soon be eclipsed by something else. And maybe it's not necessarily eclipsed in terms of overall quality of output, but maybe latency, maybe cost, not because we're trying to save cost, but maybe we want if we could generate multiple things at a time. And now instead of one, you get to look at three different things or whatever. So it just feels like there's a lot of possibility.

3.5 actually is one of the things that I've been considering. I don't think it would probably work super well if we had a hard coded, single prompt to rule them all. I think we would have a hard time matching our fine tuned model. I haven't proven this yet, but I'm pretty optimistic that if we did a sort of dynamic prompt where we're bringing in more relevant examples for it to borrow from that we might get there. I haven't really considered anything specific to the chat modality. What are you seeing, specific weird behavior from that, that is grounded in the chat nature of the thing that is causing you trouble? Nathan Labenz: (31:41) I'm not sure. Right now, we are still using a fine-tuned OpenAI model, very much considering, is it time to move on to something else? Our approach has historically been not too focused on the cost because we do have very high value users. I think this is just more a result of how we've gone to market and the fact that we've sold to big companies.

You could product like ours up—it's probably similar with yours, right? You could go directly to users. We've done some of that historically, but we've actually found the most business success going to larger organizations that need to scale this sort of video production and selling to them on a licensed basis. With that, our emphasis has always been maximize quality for them. The AI, in terms of the overall revenue and cost structure, is not that big of a deal. So we've never shied away from paying top dollar for tokens.

But I'm now getting to the point where I'm thinking, yeah, maybe this fine-tuned model could soon be eclipsed by something else. And maybe it's not necessarily eclipsed in terms of overall quality of output, but maybe latency, maybe cost—not because we're trying to save cost, but maybe if we could generate multiple things at a time? Instead of 1, you get to look at 3 different things or whatever. So it just feels like there's a lot of possibility.

3.5 actually is one of the things that I've been considering. I don't think it would probably work super well if we had a hard-coded, single prompt to rule them all. I think we would have a hard time matching our fine-tuned model. I haven't proven this yet, but I'm pretty optimistic that if we did a sort of dynamic prompt where we're bringing in more relevant examples for it to borrow from, we might get there.

I haven't really considered anything specific to the chat modality. What are you seeing? Is there specific weird behavior from the chat nature of the thing that is causing you trouble?

Jon Noronha: (33:53) Yeah, sometimes. What's funny is we had this whole part of our product we didn't talk about as much where, in Gamma, once you generated your presentation, you can say, I want to work on—we call them cards, not slides. That's just us trying to be sort of the anti-PowerPoint and therefore not necessarily borrow all the same language, but you can think of it as a slide.

So, I'm going off and I'm editing slide 3, and I say, oh, I want you to change the layout of this. I want you to make it look more visual or make this more concise, or I want you to translate this into Spanish, or whatever it is. You can basically talk to your card and transform it. That's, in my opinion, the coolest part of our product, although it's also the trickiest one to get right, because you mentioned degrees of freedom. There are so many degrees of freedom of what you can even ask for, what your content looks like, where it can go, and so it really doesn't work 100% of the time.

If I'm being honest, I think it works well about half the time. The other half, it doesn't. There are multiple reasons why it doesn't work well that other half, but one of the reasons is that we've sort of co-opted the chat interface to power this, and sometimes the chat interface just wants to be so helpful it won't actually follow your instructions. It just wants to start chatting with you and being a friend, and so I think that's a place where fine-tuning or just using open source models that aren't pushed in this direction can be really useful for us.

Nathan Labenz: (35:05) Yeah, that's related to the recent "GPT-4 is getting worse" brief news cycle where, as it later became clear, the authors had run this coding benchmark. And what they didn't account for was the fact that with the recent update, the model GPT-4 started returning basically just a markdown wrapper around the code. And they were just executing the benchmark in the programmatic way that benchmarks are often executed at our peril, I would contend. And then they're like, oh, this code doesn't even execute. It's got syntax errors in it. What a terrible regression. When you look at it and you're like, oh, well, it is actually responding pretty sensibly. I mean, it feels like that's something you should be able to get under control, but it sounds like you're seeing more opinionated responses where it's not just that it's a syntax thing, but it's trying to take you in a different direction or suggest something else. What are those behaviors that you're seeing that you haven't been able to corral?

Jon Noronha: (36:10) Yeah. One version of this that we've encountered is you'll be working on a slide about rock music. You'll say, I want a picture of a panda for this. And it'll say back to you, I don't think a panda is a good idea. I think you want a picture of a guitar instead. But I want the panda. I'm in charge.

Nathan Labenz: (36:27) Yeah. Well, that's going to be—when the copilot pushes back, it's going to be a major phenomenon across a lot of different experiences in the years to come.

I agree with you, by the way, that I think the edit with AI experience is one of the coolest aspects of the product. Believe it or not, I didn't even get that far in my initial demo. But in preparing for this conversation, I went back and spent a little bit more time with it. And I thought that was really something that we could take some inspiration from also. We have two layers where, first, it's like, tell us what video you want. We'll make it for you. Boom. Then you have your lower level controls where you can go change any of the copy and tinker with colors and swap out images and crop and whatever. A lot of interface there.

But I do think we stand to improve the product still with an intermediate layer as well. I'm not exactly sure what our version of what you have would be. At a minimum, I think just asking the AI to just make changes even if it's just operating on the whole video. I do think there's room for more verbal iteration before you have to get into that nitty gritty of the UI. So I did take inspiration from that. I thought it was really nice.

The other thing I thought was really good, I'd love to hear what you've learned from this or what—you said it only works half the time. That may be an actual quantified number because I also thought it was really smart, and I really don't know why more products don't do this. When you do the chat with AI, the product then shows you back original and suggested update. And then you can click back and forth and be like, before, after, before, after, which one do I want? And presumably, that is a really good feedback signal for you, which maybe you're not even really fully able to take advantage of yet. As you have your eye on fine-tuning, I imagine that is a real source of intelligence about what people actually want.

Jon Noronha: (38:25) Totally. This was inspired by Midjourney, which I think is still maybe the most impressive AI product I've ever seen, despite its, in my opinion, enormously frustrating interface. But what strikes me observing their product development is that so much of it seems tuned toward gathering the data that you need to make the system better over time. So their product's all about generating variations, upscaling the ones you want, and even liking different parts of it, all in the interest of creating that flywheel. And so it's funny because the landscape of how we can use that data is not fully formed. Don't quite know how it will be incorporated. But if you don't have that compass in the first place, you just have no idea where you're going and where you can focus your efforts.

Nathan Labenz: (39:04) Do you use any system right now for prompt testing, AB testing? I use Human Loop in some of my activity for stuff like that. But do you have a tool that isn't—obviously, a lot of people have just built their own homegrown stuff. But have you guys homegrown DIY'd that kind of stuff? Or have you found any tools that help wrangle this problem?

Jon Noronha: (39:24) So far, it's pretty much homegrown, particularly for prompt evaluation. We built our own little studio for editing our prompt and seeing how it works on different examples. But it really feels like we're in the stone ages of this kind of technology. If I consider how high stakes these prompts are to our business, it feels like the equivalent of running a huge software code base with no automated tests in place, and just going based on vibes to know if the stuff you're changing is making any improvements.

And maybe worth mentioning, my background, a lot of our team's background at Gamma is from a company called Optimizely, which basically tried to make AB testing a widely practiced practice across marketing and software development. And so that mindset of gradual rollouts, AB testing, measurement is deeply in our DNA. We'd look for ways to incorporate it, but the tech just isn't there yet on the AI side, and I'm looking forward to seeing what does come out in this space over the next couple months to make this process better, because it's ripe for—I wouldn't even say disruption—just first mile innovation.

Nathan Labenz: (40:21) I'd definitely check out Human Loop, but we had CEO Raza Habib on the show some episodes back, and I think they are building some pretty good infrastructure that is specifically focused on app developers as the target customer. Not to be doing BD live on the show here, but I think that would be worth a look.

Going back to just the fine-tuning thing for a second, if you were going to go fine-tuning open source, right, still with OpenAI, the reason we have continued to use that fine-tuning is it also has very nice properties around just pay for what you use. You don't have to have dedicated resources. They handle the auto scaling for you, and they seem to do a very nice job of it, which is much appreciated.

When I've looked into what would it take for us to really bring to production a fine-tuned model? Actually, doing the fine-tuning at this point, I think, would work. What even—I would have said that probably as of the Mosaic 7B release from maybe a month and a half ago or whatever. That seemed to be the moment where it was like, okay, this is probably going to work for our use case now.

Putting that into production, having inference that auto scales—I'm a big believer in the vision of the Hugging Face inference endpoints. I've looked at Replicate, even Mosaic's service. But the auto scale up, scale down, if you have any sort of bursty workload, which we do, maybe I don't know if that's a problem for you, but we do have quite bursty workloads where it'll be even just as simple as a demo. We work with these big companies. Now everybody's together. Now they're all doing it at one time.

We also have some things where we process images for users and that because the images are many per user, we're often kicking off. Can we process a hundred or a few hundred images at a time? And we want to return that as quickly as possible. So I haven't been able to get over the hump that it would be worth it for us at this point to go with that open source approach just because it seems like the inference and the auto scaling and all that stuff is not quite there yet either. What's your outlook on all that?

Jon Noronha: (42:38) Yeah, I mean, I think you're generally right. And this is one of the factors that has held us back from going all in on AI image generation. We do have a lot of images in our product, but we've mostly held off from using AI to generate them. I mean, the biggest reason for that is mostly that they look like crap most of the time, although that is rapidly changing and the quality is improving. But for us, it's generally open source models that are the leading ones in image generation, at least the ones that are available, and so solving that problem.

I'll do some BD back on you, though, which is we're using a system called Base10 that I think actually solves a lot of the issues you brought up pretty well, which is they both integrate fine-tuning into their platform, so you can just upload a lot of input-output examples. And they do pretty fancy auto scaling, where you just figure out what kind of systems you run. You can handle bursty traffic and bring up and down servers as needed. We're actually about to roll out our own AI image generation serving powered by them, and I think it's going to work pretty well. Hopefully, so by the time this podcast is out, people will be able to try it.

Nathan Labenz: (43:34) So that gives me some insight. Because I noticed that, at least as far as my experience seemed to be, I didn't detect any generated images in what came back. It seemed to be all searching through existing libraries. And so I guess the barrier to that, primarily, as you said, is that they just haven't looked that good until recently. Is it now going to be Stable Diffusion XL that's getting you over the hump into moving into product? Yeah, a lot of these things to me come down to thresholds where it's not good enough until it is. And then when it is, it goes super wide, super quick.

Jon Noronha: (44:09) And it passes so fast from being not good enough to good enough. Image generation is interesting because I feel like we are right on that line. Actually, Midjourney already passed, I would say, but open source models just crossed that line. It's not clear they'll stop. Think they're going to keep on going and become pretty breathtaking and incredible as well.

Video generation feels like it's in a similar place, and I'm not sure what new modalities are coming after that. But as sort of a developer of an AI enabled product, what it feels like we're doing is just trying to put ourselves in a position to benefit from these sudden sparks of innovation happening all around the ecosystem. Literally, how can I have a hole to put cool images in, so that when cool images are ready, I can just flip it on? And it's hard to even think about the year ahead, what are all those new opportunities that are going to be emerging that we can take advantage of? 3D models, voice, all these things. It's almost staggering to think of all the places AI can plug in. Jon Noronha: 44:09 And it passes so fast from being not good enough to good enough. Image generation is interesting because I feel like we are right on that line. Actually, Midjourney already passed, I would say, but open source models just crossed that line. It's not clear they'll stop. I think they're going to keep on going and become pretty breathtaking and incredible as well. Video generation feels like it's in a similar place, and I'm not sure what new modalities are coming after that.

But as a developer of an AI-enabled product, what it feels like we're doing is just trying to put ourselves in a position to benefit from these sudden sparks of innovation happening all around the ecosystem. Literally, how can I have a hole to put cool images in, so that when cool images are ready, I can just flip it on? And it's hard to even think about the year ahead, what are all those new opportunities that are going to be emerging that we can take advantage of? 3D models, voice, all these things. It's almost staggering to think of all the places AI can plug in.

Nathan Labenz: 45:00 Yeah. There've been some really incredible voice releases just in the last week or so as well. Shout out to the AI Breakdown, which we actually did a feed swap with not too long ago and shared one of their episodes. But he just did one where he had a cloned voice from ElevenLabs read an essay in his voice. And it was insanely good. I literally could barely tell. And if he hadn't specifically called it out, even with him saying, "Now I'm going to go to the AI voice," and then the AI voice takes over. It was like, wait, that's the AI voice? That was the switch? I mean, it's crazy.

And PlayHT just dropped another one too in the last day or two that is looking or sounding phenomenal. So yeah, it's wild. Do you think that as you move to AI image generation, is that a complement to the existing libraries? Does it become kind of your primary go-to? Because there's a whole ball of wax too, as you know perfectly well around how do you select images out of a library. I thought you guys did a nice job of that, certainly relative to most things I've seen, but it does remain a challenge to figure out what out of, for example, Shutterstock's hundreds of millions of images actually not just makes some conceptual sense, but actually looks the part. Very tough.

Jon Noronha: 46:21 It's a really hard one, and I think we're too early to say. I think we are still going to rely on image libraries early on because we just know there's some level of quality bar on those, and AI images are still uneven. They still feel like a toy much of the time, and our aspiration is not to just be a toy, but to be something much more than that.

But at the same time, there's a level of magic to them and also a level of control. Particularly thinking about a presentation, we want to be tackling problems like, I want to have 12 images throughout my presentation, and they should all be stylistically and color coordinated. That's a problem that AI images will go from being bad at to good at in the space of months. It's just a matter of us trying to stay on board with what they do.

Nathan Labenz: 47:00 We take a somewhat different approach on this as well. Most of the products out there have just embraced the AI image generation come what may, right? If it doesn't look good, whatever, hey, we're in this moment, it's cool, whatever, we'll just do it. That seems to be the prevailing approach. I think for pretty similar reasons, our companies have not rushed in on that as much and have said, if the quality is not there, in our case, a lot of the videos get put on TV as TV commercials, which is, people are not going to want to put this on TV. And for you, people are not going to want to stand up in front of an audience that matters to them and present this, right? So if it doesn't hit that bar, then it maybe doesn't work.

The best approach that we've taken so far has been, well, I guess it's complicated. There's a few. But we like to use our users' own images wherever possible. We've had a lot of success with the BLIP family of models, which is out of Salesforce, to try to figure out which of the images that they have would make the most sense here. And then we've also had a lot of luck with Shutterstock's computer vision API when we want to supplement their assets with other assets. I don't know if you've used that, but it's actually quite good at bringing you back kind of visually similar.

You'll get some nuances there because visually similar can return things that are not content similar. But we found, I think, a good way to deal with that just by also looking at the metadata of the image, which tells you what the content is. So once you apply the computer vision side to get things that look the part and then kind of filter for things that are actually relevant, appropriate content-wise, you end up in a pretty good spot. But you do need, in our case, we have user images that we can put into the computer vision to kind of expand on their, we call it their universe of libraries because it starts with their cluster, but then we kind of try to build that out.

You kind of said strategically, you're trying to put yourself in position to benefit from ongoing advancements. That's a mantra for me, too. The waterline is always rising. What are your main priorities right now that you're building toward?

Jon Noronha: 49:16 Yeah. So we talked a lot about the AI portions of our priorities. I think the other one is really anchoring back with our origin story is thinking about the medium and the format itself. So how do we make Gamma the best way to present your ideas across a range of different settings?

Some part of that is that to really go up against these titans like PowerPoint, you just have to have a bunch of other features to do that well, everything like export this as a PDF or add speaker notes or all those things. But actually, the kind of most surprising opportunity that we're pursuing, in the sense that I didn't really see it coming when we began, was the number of people that use our product to make a website, or a web page, or something that is their digital face on the internet. I think I mentioned that what we accidentally created was a website builder for slides, and it turns out it's also a website builder for websites, in part because it doesn't have all of the complexity of a big, complicated website builder. It's write like a doc and then share like a website is kind of the promise.

And so we've actually been pulled by our users in the direction of doing that. And so we're in the process of letting you support just publishing these Gammas to your own custom domain. And along with that comes, let's set up the rest of the parts of a website. I want this to have a header and a footer and forms on it and all those things. And so we are rapidly adding those kind of capabilities. And it's something that raises a lot of hard strategic questions for us. It's hard to be a great presentation tool and a great website builder and a great document editor all at the same time, although AI makes it easier than you might think. I think it really changes a lot of our assumptions about how a company competes in different spaces.

But for us, what that really means is we are trying to forge our own path between these different media. We want to make Gamma a very unique thing that is unlike what any other company creates, but hopefully well suited to a lot of people's tasks. And so a lot of what we're doing is trying to figure out all the different design details of how to do that.

Nathan Labenz: 51:07 Yeah, that's interesting. So I was going to ask also about how you think about competing against these giants, right? Because, I mean, the products that you're taking on in being the anti-PowerPoint or the anti-presentation, maybe the anti-Google Slides, whatever deficiencies they have in terms of user experience, they certainly have major advantages in terms of familiarity and just general distribution.

So how do you think about that? I mean, it seems like if you were going to go out and try to raise money, I'm sure the number one question would be, what's going to happen when Microsoft rolls out their AI slide maker? Do you think it's just going to continue to not be that awesome? Or you kind of hinted at maybe we want to make different stuff. Presumably they're going to continue to make slides in their way. So how do you think about kind of carving out, is it carving out a space? Is it just being better at the core thing? What's the strategy for taking on these giants?

Jon Noronha: 52:07 It's a good question because it's one that keeps me up at night thinking about where these giants will evolve. Our main strategy has always been, let's carve our own path, even before AI. We don't want to directly compete head-on with these tools, in part because they're very good at what they do. We really like to do something different and use AI to help us do something different.

That said, every month that goes by without Google Slides AI and PowerPoint AI actually being released is another month that I find the temptation of, well, maybe we should compete head-on now. I don't know. I think the jury is still out on how good these tools will be and also how accessible they'll be to different audiences. I think everyone's going to focus on their core, and for Microsoft, that core is very much the big, enterprise-y work setting. And so I don't know how well, for example, the small business will be served by what they're building, or how well education will be served. I truly don't know, and so I'm eagerly awaiting where they take it.

I think that Microsoft, in particular, is a pretty fierce competitor. They are the big oil tanker that doesn't turn quickly, but when they do, they just relentlessly move. And so our best bet as a startup is just to be nimble and try to carve our own path, and also always stay ahead doing something a little bit different than what those giants are doing.

The website market is a bit different, though, where it certainly has incumbents, but it doesn't have any incumbents on quite that scale. And so far, none of the incumbents in this space have really shown that much interest or savvy in deploying AI. And so it's also a place we're looking at, maybe this is even the bigger opportunity to go after. And I think we'll just have to see.

Nathan Labenz: 53:40 This has been obviously a hot topic, right, of has Google lost its ability to ship or you can never change Gmail? And certainly they're starting to ship stuff. But I would say the writing assistant experience that I have in Gmail is still not actually useful, in part because, mystifyingly to me, they don't give you the freedom to actually direct the AI. I don't know if you've used this particular interface, but you have a few modes of change they allow, like shorten it, elaborate it, whatever. But it never allows you to say, do this to it, which is just a kind of strange gap to me.

You know, enterprises are definitely not ignoring this, and they are shipping stuff. But then other times you see things like that from Gmail, and you're like, that seems pretty half-baked. You know? Do you feel like this is just such a different discipline that it's actually going to be hard for these organizations? Or do you think that they're on the verge of figuring it out? Or what do you make of what we've seen so far from big companies rushing to ship AI products?

Jon Noronha: 54:45 I used to work at Microsoft, and so I think I have some sense of how these companies operate internally. And I think when you're not inside them, it's easy to underestimate the sheer scale of the challenges that they face operating at the scale they do. Many of these products, Gmail and even Google Slides, have hundreds of millions of users in a wide variety of circumstances. All of them have their own different contracts with how they use your product. There are different languages that they speak. There are different data privacy and sensitivity requirements.

And so I don't envy the people who are trying to juggle all of those different requirements and stakeholders when making decisions. The sheer number of humans you have to convince internally to make even a relatively low-risk change is significant, and AI presents a huge number of risks that we're only beginning to understand, from hallucination and misleading people to misuse and abuse. And so I think these companies' hands are just incredibly tied, much more so than startups are.

And so for them, I think even though they feel a lot of urgency internally, in many cases, their best bet is to let startups establish patterns, let smaller companies establish patterns, and then they can follow and incorporate those over time. And they have to have the distribution advantage that they won't be doomed by doing that. They will take their time and get it right, and they will do it.

And so I think often to those of us who are in these fast-paced environments, their behavior looks odd, but it's actually perfectly rational and understandable given where they're coming from. And I think what it means is if you want to be on the cutting edge of where AI is, you almost have to be outside those ecosystems, and that will give you a very good sense of what will be in tools like PowerPoint in one or two years.

Nathan Labenz: 56:22 One theory I've been kind of kicking around is the idea that these super complicated, over-featured, overbuilt, I can't find what I want in this system type of systems. And as you mentioned too, the language complexity, I mean, the complexity certainly is vast. You could kind of tell two opposing stories. One would be all that complexity makes it hard to apply AI because, you know, it's just even compounding the complexity. The other story would be, maybe the AI can kind of smooth over all that complexity somehow by, you know, kind of reducing it all back to, hey, now you can just engage with natural language and the AI will be the one that has to deal with all that complexity. Which way do you think that, obviously, it could be contextually dependent, but which of those stories do you ultimately believe in kind of coming to predominate?

Jon Noronha: 57:16 So far, it feels like the tools that have been successful with AI are the ones that have been able to pick off narrow and specific workflows and carefully craft prompts and examples to be able to tackle those. But I would be a fool if I claimed that I could predict where these models are evolving, even over the course of a year. And if anything, the trend in AI has been towards surprising generalization of abilities. It used to be that you had one model for translation and one model for search and one model for summarization. And now we see these sort of god models that can do it all. And I think that may only accelerate. And so I wouldn't even dare predict how this story is going to evolve.

Nathan Labenz: 57:54 Well, your app is Gamma. It's at gamma.app, and it's, for my money, the market-leading AI card presentation maker today. So great work on it. Keep it up. Anything else you want to make sure we cover before we break?

Jon Noronha: 58:11 No, thanks so much. It was great chatting. Big fan of the podcast and appreciate the plug.

Nathan Labenz: 58:15 Thank you very much. Well, again, keep up the good work. Jon Noronha from Gamma, thank you for being part of the Cognitive Revolution.

Infinite Code Context: AI Coding at Enterprise Scale w/ Blitzy CEO Brian Elliott & CTO Sid Pardeshi

The AI-Powered Biohub: Why Mark Zuckerberg & Priscilla Chan are Investing in Data, from Latent.Space

AI & The Law: Changing Practice, Claude Constitution, & New Rights, w/ Kevin & Alan of Scaling Laws

The Presentation Revolution with Jon Noronha, Co-Founder of Gamma

Watch Episode Here

Read Episode Description

Full Transcript

Transcript

Nathan Labenz