Bringing AI to Data: Agent Design, Text-2-SQL, RAG, & more, w- Snowflake VP of AI Baris Gultekin

Watch Episode Here

Listen to Episode Here

Show Notes

Baris Gultekin, VP of AI at Snowflake, explains how “bringing AI to the data” is reshaping enterprise AI deployment under strict security and governance requirements. PSA for AI builders: Interested in alignment, governance, or AI safety? Learn more about the MATS Summer 2026 Fellowship and submit your name to be notified when applications open: https://matsprogram.org/s26-tcr. He shares the importance of bringing AI directly to governed enterprise data, advances in text-to-SQL and semantic modeling, and why high-quality retrieval is foundational for trustworthy AI agents. Baris also dives into Snowflake’s approach to agentic AI, including Snowflake Intelligence, model choice and cost tradeoffs, and why governance, security, and open standards are essential as AI becomes accessible to every business user.

LINKS:

AWS' Automated Reasoning checks

Sponsors:

MongoDB:

Tired of database limitations and architectures that break when you scale? MongoDB is the database built for developers, by developers—ACID compliant, enterprise-ready, and fluent in AI—so you can start building faster at https://mongodb.com/build

Serval:

Serval uses AI-powered automations to cut IT help desk tickets by more than 50%, freeing your team from repetitive tasks like password resets and onboarding. Book your free pilot and guarantee 50% help desk automation by week four at https://serval.com/cognitive

MATS:

MATS is a fully funded 12-week research program pairing rising talent with top mentors in AI alignment, interpretability, security, and governance. Apply for the next cohort at https://matsprogram.org/s26-tcr

Tasklet:

Tasklet is an AI agent that automates your work 24/7; just describe what you want in plain English and it gets the job done. Try it for free and use code COGREV for 50% off your first month at https://tasklet.ai

CHAPTERS:

(00:00) About the Episode

(03:02) Snowflake 101 and AI

(09:25) Text-to-SQL and semantics

(19:10) RAG, embeddings and models (Part 1)

(19:17) Sponsors: MongoDB | Serval

(21:02) RAG, embeddings and models (Part 2)

(32:23) Bringing models to data (Part 1)

(32:29) Sponsors: MATS | Tasklet

(35:29) Bringing models to data (Part 2)

(51:14) Designing enterprise AI agents

(58:35) Trust, governance and guardrails

(01:07:14) Agents and future work

(01:15:33) Platforms, competition and value

(01:26:04) Enterprise models and outlook

(01:40:00) Outro

PRODUCED BY:

https://aipodcast.ing

SOCIAL LINKS:

Website: https://www.cognitiverevolution.ai

Twitter (Podcast): https://x.com/cogrev_podcast

Twitter (Nathan): https://x.com/labenz

LinkedIn: https://linkedin.com/in/nathanlabenz/

Youtube: https://youtube.com/@CognitiveRevolutionPodcast

Apple: https://podcasts.apple.com/de/podcast/the-cognitive-revolution-ai-builders-researchers-and/id1669813431

Spotify: https://open.spotify.com/show/6yHyok3M3BjqzR0VB5MSyk

Transcript

This transcript is automatically generated; we strive for accuracy, but errors in wording or speaker identification may occur. Please verify key details when needed.

Introduction

Hello, and welcome back to the Cognitive Revolution!

Before we get started today, a quick final reminder: if you dream of a career in AI safety research, the deadline to apply to MATS Summer 2026 program is January 18th. Listen to my recent episode with MATS executive director Ryan Kidd for all the reasons you should apply, and get started at MATSprogram.org/TCR

Today my guest is Baris Gultekin, Vice President of AI at Snowflake, the cloud-based data platform that now describes itself as the AI Data Cloud.

Baris came to Snowflake, along with Snowflake's current CEO Sridhar Ramaswamy, as part of Neeva, an AI-powered web & personal knowledge base search engine that Snowflake acquired in May 2023.

Since then, he's been working at the intersection of frontier AI capabilities and hard enterprise realities, deploying these systems in environments where security, governance, and reliability are strict requirements.

As you'll hear, Snowflake's core philosophy is to "bring AI to the data" rather than sending sensitive data out to model providers, and in this episode, we unpack exactly what that looks like in practice.

We cover a ton of ground, including:

The massive ongoing unlock of unstructured data, which is making the 80-90% of enterprise information previously trapped in PDFs and other documents queryable for the first time
The current state of both text-to-SQL and RAG systems, and why reasoning models have finally made natural language data analysis reliable enough for business users.
The trade-offs between using frontier models versus smaller, specialized models, and when to use structured workflows versus letting models choose their own adventure
How "data residency" requirements are shaping partnerships between cloud providers, model labs, and platforms like Snowflake
How AI coding assistants are changing the discipline of product management by enabling rapid prototyping of working features
Where Baris sees value accruing in the AI stack, and his prediction that horizontal application layers will win out over narrower vertical solutions
And finally why Baris takes the over on my timeline for autonomous "drop-in knowledge workers" and what he believes will have to happen first.

If you want to understand how large enterprise companies are deploying AI today, and what's really working as they mature from the early experimentation phase to the ROI at scale phase, taking all the operational complexities and security and governance concerns into account, I think this conversation will be perfect for you.

And with that, I hope you enjoy this deep dive into enterprise AI adoption and the future of data intelligence, with Baris Gultekin, Vice President of AI @ Snowflake.

Main Episode

[00:01] Nathan Labenz: Baris Gultikin, Vice President of AI at Snowflake. Welcome to the Cognitive Revolution.

[00:06] Baris Gultekin: Thank you, Nathan. Thanks for having me.

[00:09] Nathan Labenz: I'm excited for this conversation. There's going to be a lot to learn. I think people, we have a very diverse audience. The number one profile is AI engineer. And within that profile, people work at a lot of different kinds of organizations, from solo entrepreneurs and consultants to startups to enterprises. So some people will certainly know Snowflake and will work at organizations that are customers of Snowflake. Others probably have heard of it and don't really know too much of the backstory. So maybe for starters, just kind of give us the real quick Snowflake 101, and then I'd love to go into how AGI pilled is Snowflake today.

[00:45] Baris Gultekin: Sure. So Snowflake is a data platform. We call ourselves an AI data cloud. So what that means is our customers bring a lot of their data onto Snowflake so that they can secure it, govern it, and analyze large amounts of data for various insights, dashboards, and the like. And from an AI perspective, because there's a lot of gravity to data, our customers do not want to replicate data in multiple places. Instead, they want to bring AI to run next to data. So that's a very high-level overview. And the AGI pilled is an interesting phrase. For us, we're quite practical. We serve large enterprises, and the goal is to get to high-quality AI agents that create positive ROI for customers quickly. And that can happen today, and it is happening today. Super excited about where things are.

[01:35] Nathan Labenz: I definitely want to come back to the bring AI to data strategy that you guys have in a few minutes, but To just double click a little bit on the before and after, because obviously Snowflake has been around for a while before certainly anything like the AIs that we have now were available. So what were people doing before with Snowflake and what are the new AI use cases that have been unlocked over, say, the last, I don't know when you would start the clock, right? Do you start the clock at ChatGPT or do we need it? Was that like not quite strong enough to actually make things work? But yeah, there's a lot of different dimensions. Maybe let's start with before and after.

[02:12] Baris Gultekin: Sure. So I'll start with the before. So a lot of our customers have been using Snowflake mostly for structured data initially. And this is where they'll bring the data in and then they'll run large scale analysis, either to have insights to power BI dashboards, for instance, or to do to do various analytics to understand their business. The value of the platform is to be able to bring data from all the different places to break the data silos so that you could run analysis across large amounts of data. So what is happening with AI is there is a big unlock, of course, of unstructured data. So and then this plays out in two ways. One is if you have, you know, thousands, hundreds of thousands of documents, for instance, you can now extract structure from these documents and then you can analyze them. If you have contracts, for instance, and if you want to say, hey, what are the contracts that talk about this specific thing? How many of them do I have? Or, you know, what are the contracts that are expiring soon in this category? So being able to run these analysis super easily is now very feasible. Then there's, of course, being able to do a lot of, again, kind of large-scale analytics work, but with a lot of ease. So a lot of the pipelines that used to be built around, I'm going to go classify my data, I'll extract information from it, is now very, very simple to do. The way we've done it is we've, again, brought AI to directly work with the engine for analytics. So you can do things like classification, extraction of that data very, very easily. And increasingly, of course, there is a lot of interest in bringing natural language interface to all a company's data so that you can just talk to your data, you can democratize access to all that data for the organization.

[04:08] Nathan Labenz: Could you give a sense for sort of the balance of structured, unstructured, or then I guess you're also saying that like structured data is getting structured through the process of basically AI retro annotation. My sense is that I think of, oh gosh, there's so many vector databases. The exact name of the one where the founder told me, this is slipping my mind. But one of the interesting things that I've understood to be happening in general with business data is that structured data was kind of like the tip of the iceberg in many organizations where it was the most usable kind, but it was actually like a relatively small amount of the data. It was Anton from ChromaDB who said that most of the data that was going into ChromaDB had never been in a database before at all. It was just lying around in various places. So have you seen a sort of great unlocking of people dumping more and more data into Snowflake because now they have ways to make it useful where it just previously wasn't even worth it?

[05:09] Baris Gultekin: Yeah, we're absolutely seeing this. 80% to 90% of all data is unstructured data. And because there weren't a lot of easy ways to process this, it was not necessarily seen as the most usable data, and it is now very usable. both from a kind of extract and then bring structure to it perspective, as well as just talk to all of that data, find the right information using vector DBs, for instance, and then build agents and chat experiences off of it. So we're seeing this, and it is, again, playing out in both ways. More and more data is getting structured so that you could run analytics on it down the road. as well as you can just use all of this data in conjunction with the structured data that you have. So for instance, if you want to build anything, let's say a wealth management agent, so you still need to be able to look up what the stocks are doing in a structured way, but you also have all of the equities research that's in PDFs that you'd like to be able to use. So being able to combine both structured and unstructured is incredibly important for real-world use cases.

[06:24] Nathan Labenz: Where would you say we are on text-to-sequel today? It's been a while since I've done a show on text-to-sequel. There have been probably two episodes on this theme, historically. And... I guess last I checked, there was a range of opinions where some people were like, Yeah, it's just not really there. Other people is there, but you have to do a lot of work to make sure that you have a good semantic understanding because a lot of the SQL databases that come in are like, there's multiple columns and there's tribal knowledge on teams that we don't use that column anymore. We haven't deleted it, but we don't use it anymore and it's superseded by this. There's all these little nuances that live in people's heads. if they have enough of that kind of context. What does a process look like today and how good does it get if new customers, I want to start to enable these, talk to my data with a sort of text-to-SQL kind of strategy. What does that look like now?

[07:20] Baris Gultekin: Yeah, I mean, you called out right. It's been traditionally very difficult for models to get text-to-SQL right. And there are various reasons for it. First of all, if you ask, what's my revenue, the answer is there's only one answer. So the margin of error is very low. And the expectations of quality is incredibly high. And the reason it's been really difficult for these models is because you need a lot of semantics to be able to figure out where to get the data from. So first of all, what is the definition of revenue? What is the definition of profits? Can change. And then how is it modeled in the data side can be tricky, can change. When we're talking about real-world scenarios, We're talking about thousands and thousands of tables that have hundreds of thousands of columns in them to be able to go and reason about. So it's been traditionally very difficult. What I'll say has happened in the last six months to a year is with the reasoning models getting substantially better. and increasingly being able to bring the semantics relatively easily onto the platform. We've had pretty substantial gains in quality. So we now have-- our product to do this, for instance, is Snowflake Intelligence. And we're seeing tremendous demand for you doing Text2SQL. And the quality is at a place where you can now deploy them very broadly. So very high quality and very useful, because the structured data is quite useful.

[08:42] Nathan Labenz: And when you say deploy broadly, you mean to users who are not data analysts?

[08:48] Baris Gultekin: That's right. That's right. So for us, for instance, this product, I mentioned Snowflake Intelligence, is our agent platform. And it's being used by business users. And it is the fastest growing product that we have on Snowflake because we're now making large amounts of data easily accessible to business users to ask questions and get insight very quickly. In the past, they would have to go to an analyst who's familiar with the data for them to kind of build some analysis and then get back to them a week later. Now they can just directly ask questions.

[09:19] Nathan Labenz: What does that process of bringing semantics onto the platform look like? I can imagine a sort of big setup one time where you go out and interview the people that... have set these things up and should know. I can imagine at runtime, you might have to come back with questions and say, hey, I've got multiple columns that are ambiguous here. I did one episode also with a company you may know called Illumex, where they had a really interesting strategy that was around basically building what they considered to be the sort of canonical, like abstract, ideal form of an enterprise in each major vertical that they served. And then they built their query engine off of that ideal. And then the mapping process was like, okay, now how does your actual real world enterprise deviate from this ideal? Let's map all that out. But then we know that once we've done that mapping, the logic on top of it will be trusted. What mix of strategies are you using? What would you find to be effective?

[10:19] Baris Gultekin: So first of all, again, reasoning models are now at a point where AI can help substantially in building out a semantic model. And the inputs to that are both, of course, the data that's in the system, all of the metadata that is in the system, the names of the tables and columns, as well as the data underneath them. But we have built a series of connectors to things like BI dashboards that also have a lot of semantics in them that's super useful in building out semantic understanding for the organization. Also, things like queries people have been running in the past, these are all kind of hints for these agents to go help our customers build semantic models. Snowflake has recently also announced what we're calling Open Semantic Interchange, which is a and trying to create an open standard for sharing that semantic model across the different platforms so that we can more easily create these common semantic understandings for AI to act on.

[11:29] Nathan Labenz: Okay, that's interesting. How does that, can you unpack that for me a little bit? Like how does that work? What do I do as an enterprise if I want to adopt that standard?

[11:39] Baris Gultekin: Yeah, so this is, it's still early. So we're working with companies like Tableau, Omni, BBI platforms, as well as other technology providers to create an exchange format so that if you create a semantic model in one platform, you could just use it in a different platform. So there's active development now with all the parties in the open semantic interchange to define what that interfaces so that we could support an open interchange of the semantic model, essentially. So what that would look like is a customer can go to Snowflake, for instance, they can go build out their semantic model, and then they could reuse that semantic model in another place that supports the open interchange.

[12:32] Nathan Labenz: For one thing, I wish that would come to electronic medical records sooner rather than later. I feel like, yikes, it's been my recent experience there. I wonder how that, going back to the first question about how AGI pilled various organizations are, I wonder how, how do you see that from a competitive dynamic standpoint? Because I think one of the things that's most interesting where, you know, the dice are in the air, so to speak, it feels to me right now in the software market is like, Sure seems like the pace of software development is increasing dramatically. People are, I think mostly it's outliers or just plain bluster at this point to say that people are like deleting systems of record and rolling their own in house. But you've at least got that talk is out there. And then presumably everybody is geez, if I can. maybe already see a major acceleration in my software development, or if I can project a year or two in the future and I can see a clear path to a major, and I vibe coded three AI apps for family members for Christmas presents this year. And the acceleration at that level is certainly very real. Then it seems like everybody is gonna be incentivized to like try to go take sort of some conceptual territory from companies that maybe used to be partners, used to be compliments. It seems like it's headed more toward competition. So all these companies that you named, right? Historically, you specialize in one thing. They did a little bit different thing. They work nicely together. You got a lot of customers in common. Great. But if I'm them, or maybe if I'm you, like, I might start to worry at this point. Geez, if I'm Tableau, should I be afraid of Snowflake? Are they going to come after me with something that sort of replaces what we do? And do I want to be partnering with them on these standards? Or do I have to fear that whoever is the Whoever has their hooks deepest into the customer can box out and colonize these additional niches. Now, that's a pretty AGI pilled point of view. Maybe you think that's just where I'm getting ahead of myself there, but what do you think?

[14:24] Baris Gultekin: Not at all, not at all. I actually really love what's happening. And what's happening is the silos are coming down. This is all great for customers, for consumers, right? With all these open standards, essentially, the beneficiaries are our customers. There is no luck in anymore. And I think that's great. That's great for competition. That's great for innovation. That's great for customers. And we're seeing this play out across the board. Anywhere in AI, the differentiation is coming down. That means everyone is doing more and more things to create more and more value, which ultimately is great for the industry and great for consumers and customers. So I'm loving what's happening. As you called out, the walls are coming down. The lock-in is no longer there. And that makes product development really important. That makes the speed of execution really important. And ultimately, it's all about creating more and more value.

[15:20] Nathan Labenz: But does that take us to a place where companies that used to be friends are trending toward frenemies? Because it seems like there's only so many ideas. It just seems so obvious to me in so many places that a big platform like Snowflake would be like, Sure, we can do what Tableau does, especially now that we can get so much more stuff shipped on a quarterly basis.

[15:41] Baris Gultekin: I would say the pie is growing. I don't think it's a fixed pie that people are trying to protect. The types of things you can do is growing, and that is super exciting. I also don't think that everyone can do everything. Ultimately, where each company focuses is closer to their area of expertise, to their differentiation. don't necessarily see that everyone's going to do everything, but I also do believe there is a lot of competition, but there is also growing pie, which is exciting.

[16:09] Nathan Labenz: Yeah. Okay. Let's go back more toward the technical side for a minute. We kind of went debut in text to SQL. Let's do the same thing for RAG. So we've got all these unstructured, vast amounts of data out there. They're getting loaded into platforms. They're getting metadata synthetically created by AIs coming through and just processing them suite by suite. How well is that working and what is actually key to making it work? We've been through eras of chunking strategy is really important, or I've done episodes on graph databases and entity recognition and figuring out various ways to traverse the entity graph that, and obviously it's it's going to be quite distinct for each enterprise with all the different entities that they're going to have that nobody else has. What is really driving results in that RAG paradigm today?

[17:02] Baris Gultekin: Yeah, at Snowflake, we've been actually very fortunate. We acquired a company that I came with called Niva, which was a web-scale search engine. I was a user. All right, awesome. So we brought that technology into Snowflake to build out our search and RAG solutions. And there, basically what determines quality is the quality of the embedding model that you're using. Of course, there is more and more sophisticated chunking strategies of what you're indexing. And then there is kind of other layers like the hybrid search and the re-ranker that you build on top of it and so forth. Increasingly, a core part of it is also to be able to understand complex documents. PDFs are messy, you have images, you have tables, you have multiple columns in a page and so forth. So being able to handle all of this extract information really accurately. figuring out which model, embedding model to use, whether you should use a multimodal embedding model or a text embedding model and so forth. So all of those are incredibly important, but increasingly, we're getting to a point where you can automate many of these things and then reduce the complexity so that a lot of what we used to require practitioners to do can be relatively automated at this point. And now you're getting to a point where more interesting opportunities get unlocked. So for a company like Snowflake, for instance, being able to do what we're calling analytical, agentic document analytics is something that is possible to do. So what I mean by that is, let's say that you have thousands of PDFs and there is information in it, Let's say you have quarterly results over the last 10 years. So being able to say, what's the average revenue over the last 10 years? And if that is in multiple different documents, being able to extract all of that and then do analytics on it is now possible. So overall, I think RAG is both getting increasingly higher in quality and also simpler to build and increasingly more and more powerful to handle some of the new agentic use cases.

[19:21] Nathan Labenz: Would it be a fair distillation of what you've said there that you're trending more toward more powerful models? Like a project that I've been involved with recently is built around understanding, often scanned on like a physical scanner forms that are associated with the sale of a car from either a dealer to a person or person to person. These things, of course, have to get filed with the state and reviewed and they're super messy and whatever. So working a little bit with a company that's using AI to automate that. In that context, I've really seen a pretty substantial simplification where 18 months ago it was like, I You might need your specialist embedding model here and your kind of table extractor model there and all this kind of deep specialization, often not super large models, but like really dialed in on these use cases. And now I would say today, Claude four or five Opus or Gemini three mostly just solve the problem off the shelf in terms of understanding those documents at a higher cost, certainly inference wise, but definitely a lot lower cost in terms of AI engineering time. Am I right to say you're seeing the same trend, the less specialized models?

[20:38] Baris Gultekin: There are different use cases. If you are going to process, in some cases, hundreds of millions of documents, you're not going to use Claude to do that. Instead, you want to use a specific embedding model to embed certain aspects. You want to extract the information so that you could reuse it later and so forth. But if you're talking about one or two documents, of course, these large language models can't handle them really right now. So I still do believe there are different use cases, and then those use cases call for different tactics, different models.

[21:06] Nathan Labenz: In terms of why you wouldn't send millions of documents through Claude, is it just about inference cost, or is there some other--?

[21:13] Baris Gultekin: It's cost and throughput. Oh, so like-- Yeah, exactly. So how long would it take for you to process that many documents is a challenge. So at Snowflake, we have a document extraction model that we've built and fine-tuned. It is in multiple orders of magnitude smaller than these large language models. That means it's substantially cheaper and much faster to go and process. If the task is specific, I'm going to go extract information and extract these specific fields. it is faster and cheaper to do versus kind of using these very large models, which are super capable, but again, will be limited in terms of how fast they can do this. And of course, the cost is another issue.

[21:59] Nathan Labenz: So that's interesting. We're sketching out a little bit of a Pareto frontier, so to speak, here, where we have on the simplest but most expensive ed inference time potentially also like rate limit issues. We have our clods and other kind of frontier models. You're in the middle with a snowflake specialist model that is much smaller, does just what it does, but it's still like something that is amortized off over a whole bunch of enterprise customers that you have. Is there, what are you seeing in terms of the other end of that spectrum? Like the, Is there still value in an individual enterprise trying to create its own super specialized model for some of these tasks? Or does that curve stop at the Snowflake scale model?

[22:52] Baris Gultekin: Yeah, it's a good question. So first of all, we partner very closely with all the large language model labs out there. And they have incredible capable models that we use every day. There are some cases where our customers would want something very specific. And this is the case when a customer has large amounts of data, and the use case is something that the model has not seen before, and then they have strict either throughput requirements or cost requirements. Those are the cases where a custom model That is usually based on some of the other large language models out there. Makes sense. So we work with these customers to build custom models for them. But in most cases, a well-tuned RAG solution, text to SQL solution, with the data that they already have, with a large language model that's the frontier model, is usually the go-to scenario.

[23:53] Nathan Labenz: I'm halfway through doing an AMA episode. One of the questions I got was, is fine-tuning really dead? What do you think? So it sounds like you're saying it's not quite dead, but it seems like it's specialized. It's on the decline in your analysis.

[24:07] Baris Gultekin: I wouldn't say it's on the decline. I think it is really well-suited for certain types of things. And maybe the best example is actually what Cursor recently did. At their scale, it does make a lot of sense for them to have a custom model that is doing their autocomplete for instance. So being able to figure out in which situations you need a custom model versus not is something that is evolving. Starting with the large language models makes a lot of sense. And then over time, as you have more and more data and if you have specific needs, if you have either specific needs because of data or because of cost or throughput, that's when specialized models come into picture.

[24:54] Nathan Labenz: You mentioned these partnerships that you have with the Frontier companies. Before getting to that, would you like to shout out or highlight any particular open source models that are your go-tos? We hear a lot about obviously the Chinese ecosystem is like continuing to open source a lot more than the American ecosystem at this point. I don't know if you guys feel Like, comfortable using Chinese models in your stack? I get very different answers on that when I ask that question. But what are the models that you guys go to today when you're like, okay, we're going to explore some new custom direction, either for all of our customers or even just for one customer. What are the handful of models that you go to as starting points to begin that journey?

[25:38] Baris Gultekin: Yes, so for Snowflake, we have a platform where we offer a series of models and our customers choose which model they'd like to use. And then there are certain products where the model is just part of the product and not necessarily a specific choice. For the models that we offer, there's, of course, all the Frontier models, OpenAI models, Anthropic, Gemini, as well as models from Meta, Mistral, and others. Some of these models are open source, others are proprietary. We also have DeepSeek as a model that we provide for customers. In certain arrangements where our customers are looking to build custom models, some of them are open to using model weights from these models from China, others aren't. But it really depends on the customer.

[26:21] Nathan Labenz: Does that break down along industry lines, or is it more of just an idiosyncratic gut feel on the part of the customer as to what they're comfortable with?

[26:30] Baris Gultekin: I think it's the latter, actually. It's not necessarily an industry-specific thing. We have customers who are in technology, for instance, who say yes sometimes, or who will absolutely not touch some of these models for other customers.

[26:47] Nathan Labenz: Do you have a sense of how much they are leaving on the table? Are they leaving much on the table by cutting off the Chinese model option?

[26:58] Baris Gultekin: Models like QAN are incredibly powerful, and if they'd like to start with models like that and then fine-tune it, you can get very capable models. But you also have other alternatives. So it really depends on the internal policies of these customers to decide which route to go. I'd say It's such a competitive space that I don't think there is one model that dominates it all, whether that is in proprietary world or open source world. So there are a lot of choices out there.

[27:32] Nathan Labenz: Gotcha.

Baris Gultekin: Okay.

[27:33] Nathan Labenz: So on these partnerships, we've got announcements recently of partnerships with Anthropic and also with Google for the Gemini models. I believe there's also one, although I think it was not so recently announced with OpenAI. I didn't see, I didn't catch anything with respect to XAI and Grok. Providing all the latest and greatest stuff to customers is at the heart of that strategy. But tell me more about kind of some of the nuances of the partnership. Is there a XAI relationship? If not, why not? And does it have anything to do with them putting women in bikinis all over the place? And then I definitely want to get into how are we bringing these models to data? Because that is a bit of a narrative violation relative to what you typically hear is, We can't use that because we'd have to send the data to them and we're not comfortable with that. So I'm very interested in unpacking how you are reversing that and bringing the models to the data on the Snowflake platform.

[28:24] Baris Gultekin: Yeah, absolutely. Actually, let me start there because that's incredibly important for us. When we started the journey two and a half years ago or so, we heard loud and clear that our customers do not want to move their data out of the Snowflake security boundary. Instead, AI needs to come next to data, and that gives them a lot of advantages. You can just respect all of the security that you've established. You respect a lot of the governance on the data so that you're not replicating this data. The attack vectors shrink in terms of securing all of this information. So what we have done is, thanks to our relationships, we've built, we are bringing, we're bringing essentially inference to run inside the Snowflake security boundary. So that's accomplished through these partnerships, through the connections, as well as a lot of the legal guarantees around the data. So essentially these models become sub-processors. There is no state that's saved in any of these models. So that's super helpful for our customers who are very sensitive, many of them in regulated industries. So when they're using any of these models, they know that the data still stays inside the Snowflake security boundary.

[29:31] Nathan Labenz: So does that mean then that the model weights have to come inside that boundary? And how is that happening?

[29:38] Baris Gultekin: Because obviously the IP-- Yeah, the IP still belongs to-- yes, absolutely. And the IP is owned by the model providers. The inference is run by the cloud providers in their stack. The difference is we have a series of guarantees to ensure data residency, to ensure there is no state that's left. So all of those are through the relationships and the deals that we're doing with these model providers as well as the cloud providers.

[30:10] Nathan Labenz: So the cloud providers are key in this because they are-- Certainly. -able to provide the inference. And they're also providing the underlying like physical infrastructure that Snowflake is built on top of. And so it's because both of those things are true that we can draw the right dotted line around both of these things out. Is there more that I should understand about this? Because one thing that has, of course, I don't know what I don't know, but it seems like the more you move weights around to different clouds and stuff, the more risk you as a frontier model to developer. Here I'm thinking open AI anthropic. Google obviously runs their own clouds to a very large extent, although everything's showing up everywhere. One of the fascinating things about this whole moment has been how many alliances or at least partnerships we've seen between big tech companies that previously were very much at odds with each other. So with all the models showing up everywhere, I'm like, how has it been that none of these have really leaked? It seems like there's so many people that work at these platform companies. that if access isn't like really well figured out, I don't know. It just seems like something would leak at some point. But we haven't really seen that. We haven't seen the weights of a frontier model leak at all, as far as I know. And then people will speculate. Maybe some state actor might have stolen them and not put it, not told anybody about it. But we haven't seen fundamental breakdowns. So how should we understand how that is happening to seemingly such a high degree. What role does trusted execution environments play? What role do other kind of measures play? Like how are we, my general working heuristic is like everybody's hacked, everybody's pwned, like nothing is secure. And yet at the same time, we seem to not be having catastrophic leaks. So how can you help a simple person like me understand how we're achieving that?

[31:57] Baris Gultekin: Yeah, I mean, As you called out, these are very sensitive, important IPs that belong to the model providers and then secured by the cloud providers. They have a very strict series of requirements and set up to ensure that access is limited. Because they are the ones that are running the inference and setting up the environment, they've set it up in a way that is airtight. I don't have a lot to say beyond it. I think they absolutely take security very seriously. We work with them, we understand how important it is. You've talked about all the different risks that are out there that they need to protect against. So this is something that both cloud providers and model providers, as far as I can see, are taking very seriously. And as we work closely with them, for Snowflake, of course, security is at the heart of what we do. So we set up our own environment in a way that has all of the security considerations in mind. I'll just say this is an incredibly important area, and there's definitely focus in this area across all the parties.

[32:58] Nathan Labenz: How much would you say of this kind of security has gone to the level of provable guarantees or cryptographically secured as opposed to more roles and access controls and things where there's still a more fundamentally like human element. I just did an episode not long ago with a couple experts in formal methods. including a guy who's a VP at Amazon who's pioneered a lot of their use of formal methods to derive a lot of these security guarantees. But I'm not clear on how much of this is resting on that kind of like we have proven that this is secure versus how much is like we have a process that we feel good about and we want you to trust?

[33:43] Baris Gultekin: So this is not my area of expertise, so I don't have a lot of depth, but purely from talking to both the cloud providers and the model providers, when you start looking at what are all of the attack vectors and what is possible, it doesn't seem like this is a human factor is an issue. The way the systems are set up is inherently very secure. That said, in security, of course, you can never say, Hey, this is completely airtight and it can never be penetrated. Yeah, as I said, security is taken very seriously, and I don't think it's a human factor necessarily. The way the systems are set up is such that the execution environment doesn't have access to, you cannot do a lot with it other than just run inference through it. So by design, access to the weights are limited.

[34:32] Nathan Labenz: Yeah. Okay, cool. Thank you. I'm always trying to get a little bit better read on that particular corner of the world, and it's not one that is as freely and openly as some of us curious minds might like. Going back to the models, though, themselves, what's your read right now on this is another thing where I think people have very different intuitions. Are the models going to be commoditized or are they going to be sufficiently differentiated as to maintain pricing power as we continue to go into the future? I guess there's that. There's like, how do you help customers decide which model to use for a given case? Do you have an evals platform built in or do you help them do evals? How do you help them think about being like keeping agile so they can switch. Obviously, new models are coming out all the time. So there could be something better or faster or cheaper that you could upgrade to, but you have to know with some confidence that you're going to be upgrading with good reason. There's a whole ball of wax there. Take your time in melting it, but...

[35:31] Baris Gultekin: Yeah, super interesting. I mean, as you mentioned, the differences between these models is not large. Each model keeps getting better, and then there is great healthy competition out there between the model providers, which again benefits companies like ours, our customers, and so forth. For us, because we're providing the choice to customers, the second part of the question is also really important, which is how do we help our customers choose which model is the best fit for their needs? There's a couple of considerations. One is, for many customers, they do not want to leave, again, because of data residency requirements. If they are, for instance, an Amazon shop, and today OpenAI is available through Azure, or through directly OpenAI, that becomes a consideration. So some customers are okay with their data leaving that Amazon cloud boundary, others aren't. So that's one decision point. The second one is, of course, from a quality perspective. Many customers will go run evals side by side to decide which model is best suited for their needs. What's interesting is some of these models are cheaper, faster, but when you add reasoning on top of it, the equation changes, right? So, you know, certain models are very good at certain things. Again, just to call out, you know, Claude is incredibly good at coding and continues to be a great model for that. So we help our customers in assessing which models to use for their needs.

[37:09] Nathan Labenz: Does that extend to like how much of that is a service, a consultancy type relationship, and how much is productized at this point, or do you want to do more productization?

[37:21] Baris Gultekin: Yeah, it is more productized than a service. On our product, you can easily choose models. You can easily do side-by-side comparisons. You can run evals. And for many customers, actually, it's not necessarily for the first reason. And not all the models are available in their environment anyway, because as a company, they've decided that only these models are approved, or only this environment is approved for me.

[37:42] Nathan Labenz: How often do you see people switching? This is even in my, so I started a company which I used to be the CEO of. I'm no longer. And we're only 40 people. We're doing like $10 million a year in revenue. So we're not an enterprise. We can fly a little faster and looser than, and we're also not in a regulated industry. We basically do video content creation for local and increasingly like mid-sized businesses. And I feel like we should be changing models more often than we do. Honestly, I feel like the leapfrog effect is just happening so often. And if I were to grade our own performance, I'd be like, eh, B, like we're definitely better than most, but I wish we were even a little more on top of eking out the latest and greatest performance from the latest and greatest models. But it's hard. It is hard to resolve sometimes. You've got even just human inter-rater disagreement, which is tough to overcome. And how does that play out at the larger scale? Do you see people like Oh, we got a new-- when Cloud 4.6 hits, how many people move to it in a week, in a month, in a quarter?

[38:42] Baris Gultekin: Yeah, I think it really depends on the use case. In most cases, we don't really see a lot of switching happening because the prompts get optimized for a certain model, and you get high quality because you've optimized it for a certain model, and it's not as easy without further optimization to switch. And because the deltas between the models aren't much and they keep improving on a regular basis, the need for switching is also not that much. As you were describing this, I was actually thinking about Google versus Bing. At some point, Bing got to a good enough quality, but there was the habit of continuing to use Google, the familiarity of the interface and so forth, that switching wasn't as necessary. I don't think we're there yet, necessarily, for models. They're still A lot happening, a lot of innovation happening. And also for certain use cases, like if this is a one-off, I'm going to go run something and then do side-by-side comparison, then you go pick the model that works best for you. But if you've already been investing in an application and you have thousands of lines of systems and prompts that you've built, then there's a cost to switching, in which case the gains have to be large enough to justify that cost.

[39:57] Nathan Labenz: So I'm no Ben Thompson, but it seems like from your perspective, you would want to commoditize your compliments and would want to do everything you could to reduce those switching costs, right? Like one of the virtues of being on the Snowflake platform would be you've got all the things, but not only ideally you have all the things, but also you can, the interface is presumably a lot more unified than it would be if you were going directly to the model providers and I imagine there's a bunch of different things you could do over time. You've got things like DSPy out there that you can say, sure, this is my one prompt with this model, but maybe if I throw it into DSPy, I can auto evolve my prompt to be more optimized for some other model, what have you. Is this a goal? Do you think of it as a success metric that you would help people be very fluid in switching from model to model as you go into the future?

[40:49] Baris Gultekin: I mean, that's not how we think about it. For us, it really does boil down to, how do we bring the most value for customers quickly? So choice is an important factor there. So we'd like to offer a choice. And customers make model choices for a variety of reasons. As I said, some of them have only approved a certain model. They have their own AI governance boards where they decide which model to use and so forth. But for us, we start with the data at the core. So ultimately, anything that you do is as good as the data that you provide to it. So a lot of the optimizations for us are, can we do a phenomenal job at the retrieval layer? And then can we make sure that all of these models are optimized to the fullest extent so that any customer that's choosing one or the other, for the variety of reasons I called, get the best quality data agent, if you will, that they're building with us.

[41:48] Nathan Labenz: Okay, that's really interesting. What do you think that implies for the competitive dynamics between model providers? One takeaway you might have from that is whoever has the best model at any given time wins. Of course, there are these other constraints, but leaving those aside for the moment, if I'm a customer that has no binding constraints and I can pick whatever Frontier model I want, It seems like whoever has the best model at any given moment in time wins that business and then actually stands to keep that business. Even if that business, that might not be that whole enterprise's business, but that particular use case you're, it's, you're saying is stickier than you might think. Switching costs are higher than they intuitively seem. And so. Having the best performance at the time it's initially evaluated is actually like pretty important.

[42:37] Baris Gultekin: So I think the model quality is incredibly important, but increasingly we're moving up the stack so that the product also becomes incredibly important. So when you look at from a consumer perspective, ChatGPT as a product starts having its own kind of stickiness because you start using it and you get accustomed to using it. Similarly on, on coding side, cloud code has its own, like again, benefits, you, you'll start writing your instructions, your prompts to optimize for that workflow. So I think we're just moving up the stack. The model quality is absolutely central, but as the quality kind of keeps up across model providers, the next level of differentiation happens at the application layer.

[43:18] Nathan Labenz: So that's a perfect transition to talk about agents and what you guys are doing with agents. The way I structure my own thinking about agents is on a spectrum from On the one end, your Claude code style, choose your own adventure. I just give you the goal, essentially, and you, the agent, break it down and search around, grip around and figure out how to get there. And then on the other extreme is like potentially a totally linear structured workflow where we're going to run a series of prompts one after another. The Claude code is undeniably awesome interface, but I often feel like people are a little bit too drawn to that and I sometimes say that's a don't try this at home sort of project. Like by all means, go use Claude code, but don't think like at your business, you should be spinning up a Claude code, choose your own adventure thing. Probably for most cases, I advise people like even still today, more structured is probably going to get you more of what you want faster in a way that everybody feels good about at the end of the project. Yeah. What distribution are you seeing across that spectrum?

[44:24] Baris Gultekin: Yeah, I think it really depends on the persona who are using these tools. So I'm a huge fan of Cloud Code and coding assistants, make a big difference, unlock great capabilities, and clearly very helpful for AI developers, builders. If you are a business user who is just asking questions like, what was the usage of this product over the last week? As a product manager, for instance, I want a structured way to do this. I want an agent that is already optimized for that use case, that has access to the underlying data. I do not want a cloud code interface for this. I want something that I know will be high quality and that's optimized. So that's how I think about it. It really depends on the persona that you're building for.

[45:05] Nathan Labenz: So for the talk-to-data product surfaces that you guys expose, How, where would you say you tend to fall on that spectrum? Is it a, you're going to use these tools in this order or is the model kind of choosing which tools to use at any given time?

[45:20] Baris Gultekin: So we have a product that we built for business users. So this is Snowflake Intelligence where you can build a series of assistants. For instance, for the whole company, we built a sales assistant and we've deployed to 5,000 sellers. That product is is a, think about it as like a ChatGPT interface on top of all of the company's data so that you can ask questions like, what are my upcoming renewals? How is my book of business doing? And so forth. And you can get answers for it. So for that, clearly you want a highly optimized set of agents for those set of use cases. And then these are business users using it and they need to trust the answers that they're getting. Then we have a set of products that we're building for data engineers and for analysts to build data pipelines, to analyze data. That is more coding assistant, if you will. So we have our own coding agent that's integrated in that platform where they're just either analyzing data or writing code. So that, of course, is a lot more flexible, and it's also not tuned for a very specific set of use cases.

[46:27] Nathan Labenz: How do you think about the question of, like, One big agent that might be long running versus the other kind of big pattern is your sort of initial agent that then routes tasks to sub agents. Back when OpenAI came out with their agents SDK, they had this notion of the handoff as a really central idea. And I was never quite clear on were they doing that because they thought that was the best way to maximize performance? Or was it more a nod to, we think at these enterprises that are going to use this thing, there's going to be different teams responsible for different areas, and we want to be able to modularize the work for human reasons as opposed to like for AI performance reasons. With Claude, on the other hand, and we have a sponsor of the podcast called Tasklet, which is a maximalist when it comes to just let Claude cook basically is their philosophy, give it everything it needs, let it make all the choices, let it run for as long as it can run, give it feedback, but like it's one long, one agent that kind of does it all in one long session. Of course, I'm sure you could say there's different use cases deserve different paradigms, but what do you see working the most in practice today?

[47:41] Baris Gultekin: Yeah, even the Cloud Code case, you have skills that are being developed, right? So you're still modularizing the different kinds of things you want Cloud to do and then giving instructions for Cloud to do these things. I think the way you called out is what I'm seeing, which is in especially large enterprises, you have different teams building different agents. You also have different agent platforms that are being used. So for instance, if I'm using Salesforce to manage all of my CRM, maybe I'm going to go build my sales-related experiences with an agent there, but I still want that agent to talk to this other agent I'm building for something else. So being able to do that agent handoff and coordination is emerging. I wouldn't say this is necessarily super top of mind for everyone. I think still customers are focused on, let me get this one agent right and working well before I start thinking about multiple agents kind of coordinating with one another. But that's starting to become increasingly important. For customers, one of the biggest considerations is they do not want to be locked in to a certain platform. So they still want to be able to make sure that the, again, open standards are supported so that agents can talk to one another, agents can use the tools that you built for one agent by another agent and so forth. So MCP, A2A, these are important protocols that our customers expect to be supported.

[49:12] Nathan Labenz: I was just going to ask about A2A. Are you seeing traction with that?

[49:17] Baris Gultekin: Still early. We don't yet support it. We're starting to hear. Our customers don't necessarily ask for A2A specifically. They do ask for ensuring that some kind of agent-to-agent communication is possible to do.

[49:31] Nathan Labenz: Is there any standard or protocol or platform that is bridging those, the Salesforce continent and the various other continents of agents?

[49:39] Baris Gultekin: Today, we either see a bit of a hack where these different solutions are used as tools through MCP. So still the orchestrator uses them as tools and then manages them. From an agent handoff perspective, other than ATO, I haven't really seen anything else.

[49:53] Nathan Labenz: Yeah, that's interesting. One of the things I find very funny about this whole thing is that just, it seems to me like a fundamental property of intelligence. is that you can everything one of my this is an overstatement I don't mean it literally but one one of my refrains is like everything is isomorphic to everything else meaning you can always squish and rearrange and play hide the intelligence and you can have a smart MCP that's actually an agent and how you actually classify these things seems to be much more of a choice and much less a requirement imposed on us by nature because the, just the nature of intelligence itself is so flexible, fungible, subdividable, whatever.

[50:37] Baris Gultekin: Exactly. No, couldn't agree more.

[50:39] Nathan Labenz: One of the big things, huge theme, right, of the communications that I've seen from Snowflake in preparing for this is the importance of trust. So I'd love to hear your thoughts on what are the levels of What are the dimensions and what are the levels that we have to hit in order for an enterprise to trust an AI process?

[51:02] Baris Gultekin: Yeah, I mean, just to reiterate, for us, trust is incredibly important. If I were to call out two important tenets, one is super ease of use. So how easy it is to build out these solutions and to use them. And of course, trust is at the core of everything. And trust spans multiple different dimensions, right? You know, trusting, from a security perspective, then from a governance perspective, then you have the quality layer on top of it. And then there is kind of evaluations and monitoring and so forth. So it's a full stack. So the way we think about this is by running AI next to data, A lot of the core governance that is put on the data is by design respected in our system. So what that means is, let's say you have sensitive data that's only visible by the HR team. If you go build an agent, the person who asks the question can only get the answer that they're eligible to see and nothing else. This is super obvious and important, but because we have kind of these types of very granular access controls from the ground up as part of the core data platform. Building agents that respect that becomes much easier to do. Then you have governance at various layers. And of course, next level is evaluations of these. So a lot of the trust is in, are you able to build high quality retrieval of context to pass to the agent? Is the agent orchestrator doing a great job figuring out which tool to use, which trajectory to use to answer the question? So evaluation is a core part of the platform, and then ongoing monitoring, getting feedback, and then that cycle of improving the quality. From a user perspective, the way that that trust manifests is when a user asks the question, we have UI elements that says, hey, you know, this question has an answer that was verified by an owner. So again, bringing that trust element into the user experience is another tenant to our philosophy.

[53:07] Nathan Labenz: So did I catch correctly at the data governance level that the shorthand rule is like the agent can only access the same data that the user can access? And so in theory, that could mean like multiple users could come to the same agent and have different experiences because the agent has different data access based on the user that's using it at the time?

[53:33] Baris Gultekin: Exactly, and this is exactly what our customers are asking for, and that's relatively easy to build on our platform. So for instance, the example I gave with our own kind of sales assistant, If a salesperson comes in and says, What is my book of business? Summarize it. You should get an answer that is only your list of customers assigned to you versus another salesperson. As an HR person, if I ask, or as a manager, for instance, for using an HR bot, if I say, What's the salary of this person? I should only be able to see the salary of the person that I have access to seeing versus somebody else. And underlying, it is the same agent, and it's the access controls that govern what I'm able to see.

[54:20] Nathan Labenz: Yeah, interesting. On the sort of performance reliability side, my experience has often been And sometimes it's for good reason. Certainly in like the self-driving car realm, there's a certain logic to saying, we don't just want these things to be like roughly human level. We want them to be like clearly a step up before we're going to adopt them society wide. Good news. It seems like we're getting there. What do you, what do people have in mind as the intuitive standard of performance? Is it like they want these agents to be perfect? Is it that they want them to be like at the level of the human that used to do the job? Is there some like heuristic in the middle that you think people often land on?

[55:00] Baris Gultekin: Yeah. Super interesting concept, right? The more natural the interface is, the more human-like intelligence we expect intuitively. If I'm talking to the agent versus typing, I think talking has much higher expectations, for instance, versus I'm just typing. I know I'm typing to a computer, so the expectations become a little less high. I don't think that adoption of this technology requires human-like intelligence because Even for the specific things that these models and these applications do well, that is such high value that we're seeing huge adoption, as you all know, we're seeing huge adoption of AI already. And then it keeps getting better and it keeps getting better at a super rapid phase, rapid pace. Yeah, I'm very excited about where the technology is.

[55:53] Nathan Labenz: Before going into your expectations for the year ahead, what are you seeing in terms of guardrails. And obviously one big pattern that I think is like very natural to you guys is sourcing answers back to the document or the sort of authoritative place from which it came. Beyond that though, we've got this whole constellation of different patterns, right? In terms of you can filter inputs for appropriateness, you can filter outputs, you can log things and post-process logs. You can, AWS has, I think, a really interesting new service called automated reasoning checks. where you can put a policy in, they convert with a language model, your natural language policy into a set of rules and values. And then they do like literal formal methods to ensure that at runtime, like the agent or whatever the system, whatever it gave you back, that it actually passes those like formal reasoning checks that were derived originally from a natural language policy. That's like pretty interesting and pretty cutting edge from what I've seen. But I think in most places, my sense is like the frontier model companies are doing a ton of this stuff. Anthropic has pushed this to probably farther than anyone when it comes to preventing you from using Claude to do certain things in the biosphere. But are people at the enterprise level actually doing much of it? Or are they just saying, this thing seems to work, we've got an eval set, it passes, and we'll go with that?

[57:20] Baris Gultekin: I think the sophistication is increasing. And usually companies start with products that are more internally focused. While it's important, the bar is a little lower than something that's externally focused. At Snowflake, we offer products to check for guardrails, doing things like checking for hate speech, violence, and other violations. You can detect them and flag them and not have the model respond. But we, of course, also benefit from all of the great work that, as you called out, companies like Anthropic do on their own models as a baseline. The other thing that is also super interesting is as the models keep getting better, their adherence to instructions, of course, keeps getting better. So some of these also get codified as instructions to the agent. So not only do this and that, but also here is the policy that you need to comply with. And that tends to work quite well as well.

[58:15] Nathan Labenz: Have you seen anything in in the interpretability realm being used for practical guardrail monitoring kind of purposes in enterprises so far?

[58:29] Baris Gultekin: We're seeing evaluations become really important, right? So a lot of what companies tend to do is they'll go create their own eval sets, but also use LLMs as judges across various different dimensions to go score what's happening and then continue to monitor it on an ongoing basis. And as agents become more and more complex, it's a pretty new area to understand, is the agent taking the right route? Should I be optimizing it? Understanding where things go wrong becomes really interesting. So that's what I see, not necessarily in the core consumer experience, but in the kind of developer experience where you're seeing, you know, what the model is doing, what the agent is doing through evaluations and monitoring.

[59:18] Nathan Labenz: Yeah. Okay. So you mentioned being excited about the year ahead. You also mentioned voice experiences. It seems like we're at a moment, like literally right now where I don't know if people just had extra time over the holiday break or whatever to get into Claude code for the first time for many people, but it seems like the discourse has really shifted and expectations have really shifted. In just the last 30 days, people have said, oh my God, like the coding experience now is like, it's not just vibe coding and eventually hitting a wall and giving up, like you can actually really make this work. And then the next big thing that people are saying over and over again is the same thing that's happened to coding over the last however many months is coming to a great many domains of knowledge work over the next year. So do you buy that hype? And what do you think that looks like? Are we all going to be agent managers or are we all going to be like talking verbally to agents while getting lots more exercise than we used to? What is the 2026 plus vision for success?

[1:00:24] Baris Gultekin: Yeah, I'd love to see the world where I don't need to do anything and I can just go get more exercise. I do actually see the opposite. We're able to do a lot more and we end up doing a lot more, especially in AI, where everyone is sprinting. There is more work and more productivity out there. So what I'm seeing is agents are absolutely getting more and more capable. Coding agents, I think, as you called out, have passed this threshold where because they're a lot more capable, a lot more people are using them. I think it changes how products are developed. It changes jobs like product management, for instance. I've been in product management for 20 years, and the way we build products has to change given the coding assistants. How you deploy quickly, how you test things quickly is changing because of how capable these coding assistants are. And that is a combination of different things, right? One is the agent can do a lot of things, can do coding well, but also the reasoning capability of the agent is increasing. The tool use becomes incredibly powerful. So you can apply that to other domains. If you have, like now figuring out which tool to use, using it effectively, and then reasoning and figuring out the next steps is very, like that allows you to build very capable agents across the board. I don't know, this is not a 2026 projection or anything, but I'm absolutely seeing clearly increasing capabilities with agents and also increasing use of them for production work.

[1:01:48] Nathan Labenz: Do you have a sense of how the progress in AI coding assistance or agents has changed how work is happening at Snowflake? Are you like instrumenting that or measuring? Obviously lines of code would be like too primitive, but features shipped or burn down points per cycle. Is there a way that you can begin to quantify the impact?

[1:02:14] Baris Gultekin: It is actually There's the impact piece, but there's also the philosophy that is changing. How we build products is changing. So that requires change of behavior. Usually my go-to is I have, let's say there's this feature to be built, I'll think about a UI, I'll go build this UI and go make it happen. Whereas with a coding assistant, if my users are also living in coding assistants, maybe that's as simple as, let me just go build a skill for this thing and then quickly test it out. I can just write the skill in a day, put it out in front of my customers, and then have them use it, give them feedback. And only when I know exactly what the shape of the product is, I can go and solidify it in more consumer experience. So I think, again, product management is changing, product building philosophy is changing because of these coding assistants.

[1:03:06] Nathan Labenz: Yeah. The working prototype is the coin of the realm these days, for sure.

[1:03:10] Baris Gultekin: Yep. Yep.

Nathan Labenz: One of the big predictions that I have heard a lot recently, sometimes with a remarkable level of specificity, including from some anthropic people, is that we should see the first drop in knowledge worker products offered this year. Specifically, folks have said Q2 of this year. And what that means to them is Basically a new employee that's and ultimately at heart is an AI, but it will have a very similar surface to a remote worker on your team. It'll have a name, it'll have all the same accounts, or at least you'll be able to give it all the same accounts that you can give to a human employee, which means they'll be on Slack and they'll be accessible via e-mail and they'll be all over the place, can probably join calls. And the expectation is that this will be good enough in Q2 of this year that people will start to get value from it and this will be a new product category. I guess first, do you buy that that can happen that soon? And second, how many of your customers do you think will be eager to try something like that out when it drops?

[1:04:18] Baris Gultekin: I do see that it's a natural progression. Today, the agents that are being built are either automating certain processes from a productivity perspective, or they are more like, you know, copilots that I can ask questions to and then get responses versus these autonomous intelligence entities, if you will. When exactly that will happen, I think really depends on how scoped can you get them to. I don't think we're at a point where you can just create another colleague that can just do anything and everything. But if you can just very easily scope the task, then I think that absolutely is possible. Yeah, so I don't know. I do see it happening, whether this is Q2 or not. I'm not so sure yet. And again-- I'm trying to find out. Yeah, but plus, as a data platform company, I will call out the importance of ultimately all of these capabilities come down to, for any given company, the differentiation is their data. And then the access to that data, being able to figure out and retrieve the right types of data to be able to answer that question. And then increasingly, the tools that are given to these agents to take action. I think that changes industries. That changes how we think of data. That changes how we think of making AI, making the data AI-ready, as well as making the tools AI-ready, so that more and more capable agents can be built.

[1:05:55] Nathan Labenz: Do you see changes to data itself? I guess one that we've talked about already is just retroactively going back and applying structured, unstructured data, creating metadata, so on and so forth. In the wild, one big change that we're seeing to data is like the web itself is increasingly comprised of AI generated data. And so that's a weird feedback loop that we accidentally created. Are there any other perhaps surprising patterns in data within enterprises that you're seeing as a result of AI coming onto the scene? Or is it maybe just still too early for something like that to be?

[1:06:36] Baris Gultekin: I am seeing two things. One is I'm seeing the access getting a lot easier. So that democratization of access to data and access to insights is a big shift. The other thing that I'm seeing is the value that our customers get from data is increasing because you're able to very easily glean those insights by just describing what you want in natural language and then getting it. So the value you get from data is increasing. That just opens up new and new opportunities. You start using the data in ways that you haven't thought of before. One interesting study, one of our customers is SMP, they analyzed the earnings calls. to understand when CEOs are responding to analyst questions in either directly or indirectly, or if a question was already answered in the opening remarks or not, and using that as alpha to determine which stock to buy. So stuff like that becomes very easy to build and then new use cases open up.

[1:07:37] Nathan Labenz: Yeah. That's an interesting metric. I've seen a bunch of stuff recently over the last, even just the last week. It was really the Venezuela moment where all of a sudden people were bragging about how they had created these AIs that monitor the prediction market platforms and were looking for early signal and trying to capitalize on that. That is going to be a really interesting phenomenon. How about in terms of just, actually, let me go a different direction. I'll do maybe a little lightning round to close us out. I have this sense that Right now we're in this kind of expansionary phase. I'm no astronomer, but my experience with platforms in the past and definitely experiences with the Meta platform, formerly known as Facebook, where it was like they came on the scene, they opened up a ton of stuff. Everybody could tap into all these data and social connections. And there was, for a moment, there was like an incredible flourishing of a ton of different ideas. And then after that supernova came the black hole and it was like, actually, we're going to close all this stuff back down. And a lot of that value that entrepreneurs created on the edges, experimenting with different ideas. The things that really mattered mostly ended up getting sucked back into the platform. And there wasn't nearly as much value created on the margins as it seemed like there was going to be. And I, I might be just over indexing on this. experience of having lived through this pattern once before. But I feel like the AI moment is set up for that to happen to a lot of people again, where, for example, just this week, ChatGPT launches medical version, which is great. I think that's going to be awesome for a ton of people. And the fact that I can now just connect my ChatGPT into an EMR instead of having to go laboriously copy and paste or find some other third party thing to do that kind of connector work for me, like the consumer surplus of that is going to be amazing. That's like my strongest belief is we will see high consumer surplus. But for the businesses, it seems like it does create a very tricky balance where you're like, I want to go do a bunch of cool stuff, but how do I know which of these things will be durable over time? How do I know I'm not just doing R&D for the next generation of the mega platforms that are ultimately going to eat my lunch? How do you guys think about where you want to place your bets, but what is going to get absorbed into the models? versus what, you know, only you can do over a longer period of time.

[1:09:59] Baris Gultekin: Yeah, I think we are in a fortunate place because ultimately data is an incredibly important asset for all companies. That is what kind of defines and differentiates them. And that is not getting commoditized anytime soon. So as a data platform, we sit in that layer in between the application and the model, if you will. So the way we think about this is how do we help our customers build very high quality products that are catered towards their use cases, which is all powered by their data. And whether any of that can be subsumed by these other platforms, I am sure the shape of shapes of products will continue to evolve. I think we're still in the very early innings of huge transformation. But I also believe intelligence is a... Once these models are out there and there is enough competition and we're seeing enough competition, The dynamics seem to be playing in such a way that is all pro customers and consumers versus these mega platforms. So I do believe the competition will keep things in check, and the opportunity is so massive that that growing pie will also create lots of new opportunities.

[1:11:11] Nathan Labenz: Obviously, people have to have some place to keep their data. Tell me what's wrong with this theory. If I were to be a skeptic or if I took the perspective of the snowflake bear for a second. A recent experience I had was my company's been a customer of Intercom for a number of years. And I was trying to do just some basic analysis of recent tickets. So I, and they didn't have the dashboard to do what I wanted to do. So I went to their docs. And the docs were 100 pages of docs. It's a full-featured platform at this point, so a lot of docs. So the first thing I did was told a web agent, Hey, go compile all these docs. And it literally went page by page, copied them all into a Google doc in a browser. I ended up with some 600 pages of text. Then I took that to Gemini and said, Okay, hey, there's a lot of repetition in here, but can you streamline that down to what I really need to know? And so it did that fit in the million tokens. So now I've got my consolidated single view of the docs. Then I go to a coding agent. I say, here's what I really want to do is export all my data from Intercom. And that also ended up being just one prompt to work. And then it exported all my data from Intercom and it was able to do the analysis I wanted to do. But then the sort of eureka moment was like, wow, it's never been easier to unplug from Intercom if I want to take my data somewhere else. They didn't really anticipate it being that easy when they created all these APIs. So what prevents the data platforms of today from running into trouble there. In the past, presumably, like, somebody were to say, hey, I'm not happy with you, or I want a better price, whatever. You had some leverage of just, what are you going to do? You're going to pull out all your data? And I'm not saying that would be easy today because I know you guys handle vastly more data than I have in Intercom. But it does also strike me that it's become a lot easier to move things around. It's become a lot easier to understand what it is, especially once you've gone and done all this metadata layering on. So what are the moats? What are the sticking points? Has it changed or will it change?

[1:13:10] Baris Gultekin: It is going the direction that you're calling out, which is, I think, great for, again, customers and consumers. So today, Snowflake supports open file formats for storing data. We support Iceberg, which is a compile format. What that means is you do not have to have your data locked in somewhere. You can just put it in a, you can keep it in a managed place that's managed by you or by us. And then you can use Snowflake as an engine to go process your data. So we are absolutely embracing and supporting the ability for our customers to use these open file formats, to not necessarily feel like they're locked into one platform. And I think that's great. That's great for our customers. That's great for innovation. Ultimately, customers will end up using the product that is going to give them the best performance, best cost for the things that they'd like to do. And we're absolutely embracing that.

[1:14:14] Nathan Labenz: Does that translate to increased pressure on you and your team? It would seem like maybe one way to think about that would be like, in the past, if somebody wasn't happy for a year, maybe they would start to think about a switch, where now it might be, if they're not happy for a quarter, does it shrink the timeline where you have to deliver?

[1:14:33] Baris Gultekin: I love it. It's great for product teams, right? Ultimately, we're all driven by creating value for customers, building a great product, and then we want to do that as fast as possible. And competition allows it's a great incentive in the system to go keep things in check and then have you deliver. So I don't think things change for my team. We already do feel pressured, not necessarily because of competition, because of the opportunity. The opportunity is massive. And it's also there's never been a great time to be a product manager, right? So you're easily able to build awesome products very, very quickly, and then you're sprinting. So it's probably satisfying to build these great products. And then you also reap the benefits by seeing how these are getting used in the market. So I love the competition. I also love the pace of innovation in the industry.

[1:15:29] Nathan Labenz: Does that lead you to a point of view, big picture on, this is a classic question. And again, it's striking to me how very informed and technically sophisticated people have very different answers. Where does the value, how do you think about the breakdown of where value accrues? Obviously we've got infrastructure, whether that's like chip creators or owners, models, application layer on top. If you had to assign those three layers relative value capture from the sort of AI opportunity, how do you think that breaks down?

[1:16:01] Baris Gultekin: Yeah, I think maybe the way I think about it is the middle will erode and the sides will continue expanding. So far we've been seeing A lot of value accruing to chip makers, NVIDIA, as well as the model providers. And then increasingly, application developers who are able to go build very quickly unique businesses on top of these capabilities. Cursor comes to mind as an example. So I absolutely do see the value continuing to accrue at the infrastructure layer as well as at the application layer. And traditionally, there's always been this kind of middle layer that's kind of facilitating and connecting those two things. Because of the capabilities of these models, that middle layer may not be as valuable or as important anymore.

[1:16:54] Nathan Labenz: And that middle layer is the models. Is that right?

[1:16:57] Baris Gultekin: No, no, no. I do believe-- so the middle layer is all the companies that are kind of creating custom business logic for certain applications. So that business logic, as you called out, for instance, if you want to just build your own extractor, you can just Vibe code it over a weekend and go do it, versus a company that goes and builds it for you. So that layer isn't as important. So models are, in my opinion, will continue to accrue a ton of value.

[1:17:29] Nathan Labenz: Yeah, okay. So to try to play that back to you, it sounds like you think all three of the layers that I described will do fine, but at the application layer- Traditional businesses will change.

[1:17:40] Baris Gultekin: Yeah.

Nathan Labenz: You're going more horizontal platform and relatively less excited about vertical because The fact, I mean, so many SaaS applications like essentially exist to encode business logic or best practices or whatever. And we just probably don't need dedicated teams building out those kinds of things when we can just have agents kind of do it on the fly as, as needed.

[1:18:07] Baris Gultekin: That's what I'm guessing over time. Yeah.

[1:18:08] Nathan Labenz: Yeah. Okay. Cool. One other big kind of question. question that I've asked a lot of people a lot of times, and I think you're the perfect person to touch on it. So you, of course, you know that Databricks acquired this company called Mosaic ML not too long ago, maybe two years ago now. And what Mosaic was doing, I thought was really interesting, which was starting with open source models, working with particular customers to do continued pre-training on data sets, which I assume were very often internal data sets, like the sort of data sets that might sit in a snowflake. I was really surprised. I spoke to Ali Godsey, the CEO of Databricks at a event not too long ago, and he said, we killed that product. So they basically turned Mosaic into a in-house research unit. But that product of offering this continued pre-training to try to create a model that like really knows your business inside and out, basically they don't offer it anymore. I was very surprised by this because I think, man, if I had a, if I'm GE or if I'm 3M or any number of a hundred year old, millions of employees over the generations, companies that have this incredible history and so much data that's accrued that nobody really understands at the company these days. If I could have a model that could have similar command of that information that doesn't, that only exists in my company and just nobody else outside has ever had access to as the the foundation models today generally have world knowledge, I would think that would be insanely valuable for a lot of enterprises. And yet we don't seem to be seeing, to my knowledge, many instances of whatever, 3M GPT or GE GPT, Pfizer GPT. Like, why don't we see that?

[1:19:59] Baris Gultekin: Do you have a point of view? I do. So this is kind of similar to... how up until recently, when you'd ask a question on ChatGPT, you'll say, Hey, my information cutoff is whatever, a year ago, and I can only answer questions up to that point. And then web search as a tool came in, and now all of these platforms would use web search to give you the most up-to-date information so that their world information can be It is more about the intelligence to figure out when to use the tool to retrieve the information and then make sense of it and then give it back to you, versus having been trained, pre-trained with all that information upfront. To me, that pattern is exactly what's playing out, right? So in the enterprise world, you have a lot of information, and then your text to SQL and RAG solutions can bring that information in for the agent, for the platform to reason with and then give you information. The nice thing about that is it is substantially cheaper. The model keeps getting better as the underlying premier model keeps improving. And it's also, relatively easily tunable. You can update it, you can change things and so forth. So that means for me, majority of businesses would continue to benefit from this architecture rather than codifying all of that information in the weights of the model. They'll just use the information and then use tools to retrieve parts of the information that are relevant. The exception to that is what we discussed earlier, which is if there are certain tasks that require either high throughput, low cost, if you have a lot of data and in an area that the model has not seen before, then it might make sense to go create custom models for those specific tasks. So I do believe there is going to be increasingly a need for task-specific small models in large corporations or when we have that need. But still, the majority of the use cases will be more retrieval-oriented.

[1:22:13] Nathan Labenz: So I think that's a great first-pass answer. If I think, though, even just about my own ability to search through my own stuff, my own Gmail, my own Google Docs, One of the intuitions I have pretty strongly is if I were to give you full access to my Gmail and give you full access to my Google Docs, you couldn't search through it nearly as well as I can. And that's despite the fact that you're clearly smarter than me. So I'm like, there seems to be something about the fact that I have had this like pre-training on this corpus that allows me to even just search through it a lot better. Because if nothing else, like one of the intuitions is I know when I found what I'm looking for. You might not know, you could do 100 searches in my Google Drive and never be quite confident you got the absolute best document for whatever the question is. Whereas if it's my Google Drive and I created all those documents, when I hit that document, that is the one that's yes, this is the one. Now, yes, I remember this now. This is the one. So I have that sort of confidence that... I've got to the answer, if nothing else, I think, which strikes me as really hard. And I've seen this when I try to give my own, just whatever, give Claude access to search my drive. It also struggles in that way. It doesn't know how many times the search or if it's found the right thing or sometimes satisfied too easily, whatever. So I still feel like there's something there where you could expect that a model that really had a sort of more in the weights familiarity with the data could do a better job of navigating it. And then maybe it just comes down to upgrade cycles are terrible for this kind of thing. And yeah, as you said, like you want to keep taking advantage of better and better models. This potentially, Dwarkesh has obviously influenced the discourse recently with kind of focus on continual learning. So maybe you need either a new architecture that's more suited to that or some sort of new training paradigm that would be more suited to it. I guess maybe one way to phrase this is, if that were to flip, if you imagine a world a year from now where it's no longer the case that the best approach is to pick the best models and leave them as they are, but tune them through the search as you described, and it instead becomes one of these things where they actually do have this deeper familiarity with all the enterprise's data, what would have changed to flip us from one paradigm to the other?

[1:24:35] Baris Gultekin: I was trying to kind of think about how humans do this. We'd go into an environment that we don't know, and then we'll go do a bunch of searches, and then we'll read to create some kind of knowledge. And then as you build out that knowledge, there's intuition that comes with it. So you don't need to keep referring back to it, and then somehow that turns into intuition. Right now, I think what these models are doing is the first part. I'll go pick the information. As the context windows of these models also keep getting better, I can stuff more and more information in these models and then get an answer. What intuition is not really understood, so I don't really know how that changes the dynamic. What would change if a model is trained with your data? You clearly need much less data to steer it to a certain direction. You'll have much more consistency in the responses. I don't think you can ever get away from feeding it information, up-to-date information, and so forth. But what I would imagine happen is First of all, that model that you want to do certain tasks doesn't have to write a poem in French. So you'll benefit from using the weights more efficiently for the tasks that you want to do. And therefore, perhaps, again, you may not need as large a model, so you get benefits from more optimizations to reduce the cost, increase the speed, and so forth.

[1:25:54] Nathan Labenz: Yeah, I think that many small models paradigm is also one that I'm pretty bullish on for quite a few reasons, one being just, I think we stand a lot better chance of staying in control of the meta if we have a lot of like narrow AIs doing their jobs as opposed to, you know, a relatively smaller number of giant AIs like running things for us. The pull of that is obviously pretty strong, but the, I do worry that we're racing into having such general AIs that can do sort of anything. before we've really thought through what the ultimate consequences of that are going to be. And the narrowness, safety through narrowness and maintaining control through narrowness, I think is an underdeveloped paradigm.

[1:26:35] Baris Gultekin: I fully agree. That's again, just going back to human analogy. That's how we operate as well. There are certain parts of the brain that are specialized to do certain tasks. And yeah, so I can absolutely see that.

[1:26:46] Nathan Labenz: This has been amazing. A couple of quick closing questions. What are you watching right now in terms of horizon scanning for surprises? Is there a capability threshold that's on your mind? Because obviously nobody can keep up with the AI news these days, right? So everybody has to pick and choose. What are the areas that you're watching? And maybe another dimension of that is what are the metrics that you're watching? Are you looking at ArcAGI scores? Are you looking at GDPVal? Are you looking at the MIRI task length chart? Are there other? What do you trust to give you the highest signal on what is actually important in the latest things that are coming out?

[1:27:23] Baris Gultekin: I actually don't watch the public benchmarks as closely. We do have internal series of benchmarks that watch very closely in terms of quality and latency for the tasks that we're optimizing for, which is, of course, built on top of the models that we get from the model providers. Whenever there's a new model that's about to be released, we'll go run out our tests, figure out what's improving, what are the gaps, and that I watch very closely. One thing that is maybe unique to Snowflake, of course, is we have a lot of tabular data. And that technology so far has been all about text to SQL and semantic models. So watch that space quite carefully. And there are some new trends happening in that space. These tabular foundation models are interesting. Being able to quickly build forecasting models and so forth are now possible. So those are other trends that I watch as well.

[1:28:14] Nathan Labenz: Yeah, that's an interesting one. There's a few public forecasting like benchmarks and competitions. I think those are really interesting too. At the point where the models are better able to predict the future than our best super forecasters or even aggregations of super forecasters, that will feel, I think, like a very meaningful shift in what's going on in the world. Any contrarian takes? Anything you think like the, anything you want to correct that you think the audience at large might be misled or misconceiving right now.

[1:28:44] Baris Gultekin: I mean, we touched upon the one that's very top of mind for me right now. I think the way we build products is changing. I don't think it's contrarian, but I don't think it's happening fast enough. We're at a point where how we build products needs to radically change. And that means change of behavior, because we've been trained to build products one way. So to me, that's the biggest one. In a world where these coding agents are such kind of capable platforms, how do you build new products? And in my mind, it is all about kind of starting with that first and then validating things quickly before you build a product in the first place.

[1:29:27] Nathan Labenz: Yeah, I think that's-- I've been doing that with my mom. I made her the custom travel app, travel planning app for the holidays, and it has been-- Yeah, it's inverting that process, right? I made a version, she has it, and now I sat down with her this morning over coffee, and I'm like, What do you want this thing to do that it can't do? And she's, I don't want to ask you to do more on this. I'm like, Mom, it's honestly so easy at this point. If you can articulate what you're missing, there's a pretty good chance we can get clogged code to just make it, and you can have it from one session to the next. My last question, then I'll give you the final word. What advice do you have for enterprise leaders in general? Obviously, you guys are much closer to the core for all the executives. and product owners at the companies that you serve, what do you think they should better appreciate or what can they learn from your experience?

[1:30:14] Baris Gultekin: There are some, you know, enterprises that are still quite careful about adopting AI. And at this point, it is so powerful that it is, there is a race. So, you know, the faster enterprises adopt AI, the more benefit they're going to get. And the more, intuition that they're going to get that changes the trajectories of these businesses. So to me, it's incredibly important to intuitively understand and natively use AI because it is going to change industries. And then underlying that is all about, many of the hesitations tend to be about getting the data ready for AI from our perspective. So that means investing in that core foundation to essentially get the data AI ready for AI to use. So that means breaking down silos, getting the data accessible, locked in for certain use cases. That becomes a core enabler to build on top of.

[1:31:18] Nathan Labenz: Makes sense. We've covered a lot of ground. I really appreciate your time and jumping through all these topics with me. Anything else that we didn't touch on that you would want to leave people with?

[1:31:25] Baris Gultekin: No, maybe the thing to call out is we talked about a lot of great capabilities as well as trends. One thing that is sometimes not necessarily appreciated is how easy it is to use AI and how easy it needs to be to use AI for adoption. So that's an area for Snowflake that's super core. So as we build products, making it very easy to deploy high quality AI at scale is something that we strive towards. To me, from a design principle perspective, that is key as well.

[1:31:55] Nathan Labenz: Yeah, couldn't agree more. Barish Gulzigan, Vice President of AI at Snowflake. Thank you for being part of the cognitive revolution.

[1:32:02] Baris Gultekin: Thank you, Nathan. Thanks for having me.

Situational Awareness in Government, with UK AISI Chief Scientist Geoffrey Irving

Situational Awareness in Government, with UK AISI Chief Scientist Geoffrey Irving

Universal Medical Intelligence: OpenAI's Plan to Elevate Human Health, with Karan Singhal

Bringing AI to Data: Agent Design, Text-2-SQL, RAG, & more, w- Snowflake VP of AI Baris Gultekin

Watch Episode Here

Listen to Episode Here

Show Notes

Transcript

Introduction

Main Episode

Read next

Situational Awareness in Government, with UK AISI Chief Scientist Geoffrey Irving

Situational Awareness in Government, with UK AISI Chief Scientist Geoffrey Irving

Universal Medical Intelligence: OpenAI's Plan to Elevate Human Health, with Karan Singhal

Bringing AI to Data: Agent Design, Text-2-SQL, RAG, & more, w- Snowflake VP of AI Baris Gultekin

Watch Episode Here

Listen to Episode Here

Show Notes

Transcript

Introduction

Main Episode

Read next

Situational Awareness in Government, with UK AISI Chief Scientist Geoffrey Irving

Universal Medical Intelligence: OpenAI's Plan to Elevate Human Health, with Karan Singhal

Intelligence with Everyone: RL @ MiniMax, with Olive Song, from AIE NYC & Inference by Turing Post

Mathematical Superintelligence: Harmonic's Vlad & Tudor on IMO Gold & Theories of Everything