Milliseconds to Match: Criteo's AdTech AI & the Future of Commerce w/ Diarmuid Gill & Liva Ralaivola

Criteo’s Diarmuid Gill and Liva Ralaivola explain modern ad tech, including millisecond recommendation systems, real-time bidding, embeddings, privacy choices, and OpenAI’s role in product discovery. They also discuss European AI talent and generative creative tools.

Milliseconds to Match: Criteo's AdTech AI & the Future of Commerce w/ Diarmuid Gill & Liva Ralaivola

Watch Episode Here


Listen to Episode Here


Show Notes

Diarmuid Gill and Liva Ralaivola of Criteo join Nathan Labenz to unpack how modern ad tech works, from millisecond-speed recommendation systems and realtime bidding to the role of deep learning, embeddings, and foundation models. They discuss why personalized advertising helps fund the open internet, how privacy and opt-out choices fit in, and what Criteo’s new partnership with OpenAI could mean for product discovery. The conversation also covers European AI talent, research publishing, and the future of generative creative in advertising.

LINKS:

Sponsors:

Sequence:

Sequence handles the full revenue workflow for complex pricing, from quoting and metering to invoicing, revenue recognition, and collections. Book a public demo at https://sequencehq.com and use code COGNISM in the source field to save 20% off year one

Claude:

Claude by Anthropic is an AI collaborator that understands your workflow and helps you tackle research, writing, coding, and organization with deep context. Get started with Claude and explore Claude Pro at https://claude.ai/tcr

AvePoint:

AvePoint is building the control layer for AI agents so you can securely govern, audit, and recover every action at scale. Design trusted agentic outcomes from day one at https://avpt.co/tcr

CHAPTERS:

(00:00) About the Episode

(03:32) Advertising data basics

(07:55) Cookies and targeting

(13:39) LLM commerce discovery (Part 1)

(20:10) Sponsors: Sequence | Claude

(23:09) LLM commerce discovery (Part 2)

(23:09) Real-time ad models

(33:07) Foundation model embeddings (Part 1)

(33:13) Sponsor: AvePoint

(34:20) Foundation model embeddings (Part 2)

(44:43) Conversational ad signals

(49:49) European AI culture

(01:01:58) Generative creative tools

(01:06:51) Personalization boundaries

(01:12:18) Human creative oversight

(01:17:26) Agentic advertising future

(01:23:51) Episode Outro

(01:27:17) Outro

PRODUCED BY:

https://aipodcast.ing

SOCIAL LINKS:

Website: https://www.cognitiverevolution.ai

Twitter (Podcast): https://x.com/cogrev_podcast

Twitter (Nathan): https://x.com/labenz

LinkedIn: https://linkedin.com/in/nathanlabenz/

Youtube: https://youtube.com/@CognitiveRevolutionPodcast

Apple: https://podcasts.apple.com/de/podcast/the-cognitive-revolution-ai-builders-researchers-and/id1669813431

Spotify: https://open.spotify.com/show/6yHyok3M3BjqzR0VB5MSyk


Transcript

This transcript is automatically generated; we strive for accuracy, but errors in wording or speaker identification may occur. Please verify key details when needed.


Introduction

[00:00] Hello, and welcome back to the Cognitive Revolution!

Today my guests are Diarmuid Gill and Liva Ralaivola, CTO and VP of Research and Head of the AI Lab at Criteo, the advertising technology company that powers much of the personalized advertising we experience on the open internet. 

I'm also joined by long-time friend and teammate Alex Persky-Stern, who took over for me as CEO of Waymark 3 years ago, and has since formed a partnership with Criteo that brings Waymark's AI-powered commercial creation product to Criteo advertisers.

We begin with an explanation of how modern digital advertising works and the value it creates for society.  

Personally, I tend to emphasize that without the commercial recommendation systems that allow businesses to affordably reach their target customers, a lot of the long-tail small businesses and niche products that we enjoy today simply wouldn't be viable at all.  

Diarmuid and Liva, for their part, focus on how ad tech delivers more relevant, engaging experiences and supports free access to information, as well as how easy it is for individuals to opt out of personalization systems.

From there, we dive into how it all works.  Criteo has been in business for more than 20 years, and while their AI techniques have naturally evolved with the field - most fundamentally from the earlier era of hand-crafted features to the modern era of deep learning, the one constant has been their need for incredible speed.

From the time your browser requests a web page, Criteo has just milliseconds to locate your profile among the billion or so in their system, and, in light of what you’re doing right now, decide which one - out of many millions of products - to recommend, and how much to bid in a realtime auction.

It’s a challenging problem that requires lots of pre-computing, but the upshot is that they’ve developed a highly modular system, powered by multiple foundation models, that supports prolific experimentation with cached user and product embeddings.

Beyond the core tech, we also discuss Criteo’s new partnership with OpenAI, which though still in its infancy, they expect will complement ChatGPT's broad world knowledge with accurate, real-time product inventory information.

They tell the story of the company’s European roots, and share their commitment to privacy, their sense that European compliance burdens are overstated, their decision to use the same Euro-compliant tech stack globally, and their passionate belief in the European AI ecosystem and talent pool.

They also explain why they’re confident enough in their moats to publish a lot of their research, and how this helps them attract and retain talent well enough that they’re still comfortable publishing the AI lab’s full 50-person roster to their website.

We trade ideas regarding the role that generative AI will play in expansion of the advertising market and evolution of personalized creative, and they share their admittedly speculative thoughts about how the fundamental value exchange of advertising might change as human time becomes more valuable and AI agents take on more product discovery and research work.  

Overall, I think this episode is both an informative look at how modern AI techniques are being used to make high-value commercials recommendations under extreme constraints, and a useful corrective for those who deny the ways in which cutting edge advertising enriches modern life.  

With that, I hope you enjoy my conversation with Criteo's Diarmuid Gill and Liva Ralaivola.

Main Episode

[03:32] Nathan Labenz: Dermot Gill and Leva Ralavola, CTO and VP of Research and Head of the AI Lab at Criteo. Welcome to The Cognitive Revolution.

[03:41] Diarmuid Gill: Thank you, Nathan. Pleasure to be here. Hi.

[03:44] Nathan Labenz: Thank you. Excited for this conversation. Also excited to welcome my longtime friend and teammate, Alex Persky-Stern, who's the CEO of Waymark, which we had been building together for a number of years before he took over for me as CEO a few years back. Longtime listeners have heard many asides about Waymark, and Alex is the guy running the show there now. Happy to have Alex here today because Criteo is obviously in the advertising business and bringing a lot of AI to the advertising business in various ways. And Waymark, under Alex's leadership, has partnered with Crudeo to provide some creative solutions as part of that whole package and go into market together. Lots to get into. I wanted to start with something that I saw recently that kind of caught my attention, and I wanna get your take on it. It was, I'm sure you've seen this, a Bernie Sanders sit down across the table with, I believe, Claude having a voice conversation. And it was a bit of a strange tone, it felt a little dated to me in some ways. but the subject of the conversation was Bernie saying to the AI what do you think Americans need to know about how their data is being collected and how companies are profiling them and how that's all being used and it kind of had a you know ominous uh overtone to the whole thing I think there's probably still a lot of kind of misconceptions or misunderstandings out there about this. But this is a significant part of the business that Criteo is in. So I would love to hear from you guys, as folks who have built it and are doing it today, what do you think Americans need to know about how their data is being collected and how it's being understood and how it's being used? And what's the upside of that to, and maybe the downside as well, but what's the upside of that to businesses and consumers?

[05:31] Diarmuid Gill: Yeah, it's a great question. And it's something that we in the industry need to do probably a better job of explaining and demystifying. For me, I think it all boils down to, you know, like transparency. So, you know, explaining to users what data is collected. So, for example, with Incrideo, we don't collect any personal information. So it's really, you know, a random anonymous ID. And then there's some things around, you know, what products people are interested in, you know, kind of what they've seen, what they like, what they don't like, and so on. And it's all about, for me, about a value exchange, right? So relevancy. So, you know, a system that knows nothing about you is going to show you random stuff that's irrelevant. And the brain has a really great way of filtering out irrelevant stuff. Whereas something that's truly interesting for you is way more engaging, it's way more resonant. And for a user, that creates a better experience. Also, I think in advertising, one of the great things is advertising is very much the lubricant that keeps the internet open and free, right? You know, that provides that allows service providers the ability to keep their services not behind paywalls. And so there's great utility in that. And advertising is what keeps that going. It's the revenue that those content and service providers get. that allows them to provide those great services to the end users. And, you know, providing transparency, the users can actually see what's happening and the ability to opt out. That's also something that's very important. You know, Criteo was a pioneer with the ad choices icon, so someone can click and they can see why they saw this ad and they give them the ability to opt out. Once you do that, then I think it provides great value to all the participants.

[07:18] Liva Ralaivola: Try to be like mentalists. We have very few information, very few cues, and we try to detect what is going to be the most relevant for each consumer and each end user so that in this value exchange, everyone's going to be happy down the road. So as a middle party, we have down the road the ones who have the least data and the most challenging tasks in terms of AI. And that's why actually on my part, I'm you know, because the challenge in terms of machine learning AI is really a big one and it's the most interesting one.

[07:55] Nathan Labenz: Could we do kind of a double click on a couple aspects of that? One being like, if I were to open up my file, I'm not even sure if that's quite the right way to think about it. I'd be interested to know like what's in there. And I've occasionally clicked the sort of ad choices thing and seen, oh, you're seeing this ad because you're interested in skin care. And I'm like, OK, my wife got me this one. But I don't really know-- that's kind of a high-level sort of summary statement of why I'm seeing it. I don't know exactly what is under the hood. I'm also kind of confused about-- of course, we go to websites all the time, and we get this pop-up that says, accept cookies, don't accept cookies, what's going on with those cookies? I know there was a big change to the industry, and I think it was kind of driven by Apple a few years ago. Maybe it was driven by other parties as well, where the way in which information is gathered and the nature of that information was kind of changed. I think there were some winners and losers from that. I'm not quite sure how that really shook out or if we're back to essentially the status quo ante before those changes were made effectively. I'd love to hear just a little bit more concrete description of what the data is, and then that obviously feeds into what is the machine learning layer that sits on top of that data look like to make sense of it. Obviously, that data becomes the inputs, but I always like to get down to very brass tacks on what are the inputs and outputs of the models so we can really understand what it is that the AIs are doing for us.

[09:27] Diarmuid Gill: Sure. There's a couple of different ways that I think it works. First of all, I mentioned You know, when you arrive on, say you're on a retailer or on a brand website, so they, you know, kind of using technology like Criteos, we can create a record on the computer called a cookie with a random ID. And the ID doesn't have any personally identifiable information. And so then afterwards, as you kind of continue browsing, so if you look at a product, then afterwards, when you leave that website, and you go browsing the web, they can know that you've shown an interest in that product, right? And they can show you ads for that product, specifically for the same one you've seen. Alternatively, what could also work is if, for example, you took a look at a mobile phone, well, then that could have you assigned to a group of people who are tech enthusiasts, right? And the type of phone that you look at could be interesting too. So iPhone users have a different profile from Android users, from whatever else. So you could be part of a wider audience that could just be seen as Apple enthusiasts or tech enthusiasts and so on. And then when an opportunity, when you're browsing the web and you're looking at web.com, internet.com, whatever, an opportunity comes for advertisers to bid, to make, to pay the website owner to show the advertisement in front of you. And based on the information that they've got about your previous browsing history, they can then decide whether they want to take this opportunity, show you the same product or show you equivalent products. And then maybe I'll hand over to Liva to kind of say how you actually do that bid, how you decide whether to show an ad or not.

[11:11] Liva Ralaivola: Yeah, precisely one of the very important things is to being capable of valuing the expectation of revenue of placement and knowing that if you are going to place an advertising in that placement, then there is a high probability for you for it to be clicked on or not. And you have to evaluate that. In order to do that, you're going to use machine learning and AI models that are precisely are going trained to evaluate whether a placement and given some products that we can put on is going to be something that is going to bring revenue. And for that, we precisely do a lot of, we collect all those data that Jeremy talked about. And there is a huge machinery that we put in place in order to learn from that data. And If I had to summarize the type of problem that we're doing, even though it's a bit more complicated, should we bid or should we not bid on that place, that placement? And we learn a classifier from that. And it's, of course, there's this question that you start, you started with the weather is when you click and want to see. or your information. There's this question about utility and readability that exists. If you want to be very precise in terms of evaluating the value of a placement, then you have to use very sophisticated models. You probably have heard about deep learning models. And the more sophisticated the models are, the less easy it is to understand what they have computed. So there is this trade-off to have. Now we have all in the industry and in particular in Catilia, we use those deep learning models in order to assess whether a placement is good and assess whether this product is going to be relevant for you. So it means that in a way we have what we have gained in terms of precision and relevancy, we have something to make up in terms of explainability. And just so you know, because we talked a bit before we started, that's a big topic in terms of research, scientific research, AI research to provide explainability on those models that are doing crazy stuff. And one of the things that we are looking at as well, but it's not easy like to have both high utility and high explainability.

[13:39] Alex Persky-Stern: And jump in, I actually have a question I'm interested in here. The idea of the user profile and what this person might be interested in is obviously super, super core. One thing that I think is really interesting is you guys have this OpenAI partnership, which is super cool and I know very new, so some of these answers might not exist yet. But one thing that people talk a lot about, of course, is that the queries are much richer in the context of AI and chat. But something that I haven't heard people talking about is whether the profiles are meaningfully different. Claude or an agent, like it knows a lot about me. Is that starting to change what how we understand the user and where do you see that going?

[14:21] Liva Ralaivola: So many things. So with the first very first thing you said, okay, that's there's still a lot of things to unpack, to uncover. We are precisely at this stage because of course there are many questions about privacy of the data. question. So, so far there's no answer yet. But one thing that I can answer is though, is that those conversational agents they provide with a new perfect before you just had essentially the websites or some apps that you could use and now they are have that. One thing that is very important is that those models like , etcetera, they are very, very good at general reasoning. So they can do some recommendation and in some ways you can think that they're going to be good to do. And if you ask, okay, I would like to buy shoes, they are going to propose you with shoes that are relevant. But one of the things that we is commerce data. We are going to tell us about what exactly people are interested in. And the big challenge that we have today is precisely to make the, to have both models, the LLM models that are behind all those conversational agents with all the models that we have in Criteo that are capable of providing very accurate commerce information. And the challenge in terms of technical challenge is precisely to merge the two and maybe some ways, but that's not on our part from the LLM. They're going to have some information that is going to be encoded. But what is precisely not necessarily to have this information, but more to know and see how we can enhance or those models can enhance our commerce models that we built for years. And that's where we sit as of today. To you, Dermot.

[16:14] Diarmuid Gill: Yeah, I think, yeah, that's exactly right. So the thing about the LLMs, and it's amazing technology. We're super, super impressed by the power of all of these. I think everyone is. But when those companies, when they train their model, it is true and accurate at that moment in time. Commerce data is actually way, way more dynamic, right? And so, for example, you know, kind of they would not be able to know, for example, that there are flash pricing. So, you know, around Black Friday and so on, where things change very rapidly. They also would know, for example, things like ruptures in stock, right? So the way that they gather their information is by doing this massive crawling of the internet. And then at that point in time, when they've updated their model, very quickly, it starts becoming stale, at least from the product point of view. So Gradeo has this massive network of 17,000 retailers. We ingest their product data on a daily basis, sometimes multiple times a day, and it means that we always have access to fresh data. So like Liva said, we did this hybrid architecture where an LLM in partnership with technology provided by Accumulate Criteo can ensure that when a user asks for a product, they not only get all the richness that an LLM can provide, but they can also ensure that it's up to date and it's accurate because From a user point of view, it's a very bad experience when you search for a product and it comes back with something, you click true, and the product's either a different price, or it's out of stock, or it's not what you were thinking about. So that's why that hybrid architecture makes so much sense.

[17:46] Alex Persky-Stern: Yeah, so today, just to make sure I'm getting it, the process is ultimately pretty similar to what you'd have on the open web when you're in the chat interface, maybe with a richer query. AI is being inserted in a whole bunch of other different places in the stack, but that applies really across all surfaces.

[18:01] Diarmuid Gill: So I would say yes and no. In fact, I think where these tools like the LLMs have an ability to elevate the whole experience is in the area of product discovery. And so if you're in the market for a new product and you're in that kind of like trying to dig out a lot more. So I think for the first time ever, we have the ability to provide the end users the experience, the same experience you get when you go into a store and you've got this really ******** sales assistant. who only cares about giving you a good experience, who could answer your questions, who knows the full catalog, who's able to tell you the good and the bad of each product. And then it leaves you with the experience of when you walk out the store, you feel like that person has really answered what you're looking for. So the LLMs, in coordination with the accurate product information can do the same experience where you can actually query, you can ask extra questions, you can drill deep down, it doesn't get tired, it doesn't get bored, and it's always giving you real time accurate information.

[19:02] Liva Ralaivola: And maybe there is something maybe that's a very difficult, but that has been changing. It's how you're going to connect the tools and the thing that we provide. those LLMs. So you've probably heard of the agentic era, the fact that we have MCPs, those protocols that are going to help make almost transparent the use of already made-up tools. And that's something that makes it easier to combine those LLMs with what we provide is something that, of course, is technical. Maybe nobody cares about that. But in terms of employing something, it's it has been made a lot easier because before you had to adapt to each surface, to each website, et cetera. But now with those protocols that are coming up, it has been made easier. So that's just our duty to make sure that we are compliant with those protocols. And that's actually what we do. It allows us to surface all the tools that we've built over the years and make them available to.

[20:04] Alex Persky-Stern: Yeah, great point.

[20:10]Sequence: Sequence handles the full revenue workflow for complex pricing, from quoting and metering to invoicing, revenue recognition, and collections. Book a public demo at https://sequencehq.com and use code COGNISM in the source field to save 20% off year one

[21:18]Claude: Claude by Anthropic is an AI collaborator that understands your workflow and helps you tackle research, writing, coding, and organization with deep context. Get started with Claude and explore Claude Pro at https://claude.ai/tcr

Main Episode

[23:10] Nathan Labenz: Can I dig in a little bit more on the core models that you guys are using to make predictions. And I'd love to understand the architecture of this better. I think for calibration, anybody who's listening to this feed is going to have a conversational familiarity at least with how large language models work. So we know that they're generating a token at a time. We know that the inputs get embedded and we know the kind of mechanics of the forward pass and all that stuff, right? And we know it's auto progressive, blah, blah, blah. This strikes me as a very different world. And I don't have nearly as much intuition for what the models are that are driving these things. I do know that they have to be a lot faster because the ad's gotta show up really quickly on the page. And then I know also that there's a pretty challenging matching problem in there somewhere because I've got millions of, you've got, we've got, society collectively has got millions of these profiles of individuals. And then also, as you said, into the tens of thousands of advertisers. And I don't know how much pre-computing is done or whatever, but it has to happen pretty quick on the load of a page. So could we kind of break down how big are these models? What do the inputs look like? You could imagine something very large and sort of very sparse set of inputs. But I guess it doesn't seem plausible, but it's like, here's all the websites, and here's which ones this user visited, right? That doesn't seem like it works. So there's got to be some sort of tokenization or something that is kind of bringing the user profile into a manageable state size so that it can be used as an input. I'm not even sure if I'm quite asking the right questions here. Tell me what this looks like under the hood.

[25:03] Diarmuid Gill: Yes, maybe I can take a quick stab at it and then Liva can take it down into more detail. So Liva actually referenced this earlier. So every single time that we want to, when we get an opportunity to show an ad, so that opportunity actually goes to multiple different ad tech providers who are all acting as kind of delegates on behalf of the actual advertiser themselves, whether it are brands or advertisers. And so the amount that we bid is based on how valuable that opportunity is to the advertiser. Effectively, how likely the user is to click on that ad and go back to the website and buy the product. And the way we evaluate that is we, through the mechanism we talked earlier, we see what products the users are interested in, what they've looked at, what they clicked through, what they've seen, what they buy, what they don't buy, and so on. As the display opportunity comes up, so we see the ID that we mentioned in the cookie, and then we take a look at all of the different products that that person has seen or whatever audience segments they belong to. And each one we can say, okay, based on all the different features we put into the model. So, you know, the products, the previous purchase history, the context of the website, the device they're on, a couple of other things that come in, and there's actually probably I'm not sure, it's like 150 different features we can take in. And each of those go into this calculating as part of this massive equation, which will tell us the likelihood that person is to click, the likelihood they are to click through to the website and eventually do a purchase. And all of that comes out to a value which we bid. If we win the opportunity, then we have to say, well, which products do we show and how do we do all of that kind of stuff? All of that process gets done in milliseconds because we use a lot of caching, we've trained the models offline, then the inference happens at real time in really low latency.

[27:02] Liva Ralaivola: So one thing that is important regarding all the data that we have, like the visited websites, the products that were shown, clicks, et cetera, and then the model, the question, as I said before, is kind of, let's reduce it to a classification, a binary classification problem. One of the main tasks for people who are have tried to do some machine learning, how you're going to encode and how you're going to represent the data. So I'm going to do that in two steps. The first one is going to, I'm going to talk about the legacy old models that we used to have and where we are now and that where we've been for a couple of But before, there was this question about all the products, the website that you visited, you have to encode them. And you have to encode them so that the way you encode, so the vector you're going to use to represent all those past information still carries a meaning. If you just encode them in a silly way, you're going to lose a lot of information. So before that, There was a choice before because of speed of computation and because you have crazy intuition about the type of model. It was like a spot. a representation, like a very huge vector with 2 to the 12th number of inputs with 1, 0, 1, 0, 1, 0, because we can do very fast computations on those parse vectors. But that was a way to represent the data that we used to have. And we just learned from that vector what is called the logistic regression model. So it's a linear model. You can think of just one neuron with a lot of neurons coming in if you have the learning analogy in mind. And we used to have that and we learned the model and it was very fast, even though it was sparse. There are many libraries to do like sparse matrices and sparse vectors. But then it was one thing that was very manual and built the features before. And of course, reasons why, for instance, the Crypto iLab and the head of was created was to say, okay, maybe It's not sustainable to have to craft new features and to think about how we're going to represent data each time, because the cookies can change, the information that we have, and for instance, with LLMs is going to change. So how can we proceed with more modern techniques? So it's... It was the intent and the goal of the Criteo Lab to bring deep learning. So it was created in 2018, and it was precisely the objective to say, okay, let's go to the next level, not have handcrafted features, but rather have them computed from the data. So Now, before we had like two to the 12 or to the 20s, depending on the encoding sparse vectors. Now, essentially, we have something like between 200 to the 1000 features that are automatically computed by one of the proprietary algorithm that we have, which is called DeepKNN, that computes deep learning features from which we, on top of which we do, we do learn some other models that are going to do those classification tasks. So maybe the essential thing is to understand is we went from 2 to the 12 sparse vectors to something that is a couple hundreds. And now we are at the next level again, we're going with performers, all the things that are available with these models that you can download, we're going to the next level, trying to even be more adaptable to the data that we're going to process. And that's something that is going as we speak. So we're refining all the models that we have. So that's the types of model that we have. And of course, the thing that is very important is And that's a journey. Everything happens in milliseconds. So what we have as a challenge is not only to be accurate, but also to be fast. So that's a nice challenge that we have.

[31:12] Diarmuid Gill: To be accurate, to be fast, and then also to do it billions of times a day, right? At huge levels of stability and reliability. And then the other thing that's a great challenge within our system is when you have the Black Friday, Cyber Monday, the busiest period of the year, So imagine any world in where your piece of technology, you are able to run it at like 300% of what you normally run. Imagine taking your car and running it for one week at 300 miles an hour and then back to normal for the rest of the year. And it's the same car, same machinery, and it has to perform exactly the same way. That sounds like a challenge. It's fun. Great challenge. And the engineers behind it, my hat goes off to them. It's the work they do.

[32:01] Liva Ralaivola: One thing for the people listening and for you is that one thing that is important for us is also to share how we are machine learning. You can access blogs that explain the deep KNN methodology and that explain a relevancy for the written media business, and we want to go even deeper. We have more scientific papers because we have researchers doing AI science and publishing in conferences and everything ties up together. And if you want to know more, if you want to dig deeper, if you want to know the sizes of the models, how we train them, what are the losses that we use, we have a bunch of articles online that you can totally download and read to have more information.

[32:54] Alex Persky-Stern: Yeah, I was, of course, reading some of that ahead of this call, and it's super cool to see how you guys have made that both public, but also pretty easy to understand for, you know, I'm not a true technical person, and it was totally easy to follow.

[33:13]AvePoint: AvePoint is building the control layer for AI agents so you can securely govern, audit, and recover every action at scale. Design trusted agentic outcomes from day one at https://avpt.co/tcr

Main Episode

[34:20] Nathan Labenz: Could we talk a little bit more, and this probably is to some degree in the papers that you've put out, but the sort of architecture of models and kind of what gets pre-computed, I'm thinking back to an earlier episode I did with the woman who leads AI at Stripe. And they had a pretty interesting strategy that I imagine you might have some similarities to where they train this foundation model for payments. And it's a huge model trained on, I don't know, like a trillion payments or something, you know, a massive data set that they have. It strikes me that They can be pretty open about it because they have the data. Everything's flowing through them, and that's not about to change. So they can afford to be fairly open about techniques. And you guys might be in a similar spot where having the network is the moat. And so you can afford to tell more of the techniques than maybe other companies could. But one thing I thought was really interesting that they did with this foundation model for payments was instead of trying to use that model for all the different tasks that they have within the company, and there are many, they use the embeddings of a given payment as the input to other models that may have other inputs as well. But in sort of modularizing things this way, they were able to both like amortize the cost of this model across a ton of different use cases without necessarily having to even anticipate what those use cases would be, and also do something that really just allows the developers across the company a lot of freedom to say, OK, now I know I have this really rich signal that I can kind of treat as a black box. But if I bring that into whatever machine learning task I'm trying to work on right now, they're just seeing dramatically better results across the board because that signal is so rich. I hadn't really heard that in too many other places, but I wonder if you guys have kind of a similar structure where there are models that sort of whose outputs or whose encodings, embeddings, whatever, feed into a bunch of other models.

[36:35] Liva Ralaivola: So many things. So first, before I talked about DeepKNN, and it's precisely one way to embed the data that we process, because embedding is very, very, very important. For people listening, embedding is a way to transform the data that we process, a product or user timelines or website into a vectors on which you're going to be able to do computation. And as you say, Nathan, the thing that is important is for those vectors capturing a lot of signals. That's very important. We want them to capture a lot of signals. And of course, that's something that the use that you mentioned, the fact that you have a way to encode all the data so that it can be used for other tasks as inputs to other tasks is precisely one of the key to being able to use the most of the data. And to answer your question of foundation models, we do have this program in Quiseo working on building the foundation model. And that's very important, as you said, one of the things that we very open about that because first, it's not easy to build a foundation model. So that's why all the companies can talk about the fact that they are doing that because it's not easy. We have the data. And then there is a question of the roadmap, how you're going to do that, whether you want from the get-go, you're targeting big models or you want models that are going to talk to one another. And always to approach that is precisely to have many foundation models, not that many, but three or four that are going to compute embeddings either on products or user timelines or et cetera, then we're going to have a way to, we're going to make those embeddings that are going to be computed by those foundational models available to everyone in the company. And it's just starting to reach today because people are using that like as a hot start, a warm start where to train models. And just so you know, one month ago we had a hackathon and one of the, it's always a way for us to try things. And we made available those foundational model embeddings that were used by many other teams just to say, okay, now I'm going to learn something. I'm going to learn something. And I do not want to start from scratch. I know that somewhere in the company there is like gems that can be used, like those vectors that are going to contain a lot of information. And I'm not going to start from scratch. And getting back to what I said before, regarding the manually computed features, it's totally automated and we do have that and we do provide that. And we have, it's a huge, the Krita iLab, it's a huge project because The question is to being able to refresh them, is being able to version them like a software, how you have a new version that is going to break up everything that is using them. And that's a big project. And we know that it's the one corner piece that is going to feed all the AI models that we build.

[39:39] Diarmuid Gill: Yeah, maybe just to build on that. So, you know, kind of as part of the technology group and, you know, Oliva with his team in the AI lab. So one of the things that we empower is that feature and that product innovation across all of Criteo. So historically, Criteo made its first success as part of what we call retargeting, so lower funnel. But we're expanding our feature set. We're building new products in what we call the mid-funnel, so customer acquisition, product discovery, multi-channel, so on open web, on social, CTV, and even in the LLMs. And so the ability to go from zero to performance, right? So to be able to deliver high amounts of return on ad spend or ROAS as the advertiser called it, anything we can do that can accelerate that process. So the hot start, as Leva said, so those foundation models are exactly, instead of going from zero and building everything from scratch, you're already halfway there from a performance point of view. So that's why these models are so exciting for us.

[40:42] Liva Ralaivola: And maybe one thing, quickly on how we could and that's the basics. It's really the basics of how we can use. So imagine you have product and you embed them with those foundation models because you have a lot of signals, you know that they were bought, etc. You do not know how it was processed, but you learn something, you have a model like a vector. And then, for instance, you want to recommend another piece of another type of a product that is very normally the embeddings are well-defined, then it's very easy to have a similar product to be recommended, just looking at the similarity of the different vectors. But one thing that you can do is, for instance, you have some information about a user who has seen this product, this other product, this other product, and you have that for many users. even one product, what you can do, given that you have this representation from one product, you also have those users. You can look at all the closest users from one product, and it builds you an audience, the people that you can target, just because you have similarities between the vectors that were computed for product and the ones that were computed for users. And conversely, You have a user and you can, it's encoded, let's say it's encoded by all the products that he saw or she saw, and you're going to have all the products that are around, and you can recommend the products that are around. So that's a recommendation. So it's very powerful to have those representations as long as they are semantically meaningful. And that's what we work at.

[42:32] Nathan Labenz: So everything for the most part, it sounds like is kind of pre-computed, I mean, there's an interesting moment where that most recent action of the user has got to be critical. So presumably, there's some marginal compute there that has to happen. But if I'm understanding correctly, you have sort of a base user profile and base product encodings. And those are aligned such that at runtime, it's an inner product of those two vectors, so that basically everything is done except that comparison. But I guess there is probably some last second update of the user profile as well based on where they are right now.

[43:11] Liva Ralaivola: Yeah, yeah, totally, totally. And that's actually a challenge, how you recompute those embeddings like live runtime. And that's actually an engineering and technical problem to do that very precise way. And it means like having other versions of the big models so that it can compute much faster, but still you do not hinder the way the simulator are going to be computed, but that's precisely where we put a lot of effort on. You just nailed it. That's the question. For this small adjustment, where you have to take into account the latest information, where you cannot rely on offline information, but doing that online is the key to what we do.

[43:57] Diarmuid Gill: And we've done a lot of experimentation with that over the years, right, trying to find the right balance. If you have the perfect amount of time, you can come up with the perfect result. But sometimes the fast result is good enough, right, in terms of the time constraints you have, especially when you talk about the constraints of real-time bidding, where you have to answer in milliseconds. So it's the best possible answer in the times constraints that you have.

[44:20] Nathan Labenz: Yeah, that's really interesting. I mean, the architecture of this is-- It does have a lot in common with the Stripe system. I guess it makes sense because they have a lot of similar constraints in terms of they've got to respond to should you approve this transaction or not in an incredibly short time.

[44:40] Diarmuid Gill: Yeah, it's a great analogy.

[44:43] Nathan Labenz: How about on the agentic, I don't know if agentic is maybe even the right word yet, but in just like an OpenAI context or in any sort of chat context where we've got this additional signal of like, what is the context of the conversation that the user is having? Does that also get treated essentially the same way you get, do you get like raw text and embed it yourselves or they send you over an embedded form of the text and then it becomes like another one of these already embedded inputs that goes into the decision making process?

[45:17] Liva Ralaivola: Okay, I can take it. So essentially, In Creo, we started to look at those conversational agents like two years ago, three years ago, were interested in knowing what was happening. And so I'm not going to talk about the partnerships because it depends and it's just building, etc. But in the way we think, and that's my job in the Criti Lab, we are kind of envisioning all the different scenarios. One scenario is we have the conversation and work on that and He learned something and like two years ago, there was a person in Credo who built kind of the conversational agent on Slack using the messages that were there about how to use the account strategies to try and answer some questions from clients. And he built from scratch an agent and a model trying to understand and recommend the answers to troubleshooting problems. So it was not a recommending product, it was recommending a solution to a problem, but it's the same thing. And we also tried something where we actually had some kind of summary of the conversation, like just a vector, trying to see whether it was a signal that would help us to build another model and to see whether we could learn from that. It's, of course, less powerful than having conversation, but that's something that we've trying, we're trying. And that's something that actually we are in terms of what I do in the Creolab is we're trying to de-risk what envision about what is going to come up. So, so far in terms, and I'm going to let Jeremy talk about that, in terms of the partnerships, who we should work with, we're still investigating the right way and the right data. exchange.

[47:13] Diarmuid Gill: So it's it's still early days, right? So you know, kind of we're very, very happy with the partnership we have with open AI. They're a really great team to work with. We're both very much privacy driven right? We're both very much about user consent, and so on. And so, you know, kind of It's something that's a core principle of both companies and how can we handle that. So we're very much respectful of that. And OpenAI are exactly like that too. So we need to just have only the information we need to be able to show an advertisement at a given time. It's not going to be all the time because not every context it makes sense. So that's something that we're continuing to partner as we build out this network together.

[47:57] Liva Ralaivola: And maybe just a last word on that. You have a team or a bunch of people working on what is called a trustworthy machine learning. And I think that it has never been as important as today because precisely the type of information, the volume of information that people are ready to share is like massive. And there's this question and that's That's not just a research question anymore. People are like, it's very practical. And we do look at how we can, so trustworthiness is fighting against hallucination, being sure that it's privacy safe, being sure that sometimes we're not. Or sometimes we do understand that the people we talk with have someone that we should recommend to. So that's something that we're looking at. Maybe we are a bit like more on the upstream side of things, but we know that at some point, yeah, something that is going to be very strong, either in terms of regulation, in terms of, I do not know, but we prefer to be on the same side, on the safe side, and we kind of prepare our weapons to be able to answer at the right time.

[49:11] Diarmuid Gill: And I think, yeah, I totally agree. And I think the fact that we kind of were based in Europe, right, where there's definitely with things like GDPR, there's a lot of sensitivity around those. We've always had that as a core principle as we build out these products. For advertising to be truly useful, it has to be trustworthy, right? So the user, we don't want to do anything that's creepy because if it's creepy, it won't work, right? It has to provide real utility when the user sees it, they go, okay, that's interesting, that's engaging. If that's the case, then they're more likely to click on it. They're more likely to click on it, then it means that we're a better value provider for the advertisers.

[49:49] Nathan Labenz: Can you tell a little bit more about, I mean, I think probably a lot of people are surprised almost an hour into the conversation to hear the company has roots in Europe, is headquartered in Europe. You know, I tend to think of ad tech as a, you know, mostly American phenomenon. But how different is it really trying to operate in Europe versus the United States? Are there things that you actually do differently across jurisdictions based on restrictions that may exist in Europe? Or is it sort of the same approach globally? Like, I think people have a lot of sense that there's, you can't do AI in Europe is kind of what the, you know, I think the first order summary would be from a lot of people. So if you think that's wrong, disavuse us of that.

[50:36] Diarmuid Gill: So I would say that is fundamentally wrong. It's false because, so I'm from Ireland and I moved over to France 11 and a half years ago. And the team here is just amazing, right? So the AI lab that Liva runs, the quality of the data scientist we have here is just off the charts. They're so, so talented, so, so dedicated to what they do. where definitely being European born helped us was growing up in that environment where we had to be very, very careful like it was a first principle for us in terms of how we handled that user data. We, you know, we've got like this contract, you know, this implied contract with the end users. In terms of, you know, kind of how we operate, the U.S. obviously is one of our largest markets. So we do operate in many territories. Everywhere we work, we're very, very respectful of, you know, local regulations around data and what you can and can't do. And we always go the extra mile. We always are very, very careful to make sure that we're fully compliant and then some with respect to regulations because, you know, it's super, super important. And we're a very global company. We have offices in Ann Arbor, in Michigan, in Toronto, you know, across other locations in Europe. So from a development and an AI point of view, so that part, you know, can really take in talent from all across the globe. Liva, maybe you could talk a bit about AI in Europe as a French person.

[52:07] Liva Ralaivola: Yeah, one thing that I can say is that Here in France, there is like a school of mathematics and computer science where people are very talented. They are really well trained into doing that. And it's not just AI and it's in general. And they are very good. They are very good people. And it's just the question of Having very good people trained is kind of enough to innovate. It's not a question. There's no question about France not innovating. And you can hear about Creo, but you heard about Mistral, you heard about many companies coming from France. And one of the things that is very important in Europe are actually, maybe I'm going to talk about France. And you know that part of my life, I was a professor at university, so I know students as well. The students, they are very good. They do learn with enthusiasm. They do like, and maybe that's something that is not necessarily the best thing, but in France, like mathematics and computer science, the thing that are very formal is very important. And maybe people tend to forget that, but the roots of AI, the reason why you can craft and build models that are meant to answer some specific questions and you're capable of doing that, of crafting those models, relies on the ability to formalize, to model and to find the right technical tools either in computer science, in engineering, in more mathematical parts of things. And that's just the that is important to innovate. And we do have that in France and I do not know the rest of Europe. I do know France, but I do that in France we have that.

[53:56] Diarmuid Gill: But also one thing that struck me when I moved here was If you think about it, like, say, ancient mathematics was like a lot of it would came from Greece, but modern mathematics, when you think about it, when you study it, I did engineering in university. You've got all these names, Laplace, Lazier, Fermat, all of these, Galois, which underpin the whole modern mathematics, which is the basis for machine learning and AI. And so France has really, really great standing in that area, which companies that Criteo have massively benefited from.

[54:31] Nathan Labenz: I was quite impressed and surprised actually in looking up the Crudeo AI lab how first of all just how many people are on the team I think it's like I think I saw like 50 faces on the website and then specifically I was like very surprised to see all the faces on the website because I was thinking You know, geez, like, I don't see too many AI companies in the US doing that. I think they're all afraid that if they put their names and faces on the website, that Zuckerberg is going to come calling and the whole thing goes sideways because they all get offers they can't refuse. Not that I want to complicate your lives, but how are you thinking about that? How are you able to build and retain a team like this in the age of the Zuckerberg blank check?

[55:19] Diarmuid Gill: So I think I'll go 1st and leave it. You can kind of layer on top. Within Criteo there is obviously unbiased, but the culture here is amazing, right? I travel over an hour every day to come into the office, because Just there's so many cool people who are so good at what they do. And you know, there's something really engaging about that. If you can give these people some really interesting problems to work on in an environment where they're surrounded by like-minded people, that's super, super engaging, you know, and the fact that A lot of the people here have a long tenure, been working there for quite a while. It creates that ecosystem that is very, very engaging. So even if we do have competition from the others, we often welcome that because it puts us on our toes and makes sure that we in the leadership have to make sure that we continue to make Criteo a really, really great place to build your career. And we've been doing a great job of it for over 20 years now, so that's what we want to continue doing long into the future. Liva, maybe you can.

[56:20] Liva Ralaivola: One of the things that is, we have research scientists, and one of the things that the research scientists AI do, they publish. And one of the things that is true across all research scientists, I think, and even in other companies, is that they have to make their research reproducible. It has to be open, et cetera. research and science like that. So that's something that you can see. It's kind of, you have to have a presence on the internet if you are a research scientist. What I'm talking about on the academic side of things, the people who are called scientists, usually they have their own website, they have their faces, and that's what we want them to have here in Criteo as well. do research as other scientists from universities, from other companies. So that's the reason why. And then there is the thing that Jermaine shared is up to find challenging problems, challenging topics to work on that they can say, okay, it's good to be here to try and do science in CRIDO because the problems are very They are not easy, and it allows them to connect something that might be very upstream to something that can be also deployed, maybe not yet, not today, but in one year or two years, and that's very important.

[57:58] Diarmuid Gill: And maybe one other thing, just to build on that, Liva's team also works very close with a lot of the kind of the academic institutions in France and, you know, across Europe. And we sponsor PhD students, they'll come in, they'll work alongside the existing research team, they work on real world projects. And so it gives them real concrete experience. And many of those PhD students actually become full time employees, not all do, but that's okay, too. They publish their research, you know, which is good for their career. And This is a great way to keep that pipeline going. And it keeps us very well connected with the wider AI ecosystem as well.

[58:35] Nathan Labenz: Certainly the ability to publish one's work in today's world is a differentiated part of the offer. I've got one episode that has been recorded three plus months now with somebody at one of the, let's say, frontier companies that we just cannot get approved. And it's great stuff, brilliant work. We'll see if it ever sees the light of day. Maybe just one more beat on the regulatory environments. And I would separate here following the rules, which you've, you know, I think clearly stated that you're committed to doing from advising on like what the rules should be. Do you think that there is like a, I guess, first of all, do you think there is a meaningful difference between the environments in Europe and the United States when it comes to like an individual's rights, I guess, is one way to think about it. But maybe rights-- what I really am trying to get at is, who has it better? Do people in Europe actually have meaningfully better protection that I should envy? Or do I get meaningfully better ads that people in Europe should envy? Or is this all much ado about nothing? Who should be changing on what margins their rules just to better serve their citizens? Do you have a point of view on that?

[59:52] Diarmuid Gill: So we actually engage quite a lot with the authorities, you know, kind of in the EU, with the different data protection offices, the DPOs in different parts of Europe as well as in the US. And, you know, kind of we're very much advocating on behalf of the end user. right to ensure that it's about, you know, transparency. It's about user consent, ensuring that users have a way to be able to opt in and opt out. And that's why, you know, we had the cookie consent message long before it was even regulatory. And we also kind of push through on ensuring that we live in a system where there's a fair value exchange, right? So that the user feels that they have you know, kind of proper, you know, kind of use of their data and that they get value back, right, through free content, through free services and so on. In a world where all of that disappears, right, then you take away all the value for the advertisers. So if an advertiser who's spending money to try and increase their sales, they're not getting that value back. If their products are being shown in front of people who have no interest, then that does nothing for their business and they're not going to spend. If they don't spend, then those people who provide that rich stream of, you know, services and of websites and of content, they have to monetize some other way. So then what they do is they have to put up paywalls. That's not in the interest of the end user. So as long as there's fair value exchange, as long as we're transparent about what we do, then I believe that's really important. So back to the question about US versus EU, to a large degree, a lot of that stuff's been equalized, right? A lot of those things like, you know, in California, it with a lot of changes through CCPA and CCPR and so on. And they're very much inspired by each other. We try and build a global solution. So, you know, kind of what we've done in Europe, we use the same approach globally. So it's not like, you know, we try and be looser elsewhere. We really believe that by having that principle and by being born in Europe, it means that we have a solution that can work pretty much everywhere.

[1:01:58] Nathan Labenz: Cool. That's really interesting. Let's talk about creative a little bit. I think I'll do it. I'll invite you to help lead this part of the conversation. But it is interesting that we've made it this far and not really talked about creative. Obviously, there's a lot of different formats. I don't even know today if I guess a couple of ways at least I would want to come at this. One is like, I think the number I saw was Credo has 17,000 advertisers. In our experience at Waymark, we've kind of seen often lack of creative is one of the biggest barriers to new advertisers signing up with a platform like Criteo or for that matter, Meta or Google or what have you. I wonder if you see that similarly, like is that a core barrier to market expansion? And then obviously we're in this moment of like cost of creative of some quality is dropping precipitously. And I don't know if there is a dynamic layer to the creative. Like when we're doing these kind of matching auction prediction sort of things today, is creative an input to that decision making? Do advertisers have multiple creatives that you're sort of scoring or embedding and kind of using to drive outcomes? Or is that still a frontier where perhaps because they don't have enough options, you can't do that in many cases? I don't know for what other reason, but it seems like in the future, we sort of imagine everything's going to be highly personalized. The ads are going to be much more talking to us as individuals. It seems like you have the infrastructure to do that, but maybe just the creative is not there. What do you think is the future of creative?

[1:03:43] Diarmuid Gill: Yeah, I think that's one of the most exciting areas in the whole generative AI space. kind of the things that you can create with these next generation models, it's just insane. And I think they have the possibility to create even more engaging advertisements, right? You know, you talked to Nathan about the hyper-personalized, I think that's super, super powerful and people will see it and they go, wow, okay, that's exactly the product I'm looking for. And it kind of makes it much, much more interesting and much more engaging for them. The other part as well is, you touched on the point there, with this technology, we've the ability to democratize that content creation. So before where, you know, kind of the mid to long tail would have been very much cut out of the picture 'cause it was just out of their reach, out of their means to be able to create that high quality content, that is now becoming more and more accessible. Criteo recently launched a self-service product called Criteo Gold, and this is like, really make it easier and easier for those advertisers who are very often, sometimes much smaller, to be able to create very engaging campaigns with really great creatives that will help to drive and grow their business. That's super, super important. And it's really, really enabled by platforms like Waymark, who's a great partner of ours.

[1:05:04] Alex Persky-Stern: Thanks for the shout out. We love it.

[1:05:07] Liva Ralaivola: Yeah, on Creative, so two things. First, there's something that we had been doing for years in Creative is dynamic creative with some templates, some visual assets, and we had to come up with them online. So that was something like you have Legos and it's going to put you on that, to craft you on that. In terms of generation of those ads and with a generative AI, we're not at the level of in terms of speed. There is no way unless you're ready to wait for five seconds or to the page to load. We're not there. So we have to find a way again to balance things between something that is going to be done offline and online. So we're going to kind of the Legos that I talked about, the visual assets, they can be generated either by us or by our partners with generative AI. And then it's up to us to arrange that at runtime, depending using your engine, actually, that is capable of arranging that online. But maybe at some point one day it's going to be very quick and maybe distribute something that is not going to be computed like by us, but one day in the future is going to be computed by maybe your TV or your mobile phone like online and it's going to be very quick. But we're not there yet because you've tried probably to use some of those and trying to to build an image. It's not something that... far from that, but we're not there yet. At some point maybe two years, three years, we're going to have to be able to have that.

[1:06:51] Alex Persky-Stern: Yeah. Do you have a perspective? I'm interested in the level of personalization that you see the infrastructure supporting. I think, you know, there's of course the level of like the user experience that you've talked about a lot, which is super important. You don't want to be super creepy, but when you think about the level of personalization that's at least possible infrastructurally, do you think it's gonna be... Literally at the individual level, we could tell you a story that is specific to you, or do you imagine it being more audience and context level and kind of where do you see that living?

[1:07:22] Diarmuid Gill: For me, I think you're right. I think it'll be mostly more at the audience level, right? Because I think the point is going way too hyper localized. I'm not sure there's huge utility for the end user for the brand either. Brands and the advertisers will still want to retain levels of control over the look and feel and kind of how their brand is showing up. So within those kind of parameters, I think it's mostly on an audience basis. I think that's kind of at that level, I can see it working down to hyper personalized, where every single person sees a different thing. I'm not sure there's huge utility there from the point of view of driving more sales or getting more product in front of the end users.

[1:08:09] Liva Ralaivola: Something that I would like to see actually at some point is to see those very efficient models to have sizes that allows them to be embedded in some like devices, in glasses. And if at some point we arrive at that level, then maybe the computation is going to be shared. We're going to provide something at what level and maybe on the personal device, something that is not going to be seen by us, but just the personal device is going to tweak the things at the very end saying, actually, it was something that was proposed by, I don't know us, but I know you, that's a device and I have on that device something that is really personalized and on that device, very privacy sure, privacy sake, We're going to record something, but just on the device. So maybe, so that's my dream.

[1:09:04] Alex Persky-Stern: Yeah, I love that. Super interesting.

[1:09:06] Liva Ralaivola: Maybe there's going to be a share, something that is going to be shared in terms of computation, and maybe the personalization is not going to happen on our side, but at the end user side.

[1:09:19] Diarmuid Gill: Yeah, I would agree with that. Where I could see this going is where the action is actually triggered by the end user, the consumer. And so the good use case is, kind of, okay, so I get invited to a wedding in the South of France in the summer. I want to know what looks good. So I can ask, the AI, can you recommend some outfits? And then you can see a virtual try on, stuff like that. Usual example as well is you want to see how something would look in your home. So that's very much that's engaged by the user. So they're kicking it off themselves. And because they've done that, then it's not creepy. It's their action that's initiated this. And that for me is actually where hyper-personalization really comes in and where that really makes sense. And that's great. And again, going back to what I talked about earlier, it's the utility. It's trustful, it's useful, and it's providing value. And it could help give a far better experience for all involved.

[1:10:17] Alex Persky-Stern: Yeah, that's super interesting. And it seems like it fits with what we were talking about earlier too, like in a chat or agent context where we're actually getting way more comfortable with what would be creepy in any other context, letting it do that personalization for you is super, super interesting.

[1:10:31] Diarmuid Gill: Once there's a user who's asking for it, then I think that's perfectly cool.

[1:10:35] Alex Persky-Stern: Yeah, yeah. Cool, super interesting. One other thing I'm interested in, kind of a long, it's not a different angle on the personalization, but contextualization, I know. CTV isn't your historical bread and butter, but a growing concern. And I think one of the best documented performance gains is when the creative of the advertising matches the creative of the context. I'm sure that's also true in some other contexts, but really true on the TV. Another place where you can imagine lots and lots of different variations that match the tone and style of the particular, not only the genre or the movie, but the very specific moment in the movie. Is that stuff that you, you know, have started to think about and bake into the way that you run your models?

[1:11:21] Diarmuid Gill: Yeah, I think for me, it has to be, you know, what's super important about advertising is it should not be intrusive to the user experience, right? It has to kind of feel of seamless and something that's getting in the way of the user experience of the content. And that also applies to websites as much as it does the connected TV. We've all seen those demos of where you get advertisements inserted into the content, whether it's sports or it's your favorite sitcom and so on. And I don't know if we're quite ready for something like that yet. I think you have to have something where you know, kind of the user feels, you know, kind of that it's relevant, that it's not interfering with their experience of the content. And so, you know, with video and connected TV, there's pre-roll, there's mid-roll, there's post-roll, which is, you know, kind of a different way of experiencing it. So as long as it's not intrusive, I think the users will be open. I think that's kind of more of the guiding principle there.

[1:12:19] Alex Persky-Stern: Oh, makes sense. I did have one other question that I'm interested in, which is, so I I think this is a really developing moment right now, as for the first time ever, creative can really be technically powered, where it's always had a very human-in-the-loop requirement. How are you guys thinking about, as a technical organization, what your role in the creative is, not only right now, but over these next few years?

[1:12:45] Liva Ralaivola: I think that, so as of today, the way we do approach those creatives so that we are sure that the quality of the thing that we provide is to not build our own creative models because they're very costly, they're very expensive, et cetera. But I think that it's something that is going to be, yeah, of course it's going to be key. I don't know, it's not my own turf, so I think that, It's that, yeah, we have a team that is dedicated to do that. Their program, they're like a, not a one year program, but like two years, two years, three-year program to integrate those creatives. But they're, they used to be in the creative lab, but now they're not anymore. So maybe I'm going to on that one.

[1:13:37] Diarmuid Gill: Yeah. So for me, I think ultimately, you know, kind of there's a few different constraints. So for example, brand guidelines. So the brands themselves, the original advertiser or the retailer, they've got their own kind of look and feel and the way that they want these things to show up as well. Ultimately, we want to build advertising that works, advertising that brings value for the advertisers so they feel that their money is well spent with us. And so, you know, kind of, I think within those parameters, we try many different things. So Liva talked about, you know, kind of the way we did DCO before, which was really one of the great ingredients of our initial success. You layer on top the generative AI capabilities on top of that, and the possibilities are endless. It's really, really, the potential is just amazing.

[1:14:24] Liva Ralaivola: Yeah. One, maybe just a word that you used, Alex, and it's true for everything that is going on with AI today is human in the loop. Yeah. That's the key actually, not just creative, but for everything that we've been talking about, It's and the big challenge is being sure that you put the human at point. That's very, very important. You should not strip the user away from his right to decide. If you do that, to automate everything, it's not going to work. It's going to be a problem in terms of liability, responsibility, et cetera, et cetera. But at the same time, you're going to feel as a human being allowed to make any decisions. So just so you know, the question that we have Every day when we build a model is where is the human? How do we learn so that, and how do we learn the model so that the human has still a place and we have to spot the right place? And just missing that could be like very detrimental to the project that we have. And it's true for creatives, for the bidding models, it's true for the agent stuff, et cetera. So that's the big thing that is very important to actually not just us, but everyone building new It's the human. And that's why those conversational agents were so well, it's still like heavily relying on human interaction. So that's very important.

[1:15:50] Nathan Labenz: Well, here's a small one, but a curious one, and it is kind of relevant to our business. So it's top of mind for that reason, but also in general, right? We've been talking about like all these different touch points. Where are we today on cross device understanding of, who somebody is. I could imagine that that could be very algorithmic and deterministic, but it occurs to me that maybe that's another AI question in today's world. Who is one person and who's a different person when some devices are shared and some are on mobile networks and all the complication there? That was supposed to be the short one, by the way. So I'm just prefacing that for-- and I can run a minute long, but I don't want to keep you guys long if you have something next.

[1:16:34] Diarmuid Gill: Yeah, so there's a few different aspects to it. So again, it goes back to user consent, right? So the user needs to be okay with this because there's shared Wi-Fi at home and all this kind of stuff, and you wanna make sure that you're showing the advertisement to the right household member. So there's different ways. So there's the Wi-Fi signal, but there's the network that you're connected to. There's also if you're logged in on multiple devices and that, for example, on social or on other ways. So that can be used deterministically to say, yes, it is the same person on multiple devices. The probabilistic way can be used for certain things, but only in the kind of very, very untargeted kind of way. Because if somebody has opted out of advertising, if they've gone through that, then you don't want to reshow them advertising even through a different way. So that's something that is something we always kind of keep front of mind.

[1:17:26] Nathan Labenz: Cool, all right, last big one for me. Zooming out kind of as far as possible, I see, I mean, in general, with this whole AI wave, there's like dramatic uncertainty about what's gonna happen, right? How powerful the AIs are gonna get and how disruptive it's gonna be. And, you know, do we need a whole new social contract, et cetera. Specifically in the advertising space, I feel like I see two trends that kind of counter each other. One is that if all goes well, we should all be a lot richer and the value of time should go up. And that would be really good for the advertising market in the sense that fundamentally you're competing to have some little share of people's time. And the more the richer they are, the more valuable their time is, you know, the more that's going to cost. So that's an advantage, presumably. But then the other thing that people also sort of see is like maybe search costs and matching costs could really drop in a lot of ways if we all have our AIs going out and like vetting us, you know, much broader portions of the world than we previously could. Today, I'm like, I can only evaluate so many shoes, but maybe in the AI era, I could have an AI that goes out and really evaluates in a comprehensive way every possible shoe. Therefore, my product decision-making is less about who was willing to bid on my time and more about how much time was I willing to have my AI invest on my behalf to go out and figure out what to do. I don't know, that's more of a prompt, I guess, than a question. What do you think of those trends? Are there other big trends that you see as kind of being huge factors? And where do you think we are in advertising in, say, five years time or the singularity, whichever comes first?

[1:19:09] Diarmuid Gill: I'm not sure about the singularity. I have my doubts about that one. So for me, where I see this new generation of AI really kind of helping to you know, kind of increase the amount of value to the end user. You mentioned time, Nathan, you know, kind of maybe there's a value as well, is for that kind of product that you're looking for, right? If you're kind of a very specific thing in mind, you know, where you don't know where to get it, you're able to then query these LLMs or whatever interface, you know, whatever LLMs become. and partnering with that really kind of rich commerce data. You say, here's my criteria. I don't care about the price. I just want the best product possible. And you know, you explain what it is you want. And connecting all of these systems has the possibility to give a really, really great result. When you get that great result, then you're quite happy to buy it. I think that's the kind of utility we're helping people discover, and not only just the product they're looking for, but what goes with it, how you can actually enhance it, and so on. And that kind of, when you see the way that they're working today and where they're going and how fast they're improving, I think that's incredibly exciting, right? And it can provide a really, really great consumer experience. the retailers to be able to engage in that, that allows them to get their products in front of the end users in a way that they couldn't do before. Because some of the advertising before, a lot of it was guesswork. Now it becomes really, really focused and you're really making sure that you're getting all the information that the end user needs to be able to buy that product, whether it's the shoes you're looking for or that new piece of tech that you want to invest in. I think it really, really gives a very enhanced way for the the end users to discover products.

[1:20:56] Liva Ralaivola: Yeah, just first, or the prompt, because it's very deep. One of the things today is that there is something that is going to happen so far. In a way, people are exposed to advertising. You do not, you might make the choice not to be exposed, but you are exposed in a way. I think that with the advances in AI, those companions or those assistants that you're going to have with all those LLMs, platforms, et cetera, maybe at some point you're going to be wanting advertising. The exchange is going to change a bit because the advertising is also accessing or discovering the right product, et cetera. And maybe the things are going to happen behind the scene. Like you're going to turn a knob, say to your assistant, okay, I want to go or to see shoes, but I want you as an assistant to be exposed or to look at 10 or 12 or 100 different shoes, and I want you to select among those 12 or those 100 shoes, six of them. But I want to be exposed to six of them, and I want to be exposed to advertising, and I want to choose from that. So there is a new actor that there might be a new actor, a new entremediary in the advertising business, and people People might be wanting to be exposed so that, again, to have the choice as a human to say, okay, I want to be, because having the choice of different shoes is having an advertising in front. So maybe they will be saying, I want to see, I want a trip to Greek. I want five different trips. Show me five different trips, but I won't be. I want to be shown five different trips, not more, not less. And then I'm going to choose. So maybe it's going to change something in the way advertising is going to be experienced just because the quality and the ability of the system to filter and to select the product. So maybe there's a change like that. So you say in five years, maybe you will be in that situation where we're going to say, okay, show me some advertising. So yeah.

[1:23:26] Nathan Labenz: Yeah.

[1:23:26] Alex Persky-Stern: It's like the collapse of search and advertising. If the advertising is good enough, it might be better than search.

[1:23:32] Liva Ralaivola: Yeah, yeah, exactly.

[1:23:35] Alex Persky-Stern: Interesting.

[1:23:36] Nathan Labenz: Cool. Well, that's a great note to end on. Alex Persky-Stern, Dermot Gill, and Liva Ralai-Vola, thank you for being part of the Cognitive Revolution.

[1:23:45] Alex Persky-Stern: Great to be here. Thanks, Nathan. Bye.

[1:23:48] Diarmuid Gill: Thank you, Nathan. Bye.

Episode Outro

[1:23:51] Paris in the pipes, signal in the static. Weak use, big math, we move automatic.

[1:24:07] Mentalists in the middle, reading smoke and mirrors. No name, no face, just a cookie getting clearer. 17,000 shelves, fresh catalog, a river. LLM knows the world but the price we deliver Laplace, Lagrunge, Firma and Gala Modern math on modern drums Ooh la la voila Spars two to the twelfth Now or then it's hot start Depending and begin Know the shape of your car.

[1:24:38] Billions of bits a day, milliseconds on the clock Three hundred miles an hour, same car, never stop We the mentalist in the middle.

[1:24:48] Through the riddles Best answer in the time we got Accurate fast, billions are shot Cognitive revolution on the wire French touch on the flames, don't the fire If it's creepy, it won't work, keep it clean Human in the loop in between Ask the agent for a wedding fifth South of France in June Virtual try-on spinning like a daft punk tune Not hyper local audience, but sharper than before mid Funnel, open web, CTV, and more.

[1:25:17] Trustworthy machinery Privacy on deck GDPR in the DNA Not a line we checked Ann Arbor to Toronto Paris to the bay All the charts, the scientists That's how we play Show me four different trips Let the age you decide Behind the scenes on your device Personalize inside We the mentalists in the middle Reading signal through the riddles Best answer in the time we got Accurate, fast, billions a shot Cognitive revolution on the wire French touch on the flame, stoke the fire If it's creepy, it will work, keep it clean Human in the loop in between Where is the human? Where is the hand? Spot the right place Where the people still stand Collapse of search and advertising Merging in the glow Asked to be shown, that's the future we know Show me five, show me six, let these essentials Turn it off, turn it up, we ain't here no We the mentalists in the middle Read a signal through the riddle Best answer in the time we got Accurate fast, billions of shy Cognitive evolution on the wire French touch on the flame, slope the fire If it's creepy, it won't work, keep it clean Cure and in the loop in between in the pipes Signal in the static Mandalus in the middle Automatic

Outro

[1:27:18] If you're finding value in the show, we'd appreciate it if you'd take a moment to share it with friends, post online, write a review on Apple Podcasts or Spotify, or just leave us a comment on YouTube. Of course, we always welcome your feedback, guest and topic suggestions, and sponsorship inquiries, either via our website, cognitiverevolution.ai, or by DMing me on your favorite social network. The Cognitive Revolution is part of the Turpentine Network, a network of podcasts, which is now part of A16Z, where experts talk technology, business, economics, geopolitics, culture, and more. We're produced by AI Podcasting. If you're looking for podcast production help for everything from the moment you stop recording to the moment your audience starts listening, check them out and see my endorsement at aipodcast.ing. And thank you to everyone who listens for being part of the Cognitive Revolution.


Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to The Cognitive Revolution.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.