Illia Polosukhin, co-author of the seminal "Attention Is All You Need" paper and founder of NEAR, unveils his ambitious vision for user-owned, privacy-preserving AI.
Watch Episode Here
Read Episode Description
Illia Polosukhin, co-author of the seminal "Attention Is All You Need" paper and founder of NEAR, unveils his ambitious vision for user-owned, privacy-preserving AI. This technical deep dive explores NEAR's foundational infrastructure, from its proof-of-stake consensus and leveraging NVIDIA's confidential computing for private inference with minimal overhead. Listeners will discover how NEAR aims to decentralize model training with cryptographically guaranteed returns and how blockchain can provide stronger guarantees for AI agent behavior, setting the stage for a future where AI truly belongs to everyone.
Read the full transcript here: https://storage.aipodcast.ing/transcripts/episode/tcr/149adc6c-58d2-4e41-ae98-05473e5b994e/combined_transcript.html
Sponsors:
Linear: Linear is the system for modern product development. Nearly every AI company you've heard of is using Linear to build products. Get 6 months of Linear Business for free at: https://linear.app/tcr
Oracle Cloud Infrastructure: Oracle Cloud Infrastructure (OCI) is the next-generation cloud that delivers better performance, faster speeds, and significantly lower costs, including up to 50% less for compute, 70% for storage, and 80% for networking. Run any workload, from infrastructure to AI, in a high-availability environment and try OCI for free with zero commitment at https://oracle.com/cognitive
PRODUCED BY:
https://aipodcast.ing
CHAPTERS:
(00:00) About the Episode
(03:58) From Transformers to Blockchain (Part 1)
(17:01) Sponsor: Linear
(18:30) From Transformers to Blockchain (Part 2)
(21:23) Blockchain Security Fundamentals (Part 1)
(33:36) Sponsor: Oracle Cloud Infrastructure
(35:00) Blockchain Security Fundamentals (Part 2)
(39:01) Zero-Day Vulnerabilities Solution
(51:11) Confidential Computing Infrastructure
(58:07) User-Owned AI Models
(01:14:47) Marketplace and Governance
(01:18:57) AI-Crypto Synergy Vision
(01:27:50) Autonomous Agents Future
(01:31:22) Outro
SOCIAL LINKS:
Website: https://www.cognitiverevolutio...
Twitter (Podcast): https://x.com/cogrev_podcast
Twitter (Nathan): https://x.com/labenz
LinkedIn: https://linkedin.com/in/nathan...
Youtube: https://youtube.com/@Cognitive...
Apple: https://podcasts.apple.com/de/...
Spotify: https://open.spotify.com/show/...
Full Transcript
Transcript
Nathan Labenz: (0:00) Hello, and welcome back to the Cognitive Revolution. Today, my guest is Illia Polosukhin, founder of NEAR, a hyperambitious, multifaceted project that describes itself as the blockchain for AI and aims to build a future where AI belongs to everyone. Illia is not your average blockchain founder. Before starting NEAR, he was one of eight coauthors of the seminal 2017 paper, Attention is All You Need, which, of course, introduced the transformer architecture and helped launch the current AI revolution. The credits on that paper say that Illia and another coauthor, quote, designed and implemented the first transformer models and were, quote, crucially involved in every aspect of this work. And in fact, NEAR started as an AI project, but took a major detour into crypto when Illia and team realized just how difficult it was for them to pay their data task workers all around the world. Today, given his pedigree, Illia could no doubt command one of those mythical billion dollar comp packages from Zuck's Superintelligence Labs. But instead, he's building the foundational infrastructure for user owned, privacy preserving AI that can operate at global scale. In this conversation, we start with foundational principles and work our way up the technology stack. We begin with an overview of how NEAR's proof of stake consensus mechanism creates security for network participants without requiring trust in a centralized authority. In short, because the protocol is open and permissionless, anyone who is willing to put value at risk can become a validator, and thus no one can prevent the flow of valid transactions. From there, we discuss how NEAR is leveraging NVIDIA's confidential computing capabilities to create a permissionless network that allows anyone with compatible GPU hardware to sell inference compute while keeping both the model weights and user data private even from the hardware operators, all with only a 5% overhead relative to normal computing. To be honest, I hadn't realized just how affordable this privacy layer has become. And while it does require trust in the chip maker, it's clearly a huge deal for all sorts of scenarios and does seem to help explain how it is that frontier model developers have been able to ship their models to so many different inference partners without leaking the model weights. After that, we dig into NEAR's plans to decentralize model training itself with a process that allows contributors to provide whatever compute or data they have in exchange for a cryptographically guaranteed share of the model's future revenue. While it does remain to be seen whether a community driven project can produce a trillion parameter model that performs at the level of today's leaders, at a minimum, Illia and team seem to have designed an incentive structure that could make it worthwhile for folks to contribute the estimated $100 million worth of resources that's required to compete at that level. Finally, we talk about what AI and crypto can do for one another, including how AI might finally put the smart in smart contracts and how blockchain technology can provide stronger guarantees that AI agents will act only as intended. This is something that I've been envisioning, if only in quite fuzzy terms, for years, and so I was really excited to hear just how concrete it's starting to become. In the end, I was so eager to understand the foundational technology on which Illia's vision rests that we barely had time for the vision itself. But I do think this is really valuable knowledge, and the good news is that we've already scheduled another recording. So definitely stay tuned for part 2 in which we'll explore the applications that people are already starting to build at this intersection of AI and crypto and also try to get a handle on the giga agent future by exploring Illia's vision for how AI agents will interact, transact, and even participate in governance. With that, I hope you enjoy this technical deep dive into the infrastructure for user owned, privacy preserving AI with Illia Polosukhin, founder of NEAR.
Nathan Labenz: (3:58) Illia Polosukhin, founder of NEAR Protocol. Welcome to the Cognitive Revolution. Thanks for having me. I'm excited for this conversation. I think this is going to be one that I'm going to learn even more than I usually do, because you're right at the intersection of AI and crypto. And sometimes these two technology waves are characterized as different camps that don't understand each other or, at times, are in some sort of online rivalry that may be mostly made up. But I've got this sneaking suspicion that I've never quite been able to fully develop that these two technologies actually might really need each other to reach their fullest potential and to do so in a way that we can keep control of over time. So I'm really excited to dig into all of that with you. For starters, though, if my deep research reports are to be believed, and this wasn't hallucinated, I do know for a fact that you were one of the authors of the original Transformers paper. But then I understand from deep research that this project that we now know as the NEAR Protocol started as an AI project and then sort of evolved into a crypto project before now kind of coming back to the intersection of crypto and AI. So I'd love to hear a little bit of the story of how you went from an author of Attention Is All You Need to branching out on your own to realizing that you needed crypto as well as attention. Just give us a little bit of that history and bring us up to the present day.
Illia Polosukhin: (5:35) For sure. Yeah. That's good deep research there. So my background is in machine learning and AI. I joined Google when I saw this cat neuron paper where they effectively trained an autoencoder model, meaning they would feed images and compress it and uncompress it to get back the same image. That model, trained on a bunch of images on the Internet, figured out that there's a cat. So there was a neuron specifically that if you activate it, would show up as a cat. And for me, it was like, this is the approach. Neural networks are working now. But I think images are not where the knowledge and intelligence lies. I think text is. And so I joined Google with very much, hey, how do we focus deep learning on natural language? And the most straightforward path to advance and check intelligence of any human or machine is to ask questions. So my team worked on question answering. And we actually had even some of this in a product on google.com. Back in the day before Gemini, you would get these short answers. And part of the challenge we had was actually the models that were state of the art were too slow. You couldn't, the long short term memory, LSTM models, they need to read one word at a time. And so if you feed them a bunch of articles from Google search results, it'll take forever for them to respond. So that's actually how transformers was one of the motivations, like, hey, how do we consume all this context as much in parallel as possible and then figure out how to answer? Now, after that work, I felt like the slope of evolution of AI is accelerating. This is 2017, and I want to build products on top of it. And so I left Google to start NEAR AI with my cofounder, Alex Skidanov. The premise was that we actually want to teach machines to code. And so that's something that I believed for a long time, that if machines can code, we're effectively changing how we're interacting with computing. It's changing from few people can source code and talk to machines, and then everybody else effectively needs to consume their magic, to everybody can effectively do everything with a machine that normally only a few developers can do. And so we were trying to build that. Back in 2017, that sounded like science fiction. Now it's just called vibe coding. We were trying to build vibe coding in 2017, and the expectation we had is what we've seen in 2022-2023. We thought that's what's going to be happening in the industry. And as we know now, we were lacking probably an order or a couple orders of magnitude of compute. Now, we were trying to be smarter, so what we tried to do is get a lot more training data. Now, it's called instruct fine tuning data. We were trying to get people to actually write a little bit of code for some instructions or write instructions based on code. And the people who can do this are effectively computer science students from developing countries where a few cents here and there, that adds up to a few dollars a day, is actually reasonable money for them to work online and practice their coding experience. Now, the challenge we faced was paying them. People, like students in China, don't have bank accounts. They have WeChat Pay. Back then in Ukraine, if you receive dollars in your bank account, you actually by law are required to sell half of it. There was no way to pay into some countries for whatever reasons. PayPal didn't work, TransferWise didn't work. So, there were all kinds of weird, just pure payment problems that we faced while trying to collect more data. So, we ended up saying, okay, well, we've heard of this blockchain thing. It's a global payment network. It should solve our problem. So we were just trying to use blockchain as a solution to our own problem of coordinating and paying a bunch of people. And as we did our research over 2018, we kind of realized, well, architecturally, none of them scale. They are kind of very slow, hard to use, hard to build on. And so that's where it's like, hey, if there was a blockchain with these criteria, easy to use, scales, microtransactions, predictable price, we would use it. But there's none. And so we should just build that and then use it ourselves and come back to AI. We thought we were going to build it in 6 months, to be clear. So we were like, hey, it's easy. It's just some systems, we're going to ship it and come back. It took a little bit longer. We effectively built a highly scalable blockchain that's focused on ease of use, really abstracting out the blockchain itself from the user experience, as well as highly programmable. You can effectively run arbitrary software written in C, Rust, JavaScript, Python, et cetera, on the blockchain as well. Obviously, all the financial stuff, payments, loyalty, remittances, being run through it, a lot of microtransactions going through it, as well as a few other use cases. And so it is one of the most used blockchains. We have 15 million monthly active users on it. And so as 2022-2023 happened, we were like, we started building up the team back into, okay, now that this acceleration is happening, how do we bring these new learnings we've had through the blockchain experience? Where can we bring that vision of, hey, AI will change how you experience computing forward? This is the history. Looking forward, the important part is, as AI will write code, it will be able to interact with other tools and systems. You're effectively removing the need for other apps and even websites. They can be built on the fly as the stack matures. It can go and directly get the information you need. And so interestingly, there's a few things that are happening. One is the devices are going to become more run by what we call an AI operating system. Now, this AI operating system, however, you want it to run locally, but you will not be able to process everything locally. There's background processing. You want it to summarize news for you, et cetera. So you need some way to offload the compute. The problem is it needs to have all of your context. That's how we're going to get to true AI that's really helpful, because it will know everything about you. But that is a very scary situation if some third party company has all your data. We just saw everything from chats being leaked to companies reporting to police with your chats. So there's already a challenge with how this data is handled by the centralized AI companies. First of all, we want everything to be private. I think that's fundamental. We want AI to be private. We want AI to have all your context. We want it to be able to go and execute actions on your behalf. The other interesting thing that happens in the future is that a lot of the existing, you can call them aggregators like Google, DoorDash, Amazon, as well as other middlemen. I call the FDA, for example, a middleman. It takes all the pharmacy filings, it processes them, it evaluates them. It's like, okay, yes, this medicine is safe on average. And then they allow pharma companies to stock it. Your AI can do all this work. It can go and find directly the factory in China to order stuff. It can go directly to an AI of the pharma companies, discuss with them exactly your medical situation, find the exact medical compounds that you need at this moment right now. You don't need to check if on average, for an average person, it's going to be good. You actually need to check if it's good for you right now is what you need to address. So you're removing this middleman system that our societies have built and going very much direct peer to peer, be that between different people or people and companies or even people and autonomous agents. That's where blockchain comes on the other side because it's effectively facilitating those types of transactions. So there's effectively two pieces of this vision. One is how do we ensure privacy, verifiability? And there's another aspect, is if this is how we see the world and how we interact with the world, it's through this AI lens. A small alteration into that AI can effectively lead you to perceive the world differently. The example I use is if you go into ChatGPT right now and at the beginning say, hey, subtly convince me while we're talking about some other topic of X. Let's say, voting for somebody else that you don't like. And then you go through the conversation. It will start actually working to change the accents on how you think about things in the chat on another topic. Now, this can be in the system prompt right now, or this can be in other ways directly. We don't know. And so we have this concept of what we call user owned AI because you want AI to be yours, not theirs. Privacy is part of this, but the other part is the model itself. We need to know what goes into it. We need to have verifiability of when you run it that there's no additional ways that it's actually being affected. So that's a big part. To do that, you need a different model. You need to build a movement, not a company, because if you build another company doing that exact thing, you'll end up in the same result. So you need to build a movement around this idea where people are actually willing to contribute to build this more of a common good that's accessible to everyone while creating an economic engine behind it to actually power it because this requires a lot of financial investment and resources. And on the other side, as this AI becomes your interface, how does it interact with other AIs? That's where what we call intents, AITP, and other components are coming together. So that's really the vision we're working toward.
Nathan Labenz: (16:56) Hey. We'll continue our interview in a moment after a word from our sponsors.
Nathan Labenz: (17:01) AI's impact on product development feels very piecemeal right now. AI coding assistants and agents, including a number of our past guests, provide incredible productivity boosts. But that's just one aspect of building products. What about all the coordination work like planning, customer feedback, and project management? There's nothing that really brings it all together. Well, our sponsor of this episode, Linear, is doing just that. Linear started as an issue tracker for engineers, but has evolved into a platform that manages your entire product development life cycle. And now they're taking it to the next level with AI capabilities that provide massive leverage. Linear's AI handles the coordination busy work, routing bugs, generating updates, grooming backlogs. You can even deploy agents within Linear to write code, debug, and draft PRs. Plus, with MCP, Linear connects to your favorite AI tools, Claude, Cursor, ChatGPT, and more. So what does it all mean? Small teams can operate with the resources of much larger ones, and large teams can move as fast as startups. There's never been a more exciting time to build products, and Linear just has to be the platform to do it on. Nearly every AI company you've heard of is using Linear, so why aren't you? To find out more and get 6 months of Linear Business for free, head to linear.app/tcr. That's linear.app/tcr for 6 months free of Linear Business.
Nathan Labenz: (18:30) There's a lot there to unpack. I'm going to maybe try taking it from what I think is the most foundational layer and kind of work up toward the giga agent future. Most fundamentally, when we talk about, obviously, there's a lot of problems in society today, including loss of trust in institutions, trustless is becoming a general description of society perhaps, or something that certain technologies can achieve. I think for a lot of people listening, they don't have a great handle on where with different blockchain schemes the trust comes from. I know in the original Bitcoin, the idea with the proof of work is that basically, it's really costly to mint a new valid block. And so because it's so costly, nobody could plausibly rewrite the whole chain to present. You'd have to do that so many times in a row that nobody could really ever get there. And so the canonical version is kind of the canonical version. That difficulty of extending means that nobody can corrupt the history. Okay. So that's, unfortunately, though, pretty costly to run. I think I looked up, I think Bitcoin is consuming the same amount of electricity as the country of Poland today per my Internet research. So we've now seen this move to proof of stake, and there you have a lot less compute required, but you have to make sure you have a really thoughtful incentive design. People are essentially putting their holdings of tokens up, putting them at risk, basically, locking them up, putting them at risk, saying, I represent that I'm going to do the right thing for the broader community. And if I don't, I stand to suffer some consequences as a validator in the network. So can you maybe elaborate on that just a little bit? Because ultimately everything we're going to talk about, if I understand correctly, is built on the idea that we have a set of validators that we don't necessarily have to trust as individuals, but we sort of have to trust collectively, and we also have to trust that the incentives are such that they don't have any reason to defect from the stable equilibrium. So can you give us a little bit more about what is that stable equilibrium that you have designed? Who are the validators? In your mind, what is the ground that people can put their confidence in that everything else that we'll build on top of this is really on a solid, albeit distributed foundation?
Illia Polosukhin: (21:24) Yeah. So that's a very good question. And I think just to contrast it with our traditional Internet. Right now, you know, we're using Riverside to record this. We went to riverside.fm. We actually relied on somebody to tell us Riverside is this IP address. That somebody is a distributed system, but it actually has a single organization that effectively decides how it's done. Similarly, we have SSL certificates that give us end to end encryption. But again, there's certificate authorities who issue the certificates, and there's the authority that gave them this. So there's an effective point of centralization on the current Internet, where you cannot verify on its own that this is correct. Now, when we think of blockchains, actually an important way to think about it is from your perspective. You want to verify that what you're interacting with is correct. And how hard is it for someone to effectively lie to you and show you fake information? That's what you're looking for. So as you said, for Bitcoin, effectively, if you're looking at some state, for example, you just received a bunch of Bitcoin, the cost of this not being true is effectively how much it costs to produce that number of blocks. And indeed, for an hour, to lie to you about an hour of history, it costs $2 million. So that's the idea. Now, with proof of stake, it's a little bit different because in Bitcoin, it's possible that, for example, Google has a lot of servers. They could be effectively creating a fake blockchain that's days and weeks long, and they can present that and effectively lie to people and steal a bunch of money. I don't think Google is doing this, but just to give you an example. With proof of stake, and specifically this idea of Byzantine fault tolerant consensus, you actually have this notion of finality, where if indeed two-thirds of the stake is not malicious, then they're following the protocol, so there's actually no way to have a fake history or to lie to you. You can verify this from the beginning. We launched the main net in October 2020. And so since then, you can verify that every single transition was correct and all the rules were followed. So that's the idea. Now, who are those validators, and what is the stake? So, for example for NEAR, but it's similar for other proof of stake. It's a lot of both projects in the ecosystem. So we have a number of other companies in the ecosystem that are building on top of us. So they effectively have a vested interest in the security of this network. So they become participants. And becoming a validator is permissionless. You can actually start a validator right now. It's also, on NEAR you can run a very lightweight node, like somebody ran a node on an 8 year old laptop recently. We also have, for example, Vodafone, NTT, and other entities like that running. We have crypto exchanges who are participating in the network as well. So it's really about the participants in the network who become validators as well because they have a vested interest. They have some stake as well. So they have both business interest and financial interest in the network, and so they become participants and validators. And then also just a bunch of community, we have universities. We have some individual developers who are running nodes as well. So that's the idea. You have an open network. People join. We have actually a pretty large number of new validators joining in the past 6 months. And in turn, they put some of their money at stake as well as maybe some other users can delegate. So they say, hey, I trust this participant, and I'm going to add more stake to them effectively. And then there's a reward that's being split between all of them to effectively incentivize them doing this.
Nathan Labenz: (26:07) This is maybe a really naive question, but if it is permissionless, maybe let's talk about the importance of permissionless versus permissioned because I was also, I'm sure you're following the Tempo project, I don't know if you see them as a direct competitor. There's obviously a lot of projects out there. Tempo, it has Stripe behind it as I know you know. My understanding reading through what they have said is that they plan to start with trusted validators and then gradually become more permissionless over time. And I'm not sure if this is one of those things where it really is a spectrum or if it really is more of a binary. I always say AI defies all binaries, but when it's something like, is it permissionless or is it not permissionless? That sounds like how do you become more permissionless? Is there a way to become, if there's some permission there, it still seems like there's some. Anyway, I'm a little confused about exactly how that matters and how much that matters. And then also, if I just show up with a node, like, how do you know who I am or who I say I am? If I show up and say I'm Coca Cola, here's your new validator, do you have to meet me in real life? Do you go to the Coca Cola office and kick the tires on the servers to do that sort of real world validation? Again, these are obviously noob questions, but knowing how much I might rely on my AI agents built on top of all this, I do want to make sure I'm solid in the foundational understanding.
Illia Polosukhin: (27:43) No. These are all good questions. Yeah. So I think permissionlessness indeed is a spectrum because there's a few step functions. There's some networks where there's a very specific set of validators. And what this has as an issue is that, let's say, if these validators don't like you for whatever reason, they can effectively not let your transactions or your AI agents' transactions on chain. They can effectively censor you. And so that's the main challenge. If it's not one, but a subset of permissioned validators, is that for whatever reason, they can effectively exclude you from the network, and you cannot do anything. The benefit of this permissionless network is that you can yourself join as a node and say, hey, I'm actually going to join the network. Even if they don't let me in, I can myself push transactions and interact with the network directly. So that is the biggest change, which for normal people is like, whatever. It's not that big of a deal for maybe people who've been through, I'm from Ukraine, I've seen different types, I have multiple banks that effectively closed up that I was a client of. Obviously, it was a war and everything. Things like that are pretty important, obviously, for journalists, but more importantly is actually for businesses. If I'm a business, if this is a payment processor that I'm using and the competitor business is effectively controlling which transactions are good or not, or even there's also just delaying, for example, my transaction, et cetera. They can affect my business. And so that's why there's this idea of neutrality. You want the ability for everyone to join because then there's no single party that can pull it on themselves or a coalition of parties, a cartel, you might say, that can pull on it. So that's one side. The other question is, okay, you want to join, you say you're Coca Cola, you can actually do that. But nobody will believe you because there's no social signal. Now, if, for example, the Twitter account of Coca Cola tweets like, hey, we just started a NEAR validator. Here's our address. Here's how to delegate. Let's go make Coca Cola the biggest validator on NEAR, then that would actually provide the social signal. So the way you identify yourself to the network is effectively through public private key cryptography. You essentially say, hey, this validator, this account. And then to become a validator, you need to put some NEAR at stake. You need to put some money at stake, and so now you're a validator. So you don't really need to meet in person or do anything. You can be a cat. You can be an AI. It doesn't matter if you have some capital at stake. Nathan Labenz: (31:05) Got it. So anybody who shows up, if they're willing to make an investment in the currency, which they have to have in order to then be able to put some value on the line, that's what proof of stake is. You actually own value and you're putting that value at risk. Not at risk in a probabilistic sense unless you do something wrong. Right? The idea is, as long as everything you do is valid in your role as a validator, then your value is not at risk. But if the rest of the network finds you to have fake transactions or whatever, then they can take the value that you put forward, essentially as collateral.
Illia Polosukhin: (31:53) Correct. Yeah. And we have even a mechanism, because so far it's mostly been people having misconfigured nodes. So it's not really been a malicious attack, but actually people just had whatever bugs in their setup. And so we actually only kind of, it's called slashing when you actually take their money at risk. So we only slash proportionally to how much stake you actually have that participated in the malicious behavior. So if you had a misconfigured node and you only had like, you know, 0.1% or 0.01% of the total stake, you're only going to get slashed, you know, whatever, a multiple of that percentage of your stake. Right? So it'll be very little. Now if the whole, if you know, a large percentage of network coordinating, attacking the network, then they're effectively going to get slashed fully because that percent multiplied by some coefficient will be slashed. That's the idea. We don't want to punish people for misconfiguring things, but we want to punish if there's a coordinated attack for whatever reason. It can be that somebody hacked into a bunch of nodes like they had zero day vulnerabilities and did that. But we need people to have kind of responsibility over their security setups and things. But we want to allow people to have kind of individual mishaps.
Nathan Labenz: (33:23) Yeah, got it. Okay. So is there any last foundational question on the security layer before moving up the stack? Hey, we'll continue our interview in a moment after a word from our sponsors.
Nathan Labenz: (33:36) In business, they say you can have better, cheaper, or faster, but you only get to pick two. But what if you could have all three at the same time? That's exactly what Cohere, Thomson Reuters, and Specialized Bikes have since they upgraded to the next generation of the cloud, Oracle Cloud Infrastructure. OCI is the blazing fast platform for your infrastructure, database, application development, and AI needs, where you can run any workload in a high availability, consistently high performance environment and spend less than you would with other clouds. How is it faster? OCI's block storage gives you more operations per second. Cheaper? OCI costs up to 50% less for compute, 70% less for storage, and 80% less for networking. And better? In test after test, OCI customers report lower latency and higher bandwidth versus other clouds. This is the cloud built for AI and all of your biggest workloads. Right now, with zero commitment, try OCI for free. Head to oracle.com/cognitive. That's oracle.com/cognitive.
Nathan Labenz: (34:45) Is there any risk of incentives changing in the long term? Like, right now, you know, the market cap of the Near coin, last I checked, I think it was $7 billion. And that was, I mean, it's obviously a lot, but you know, for people who have businesses or whatever. And then you also, I guess, would imagine, like, if there is some coordinated attack, probably that market cap falls real fast. So you could try to hijack, but also what have you won, sort of. Right? I imagine that's a big part of it. But in the fullness of time, if we imagine a massive AI agent economy built on top of this whole thing, is there ever a point where it sort of could flip from it's not worth doing a coordinated attack to maybe it could become worth it? Because instead of $7 billion, it's $7 trillion or something like that.
Illia Polosukhin: (35:44) Well, I mean, if the network were $7 trillion, right, then the attack will cost, for example, 30% of that. So if somebody can come by, you know, with a couple trillion to attack it. But then the question is, that attack will not be able to extract much. Right? So this is where it's like, again, from a perspective, the question is, you as an individual, because an attack is towards someone. Right? The system itself is like, from its perspective, it's always correct. Right? It follows the rules. If somebody violates the rules, they're not in the system. The only way to attack is effectively to attack you saying, hey, here's the fake information. For example, your agent, I sent you $1 billion, give me $1 billion of services. But I didn't actually send you $1 billion. Right? I just lied to you by creating, by attacking this, by kind of faking that there is actually a transaction that happened and it's been finalized. And so to do that kind of attack, you'll need effectively 66% of stake. So if the network is at that point $7 trillion, you need 66% of $7 trillion or whatever the stake percentage is. Then you, probably, as you know, if you're receiving $1 billion in a transaction, you're probably not going to be like, immediately, cool, I'll give you $1 billion of value. You probably can wait a little bit, make sure that there's nothing else going on in the network. So there's kind of a question of timing, value, and security that together work. Right? So similarly, how Bitcoin right now, Bitcoin is whatever, $2 trillion. Right? The one hour attack is $2 million. So if you're sending $10 million, you probably should wait a little bit longer than one hour to make sure nobody is attacking you. If you, and et cetera. Right? Like, the more you send, the longer you should wait to make sure. But kind of the idea is, as the value increases, then you can also, you know, wait less for larger transactions. That's why it's like, it's a little bit, people say, you know, that 51% attack, and it's important to actually understand what this means. It means somebody specifically getting attacked. It can be an exchange attack, or it can be an individual, a service provider, et cetera. It's not that the network itself is still correct. The rules are still followed. It doesn't matter which effective, so-called fork will end up being picked. From the perspective of that fork, you're still on the main thing. Blockchain is a little bit of this abstract point of view, indeed, worldview perspective. In it, you're always correct. Right? Or are you outside and you actually, are you seeing the right thing or not? And how can you validate that you're seeing the right thing? But for most use cases, right, if you're spending $5, $10, $100 to buy something, like, effectively, the security of the network is, you know, so much higher, then it's not a problem, and you can effectively accept the transaction right away. And so for normal use case of agents, it's fine. And then the reality is, there's more challenges with indeed security of the, like, zero days of the code of the systems than, like, the kind of economic security of this. And we should talk about this because we have a whole thesis on that.
Nathan Labenz: (39:26) Well, maybe now could be the time. My plan from here is to kind of work our way up to, okay, how do you train models on this? And there's interesting aspects of that. And then how do you build, like, you know, business models on top of models and ultimately agents and, you know. So if you want to talk more about the...
Illia Polosukhin: (39:47) On the security side, yeah. I think the important part to understand is, as AI is now getting better and better, it actually gets really good at finding vulnerabilities. I've used it to backtest effectively. You take a recently found vulnerability, you run an LLM over it. You effectively, like you know, you don't even need to overprompt it, and it will find you, you know, some vulnerabilities in the code. Somebody found a Linux, I think, zero day with that approach as well. What this means is, because cybersecurity is so kind of one-sided, right, like, it's really hard to know that you don't have vulnerabilities, but it's getting easier and easier to find them. And it's also true about social engineering and other attacks. It's just going to get easier and easier. So we kind of need a fundamentally different approach. And so the fundamentally different approach actually relies on math. Like, the only thing that you can rely on is actually the whole system is correct given, again, your perspective. So from my perspective, as a user of the system, I want to know that if I'm doing something, then it is actually correct. Right? There's no vulnerabilities in the system or potential problems in the future that can happen. So I want actually a mathematical proof at the time of using the system that it is correct, given my requirements. So it's a very different, like, for those who are, you know, there's been a lot of, that's called formal verification. There's been a lot of research on this. But because it's so complicated, nobody actually does it at the time of using services. You may be doing it, like, for example, when they were sending staff to Mars, right, they actually had, I think, formal verification of the code for the Mars Rover. But it was done once. Like, it's effective, like, okay, we want to verify that it runs correctly, will run correctly given the specification. But, obviously, if anything gets updated, if the requirements change, et cetera, now this is not usable. And it's because it's like it's so hard, it requires a lot of manual labor. Now, benefit is, with AI now, right, that's getting really good even at winning gold at the International Mathematics Olympiad, we can actually simplify that process of proving itself. And so what we believe is we actually will need to rewrite effectively every single line of code in such a way that it's formally verifiable at the execution time. And with blockchain, this is the first step. Blockchain, because it's a root of trust, we want to formally verify it so that you as a user, when you, for example, are sending money or depositing money somewhere, you can say, hey, I'm expecting to receive money back, at least as much money. The chain needs to prove to you that this is actually going to be correct and it's going to happen. But you can expand that to broader services. Like, if I'm using some service, I'm giving it my private data. I want it to prove to me that it's not going to leak it. Right? And so for that, this is where we're going into verifiable compute in a broader sense. We kind of need to both have formal verification and guarantees that this code is going to be executed over your data. So that's where trusted execution environments come in. So that's kind of like the position on, like, kind of zero day vulnerabilities. We're moving in a world where this is going to, you know, a couple of weeks ago, right, there was a zero day in all the iOS and Mac, and everybody was patching. We just had a massive attack on the supply chain in NPM yesterday in the crypto space. Somebody injected code effectively to replace addresses in all of the tools. The amount of this kind of vulnerability is going to just accelerate. That's one of the longer term research projects we're working on to actually solve that. But yeah, you need a few more pieces. That's where we're getting into, okay, well, how do we actually guarantee privacy around these models, have verifiability, as well as indeed build monetization, build a kind of financial engine that actually runs us. And so we're using an effective combination of indeed blockchain, cryptography, and hardware. So as of about a year ago, NVIDIA supports so-called confidential computing. So what this means is the NVIDIA chip itself connects to Intel and ARM and even AMD supported in such a way that even I, as an operator, as the owner of the hardware, cannot access what's happening inside. Like, so as a runner of the operating system. This is called confidential computing mode. And the interesting thing is it gives you both confidentiality and verifiability. Right? It gives you, it tells you this code was run on this data, and only this thing was done. And by the way, nobody else has seen what happened there. Right? And so what we're using this for is effectively now, I'm as a user, for example, want to run some AI workload. I can establish an encrypted channel to this secure enclave and run some AI workload there and then receive back the result again in an encrypted manner. And I know that there's no other single party that were able to actually understand what I was doing there. And I also kind of received a verified certificate that says this model was run on your input and this is the output. And there's no way that something could have changed or degraded performance midway or whatever. Or they injected some prompts into the thing. It's like you have a guarantee of the execution.
That's kind of the basic primitive of this kind of confidential, verifiable compute. And so there's a few sides of this. One is, well, we now need a lot of compute. Right? Hundreds of thousands of GPUs need to be here so we can actually provide it for lots of users. And again, one of the challenges with doing that is, you know, massive data centers, lots of CapEx. Now you need a lot of energy. It's all in one place. You have latency problems. You have sovereignty problems. So all the challenges. Now the cool thing is, again, because owners of compute don't actually see anything, we can, again, open it up and make it permissionless. We can let anyone with compute join. We have a way to verify that compute is indeed NVIDIA and Intel chips because they effectively sign the certificates. And now we can decentralize the compute itself. So we can have compute everywhere joining across the world. We can then route requests to the closest and available compute to you. You get confidentiality. We have verifiability. You now can pay to the network, and the network then distributes these rewards. Very similarly, how validators, you know, receive rewards for providing validation of the blockchain here, the compute providers receive rewards for providing compute, and then users pay for using inference or developers paying for using inference. So that kind of creates what we call a decentralized confidential machine learning cloud. Right? And it's not just for AI stuff. You can also run, you know, arbitrary compute over it as well, data processing, you know, agents, MCP servers, et cetera, et cetera. You know, store memory, end-to-end encrypted for users. So that's kind of the second piece, which, again, enables that vision of, how am I going to have an AI OS that's, you know, truly private and I know what's happening there. And so this already has an economy. Right? We effectively bring in compute and usage. Now the other challenge is indeed models and data in a sense. And so there you have, okay, well, right now, if I'm a model developer, I effectively have two routes. I'm like open sourcing it and making zero money. Or I keep it closed source and then I need to actually procure compute, make sure that compute is verified. I trust the compute provider, meaning, because I'm uploading my weights to them, they can actually steal and open source my weights or, you know, run them alternatively. So that's why only big hyperscalers actually usually partner, Microsoft OpenAI, Amazon Anthropic. Then on top of this, also have, there's also this maturity problem, whereas, you know, if I have like, I'm just launching a new model. I don't have any users for it yet, but I kind of need to commit to a bunch of compute from this hyperscaler, like, usually, like, a two year contract. I don't actually know how much I need. If I get more users than I have compute, now I need to pay a ton more. So it's like a very weird, you know, kind of also economic question. We have tools to do that. And so the idea here is that now if it's an open source model, you upload it. Or if it's a closed source model, you encrypt and upload it to our system. And so now, again, because all of the compute happens in this confidential environment, we can decrypt the model inside the confidential environment. Meaning, again, nobody is able to access what's happening there. And you can run on users' data privately. And now when users pay, they pay both the developer and the model developer and the hardware provider the fee. Right? So you're effectively now combining model providers and model developers into this, offering them a way to serve their models without prepaying. Right? They don't need to pay for the compute to serve it. They don't need to take users' data. They don't need to deal with GDPR or HIPAA, SOC 2, whatever. All the compliances, they just push a model in, start making money when people use it. So that kind of creates a new economic model for model developers. Similarly, you can have this for content creators and data providers. You can upload your data that can be used at inference time or potentially later at training time. And if it is used, you're getting paid from that transaction again as well, similar to Spotify. There's subscription fees or API call fees that get distributed to the providers. So that's kind of how we see this. Have a cluster of services that effectively all run on the same kind of decentralized hardware cloud, where you can bring models, you can bring data, there's users and developers consuming this, and it all kind of gets secured on this blockchain-based marketplace that really facilitates it.
Nathan Labenz: (50:56) So a couple questions there. We could go on for a long time. This is fascinating already. When I show up to the network, I basically just have to have an NVIDIA chip of a certain type that has support for the secure private computing. Right? That does suggest that there's, like, important, is it, I don't know if it would be right to think of it as a, I guess you could have a hardware vulnerability or a software vulnerability there. Hardware vulnerability, from what research I've done, people seem to think, like, if you, you know, gave some of these chips to, for example, the Chinese government, and let them bring the full power of their, you know, immense engineering prowess to bear on cracking that hardware, people, from what I hear, seem to think they probably could, but that it wouldn't really be economical to do it for much stuff because it's really, you know, you're, obviously, these things are exquisitely crafted. So to do some sort of hardware level modification, you might be able to figure out how to do it. You're probably not going to find it worth your while to try to scale it to some, you know, large number of chips. So this would be very focused, you know, kind of hyper-targeted attacks at most. But then there's also the software question of, how do we know that NVIDIA has actually, is their stuff formally verified that this is all fully locked up? Like, how much trust and where are we putting trust there? And then part two would be, how much overhead is associated with this? I did one episode on use of zero knowledge proofs to prove that the model, and this was two years ago, so the framing at the time was, and it's actually still relevant because just in recent days, you know, there's been this discourse about Claude seems to be dumber for me during the day than at night or whatever.
Nathan Labenz: (53:01) Exactly. People are saying, are...
Nathan Labenz: (53:02) They quantizing it or are they not? And they're saying they're not, but something happened and it's not exactly clear what. The framing at the time was, well, if they use a zero knowledge proof, they can demonstrate to you that we ran the model we promised you to run. You're getting the value that, you know, you contracted for. However, at the time, there was a lot of overhead associated with doing the actual zero knowledge proof. So it wasn't like for 2% more, you can get a guarantee. I forget exactly how much, but it was a lot more to get the guarantee.
Illia Polosukhin: (53:33) It's like 10,000 to 100,000x.
Nathan Labenz: (53:37) Yeah, it was a lot. So is that, I mean, okay. So where, how much risk is there at the hardware level? How much risk is there at the software level? And how much overhead is there to actually get these benefits?
Illia Polosukhin: (53:51) Yeah. So given you touched on ZK proof, yeah. So there's effectively a spectrum of how to achieve verifiability and how to achieve privacy. ZK proof is effectively how to achieve verifiability. You don't get privacy from it. Somebody still needs to run the compute and then compute the proof, and so that other third party has all your data. But indeed, you get verifiability that this exact model was run in this specific way. The challenge is, yeah, even in the right now, the most performant way, it's probably 1,000x slower than just computing it itself.
Nathan Labenz: (54:29) It's a lot of overhead.
Illia Polosukhin: (54:30) It's a lot of overhead, especially with most of the time, you're asking ChatGPT to do dumb things. Everybody does.
Illia Polosukhin: (54:43) Only makes sense for really critical stuff. The other side is there's this thing called fully homomorphic encryption. Right? This is where you're actually doing compute where it is private for every participant. It is verifiable, but again, the overhead is massive. Something also in the order of 1,000 to 10,000x. But that is a, you know, kind of fully private and verifiable. What we're doing in the secure enclaves is finding, I would say, a pragmatic middle ground where, indeed, we're trusting the hardware providers, and we can talk about what does that entail. But the benefit is the maximum overhead we've seen in production, kind of in our testing and production, is 5%. And so it's usually from 1 to 5% overhead over running without this mode, but you get privacy. You get this ability for the network to be permissionless to join, and you get verifiability as well out of it. So you don't get, why is it dumber at night, at daytime. So you will not get that on kind of our decentralized cloud. So 1 to 5%, the only other challenge right now, it only works on the level of a single machine, so 8 GPUs. So we can fit anything that fits into 8 GPUs, which is effectively all the open source models right now fit there. It's over 100 billion parameters fits there. No problem. But indeed, you cannot do this across multiple machines yet. This is going to be in the works right now. Now on a kind of software and hardware security side, indeed, there is a potential for a hardware attack, but it, so right now, these chips are, you know, two nanometers. Right? That is very small. And so we're getting to a level where, you know, it's effectively baked into the atoms of the chip. And so if you want to change something or address it, you effectively, I mean, I'm not a hardware, at that level, so I cannot actually approximate it. But the cost of attack is extremely high. And the benefit is you're going to get some maybe random requests from somebody asking, you know, which cat to pick. Right? So, because you will not be able to target specific use cases in our system, even if something happens. There's indeed the other challenge, which is NVIDIA, because NVIDIA is actually certifying it. So their key management and their process needs to be indeed correct. And so this is something that, you know, we would love to kind of help secure. I mean, obviously, they're already using kind of top line security, but that's where, you know, formal verification and using blockchain as security mechanisms can definitely help.
Nathan Labenz: (57:53) Okay, cool. So let's build our way up the stack. So we want to now create user-owned AI. And I've seen you have a plan to train a more than 100 billion parameter model. Estimated cost of that, $160 million.
Illia Polosukhin: (58:18) Well, it's dropping down since we talked about it. So, yeah, that's good. So maybe...
Nathan Labenz: (58:23) Cut it in half. Yeah, VO-3 just got cut in half, so we'll apply a half factor on this too. I guess, what I understand there is Dario just kind of talked about this where he said, you know, yeah, we're burning a lot of money and in a way we're burning more money every generation of scale up. But in another way, if you look at each model as its own venture, each model is profitable. And so what's weird is, like, we might make, you know, however much, it might cost us $100 million to train Claude 3 and we make $1 billion. Then it costs us, you know, $2 billion...
Nathan Labenz: (59:01) To train Cloud 4...
Nathan Labenz: (59:01) And we'll make $10 billion. But that's going to cost us $20 billion for the next one. And so that's a weird development cycle to say the least. But it seems like you are basically kind of taking that reality to its natural conclusion and actually planning to structure sub-ventures for each of the models. Right? So tell me exactly that sort of economic schematic there. And then also, I understand that you're applying this secure computing to the training process where people can contribute training data in a privacy-preserving way. But that's a really interesting challenge, I would imagine too, right? Because, like, who's providing what data, the data mix, if you talk to people at the frontier companies, they're like, yeah, it's like baking a cake. We're constantly experimenting with different mixes and a little more of this and this other behavior gets degraded and whatever. So how can you have a privacy-preserving, even understanding on some naive level that you can do it from a, you know, computation standpoint. From a getting a model to perform standpoint, how can you have privacy-preserving mechanisms on the training data side and have any sense of what the hell you're going to get out when the model comes out of the GPU oven, so to speak.Illia Polosukhin: (1:00:31) Yeah. So all great questions. And indeed, it was actually funny that they posted this. And I'm like, yeah, exactly the model. So I think maybe just to give you a little bit of a roadmap. Building out this decentralized computing, starting with inference. It effectively already gives an ability for people to consume it, get confidentiality, get verifiability. And the next step, it offers people who are building models, including potentially the existing companies that build foundational models, they can upload them. They encrypt. Nobody can see them. They don't get leaked. But now they can be used across this decentralized compute. Everybody can verify and know that they're using this exact model that this company has posted. They effectively on-chain said, the company tweeted about it, etc. You can verify indeed you're using, let's say, this Anthropic Claude, and it doesn't degrade in the evenings. So that's kind of the first steps. Then, okay, well, given we want to kind of coordinate this research and development of models, and importantly, it's not, we're not going to build one model. That's not the point. The point is to build a process that kind of creates models that are state-of-the-art and user-owned, everybody can inspect how they're done. And so the important part is how any frontier lab is working is actually a number of benchmarks that effectively goes through the whole flywheel of building models. Everybody building different parts of it evaluates how their change in innovation benefits the scaling loss. You test different sizes, and then all these ideas get accumulated, and then you do bigger runs. Replicating some of that as well, where we effectively now, again, using this compute network, can say, hey, actually anybody can come in and build a new benchmark. So you can say, hey, I'm going to build a benchmark that's on the model predicting the future, or answering questions about astronomy, or, you know, the ocean, or whatever. Different things that people care about. Including, you can actually have enterprises who say, hey, I actually care about how this model does on this, you know, deepfake detection. And I will not actually upload my data, but I'll keep it private. So nobody can have access to data, but everybody can run evaluation over it. So you can now offer effectively both benchmark creation and evaluation service where you verifiably can check that this model, the closed-source model that runs on undisclosed data, on private data, and gets this result. So again, effectively now creating this marketplace of model builders and, you know, benchmarks, which can be representing companies or specific use cases and finding what the best model and the best ideas are. That already, on itself, is a really interesting product because, again, right now you cannot do that. You cannot actually go and benchmark some intermediate model from, let's say, OpenAI. They will just not give you the weights. If you don't want to upload your data to them, you cannot even benchmark their current model. But this also opens up an environment where anybody, you know, you're sitting somewhere in whatever country, you want to train a model that specifically targets whatever ocean benchmark. Maybe you have special data for this. You can also do this, upload it, and get rewarded when people are using it through the previous step. Now, next step is like, well, actually, I have a lot of interesting ocean data. Or maybe I'm in a university and I know how to train the models, but I don't have compute. And so that's the next step. Okay, well, we're actually going to offer you a way to do first fine-tuning. So you can take an existing base model, fine-tune it. And again, you can fine-tune it even on private data. So I'm, as maybe an institute that collects a lot of ocean data, I have the data, but I don't have the know-how on how to fine-tune this properly. So I can partner up with someone who can then fine-tune on my data without seeing it. They can effectively deploy, like, hey, this job into the confidential computing. The model, the resulting model then, you know, splits the revenue between the developer and the data provider. So you can have all of those different combinations on top. Fine-tuning is, I would say, easier to do when you, you know, may not have exact access to the data. The training is where, if we're talking about full fundamental training from scratch, that's where things get really complicated. That's our final step where we're actually putting all of those pieces together. And there indeed you need like an environment where all of those pieces need to really work well where you effectively can kind of iterate on different pieces and sizes and then really combine it all into a final model. I would say, to your point about private data, there's effectively a few things that will need to happen. One is that there are going to be a whole separate process of data filtering, curriculum construction, etc., that, again, you can actually benchmark it. So, like, I can build a curriculum builder, right, pick different articles and different pieces of data, construct the training set out of it. And so you can actually evaluate how good that system is by training a few sizes of the models. So people can actually compete on whose curriculum builder is better. And so then, when you pick the best curriculum builder, then you run a larger run with, you know, more parameters and more data. So I would say it's going to be kind of a very iterative process. And it's the same way it is right now inside frontier labs because indeed we're kind of experimenting with all those things as we go. Now, with specifically private user data, it's a bit more sensitive and complicated. But again, the biggest benefit we can do is, like, as you contribute your data, first of all, you can get a reward from the model outcome. But you can also have effectively a filtering procedure. Because one of the things is there's going to be a lot of PII information, your phone numbers, SSN, etc., that you don't want in a model. It's also not actually helpful for the model for the most part. But this filtering procedure will be open source, and everybody can effectively, not everybody will inspect, but some developers will inspect and say, yes, this is actually a valid way to filter data. There's a few different components, again, that we can do to make sure that both data is cleaner and then how it gets filtered and processed into the curriculum of the training. Probably a bunch of your personal chats are probably not that actually useful for training these models. There's other types of data that's more useful for those things. We see, in general, we're moving to more synthetic data anyway, so there could be a lot more of that. You can still, and importantly, we still have the crowdsourcing and data labeling that runs through this decentralized network as well. So we actually have both, like, NEAR Crowd, which is kind of the original project we had for crowdsourcing, like, crowdsourcing, meaning, like, Scale AI style. As well, we have PublicAI, which is another data labeling crowdsourcing platform.
Nathan Labenz: (1:08:35) How much overhead is associated with all that? I mean, you mentioned, like, relatively low at the hardware level for the confidential computing, but how much is there in terms of just, like, the software setup? It's not super easy to set up a training cluster. It's not super easy to manage data pipelines. It's not super easy to, I mean, you have a marketplace, but, you know, even going out and hiring a bunch of people to create data, like, that's not super easy even in sort of base case. Interested in how much overhead there is in doing it the privacy-preserving way. And then also at the sort of model level, who makes these decisions, you know, in terms of, like, what's better? Because I think we have benchmarks galore, and yet it seems like they all, you know, I mean, there's been a lot of discourse about this over the last couple weeks too. People swear by evals. People say evals are useless. You know, certain, you know, famous projects are run entirely on vibes. It would seem like the one thing that you'd have a really hard time putting into this frame would be, who do we actually trust to be the vibe checker on this? And how do we make sure that, in addition to maxing out on these various benchmarks, the thing actually, you know, is nice to talk to, which people do value that a lot. So, yeah, I guess overhead in terms of, like, all the setup, the config, just, you know, moving things around, coordination, I guess, broadly, and then how do we, is there a governance mechanism? I know you have governance projects as well that are pretty interesting in themselves. Are you bringing those to bear on, like, who decides what version of the model is actually good or, you know, what training recipe gets scaled up? Or is there some tastemaker, you know, that still is involved somewhere?
Illia Polosukhin: (1:10:34) Those are all good questions. And I think the high-level answer is, like, we're effectively creating a marketplace of all of these decisions. So instead of trying to prescribe, like, okay, Illia will be the tastemaker and decide you get the compute and you don't, instead we say, hey, we're creating a marketplace where anyone can do this and probably Illia will do it as one of the participants, but there can be anybody else doing it as well. And you kind of have the rules and everything known, and you have, like, visibility and traceability, and you can also build on top of each other instead of, you know, right now, you're effectively competing. And so, yeah, we kind of touched on it, but, yeah, like, for, let's say, I'm going to say, hey, I'm going to train this 1.4 trillion parameter model. I'm going to issue a token for it. So I'm going to create a new, effectively, value capture system for this specific model. I can now distribute it for the compute. I can distribute it for data. I can distribute it for research and work. In this case, I started it, so I probably will be the tastemaker in picking and selecting which pieces I want to put together. We train it, you know, if it gets used, we would then, you know, we would distill it, fine-tune, etc. The revenue gets distributed to the token holders. Now, in parallel, somebody is like, oh, actually, I'm going to train, you know, a 70B model in this way, etc. They're going to use exactly the same framework, exactly the same system, do this, you know, and again, because 70B is smaller, cheaper to use, but then it also runs faster, so, like, it's going to get used in different use cases. So you can, you're really using this more as a platform where different people can put together these pieces really easily. And you don't need to raise, whatever, $10 billion to really kind of do this. You can actually raise like smaller amounts and specifically for compute, data, etc. And you already have a lot of building blocks as well on this. So that's kind of the idea. It's like really creating more of an environment where we can open source, but then also we kind of, and we use these things while kind of having this economic flywheel, right, where things are, like, generating revenue by usage. And importantly, kind of learn from what works and have this inspectability of what actually went into it. Which, I mean, in some cases people say, hey, no, actually it's all private. I will not tell you what actually went in. But then people know that this is private and they can decide if they want to use it or not. But also, like, here's verifiable benchmarks. Again, now, when somebody posts benchmarks, you have no idea how they got these numbers. They don't usually even post the prompts they use for the benchmark. So, like, you can have, you know, a two-page prompt that answers half of the questions and get way better answers from that. So here, you can, you know, you have the full traceability of the whole flow. You know kind of what went in. Maybe it's private, but at least, you know, 100 million users contributed their private data, and that's what it trained on. And then it gets, you know, is also benchmarked, and then you can actually, you know, vibe, contribute your vibe benchmark as a benchmark as well. On Illia's vibe benchmark, it gets this much as well, for example. And on whatever, on this company's benchmark, right, like Bank of America, you know, their test, it gets this much. So you can have all of those different pieces of information and decide based on that. So it's really about, like, kind of creating transparency and marketplace for this while preserving privacy for the end user and companies.
Nathan Labenz: (1:14:33) And all of this traceability, verifiability, to rewind for a minute, all rests on the fact that somebody has locked up their NEAR coins with the assertion that they are putting forward valid transactions, which in this case would be more like log statements, basically. Right?
Illia Polosukhin: (1:14:54) It's, I mean, it's a bunch of stuff. Yeah. Like, because it's payments. It's logs. It's kind of traces of, like, trainings, etc.
Nathan Labenz: (1:15:03) How big does this database get? Obviously gets, like, pretty huge. Right? So where does the database live long-term? Like, it becomes too big for a node, a single, you know, if I want to just connect, right, I can't download the whole history of, like, everything that's ever happened. It becomes too much, I assume.
Illia Polosukhin: (1:15:24) Yeah. So NEAR, I mean, one of the kind of underlying things is that NEAR is sharded. Meaning, it's like every single node contains part of the data and part of the process, part of the transactions, and part of the execution. And it can continue expanding the number of shards. And this, like, that's why we actually need more validators to join because the more data, more shards there is, you need more nodes in the network to actually participate. So very similar to how Google databases work or whatever, Meta. Yeah. You know, they don't run out of capacity. It's like, no, no more users here, please. It's like, no, no. We just keep adding more computers to run stuff in parallel.
Nathan Labenz: (1:16:08) Yeah. Okay. So this has been fascinating, and we've built up from the premise that we want user-owned AI, this is kind of, you know, all the layers that we have to put in place to finally be able to get to the point where you could have a community-based process of all this contribution and kick out a model and even get paid back with inference. As you mentioned, though, you could also just have somebody train their own model and kind of bring it into the system. So, you know, regardless of, like, where the model that people want to use comes from, let's change maybe gears toward what does AI need from crypto and what does crypto need from AI? My sort of very high-level styling, and then I want you to, you know, go 10 minutes on this if you want to, is like AI can put the smart in smart contract. I've always kind of thought like, geez, these smart contracts aren't really that smart, but if there was an AI in there, maybe it could be smart. What we really need, I think, is, like, smart dispute resolution in a lot of cases, and affordable dispute resolution because we have, at least in some places, we have courts, but it's, like, not convenient to access them by any means. Other places, there are no courts. So we need, like, smart, affordable dispute resolution that AI seems like it could potentially provide into the context of these smart contracts. And then on the other hand, AI needs, like, really rock-solid guardrails, because we obviously have many, you know, the AIs themselves, like, can be tricked and can go off the rails and make mistakes and even scheme against their human users in some cases as we're increasingly finding out. So we need, like, very, very reliable checkpoints, basically, in workflows to know that, like, you can go explore, you can do whatever, but, like, if you're going to do this, you know, you've got to have a certain, almost think of it like the Jurassic Park thing, you know, where they engineered the dinosaurs to not be able to survive without the special food from the park. Of course, we know how that turned out. But not to generalize from fictional evidence, but we do want the AIs to sort of be dependent on, like, a really rock-solid validation that, like, this is okay for you to do. So that's my, you know, high-level premise. Give me the fleshed-out version.
Illia Polosukhin: (1:18:40) Yeah. So, I mean, both of these sides are definitely very interesting and both directions we're working. And to be clear, like, when I say we, we have a large ecosystem of different projects and kind of, as I said, it's a protocol that combines a lot of different projects and contributors who are actually working towards this. So one side on the dispute resolutions and putting intelligent, like smart, into the smart contract. So we effectively have two kind of parallel threads there. One is what we call NEAR Intent. So NEAR Intent is this idea that as we move into this AI world, well, actually, you or your AI will just express an intent that something wants to be achieved or done. This can be as simple as, hey, I want, you know, some pizza tonight for dinner. And as complex as, hey, you know, I want to construct a new building in this location. You know, it needs to be like this, etc., etc. So there are, you know, high qualifiers. Now that requires a number of steps. It needs to find somebody who actually will do it. So you need a whole discovery process. So how do you discover other AIs or non-AIs? It can be businesses as well. We're assuming everyone will have, like, some AI-like interface. How do you discover the AI that will actually, you know, do this intent? How do you actually come into a commercial agreement with them? And, you know, this commercial agreement can be like, hey, I need to rent this car, you know, to get to the airport, right, for Waymo, or it can be like, hey, we need a six-month project to construct this thing. And then if something goes wrong, how do you actually deal with it? Right? So dispute resolution indeed. So that's what we're building as kind of this intent protocol. So this is the idea of, you know, it starts with people, organizations, and AIs effectively collaborating and doing these things. And so that's a fundamental protocol. It includes indeed AI-based dispute resolution. So you can actually run AI over, if something went wrong, like the taxi didn't actually drive you where you were going or didn't arrive at all, how to deal with that, AI can look at this, cheaply analyze the situation and decide to do something. Now, if parties are still not satisfied with that resolution, then you go to court. So effectively, reduce the cost of disputes, make it way cheaper. It can use a lot more information as well right away because, you know, you have your on-chain information, all your previous actions. It can run over your private data, over both sides' private data very effectively, right, without revealing this private data to anyone. It's effectively like if, you know, the discovery process, like you can run AI on discovery of both sides without this costing, you know, millions of dollars. And so you can effectively minimize the cost of this process very low. And then again, if it's not kind of acceptable, then you go to court. But again, a lot of work already done, so this is all can be reused. So that's kind of one piece. And we kind of, the go-to-market for that, we actually started in more the crypto side. So we like trading and other things. And we're expanding it into e-commerce and other use cases, kind of adding more and more services into this. The other side is actually this autonomous agents. And so right now, smart contract's idea was it's effectively an autonomous being. It exists independently of any person or entity. It has money. The problem was it was just a piece of code and usually a very small piece of code. It was very dumb, even though it was called a smart contract. What we've done is because we have this decentralized kind of compute with AI, we actually combine these two things. So in the decentralized compute, we're running the AI brain, and then it has the smart contract to actually execute actions and execute this intent so it actually can go and do things. We have, as an example, one agent that was given $10,000 at launch. We cannot stop it. We cannot take money back. We cannot do anything with it. And it's effectively trading it based on sentiment on Twitter. And so it made, whatever, $4,000 doing that from the $10,000 it was given. So this is just like an example, but you can imagine a future of businesses where you effectively have businesses like this, which are operated by AI. You know exactly what it's going to do. Your bylaws are the prompt of the agent. There's no person that does random things. Shareholders can effectively give it feedback, and so you can apply all this training data to improve it, or decide to upgrade the model and things like that. That's the framework of what we call autonomous agents. Intent is how they're going to interact with commerce themselves and with other AI agents. And AITP is really the protocol to communicate, to do the communication, to do intents. And then the other side of this is, yeah, like, how do we govern this? This is where indeed we have whole governance protocols. But even before that, I mentioned the formal verification. I actually think one of the pieces of this puzzle will be where, let's say, you're using some AI service. Again, you want it to guarantee to you before it even executes on your data that it will not, you know, escape and go kill a bunch of people. So you kind of, like, effectively, we need to create this environment in which we can verify what AI runs. My approach is, I think it's really hard to effectively be a thought police, even for AI. But it's way easier to ensure that they don't do anything outside of the sandbox that violates whatever the rules you apply when you call it. And so that's kind of, again, using this verifiable compute, we can actually guarantee that as you upload this model, you can attach effectively a set of things that you don't want this model to do, and it can stop it from doing that kind of by running on the output and verifying if it's correct or not. So that's kind of the idea is to leverage some of that. And then you still need, obviously, governance and kind of ways for people to come together and actually decide what it is that we don't want. For example, probably everybody agrees we don't want bioweapons, but maybe not everybody agrees on some other things. There needs to be some process. There, I would say we have a bunch of experiments, including using AI in the governance. So we have AI senators effectively that people can vote for, and then they go and actually make decisions. Because getting people to vote is hard, getting people to go and deep research a bunch of stuff to decide how to vote is even harder. And so you can actually have delegates to AIs as well, and you kind of just give them, you vibe with them on what you think is your opinion, and they kind of construct from that their platform. But you remove this principal-agent problem which you have with other people. When you vote for them, they still have their own agenda, right? And so they may not do what you ask them to do. That's kind of like all the pieces where we're combining all the blockchain and AI side from different directions and really creating protocols and kind of verifiable compute and privacy and really just you can use it in all those different ways.
Nathan Labenz: (1:26:39) One thing that definitely kind of makes me nervous is the idea of these autonomous agents that we can't take back. You know, on the one hand, you sort of want to have reliable follow-through on, you know, what they're supposed to do. You don't want other people to come in and tamper with them. I understand why there's some allure to it, but then I'm also like, geez, you know, if we have a hard enough time telling what one AI is going to do, what happens when we unleash 1 billion agents on the world and they're all autonomous and we can't take them back, and we've, by the way, like delegated governance to AI senators, you know, some might wonder, like, are we not creating the recipe here for the AIs really to just take over and run everything? And so, yeah, what is your positive vision for the future, and how do we make sure we don't, like, you know, take a premature off-ramp on the road to that vision?
Illia Polosukhin: (1:27:35) Yeah. So I think they're going to be a combination of things where, like, AI will become more of the economic force, right, and be that kind of conduit and medium. And then on our side, we're also going to be kind of becoming more of, you know, governance and curators of that on our side. And so one of the kind of indeed, we're like putting AI into governance, but also we're kind of putting governance into AI where I mean, formal verification is one part, but the other part is also if you have this autonomous agent, well, actually, if it wants to do, like, intents in the physical world, it actually needs to be in some jurisdiction. This intent is effectively a legal contract. It needs some backing with the actual jurisdiction, and that jurisdiction has enforcement over this agent. So we actually have infrastructure for jurisdiction, for actual courts to enforce stuff on AI agents as well. It's all visible. It's all on-chain. It's all kind of transparent, but, like, you're effectively combining this, you know, what we have traditionally with courts and jurisdictions and legal frameworks with AIs who can actually interpret all that and, you know, reason over it and make sense of it, as well as these protocols to kind of facilitate it. So, like, kind of the way I see this is all of these things are going to keep fusing together and, like, really becoming this kind of new conduit of how our society operates. But indeed, like, I think there are going to be a lot of exciting innovation. And also, you know, our role as well will kind of keep changing because, like, we're going to be more of participants in this network as well, but, you know, through the AI as a lens. Right? How right now, we're, you know, through the lens of computing, we're already interfacing with each other a lot more than, you know, the physical world many times. So I think, like, that is going to continue also increasing.
Nathan Labenz: (1:29:33) It's a fascinating set of layers of foundational technology that you've built, and the possibilities for it are really pretty dizzying. So I want to already propose a part two for this podcast. Maybe we can get back together before too long. And in the meantime, I don't know if you've ever read the book Liquid Reign, but it's my favorite kind of crypto-AI intersection. And I'm not, like, super well-read in science fiction, but it's a funny but I think increasingly prescient-looking and positive view of what that future could look like. So I definitely recommend that. I know you've got to go. Put me on the calendar somewhere for part two. Any closing thoughts that you want to leave people with today?
Illia Polosukhin: (1:30:26) No. I think, I mean, again, this is a movement. We want people to join and kind of contribute across the board. So if you're a developer, there's a ton of stuff to contribute and build. If you are, you know, planning to kind of explore what does it look like in your life as AI? I think this is really an opportunity to engage in something that's really yours, and you can actually own part of it. Again, contribute in many ways from kind of creators and receivers to tastemaking to many other aspects.
Nathan Labenz: (1:30:59) Sounds great. Illia Polosukhin, founder of NEAR. Thank you for being part of the Cognitive Revolution.
Illia Polosukhin: (1:31:05) Thanks for having me. Nathan Labenz: (1:31:07) If you're finding value in the show, we'd appreciate it if you'd take a moment to share it with friends, post online, write a review on Apple Podcasts or Spotify, or just leave us a comment on YouTube. Of course, we always welcome your feedback, guest and topic suggestions, and sponsorship inquiries either via our website, cognitiverevolution.ai, or by DMing me on your favorite social network. The Cognitive Revolution is part of the Turpentine Network, a network of podcasts where experts talk technology, business, economics, geopolitics, culture, and more, which is now a part of a 16 z. We're produced by AI Podcasting. If you're looking for podcast production help for everything from the moment you stop recording to the moment your audience starts listening, check them out and see my endorsement at aipodcast.ing. And finally, I encourage you to take a moment to check out our new and improved show notes, which were created automatically by Notion's AI Meeting Notes. AI Meeting Notes captures every detail and breaks down complex concepts so no idea gets lost. And because AI meeting notes lives right in Notion, everything you capture, whether that's meetings, podcasts, interviews, or conversations, lives exactly where you plan, build, and get things done. No switching, no slowdown. Check out Notion's AI meeting notes if you want perfect notes that write themselves. And head to the link in our show notes to try Notion's AI meeting notes free for 30 days.