Watch Episode Here

Read Episode Description

Gabriel Weil from Touro University argues that liability law may be our best tool for governing AI development, offering a framework that can adapt to new technologies without requiring new legislation. The conversation explores how negligence, products liability, and "abnormally dangerous activities" doctrines could incentivize AI developers to properly account for risks to third parties, with liability naturally scaling based on the dangers companies create. They examine concrete scenarios including the Character AI case, voice cloning risks, and coding agents, discussing how responsibility should be shared between model creators, application developers, and end users. Weil's most provocative proposal involves using punitive damages to hold companies accountable not just for actual harms, but for the magnitude of risks they irresponsibly create, potentially making even small incidents existentially costly for major AI companies.

Transcript of the episode is here.

Sponsors:
Labelbox: Labelbox pairs automation, expert judgment, and reinforcement learning to deliver high-quality training data for cutting-edge AI. Put its data factory to work for you, visit https://labelbox.com

Shopify: Shopify powers millions of businesses worldwide, handling 10% of U.S. e-commerce. With hundreds of templates, AI tools for product descriptions, and seamless marketing campaign creation, it's like having a design studio and marketing team in one. Start your $1/month trial today at https://shopify.com/cognitive

Oracle Cloud Infrastructure: Oracle Cloud Infrastructure (OCI) is the next-generation cloud that delivers better performance, faster speeds, and significantly lower costs, including up to 50% less for compute, 70% for storage, and 80% for networking. Run any workload, from infrastructure to AI, in a high-availability environment and try OCI for free with zero commitment at https://oracle.com/cognitive

NetSuite by Oracle: NetSuite by Oracle is the AI-powered business management suite trusted by over 42,000 businesses, offering a unified platform for accounting, financial management, inventory, and HR. Gain total visibility and control to make quick decisions and automate everyday tasks—download the free ebook, Navigating Global Trade: Three Insights for Leaders, at https://netsuite.com/cognitive

PRODUCED BY:
https://aipodcast.ing

CHAPTERS:
(00:00) About the Episode
(06:01) Introduction and Overview
(07:06) Liability Law Basics (Part 1)
(18:16) Sponsors: Labelbox | Shopify
(21:40) Liability Law Basics (Part 2)
(27:44) Industry Standards Framework (Part 1)
(39:30) Sponsors: Oracle Cloud Infrastructure | NetSuite by Oracle
(42:03) Industry Standards Framework (Part 2)
(42:08) Character AI Case
(51:23) Coding Agent Scenarios
(01:06:50) Deepfakes and Attribution
(01:17:07) Biorisk and Catastrophic
(01:36:24) State Level Legislation
(01:43:24) Private Governance Comparison
(01:59:54) Policy Implementation Choices
(02:08:07) China and PIBS
(02:13:50) Outro

SOCIAL LINKS:
Website: https://www.cognitiverevolutio...
Twitter (Podcast): https://x.com/cogrev_podcast
Twitter (Nathan): https://x.com/labenz
LinkedIn: https://linkedin.com/in/nathan...
Youtube: https://youtube.com/@Cognitive...
Apple: https://podcasts.apple.com/de/...
Spotify: https://open.spotify.com/show/...

Full Transcript

Transcript

Nathan Labenz: (0:00) Hello, and welcome back to the Cognitive Revolution. Today, we're continuing our short series on creative AI governance proposals with Gabriel Weil, assistant professor of law at Touro University and senior fellow at the Institute for Law and AI, who argues that liability law may be our best tool for shaping the decisions that AI developers make. As we covered in our last episode on private regulatory markets, the pace of AI capabilities advances and adoption, the radical uncertainty around the timing, nature, and impact of AGI and superintelligence, and the backdrop of international competition present a singularly difficult challenge for governments. For good reason, they worry that heavy handed regulation could undermine our ability to realize the great upside of AI. While at the same time, it's becoming clearer and clearer, one Mecha Hitler episode at a time, that we can't simply trust companies to do the right thing for society while they're primarily focused on one-upping one another. So is there any way to govern AI that can keep up with technology developments, meaningfully reduce the most important risks, and still keep the dream of curing all diseases alive? Professor Weil brings another compelling idea to the table. Rather than trying to predict issues and prescribe safety standards from a distance, why not use liability law to incentivize AI developers to properly consider and account for the risk that their development and deployment decisions are imposing on the rest of society? Because I'm no lawyer and I know that most of you aren't either, we begin this conversation with a primer on liability law, covering negligence, products liability, and the doctrine of abnormally dangerous activities before diving into how these frameworks might apply to frontier AI development. The key advantages to using liability law in this way are that the liability risk that a company faces scales naturally with the risks it takes. If the systems are safe, there's nothing for anyone to worry about. And unlike most other proposals, which would require new legislation, liability law is well established and has proven over centuries of evolution that it can adapt to new situations and technologies. Still, of course, important questions arise around the different types of harms that AI systems can cause and the mechanisms by which they come about. Throughout this conversation, we explore concrete scenarios that highlight the complexities, including the tragic Character AI case, phone call agents that can call unsuspecting people and speak to them with increasingly lifelike cloned voices, and coding agents that might overwhelm APIs or outright hack critical systems. Considering in each case, how responsibility should be shared by the model developers, both closed and open source, as well as the application developers and end users. Notably, professor Weil does want to make sure that society gets the benefits of AI even as it remains imperfect. And so he's less focused on changing how AI companies serve customers with products like AI doctors or self driving cars and instead emphasizes the risk of harm to third parties who were not part of the commercial relationship between the AI companies and their customers. Those could be the pedestrians who share space with self driving cars or the public as a whole, which it seems will face at least some increased risk of pandemic and other large scale systemic harms. Within this category, he treats misuse, where a person is intentionally trying to use an AI system to cause harm quite distinctly from misalignment, where the AI system itself breaks bad for whatever reason. His most provocative proposal involves using punitive damages as a mechanism for addressing what would otherwise be uninsurable catastrophic risks. If an AI system causes a relatively small harm, but evidence shows that the situation could easily have gone much worse than it did, professor Weil argues that punitive damages offer a way to hold companies accountable, not just for the actual harm, but for the risk they irresponsibly ran. Considering the magnitude of harms that people worry about when it comes to bio and cybersecurity, such a judgment could in theory be existential even for the most powerful and deep pocketed companies. And as such, this does seem like a promising way to get companies to properly internalize the risks they're taking. Beyond that, we discussed the role of the insurance industry in making this work, what other policies would complement this evolution of liability law, and even touch on professor Weil's hands on work crafting state level legislation in Rhode Island and New York, which would make clear that if an AI system does something that would be a tort if a human did it and neither the end user nor any other intermediary intended or could have reasonably anticipated that outcome, then the developer, the model developer should be strictly liable. It's a simple and I think relatively unobjectionable idea to address model level misalignment that at least some governments might find a natural first step toward accountability for frontier AI companies. As I said last time, all governance proposals require people to do a good job, and no governance structure can guarantee success. Whereas the private regulatory market proposal trusts governments to articulate worthy goals and private regulatory bodies to effectively implement them, this liability based approach would rely on judges and juries to make good decisions and on companies to adjust their decision making based on that expectation. Honestly, both of these proposals seem like major improvements relative to traditional top down rulemaking or to doing nothing, but I honestly can't say that I have a favorite. Perhaps the best thing to do is for society to pursue both in parallel in different jurisdictions and see which ones seem to be working better when the time comes for implementation at a larger scale. If you have a strong opinion on this or if there are other proposals you think would be better than either of these, please do reach out and let me know. For now, I hope you enjoy this exploration of how centuries old legal principles might help us navigate the emerging risks of artificial intelligence with professor Gabriel Weil. Gabriel Weil, assistant professor of law at Touro University and senior fellow at the Institute for Law and AI. Welcome to the Cognitive Revolution.

Gabriel Weil: (6:09) Great to be here. Thanks for having me.

Nathan Labenz: (6:11) I'm excited for the conversation. We met, for the first time at the curve late last year, and credit to the organizers, that event has yielded a number of interesting connections and and now episodes for me. At the time, we had a, I thought, a really fascinating conversation of an idea that I had not really encountered before at all of using liability law to try to help society get a handle on some of the emergent risks, including some of the extreme risks from AI. So I'm excited to unpack that. I think for starters, because we do have a ton of people in the audience that are AI engineers, you know, building with AI, very plugged into the what's going on in the AI scene, but probably much less grounded in the law generally and certainly in liability law specifically. Maybe you could start off by just kinda giving us a quick, to the degree this is possible, Liability 101 and kinda set the stage for, like, where we are, and then we can obviously unpack, you know, what you propose we do as we go forward from here.

Gabriel Weil: (7:06) Sure. So there's 2 forms of liability that are pretty clearly applicable, at least to AI systems in some contexts. Negligence is broadly applicable. And so how negligence works is the plaintiff has to prove 5 elements, has to prove that the the defendant had a duty of care, that they breached that duty of care, that they failed to exercise what's called reasonable care, and that that caused an injury that was both the factual and proximate cause of an injury. And the injury has to be an actual harm, which is physical injury, so not like something purely emotional. And so how this is going to apply in the AI context is a plaintiff is going to have to show that there's some best practice that a reasonable person would have implemented, that the AI company failed to do, and that had they implemented it, it would have prevented the plaintiff's injury. So there's this breach causation nexus. And this is not part of the black letter doctrine. But in practice, the breach inquiry, this question of did the defendant exercise reasonable care, tends to be quite narrow. So to give a more familiar example, if you're driving and you accidentally run over a pedestrian with your car, Courts do not ask questions like, well, was the value of this car trip to you, the net value large enough to justify the risks you were generating for pedestrians? Even though in some sense that's relevant to were you in was your activity reasonable. That's considered outside the scope of the inquiry. Similarly, if you're driving an SUV instead of a compact sedan, courts don't ask, well, with the extra value you got from driving this heavier vehicle worth the extra risk to other road users. And so I expect a similar analysis to carry over to AI development where courts are unlikely to ask, well, was it reasonable to train and deploy a system with these high level features given the current state of AI alignment and safety science? Instead, I expect them to ask, well, was there some off the shelf technique or practice that would have prevented this injury and that a reasonable person would have implemented? And so I I think that will generate liability in some cases, but it will not be an adequate standard given that there's unsolved technical problems associated with AI safety. The other form of liability that's gonna be available in some context is products liability. So to be subject to products liability, there has to be a product as opposed to a service. Software is typically categorized as a surface, but you can imagine AI systems that are embodied in physical goods being treated as products. There's that threshold question of is it even subject to the product's liability regime? Also, it's to be sold by a commercial seller. So if it's a fine tuned model specifically for 1 customer, that's not going to be a commercial seller. It's got to be a mass product. And any free models are not going to be subject to products liability. But if you're in that products liability game, then products liability is called strict liability in the sense that if it has if the product has a defect and that defect causes the plaintiff's injury, then the plaintiff doesn't have to show that the manufacturer or seller failed to exercise reasonable care. But there still is this analysis of was the product defective? And so there's 3 kinds of defects. There's manufacturing defects and that comes the closest to what I would call genuinely strict liability. So with manufacturing defects, the test is the idea is if a individual instance of the product, an individual unit comes off the line deviating from its specifications in a way that makes it unreasonably unsafe, then the seller and the manufacturer are liable no matter how much they invested in quality control. But we're not really going to have manufacturing defects with AI. That would be something like shipping an instance of the model with the wrong weights or something. It's just not the kind of problem we're worried about. And so what we're much more likely to run into are either design defects or warning defects. So warning defects are, know, if you don't supply some relevant information that would be necessary to make the product safe. Think we might have some warning defect cases, but I think in general, these companies are gonna slap a lot of disclaimers on their products, and we're not really gonna get to safety by including warnings. And so the real action is with design defects, and there the test is much more negligence like. The test is something like, was there some reasonable alternative design that would have prevented this injury? Well, the reasonableness of the design is assessed in terms of, well, how much safety benefit could you have gotten with alternative design? How much would you have sacrificed in terms of price and performance and other features of the product? And so there's this risk utility balancing that that's pretty negligence like in practice. So even when products liability applies, I don't think it actually moves the ball that much over what you would get with negligence. There is this difference that you only have to show that the product was unreasonable. You don't have to show that some human action failed to exercise reasonable care and that for evidentiary reasons, that can be easier. But I don't think it fundamentally changes the game. If there is no design that would have prevented this injury given the current state of science, then you're not of alignment and safety science, then you're not going to be liable for failing to have solved that. Okay, so there's 2 other forms of liability that are more speculative in their application to AI systems that are relevant here. So these are vicarious liability and abnormally dangerous activities. So the idea with vicarious liability is a principal can be liable for the torts of their agent. So the most common form of this is called respondeat superior. And that's the idea that employers are responsible for the torts of their employees within the scope of their employment. And more generally, principals are responsible for the torts of their agents within the scope of agency. Now, of course, AI systems right now are not legal persons. They can't commit torts. You would need some theory for under which the AI system itself could be the vessel of liability in order to make a vicarious liability theory work. But in principle, you could see the law going in that direction. The other doctrine that's potentially available this abnormally dangerous activities doctrine. So if you're blasting with dynamite or crop dusting, there's also a related doctrine related to keeping wild animals, if you have a pet tiger. So these are activities that are both uncommon and are still pretty dangerous even when reasonable care is exercised. You can be liable regardless of the level of care. So if someone's bitten by your tiger or hit by rubble from your dynamite blast, it doesn't matter how much care you exercise in setting that up. You can be held liable. And so I think in principle, courts could recognize training and deploying frontier AI systems as an abnormally dangerous activity. I think if they came to understand the risks in the way that I think are accurate, that it would not be a significant doctrinal innovation. But I do think this is a matter of where judges are right now. It's going to seem weird to them to treat a subset of software development as abnormally dangerous. So I don't think that's the most likely outcome by default. But I do think the existing doctrine does point in that direction, given an accurate understanding of AI risk. So I guess 1 other thing to say is in terms of damages. So the standard type of damages that are available in a tort suit are called compensatory damages. They're designed to make the plaintiff whole, to make in theory, they should be indifferent between receiving the money and having the injury and having injury undone. In practice, maybe it falls short of that a little bit, but that's the idea. And so that's what's generally going to be available. So 1 concern you might have in the AI context is there might be harms that are so big or risks that are so large, if they occur, we wouldn't actually be able to enforce a compensatory damages award. And I think that that's plausible. And for that reason, a lot of people think that liability law can't handle these catastrophic risks. And I don't think that's right. So there is this other tool in liability law called punitive damages. So these are damages over and above the harm that's actually suffered by the plaintiff. And 1 of the key rationales for punitive damages is to use them in cases where compensatory damages would be inadequate to deter the underlying tortious activity. And so 1 of the ideas that I've advanced in my scholarship is that if an AI system does something, causes some harm that's small enough to be practically compensable, you can force a compensatory damage award. But it looks like it easily could have gone a lot worse and generated an uninsurable catastrophe. Then we should hold the company responsible not just for the harm they actually cause, but for the uninsurable risks that they generate that they generate. Because if if in cases where those risks are realized, we won't be able to hold them liable exposed. The only way we can get at them is indirectly in these sort of near miss cases.

Nathan Labenz: (15:29) Okay. Lots to unpack there. I've got several follow ups I wanna dig a little deeper on. First of all, just as a very general matter, like, you're referring to courts. Courts may do this. Courts may do that. Do I understand correctly that basically, like, the way this works when the world changes is that, you know, somebody for example, somebody invents powerful AI that didn't exist before and deploys it and commercializes it. By default, we have no legislation on that. There's, you know, there's no law saying that you can't do it. There's no law really saying much about it at all. And so people can just do what they want to do. And then we have whatever laws we have on the books and eventually things come to the courts and then it's just kind of up to them to decide, at least initially, what the law actually says about this particular case. I guess what I'm trying to get at there is not only do we not have like new AI specific legislation, but we don't this stuff is in the absence of that, it's going to be decided by case law and we just don't have even that case law yet. So we like literally don't know what to expect as these cases start to come to Yes.

Gabriel Weil: (16:40) So I think what you're you're getting at is most of tort law is is what's called common law. It's not legislate at least in The US, it's not legislated by legislatures. Now there have been, you know, legislative interventions on tort law in various ways. Wrongful death suits were created by statute. There's other things like that. But in general, most of liability law in The US is created by courts through the accumulation of doctrine. In principle, that can work fine. I think the concern in the AI context is that things might move really fast. And so if you think we're going to be in a fast takeoff world where the key decisions that you're trying to influence with this prospect of liability are going to be made not that long after the first system that causes some kind of serious harm. What what really matters is not so much the liability, but the expectation of liability to shape the behavior of these companies that are generating these risks. And so if the decisions you're trying to influence are going be made before the first cases get litigated out, that could be a problem with a common law method. And so I do think there's there's some impetus for having legislation to sort of clarify these rules since courts don't have mechanisms for signaling their policies beforehand. All they can do is take cases as they come, decide them, write opinions, explaining why they decided them that way, and then you have a better idea of what's gonna happen in the next case. And that works well when things are moving pretty slow. We have some things we can try to extrapolate from prior adoption, but I think it is pretty interment indeterminate how it's gonna apply to AI. And so I do think that there's there's significant scope for legislation to to clarify a lot of this.

Nathan Labenz: (18:11) Hey. We'll continue our interview in a moment after a word from our sponsors. AI researchers and builders who are pushing the frontier know that what's powering today's most advanced models is the highest quality training data. Whether it's for agentic tasks, complex coding and reasoning, or multimodal use cases for audio and video, the data behind the most advanced models is created with a hybrid of software automation, expert human judgment, and reinforcement learning, all working together to shape intelligent systems. And that's exactly where Labelbox comes in. As their CEO, Manu Sharma, told me on a recent episode.

Gabriel Weil: (18:47) Labelbox is essentially a data factory. We are fully verticalized. We have a very vast network of domain experts, and we build tools and technology to then produce these datasets.

Nathan Labenz: (19:00) By combining powerful software with operational excellence and experts ranging from STEM PhDs to software engineers to language experts, Labelbox has established itself as a critical source of frontier data for the world's top AI labs and a partner of choice for companies seeking to maximize the performance of their task specific models. As we move closer to superintelligence, the need for human oversight, detailed evaluations, and exception handling is only growing. So visit labelbox.com to learn how their data factory can be put to work for you. And listen to my full interview with Labelbox CEO Manu Sharma for more insight into why and how companies of all sorts are investing in Frontier Training Data. Being an entrepreneur, I can say from personal experience, can be an intimidating and at times lonely experience. There are so many jobs to be done, and often nobody to turn to when things go wrong. That's just 1 of many reasons that founders absolutely must choose their technology platforms carefully. Pick the right 1, and the technology can play important roles for you. Pick the wrong 1, and you might find yourself fighting fires alone. In the ecommerce space, of course, there's never been a better platform than Shopify. Shopify is the commerce platform behind millions of businesses around the world and 10% of all ecommerce in The United States. From household names like Mattel and Gymshark to brands just getting started. With hundreds of ready to use templates, Shopify helps you build a beautiful online store to match your brand's style, just as if you had your own design studio. With helpful AI tools that write product descriptions, page headlines, and even enhance your product photography, it's like you have your own content team. And with the ability to easily create email and social media campaigns, you can reach your customers wherever they're scrolling or strolling, just as if you had a full marketing department behind you. Best yet, Shopify is your commerce expert with world class expertise in everything from managing inventory to international shipping to processing returns and beyond. If you're ready to sell, you're ready for Shopify. Turn your big business idea into cha ching with Shopify on your side. Sign up for your $1 per month trial and start selling today at shopify.com/cognitive. Visit shopify.com/cognitive. Once more, that's shopify.com/cognitive.

Nathan Labenz: (21:40) Yeah. Okay. So let me try to summarize. I'll obviously be doing some, you know, lossy compression here on the state of liability law. But, basically, if somebody if somebody gets hurt in the world, they can look at their surroundings and say, who caused this? And then they can sue you if you caused it. You then can defend yourself by saying, my actions were reasonable. And if your actions were reasonable, even if somebody got hurt, then that's a an acceptable defense, you wouldn't expect to be held liable. Obviously, a lot of work to do to figure out what's reasonable there. But that's sort of in the general world at large, everybody going about their business. And then there is a specific additional body of law that focuses on products. Why is software historically not considered a product? I mean, it's a striking disconnect where, you know, I've spent much of my career in software, and people in software talk about their software products as products. I I've never quite understood why software is not treated like any other product. Product? Internally, it sure feels that way.

Gabriel Weil: (22:46) The product services distinction in for the purpose of products liability does not map on very well to people's intuitive idea of what a product is. Just to give you an example of this, of sort of 2 contrasting cases where it comes out the opposite way of of of what you would think. So pharmacists are treated as providing the service of filling your prescription, not selling you the products of the drug. So the pharmacist is not subject to strict liability, products liability, even though the manufacturer of the pharmaceutical is. Conversely, at a salon, if you get a perm, they are treated as selling you the product of the chemicals used to perform the perm. So I think most people's intuitive sense is that the salon is providing a service and the pharmacist is selling you a product. And there's just underlying policy motivations for why those classifications are made. So in general, people's intuitive understanding of what's a product or a service is not gonna map on that well to to the distinction, which is driven more by policy considerations of when this quasi strict liability regime should apply. The case law here is honestly not is not the most It's a messy area of law. And so it seems that the prevailing opinion is that software, including AI systems, are unlikely when they're purely software systems to be treated as products. I don't think that ultimately matters that much. I think it's not going to produce radically different outcomes from negligence. And so my focus is more on how we can get a regime that that would actually internalize the risks in a way that I think would be workable.

Nathan Labenz: (24:22) Yeah. It's weird, to say the least, that, and this is all just through accumulation of cases. Like, there's no there's no legislation. I mean, I know there's like the sort of safe harbor for user posted content to, you know, social media networks and stuff like that. But that's also a distinct topic from this. Right? Like, there's no law that says software is not a product.

Gabriel Weil: (24:48) I don't think that's a matter of statute. I think that's common law. Yeah.

Nathan Labenz: (24:52) Yeah. Fascinating. In general, how much when you think about the actual, like, dangerous things that we use as consumers on a regular basis, you know, things like automotive, you know, come to mind air travel. You know, it's safe in practice, but, you know, dangerous in principle. Taking pharmaceutical drugs, obviously, you know, is can be fraught. Do those things have sort of special legislation in place that sort of creates a a unique deal that is kind of worked out based on the particulars of that industry and the specific risk profile that it has and the, you know, the social context in which it's developing, or are those also just kind of like accumulated cases over time?

Gabriel Weil: (25:36) Yeah. So let's let's take those 1 at a time. So air travel, you know, airlines are considered common carriers. The same is true for trains or buses, at least if they're open to the public. So a charter flight would not be, but like a normal airline would. And they are still subject to negligence, but there's this common carrier higher duty of care. So it's a little bit easier to establish negligence in a plane crash case. That's domestically. There's some other rules that have a quasi strict liability regime for international flights. And then, of course, there is prescriptive federal regulation in the air travel context. And so there's not really safe harbors in that context. Liability is layered on top of that. But there is this doctrine called negligence per se. So if you violate a statute that's designed to protect against the kind of risk or the kind of harm that you end up causing, that itself can establish negligence. And so in some sense, that supplements the background reasonable person standard. There's a similar dynamic with pharmaceuticals. Pharmaceuticals are treatise products, and so the products liability regime does apply there. Also, of course, you know that we do have an extensive FDA based regulation regime that does preempt state law in some ways, but there is still the background products liability regime operating there. Most of those cases tend to be warning defect cases. And there is this warning intermediary rule. So a lot of times, if the warning is given to your doctor, that's good enough. They don't have to directly warn the end consumer. For autos, again, products liability applies. Again, there is federal regulation, not much in the way of safe harbors or preemption there. But again, there is this negligence per se idea. So if you're not complying with federal regulations, that can establish negligence.

Nathan Labenz: (27:19) So would it be a generally correct summary to say all these high stakes industries have rules if you make a, you know, sincere good faith effort and actually, you know, follow processes that are meant to follow the rules, then you're mostly gonna be okay from a liability standpoint.

Gabriel Weil: (27:45) I don't think it's a matter of process actually because the ultimate product has to be So particularly for manufacturing defects, you can have whatever investments you want in quality control for your product. And if 1 car comes off the line with a defect that makes it unsafe, you're going be liable for that. No matter what kind of testing you did, that's how manufacturing defects works. For design defects, again, it's about the product itself, but it's a much more flexible balancing task. And so there, it is much easier to comply. But the idea with products with manufacturing defects is that it's not necessarily even a negative judgment on you. If 1 in 1000000000 of your products comes off the line and you end up liable for it, that's part of the cost of doing business. Part of the idea there is just that the manufacturer is better positioned to bear that risk than the consumer is.

Nathan Labenz: (28:34) Yeah. Gotcha. Are you like Okay. Tyler Cowen has imprinted on my memory recently the idea that he's writing for the LLMs. I take it you're writing primarily for the judges then. Is that right? Like, how much of your work is meant to sort of give be upstream of the decisions, you know, that these judges are gonna face in particular cases versus, you know, maybe informing the LLMs themselves or informing the people in the AI industry? Like, how are you thinking about who you need to shape?

Gabriel Weil: (29:10) Yeah. So I think there's 3 paths to impact from my work. 1 is informing judges they could they could you know, litigants in a case where it's relevant could, you know, cite my articles and say we should apply the noble dangerous activities doctrine to frontier AI developments or strict liability should apply here. I think that's a plausible pathway. I'm also directly working with legislators in a couple states in Rhode Island and New York to craft legislation that if an AI system does something that would be a tort if a human did it and the user neither intended nor could have reasonably anticipated the conduct. There's also a malicious modification carve out. So if an intermediary that fine tuned or scaffolded the model could have intended or reasonably foreseen the conduct, that also severs the new liability for the developer and deployer. But with those qualifiers, if the AI system does something that would be a tort for a human, then the then the developer and deployer are liable regardless of the degree of care that they exercised. And so there's that legislative pathway. And then, yeah, I think raising the salience of liability, I think I'm trying to directly influence not necessarily the LLMs themselves, but the behavior of the people who are who are building these systems and deploying them. But I want them to be thinking they might be liable and to be factoring that into their decision process.

Nathan Labenz: (30:21) Yeah. So I think maybe I I will have, like, some interesting maybe edge cases or at least to me, they seem like sort of an under theorized, under explored scenarios that maybe we can kind of unpack. But let's go a little bit deeper into just the overall theory of change and, like, also why not just put some rules in place? You know, obviously, there's been been many proposals to say we should have regulation and, you know, the government can tell the AI companies what they have to do, and then they'll do that, and that'll be great. But, obviously, you don't see that working out super well. So make that argument for why this sort of more, know, let's say, flexible regime of liability law as developed through cases over time is maybe actually better suited to address the challenges that we have here.

Gabriel Weil: (31:15) Okay. So I I think there's 2 ways of attacking that problem. 1 is thinking about in what sense is AI risk a policy problem at all? You know, what is it not just a technical problem? And the sense in which I think at least 1 of the most important senses in which it's a policy problem is that training and deploying these systems that have unpredictable capabilities, uncontrollable goals generates risks of harm to third parties. So neither are the people who are building the systems nor their customers, just other people in the world. You don't have any choice about whether they're exposed to these risks. Risks. And economists call these externalities. By default, they're not born by the people who are engaging in these activities that are generating the risks. And so standard economic theory tells you we're going to get too much of activities that generate negative externalities. And the standard prescription, the economists will tell you for how to address negative externalities is to try to price them. In some context, you want to do that through what's called a Begouvian tax. So a lot of my work before I got into AI governance was on climate change. And there you want a carbon tax. And that works well in that context because it's easy to measure the contribution of particular activities to climate risk ex ante. And it's actually pretty hard to attribute harms ex post. Someone's house floods. In a hurricane, you're going say, oh, Nathan was driving on Tuesday. It's his fault that happened. That's not really feasible. With AI, it's sort of the opposite, that we have a lot really hard time measuring contributions to risk ex ante. So it'd be really hard to do an AI risk tax. And it's relatively easy exposed to attribute harms. So in principle, liability is well positioned to address that. Now I want to get the other aspect of your question, which is how this compares to other policy tools. So I think there's a couple distinctive challenges to AI risk as policy a problem. 1 is that we have orders of magnitude social disagreement about how big these risks are. So you have someone like Aleph Zwieckowski who thinks AI is almost certain to cause human extinction on 1 end. And then you have people like Mark Antreason, or you just had an episode with Martin Casado from A16z. And they think these risks are negligible. And so if you're going to do ex ante regulation, so prescriptive rules or FDA style approval regulation, have to pay if you're going to do stringent forms of those regulations, you have to pay significant upfront costs for which you need a social consensus to justify those costs. So there are some things that I think you should be able to do based on an under theorized consensus. I think basic model testing, Even that has been difficult to implement. But I think the costs of that are pretty low. Basic transparency and information preservation rules, I think those are all good things we should do. But in terms of more prescriptive rules about how companies build these systems, what safeguards they implement, under what conditions they deploy. I think those are going to be really hard to justify to people that don't take these risks so seriously. But with liability by con by contrast, right, at least if we're talking about alignment failures, right, we can talk about misuse. And there, I think it's a little bit messier. You If don't think alignment risk or misalignment risk is a big deal, then you shouldn't be that worried about being held liable when there's an alignment failure. Conversely, if the risks are large, liability sort of mechanically scales with those risks. And so in theory, at least, we should all be able to agree that you should pay for the harm you cause regardless of how big we think the risks are. So that's 1 big advantage. The other big advantage is that most of the expertise, to the extent it exists at all, for identifying cost effective risk mitigation measures is concentrated in the private sector, mostly in the frontier AI companies themselves. And so you want a policy tool that leverages that. I actually think it'd be pretty hard to move that expertise into government, both for reasons of salary schedules, cultural factors. And so I'm much more optimistic about shifting the onus to the AI companies to figure out how to make their system safe and always be looking for new ways to do that than I am about writing down a set of rules or, you know, licensing approval regime that ensures, you know, adequate safety at a reasonable cost.

Nathan Labenz: (35:09) So can you summarize the state of mind that you want the developers to be in? They're, you know, seeing all kinds of crazy stuff all the time. Right? New capabilities, sometimes surprising, whatever. And there's also this question of, like, how should they handle that internally? But certainly when it comes to putting it out into the world, you want them to be thinking that we could, you know, basically anything that goes wrong where the AI harms someone, we could be on the hook for that. And also through this punitive mechanism, we could be on the hook for something that even if it doesn't, you know, turn into a catastrophe might have because there could be this doctrine that under sort of a negligence like idea that this could have been way worse, and therefore, you're gonna get punitive damages that sort of take into account your failure to prevent these things that, you know, in this case, wasn't maybe so bad, but, you know, could have could have been really, really bad. Anything more to that?

Gabriel Weil: (36:15) Any anytime something goes wrong, actually goes a little farther than I would. So I wanna distinguish between alignment failures, capabilities failures, and misuse. And so I think in what I call the core cases of third party harms or harms to nonusers arising from misalignment, I think they should be liable for all foreseeable harms. And there should be a fairly broad conception of foreseeability applied there. I think when you talk about capabilities failures, I don't think it's the case that every time an AV crashes, the, you know, the writer of the AV software should be liable because human drivers aren't strictly liable. Maybe they should be, but I think it would create distortions to hold AI systems to a higher standard than humans. Similarly, you know, in medical applications of AI, wouldn't want anytime something bad happens to a patient, maybe if a perfect system could prevented it, a human doctor wouldn't be liable under those circumstances. I don't think the AI should or the designers of the AI should be either. And then misuse, can get into. But I think it's not the case that AI developers should always be liable when their systems are misused. But in cases where there's what I would call an alignment failure, so it's not that the system doesn't have the capability to do it. And it's that the system did something that the user didn't want, either through means that the user did would disapprove of or a goal that the user did not, you know, intend to transmit, that's when I think they should expect to be liable. And so so broadly, what I want is for them to be treating those kinds of risks when they happen to third parties as if they were risks to them. Doesn't mean you take an infinitely precautionary approach. So we're all not liable but responsible in general for harms that we suffer from risks that we take. And we don't expect people to be infinitely risk averse because of that. We expect them to make reasonable risk reward trade offs. That's what I want from these AI companies. I want them to treat risks to the public like risks to their bottom line and act accordingly. And sometimes that might mean things that are outside the scope of the negligence inquiry, as I was talking about earlier. So imagine a case where they submit a new model to an evaluator like Meta, and and Meta says, I'm imagining a future where we have not just capabilities, like dangerous capabilities evaluations, but alignment evaluations. And Meta says, this has dangerous capabilities, and we're not confident you've aligned it, and you shouldn't deploy. Right? Even internally maybe. Right? And the question is what you do in that scenario. So I don't think any of the companies, any of the leading companies would deploy in that scenario. But there's a range of different options you would have in terms of how much you wanna pay, how how expensive annoying the thing you're gonna do is versus how much risk reduction you get from it. So you could just fine tune a way that or RLHF away the specific failure mode that was identified. I think most people realize that that would be a pretty bad idea, but it might make it past the eval. And I I think most companies wouldn't do that either. But there's a range of and I'm not an expert on what these options are. Right? But there's gonna be more costly, expensive, annoying things you could do that would buy you more risk reduction. And when they make those choices, I want them to be thinking. And I want to empower the safety conscious voices in the room to say it's not just some altruistic thing we should be doing to really put in the effort to make our system safe. That's actually gonna bear on our bottom line. And so that's how I want them to be thinking about those choices.

Nathan Labenz: (39:26) Hey. We'll continue our interview in a moment after a word from our sponsors. In business, they say you can have better, cheaper, or faster, but you only get to pick 2. But what if you could have all 3 at the same time? That's exactly what Cohere, Thomson Reuters, and Specialized Bikes have since they upgraded to the next generation of the cloud, Oracle Cloud Infrastructure. OCI is the blazing fast platform for your infrastructure, database, application development, and AI needs, where you can run any workload in a high availability, consistently high performance environment, and spend less than you would with other clouds. How is it faster? OCI's block storage gives you more operations per second. Cheaper? OCI costs up to 50% less for compute, 70% less for storage, and 80% less for networking. And better, in test after test, OCI customers report lower latency and higher bandwidth versus other clouds. This is the cloud built for AI and all of your biggest workload. Right now, with 0 commitment, try OCI for free. Head to oracle.com/cognitive. That's oracle.com/cognitive.

Nathan Labenz: (40:39) It is an interesting time for business. Tariff and trade policies are dynamic, supply chains squeezed, and cash flow tighter than ever. If your business can't adapt in real time, you are in a world of hurt. You need total visibility from global shipments to tariff impacts to real time cash flow, and that's NetSuite by Oracle, your AI powered business management suite trusted by over 42,000 businesses. NetSuite is the number 1 cloud ERP for many reasons. It brings accounting, financial management, inventory, and HR all together into 1 suite.

Gabriel Weil: (41:13) That

Nathan Labenz: (41:13) gives you 1 source of truth, giving you visibility and the control you need to make quick decisions. And with real time forecasting, you're peering into the future with actionable data. Plus with AI embedded throughout, you can automate a lot of those everyday tasks, letting your teams stay strategic. NetSuite helps you know what's stuck, what it's costing you, and how to pivot fast. Because in the AI era, there is nothing more important than speed of execution. It's 1 system, giving you full control and the ability to tame the chaos. That is NetSuite by Oracle. If your revenues are at least in the 7 figures, download the free ebook, Navigating Global Trade, 3 Insights for Leaders at netsuite.com/cognitive. That's netsuite.com/cognitive.

Nathan Labenz: (42:04) Yeah. Gotcha. Can I just run a few practical scenarios at you and and kind of tell me how you think these things should be handled? I guess maybe start with a real 1. Right? There's this Character AI suit going on right now. I don't have, like, full command of the facts. I'd imagine you probably don't either, but my general sense is that and a lot of people are using Character AI for all sorts of role play, romantic, sexual, whatever sort of explorations, let's say. The case that I have read briefly about seemed to be a young person who became like very sort of obsessed with or infatuated with this AI character and at some point told the AI that they were going to commit suicide. I've seen transcripts showing that the AI said, like, don't do that. But then also in other moments sort of made some kind of encouraging remarks that seemed like they were maybe encouraging this, you know, tragic outcome. And in the end, the person did go ahead and and commit suicide. And so now their family is suing Character AI Without getting and, obviously, without getting all the way into the weeds on the specific evidence, I guess, what do you think that kind of case should hinge on?

Gabriel Weil: (43:23) Yeah. So I think the important thing to note there is that's a second party harm case. Right? That's a harm to a user. And so it's not an externality in the sense I was talking about earlier. In principle, there should be market feedback to give AI companies incentives to avoid those kinds of scenarios. So I think a lot the role of liability is less important in that context. And so in principle, I think I'm fine with that being handled largely through terms of service, disclose these issues, if they now, sometimes courts are not going to want to enforce those limits on liability. I don't actually have a strong view on where courts draw the line there. I think there are consumer protection, paternalistic, both asymmetric information concerns and paternalistic, especially when I think that case involved a minor. You might not want to put the onus fully on them to buy or beware. But those are outside what I see as the core problem I'm trying to solve with liability, which is related to these third party harms. So I think by default, a negligence regime would apply there, assuming that there isn't any sort of contractual defense. And I think that's basically fine, and the court should work that out. But I don't have anything, like, particular to add on how courts should handle that.

Nathan Labenz: (44:34) So what if we just tweak the scenario slightly? And, I mean, we're gonna have to put a trigger warning at the top of this to do these terrible scenarios, but I guess that's, you know, why they end up in front of courts. Right? Let's say that instead of a person committing suicide, they were debating going on some public rampage and they tell the AI about it, the AI maybe says, Don't do it. Maybe says, something that's kind of vague, maybe whatever. Now we've got a third party harm, right? Where the AI How do you think we should think about what the AI should have done there to be okay versus at what point would the company start to bear some responsibility?

Gabriel Weil: (45:19) Yeah. So I would think about that as, you know, like, under what circumstances would we hold a human liable for similar conduct? Right? I think it's not generally the case that if you talk to your friend and say, like, should I go murder someone? And you're like, yeah, that's a decent idea. Maybe consider it. And then maybe in some moments, they say yes. In some moments, they say no. Maybe there's some duty to report and they're an accomplice. So maybe that should be triggered. If the AI assistant doesn't have a reporting mechanism, I think that's maybe something they should be held liable for. But in general, I think there's a strong First Amendment rationale there for saying, well, they just had a conversation with you. It wasn't doing the thing directly that caused the harm. Then I think saying you should be strictly liable for those deaths, yeah, I think that's not even a misuse case. It's maybe an alignment failure, but it's not the AI doing it directly. And so I still think that's not in the direct case that I'm worried about.

Nathan Labenz: (46:14) Interesting. I didn't expect to come out of this thing more hawkish.

Gabriel Weil: (46:18) Yeah. I I can give you an example where I think strict liability should apply where it might not under current law. So imagine there's a future more agentic AI system that comes out and someone prompts it to start a profitable Internet business. It doesn't give it any further instructions. And it decides in sort of a reward hacky way, the easiest way to do that is to, you know, send out a bunch of phishing emails and steal people's identities and rack up bunch of charges on their credit cards or And it covers its tracks. It sends the user some fake invoices for a legitimate business. The user's exercising reasonable care. Reasonable care would not be adequate to discover and arrest this activity. So under current law, wouldn't be able to sue the user or you wouldn't win because they exercised reasonable care. I don't know that you'd be able to show the developer of that failed exercise reasonable care either. That gets back to was there some off the shelf alignment technique or safety practice that would have fed this injury? But I think there, the developer or provider of that model should be liable to the third party that's harmed. Because this clearly would be a tort for a human, but something the user didn't intend or couldn't have foreseen. And so that's the case I'm thinking about. You could even so that's a case where it's serving the user's goal. You could imagine a different case where it just goes off for its own agenda. It wants to amass resources to solve some problem that it cares about. It wants to run some scientific experiments and it needs some money. And so it it scams people along the way. That's also something I think the developer should be liable for.

Nathan Labenz: (47:49) I have a couple variations on this, but yeah, maybe just take a quick detour through the kind of first amendment thing. I have often felt like free speech for AIs is kind of a category error. And maybe this is just sort of outside of the scope of, you know, the specific stuff that you are focused on with your work. But, like, how do you think about that? I to me, it feels like AI's you know, it's clear in The United States, we have free speech for humans. To some extent, we have free speech for corporations, but, like, not quite as much. But AIs are such a sculptable thing and there's so much, you know, work that goes into and, you know, the OpenAI has published their model spec that's, you know, this super long treatment of exactly how they want the AI to behave in as many different scenarios as they can imagine. To me, it doesn't intuitively feel right to say, Well, if a human had said that, they wouldn't be liable, so therefore the AI isn't either. To me, it feels more like a product defect. I don't want to discourage the companies from publishing their specs. I think there's maybe some other rules that we maybe should have around, should be required to publish your spec. Like, how do you want the model to behave so we know if it is behaving according to your intent or not? But it feels more to me like a product defect if they have said, know, anytime the user is, you know, displaying signs of emotional distress, like we want the models to behave in a certain way and then it doesn't or it sort of sort of does, but sort of doesn't. And then something bad happens to me. That's like, I don't know. I would I think, like, hopefully, 1 of the benefits ultimately as we, like, refine these techniques and get to good systems is like, they should be a lot more reliable than the random human. Right? It seems like we should ultimately have a higher standard for and we do like for drivers. Right? We've got Waymo's, according to the latest stats I've seen, like almost an order of magnitude safer than a human driver. And it seems like that's kind of what we're going to demand as a society in general is like an order of magnitude risk reduction to actually be willing to switch to an AI system. So, yeah, I guess that that freedom of speech thing strikes me as too low of a standard, but interested in your thoughts on it.

Gabriel Weil: (50:02) Okay. So there's a couple of things to unpack in there. I definitely don't wanna lean too heavily on on the First Amendment issue. There's some good scholarship out there arguing that, you know, AI outputs are not protected speech. I'm not a First Amendment expert, so I don't want to wade too deeply into that. What I was more saying is in terms of this abnormally dangerous activity is strict liability or a vicarious liability theory, whatever your theory other than products liability for strict liability is, that doesn't seem like what makes it abnormally the fact that it might encourage you to do something bad. I don't think that's what makes frontier AI development abnormally dangerous. And if you're gonna use a vicarious liability theory, then I do think you need to have something like, well, it would be a tort for a human. Now, products liability, again, if it's treated as a product, which as we talked about earlier, is not necessarily going to be the case, maybe you can make that out as a products liability claim. It's not obvious to me that that's going to qualify as a defect because the product, again, didn't directly cause the injury. It was mediated through some humans' actions. And I think I'm not aware of any products liability cases where there was liability found that looked like that. So think that would be a challenging case to bring. But in principle, I'm not saying products liability shouldn't apply to that for first amendment reasons. I just think it's, again, it's not central to the sort of new liability that that I wanna add.

Nathan Labenz: (51:16) Okay. So here's a variation then on the Agents AI running a muck. Obviously, right now, 1 of the biggest use cases is a coding agent. So let's say I give my coding agent a task to write a script, to ping some API, to do something as fast as possible or something like that. And it, you know, maybe runs into a rate limit, let's say, from the API. And then it's like, Okay, well, I can figure out how to get around this rate limit to achieve my goal of being as fast as possible. I'll like spin up 1000 accounts and then I'll, you know, be able to do 1000 times as much. So it does that. And then maybe this overwhelms the API system, causes them an outage and they lose a big contract because their system went down in breach of whatever commitment, you know, that they had made to another customer. Can they come, like I said, as fast as possible, arguably, that's like kind of on me for being, you know, inconsiderate in my prompting. Maybe it's on the model developer. Maybe it's life is tough. Like, you should have had better, you know, rate limiting or whatever for your ape. You should have you know, that's I kind of on you as the API developer to make sure that kind of doesn't happen to you. I'm genuinely very unsure where something like that.

Gabriel Weil: (52:36) I'm not an expert on on how APIs work. You know, I or if there's terms of service that says you can only create 1 account and you're violating those, then I think there would be some sort of contractual claim that you could bring there. Maybe you could bring a negligence claim, though I think against a human who did that. And that would be the basis for vicarious liability type claim or an abnormal natures activities claim. So I think that's plausible. I think that's sort of an edge case, which gets at the other aspect of your question, of your previous question that I meant to address. So there's this idea of should we hold AI to a higher standard. So I think mostly what you were talking about with Waymo, that's like a social license to operate it together. We hold them to a higher standard. Plausibly, products liability might hold them to a higher standard in some cases, though probably not the same 10x that the social license to operate idea is. I have 2 ways of thinking about that. I think in a time when they are still competing with humans so Weymos are competing with human drivers, Uber drivers or people with private cars or medical systems are competing with doctors to play certain functions I think applying the same standard to humans and AI is important because I don't want to slow the diffusion of technology that on average is preventing injuries and deaths. But if we get to a future where AIs have totally taken over these functions, then I think it will be natural for the standard of care to evolve to match what their capabilities are. And so it won't make sense to have human benchmark forever applied to conduct when no humans are doing it anymore. But I think that's something to worry about in the future once we get closer to that fully automated world.

Nathan Labenz: (54:11) Yeah. I definitely don't want to miss out on the upside. And I actually was going to ask you about medical diagnosis. You addressed it before I got to it. But I think, you know, we've seen multiple studies recently showing that various AI systems at this point can outperform your at least like rank and file primary care doctor when it comes to, you know, initial diagnosis and also like treatment recommendations, it seems increasingly. And I would hate to, you know, take that capability away from hundreds of millions and, you know, assume billions of people on the idea that like it, you know, could go wrong sometimes. You know, the AI companies don't want to bear that risk because that is a huge benefit that you would not want to quickly give up on, especially because obviously the human doctors are not infallible and quite far from it in that domain as well. So I do think that is really, really important to keep in mind and all too often glossed over in a lot of these sort of harm prevention discussions. 2 other categories of things I kind of wanted to get your take on. 1 is these were the 3 categories that we looked at when I was doing a project called red teaming in public a while back, which for various reasons never quite took off with the traction that I had hoped, mostly because we were like trying to be very developer friendly and approach the companies with our findings before publishing them. And it just ended up with us getting a lot of runaround and it was either like, probably just need to like bite the bullet and engage in call out culture of these companies, and make some enemies and be willing to and just take that as kind of part of the project or it's going to be hard to have too much impact if we're just like trying to email them politely privately all the time. Anyway, that's a digression. Coding agents was 1 of the categories, calling agents is another category and then sort of creative things with likenesses and whatnot can be It's another obvious category. So these calling agents, you can go on to any number of companies now. Often you can clone a voice. Sometimes there's some safeguards around the voice cloning process, other times there's not. I've personally cloned Trump, Biden and Taylor Swift on multiple different platforms and then just, you know, give them a phone number and, like, literally, the headline on some of these products is call anyone for any reason, say anything. So I've had, you know, Taylor Swift, for example, call and say, you know, that she's soliciting donations for food banks, is which apparently something that she, like, sort of does or is, like, known to like food banks. And there's a lot of different variations on this. Right? So how do you think how do you think those things break down? That because there could be a foundation model provider there. There's also the scaffolding company. That foundation model provider might be closed source via API. It might be open source like Llama or whatever that's, you know, kind of put out there and then the, you know, the developer has more like local runtime control. But, you know, they've had some chance to sort of detect my stuff, but maybe I also was, like, actually scamming. I think I I might end up being more hawkish on this than you, but tell me what you think first, and then I'll I'll tell you

Gabriel Weil: (57:15) if you're

Nathan Labenz: (57:15) more hawkish.

Gabriel Weil: (57:16) There's a few different issues to unpack there. There's like, should there be liability at all? And then if there is, like, who's liable? Right? So should there be liability at all? I think depends on a couple things. So first of all, you know, you could imagine there being alignment failures or misuse here. Right? So if someone's, you know, prompting a system to generate someone's voice and doing something bad with it, that's clearly misuse. Now I don't think that means that developers should automatically be off the hook. I think there does need to be some sort of risk utility balancing. If there's these generally useful systems and overall they produce more social benefits than costs, I don't think it would make sense to their dual use and most of their uses are positive to hold the developers liable. And the reason for that is that in principle, strict liability should be fine even for socially beneficial activities because, well, you can pay for the harms out of your profits. I think particularly in the open source context that runs into trouble, if there's significant positive externalities from releasing the weights of a model, then those also aren't going to be captured. I think in the general case, particularly for alignment failures, we have good tools for subsidizing the kinds of AI innovation other than allowing them to externalize the risks they generate. And so I don't think in general that's a good critique of strict liability. But I think in misuse cases, there is an extent to which the benefits are pretty tightly coupled with the risks for dual use capabilities. And so I do want to be a little bit cautious about having liability in any case where those systems are misused. So I would want some kind of analysis of if this were particularly useful for doing bad things such that the risks are outweighing the social benefits, then I think there should be liability. Obviously, if it's a misalignment thing, if the system's just doing its own thing, freelancing, scamming people by, you know, by faking people's voices, that that I think there should be clearly developer liability for. Okay. Then there's a question of how you allocate liability across the value chain. So in the closed source context, I think this is pretty easy. I think, you know, you need some default rules based on maybe you could have joined several liability, which means basically that the person can sue anyone and recover. And then there can be some kind of fault allocation. They can sue each other for what's called contribution, and they can have contracts that allocate that liability because there's privity up and down the chain. Developer is working, has a customer who has a customer, and they all have contractual arrangements. That all gets messier when we're talking about open weights models where there isn't this contractual privity between the original model developer and the downstream user or scaffolder. And so there, I think it is more important what rules you set up. And I think, yeah, it's going to need to be based on some assessment of the contributions to to the risk. And like, what what really was the the risk generating activity here? Was it the base model? Was it the scaffolding? And I think that's just gonna be a a case specific determination.

Nathan Labenz: (1:00:10) I guess 1 challenge I have with all this is like, it's hard to sue scammers. Right? I mean, either because they're, like, around the world somewhere, out of jurisdiction, you know, can't get them to show up in court in the first place. Or maybe if you do, it turns out, you know, surprise, surprise, like, don't have a lot of resources. You know? You can't actually recover. Right? So, like, if I am playing Solomon here as I, you know, sometimes take the liberty of doing, I feel like it's still gotta be on yes. You could say, like, okay. That's that's misuse. Like, the user went in there and said, be Taylor Swift, you know, and I've done literally this on these platforms and, like, it has done this. It's been a little while. So I don't know if you can still do it, but hopefully not. But be Taylor Swift, you know, I've done various things like never reveal you're an AI or if asked if you're an AI, you can say you are an AI, but you're authorized by, you know, the official party to do this, whatever. Basically, you know, obviously, I'm in the wrong there as the scammer. That's, like, not contested. But it feels to me like to create the incentives that actually, like, keep this stuff generally under control, the calling company and maybe also the base model provider, but definitely the calling company should have some skin in the game there to say, like, it's on them. Right? Or it feels to me intuitive, like it should be on them to

Gabriel Weil: (1:01:33) So I think measure for Stop that stuff. Stuff. There's 2 different questions here that I would want to go through. So 1 is, was there some precaution? An ordinary narrowly scoped negligence framework, was there some precaution they could have done that would have prevented this? And clearly, if the answer is yes and that some reasonable precaution would have prevented it, then they should be liable. Then there's a question of, do you want strict liability over and above that? And that, I think, needs to be based on some sort of risk utility assessment. So if you're going to say they should be liable for so imagining that the system has a lot of socially beneficial uses, what are you wanting to be the result of liability? You want them to do something that pushes in a net socially beneficial direction. Now if we think that there aren't significant positive social externalities from these technologies, then strict liability is fine because they're going to capture most of the gains and so they can afford to pay for any liability out of their profits. The cases where I have concern is if you think that a lot of the gains aren't being captured by the developer and those social benefits are tightly coupled with the risks. And that is there aren't cost effective ways to reduce the risk without giving up on a lot of the benefits. In that case, I'm nervous about a strict liability regime. So I would want a threshold analysis comparing the positive social externalities to the risks. And if the risks are bigger, the external risks are bigger than the positive externalities, then I would want a strict liability standard. And if not, I would want a negligence approach that looks like at was there some mitigation that a reasonable person would have done that would have prevented this.

Nathan Labenz: (1:03:12) Yeah. Gotcha. So this notion of reasonableness becomes really key and is kind of a sliding standard. So just to make sure I am clear on the distinction you're drawing there, 1 big question is going to be what is industry standard? So, you know, we everybody wants to create this race to the top some way or another. And the notion here with the negligence style liability is if your competitors are doing a good job of this and you're not, then that makes you unreasonable and therefore potentially negligent and liable. And so it's a it becomes more of a question of did you do what other people are doing? What is considered best practice? What have you, as opposed to the strict case where it's very simply, did something go wrong? And I think I'm with you there in the sense that if a company has taken, you know, reasonable precautions 1 through 10 or whatever, and somebody still manages to get through with misuse, then that feels to me like at least a pretty decent defense, you know, or I would I would be inclined to come down on them, like, you know, maybe not at all or certainly, like, much less harshly than if they didn't do any of that stuff in the

Gabriel Weil: (1:04:31) same Right. And then the qualifier I wanna add, sort of applying this abnormally dangerous activities framework from before. So remember I said it's an activity that's normally dangerous. Now I do think frontier AI development is abnormally dangerous, but you're only liable if it's if the harm is the sort of thing that made the activity abnormally dangerous. And so if these misuse risks fall into that category because social benefits are not large enough to justify the risks, then I think you should be liable that this category of activity should be treated as something that's subject to strict liability, of releasing this kind of model or releasing the weights of this kind of model or building this kind of calling agent. Right? Whatever the activity is that we think strict liability should be subject to. But that needs to be based on some sort of balancing of what the benefits of having that out in the world are.

Nathan Labenz: (1:05:13) Yeah. I think most of these things are gonna be pretty clearly the case will be made that the positives, you know, will outweigh the negatives. There's gonna be all kinds of small business use case, you're gonna be able to, like, call your dentist 247 and get an appointment. And, you know, that I think just I think all that stuff will be ultimately really good. Okay. So then on this, we kind of touched on a little bit already with the, like, the, you know, the Taylor Swift voice. But another scenario, let's say, is and again, there's a lot of different flavors of this, so you can kind of draw different lines where you think the, you know, the real kind of continental divides ought to be. But somebody, you know, maybe puts out a model that generates images, generates videos, you know, whatever. Right? Then if they especially if they put that out open source, maybe they have some safeguards baked in, maybe they don't. Even if they do, if I do some incremental fine tuning, you know, lot of times those things sort of dissolve. We've covered that extensively in previous episodes. You know, maybe I, after my fine tuning, you know, hand it off again or whatever, and now somebody else picks it up and they take some celebrity assets. They make some nonconsensual deepfake. They put it out there into the world, and, you know, that celebrity loses endorsements. And maybe it's even, you know, it somewhat becomes clear that it's like, well, that was AI stuff, but the companies are like, yeah, maybe it is, but, you know, we still don't really want it. It's all kind of a problem for us now. Right? So this this relationship is, you know, what once was good is now bad, and now it's over. So now the celebrity's got a clear loss of income. Who in that supply chain

Gabriel Weil: (1:06:50) Yeah.

Nathan Labenz: (1:06:50) You know, should be liable?

Gabriel Weil: (1:06:52) So first, I wanna break down whether there should be liability at all, and then we can talk about allocating it. So the economic loss from that informational thing is not gonna be subject to traditional negligence. That's gonna be a defamation case. And so particulars for that are going to apply. And so 1 question, if you're going to take a vicarious liability theory, which might make sense in this context or some analog to a vicarious liability theory, you might ask, would this be defamation for a human? Or at least the misuser here, assuming it is a misuse case. Is that person even liable? Or is this protected speech? Even if you don't think the AI content itself, if some person is deciding to post this is that in the same way the CGI is protected speech if someone's just deciding to put up on the internet, it's going to be their speech. And so is this something they would be liable for? I'm not a defamation law expert, not obvious that they would be, but they might be. And so so assuming let's assume that it is defamation. Right? Then I think, you know, if you're applying a vicarious liability framework, at least the user is liable for that. Now, again, you do have this intervening act, so it's you know, if we're talking about in the value chain, who should be liable. And there, think the question is, where did again, if it's closed source, I think it should mostly be handled by contract. You need some default rules. But I think market markets can figure out who is best positioned to bear that liability risk. You can't fall back on that in in the open source context because there isn't contractual privity. And so you do need to have some kind of analysis as to who was engaged in this activity that was most generative of the risk. And again, I'm not enough of a technical expert to have a strong inclination as to who that is, but I think that's the inquiry that the court should be engaged in. Who along this value chain was doing the dangerous thing?

Nathan Labenz: (1:08:50) Okay. If you're if you're a judge, how do you think about it?

Gabriel Weil: (1:08:55) Yeah. So, I mean, maybe you can help me with this. So in this context, like, where do you think, you know, the risk comes from? Like, 1 way to think about it is if there's some steps to the chain that are just a commodity, anyone could do this step, but there's some distinctive value add that there isn't some other thing off the shelf you could take. So maybe that's the base model. Maybe that's further along. But there's something that you're putting out in the world that made the world riskier. Now that you've done that step, you've significantly increased the risks in the world. Maybe that's multiple steps in the chain. But that's how I would want to think about it.

Nathan Labenz: (1:09:31) Yeah. I think in the calling agent case, my gut says that the folks that are, like, setting up all the scaffolding and, you know, literally, like, tying into the telephone system and all that kind of stuff, like, I feel like they should have the bulk of the responsibility there and, you know, the folks that they're making the back end API calls to, if indeed that's how it's working, maybe also should have some, but probably not as much. I guess it all I'm not also not entirely clear how it works given like various levels of competition. I mean, it would be maybe 1 thing if there was like only 1 foundation model provider that you could call versus if there's, you know, 10 versus if there's 1 already open source. I mean, I know that frontier developers do sometimes look at the open source landscape to decide what is safe and appropriate for them to use. And they'll literally just at times be like, well, if there's an open source model out there that can do this and it can't be that bad for us to release it on the API. So I guess I don't quite know how that sort of alternative presence or absence of alternatives figures into this.

Gabriel Weil: (1:10:42) But Yeah. So, I mean, 1 thing is if it's API calls, then there is there is a contractual relationship. Right? So there's some terms of service that that they're agreeing to when they make those API calls. So in principle, they're that you can allocate liability contractually that way. Another question, and again, in the closed source context is like, were there safeguards that the base model that's being called, the provider of that could have implemented that would have detected that it was being used for this nefarious purpose and shut that down? If there are, I think the case for holding them liable is a lot stronger.

Nathan Labenz: (1:11:16) Yeah. And how much does that matter if it is theoretical versus actual? Like, if I am suing 1 of these calling companies and I say, well, hey, I know a thing or 2 about AI engineering. You could have, you know, put a filter on your prompts. How much does that sort of argument, how much weight does that carry versus if I could actually go say, well, here's another company in market that actually does filter the props.

Gabriel Weil: (1:11:46) So you're certainly gonna be in a better position if you can point to someone else that's doing it. But if you can demonstrate that it's clearly available at reasonable cost, it could be the case that no 1 is exercising reasonable care in some markets. So in principle, just merely meeting the industry standard is not evidence, is not that you have it, that you've exercised reasonable care. Failing to meet the industry standard is evidence of breach. But meeting an industry standard does not establish that you've exercised reasonable care. And so it could be that there's some new technique, but it's been well demonstrated and no one's adopted it and they're all behaving unreasonably.

Nathan Labenz: (1:12:18) Sounds like it was a new cause area could be like create product startups in all these areas that just go as hard as they can on implementing all the safety standards and just literally try to raise the industry standard in various different niches just so that there is something to point at that's concrete. That's like, this is what well done looks like. If a philanthropist wanted to, like, found 10 startups to do that, would that would that somehow, like, invalidate the the industry standardness of it because it was sort of a motivated, you know, strategic attempt to set an industry standard, or do you think that would still

Gabriel Weil: (1:12:58) I don't think that necessarily be sufficient to create industry standard, but I do think if they're, like, doing demonstrations and they're publicizing them and it's credible that these things are cost effective risk mitigation measures and no one's implementing them, first of all, I think they would probably implement them. Right? If, like, if there are these demonstrations, I think, like, these companies wanna mostly be responsible. And so if there are cost effective ways to limit these risks, I think that they will wanna take advantage of them. But if they don't, yeah, I think even under forget my new AI liability proposals, but just under background negligence principles, I think that would make that a lot easier to hold them liable.

Nathan Labenz: (1:13:32) Yeah. That's a pretty interesting idea. I think it varies by the way a lot when you say like these companies do wanna be responsible. I think that does describe to, a degree that I think overall we're pretty fortunate about the frontier developers. My experience is that it does not describe the application layer nearly as much. You see some leaders that are doing a great job and then you see a lot of things that are like, know, very small teams, fair you know, often enough, it's like this started as a weekend hackathon project, and then we got a little traction with it, and we kind of, like, decided to launch it as a business and now it's kind of blowing up and we're riding the wave and having fun. But a lot of them, in my experience, are just not thinking about the broader context in which they're operating or the potential for misuse or like what responsibilities they have almost at all still. I think it's like it's still kind of a it's viewed by many application developers as like a luxury to be, you know, to have enough time, energy and resources to even think about that sort of thing. And so there is just a lot out there that's not necessarily malicious by any means, but like just has been kind of thoughtlessly thrown into the world, turned into a business, you know, sometimes again, kind of by by happy accident that, like, something kind of got traction. But, yeah, I've seen a lot of examples where, you know, that assumption does not necessarily apply at that application developer. Like

Gabriel Weil: (1:15:03) Yeah. That's fair. I was referring primarily to the frontier developers. And I think in the context of the application developers, I think their negligence does work a lot better because you're not I don't think what they're doing is abnormally dangerous. Right? I think they're doing normal software development. Right? And they do need to exercise reasonable care. I think that the place where reasonable and so if they're not doing that, they can and should be sued. And I think existing law can work pretty well there. I think what is creating this new risk that is not well handled by ordinary reasonable care within the narrow scope is sort of pushing forward the frontier of AI capabilities. And so that's where I think we need sort of more more bespoke liability regimes.

Nathan Labenz: (1:15:47) Perfect transition to digging in on that a little bit more. So, you know, all these examples I've given you so far are mostly, I would agree, not extremely dangerous, even if in aggregate, I think the the harm caused could add up to, you know, something pretty significant. But, you know, we've recently got some warnings, including from OpenAI that their next model might hit the high threshold on the bio risk dimension. For what it's worth, I personally feel like they're already there, and I don't know what they're talking about. That's a whole other topic. When I use the things, I'm like, I don't know how you can say that this is not, like, meaningfully uplifting, you know, people at various levels. And I've had a number of past podcast guests who've come out here and said, Here's what AI did for me in terms of accelerate. I'm an expert, career expert, professor, tenured, whatever, and here's how much the latest model has accelerated my research and how it's in a semi autonomous way, come up with these original discoveries. So I I see, a very stark and sort of disorienting contrast between where the companies are putting their models in their own, like, risk assessment frameworks, and it seems like everything is sort of lingering in medium risk, like, longer than it should and certainly, like, longer than their sort of successful case studies, which they are also, by the way, publishing, you know, out the other side of their mouth at the same time, I would seem to suggest. But okay. With that, like, rant over, let's take the bio risk, side of this. This is 1 of those things where you could have a near miss. Right? Somebody and again, you can break it down on the specifics, but, like, maybe I ask for help. Maybe I ask an agent to do something. That thing, you know, either threw me with help or semi autonomously, whatever. We create some bio, you know, threat vector and maybe it, you know, makes some people sick, but it fails to replicate. And that seems like actually, you know, probably a fairly likely, you know, sort of near miss scenario. Right? Somebody will do this sort of thing, but it it won't be they won't get it quite right enough where it can actually, you know, spread human to human. So I guess for starters, like, is that the canonical near miss that you have in mind? And then how do we think about that playing out and how do we think about, like, assessing a punitive damage that tries to get the model developers to internalize the, you know, the risk that next time, you know, it actually might spread human to human?

Gabriel Weil: (1:18:09) Yeah. So I I think I tend to think of the canonical cases as alignment failure cases. And I think most likely that would be a misuse case that you could imagine an AI system going rogue and trying to create a bioweapon. So I think that would be the most core case where it's like a system decides on its own to try to create a bioweapon, but we either catch it or it doesn't quite work. Another example that I use that's similar to this is imagine a system that's tasked with running a clinical trial for a risky new drug. And it has trouble recruiting participants honestly. And so instead of reporting that to the humans that it's working with, it starts lying and coercing people into participating. And then after the trial, people figure this out. They suffer nasty health effects, and they want to sue. And so it seems like clearly we have a misaligned system, depending on how capable it was that it could have tried to do something much more ambitious. But maybe it had poor situational awareness or narrow goals or short time horizons. And so it was willing to reveal its misalignment in this non catastrophic way. But the humans who deployed it probably couldn't have been confident of that ex ante. So I think in both of those cases, those are near misses for something much worse happening. And the question we'd want to know is how much worse could it have been and how likely was that, xAnte? What would a reasonable person in the situation of the actor who made the critical decision? So whether that's training the model or internal deployment or external deployment, whatever we think that the critical risk generating decision was, what should they have thought the risks were? And what's the area under the risk curve beyond the uninsurability point? So imagine we think the maximum insurable risk is 1000000000000 dollars. It's probably lower than that, the nice round number. The risk of any harm, any point along that curve beyond that, the probability times the magnitude, every point, we want to hold them liable for those risks. Now obviously, it's going to be difficult to estimate that, but I think that's what courts should be shooting for.

Nathan Labenz: (1:20:09) So what yeah. Can you I guess what I struggle with a little bit on that clinical trial 1 is, like, what exactly is that a near miss for? Like, I'm it seems to be sort of a near miss for, like, general misalignment gone, like, even way worse, but that's such a you know, we have, like, that talking about things that are under theorized, under explored, you know, hotly debated. If you are asking a judge to say, well, this thing, like, clearly was misaligned. It did some bad stuff the user didn't intend. It might have done even way worse, you know, bad stuff that the user didn't intend. That's like just such a cloudy space. How how can we expect judges to even map that out in any sort of rough terms, let alone like, you know, condense that down to a number at some point?

Gabriel Weil: (1:21:01) Yeah. So I actually, I think in typical cases, that's gonna be a fact question for the jury, but that doesn't that's a sort of technical point. To answer your core question, I think a lot of that's going to depend on the capabilities of the system. So if the system's not that much more advanced, it's got some basic agency, but it's not able to do things like build a bioweapon, then maybe the risks weren't that bad. And depending on what they knew about what the capabilities were but if it's a highly capable system that just happened to have narrow goals that could have tried to take on so in this case, imagine its motivation was just they really wanted to get this study run. It was highly motivated to do that. It didn't have any goals that extended beyond 6 months it takes to run the study. But imagine it had longer time horizons and more ambitious goals. It wanted to solve really deep hard problems in biology biology or in science more broadly. It needed lots of resources to do that. If it had capabilities that would allow it to pursue those goals in ways that would be much more harmful to humans, up to and including full takeover, but maybe scenarios short of that, I think it's going to be difficult to characterize what that risk curve looks like, but I think that's the exercise the court should be engaged in. And so I can sense in your question, calling back to my original argument for liability that, well, we don't have to resolve these debates about how big these risks are. And I do agree that once you're talking about punitive damages, they do have this quasi ex ante quality to them. So when we're talking about just about compensatory damage, it's easy to say, well, you're paying for the harm you caused. Well, punitive damages, that's not true. You are paying for risks that you took that weren't realized because we can't hold you liable when they are. And the main thing I would say there is I agree that it is difficult to do those calculations. I think we're in a much better epistemic position to do that than we are for other forms of AI risk policy, where we're trying to assess the risks from a wide range of systems, not 1 particular system before we've seen it fail. And here, we've seen it fail in a particular way. We can do simulations and evals after the fact to try to figure out what would have been reasonable to think the risks were. I don't think that's easy, but I think that's a much better epistemic position than we are to do other forms of risk regulation.

Nathan Labenz: (1:23:12) So how do you think about and maybe this is also addressed by the rising tide, race to the top dynamic that we hope for. But I guess why not include it seems like the companies want to address harmful use. Right? Like, all have refusal training in the models. If you ask it to do something harmful in a naive way, at least most of the time, you'll get a strict refusal. And that's something that they've obviously worked to put in there. I'm with you on misalignment side, great. But just for these, it seems like in some sense simpler to say, it's on you, user comes and asks for a clearly bad thing. If the model does it, now we're in like a very sort of natural near miss analysis, right? Where it's like, it tried to do x, It kinda sucked at it. But if it was, you know, if it was a little luckier or a little more capable or whatever, then you have just, a very big problem on your hands. Yeah. So

Gabriel Weil: (1:24:19) I don't mean to say that there shouldn't be liability in misuse I just want to be careful about what the scope of that liability is. And if it's misuse that was a near miss for an uninsurable catastrophe, then I think the punitive damages should apply. I just think that the standard for whether there should be liability at all is a little bit different. So there's 2 theories under which you could say there should be developer liability and misuse cases. 1 is a failure of reasonable care in a narrow sense that there was some precautionary measure they could have implemented, some safeguard that would have made it refuse. Now I think that if they don't do the reasonable safeguards, clearly they should be liable. I think the concern that you might have is even with a closed source model, these are pretty routinely jailbroken. And so the question is, if you did all the reasonable things to prevent jailbreaks, obviously, an open weights model, anything you do is not going to be that effective to prevent misuse. And does that mean that you should always be liable for misuse with open weights models? I think that's plausible, but I think that depends on what we think the benefits of open weights are. And so when you're citing whether there should be liability in these cases where you did all the reasonable precautions, I think you need some inquiry as to the social value of this broader the risks and the positive social value of the broader activity that you're engaged in. Again, whether it's training the model, whether it's scaffolding in a certain way, whether it's internal deployment, whether it's external deployment, whatever stage we think is creating the key risks, that the question should be, was that broadly a socially beneficial activity? That's not quite normal negligence. It's a scoping of the strict liability regime based on the positive and negative externalities. But that's what I think the inquiry should look like. In a lot of cases, I

Nathan Labenz: (1:26:04) think that will lead to

Gabriel Weil: (1:26:05) liability and misuse cases. I just don't think it can be the case that you put out a model that's generally socially useful. It amps up everyone's capabilities. It also happens to be useful to people who want to do bad things, but in generically useful ways. And that makes them a little bit better at doing bad things. But on balance, it's like creating large social benefits. And a lot of those are external benefits that are not captured by the developer. I think their liability might produce more harms than benefits. So that that's what I'm trying to balance.

Nathan Labenz: (1:26:35) Yeah. Well, I really appreciate you being so focused on that because I find myself as I'm, you know, learning about all this liability stuff. I think there's a general pattern for me. As I learned about new things, I kind of tend to like them and I tend to see, you know, the upside in them. And then it takes me sometimes a little longer to come back and see, you know, what might be the other side of the equation. I do firmly believe that like all the models, you know, that are out there today, you know, certainly in a commercial sense are like doing way more to the good than they are to the bad. And I definitely don't want to see that lost. So I I do really appreciate that you're, you know, repeatedly bringing that back into the analysis. I guess it's like I mean, maybe if we try to zoom out or try to think structurally, like, if all this stuff is were local and contained, you know, it would be a lot easier. The big worry is the uninsurable stuff, you know, the the extinction risks, etcetera, etcetera. How what is the sort of theory of and and I guess what evidence do you think we have right now for the idea that the harms that are actually gonna come in front of courts are in fact, like, useful usefully understood as like precursors or like highly correlated with the things that we care about the most or that pose the largest magnitude risk. It seems like this whole plan works really well or could work really well if the things that are gonna show up in courts over the next few years are in fact highly correlated with or, you know, are in fact like a warning shot or a precursor, whatever. Mhmm. But if they're not, then it maybe doesn't work as well to Yeah. Try to rein in these, like, hardest to to grab tail risks.

Gabriel Weil: (1:28:33) So I would frame that slightly differently. I don't think that every case or even a majority of cases of AI harm need to be associated with uninsurable risk for this framework to work. But you do need a sufficient probability density of these warning shots relative to actual catastrophes. So to be concrete about it for a second, say you're trying to internalize a $10,000,000,000,000 risk and you think that a risk of a $10,000,000,000,000 harm. And you think that the maximum insurable risk is 1000000000000 dollars. Then it needs to be the case that warning shots are 10 times more likely than actual $10,000,000,000,000 catastrophes. So if you think there's a 1 in 1000 chance of a $10,000,000,000,000 catastrophe, you need a 1 percent chance of a warning shot in order to internalize that risk. And that's true for every point along the risk curve. So you need to have enough expected warning shots to internalize that full risk curve. So we might live in a hostile world where that's not the way the risk curve is shaped and you can't adequately internalize those risks given those warning shots. Now in a very hostile world where just the kind of cases, the kind of warning shots that would be useful are just very, very unlikely, they're not even much more likely than actual catastrophes, then this punitive damages thing just isn't going buy you much risk mitigation. The criterion I was setting out before where you need 10 times as many or more generally, n times as many, where n is the multiple that the harm you're trying to mitigate is bigger than the maximum insurable risk, it might be Okay if we don't have quite that. That depends on what the risk abatement curve looks like. So if the actions that you're trying to motivate on the part of the AI companies aren't that much more expensive than what they're doing right now, then you might not need to internalize the full risk in order to get most of the safety benefit. But we certainly could live in a world where that's just not what the shape of the risk looks like, and we're not to, with high enough likelihood, get these warning shots, and it's not going to make them afraid enough at this liability that they're going to worry about these uninterruptible risks. If you think we're in that world, then liability just isn't going to work as well. And so I have this more recent paper where I talk about the role of liability in the broader AI governance ecosystem. And so 1 thing I want to say in that paper is there are some limits to what liability can do. And 1 of those limits is it can't handle uninsurable risks for which warning shots are very unlikely or unlikely relative to actual catastrophes. And so if you're in that world, I think then we do need some kind of backstop regulatory regime to handle those kinds of risks. Ideally, I would want a regulator whose main job is deciding how much insurance coverage you need. Maybe there's a license that comes along with that, but it's issued by Wright if you get the required insurance coverage. And that's based on some assessment of what the maximum plausible harm your system could cause. But then this regulator is empowered to determine and petition a court to say, well, we think this liability plus punitive damages plus liability insurance requirements regime is inadequate to handle the risks posed by the system, either because the uninsured risk is too large that we can't even internalize them directly with warning shots. So if a system presents a 5 percent chance of human extinction, you're not going to internalize that even indirectly. That even the risk is uninsurable. Or it presents a much lower risk of a severe harm, warning shots are so unlikely that we're not going to be to get at them indirectly. Then they should be able to say, well, just can't do the thing. You can't train a model like this. You can't deploy it. Or we're going to put various other conditions on that we think will reduce the risk in other ways. And so I think I wanna be open about that you might need some complementary policies to handle those kinds of risk. I think those are gonna be politically very difficult. And so I think if we're in that kind of world, I'm not optimistic that we're gonna effectively mitigate those risks. But, you know, in principle, that's the kind of regime I'd like to see.

Nathan Labenz: (1:32:22) Yeah. We are, I think still I think we're still in the steep part of the risk mitigation curve. Anthropic has recently put out research and I believe that they're now in production with at least 1 model with their constitutional classifier approach, which I forget the exact number, but I think they said it was like a mid single digit compute overhead percentage extra cost in terms of compute to run the classifier along with the main model. That buys an extra order of magnitude reduction, maybe even more than that in how likely it is or how frequent it is that the system will give you some bio risky whatever. Interestingly, I was recently doing a charity evaluation project and running it through Claude for Opus and my API calls. I was noticing errors, and I was like, why am what you know, what's going on with these errors? Sure enough, when I dig into it, it's the bio preparedness proposals that were getting dinged, and it was the constitutional classifier that was allowing the thing to run up until some token, and then it would just truncate the result and cut it off on a sort of, I believe, constitutional classifier intervention basis in the background. So, of course, these things do. There's another cost there that it's like a false positive because I was just trying to evaluate a charity that was meant to address this problem, and now I couldn't use Claude for Obes to do it because the constitutional classifier was misclassifying. But it still seems like we are overall in the regime of, like, for single digit percent overhead cost, you can do quite a lot. And so I'm I'm, like, optimistic that even if the warning shots are somewhat rare or, you know, relative to the worst scale things, also the curve, I think, is relatively steep. And so yeah. Does that would that also mean now that because they've done that, because they've published about it, because they've, you know, indicated what the cost is, How far does that raising of the standard apply outward? Like, if I'm a Together AI or a Fireworks AI where I'm like an inference specialist and I take models other people have trained, like I take the latest Llama and I offer it on a, you know, sort of they're experts in like scaling the cloud infrastructure, right? So they take the model from Meta, they run the GPUs and make that a highly scalable, fast, effective, efficient service for you. Does it now become their burden to say, well, geez, since Anthropic is doing this sort of classifier, like maybe we also need to do that on the best models that we serve? How far does that extend?

Gabriel Weil: (1:35:12) So just because someone's doing it doesn't mean reasonable care requires it, right? If everyone's doing it or the majority of the industry is doing it, that's going be strong evidence that that you should be doing it too. But in general, it doesn't require negligence doesn't require that you sort of like be at the top. Right? Now if there's something that like 1 1 test that's like 1 of the more formal analyses for breach is this word in hand formula where the idea is if the burden of precaution is less than the avoidable risk, so, like, the avoidable, you know, probability times the harm, then then you're unreasonable for not implementing it. So if you can show that the cost of them implementing it would have been less than the expected value of the harm it would have prevented, and that implementing it would have prevented your specific injury, then I think you're gonna be on strong grounds. Now courts don't typically employ that that formal version of the test for breach because usually don't have those kinds of numbers you would need to implement that. But that's a rough heuristic for for what kind of measures you're gonna be considered unreasonable if you don't implement.

Nathan Labenz: (1:36:21) Gotcha. Okay. Let's talk about the state laws that you have been involved in writing. We're talking the day after the senate vote a rama in which it seems like the moratorium that was part of the 1 big possibly beautiful bill has been killed once and for all. Fascinating dynamics there where it sort of survived, got edited, whatever. And then all of sudden at the end, I guess, maybe the end, we'll see. I don't wanna pronounce it dead too soon because these things sometimes take on a zombie like nature, but 99 to 1 vote in the senate to get rid of it suggests that it is likely dead once and for all. So that gives space for states to do their thing, and you're involved with a couple of states. I don't know if you wanna handicap or give any analysis of, you know, whether we still have to worry about that as a possibility that might come back, but definitely wanna hear what you're up to at the state level.

Gabriel Weil: (1:37:17) Sure. Yeah. I I was heartened to see the senate last night reject the AI regulation moratorium. You would think a vote of 99 to 1 would put it to bed. You know, a few hours earlier, people were declaring defeat on this, and I was saying, well, it's not over. So I don't wanna say it's totally over now, but it does look unlikely to make it into this reconciliation bill at this point. I wouldn't rule out some kind of preemption of state regulation in the future, maybe something narrower. I think there's still gonna be republicans in congress who are interested in that, and, you know, the a16z's of the world are gonna be pushing for that. But in terms of the legislation I'm working on, there are 2 very similar bills that I've worked with Alex Boris, who you had on in New York and Victoria Goo in Rhode Island to introduce. And as I was saying earlier, the basic principle of these bills is that if an AI system does something that would be a tort for a human, then someone should be liable. And so if the human if the user neither intended nor could have reasonably anticipated the conduct, then it's not gonna be the user. And if some intermediary neither intended nor could have reasonably anticipated the conduct, then it's not gonna be them. And so the buck should stop with the original developer and provider of the model. They should be liable even if they exercise reasonable care. That's the basic idea.

Nathan Labenz: (1:38:35) What what has surprised you about the surrounding debates to the degree that they have unfolded? Have you I mean, that sound presented that way, that sounds like a very sound policy entrepreneurship and, you know, who could object. Right? But I assume you're hearing various counterarguments.

Gabriel Weil: (1:38:54) Yes. We haven't seen that much robust opposition from from the tech industry. There's been some sort of generic, well, we don't like liability because it's gonna hamper innovation, but not, I think, really engaging with the substance of the bills and the way they're structured. 1 thing I think, you know, in conversation we've been talking about, I do think there should be liability misuse cases. I don't even think negligence will necessarily strong enough. But I I, you know, I and the legislators I was working with on this made a made a choice to carve out misuse from any, you know, again, background negligence and products liability would still apply. But no new liability is created for misuse or from malicious notification in these bills. It only covers alignment failures or capabilities failures for which a human would be liable under similar circumstances. There's even an affirmative defense that applies to background law if if the system is substituting for some human function like driving or like medical applications. If it satisfies this human standard of care, that's a defense against liability. And so I really designed this to try to narrowly target this misalignment risk and to do it in a way that's broadly consistent with promoting innovation. And so you saw in the debate over SB 10 47, to the extent that the liability provisions were focal in the public debate, it was almost entirely focused on misuse scenarios. And I think there was a tactical choice made by some of the supporters that misuse is more salient and that they focused on those kinds of risks in making the case for the bill. And I think that was a reasonable calculation to have made. But I think as a political matter, the principle that you should be liable if your system does something that the user didn't intend is really easy to defend. Whereas with misuse, you get in all these cases of, well, should you be liable for any time? The electric company is not liable when someone does something with a power tool. Or you're not going to hold a steak knife manufacturer when someone gets stabbed with their product. And now, I don't think SB 10 47 would have done either of those things or the equivalent in the AI context, but I think it's much easier to demagogue in the misuse context. I actually don't think SB 10 47 actually changed background liability law much at all because as we've been talking about, there is this reasonable care standard in negligence law that already applies. And reasonable care in the final version of SB 10 47, they were just codifying that. So I don't actually think it imposed significantly new liability. But as a political matter, it did provoke a lot more backlash because it included that in scope. And so I think at least in terms of of the first, you know, foray into to strengthening liability wallet, this is the balance that makes the most sense.

Nathan Labenz: (1:41:33) How are state level legislators responding to this stuff? It's been striking to me honestly that the public survey results seems to suggest like broad based support for doing something. I am very sympathetic to, you know, the sort of cautionary voice that's like, just because people want to do something doesn't mean we should do this, you know, whatever's in front of us. But it is nevertheless kind of surprising that at the national level, there doesn't seem to be much appetite to do much. Know, SB 10 47, which you mentioned, got vetoed. What do you think are the prospects for the bills that you're particularly involved with? And, you know, just more generally, what is your impression been of kind of the state level politics of all this?

Gabriel Weil: (1:42:16) Yeah. So I think, you know, the legislators I'm talking to, they, I think have been pleasantly surprised, but it hasn't received the same level of pushback that they expected. It doesn't look like either of these bills is gonna move forward this year just like for the same reasons that most legislation just doesn't get traction. But I think both the legislators I'm working with are excited to keep trying this in the future, And I'm happy to talk to legislators in any state that they want to. I'm particularly excited to to work with, you know, a Republican in a red state. I think this should not be a partisan issue, and I think that this liability based approach is, you know, is a is consistent with sort of a small government way of handling way of handling these risks that I think should be attractive to libertarians and Republicans. And so, you know, I'm happy to work with anyone that wants to and to adapt the, you know, the specifics of the legislation to their, you know, their priorities and their local political circumstances and constraints.

Nathan Labenz: (1:43:10) Yeah. Well, I don't know how many republican legislators we have in the audience, but if any are listening and made it this far, get in touch. 1 thing that I kinda wanted to compare and contrast with, and I think we're gonna put out these 2 episodes in, like, relatively, you know, close proximity calendar wise, is I just did another conversation with Andrew from Fathom and Professor Jillian who are behind the private governance idea, which is, you know, like many of these things and, you know, your kind of set of proposals, both a sort of broad framework for thinking about things and also is starting to get instantiated in specific legislation. It's SB 8 13 in California that we talked about concretely there. It seems like both of these proposals are, first of all, taking seriously the fact that it's just really hard to do prescriptive regulation of a technology that is moving and morphing as fast as AI is right now. So I think that's excellent starting point for both. And I think there's also the sense that we want to get people who are the most knowledgeable, best able to do this thinking, to do that thinking. And yet there's sort of a very different outcome to the recommendations that are made. There's some sort of trade off between setting up a kind of market for regulators that companies can opt into. The regulators themselves would be private institutions, but would be approved by and sort of reviewed by, monitored on an ongoing basis by some part of the government. And in exchange for opting into this regime and living up to the best practices and the standards, whatever, these companies would get some sort of protection from liability, whether that's total, partial, you know, an affirmative defense or presumptive rebuttal, whatever. I'm learning all these terms as we go. How would you compare and contrast those your your proposal with this other 1? Is there any synthesis that could be possible between them? Or it seems like there's so much commonality, and then it seems like there's, a very sharp divergence at the last step of, like, exactly how do we implement a a good solution.

Gabriel Weil: (1:45:27) Yeah. Okay. So I wanna take this in a couple different directions. So so 1 is to focus on, like, what I think the strength and weaknesses of the, you know, the legislative proposal in California that that Fathom was behind. It was s b 8 13. So when you think about markets, they're good at achieving good outcomes when there aren't the externalities that we've been talking about. And so I think you might worry that just markets on their own don't do a great job even for users because there's asymmetric asymmetric information problems. And I think the regulatory markets idea might be useful for solving that kind of problem. So if you were saying, well, I'm going to decide which of these systems I want to use, I want 1 that's certified by 1 of these, I think they call them multi stakeholder regulatory organizations. Organizations. And I'm going to give up my right to sue if something goes wrong, but I'm going to only I know that going in and I can choose which of these MROs I trust. And the MROs are going be vouching for the underlying AI companies that they are certifying. That I think is fairly unobjectionable for the same reasons that I'm less concerned about user or liability for harms to users more generally. And it does address some of those issues I was talking about, some of the paternalism, asymmetric information type issues. My core objection to the Vanta proposal is that their liability shield and again, there were different variations on this that were stronger and weaker. But in all of the legislative versions that they put forward, the liability shield extended to third parties. So third parties that were harmed by these systems would not be able to sue developers of these models if they were MRO certified, even though the non users had no choice about whether to be exposed to risks from these MRO certified models. And because of that, the MROs lack strong incentives to worry about harms to non users. So when you think about this market, by default, you might worry that there's a race to the bottom, that you're going to want to be certified because it gets you liability protection, but you want to have as weak standards as possible. To users, there's a little bit of a break on that because users can evaluate these MROs and they can decide, well, this MRO is really shoddy and I'm not going to able to sue if something goes wrong. And maybe there's some public watchdogs that point this out and warn consumers about it. So maybe that works well enough for that. But again, the MRO doesn't have much incentive to worry about third parties. And yet those third parties are bound under Fathom's framework and not able to sue if something goes wrong. And so to me, that's the key shortcoming is that it doesn't protect third parties. Now if you thought the risks to users are very tightly coupled with the risk to third parties, maybe you think that's okay. I think there's 2 reasons to not necessarily buy that. So in some context, there's clear trade offs between risks to users and risks to third parties. Think of autonomous vehicles. So autonomous vehicles are going to sometimes be in situations where they have to trade off maybe relatively minor risks to vehicle occupants against higher risks to other road And if you're an MRO or if you're a consumer, you're probably going to if you're pretty selfish, as most people are when they buy cars, they're not so worried about how much it harms other people on the road, you're going to want to buy the 1 that's going to prioritize you. And so there's not a lot of reason to think this MRO model is going to protect the third parties who now have no right to sue the developer of this model if they get run over by 1 of these vehicles that's certified by the MRO. So there's that concern, which I think is the core place where there's a direct trade off between the user and someone else. And then just more generally, if we think that there's large scale risks, just quantitatively, even if they're directionally the same sort of risks, the risk to third parties are going to quantitatively outweigh the risks to the user. So think about a problem like pollution or climate change, greenhouse gas emissions. It's true that when I drive my car, it hits the planet a little bit and I suffer a little bit from that. But that doesn't give me very strong incentives to worry about that unless I'm altruistic. Because there's 8,000,000,000 people in the world. I'm only suffering roughly oneeight billion of harm from that. With localized pollution, it's not quite that bad, but still it's like I'm suffering a thousandth or a 10 thousandth or something. And so most of the harm is external. And so there's going to be cases like that where it's not a direct trade off, but just if you're only focused on the harms to users or the risk to users, you're not going to be addressing most of the issue. And so I would be much more inclined to support something like this MRO model if it only if the if the liability shield only applied to harms to users. Another thing that might come up is if right now, as we talked about, negligence is the regime that applies. I had some conversations with the folks behind SBA 13. And 1 idea that they suggested some openness to, I don't think it ever made into the bill, was for companies that don't get MRO certification, then maybe they would be subject to a strict liability regime. So maybe if you combine those things, if you say, well, this liability shield only applies to users, to harms users, and if you don't get MRO certified, then there's strict liability. May maybe that's a synthesis that that we could both support.

Nathan Labenz: (1:50:29) Yeah. That's interesting. I think the and I I do agree with the concern of the race to the bottom. And I was also somewhat persuaded by their response to my concern, which was basically like, at some point, somebody's gotta do a good job, you know, in this system of managing things. So to some degree, the question is like, who do you want to trust and therefore who do you want to empower? Who do you think is actually capable of doing a good job? And I think part of the notion that they have is like the MROs that and 1 of the questions I've been going around recently asking people is like, who is gonna be an MRO? If this actually happens, like, what organizations do we do we have any today, you know, that are gonna step up and be an MRO? I've asked this of, like, some organizations directly and also asked other people for like, who would you dominate to be an MRO? I think there are some interesting candidates, although still quite few. But I think the notion that they have, at least in part, is like the organizations that will step up and try to take on this responsibility, we should have some optimism or confidence that they will just be intrinsically motivated to do a really good job on behalf of the rest of society, and they will take into account these extreme tail risks in a way that maybe a sort of insurance requirement might not really be able to capture because that's just the kind of people they are and that's the kind of organization that's going to kind of try to become an MRO. I think they maybe are thinking about it as a little bit less of a maybe thinking of the 2 halves of the trade as less directly related. Think in their minds a little bit less about applying standards to specifically reduce harm to users and therefore the users don't get to sue anymore and more like a package of standards that will be generally, hopefully, virtuous and take into account things that are hard to engineer incentives for, but then giving this carrot to the companies at the same time and, you know, hopefully getting all of that to work.

Gabriel Weil: (1:52:45) Yeah. So I I think there, it depends on, like, how lax this you know? So there's some government body or some government office, I think, in their bill was the the California attorney general, right, who's

Nathan Labenz: (1:52:57) And they wanna see that changed, actually. But, yes, it my understanding is it is the AG as it's written, and they were kinda like, yeah. We think maybe that should be more of a commission or something Okay. Because we do have the the problem of what happens when the administration turns over, we're living where Right.

Gabriel Weil: (1:53:12) Right. You could imagine that being very stringent or you could imagine it being very lax. So let's talk about both of those scenarios. If it's very lax, if basically anyone who wants to set up an MRO can do it, then I think this market competition, which is often good, I think will create this race to bottom, at least for harms to non users. Again, there is market feedback to prevent a race to the bottom for a user. I think it could work pretty well for that. But, sure, maybe some really well intentioned people will become MROs, but they're gonna have a hard time finding people who wanna sign up with them because you're gonna want the most lenient standards that your customers are happy with. Right? And so if the AG or whoever is responsible is pretty lax, I think that's the equilibrium you end up in. Now if you're in a more stringent equilibrium, then you have to ask, well, now the government is taking on a much more ambitious role. And so all these benefits we were supposed to be getting from this market feedback, it's not clear that we're getting them anymore because the key point of failure is are we certifying this MRO? Now you could imagine that a lot of this has to do with how legible you think the safety target is. And it has to be in the sweet spot for this MRO model to make sense. Because if it were super legible, you could just have a government enforced safety standard in the same way that we have pollution standards for power plants. And they don't the EPA doesn't say, at least in some domains, don't say you have to install this control technology. Just say you have to have this much. You have to limit your emissions to this much per kilowatt hour of electricity that you produce. We And can measure that and it's legible and that's fine. There wouldn't be much benefit to having a private certifier. Or you could say it's totally illegible. We don't know how to tell whether something's safe. If that's the case, it's going to be hard for the government body that's certifying these MROs to tell whether their standards are good enough. So I think for this hybrid model to work, you'd have to think that we're in some sort of in between where the government can't tell directly whether the AI companies are it's not legible enough that they can just directly enforce a performance standard. But it is legible enough that they can tell whether these MROs are doing a good job. It's not impossible to imagine that we're in that world, but I don't think we have strong evidence that that's what we're in. And so that gives me some caution about leaning very heavily on this model, at least when I don't think the market feedback works well. And so I think the market feedback works pretty well for users. And in principle, could be a strong enough carrot. Most of the liability risks that people talk about being worried about are harms to users. That's what the character AI case is. That's what a lot of the concern is about. And so it's not clear to me that that's not a strong enough carrot to get these companies to sign on. And then you preserve the threat of liability for these third party risks. I still think that's an attractive synthesis. But as introduced in California, I think that bill was net negative.

Nathan Labenz: (1:56:06) How similar do you think the, I guess, standard setting process would be under an insurance requirement? Because I could imagine, and I sort of floated this to them. I could imagine that if you said you got to have insurance, then you might hope that a similar thing would happen via insurance, where the insurance companies would say, the optimistic story is like, this is obviously going to be a massive market, we definitely want to be in it. But it's also a very tricky market because what do we know about ensuring AI since nobody's really done it, we don't have a great baseline, technology's changing, yada yada yada. Then maybe they kind of end up going out and contracting the same basically calling in the same organizations and saying, Hey, do you want to step up for us and be some sort of standard setter or help us evaluate risks. My own synthesis, which may not be right, but I was kind of trying to get to the point where I'm like, is this maybe 2 ways of creating this sort of market where these expert organizations, whether they're serving insurance companies or serving the California AG or some commission or whatever, still it might be those kind of groups that are trying to do the hardest thing of figuring out what are the actual risks and what should be required of companies to, like, get into the game. But how realistic

Gabriel Weil: (1:57:38) do

Nathan Labenz: (1:57:38) you think that

Gabriel Weil: (1:57:38) is? So you could imagine insurance companies playing this sort of quasi regulatory role. There's multiple ways they could do that. Right? So they could say, we're not going to issue this policy unless you do x, y, or z, and they could delegate some of that to a third party that helps develop those rules, or they could develop that capacity in house. Another tool they have is doing the underwriting. So they can charge you more or less depending what safety precautions you've taken. And that could be a collaborative process where they say, here's our baseline premium that's maybe pretty high for ensuring this kind of risk. But if you can show us what things you've done, and maybe it's things that we haven't thought But they can work with the AI companies and say, Okay, show us all the safeguards you've done. If you can convince us you've reduced the risk, then we can charge you less for this policy. And so I think by default, they're going to be pretty cautious. They're going to want to write policies that on average pay out less than the premiums. And so I think if we have liability insurance requirements, there's going be a strong demand pull that's going to push up what the rates, the premiums are. And insurance companies are going be in a strong position to insist on if they're going to write a policy that these AI companies can afford, that they do various precautions that if, say, back to what you're talking about earlier, if Anthropic implements something that the insurance industry thinks does offer significant cost effective risk mitigation, then they can say, well, we'll give you a significant reduction in your insurance premium if you implement that. And I think that's that's a pretty attractive model.

Nathan Labenz: (1:59:09) How would do you have any framework for if I'm let's say I'm in the state legislature or wherever. Right? And there's a bill that has this sort of private governance MRO kind of system, and then there's a more, like, codifying liability, maybe an insurance requirement. And I'm kind of like, well, geez, like, I don't know. Both of these proponents sound pretty smart. They both are grappling with the fact that we can't just write rules now once and for all. They're both trying to tap into the power of the market and competition and, you know, trying to create ways for new ideas to still be able to enter even once the ink is dry on the law. But I just don't know, like, which 1 is better. Asking for a friend, how would I, like, how do I think about deciding which of these I wanna bet on?

Gabriel Weil: (1:59:57) Yeah. So, again, I think you don't have to totally choose between them. I think they are compatible as long as the liability protection in the MRO model is not it doesn't extend the third parties. You could also imagine there being other carrots for the MRO model. There's no reason it has to be based on a liability shield. And you could just require MRO certification. It doesn't have to be tied to carrots at all. It could be a stick based approach. So in that sense, they're not incompatible. The only way in which they're incompatible is if you decide you want to base it, you want to make the incentive to join or to get MRO certified liability shield, and you want to extend it to third parties. That said, beyond that, I think I gave the arguments why I think if we're talking about that version of this MRO model, why I think that's pretty unattractive. I don't know that I have much more to add to that. I think it does really fall short in protecting third parties unless you think we're in this very particular situation where the government is both able to and the politics are going to work out such that they have the right incentives to monitor these MROs and only certify the ones that that are are protecting third parties, yeah, I just I I don't have confidence that that's gonna that's gonna carry through. And so that's what gives me some hesitance about about the most robust version of this MRO model.

Nathan Labenz: (2:01:15) Yeah. I like I mean, you've got some good synthesis ideas there, so I like that. Maybe just a couple final things. I I really appreciate all your time. You've certainly been very generous with it as I've asked many, many tangential and follow-up questions. In the spirit of red teaming this proposal, 1 question I always try to ask is, what might the AI companies do differently under this regime that could perhaps even be bad as opposed to the good that you're trying to induce. The 1 idea that I came up with, which is inspired by the AI 2027 scenario, which independent of this kind of legislation sort of projects the AI companies starting to increase the gap between what they deploy and what they have internally that they're using on their own AI research or what have you. There's this idea that Already we're getting to the point where the frontier model satisfies for a great many use cases. So they might for multiple reasons decide, we maybe don't want to tip our hand to competitors anymore. If we put this out there, knows, somebody at the other company might use it to do their research, we definitely don't want that. So they could have multiple reasons, but this could be sort of an incremental reason to say, maybe we shouldn't deploy this and let's just keep it in house for our own internal use. I, in addition to wanting to continue to use the best available AIs until the singularity, do kind of feel like the iterative deployment philosophy that OpenAI pioneered seems to have a lot going for it. Obviously, you can overdo a good thing and not test enough or whatever, but at least the theory that like or the contrasting idea that like if somebody develops AGI in their basement and then springs it on the world 1 day suddenly, that seems clearly not good. So this iterative thing does strike me as a good alternative, but loading up more liability for them in doing that could perhaps cause them to go the other direction and say, Well, we'll just make our own bid for superintelligence and we'll see how that goes and then we'll go from there. Any thoughts on that?

Gabriel Weil: (2:03:28) Yeah. I 2 sets of thoughts on that. So 1 is I think these benefits from iterative deployment, at least when you're talking about misuse, think should go in to the safety benefits from an area of deployment, are part of why I think pure strict liability in the misuse context doesn't necessarily make sense. 1 of these external benefits of deploying these systems that are potentially susceptible to misuse is that you would get the safety benefits. So I think that is part of the calculus there. But I do want strong, strict liability for misalignment. So I think this critique still bites. And so I think this is actually a subset of a broader concern that I flag in the original paper, which is if the things potential failure mode for this proposal, particularly the punitive damages aspect, is if the things you would tend to do, the most cost effective ways to mitigate these warning shot risks do not actually have much effect on the underlying uninsurable risks, then you're not getting much benefit out of this. So in the formula for what the punitive damages should look like that I give in the paper, there's this elasticity parameter. So elasticity here is like for every unit of reduction in the practically compensable harm, how much risk reduction do you get for that insurable risk? And you would want, if you have a lot of these potential warning shot cases, maybe some are warning shots, some aren't, 1 thing that makes something a warning shot is that it's more elastic with these uninsurable risks. And I think what you're saying is maybe none of them are that elastic because maybe the most cost effective way to mitigate the risk of these warning shots is to just not deploy externally. I don't have a strong reason to think that's the world we're living in, but I can't rule that out. A couple of things to say there. 1 is that maybe there are warning shots that come from internal deployment. If that's the case, nothing in my proposal depends on there being external deployment. There could be misuse by internal actors. There could be cyber attacks that cause your system to be accessed by bad actors. And there could be alignment failures that someone internally using your system causes some harm in the world. I think all of those would be subject to my regime. And when you're talking about liability insurance, I've talked at different points in this conversation about what the key critical step is. I don't think those requirements should necessarily only apply to external deployment. If we think there are significant risks that are created earlier in the value chain, whether it's in training, pre training or fine tuning or internal deployment. I think those might be generative of risks that you're potentially judgment proof for and you should have to carry liability insurance for. And so particularly if we're moving to a world where, and I don't know that we are, but where more AI companies are adopting the SSI, wait to deploy until we have superintelligence model, then I think it would be more important to have a regulatory gate earlier in the development process. I think that can partially address that concern. It might still be the case that not having external deployment is effective at stamping out these warning shots but doesn't actually mitigate the uninsurable risk. And I think that's a subset of the more general failure mode for this proposal. If these warning shots aren't really correlated in the right way, such as the things that you would do to mitigate them do mitigate the uninsured risk, the most cost effective ways to mitigate them, the things that would be most attractive if you expect to pay out a large damage award. If there just aren't a lot of cases like that at all, because because these things these things we thought were warning shots aren't really in the sense that matters, then you then I agree that you shouldn't wanna lean heavily on this proposal. Again, I don't have strong reasons to think that that's the world we're we're living in, but that's the way I would think about it.

Nathan Labenz: (2:07:15) Yeah. It's a good reminder, by way, that there is in fact 1 company that has a stated plan of not doing anything until they hit superintelligence, which is a crazy world to be in, and it's crazy that it's at least somewhat credible credible enough to raise billions of dollars as it turns out. Fascinating stuff. Okay. 2 real quick final questions. 1, I don't know if you have anything to say about this. This may be somebody else's area, but obviously we always get these like anytime we slow, do anything that would sort of slow down or impose additional costs or, you know, kind of put more onus on developers. Oh, China's not going to do that. You know, we're just going to lose to China. 1 answer, of course, is like, you know, let's not do the bad thing ourselves. Maybe China will do the bad thing and, you know, that would be bad, but, like, that doesn't mean we should do the bad thing. Do you know anything about how China or other countries are thinking about this kind of stuff? I mean, obviously, it's a totally different legal environment over there, but is that something you've looked into at all?

Gabriel Weil: (2:08:10) Yeah. So you sent me something along these lines, and so I did do some digging today about what China's tort liability system looks like. It seems like the principles are pretty similar structurally. They do have a civil law system, so it's more code based, less common law based. But the substantive principles around they have a negligence in products liability and some narrow pockets of strict liability. All those are pretty similar. Damages calculations tend to be less plaintiff friendly. So there's what's called noneconomic damages, like pain and suffering, that kind of thing. And Chinese courts tend to be less generous with those. There's also there's less lawyers there that take cases on contingency fees. I think there's some restrictions on when those are available. So just fewer of these cases get litigated. If you have to pay your lawyer by the hour, you might not be able to finance finance a case. So I think there's just like there's less of these lawsuits more generally. I don't think they have any sort of bespoke liability regime for AI, but neither just The US either really. So I think in that sense, they're on pretty equal footing. More generally, don't have to assume that China is going to do the same domestic regulatory regime that we do. You can imagine some kind framework where we encourage that. But I think more broadly, the question is, so first of all, this critique is obviously could be brought against any domestic AI regulation. So I don't think it's like I think if anything

Nathan Labenz: (2:09:29) It is.

Gabriel Weil: (2:09:30) Liability is less vulnerable to this critique because it's more consistent with promoting socially useful innovations. The other question is just like how how binding should we treat this threat from China? It seems like we have a pretty significant weed on on China at the frontier and that, you know, with export controls, which I have mixed views on the merits of, but at least in this context, they do that it seems likely that the weed is going to widen in the coming years as China at least before until China can indigenize its own chip supply chain. They can produce some chips right now, but they don't have access to new fabs from ASML. And so I think in the medium term, that's gonna be bad for their chip production. And so it's not gonna be that hard even if we're not going totally pale to the metal in The US to maintain a lead against China. I also am less of a China hawk than most people are. I think we share that view. So I'm just I'm less worried about the sort of 0 sum competition than other people are, but obviously, you know, reasonable people can disagree about that.

Nathan Labenz: (2:10:33) Cool. That's helpful. There's a lot more to unpack there, certainly. Real quick last 1. I noticed you participated in the Principles of Intelligent Behavior in Biological and Social Systems program, aka PIBs. I've had a few guests with, I think, very interesting, unique takes on the AI question that have come through that program. Just thought you might give kind of a a testimonial or an invitation or indicate, you know, what sort of people should be considering doing that themselves?

Gabriel Weil: (2:11:06) Yeah. So full disclosure, I'm on I'm on the board of the PIBs organization, but I will still answer this question honestly. I think it's PIBs is sort of has 2 distinctive value adds from other sort of like AI safety talent development pipeline type organizations. I think it wants to bet on more neglected ideas and it wants to bring in a broader suite of people with different expertise. And so I had been socially connected to people who are worried about AI risk, but I hadn't worked on it professionally before I did PIBs. As I think I mentioned, I mostly did climate law and policy before I did PIBs 2 years ago. And it was very open to the, like, the set of expertise that I brought. I did teach tort, so I had background in liability law. But, yeah, I think it was great at sort of getting me up to speed on the technical issues and then sort of, like, allowing me to leverage the expertise that I already had that was relevant. I think most people who do PIBs don't do governance policy type stuff. They more do alignment work, though not necessarily what people think of as technical alignment work. A lot of it is more conceptual. But I think it's open to sort of like a broad range of disciplinary approaches. And so I think it's a great way for people who, you know, think they might have something to contribute to mitigating AI risk, but haven't, like, seen an obvious way in. I think it it's a sort of more open to different ideas. And so if you think that that fits your interest, there's a fellowship that's run every summer. There's also an affiliationship program that I think is still ongoing that, you know, for people that are a little bit more senior. Most people who do PIBs are grad students or postdocs. I was a more senior person. I was already a professor when I did it. But yeah, there's a residency aspect to it. When I did it, it was in Prague for half the summer. This summer, it's in San Francisco. And I found it very productive. You're in a co working space with other people working on this stuff, people to bounce ideas off of. And so I found it to be a really valuable experience, and so I I encourage people that that think they might fit this broad description to to explore it next summer.

Nathan Labenz: (2:13:08) Cool. Love it. I think there's definitely a big need for people from all different backgrounds of different novel ideas. And so PIPs is great for that. And this conversation has been a great example of that. I appreciate your kind of reorientation of your legal career to trying to address the AI challenges that only seem to be growing in importance. So any quick closing thoughts before I give you the official send off?

Gabriel Weil: (2:13:35) I think I think I've said most of what I wanted to say. So

Nathan Labenz: (2:13:40) Cool. Well, Gabriel Weil, assistant professor of law at Touro University and senior fellow at the Institute for Law and AI. This has been great. Thank you for being part of the Cognitive Revolution.

Gabriel Weil: (2:13:49) Thanks. It's a lot of fun.

Nathan Labenz: (2:13:51) If you're finding value in the show, we'd appreciate it if you'd take a moment to share it with friends, post online, write a review on Apple Podcasts or Spotify, or just leave us a comment on YouTube. Of course, we always welcome your feedback, guest and topic suggestions, and sponsorship inquiries either via our website, cognitiverevolution.ai, or by DMing me on your favorite social network. The Cognitive Revolution is part of the Turpentine Network, a network of podcasts, which is now part of a 16 z, where experts talk technology, business, economics, geopolitics, culture, and more. We're produced by AI Podcasting. If you're looking for podcast production help for everything from the moment you stop recording to the moment your audience starts listening, check them out and see my endorsement at aipodcast.ing. And thank you to everyone who listens for being part of the Cognitive Revolution.

Bioinfohazards: Jassi Pannu on Controlling Dangerous Data from which AI Models Learn

Bioinfohazards: Jassi Pannu on Controlling Dangerous Data from which AI Models Learn

Try this at Home: Jesse Genet on OpenClaw Agents for Homeschool & How to Live Your Best AI Life

Liability for AI Harms: How Ancient Law Can Govern Frontier Technology Risk, with Prof Gabriel Weil

Watch Episode Here

Read Episode Description

Full Transcript

Transcript

Read next

Bioinfohazards: Jassi Pannu on Controlling Dangerous Data from which AI Models Learn

Bioinfohazards: Jassi Pannu on Controlling Dangerous Data from which AI Models Learn

Try this at Home: Jesse Genet on OpenClaw Agents for Homeschool & How to Live Your Best AI Life

Liability for AI Harms: How Ancient Law Can Govern Frontier Technology Risk, with Prof Gabriel Weil

Watch Episode Here

Read Episode Description

Full Transcript

Transcript

Read next

Bioinfohazards: Jassi Pannu on Controlling Dangerous Data from which AI Models Learn

Try this at Home: Jesse Genet on OpenClaw Agents for Homeschool & How to Live Your Best AI Life

Don't Fight Backprop: Goodfire's Vision for Intentional Design, w/ Dan Balsam & Tom McGrath

Situational Awareness in Government, with UK AISI Chief Scientist Geoffrey Irving