Intelligence on the Edge: Liquid AI's Ramin Hasani on the Search for Device-Native Foundation Models

Hello, welcome back to the Cognitive Revolution, and happy Fourth of July to everyone in the United States. Today, my guest is Ramin Hasani, CEO of Liquid AI, a company founded by MIT researchers that's developing device-native foundation models. I'll say upfront that just before recording, I encouraged Ramin to go deep into the weeds on the technical details of Liquid's work. And as you'll hear, he did a truly excellent job, demonstrating a mix of technical sophistication, differentiated vision, and a contagious passion that, in my humble opinion, make this episode an instant classic. We start with an overview of the team's research into tiny, biologically inspired, differential-equation-based neural networks that Ramin and team developed at MIT and which inspired them to start the company. Some of the capabilities that they demonstrated — such as parking a car with a control module that consisted of just 12 liquid neurons — still sound a bit like science fiction today. And while those systems haven't scaled up to today's capabilities frontier, the company has maintained the Liquid philosophy, which today means taking a neutral, empirical approach to designing and optimizing neural networks to perform under all sorts of exotic constraints — including, most commonly, the need to run on edge devices with limited memory and processing power. Considering the fact that the global smartphone and laptop market is roughly eight hundred billion dollars per year — a number which the global AI data center build-out is only now surpassing — that the demand for inference threatens to price much of the world out of the frontier model market, and that so many enterprises and individuals value privacy and the ability to control their own information, this is an absolutely massive market opportunity unto itself. And Liquid has serious proof points, including holding the number five spot on the Hugging Face United States total downloads leaderboard and notable partnerships with companies such as Shopify and Mercedes-Benz. Plus, anyone can do a quick download and demo of Liquid's Apollo app, which shows that even a one-billion-parameter model — which combines a small number of attention layers with a very simple gated learned convolution — while admittedly far from the frontier, can run fast enough on an iPhone to be a real option for basic use cases, such as privately searching through and classifying one's own local documents. Perhaps most interesting is the network architecture search process that Liquid uses to develop networks for particular use cases and runtime environments. Having found that proxy metrics too often lead the process astray, they now evaluate models on real downstream tasks on the actual target hardware that their customers intend to use. Ramin shares a lot more detail on their findings, but in short: while attention-based architectures continue to generalize better than any known alternative and therefore continue to dominate the frontier, the more specific your use case and the more limited the compute resources you have available, the more likely their search process is to land on an exotic architecture. This, for now, is where architectures like Mamba and other sub-quadratic innovations really shine. Toward the end, Ramin teases a platform that Liquid will soon be introducing to allow customers to fine-tune small models for their own use cases on a self-serve basis. For multiple reasons — including the potential to ease demand for frontier models and improve access to AI globally — I, for one, will be very excited to see that come online. And so, without further ado, I hope you enjoy this high-energy look at how Liquid AI is squeezing as much intelligence as possible out of any given computational resource, with co-founder and CEO, Ramin Hasani.

Watch now!

Thank you for being part of The Cognitive Revolution,
Nathan Labenz