Mechanistic Interpretability: Philosophy, Practice & Progress with Goodfire's Daniel & Tom

Mechanistic Interpretability: Philosophy, Practice & Progress with Goodfire's Daniel & Tom

In this episode, Daniel Balsam and Tom McGrath, at Goodfire, discuss the future of mechanistic interpretability in AI models.


Watch Episode Here


Read Episode Description

In this episode, Daniel Balsam and Tom McGrath, at Goodfire, discuss the future of mechanistic interpretability in AI models. They explore the fundamental inputs like models, compute, and algorithms, and emphasize the importance of a rich empirical approach to understanding how models work. They provide insights into ongoing projects and breakthroughs, particularly in scientific domains and creative applications, as they aim to push the frontiers of AI interpretability. They also discuss the company's recent funding and their goal to advance interpretability as a critical area in AI research.

SPONSORS:
Box AI: AI is delivering truly measurable productivity — strategic companies are already turning a 37% productivity edge. Discover how in Box’s new 2025 State of AI in the Enterprise Report — read the full report here: https://bit.ly/43uVP52

Oracle Cloud Infrastructure (OCI): Oracle Cloud Infrastructure offers next-generation cloud solutions that cut costs and boost performance. With OCI, you can run AI projects and applications faster and more securely for less. New U.S. customers can save 50% on compute, 70% on storage, and 80% on networking by switching to OCI before May 31, 2024. See if you qualify at https://oracle.com/cognitive

ElevenLabs: ElevenLabs gives your app a natural voice. Pick from 5,000+ voices in 31 languages, or clone your own, and launch lifelike agents for support, scheduling, learning, and games. Full server and client SDKs, dynamic tools, and monitoring keep you in control. Start free at https://elevenlabs.io/cognitiv...

NetSuite: Over 41,000 businesses trust NetSuite by Oracle, the #1 cloud ERP, to future-proof their operations. With a unified platform for accounting, financial management, inventory, and HR, NetSuite provides real-time insights and forecasting to help you make quick, informed decisions. Whether you're earning millions or hundreds of millions, NetSuite empowers you to tackle challenges and seize opportunities. Download the free CFO's guide to AI and machine learning at https://netsuite.com/cognitive

Shopify: Shopify powers millions of businesses worldwide, handling 10% of U.S. e-commerce. With hundreds of templates, AI tools for product descriptions, and seamless marketing campaign creation, it's like having a design studio and marketing team in one. Start your $1/month trial today at https://shopify.com/cognitive


PRODUCED BY:
https://aipodcast.ing

CHAPTERS:
(00:00) About the Episode
(04:44) Introduction and Welcome
(05:22) Framing the Field of Machine Learning
(06:11) Empirical Data and Interpretability
(09:51) Challenges in Model Experimentation
(10:28) Unsupervised Learning and Interpretability
(12:12) The Role of Compute and Algorithms
(14:48) Analogies in Interpretability (Part 1)
(16:22) Sponsors: Box AI | Oracle Cloud Infrastructure (OCI)
(19:13) Analogies in Interpretability (Part 2)
(19:40) Philosophical Questions in Interpretability
(23:19) Current State and Future Directions
(32:20) The Paradigm of Interpretability (Part 1)
(34:54) Sponsors: ElevenLabs | NetSuite | Shopify
(39:32) The Paradigm of Interpretability (Part 2)
(41:43) Competing Approaches and Techniques
(48:14) Machine Learning Techniques for Better Decomposition
(57:21) Minimum Description Length and Interpretability
(59:27) Understanding Minimum Description Length
(59:56) Sparse Autoencoders and Optimization Targets
(01:03:35) Challenges in Model Reconstruction
(01:05:02) Dark Matter in Scaling Analysis
(01:06:43) Exploring Features and Interpretability
(01:19:21) Scientific Discovery and Interpretability
(01:43:52) Applications of Interpretability Techniques
(01:50:43) Good Fire's Mission and Future Directions
(01:53:46) Outro

SOCIAL LINKS:
Website: https://www.cognitiverevolutio...
Twitter (Podcast): https://x.com/cogrev_podcast
Twitter (Nathan): https://x.com/labenz
LinkedIn: https://linkedin.com/in/nathan...
Youtube: https://youtube.com/@Cognitive...
Apple: https://podcasts.apple.com/de/...
Spotify: https://open.spotify.com/show/...

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to The Cognitive Revolution.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.