Embryology of AI: How Training Data Shapes AI Development w/ Timaeus' Jesse Hoogland & Daniel Murfet

Jesse Hoogland and Daniel Murfet, founders of Timaeus, introduce their mathematically rigorous approach to AI safety through "developmental interpretability" based on Singular Learning Theory.
Watch Episode Here
Read Episode Description
Jesse Hoogland and Daniel Murfet, founders of Timaeus, introduce their mathematically rigorous approach to AI safety through "developmental interpretability" based on Singular Learning Theory. They explain how neural network loss landscapes are actually complex, jagged surfaces full of "singularities" where models can change internally without affecting external behavior—potentially masking dangerous misalignment. Using their Local Learning Coefficient measure, they've demonstrated the ability to identify critical phase changes during training in models up to 7 billion parameters, offering a complementary approach to mechanistic interpretability. This work aims to move beyond trial-and-error neural network training toward a more principled engineering discipline that could catch safety issues during training rather than after deployment.
Sponsors:
Oracle Cloud Infrastructure: Oracle Cloud Infrastructure (OCI) is the next-generation cloud that delivers better performance, faster speeds, and significantly lower costs, including up to 50% less for compute, 70% for storage, and 80% for networking. Run any workload, from infrastructure to AI, in a high-availability environment and try OCI for free with zero commitment at https://oracle.com/cognitive
The AGNTCY: The AGNTCY is an open-source collective dedicated to building the Internet of Agents, enabling AI agents to communicate and collaborate seamlessly across frameworks. Join a community of engineers focused on high-quality multi-agent software and support the initiative at https://agntcy.org/?utmcampaig...
NetSuite by Oracle: NetSuite by Oracle is the AI-powered business management suite trusted by over 41,000 businesses, offering a unified platform for accounting, financial management, inventory, and HR. Gain total visibility and control to make quick decisions and automate everyday tasks—download the free ebook, Navigating Global Trade: Three Insights for Leaders, at https://netsuite.com/cognitive
PRODUCED BY:
https://aipodcast.ing
CHAPTERS:
(00:00) Teaser
(04:44) About the Episode
(09:28) Introduction and Background
(11:01) Timaeus Origins and Philosophy
(14:18) Mathematical Foundations and SLT
(17:11) Developmental Interpretability Approach (Part 1)
(20:53) Sponsors: Oracle Cloud Infrastructure | The AGNTCY
(22:53) Developmental Interpretability Approach (Part 2)
(24:08) Proto-Paradigm and SAEs
(29:21) Generalization Theory Deep Dive
(34:59) Central Dogma Framework (Part 1)
(36:57) Sponsor: NetSuite by Oracle
(38:21) Central Dogma Framework (Part 2)
(39:19) Loss Landscape Geometry
(45:25) Degeneracies and Singularities
(52:09) Structure and Generalization
(01:00:20) Essential Dynamics Research
(01:05:04) Grokking vs Typical Learning
(01:12:03) Double Descent Discussion
(01:14:39) Interpretability and Alignment Applications
(01:22:01) Reward Hacking and Overgeneralization
(01:30:03) Future Training Vision
(01:36:20) Scaling and Compute Requirements
(01:38:19) Future Research Directions
(01:41:27) Outro
SOCIAL LINKS:
Website: https://www.cognitiverevolutio...
Twitter (Podcast): https://x.com/cogrev_podcast
Twitter (Nathan): https://x.com/labenz
LinkedIn: https://linkedin.com/in/nathan...
Youtube: https://youtube.com/@Cognitive...
Apple: https://podcasts.apple.com/de/...
Spotify: https://open.spotify.com/show/...