Demystifying LLMs with Mechanistic Interpretability Researcher Arthur Conmy

Explore the frontier of AI interpretability with Arthur Conmy's ACDC approach, automating the discovery of sub-circuits within transformers. Sponsored by NetSuite.


Watch Episode Here

Video Description

Join Arthur Conmy and Nathan Labenz in this captivating and accessible discussion as they embark on a deep dive into the cutting-edge world of interpretability research. Discover how pioneering researchers have isolated sub-circuits within transformers that are responsible for different aspects of AI capacity. Arthur introduces us to the groundbreaking ACDC approach, a revolutionary method co-authored by him, which automates the most time-consuming aspects of this intricate research. If you’re looking for an ERP platform, check out our sponsor, NetSuite:

(00:00) Episode Preview
(04:40) What attracted Arthur to mechanistic interpretability?
(07:49) LLM information processing: General Understanding vs Stochastic Parrot Paradigm
(14:00) ACDC paper:
(14:45) Sponsors: NetSuite | Omneky
(24:30) Putting together data sets
(32:39) How to intervene in LLMs network activity
(36:00) Setting metrics to evaluate the production of correct completions
(44:20) The future of the mechanistic interpretability research
(50:00) Extracting upstream activations in the ACDC project and evaluating impact on downstream components.
(56:00) Anthropic research findings
(01:08:00) 3-Step process of the ACDC approach
(01:22:00) Setting a threshold and validation
(01:27:00) Goal of the approach
(01:32:00) Compute requirements
*Correction at 1:33:00 Arthur meant to say = "quadratic in nodes"
(01:35:30) Scaling laws for mechanistic interpretability
(01:40:00) Accessibility of this research for casual enthusiasts
(01:46:00) Emergence discourse
(01:56:00) Path to AI safety

Towards Automated Circuit Discovery for Mechanistic Interpretability

@labenz (Nathan)
@arthurconmy (Arthur)

SPONSORS: NetSuite | Omneky

-NetSuite provides financial software for all your business needs. More than thirty-six thousand companies have already upgraded to NetSuite, gaining visibility and control over their financials, inventory, HR, eCommerce, and more. If you’re looking for an ERP platform: NetSuite (Code and defer payments of a FULL NetSuite implementation for six months.

-Omneky is an omnichannel creative generation platform that lets you launch hundreds of thousands of ad iterations that *actually work* customized across all platforms, with a click of a button. Omneky combines generative AI and real-time advertising data. Mention "Cog Rev" for 10% off.

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to The Cognitive Revolution.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.