Titans: Neural Long-Term Memory for LLMs, with author Ali Behrouz

Titans: Neural Long-Term Memory for LLMs, with author Ali Behrouz

In this episode of The Cognitive Revolution, Ali Behrouz, a PhD student at Cornell University, delves into his research on enhancing memory mechanisms in large language models through his latest paper titled Titans.


Watch Episode Here


Read Episode Description

In this episode of The Cognitive Revolution, Ali Behrouz, a PhD student at Cornell University, delves into his research on enhancing memory mechanisms in large language models through his latest paper titled Titans. Behrouz discusses the limitations of current models in maintaining long-term coherence and introduces the concept of a neural network as a memory module. Highlighting architectures such as memory as context and memory as gate, he explains how these innovative approaches can significantly improve long-term memory retention in AI systems. The discussion also touches upon challenges such as catastrophic forgetting and the need for more effective models in reinforcement learning and decision-making tasks. This insightful conversation sheds light on the future directions and potential applications of advanced memory mechanisms in AI.


Upcoming Major AI Events Featuring Nathan Labenz as a Keynote Speaker
https://www.imagineai.live/
https://adapta.org/adapta-summ...
https://itrevolution.com/produ...


SPONSORS:
ElevenLabs: ElevenLabs gives your app a natural voice. Pick from 5,000+ voices in 31 languages, or clone your own, and launch lifelike agents for support, scheduling, learning, and games. Full server and client SDKs, dynamic tools, and monitoring keep you in control. Start free at https://elevenlabs.io/cognitiv...

Oracle Cloud Infrastructure (OCI): Oracle Cloud Infrastructure offers next-generation cloud solutions that cut costs and boost performance. With OCI, you can run AI projects and applications faster and more securely for less. New U.S. customers can save 50% on compute, 70% on storage, and 80% on networking by switching to OCI before May 31, 2024. See if you qualify at https://oracle.com/cognitive

The AGNTCY: The AGNTCY is an open-source collective dedicated to building the Internet of Agents, enabling AI agents to communicate and collaborate seamlessly across frameworks. Join a community of engineers focused on high-quality multi-agent software and support the initiative at https://agntcy.org/?utm_campai...

Shopify: Shopify powers millions of businesses worldwide, handling 10% of U.S. e-commerce. With hundreds of templates, AI tools for product descriptions, and seamless marketing campaign creation, it's like having a design studio and marketing team in one. Start your $1/month trial today at https://shopify.com/cognitive

NetSuite: Over 41,000 businesses trust NetSuite by Oracle, the #1 cloud ERP, to future-proof their operations. With a unified platform for accounting, financial management, inventory, and HR, NetSuite provides real-time insights and forecasting to help you make quick, informed decisions. Whether you're earning millions or hundreds of millions, NetSuite empowers you to tackle challenges and seize opportunities. Download the free CFO's guide to AI and machine learning at https://netsuite.com/cognitive


PRODUCED BY:
https://aipodcast.ing

CHAPTERS:
(00:00) About the Episode
(07:09) Introduction to the Cognitive Revolution
(07:33) Exploring Memory in Large Language Models
(09:10) Ali Behrouz's Research Journey (Part 1)
(13:47) Sponsors: ElevenLabs | Oracle Cloud Infrastructure (OCI)
(16:15) Ali Behrouz's Research Journey (Part 2)
(18:37) Understanding RNNs and Linear Attention
(20:39) Human Memory and AI Architectures (Part 1)
(27:54) Sponsors: The AGNTCY | Shopify | NetSuite
(32:16) Human Memory and AI Architectures (Part 2)
(32:23) Designing Effective Memory Modules
(44:15) Persistent Memory and Attention Mechanisms
(52:00) Queries, Keys, and Values in Attention
(01:12:41) Understanding Context and Surprise in Language Models
(01:14:19) Introducing the Momentum Concept
(01:14:28) Defining the Surprise Metric
(01:14:53) Momentary and Past Surprise
(01:15:37) Decay Mechanism in Surprise Metrics
(01:16:08) Optimizers and Test Time Training
(01:17:52) Memory Module and Runtime Queries
(01:24:01) Scalability and Efficiency in Training
(01:29:39) Strategies for Memory Integration
(01:37:23) Hybrid Approaches and Their Benefits
(01:39:08) Micro Skills and Task Performance
(01:50:59) Long Context Modeling and Titan's Advantages
(02:08:22) Future Directions and Applications
(02:11:39) Outro

SOCIAL LINKS:
Website: https://www.cognitiverevolutio...
Twitter (Podcast): https://x.com/cogrev_podcast
Twitter (Nathan): https://x.com/labenz
LinkedIn: https://linkedin.com/in/nathan...
Youtube: https://youtube.com/@Cognitive...
Apple: https://podcasts.apple.com/de/...
Spotify: https://open.spotify.com/show/...

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to The Cognitive Revolution.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.