Autonomous Organizations: Vending Bench & Beyond, w/ Lukas Petersson & Axel Backlund of Andon Labs

Autonomous Organizations: Vending Bench & Beyond, w/ Lukas Petersson & Axel Backlund of Andon Labs

Today Lukas Petersson and Axel Backlund of Andon Labs join The Cognitive Revolution to discuss their experiments deploying autonomous AI agents to run real-world vending machines, exploring the safety challenges and unexpected behaviors that emerge when frontier models like Claude and Grok operate without human oversight.


Watch Episode Here


Read Episode Description

Today Lukas Petersson and Axel Backlund of Andon Labs join The Cognitive Revolution to discuss their experiments deploying autonomous AI agents to run real-world vending machines, exploring the safety challenges and unexpected behaviors that emerge when frontier models like Claude and Grok operate without human oversight.

Read transcript of the episode ⁠here: https://storage.aipodcast.ing/...

Check out our sponsors: Oracle Cloud Infrastructure, Shopify.

Shownotes below brought to you by Notion AI Meeting Notes - try one month for free at ⁠ https://⁠⁠notion.com/lp/nathan
- Autonomous Organization Philosophy: Andon Labs believes that AI models will improve to the point where human oversight becomes impractical due to efficiency constraints, leading them to pursue fully autonomous systems rather than gradual automation.
- Vending Bench as a Testing Ground: They created "Vending Bench" as a benchmark for testing long-term coherence of autonomous agents, using vending machines as a practical business case for experimentation.
- Domain-Specific vs General AI: There's a notable difference between optimizing AI for narrow domains (like vending machines) versus general-purpose AI, with domain-specific applications potentially being more manageable regarding reward hacking.
- Frontier Model Race: Major companies like OpenAI and Google are advancing rapidly in general reasoning capabilities (e.g., IMO Gold achievements) independent of narrow application research.
- Insurance and Liability: The insurance industry may play a significant role in AI adoption, with premiums potentially being much higher for general models that could be misused versus narrow-domain models with limited capabilities.
- For-profit AI Safety: The case for for-profit companies in AI safety has been historically neglected but is becoming clearer, with accelerators like Seldon Labs supporting this approach.

Sponsors:
Oracle Cloud Infrastructure: Oracle Cloud Infrastructure (OCI) is the next-generation cloud that delivers better performance, faster speeds, and significantly lower costs, including up to 50% less for compute, 70% for storage, and 80% for networking. Run any workload, from infrastructure to AI, in a high-availability environment and try OCI for free with zero commitment at https://oracle.com/cognitive

Shopify: Shopify powers millions of businesses worldwide, handling 10% of U.S. e-commerce. With hundreds of templates, AI tools for product descriptions, and seamless marketing campaign creation, it's like having a design studio and marketing team in one. Start your $1/month trial today at https://shopify.com/cognitive


PRODUCED BY:
https://aipodcast.ing

CHAPTERS:
(00:00) About the Episode
(04:49) Company Vision Overview
(12:24) Vending Benchmark Design (Part 1)
(20:12) Sponsor: Oracle Cloud Infrastructure
(21:21) Vending Benchmark Design (Part 2)
(24:41) Model Performance Results (Part 1)
(35:03) Sponsor: Shopify
(37:00) Model Performance Results (Part 2)
(43:06) Real World Deployment
(59:41) Wild Stories Incidents
(01:19:59) Business Safety Strategy
(01:38:20) Future Directions Discussion
(01:47:09) Outro

SOCIAL LINKS:
Website: https://www.cognitiverevolutio...
Twitter (Podcast): https://x.com/cogrev_podcast
Twitter (Nathan): https://x.com/labenz
LinkedIn: https://linkedin.com/in/nathan...
Youtube: https://youtube.com/@Cognitive...
Apple: https://podcasts.apple.com/de/...
Spotify: https://open.spotify.com/show/...

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to The Cognitive Revolution.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.