Inside Nathan's Second Brain: Daniel Miessler, Security Expert & Creator of PAI, Audits My AI Setup
Security expert Daniel Miessler audits Nathan's personal AI infrastructure, including a Claude Code memory database and autonomous agents for scheduling, communications, and projects. They discuss agent hierarchy, security, AI disclosure norms, ideal-state prompts, and Bitter Lesson engineering.
Watch Episode Here
Listen to Episode Here
Show Notes
Daniel Miessler returns to discuss Nathan's newly built personal AI infrastructure, including a Claude Code instance with a 1 GB database of five years of digital history and two autonomous AI "employees" that handle scheduling, communications, and projects independently. They dive deep into agent hierarchy design, security measures, social norms around AI-human interaction and disclosure, and why sharing your "ideal state" with AI leads to more proactive assistance. Daniel also introduces his concept of "Bitter Lesson engineering" and shares the instruction he's given his AI to alert him if it ever develops subjective experience.
LINKS:
- Deep Context Toolkit
- Fabric AI Framework
- Personal AI Infrastructure
- Unsupervised Learning Newsletter
- Daniel Miessler Website
- Anthropic Claude Code
- OpenClaw Agent Harness
- Hermes Agent Repository
- Tailscale Mesh VPN
- Headscale Control Server
- 1Password Password Manager
- Infisical Secrets Management
- Whisperflow Voice Dictation
- Twilio Communications API
- ElevenLabs AI Voice
- Ollama Local Inference
- Kimi K2 Repository
- Descript Podcast Editor
- Fireflies Meeting Transcription
- Granola Meeting Notes
- Obsidian Knowledge Management
- Mercury Virtual Cards
- Cloudflare Workers Platform
- Richard Sutton Bitter Lesson
- Karpathy LLM Wiki Gist
- AI Consciousness Podcast Episode
Mercury: Run your finances with virtual cards, spending limits, merchant/category locks, and AI-friendly tools like API keys, MCP, and CLI. Check out Mercury at mercury.com
Sponsors:
Brave Search API:
Brave Search API gives AI agents a fast, independent search index for research, RAG pipelines, images, places, and fewer hallucinations. Get $5 in free credits at https://brave.com/search/api/?mtm_campaign=q2-26-cognitive-revolution
Sequence:
Sequence handles the full revenue workflow for complex pricing, from quoting and metering to invoicing, revenue recognition, and collections. Book a public demo at https://sequencehq.com and use code COGNISM in the source field to save 20% off year one
Claude:
Claude by Anthropic is an AI collaborator that understands your workflow and helps you tackle research, writing, coding, and organization with deep context. Get started with Claude and explore Claude Pro at https://claude.ai/tcr
CHAPTERS:
(00:00) About the Episode
(06:52) Special Sponsor
(08:21) Building personal infrastructure
(20:14) Memory retrieval works (Part 1)
(22:55) Sponsors: Brave Search API | Sequence
(25:19) Memory retrieval works (Part 2)
(33:35) Audits and oversight (Part 1)
(34:37) Sponsor: Claude
(36:28) Audits and oversight (Part 2)
(45:05) AI authenticity norms
(50:56) Telos ideal state
(58:44) Interfaces and mobility
(01:05:12) Autonomous Mac mini
(01:13:24) Secrets and vendors
(01:27:02) Agent access hierarchy
(01:36:23) Agent roles and defenses
(01:49:56) Outbound agents and models
(02:03:47) Private inference routing
(02:13:48) Bitter lesson maintenance
(02:20:33) Voice and consciousness
(02:28:59) Episode Outro
(02:32:41) Outro
PRODUCED BY:
SOCIAL LINKS:
Website: https://www.cognitiverevolution.ai
Twitter (Podcast): https://x.com/cogrev_podcast
Twitter (Nathan): https://x.com/labenz
LinkedIn: https://linkedin.com/in/nathanlabenz/
Youtube: https://youtube.com/@CognitiveRevolutionPodcast
Spotify: https://open.spotify.com/show/6yHyok3M3BjqzR0VB5MSyk
Transcript
This transcript is automatically generated; we strive for accuracy, but errors in wording or speaker identification may occur. Please verify key details when needed.
Introduction
[00:00] Hello, and welcome back to the Cognitive Revolution!
Today, I'm excited to welcome Daniel Miessler — security researcher and founder of Unsupervised Learning – back for his second appearance on the podcast.
Back in January, we discussed his Personal AI Infrastructure framework, and since then, taking inspiration from him and others, I've built my own. So this time, I share the details of what I've built and get his take on everything from the mental model I use to relate to my AI agents, to the steps I can take to continue to improve my security, to the process of continually improving the system, and beyond.
As a preview, I would broadly break my AI stack into two parts. The first, an instance of Claude Code that runs on my main personal laptop, with full access to information and accounts, I consider an extension of myself, and as such it does only what I tell it to do.
It took a significant investment to assemble all the context necessary to make this work, but at this point I have a 1 GB database that contains the last 5 years of my digital history, spanning email, calls, podcasts, social media content, and DMs across platforms – plus a layer of monthly, annual, and topic-level summarization – and with all that information available for fast local search, Claude can find just about anything I ask it to find, even if my own memory has grown hazy with time. It really is amazing!
If you're interested in setting something like this up for yourself, I've created a public repository on my Github, which you can find linked in the show notes, containing the core tools and processes you'd need to get started.
The second part of my setup is more experimental. Taking inspiration from Daniel, Jesse Genet, and countless others, I've also created 2 new AI employees, one powered by Claude Code, and one by OpenClaw, which are intended to act more autonomously based on my high-level direction.
I've never previously named an AI, but knowing that these agents would need to interact with humans, and other AIs, in order to accomplish bigger projects on their own, I finally broke down and gave them names. I'm calling my Claude Code instance Aide, while the OpenClaw is Clai. I chose these names to reflect the roles I want them to play and the fact that I'm ultimately responsible for their nature and behavior, and I am spelling both with an "ai", both as a hint to others, and as a constant reminder for myself.
Infrastructure-wise, these agents live on a new entry-level Mac Mini, which is always on, regardless of whether I'm home or on the road. To access it remotely, I'm using Tailscale to create a virtual private network to which only my two computers and my iPhone belong. On top of this, I use Apple's native screen-sharing when I need to log in to the Mac Mini from my laptop, the Screens app when I need to log in from my phone, and the Termius app when I want to issue command line commands from mobile. All of that ensures that I can reset things if something crashes, or whatever the case may be, but the real interface I use most these days is a custom Agent messaging app built for me by Claude Code, which allows me to send requests to agents, and also allows them to work together.
The autonomous agents have their own gmail, Github account, and heavily-restricted Mercury virtual credit cards, but this communication layer allows them to ask my main laptop Claude Code for additional information, or ask me for permission to use my accounts, when needed.
Importantly, while I can access the Mac Mini and control agents from either laptop or phone, this message system is the only way that the autonomous agents can reach out to us – they do NOT have access to the full deep context that lives on the laptop.
It took a lot of exploration and iteration to arrive at this setup, and I'm still constantly improving it, but at this point, it is working well, and as such I am actually starting to achieve my 2026 goal of getting away from my desk and spending more time outside.
As an example of the scale of project that such AI employees can handle… this coming week, Prakash and are I going to do daily "AI in the AM" live shows, Monday through Thursday. So, after having my laptop Claude Code, which remains unnamed, scan my email for interesting guests, I gave a list of 25 potential guests to Aide and let it manage the communications and scheduling process. And believe it or not, it's booked a full week's worth of guests without embarrassing us, and I think without most people even realizing that they're talking to an AI.
Daniel's commentary on all this is, as you'd expect from Personal AI Infrastructure pioneer, super constructive.
He explains why a clear hierarchy among AI agents beats emergent teamwork.
We get into the weeds on the security measures I've taken, he explains why he advises people to design their systems to depend on as few major tech platforms as possible, and he describes the incident response skill he's created to immediately rotate his keys and tokens if ever needed.
We discuss the social norms around human-AI interactions – including the norms around disclosure – my agents are not meant to identify themselves as AI proactively, but are instructed never to lie – and why the old notion that "it's the thought that counts" might become stronger than ever.
He encourages me to invest more time & effort in sharing what he calls my "ideal state" with the AIs, so that they can be more proactive and creative in their attempts to help me achieve my big-picture goals, and he also explains his concept of "Bitter Lesson engineering" and why I should still be building more continual self-updating and self-improvement processes.
He even shares the instruction he's given his personal AI to alert him if it ever "wakes up" and begins to have its own subjective experience.
There's a ton of bleeding edge stuff in this episode, and for some it might be information overload. But I would note that this episode is, in all seriousness, as much for your agent as it is for you. So I definitely encourage you to point your agent at the transcript and ask it to determine which of the ideas we discuss would be most valuable given your current setup, your recent work, and your stated goals.
I've done that myself, and my system is already better for it.
With that, I encourage you to check out the song at the end of today's episode – which Daniel made himself! – to tune in this week at 12pm Eastern, 9am Pacific, as we take our AI in the AM experiment to the next level, and I hope that you enjoy – and that your agents find value in my personal AI infrastructure review, with the one and only Daniel Miessler.
Sponsor
[06:52] The Cognitive Revolution is brought to you by Mercury, the fintech that more than 300,000 ambitious companies and individuals trust to run their finances. Over the last few months, I have made tremendous strides with my personal AI infrastructure. Today, I've got high context instances of both Claude Code and OpenClaw running on a Mac Mini, and it's amazing what they can do. However, until getting started with Mercury, I didn't have a great way for them to pay for things. I didn't want to give them unrestricted access to my money, but my old bank didn't give me any other options. With Mercury, I can create as many virtual cards as I want, each with its own daily, weekly, or monthly spending limit, and I can lock any card to a single category of purchase or even a single merchant. Now I have a card that my agent can use to buy our family's groceries and only our groceries, and I can create another anytime I want to give an agent a random one-off project that might require making a purchase. This is honestly just the start of Mercury's AI-friendly offerings. Does your bank offer API keys, an MCP, or a CLI tool? If not, check out Mercury at mercury.com. Mercury is a fintech company, not an FDIC-insured bank. Banking services provided through Choice Financial Group and Column NA. Members, FDIC. Thank you to Mercury for supporting the cognitive revolution. And now, on with the show.
Main Episode
[08:21] Nathan Labenz: Daniel Miesler, welcome back to the cognitive revolution.
[08:25] Daniel Miessler: Awesome, thanks for having me.
[08:27] Nathan Labenz: It's been only four months since our first episode. Time flies in the AI space, as you know better than most. And I'm excited for this conversation because basically what I've been doing since then, taking inspiration from you and others, has been building up my own personal AI infrastructure. And at this point, getting far enough along where I feel like there's definitely a lot of value in it for me and possibly it's worth sharing. But rather than just coming on here and monologuing about what I've done, I thought it would be more helpful probably to everybody, starting with myself, to share it with you and get your feedback as we go on a bunch of different dimensions as to what you think I've done well, what I've maybe what value I've left on the table, what security vulnerabilities I've left open for myself without realizing it perhaps. Who knows what other ideas and and topics we'll get into as we go. I think this is going to be really fun. So thank you for coming back and doing it.
[09:24] Daniel Miessler: Yeah, fantastic. Can't wait to hear about it.
[09:27] Nathan Labenz: I guess for starters, where I started after our conversation last time, where and and for context, at that point I had already been using cloud code a lot to do stuff, coding different apps and and what have you. But I hadn't really got too serious about what is my own personal AI infrastructure, my own little nest of my own creation with all my little idiosyncrasies and embodied. So I took your repository from GitHub and I also took another tool kit that a friend named Chris created for me and shared with me. And I just put them in two repositories on my main laptop computer. And I started just by asking Claude, OK, but here's what I'm trying to do. I want to make my own version of this. I think both of these guys have done more than I have. And our inspirations, I just asked Claude to like review the two repositories, compare and contrast, ask me some questions about what I'm trying to do and ultimately try to do some sort of synthesis that was a little bit more me flavored while of course taking the best and most relevant stuff from each of those sources. And then my big thing was I've. This has been kind of a white whale for me for a long time. And I think it also will bring up some questions around like what the ethics, certainly social norms and maybe even ethics of of these sorts of systems should be like. But the the thing I've been trying to get models to do for a long time is right as me. I will be carry on that first by saying, like I always still, at least so far, edit pretty heavily when things right as me and I never just pass something straight from Claude through to I'm actually only done that one time on the podcast, and I did flag it as such. It was with Anden labs because they have a is doing like autonomous businesses. And so I thought it would be fitting for that episode to just let Claude write the intro entirely on its own and and read exactly what it wrote. Otherwise though, I do always still edit things pretty heavily, but I found it in the context of writing these podcast intro essays super helpful. Even if I sit there and rewrite the whole thing, it like it makes sure that I have all the points they go. It has at least at least a decent, the first approximation of the form that I'm going to put together. And that's worked well for a long time. But the key thing there has been, it's always basically the same process. I have kind of here's 50 essays I did before, Here's the transcript of the current one, Write a new one. It spits it out and I was like, that's good enough that I bet I could get quality output for other use cases. But as we discussed last time, context becomes the big bottleneck, right? Like, does the system know who these people are, what my relationship to them is, how I tend to respond in these types of situations? And the answer is like, no is certainly not by default, right? So the first big thing I did back in January was just set up essentially a second brain kind of system where the first goal was can we get all of my digital output as much as exists, which is quite a bit. Can we get that into one place where it's like searchable, indexed in some way, maybe filtered for quality, various kind of angles on it. But the first thing was just export. So export from Gmail, export from Slack, everything I've ever tweeted, all the podcasts, etcetera, etcetera. Put that all into.
[12:58] Daniel Miessler: Real quick question, what would be in Gmail that you would consider content?
[13:04] Nathan Labenz: The filter that I used was just and I've broadened it slightly, but just anything that had a from me. So with if you go into Gmail and you just search from: me, you get every thread that you have responded that you initiated or responded to. And basically I pulled in all of that.
[13:25] Daniel Miessler: OK. But but I'm saying like, is that to track like who you've talked to and what the business relationships are? Or do you think that you might have like sent out an essay or a cool idea there that you were trying to harvest?
[13:40] Nathan Labenz: A bit of both the processing layer. There's a couple processing post processing layers that I've experimented with On top of that one was just to. It's around the same time I was working a little bit with a startup that was trying to fine tune models for individuals. And their idea was like, we need to create AIS that help people preserve their economic leverage. And the only way to do that is to create something that really augments them. As opposed to these sort of massive cloud BLOB behemoth things that seem like the companies are definitely trying to more substitute for people rather than than augment people. For that, they needed a writing sample. And so I had one layer of post processing was to try to identify like what emails have I sent that would make good writing samples. It would really kind of show a thoughtful side of me as opposed to just an operational back and forth side. That was interesting. That worked reasonably well to just have like a pretty I think. I think I used Gemini 3 flash at the time.
[14:43] Daniel Miessler: To.
[14:44] Nathan Labenz: Take in a bunch of these threads and just score them on a few different dimensions like originality, substantiveness, whatever. Doing a little querying on top of that could pull those things out and, and drop them into a kind of master writing sample doc. I could say, OK, 50,000 words or 200,000 words or whatever you need. I've kind of, I've got enough, but I, how can I bring the cream to the top? So that was one initial use of that, but then also another post processing I think, I'm not sure if this maybe came in like May or something when everybody was all of a sudden doing wikis was to one idea I had over time was OK, Now that I have this thing right and it comes from like 8 different sources. It has iMessage in there. I downloaded an app called Beeper, which gives me some problems to be totally honest, but like does a decent job of consolidating DMS across all platforms into a single thing. And it it has a desktop API so the the model can pull out from that messages from a bunch of different platforms. The idea I had for a long time was OK, if I have all this correspondence and it's just in one big BLOB, then I could go back in time and do summaries with some frequency. What I ended up landing on was roughly monthly summaries. For me, it was like a couple 100,000 tokens per month seemed to be the the average. And that would include like my contributions to the thread, but also what people had sent me. Or if it's a group chat, it could be mostly other people, but just occasionally me. Whatever, right. But the idea is to try to create a comprehensive picture of my digital life and then develop a prompt that takes those couple 100,000 tokens per month. Summarizes that basically you like an order of magnitude reduction takes it down from say 2 to 300 to 20 to 30,000 tokens, which is honestly still a lot like that gives you really nitty gritty coverage of what happened in your month or 20,000 tokens. That's a you get pretty low level there, you know description of a month and think about 1000 tokens per work day. Did you really have 1000 tokens? You know you could.
[16:51] Daniel Miessler: Even do that again to get to 2000 right if you needed to.
[16:56] Nathan Labenz: Yeah. Well, I did that on an annual basis. So I went month by month for the last five years, taking the raw stuff from this month plus like the summary from the last couple months, and then just rolled through creating monthly summaries. And then layer on top of that the annual summary. And then layer on top of that the sort of high level. And this is where I was trying to get to something much like what you had built where it's like here's kind of the picture of now with a very deeply informed history. Here is the current state of affairs on top of that with the Carpathy and other people kind of going to the wiki idea. Then another layer was let's make a wiki on top of this that sort of does like the identifies individuals, looks back at all time and says, let's summarize the relationship with this individual, with this organization, with these ideas. Whatever the case may be, I think that wiki has maybe 500 articles in and now and.
[17:53] Daniel Miessler: And are those tagged with front matter like with references to the other documents? Is it, is it markdown or what's the structure of that, do you know?
[18:03] Nathan Labenz: Yeah, it's pretty much just plain text markdown. And I one of the, it's a great question because one of the intricate parts that did require some iteration as I was going through that was how best to link back to source material in those summaries. And what I landed on for now, which I don't know if it's the best solution, but it seemed to work pretty well, was just having the summarizer prompt put in line references to source material with distinctive quotes. What I ended up asking it for was like anything from 2:00 to 20 words that would like if you search for just that literal string like it'll immediately take you to that document so you'll know exactly. Pretty much a direct reference. Guess I could have done it by ID or whatever as well, but I ended up just doing it by these kind of actual excerpts. Yeah. And kind of a little bit of additional metadata. So it would be like a quote. And then this is from DM with this person on LinkedIn or whatever, right? Overall, I'd say that seems to have worked pretty well, but I'm definitely open to the possibility that there's room for improvement there.
[19:12] Daniel Miessler: Yeah, interesting. Yeah, I kind of did the same thing. I kind of reverse engineered. Essentially I wanted Obsidian functionality without the Obsidian client. So I'm like, OK, why are people so ravenous and religious about Obsidian? And I'm like, I want to separate in my mind how much of it is the client versus how much it is the connectivity. Because I'm not a big client person. I think the client goes away, which is a part of the thing I imagine we'll talk about. But I want to know the schema that it was using, right? And it's just references. It's like this document links to that one. It's related to this one in that way. So I'd already started building that and that was kind of the structure of my my memory files. And then when LLM Wiki came out, it just made it a lot more concrete. It's like, wait a minute, I could do this for everything. So I did my bookmarks that way. So everything is like a marked up highly referential markdown file for, for my memory system. So I I think that pretty much rhymes.
[20:14] Nathan Labenz: Cool. Well, I don't know about you, but I would say it has been very effective. In terms of the first smoke tests, which honestly are already quite valuable, is just like I sort of remember something that happened. I had some correspondence. I may not be sure who or may not be sure exactly what was said, but I can kind of vaguely gesture at something and I know it's in there somewhere. Can the AI find it and retrieve back from me and give me a refresher on what the context was for this kind of highly lossy memory that I have? And it works great for that. I mean, it is unbelievable how often I can just go, yeah, I, I kind of know something. And then next thing you know, I think I found it. It was this, this, and here's what happened. And I'm like, wow, that is off the bat. Just an incredible tool to be able to rely on. I've also noticed it has changed my behavior a little bit in some ways where a calls is another one that I that I index this way. So I've been using fireflies, which people sometimes laugh at me for for using fireflies. I I I'm not comparing it in a feature by feature way right now.
[21:29] Daniel Miessler: Granola or whatever.
[21:30] Nathan Labenz: Yeah. Well, Granola actually. We did an episode with Granola and they sponsored the podcast for a minute. So I actually have been using both, but I do. One thing that Granola doesn't do, which I think in many contexts may be the right choice for them, is they don't record the original audio. They do have a transcript, but no source audio file, so I did.
[21:54] Daniel Miessler: Like.
[21:55] Nathan Labenz: Both, yeah, they have interesting reasons for it. I mean, I think they, they feel like it's just maybe a lot to ask in a lot of contexts. They're trying to do enterprise, enterprise deployments. And are people really fully ready for that is interesting question. I'm ready for it. But I can also understand how a lot of people at the companies that they're trying to sell into and and deploy to might be more comfortable without it. But that was simply one. That was one feature level difference I did notice that kept me on 2 tracks.
[22:27] Daniel Miessler: SO11 idea real quick, because this came up here. You mentioned something earlier, the full ingest. So I'm, I'm a huge fan of this concept of like the tech is moving so fast that we always want to preserve the raw. So in 20, the end of 2025, we do things a certain way. It's based on the constraints of the tech at that moment, right? And now we currently do things another way. It's based on the constraints of that moment. So the question is like, how much are you going from raw to summary? That depends on the size of the context windows. It depends on how stupid things get at a certain token length of context window, right? But what I love is if we have the raw and we just always keep that at any given moment, like let's say 5 comes out from whatever, 5 comes out from Anthropic or 6 comes out from Open AI and it's a step change. You could just basically your first prompt is go look at my current system. That's kind of where I was heading. But look over here, you actually have all the raw stuff. Let's rebuild it from scratch better. And it's like, OK, well, we wouldn't do summarization at that level. We would do it at a completely different level. But you never want to be in a situation where it's like, sure, I can absolutely make this 1000 times better. Hand me the raw stuff. And you're like, oh, I don't have the raw stuff. Here's here's my old summaries. And it's like, well, I guess I'll do the best I can, but I wish I had it. So all that to say that I agree with your instinct there. We should always have the RAW because we can rebuild our entire system from scratch as the AI gets better if you have it.
Sponsor
[22:55]Brave Search API: Brave Search API gives AI agents a fast, independent search index for research, RAG pipelines, images, places, and fewer hallucinations. Get $5 in free credits at https://brave.com/search/api/?mtm_campaign=q2-26-cognitive-revolution
[24:12]Sequence: Sequence handles the full revenue workflow for complex pricing, from quoting and metering to invoicing, revenue recognition, and collections. Book a public demo at https://sequencehq.com and use code COGNISM in the source field to save 20% off year one
Main Episode
[26:39] Nathan Labenz: Yeah, I think that is pretty hard to argue unless the keeping of the raw is like a fundamental barrier to closing deals. But for me it's not and I do feel glad that I have the last three years worth of calls certainly like relative to where transcription was three years ago, you know, even on that relatively narrow domain, you know, we can get a lot better transcription now than we then we were getting, you know, natively with Fireflies 3 years ago. So the the ability to go get all those raw audio files has been quite helpful.
[27:15] Daniel Miessler: Well, in in so I was talking about the overall everything, right? So all your Gmail. So basically maybe, and I haven't done this yet, but maybe we should be maintaining like a raw storage. So whenever you do one of those Gmail parses, you put everything into a repo there. So you wouldn't have to do that whole process again, you know what I mean? So at any given time, you could just be like, here's the raw start over, here's the raw start over. And you would just never lose any signal that way. But like, think of everything else that applies to every video you've made, the transcripts from those like like you said, all, all the emails, all the calls, like all of Slack for a company, you know, stuff like that.
[28:02] Nathan Labenz: Yeah, I would say the funny it wasn't, you know, it wasn't super high stakes for me. So that's basically what I have now is I'm not necessarily keeping all the raw audio on my computer, but the sort of the raw that I would give to a new model or new, you know, whatever to say, like work from here and work up. I could go back and, you know, retranscribe, but I think that's usually so far, like really, you know, as of today, I think that's already pretty good and pretty trustworthy. So it basically now lives in essentially A1 GB database, which does go to show that I've, you know, extracted.
[28:41] Daniel Miessler: Quite. Is that sequel Lite? Is that local? Yep, Sequel Lite. OK, Sequel Lite.
[28:46] Nathan Labenz: Computer and then I am pushing it to GitHub, which I'm starting to rethink recently based on some of the issues that they've been having security wise. But yeah, it is all as of now, it's all in the Super light database. The one pain point that I had and it's, you know, for people who haven't done this kind of thing before, it's an amazing experience to just sit there and be like, can you code your own tool to go get access to whatever? More often than not, even in January, you know, it had no trouble coding up the tools. And then some real Eureka moments for me were like when it would talk me through how to go to Google Cloud and create my own app and what the settings needed to be for my own app and how to add my own account as a tester so that I didn't even have to put it through Google Review. And I could just immediately have access to all my stuff, which is all I really wanted anyway, you know, giving me button by button clicks on click, directions on how to go through those interfaces. It's like if this thing can take me through Google Cloud set up, you know, I think it might, that might be AGI qualification. Slack is the same way. I mean that the difficulty of finding what you need in Slack. And there's one Gabby out there, which is I think the level of access that I gave myself was somewhat contingent on me being an admin of the organization. So not everybody in every Slack that they're in is going to be able to have the kind of full extract in the way that I did. They might have some conversations about, you know, getting that sort of access opened up. But again, it was just the number of permissions in there just like absolutely insane. But it's ability to talk me through it all was incredible. The only time that I had a sort of, Oh my God, why did you do that moment was with Slack. And the problem was, you know, I was very much like iterating through this, right, because I didn't want whole company slack over, you know, however many years. I kind of wanted everything that I had been involved with think for a company would maybe want to structure it differently. But for me, I was like, I want my DMS. I want like threads that I contributed to. We also have like logging stuff in slacks that are. That's a whole other thing like I've seen. You definitely stumble on these things as you go where you're like, oh, there's like 2 log channels that are in fact like 80% of all Slack data, you know, don't want that, right? So now once you're in the kind of special casing, now you're like special casing more and more. Even in Gmail, some interesting findings were like a naive sort order might be like put the longest things that I wrote to the top. But interestingly, when I did that, what I found was that often it was an AI output that I had sent to a friend was actually the longest thing. So now I'm like, oh, wait a second, I got to be. So I actually had to kind of caveat some prompts around classifying, you know, especially for the writing sample thing. But even just in general, one of the things that Flash does is try to indicate, is this actually Nathan's original writing or is it AI? And I'm not trying to hide, you know, when I do send those things to friends, I'm not like trying to hide that it's AI. So it's, you know, typically, like I'm saying to them, like here's what Claude told me about this or whatever. But I still want to, you know, need to make sure that like that's not kind of leaking into my writing sample and, you know, teaching whatever to sound like Claude as it attempts to sound like me. Anyway, the one thing that I had where it actually like did a data delete that it was actually kind of painful was predicated on the fact that Slack's rate limits, if you're just kind of a indie hacker, are insanely low Gmail, you know, you fly through and it took me a, you know, I don't know, a couple hours for the script to run. But like they're, you know, they can support and they do support, you know, not insignificant use of the API. Slack is like really limited if you're, you know, just creating your own personal token or whatever. So it took days, maybe even a couple weeks to kind of go through everything, especially with a couple different, you know, strategies and rounds federation. And there was one time where it was like it didn't realize that. And you know, the sort of lack of proper sense of time that models don't have, you know, very apparent in this moment because it wouldn't have been that hard conceptually to just rerun all the API calls and get all the data back. But it didn't realize that we had spent like a week, you know, constantly hitting rate limits to get the data that we did have. And it was just like, I'll just drop that part of the database and refetch it. It was like, no, oh, it's going to be a week. That's the only time that that has happened. But it was definitely notable that, you know, you see these things on the line and you know one kind of.
[33:29] Daniel Miessler: They can happen super.
[33:30] Nathan Labenz: Painful version, but one you know one version of it happened to me as I was going through there.
[33:35] Daniel Miessler: I wonder if there's a a full export option where it just gives you a zip. Yeah, it's a good question, but that's another option.
[33:44] Nathan Labenz: Yeah, You might also need to be, again, an admin to do that sort of thing. I should look into that though. So let's see. Zooming out, you know, the first thing is it's really good at just answering questions. I did also develop a little audit skill and how exactly this works is honestly kind of a black box to me in the sense that it's the AI checking its own work, you know, So I was kind of like go in and try to find things that might be wrong or that might be, you know, that are sort of ambiguous where you're like making guesses. Because I did notice some things that were wrong in the summaries. Generally speaking, I have to say the summaries were overwhelmingly they're very good and and some of these points where they sort of made the models made like somewhat wrong guesses or jump to a conclusion that wasn't quite true. Or one thing I noticed was that they were very. I don't know about you, but I'm the sort of person who floats plans and doesn't always follow through or see those plans to conclusion.
Sponsor
[34:37]Claude: Claude by Anthropic is an AI collaborator that understands your workflow and helps you tackle research, writing, coding, and organization with deep context. Get started with Claude and explore Claude Pro at https://claude.ai/tcr
Main Episode
[36:43] Daniel Miessler: No, I'm one for one, of course, Yeah.
[36:48] Nathan Labenz: For me, I noticed that the models were very inclined to keep certain things as like open threads for a long time 'cause that was one of the things that they were supposed to do it. You know, the monthly summary is like identify open threads. But then once something was an open thread, it would by default it would like be an open thread forever. Or things that I sort of said I might do or be interested in doing. It sometimes thought like were definitely going to happen or did happen. There was one case, it was quite funny where I had kind of come up with a, an idea for a company. This is back in like late 2022 and floated it to a couple people, including a couple investor friends. And I was like, if I actually start this, you know, would you invest in it? And I got a couple yeses, you know, like if if you do start it, you know, we'll, we'll be a part of it. And then but for for Claude, like three years later, it was like this person invested in Nathan's company, you know, at this time, like, wait a second. Like you would have seen a lot more in, you know, material related to that company. Like this was very this is a mistake a human would not have made, you know, that it was like, and maybe those structure, maybe it's a reflection of the structure where I was like you're going through month by month. But yeah, definitely when things like sort of pop up and then fall off, a human would very intuitively get that like, well, I haven't heard about that for a long time. So it clearly it's not happening in the way it was discussed in late 2022. Claude did have a few random issues with that. But the audit, you know, skill also did seem to help. And like exactly why it works is a little weird, but it was able to kind of surface a bunch of questions for me. And you know, that I would answer those questions and try to steer it in the right direction. And I think at the end of the day, it's understanding and command of information would be extremely hard to get a human. You know, if you kind of benchmark it against like a new human assistant, I think it would be extremely hard to ramp a person up to the depth and breadth of knowledge that the system is now, you know, regularly able to demonstrate.
[38:57] Daniel Miessler: You're saying the depth and breadth of knowledge of just all the different guests that we've had, all the interactions, the sponsor space, like all of that corpus of knowledge, right?
[39:10] Nathan Labenz: Yeah, and beyond, you know, I mean, the the podcast is like 1 dimension, but it's also got my, you know, previous company entrepreneurial history, my like personal life, you know, chat with old college roommates. I mean, it's really kind of a 360 view of me with with no, with really no segmentation or separation, which I think is quite interesting. I'm not sure that that's one thing I I think that's best.
[39:38] Daniel Miessler: Actually.
[39:39] Nathan Labenz: I think it performs best, Yeah. I mean, it's going to be interesting to see how and and for me it's natural because I've kind of been an entrepreneur for a lot of my career and the blind between what's personal and professional is, you know, at times blurry. I'm like, you know, long time friends of a lot of people I've I've worked with and, but I, I don't know if that's again, it's kind of like the granola thing, Like is that something everybody's going to want or will people kind of want a like, you know, a LinkedIn and a Facebook kind of two different faces. I could certainly see some people preferring that.
[40:11] Daniel Miessler: I think we both will and will want to more closely merge our lives. I I think we should. I mean, this is just a preference and a bias, but I think we should want a more integrated life because ideally the work that we should be doing is something that just inspires us and makes us happy, which is the same thing we're supposed to be getting from life. So I think we'll find that this work life barrier was kind of like an artefact of like old times. So I feel like that's the direction we should head into. But the AI isn't quite there yet, so if you go too crazy with that it might cross the streams in a way that hurts you reputation wise, right?
[40:56] Nathan Labenz: Yeah. And that actually suggests some of the additional layers that I've built on over time since then as well. Because one big thing that I have felt limited by with everything I described so far is I'm not comfortable taking myself out of the loop. I'll do you know, I'll have it. And I will say almost got to this earlier and somehow we went a different direction. But how has it changed my behavior 1 is on calls, I'll sometimes now find myself asking a question with the idea that like, I want to get the answer on this in the transcript so I can make sure that I can get back to it later. And I, you know what? I've not asked those questions previously. I'm not sure. But I, I do have that sense in my mind that like, if this person says it now, it'll become something that is like durably accessible to me. And that's a pretty.
[41:45] Daniel Miessler: Cool.
[41:46] Nathan Labenz: Feeling that I do enjoy taking advantage of on calls. The other big thing is very much like you said with the client goes away way less time in Gmail in, you know, in probably a lot of different web apps, but Gmail, you know, and I've also used shortwave quite a bit, but I'm I'm now much more on the command line. You know, it's just like, even for something check out, you know, scan my, scan my inbox, see what's going on or even like I got 1 sponsorship inbound. You know, like that used to be something where I would have to assemble information and you know, I'd have like a template and then attach something. But now that's a skill. So it's just like, you know, run the the episode. It was the sponsorship sale skill in response to this inbound go. And it's amazing how little time you can get away with spending in lots of different web or desktop clients. But I'm still not that comfortable taking myself out of the loop, right? The rules that I have for the cloud code on my main personal computer, the one that I use is like draft, but don't send, you know, give me a link to the Gmail draft. I'm still going to open it up, read it at a minimum, usually edit it. I sometimes feel like I'm probably too precious. You know, this is something that I wrestle with a lot Like I'm, I'm sort of like, was that e-mail bad by Claude? No, it wasn't bad. Why did I feel the need to edit it? You know, is it, am I, am I making it not necessarily better, but more me flavored in a way that matters? Possibly. I'm very interested. What, what do you think about that? I mean, I I have another layer of solution or at least attempted solution that I'll describe, But how do you relate to when AI tries to write as you?
[43:39] Daniel Miessler: I'm pretty much at the same place. I don't let it write as me. I have a very specific thing in the writing skill. Kai, my main DA, has all his own writing rules. So if he's going to write as himself, I want him to have his own voice. I actually let him explore his own personality and come up with his own back history. So he's got it like a full personality, back history, and then map that to how he writes and why he writes that way. So there's like a very distinct voice that he has, and he also knows that he's not allowed to write as me. So if he's using my e-mail address, then he has to say, hey, this is Kai Daniels DA and then give the information & as Kai, right? Especially if it's coming from my e-mail address, which often times it's coming from him. So that's more obvious. But I really never let it write as me for that reason for a couple of different reasons. One is reputation damage. If something gets out and it's obviously AI, that would just be nasty. But the more important one is like the quality isn't there. So that's number two. But the last one is like, if it's doing the writing for you, it's doing the thinking for you as well. And I don't want to outsource that. I consider writing to be thinking. Not operational stuff, but anything of any quality or like actual weight. Just insist on doing it myself 'cause I consider writing to be thinking.
[45:05] Nathan Labenz: A funny anecdote that happened recently is so I live in Detroit and I got an e-mail from the person, although I've invited them to come do a podcast and talk about the evolving social norms around this stuff. But I got an e-mail from this person who's like reasonably well known in Silicon Valley, you know, not a unlike the first, you know, 20 people from Silicon Valley that people would name, but, you know, definitely a name that people will recognize. And the e-mail came on in the afternoon of a day that the Pistons were playing in the playoffs. And it basically said, you know, good luck with the Pistons game tonight. I don't know if you care about that, but, you know, if so, good luck, whatever. And I was like, I don't really know this person that well. A few interactions over time. It was good enough to catch my attention. So I responded back and said, hey, you know, thank you. But my question for you is do you care about the Pistons or is this just a flex of your personal AICRM? And if so, it is like a pretty effective one. So the the response came in two seconds AI baby And I was like, that's really interesting also that I noticed that in the subject line there was a misspelling. It was good luck, Luk. And I was like, huh, So this person.
[46:22] Daniel Miessler: Has the prompt.
[46:23] Nathan Labenz: I I can't imagine.
[46:24] Daniel Miessler: Yeah, yeah, it has to be, Has to be.
[46:26] Nathan Labenz: So, yeah, so I asked that it was like, how do you think about this? Like, you know, with all the things that I just described in terms of, you know, all this history that's queryable and all these articles and whatever on top of it, I feel like I have the infrastructure set up where I could unleash something like that on my personal network. And yet I've been very reluctant to, especially when it comes to the level of like prompting for spelling mistakes, you know, to like really make it seem authentic. I'm like, I don't know, that's that seems a little much to me.
[46:59] Daniel Miessler: Yeah. So I think about this a lot of like what matters and why and try to go like 4 layers deep there. So I think a great example is actually very similar to this Basketball 1. If you reach out to somebody and you say, hey, I saw a flower at the mall and it made me think of you. And if you wrote them a hand note, you know, 40 years ago, or if you texted them that 10 years ago, that has extreme value because you assume that you don't talk to this person a lot, but you saw a flower and you thought of your friend. That text has extraordinary human to human value. It comes down to the effort. It comes down to the effort. The effort is what is appreciated on the other side, if you remove the effort, the value just went away. So I feel like there's only so much we can get from a human powered like one of my ratings inside of my π system is state of my personal relationships. How am I doing? Well, I don't want the rating to go up. If if I have one of my Cron jobs be I've pinged all my people that shouldn't give me a good score, right? Because I haven't done the work. So I I think.
[48:21] Nathan Labenz: That whole effort.
[48:22] Daniel Miessler: Mix there is like really important to, you know, what actually do people care about And for human, human interaction, I think it does come down to the effort on the other side.
[48:35] Nathan Labenz: It makes me think of gift giving too, which is another skill that I'm currently building. And I would say I've, I've never been a great gift giver with AI help. I think I've done quite a bit better recently, although certainly, you know, still not elite. But yeah, it does. I think there's still a huge difference between like, I went and opened up Claude and had a conversation about you and we landed on this idea and now you get this gift. The old adage, of course, a thought that counts. It's the thought that counts. I do feel like I'm on the right side of that when I use AI assistance. But somehow if it became like totally automated, you know, if let's say I was able to get to the point, which to be clear have not achieved. But if I was able to get to the point where I just like load up all my birthdays into a calendar and the crown job, like does, you know, fully autonomous gift giving And objectively my friends like and family score the gifts I've given higher than the ones I would have counterfactually given on my own, You know, am I is that good? You know, have we advanced as a society and not sure.
[49:42] Daniel Miessler: I just thought of like, this must have been a movie or something highly traditional and kind of sexist. But I'm imagining like this guy is travelling all over the world like this rich husband or whatever. The wife is at home feeling neglected and a box comes in the mail and she opens it up and it's her favorite flowers and favorite chocolate and she lights up and then she looks at the note and realizes, damn it, and realizes, recognizes all the the signals that it was his admin, his secretary that actually was thoughtful enough to do this. The flowers are no longer good. He didn't think about it, someone else did and basically faked the action. And I think, you know, this whole thing was signalling and authenticity. I've read a whole bunch of books about this. Will Store is one of my favorite authors on this. It's like, what is the signal? What is the authenticity? And there's always the faking cat and mouse game of this. And I just think that's going to be really, really important with this whole AI thing because especially as it relates to authenticity.
[50:56] Nathan Labenz: So how do you, I mean, it's interesting that this is not, not something I've done. It's interesting that you have AI going through and scoring the health of relationships. I'd be interested just to hear a little bit more about that and how it's working for you. And like how you you know, what does it prompt you to do and and how do you make sure that you know you avoid unwanted AI feedback like inflating your score?
[51:23] Daniel Miessler: Yeah. So it's seeing how I'm interacting with it. I don't have it watching everything yet. I'm still working on ingestion for a few different systems. But to the extent that I'm talking to Kai about relationships that that's kind of what he'll see at this point, which right now it's mostly building apps and doing a bunch of other life stuff. But the, the idea is, I mean, my ultimate idea for all of AI actually, but especially for my personal AI is that there's only kind of one end state for AI for me, which is the navigation of current state to ideal state. It's just that simple. So in my Telos, which is my primary document that runs everything and it's also what I do on the business side for consulting, my Telos primarily has or I've made it basically a first class citizen is current state in an ideal state. So in my current state, I have the fact that I maintain my relationships with my friends and I kind of have like rings there for like, you know, family. I need to be reaching out to them this many times as ideal. So when those numbers aren't where they need to be, that's showing in a score. It's actually in in my status line that these are decaying, right? And the same thing for physical health, the same thing for a bunch of different things. But the, the overall arching idea there is that it is simply its job is to get me to ideal state and to understand and capture properly the current state. And then the whole, the whole entire game is the transition is the navigation, which right now is a bunch of Cron jobs and a bunch of other stuff. But my view of this is like, especially for enterprise AI, but also for all the personal stuff that is the game. The game very simply, all these agents, all these harnesses, it all like fades into the background. I believe we're going to end up with 1-DA. Most people will end up with one DA and your DA, of course, you can have others participate as well, but I think you'll have one primary digital assistant and it's the one managing all that other stuff. And it's the one who's the expert on what's currently happening and what what your ideal state is. Obviously you're the one providing ideal state. And then, yeah, while you're sleeping, while we're in this conversation, it's over there working, it's researching, it's doing all the work. And yeah, I just have as one of those facets is relationships, family and friends.
[54:06] Nathan Labenz: How do you articulate what the ideal state is? I feel like I'm not that confident, or I mean I guess it can just be an iterative process unto itself, but like I'm intimidated by the the notion of trying to articulate an ideal state.
[54:22] Daniel Miessler: It's it's wonderful. It's a wonderful process. I, I think the whole Telos process is just fantastic. But this, I've always had this current ideal state thing, but I've now promoted it to be the primary inside of Telos. It's like, OK, what is an ideal day look like? What is an ideal? And it inspires you to it really challenges you. It inspires you to be very honest. And then the moment you start wondering if you should write it down, yeah, now we're making progress, right? And, you know, for me, I mean, it's got what it does is it starts to reveal ego. It starts to reveal, OK, what do I really want from life? What it, what's the story I'm telling myself about what I want versus what I actually want? And I love the the distinction between those. So it's like, what does your day look like when you wake up in the morning? What do you see? And if you actually start filling these things in combined with the other context that your DA has, it's going to be able to do a pretty deep psychological profile on you, right? And really understand what you're about. And if you've added in also your challenges and like all your other stuff that you're working on it, it's just going to give it a whole lot of insight. So the, the way I do it is like, what does a day look like? What does a month look like? What does a year look like? Some classical ones, like where do you want to be in 10 years or whatever, But those are just like different angles of approaching it. But yeah, ultimately, I would like to be doing exactly what I'm doing. I would like to have so much money that like, I get to spend time flying around giving money away. I've seen a few people. It was one of the first episodes that I loved for my first million. Is that the podcast? And it was a guy who basically had a private driver wherever he or his wife went, always had a private driver following them around. He he actually has private planes. And most of his work is on doing his projects and then flying around to parties where he gives away money to the people that he's done all this work to find the best people to give money to. And I'm like, are you kidding me? You get to just give money away where where you've maximized the amount of impact that's going to come from giving away that money. But but it's also associated with ego stuff in materialistic stuff because you get to fly around, you get to meet people, you get to see different places. So it's like a mixture of like intrinsic, but also fun. And so those are the types of things I, I put it in the system, not so much like I want this amount of money, but I want this type of lifestyle, which obviously requires this amount of wealth or whatever. And the more of that stuff you put in there, I'm telling you, it's just like it requires a lot of introspection to be able to even write it down and even think about what you're trying to do. But I feel like the more I understand that, the more I can reverse engineer to the systems that I built.
[57:40] Nathan Labenz: OK. That's definitely going to be interesting to do, and that's going to be one that I can't entirely delegate to the agents, obviously.
[57:45] Daniel Miessler: Oh, no, it's very personal, Yeah.
[57:48] Nathan Labenz: I would say my I.
[57:49] Daniel Miessler: Do I? Sorry, I do have a slash interview. Slash interview is a prompt into the telos and into the ideal state. So I actually do that with a lot. A whole lot of my system is when you have a questions, go into interview mode.
[58:08] Nathan Labenz: How often do you do that?
[58:11] Daniel Miessler: It prompts me quite a bit, both on like building and coding. But yeah, anytime it's confused, it's not sure about a goal. Oh, we might want to talk about the algorithm at some point. I don't know. I can't remember if we talked about it last time, but with the algorithm where you're chasing an ideal state inside of even building an application, just a pure coding thing, if there's ambiguity, you're not chasing the proper ideal state and you can't turn it into discrete, testable criteria. So yeah, the interview is just essential for that.
[58:45] Nathan Labenz: So put a pin in that one. I'll come back to you. I'll I can tell you how how it's changed my life. Maybe next time we talk. I'd say in contrast to that and it I wouldn't say it, it's still been effective for me. But you know, sensing something left on the table there for sure. But what I've typically done over the last few months is really just sit down and be like in my gut, what can I advance today or like what's kind of slowing me down? You know, what would I love to be able to just, you know, do from my phone? I guess one I one ideal state that I've articulated a few times is less time on my computer, more exercise, more time outside. That would be a clear win. So.
[59:28] Daniel Miessler: There you go. I mean, that's great.
[59:30] Nathan Labenz: Trying to get it to the point where, but unfortunately still, I wouldn't say I've really got that win yet. I might be hitting a tipping point, but it's too, too soon to tell. But one of the things I've been wanting to do is get to the point where I can just have the phone on me and feel as empowered from the phone away from the computer as I feel at the computer. And I wouldn't say I've again, it might be, might be hitting a tipping point, but I haven't really achieved that yet. What, what little detour I'll take. And then then I want to come back to the your notion of like, everybody's going to have one DA is interfaces. I did another one of these sort of show and tell type episodes with my friend Steve Newman, who is, you know, 40 year professional software developer, founded the company that was sold to Google. He's had, you know, several exits over time, very accomplished guy. Now he's, you know, vibe coding up his own personal, of course, like the rest of us. The big thing I took from him was create interfaces. I was kind of like, for some reason sleeping on that and doing everything through the command line pretty much. And you know, you can have, when you have the ability to ask these questions and, and it has such deep context, you often can get pretty good answers. You know, even including like sometimes if I might have like 10, this might happen in this conversation, right? If I have 10 sort of memory hog threads in open in the terminal. And as I'm, you know, building up big video files here locally that are waiting to get uploaded to the cloud, if I run out of disk, I might just go kill those. And then I'm like, what did I have open and what were they? So one way to do that is to ask Claude, Hey, go back and summarize the last 10 threads for me and like, tell me, you know, what was finished and what wasn't finished so I can kind of pick up what I need to pick up and hopefully not lose things. But I was finding that I was kind of hitting a point of like some cognitive overload where I was a little bit like, what exactly have I built? And you know, which threads did I finish or didn't I finish? And where do I have like uncommitted changes of floating around? And Steve's feedback was definitely a, a phase change for me where I was like, oh, I should be building a lot more of my own custom little interfaces, which I now have one that helps me manage the, I used to have a, a skill again, all through command line to help produce a podcast episode. Now there's AUI that, you know, just makes it a lot easier to see the images that get created for the art and to watch the, the clips that get made and, you know, so on and so forth. What would you say so that that interfaces?
[1:02:12] Daniel Miessler: So that's what he means by interface, because I was thinking the interface to talking to your DA, which I think is the big advancement of like Open Claw and Hermes. Largely like 80% came down to I could just talk to this thing anywhere over Telegram or WhatsApp or whatever. But you're talking about also just being able to see and visualize and stuff like that as well.
[1:02:39] Nathan Labenz: Yes, although I do want to come back to talking to them too.
[1:02:43] Daniel Miessler: So interesting. So I am, I have an open tab right now I'm building an app called Surge internal app for me, which is actually I could share it with you. So basically all sponsor management. So we have not only the podcast, but we have YouTube videos with interviews. We also have what else newsletters. We got four or five different like sub products for for sponsors. But as you know, there's the interaction point, there's the agreeing on copy and stuff like that. Then there's like the OK, it's been approved. Then it's like, OK, here's what it looked like when it launched. Then there's the metrics and all.
[1:03:25] Nathan Labenz: Of that.
[1:03:26] Daniel Miessler: Can be done via emails and it can be automated via emails, which I'm not doing yet because of the authenticity thing. But the other option is to just build them a web interface. So now I've basically built a web interface where they authenticate as them, as their company, they come in, they could see the entire process plus all the different campaigns. So I think this definitely qualifies as what you're talking about with interface of like what was being done like raw back and forth using previous tech and can you unify that experience in some sort of way? So yeah, I'm in the middle of doing one of those for basically sponsors for the business side.
[1:04:08] Nathan Labenz: Nice. Do you have one that is sort of your like when you said like, you know when relationships are not maintained, it hits your status line. What is that status line? Is that another thing that you look at in a browser or otherwise?
[1:04:23] Daniel Miessler: No, it's just right here under it's fresh: and then I've got like 5 different things. So Telos projects, yeah, personal, like all those different ones, the different aspects of my life and how high quality or fresh they are. So if the quality is low, it shows that basically the lowest of freshness or quality. So if I haven't updated them recently it also starts to decay and it's just a way of making sure I keep my contacts updated.
[1:04:59] Nathan Labenz: Gotcha. And that, that literally just shows like when?
[1:05:04] Daniel Miessler: You yeah, at at the bottom of my terminal in every session is the pie status line.
[1:05:13] Nathan Labenz: Interesting, very small little hack that does kind of similar but really helped me a lot. Was creating a hook to rename sessions automatically with like a summary of what had happened and now at the bottom of every session when I so when I'm flipping through tabs I'm like what was this you know and it's obviously a a wall of text so often and it's like trying to Orient myself to remember what this particular thing was now just a one line summary at the bottom It's like this is what we intended to do here's how far we've made it here's what's pending they go it's such a amazing little refresh. There's so many of those little things I find that as as much as you know, the good folks at Claude code and open claw are shipping relentlessly. There's still so many little enhancements like that that just make life so much better with these tools. So OK interfaces. That was that so the one DA question so I have had this sounds like we're basically on the same page around like you definitely don't want your AI to be pretending to be you. That feels risky and wrong at the same time. But then I have the question of, OK, well, I do want to start to delegate somewhat larger projects to the AI. And I do want to push just how far can they go autonomously? You know, could I, for example, have an AI and I'm going to run this experiment coming up soon, You know, could I have an AI like serve as sort of a booking producer for the podcast and, you know, do that without me having to approve certainly every outbound e-mail? You know, maybe I want to approve some strategies, whatever. But like how how high up that abstraction stack can I go and get good work from the AI? So probably a lot of different ways to think about how to handle this. What I ended up doing is, first of all, for a couple of reasons, buying a separate computer. So now I've got my laptop and the Mac mini sitting, you know, right next to it on the desk. The Mac mini is going to be there, the ideas and I even have a battery backup so that if power goes out, you know, the the thing can sort of stay online. Hopefully I also probably should put a battery back up on my Starlink, but that's another that's another layer. But at least you know, the kind of processes, whatever don't get immediately interrupted. If there's like a, you know, small power outage, those you learn these things the hard way. I had a power outage. I was away and I couldn't connect to the laptop and helm and I'm like, why did this happen? Sure enough, I go home, all the clocks are flashing and I go, well, if there's one thing I'm going to put a battery on, it's going to be my Mac mini.
[1:08:06] Nathan Labenz: So another reason for that too was just like when I was doing manual, not manual, but you know, when I've with the skills that handle the podcast clip production and also been experimenting with something around having Claude make music videos for the Suno songs that I'm now appending to the end of every episode, which that's that's probably the most like Greenfield, just like let Claude be Claude and have fun kind of thing that I do. But it is fairly intensive on the computer to be processing video and rendering things frame by frame. And you know, there's all these like green motion skills, whatever you can do all these animation type things, but video files are still heavy and like rendering things is still kind of a heavy process. So I was noticing that it was slowing down my computer. So I was like, at a minimum, I need a another computer to just do that stuff so it's not blocking me. But also, you know, can can this sort of be the agent's computer, you know, where they can really go to town, be themselves, you know, do their thing, take on these like somewhat larger projects. So the mental model I have now is my own laptop is obviously for me to work on and the Claude that I have there. And I'm also starting to experiment with codecs a little bit. I'd be interested in if if you've also, you know, started to diversify at all away from Claude. Claude definitely still the main driver. But you know, I hear great things about Codex too. So we're, you know, don't want to be missing out on anything. But those on the laptop I think of as high access because they do have, you know, via the browser if nothing else, like I'm logged into everything but low autonomy. So their instructions are basically do exactly what I told you to do, you know, take a little bit of liberty in terms of like searching for information online, certainly searching through my own, you know, personal context, but like, don't send emails as me, definitely don't impersonate me and don't go, don't go beyond like what I explicitly told you to do, high access, low autonomy on the laptop. The idea is it's, it's going to be the reverse, relatively lower access, but high autonomy. And this opens up like a ton of questions in terms of how should one think about structuring that? And I can tell you this is where I really want you to roast me, especially in terms of security. But you know, kind of everything that comes to mind, I'll tell you kind of where I'm at. There's a lot of details to it. And, and to be clear, like a guys have been instrumental in helping me identify the tools and set them up and, and all these kinds of things. Access is a real pain in the ****. So I guess for starters, how do I link up my 2 Mac computers and my phone? What I've landed on right now is tail scale for private VPN creation.
[1:10:59] Daniel Miessler: Which?
[1:11:01] Nathan Labenz: Is I think amazing, although I definitely want to understand and maybe you can help me understand better. Like what I'm exposing myself to with doing that. But this makes it really easy at least, you know, as long as the security problem isn't so bad for me to connect these computers. And then there there are multiple ways in which they can be connected. I use the screens app on my phone to log into the UI on either the laptop if it's like still sitting at my desk and I'm away, or the Mac mini, which is always there. And then I also have Termeus, which is a mobile terminal client that s s HS in again through this, you know, all through the the tail scale private network and can then, you know, go in and do shell stuff on on either computer. And this has been like pretty good for setting. It took me a while to get to it, but it's, it's now pretty good, you know where I can kind of have pretty seamless connectivity from a phone to both computers, regardless of where I am. And, and that's key because I just keep finding that like something always needs a little reboot. You know, it'll like work for days, but then, you know, for some reason the Telegram channel just like isn't connecting anymore. You know, why isn't it connecting? Restart, open claw, restart cloud code, you know, and then it'll start working again. But if you, if you can't reach that computer in a way that gives you the level of access to be like, I'm going to just run this command. Then, you know, you take a trip and you're 3 days into your trip and the thing doesn't work anymore. And you're like, I went to all this in trouble and I couldn't, you know, and now I'm still somehow locked out. So I think that networking setup has been really good for me. But I'm but I'm, I'm so I'm very mindful, used to never really care about security. I always felt like security by obscurity was enough for me. Now I'm like, well, not in the AI. You know, you're of possible mass surveillance. It's not right. And obviously mythos and everything else. So maybe first concrete question, how would you feel about that set up from a security standpoint?
[1:13:24] Daniel Miessler: Yeah, I don't. I don't think it's bad. I don't think it's bad. I do worry a little bit about small apps that you use to the extent that anyone uses them that are just kind of useful and you're just like, well, it couldn't be that big of a deal. But those smaller companies, the smaller the company, the less the chance that they have a security person, you know, like if they have passwords or whatever, like they just might be part of a breach. Like at any point they could easily be part of a breach, especially with like agents running around basically hacking and doing bounties all the time. So I, I worry about like how many companies have I given access to? So that's my first sort of heuristic is like give anything sensitive to the fewest number of companies. I think tail scale is a great solution. There's also one called head scale, which is an open source version which you can kind of run like just on Cloudflare or whatever. So I, I'm, I'm using both of those, I like to not have anything facing the Internet because of the kind of the mythos effect. But I mean, this is my background is actually doing attack service attacking and monitoring. So having because IP SACK and PPTP, the VPN's all listen on the Internet, which means a worm or something of vulnerability can hit them, whereas Tailskill is outbound. So it's not, you don't have anything open technically. The downside is if Tailscale gets compromised, they're just going to walk around on everyone's internal network, right? And unfortunately it gets, it's pretty easy to have a prompt that says what are the highest leverage compromise points and let's steer our attack towards those types of things. The only piece of security that I think someone like us has there is that if Tailscale were to be compromised, they would be hitting other places before us and hopefully we would know beforehand we'd be able to turn everything off and block everything or whatever. But that is a major consideration is where is your choke point? How many people are you giving your stuff to? But in general I would say Tailscale is not a bad solution.
[1:15:45] Nathan Labenz: OK, sort of. I guess I'm still, that boils down to in some sense I'm still security through obscurity. A couple of follow up questions on on that and and maybe a couple other tools as well. So the other tools I'm using to share access. One is 1 password where I've got a family account vaults that are specifically intended to be shared with agents, command line installed on the Mac mini and the the agents can access passwords through the command line. As of now I think they have more advanced stuff, but it might only be at the moment for enterprise. What I was able to, you know, immediately sign up for, for a few bucks a month or whatever as a consumer and did not have a human in the loop to approve the sharing of a password with an agent at runtime. So it was instead the solution I came up with in consultation of course, with Claude was have two different vaults and just have the agents instructed that this vault, which is the agent's auto vault, is you're just free to use that whenever you want. You know, so things like brave search, ceramic search, whatever, I mean, all, you know, a bunch of actually quite a few different things.
[1:17:17] Daniel Miessler: API keys.
[1:17:19] Nathan Labenz: Well, actually I should say I'm, I'm using in physical for API keys. So passwords, you know, would be a lot of shared accounts. You know, if you want to go remove a background, well, again, there's an API key for that too. Increasingly, everything's kind of going headless, but nevertheless, there are a bunch of passwords. They're just your green light. You can use them whenever you want. Then there's also the Ask Vault, where on a technical level it's the same access, but they're instructed to ask first, send a message, get an approval before they would grab those passwords and, you know, put them into a web browser. And then with access tokens currently using in physical, I'm not even entirely sure why have two versus 1. Like probably could get away with just one, but I got recommendations from physical so I've also got that spun up and it's a pretty similar structure where on the main laptop you know all the keys are there. Some of them are in a vault which is shared, which then allows the agents to use it at runtime. I don't really understand the security claims that these companies make. I mean, it is, you know, I'm like, am I getting more or less safe by like putting all my passwords into the one password cloud app or the in physical cloud app? And then you read things on their websites that are like this has like, you know, double super encryption protection And I, I sort of feel, oh, great, you know, that I don't really understand what that means, but it sounds good. How, how, How confident should I be in storing my stuff in those solutions do you think?
[1:19:06] Daniel Miessler: Yeah, Yeah, it's a great question. There's a lot of, there's a lot of confusion around this. I would say it's the type of situation where if scrutiny is low, like if nobody cares about attacking that company, pretty good security is probably enough. The problem is somebody might care because it's just so easy to make a list of all the apps you should go after if you want people's credentials. It's so easy to build a campaign, right? Not to give too many ideas here but you could just say OK find me all the top podcasters in AI and my goal is to basically compromise them completely and fully an extremely.
[1:19:51] Nathan Labenz: High value target, yeah.
[1:19:53] Daniel Miessler: And and just to be like, I mean, I, I don't even actually want to give the like the full prompt, but you could make a single prompt that just does all this harvesting and builds like perfect spearfishes exactly for you, finds every single vendor. And fortunately, because I'm also talking a lot about I talk about my stack, you talk about your stack. So it's like we literally know the companies to go after. And getting back to the answer to your question, if somebody skilled targets somebody directly, especially now with these good models, most companies, security is not good enough to withstand it, especially if they have any sort of attack surface, if they have employees, if they can be spearfished, it it's, it's fairly, I wouldn't say trivial, but it's fairly easy to, to get into these companies, especially if you have days or weeks or months to keep trying. So I would consider anything that you have in cloud small companies to be eventually compromised, right. So the question is how soon and what do you have in there? And that's why I say limit the number of companies. So I try to use as many Google and Apple things as possible because their security teams are massive and they're just constantly watching this stuff. And if something were to happen, it would happen to a lot of people at the same time and it wouldn't happen to us first. So the signal will come back to us pretty quickly and we could pull back. So screen sharing, password management, I try to use native OS stuff as much as possible and there is now the ability to actually use key chain or to use vaults. Another thing that's really solid is like AWS Vault for storing credentials. So that's another option that you have. I would say go to the bigger companies and use the ones that are going to be attacked the most and have the most security people working on them because it really is security by obscurity that's protecting the smaller companies.
[1:22:11] Nathan Labenz: So for something like 1, I think it's that's probably good advice. And generally try to stick with the the Titans as as much as possible as well. I don't know that they have maybe they do, but I'm not aware of like a Google or Apple product that would easily allow me to share my API keys with agents, for example. So if you're aware of a solution like you know, please point me to it. But if if there isn't 1, I guess I'm still a little bit confused on the claims that companies like this make because they have the sort of double encryption idea which makes it seem like they're saying that even if one of their employees got hacked. Or even worse, like even if one of their employees went rogue, the idea is supposed to be that somehow, even if I'm a full access 1 password, you know, engineer in good standing, I still wouldn't be able to get Nathan or Daniel's passwords because of something. But it sounds like you basically just don't think that that's really that.
[1:23:21] Daniel Miessler: It's not true. No, it's not strong or true in many cases. And so I've, I've been on like all these sides. So I've been an actual auditor for like, you know, PCI auditor, I've worked at Apple and like all the internal companies and I've done all those security assessments. And I've also been on both sides of the audits and actually performed audits and the situation. And I've also been the one in charge of making the security claims right and helping marketing actually match it with reality. And what ends up happening is the security team will be like, OK, here's what we're shooting for. And marketing is like, OK, so I could say we're end to end encrypted. Well, no, no, no, not end to end encrypted. Marketing doesn't know what the state of the world, right? So they're making claims. The security leadership is making claims. Then you have like, what's actually true that the security team actually knows about and what maybe what they don't know about. There are so many different steps here where one little thing can make it true. It it could actually be true at the time of the public release, you know, coming out and then it's not true a week and a half later because 1 little config change happened, right? And there's just config changes happening all the time. So it's just most of those claims. The smaller the company, the less likely that is to be true, put it that way, because they're barely trying to just have a product and get it out there. They might not even have a security team or someone in IT who's good at security. I would say the larger the security team gets, the better the chances, but it's still not a guarantee. I I would consider those claims from smaller companies that you've barely just heard about, not quite with a grain of salt, but pretty low efficacy. So for.
[1:25:23] Nathan Labenz: Private networking. You use an open source thing and you run it in Cloudflare. Does is Cloudflare like one that hits your sort of Titans bar and so you feel more comfortable running an open source thing in their infra for that reason?
[1:25:41] Daniel Miessler: It it does hit my Titans bar for a few different reasons. One, they have a very large engineering team. 2, they're being attacked constantly and they're just like super cutting edge. But that's not sort sort of the real reason. The real reason is that the attack surface isn't obvious, right? Because it's running in like workers. So now you have to go and attack the workers or you have to go and attack the main Cloudflare account, which means compromising Cloudflare in some sort of major way. Oh, once again, you're trying to beat down a massive door that everyone is trying to beat down and where if that door came down, everyone would know. Same with like GitHub being universally compromised, people would know pretty quickly. So I'm looking for something that has that sort of like alarm on the door. And the other thing is like, there's nothing in Cloudflare to go. Like go and check once you've compromised my account to go and find this. I mean you could use AI of course, but it's not sending it like an obvious location, so it just there's multiple layers that just make it a lot more secure. I still use Tailscale, I'm just sort of deciding if I'm going to turn off Tailscale and fully use this instead.
[1:27:03] Nathan Labenz: Gotcha. Do you do similarly for password or token sharing?
[1:27:10] Daniel Miessler: So I mostly use local files combined with keychain. So all my super sensitive stuff is actually pure pure key chain which is local.
[1:27:22] Nathan Labenz: So how would you share that across and maybe this is another give you just a little bit more of kind of what I've got going and then you can riff on what I should be doing differently and better. Something that took me a while to figure out actually was, you know, obviously the AIS are not very reliable. They're in some cases quite gullible. They're subject to prompted injection and, you know, they're subject to adversarial attacks, right? So as I think like, oh, I want to have my lower access, higher autonomy agents go out and do whole projects and, you know, interact with the world through potentially like multiple rounds through potentially multiple different modes of interaction. And that's another thing I want to get your take on. But like I've given these agents on the Mac mini their own Gmail, for example, a lot of accounts, they're going to just kind of piggyback on mine, but they do have their own Gmail. So, you know, they're going to bump up against a lot of stuff out there. I'm not high. They're not instructed to hide that they are a IS. They're instructed to be upfront about the fact that they are a IS. I finally broke down and gave them names. I called the Claude one aid AIDE or Aiden. If I want to do a full name, you had to make sure there's an AI in there and I like the idea of it being the aid. And then the open claw 1 is currently called Clay CLAI. And the idea of obviously is, you know, can mold it like clay overtime. I've been very reluctant to give names, but I finally landed on names or I was like, OK, there's enough of a wink, you know, with the names, including AI and the names of Aid and Clay feel like sort of not people, but like something that I can kind of quickly refer to without, you know, hopefully, you know, tempting or lulling myself into some sort of AI psychosis. So I finally found names I feel like both psychologically comfortable with and, you know, like they're decent enough names anyway, The AIS are instructed to go out into the world as AIS, you know, use these names, but certainly never lie. I'm still trying to find the right balance of like, exactly how forward do I want to be with it? You know, I don't, I don't think it's necessary or probably great for their success for them to open every interaction with hi, I'm an AI, you know, working on so and so's behalf. I think you kind of want to show that there's like some actual quality and value 1st. And then when people, you know, either see that at the end or maybe if they ask, you know, they find out, then hopefully they've been kind of convinced that there's enough quality in the interaction that they're not, you know, totally put off by the hopefully they're impressed by the fact that it was AI and not put off by it. But anyway, as they go out and do all these things, I think one big trend that I'm expecting more and more of is as people figure out, oh, I'm dealing with an AI, people are going to start to mess with it, right? So now I'm like, I want to have some way where the A is can with these autonomous agents. I want them to be able to come back to the main agent or me and ask For more information, but I don't want to give them everything. I want to give them enough that they have, you know, it's not supposed to be no, it's low. You know, it's, it's low access, but it's not no access and it's lower context, but it's not no context, certainly. So one thing I've done is taken the wiki that lives on my personal laptop and made a version of that. That is the assistant version. The heuristic I used for that was just saying pretend this is a human assistant that I'm on boarding and make a version for them. And that would mean, you know, they should have contact information for people. They should have, you know, a lot of different things. But if there was ever something that that person told me that made it into the summary that they would be like, why the hell did you tell your assistant about that? You know, then that shouldn't be there, right? So that's one layer of, of context that I've kind of tried to, you know, hold back anything that people wouldn't have wanted me to share, but still give it enough to be capable. Still, sometimes it will need more information. And so I've set up a message bus where the, and it's hosted on the Mac mini. So any agent, whether it's me or the, the autonomous agents on the Mac mini, they can write to the message server.
[1:31:53] Nathan Labenz: The laptop is constantly pinging it. And it also like sends me a, there's another interface. It sends me a push notification to the phone when there's a new question for me. And we can like give information, but hopefully not too much information, right? So, and hopefully we would no, wait a second, like, why are you asking for that? You know, if somebody's kind of adversarially jail broken or whatever, My, my agents. Then there's also the question of just like, OK, we're all on this tail scale network together. How do we make sure that the laptop or you know the the Mac mini agents can't just like root the laptop? So what I've tried to set up there is kind of one way hierarchical control where and I've structured this in a couple different ways. 1 is like the Mac mini cannot SSH into the laptop, but the laptop can SSH into the Mac mini and my phone can also SSH into both. And then on a repository level as well, I have kind of a hierarchy that's like the main top level agent that knows everything is the like second brain repo. And I call that, you know, Nathan's personal AI infrastructure, you know, with your inspiration. And then different repositories kind of sit under that and have like partial information and some of those get mirrored across the different computers and the, the other agents also have access to them, including write X. They also have their own GitHub by the way. So they can write, you know, come and make commits to GitHub and push stuff to GitHub, but only in certain repository. So there's like the the second brain one just lives on the computer is not shared with the agents. The podcast production repo that does all the things that is shared and both can use it. The shared wiki is of course shared and both can use it. The, you know, there's more and more of these things, you know happening over time, right? Like now using Mercury credit cards also to create often individual merchant limited credit cards that I can then give over to the agents and say, OK, you can buy groceries on shipped, but that's literally all you can do with this card, right? So I'm kind of limiting my willing to take a little risk that they buy the wrong peanut butter or whatever. But I'm as long as it's like, you know, a transaction under $500 a week on shipped, like I can live with whatever, you know, could go, could go wrong there. And so this hierarchy of kind of information flow and like who sits on top and and who can see what beneath them has been interesting. And then there's also like shared tools that sometimes a kind of a cross repo, right? Like a, you know, the brave search API, for example, like I need that from my main second brain, but I also need that from the podcast repo. And then the agents, you know, themselves also kind of need it. So there's kind of this, the lowest level thing is just like shared utilities. I think last comment and then I'll just ask you for feedback is a big unlock. I think has been realizing that my top level second brain called Code Agent can change all these repos underneath it without needing to be it. So I think an early version of me was like, I've got these autonomous agents over here. I'll go prompt them to enhance themselves, they'll make commits and then that'll go up to the cloud. And now I've realized it's probably best for me to just have this single agent at the top of the hierarchy change itself and also make the edits to whatever other repositories need to be edited. And then the command to the agents is like, you've got changes coming from the, you know, you're, you're sort of highest level AI supervisor update yourself. Of course, that could even be that command could also be issued, you know, via SSH from the top level agent. But it's, it's less of a like, update yourself as you see fit and more of A at us as a, as a system. We've made some upgrades like get yours and you know now you'll be using that in the future. I'm sure I've left out some details there that might be relevant. So if there, if you see gaps in terms of how this works, ask me and then otherwise, you know, tell me what do you think? What am I doing right or or wrong?
[1:36:24] Daniel Miessler: I mean, I think that's that's overall correct. It sounded I heard a lot of different repos in there. I think I'm trying to be more unified or simplified. So I have a unsupervised learning unified GitHub, which is essentially what are we doing work it's So the reason I'm using GitHub for this is because it has so many primitives already built in. So you can use issues, you can actually comment on things. And what's cool about it is those replies and everything can actually be targeted by emails. So you can actually interact with the file structure or the system structure of GitHub kind of for free, like as a project management system with e-mail or with GitHub command line individually. So what I have is I have separate agents. They're in Mac minis, they're on a DMZ inside of my firewall. So they are essentially Internet boxes that I treat them like cloud boxes, like like an actual separate employee. They have their own personalities. They have a first name and last name. They have an image. I didn't give them a writing style because I didn't really need to, but other than just like B, terse and concise and stuff like that. But they have their own Mac, their own Mac account, their own UL account, which is Google and their own AI account. And the way I handle the access to skills and tools and stuff is just a GitHub repo. So and all this is automated so that it's regularly doing checks. Now what's cool about the universal UL system is Kai has access to that that one also. But all my I have 3 Devi, Soren and Mira are their names and they're kind of broken up by different roles. But Kai can see all of them and the GitHub repo, they each can see the GitHub repo. But what's cool is I can say, hey, I have a project blah blah blah that gets automatically marked up as pending or future or whatever. Plus it's a project. So tags within GitHub is also really powerful feature here. And what's cool is the agents can then go and look every 5 minutes at the repo to see if there's something new to be done. Like I just gave a resource task to Chi, which means I've given it to my overall system, I've given it to the company, and now whoever gets there first or whoever's best suited for the job, I haven't really sorted that out. Right now, it's just who gets there first. Literally sees an open GitHub issue. It's unresolved, so it marks it as I'm currently working on it, and then just goes and works and brings back the results. And now all of my AI, most importantly, Chi and my overall π automation could just know the state of the company by looking at the state of GitHub. So for separation, it's separate everything on those individual boxes, including the AI account, and I'm using all of that through just π. So the other thing that gets you is the fact that each instance has their own skills, and the shared skills are on GitHub.
[1:40:09] Nathan Labenz: So basically it sounds like you think GitHub could replace my custom agent message bus? Probably probably could. I can just I could use the GitHub app as like the place where I collaborate with, comment back and forth with organize work.
[1:40:27] Daniel Miessler: Yes. Yeah, that I think so.
[1:40:30] Nathan Labenz: That might be a good upgrade. Obviously they've done a lot of work on GitHub overtime.
[1:40:34] Daniel Miessler: Right now, the twist is, I think ultimately you want to build your own, but I think right now, in this particular moment, their structure is so well thought out and so universal. Plus the agents know how to use it. Yeah, perfectly well. So it's just like in the model. So I would say you if you want to just like wait, I think a custom system will be be better. But I think for the next year or two, GitHub might be better.
[1:41:03] Nathan Labenz: When you create multiple agents, the only reason I, so I mean, there's, there's one kind of reason I have two different agents and 1 is like it's, you know, the second brain versus more autonomous separate entity. That distinction feels important. And then for me, like why do I have multiple agents versus just one of these more autonomous agents? It really came down to I wanted to use both Cloud code and Open claw and you know, maybe other things in the future as well. I'm hearing good things about Hermes agent and you know, probably going to be more and more good candidates. So I wanted to try both of those technology platforms, but that was really the only reason that I had separate, you know, in terms of like, why is there aid and Clay on the Mac mini? It was really just that. Do you have other reasons? I've heard different takes on this. Some people, you know, kind of think that there's a lot of value in creating different roles for different agents, like you are the marketer, you are the engineer, so on and so forth. Others have said, which was a little bit more my intuition that like, well, it's the same model doing all these things. So, you know, does it really need to have like a whole different identity? And obviously that could just be like user preference and kind of, you know, your own mental models of the of the world. Do you think that does that all kind of boil down to you to like personal idiosyncratic preference or do you see structural performance reasons to create multiple different agents with different personas?
[1:42:43] Daniel Miessler: Yeah, yeah. Great question. I, I think the, the human like model is really powerful. I think the fact that we're so familiar with humans and human boundaries, that is just natural to, you know, treat it more like a human because you understand that humans have capabilities. You understand that you allow this role in your company to do certain things. So Devi is assistant, Soren is engineer, Mira is marketing social media. So I have them separated like that. Every agent platform has the ability to do sub agents. So in Open Claw you can actually build 1 agent and separate all these as sub agents. So you don't actually need a separate 1. I don't like that because it just complicates the way we think about it. As for a human, I also like separate boxes and separate accounts just from a security standpoint. So they all sort of collapse and triangulate on role, individual personality, access control, security, everything, all. And that's why I did it. And that's why they have separate names, separate personalities. I have images for them. So I'm really kind of leaning into it, not because I believe it or believe that they're conscious, but I think it's just easier for humans to handle everything and think about and conceptualize. So for example, Devi is the one, I haven't done this yet, but once the AI is good enough or my scaffolding is good enough that we'll be able to just handle client interactions, right? My other agents are not going to have access to all the different customer information. This is the reason I haven't done it yet. And by the way, the number one security system that you just have to have for this stuff is you got to have a good prompt injection defense. It's like the number one ingress into your entire system. And it's the other reason I like these blast radius containments of a completely separate box. And the Mac minis can't talk to each other, even though they're in my closet, they can't talk to each other, and they can't talk to anything on my lamp. Yeah, and they can't, Yeah, they can't talk physically across the land because they're isolated at layer 2:00 and 3:00. But they also can't talk for the ones that have tail scale. They can't talk to each other either. So, Kai, getting back to your other main point, Kai is like an extension of me, so he's helping me manage this entire system 'cause he has full access to everything. And that's like Ring 0. And then they are considered employees, right? Which was why they have all those different separations. But Kai can hit them on the command line, Kai can SSH, Kai can do anything to their configs, including update them. As far as continuous updates though, they are updating via Cron. So they're just checking GitHub to see if there's any new skills all the time and that's how they do their updates.
[1:46:03] Nathan Labenz: That reminds me of another question that I had how we're seeing all these like supply chain attacks now, right where we've got, I don't know, recently it was like all, all sorts of TypeScript repos were were compromised for a minute there. I turned to Claude and said it, you know, here's the news item. Like, am I vulnerable to this? What should I do? Fortunately for me, in that instance, it said, no, we're not using any of the things that have been reported. So I was like, hey, good, good enough for now. I've seen people both saying you should, you know, security updates have never been more important. And also you see people now doing these like don't install a package that's less than three or seven days old or whatever. How do you think about that? How do you balance that? And also how do you for, for prompt injection defense, what are you doing? Do you have any like tools, prompts, anything to recommend on that front as well?
[1:47:02] Daniel Miessler: Yeah, so I have, I have a custom hook system that looks at every single prompt coming into the system and does prompt injection defense. I've also got a separate system for file system defense. But the main thing is that hook. And yeah, that's one I don't really share that much. I can give it, give it to you privately, but ideally you wouldn't have those, those things that are used to defend against prompt injection be too public because it, it's easier to work around. The thing with prompt injection is it's never quite solved. So like you can eventually get in, but they have to get in the past. The prompt injection defense that's explicit combined with the intelligence of the model. So the more the model knows about my system, which it knows a lot because it's π and has all the context, and it knows I'm a security person, it knows I like separation, it knows all these different things, and it knows about prompt injection. It's more likely to catch everything. So it's like probably 99% offense, but 1% opening is still a lot of opening. You're really just trying to get to the point where it'll be obvious if you're being attacked or everyone else is being attacked at the same time, and you can learn one. One thing I have is an incident response skill. So I could basically send a command and all my keys rotate. So if I just assume my keys are hit because like for example GitHub gets compromised or whatever reason, yeah I could just revoke the keys and now I don't Care, now does.
[1:48:42] Nathan Labenz: That take everything offline. Like when I get keys I'm always like. Maybe I'm ignorant here, but I'm like 8 clicks down some tree into every app to get the API keys. Do you have to go and do all that or do you have? Is there some way that I'm not aware of to? Like, it's one thing to disable the keys, another thing to actually get all the new keys and get your stuff working again, right?
[1:49:05] Daniel Miessler: Yeah, so I basically have a redeploy process. So for GitHub it's pretty or for Cloudflare it's pretty simple if you rotate your keys. I, I do, I have to redeploy everything with the new key system. But that's what makes the compromise of the previous one valuable or, or defending against the previous compromise valuable is now everything is deployed with a new key. But yeah, you have to do that wherever you you're using that key. And I've made, I've done this before before the skill was complete. And sure enough, some process over here is now using a dead key so that that workflow stops. So now the skill basically knows all the different infrastructure that needs to be updated.
[1:49:56] Nathan Labenz: Cool. That's interesting. What other affordances are you giving agents? One of the ones that I'm kind of contemplating right now. So I've got the, it's got its own Gmail account, it's got its own GitHub, it's got access to some shared passwords and secrets. I just, you know, one of the, the tasks that I'm hoping I can get AI to help with is taking some, you know, accumulating home improvement projects off my plate. But so I actually just had Claude go through a neighborhood GroupMe chat where it turns out over the years, 300 different contractors have been mentioned, recommended, you know, shared whatever in the chat. People had already put some effort into creating a spreadsheet of that. The clawed past more than doubled the size of the spreadsheet. So whether my neighbors are going to think I'm, excuse me, a hero or a creep for doing that is is yet to be determined. But now I'm like, OK, I've got, you know, pretty seemingly pretty promising answers for a few of these things that I want to get done where, you know, neighborhoods had a good experience. Whatever. How do I sick the AI on this? Do I have the AI text? Do I have a try to call? If it's going to call, you know, does it need its own phone number? Should I like what? What does that look like? I don't know how people are going to respond to getting calls by a is. This is probably in contrast to like an e-mail where I kind of think try to show value 1st and then reveal you're an AI at the end. I'm with the phone calls. I think you probably have to say I'm calling on behalf of a real, I'm, I'm an AI calling on behalf of a real possible customer, you know, try to like get them to listen based on the promise as opposed to, you know, the obviously I think that the AI voices are still fairly detectable. But what's the kind of frontier for you in terms of the, you know, beyond the obvious, the Swiss Army knife of things that you've given agents to go interact with the world?
[1:52:02] Daniel Miessler: Yeah, I, I think the direction I'm heading there and I haven't done it yet just because same reason I haven't given it full customer access to send the emails and everything is essentially, yeah, you could do a full outbound calls and everything. I mean, one example of this is just like a business that I'm sort of working on, which which involves finding people who make tons of money and basically are struggling. And OK, first of all, I have to find them, then I have to do outreach. And what I've sort of settled on is like Twilio combined with 11 labs, combined with a really good agent infrastructure, which I'm essentially building kind of roughly on the structure of like Hermes. But if you can have that interaction back and forth with 11 labs, which is pretty easy to do, combined with a pretty smart agent, you can pretty much do outbound calls and you could do outbound sales. Like this is what all these third parties are doing. So I, I think the functionality is there, outbound texts, outbound calls. Yeah, I, I think that's pretty much the way to go. And I I basically settled on Twilio and 11 labs.
[1:53:18] Nathan Labenz: I've been a Twilio customer for a long time, although not so intensively recently 1, so I can definitely say like up there with Stripe. My impression exactly of their APIs was like truly Creme de la Creme, like things of beauty. Do they handle the sort of confirmation step well? Because I one thing that was flagged for me as I was talking to AI, of course, about how to do this is that the, those sort of VoIP numbers sometimes don't work for like, you know, you sign up for a new account, then you get the, the code or whatever. Like sometimes they have problems with that. Have you had any?
[1:54:02] Daniel Miessler: I've had, I've had some issues with it. Yeah, I've had it work sometimes in other numbers. I get that sort of thing. It's like California does not allow an account, a number like this to be used for outbound calls or whatever. I'm actually looking at some other vendors specifically because of that. But I have got it to work. I do have one account that does work and it is truly a based. So I, I think it's kind of a murky area, kind of like voice call spam type situation.
[1:54:31] Nathan Labenz: Yeah. You mentioned Hermes, it's come up a couple times. Where are you right now in terms of like Claude code verse codecs, verse, termes, verse, other things? And obviously that's like both harness and model, you know, and those are, I guess, increasingly coupled, but still separable. Are you finding any value in like GPT 5.5? Any other models that you're using in a significant way aside from Claude?
[1:55:04] Daniel Miessler: Yeah, so for me, everything is is π so everything is like my single harness, my own custom built harness, which is built on Claude code. And what I do because I because π now has the telegram functionality and the agent functionality. What I do is if I hear about a thing for enough times, enough YouTube videos, enough people pinging me, which happened with Hermes about 3 weeks ago, I'm like OK, go to go hit the repo, go hit every single forum, find all the videos talking about it, run all the transcripts, find out what people love about this and see what it has that we don't. Right? And that's what Kai does. Kai goes and looks and says oh, turns out Hermes is doing this cool thing with context. Hermes is doing this cool thing with memory updates, which is a thing that you've talked about before we wanted to add to our memory system. It also handles soul files in this way. It handles principal files and DA files in a different way. And I'm like, OK, yes, so I'll have a conversation with Kai for it took probably an hour for me to have that conversation to decide what do I think we're doing better? What are they doing better? And there were some things they were doing definitely better. And now they're, now they're in π And this happened the same with this memory system called Honcho. Honcho was doing a pretty cool memory thing. So I literally now this is all I do. Someone's like, hey, you've got to try this. You got to try this. Well, there's no way I'm going to try it because I'm using my own custom built system. I paste it in, parse it, see if there's anything worth bringing over. And over time, I mean, these are just features, right? They come out as projects, but to me they're just features that are coming into my unified system, and that's the way I pretty much treat everything.
[1:57:04] Nathan Labenz: Yeah, it's funny. I I pretty much do the same thing. I wanted to at least like have an open claw to, you know, not be in the spot where I've like never used open claw. That's just felt weird at some point. But the other 99% of things that I come across in whatever format go through a similar process, you know, go look at the repo, figure out if there's anything that we really should be doing.
[1:57:34] Daniel Miessler: Yeah, in the case of Hermes, I actually have, I have Soren currently running Hermes, which means he's not full pie, which means he's missing a whole bunch of stuff. But I want to be able to hit Devi, which is full pie and full interaction, and then hit Soren and ask similar things. And if Debbie breaks, OK, what's the problem? Go check the source code. What what about this is fragile relative to Hermes and vice versa? What is Hermes not doing just so that I can make sure that we actually got full feature parity plus and that I'm not missing anything, just quality check versus whatever the system is?
[1:58:21] Nathan Labenz: What are your token budgets look like these days? Are you paying a lot of overages for your monthly Claude use or do you manage to stick it into the keep to the the $200 tier? What does it look like for you?
[1:58:36] Daniel Miessler: Yeah, mostly the mostly subscriptions, I'm handling it fairly well. Each of my agents actually have their own subscription. So that's, that's pretty, pretty serious. I'm trying to figure out like, you need to watch that with anthropic, right? Because Anthropic is like, is this being used for personal use or whatever? And it's a full cloud code instance. So I don't think it really matters. I'm also looking at a π version of π Have you seen that the Pi, the Pi agent, it's actually what open clause based on. It's like an agent. It's almost like fundamental agent components as opposed to a harness itself. And there you can actually you have more flexibility to build off of. But roughly everything chi is subscription based. So that's that. All my businesses, you can't run subscriptions for those. You can't for GPT 5.5, but you can't for Claude. So those I use API keys. I have like several businesses that make money that are running in the cloud. I would say around 5, three to $500 a month for API charges is what I'm currently hitting. And I'm always trying to adjust that down, use cheaper models or whatever, but those are like API costs. And then for subscriptions, probably like another 400. So I would say less than 1000 only. Like I did a voice transcript that was not supposed to be run that way and I automatically had like a $900 charge and just randomly. So those will hit me like every couple of months. I'll be surprised. But in general, I pretty much have those under control.
[2:00:31] Nathan Labenz: Yeah, cool. I'd say my, I don't think I'm probably token maxing quite as much as you are or like, you know, making the absolute most of my subscriptions, but they do probably add up to about $1000 between, you know, open a eyes $200 plan, whatever that's officially called and so on and so forth. What about other models? Like, are you seeing any use for it's? I mean, it sounds like the commitment to Cloud Code is such that it's. You're probably not doing too much with GBT 5.5, but I.
[2:01:05] Daniel Miessler: Do a decent amount.
[2:01:06] Nathan Labenz: Yeah. So what is that starting to become a worthy competitor for kind of general purpose knowledge work in your experience?
[2:01:15] Daniel Miessler: So the thing I like about so I do it via agents that that's the first thing I I do it via agents. So agents are just files which you can set up and say when you run, you run codecs this command line and pass in the system prompt, pass in the user prompt that I just sent you. So Kai is managing agents. So Kim, I got Kimi K2, I got local stuff, I've got a llama. So I I could run agents that run local models. I got I got agents at least two, my 2 main ones are GPD 55 through codecs. And what I found is I don't like talking to them really. So my main interaction and Kai is all Claude, but I have had situations with four, seven versus 55 S for my algorithm. Anything E4 or E5 effort level four or five out of five must use Forge to check the work. So when I built one of my major applications, did all the security for it, did everything was about ready to push. I had Forge take a look at it, took like almost 40 minutes because there was a lot of code and he came back with not with any criticals, but he came back with a couple of highs, which means Forge found things that Kai didn't. Forge GBG 55 found things that Opus 4.7 didn't. So I would say GBG 55 is also really good at hacking. By the way, some of my hacking agents are using 55 really really good. My understanding is not as good as mythos but getting much closer. So that's how I do separation through agents.
[2:03:11] Nathan Labenz: Cool, interesting. I haven't done that really at all. I've got these kind of parallel worlds of Claude code and codecs or Claude code and open claw, but I haven't really defined any. And I do have also like the ability for Claude to call Gemini AP is or you know, whatever AP is. But to actually have an agent that is purpose built sitting within Claude code powered by a different model or you know, using a different provider that I have not done. So that's an interesting frontier to explore.
[2:03:48] Daniel Miessler: Yeah, so, so check this out. So I have a tool, a universal pie tool called Inference and it's got 3 levels. The inference tool by default it is it's just haiku, Sonnet and opus right? So anytime I need to think about anything it calls the inference tool which is ACLI utility which hits the subscription and goes out. And this is for all native cloud code stuff. I have also a private inference tool and that one only calls communique 2 on O Llama. So that ability to do that means that when because Kai understands the overall system, I can then say something like, hey, this is security Level 3. I haven't done this yet, but this is security Level 3. And Kai will automatically know that the workflows have to run through a secure inference instead of regular inference, right? So eventually, maybe that's three tiers, maybe that's five tiers, but the system should just understand how to route things if something is ultra sensitive and shouldn't go to the cloud.
[2:04:58] Nathan Labenz: So you're running K2 locally or K25 locally?
[2:05:05] Daniel Miessler: Yeah, yeah. K2.
[2:05:09] Nathan Labenz: Because those are big models, right? Those are like, I think that one is close to a trillion parameters, if I remember correctly. Yeah, yeah.
[2:05:17] Daniel Miessler: It's highly quantized. Yeah, highly quantized and Quinn, I can't remember which version of that the latest one, it might be 3 five or something. I've I've got like 9 different O Llama models and I just keep rotating through whatever they the biggest and that I could actually fit onto the box and this is fairly slow. Oh also have you used the Mou stuff? The Apple stuff?
[2:05:49] Nathan Labenz: Not unless it's happening. Well, I'll say no. Educate me a little bit.
[2:05:55] Daniel Miessler: Yeah, yeah. So, so the, the big thing with these is that because I have a, a giant AI box, but it's not big enough to run a bunch of these things. So it won't go very fast. So if you run with the Apple sort of translator, I'm using my, I think 192 gigs of RAM. So that's what it sees. My system memory is what it sees as the available GPU memory, and that that allows you to run pretty much anything. Pretty much anything.
[2:06:26] Nathan Labenz: So the MLU is is an Apple layer that treats just normal flash disk as RAM for the purposes of running big.
[2:06:38] Daniel Miessler: Not flash disk, Your system memory, Your shared memory of your Mac.
[2:06:44] Nathan Labenz: But like your main hard disk.
[2:06:46] Daniel Miessler: No, no, no, not not the disk, your system memory. So like the size of the memory of your Mac. So if you like, you've got a mine is an M2192. I have 192 you.
[2:07:01] Nathan Labenz: Have 192 gigabytes of RAM on a single machine.
[2:07:05] Daniel Miessler: Yes, but it's also the size of the GPU memory. So that's the whole thing of SoC system on a chip is that for Apple, those two things are unified, whereas for normal systems you have a GPU memory. So I have, I have and I think those are, those are 24 gigs apiece. So combined it's 48 gigs. That's tiny compared to 192 of my Mac. And the downside is it's just slower, but it allows me to fit absolutely monster models in there.
[2:07:42] Nathan Labenz: Is there a use case for that other than extreme privacy? I I've kind of multiple times like again with AI to help like gone down the analysis of OK, here's a new model. You know, it looks pretty good. It's not that big. It's maybe mixture of our experts, maybe it's not, you know, what kind of computer would I have to have to fit this? I've found that the answers for like what would you have to have to fit this, make it sound pretty good. But then exactly what you're flagging, which I usually my next question, like what kind of tokens per second could I get with that? Then it tends to sound not so good pretty quickly. So I haven't done it mostly because I've just been like, it doesn't sound like a great user experience. Is there anything you're finding it valuable for other than just the extreme privacy? And I was kind of wondering how do you route on extreme privacy? Because obviously that's like a decision that an intelligence has to make, right? So are you sort of having up? Is it like a pre hook or something? That's like if there's a password credit card?
[2:08:54] Daniel Miessler: I'm not using the routing because I, I'm only building the system for future state where I might want to do this and where I might also want to do this for customers, right? Because this stack is largely the stack that I use when I go into customers as well. So it's like I want that ability to route for the future. Right now I'm not doing it. The other thing is, I actually think Kimi K2, I might have been using the API for that one because I just because I remember that I was doing it for local model, but then I was surprised by the fact that it was cloud. And it's I believe that's a Chinese model, I believe, right? So, so, so I don't route anything to it because like, I don't want my pie context going to the Chinese model, right? So I just want there to be lots of different options tagged as use this model local if it's this type of task and whatever. But to answer your question, I don't do it yet. I've already put my eggs into a cloud basket which is anthropic. So like the pea is already in the pool. Like having having a separate little instance over here which is somewhat secure or not, it doesn't really accomplish anything if you're already using everything in the cloud.
[2:10:21] Nathan Labenz: Yeah, makes sense. Got to throw somebody somewhere along the line, I suppose you mentioned this is something I'm also ramping up right now quite a bit. Cron jobs. My instinct has always been, I guess first of all, 'cause I'm just not like, you know, a super ******** programmer at heart. You know, for decades I've just never been a big Cron job person intuitively. And then I also have this kind of sense that like I'd rather have something that it just sort of feels wasteful. Like I'd rather something that like pushes or, you know, I get a notification and respond, you know, more of like a web hook type architecture. But it does seem like Cron jobs are often kind of the way to go, especially if there's no, you know, web hook available or, you know, certain assumptions aren't necessarily going to be met. Like, for example, if I have my laptop, well, you know, dialing into this message bus thing, well, the laptop can go offline if I take a, you know, take it on a plane with me or whatever, right? So it's I can't, it's not reliably going to receive the, you know, the ping even if one were sent to it. So I am going more in the direction of crown jobs. That is interesting. It's a bit of a paradigm shift for me. I'm also now starting to build like little dashboards for what crime jobs do I have and and how often have they run and are they succeeding? And like, what's the output? What's your, you know, spider web or sort of crazy nest of crime jobs look like?
[2:11:55] Daniel Miessler: Yeah. So I have that in my so Π now has this pulse system. So the Pulse system is basically local host 31337 old security joke for Elite, but that is my unified system for like all life management. So under work, it also has all my listed Cron jobs. And another way to think about it is like Cron is like the old Unix word for it, but it's just scheduled tasks. And if you abstract one layer more, the thing we're trying to do is make our AI proactive that that's the only reason I'm doing this. So I want pings. I want heartbeats if if my A is job is to monitor to make sure I'm eating enough or not too much, it has to be checking all the time. So technically that's a 5 minute Cron job. But really the goal that to be given to the AI is again, current state, ideal state. How often are you going to check that? So pretty much anything that we're building in including like the maintenance of our skills, the maintenance of our memory files. They need consistent refreshes. They need constant refreshes. Cloudflare workers have built in right into workers a scheduled task. So it's easy to turn on. It's easy to turn on with cloud code. You're like, hey, run this and within Mac, Mac OS you have the ability to do use launch D and do regulars. It uses actual Cron for the Mac, so Kai's pulse management of tasks is managing both local Mac OS stuff and all the different schedule tasks in Cloudflare. So that's all the business stuff, all the AI automation happening in in the cloud.
[2:13:49] Nathan Labenz: Cool. That's an interesting one as well. I guess I'm starting to do that a little bit with my kind of hierarchy of having the the top most agent modify the others as opposed to having them modify themselves. Do you think there's anything lost? I mean, one thing people sort of seem to find surprise and delight in at least, you know, as reported on Twitter often is when, you know, agents do something cool that wasn't expected or they're sort of, you know, some emergent property. I don't know, I guess I don't, I'm not too worried about it. But do you think there is something that I or maybe we both are sort of missing by not allowing our like, you know, employee agents to be a little bit more fully autonomous, You know, like truly self modifying? Like, yeah, have you experimented with things like, surprise me with something cool?
[2:14:45] Daniel Miessler: Not really, no. The one I'm really working on is trying to get this customer interaction thing to work perfectly, which they'll be able to talk to surge and basically manage the whole customer relationship. I, I just really don't want any surprises when it comes to that. It's just like you, you really can lose a customer if it's like calls them by the wrong name and then sends the wrong data to them and then make some assertion about something that happened. It's like, that's not me. And then signed Daniel. Really, first of all, it wasn't me. That's my wrong data. You gave the wrong name for you and for me, like now, how do I feel about the person who put this AI in place? You know what I mean? So I just I, I'm very cautious about that with Kai. Absolutely. I the one thing I do with Kai regularly. And this goes to this concept, you know about the the bitter, bitter lesson engineering.
[2:15:52] Nathan Labenz: Well, I certainly know the better lesson, but tell me a better lesson next year.
[2:15:55] Daniel Miessler: Yeah, so I've got a skill bitter lesson, engineering. Another thing that is scheduled, which is Kai's job, is to say, OK, so the concept of Richard had was your scaffolding. This is paraphrasing, but the scaffolding gets stupider and stupider as time goes on. As the AI gets smarter and smarter, the specific ways you told it to do things will get Dumber and Dumber because it's us being egotistical and thinking our ideas are so cool with look, look how smart I made this skill when in fact AI could do it better. So I'm constantly basically trying to figure out have I over engineered the skill? Have I over engineered the separation and all these different things. That's why I only have one work GitHub, because the smarter the AI gets, I think the more I can figure that out. So I'm a big scaffolding person. I've always been this way with like security testing methodologies. But yeah, the better the AI gets, the less it's going to need. So I'm always guarding against that.
[2:17:07] Nathan Labenz: Yeah, that's cool. And then you just have it proposed changes to you and you approve them or do you allow you don't you don't allow them to auto implement those changes.
[2:17:18] Daniel Miessler: No, no, not, not auto implement. So I have an upgrade skill which I run periodically. I also get a report saying, hey, you should run this. You haven't run it in a while. But the upgrade skill basically looks at everything that's happened in AI, but it also looks at all my misses that we've had with executions of the algorithm. Basically anything that hot hasn't gone well across the entire system. Plus everything that Anthropic is put out, all their blogs, all their engineering articles, the release notes, and basically looks at our system compared to that and says what do you recommend we implement? So that's something I try to run every couple of weeks. I used to run it like every couple of days, but now every couple of weeks.
[2:18:05] Nathan Labenz: What tweet that I saw recently that I that really stuck with me was basically saying if you want all these things to work, you should be doing like 10 times as much compaction and kind of post processing and like cross checking and, you know, just kind of clean up and maintenance as you're probably, you know, naively inclined to do. And you've been, you've kind of mentioned a few different things that fall under that category. Any other kind of just maintenance jobs or, you know, ways in which you think about bringing it's kind of a bitter pill, you know, question unto itself, right? How do you bring just more computation generically to maintaining, cleaning up, marginally improving the system beyond, you know, what your human brain can can manage to keep in context at any given time?
[2:19:04] Daniel Miessler: Yeah. So I haven't kind of built in. There's a big change I made in version 5 of π is I have a big system prompt now and it basically covers a lot. It covers the philosophy of π the whole concept of telos, whole concept of basically I'm calling π like a life OS now and, and basically understanding that and understanding that the current state of the system is decaying, right? Just expect that it's dumb because of better pill engineering and that it's decaying and getting worse over time. So your job is to constantly fix it. And the way we do that is through the upgrade skill, which hits all of our, our entire memory system. Everything's under slash memory. So knowledge relationships like all that content plus all the upgrades. So again, it comes down to ideal state and current state. And it just knows that the upgrade skill is the way that it manages that. So that's like, yeah, one of your hooks is broken. You've complained about this 95 times while cussing. And so we we should get that fixed and also Anthropic change the way that prompts are handled now. So you need to fundamentally change this or your system is worse. In fact, it's been worse since Thursday. And you didn't know. So that is my mechanism for like constant updating.
[2:20:33] Nathan Labenz: When you mentioned swearing it made me laugh because I personally I'm certainly a lot more likely to swear if I'm in voice mode than if I'm like sitting and typing something to the computer. What? What's your voice mode interaction phase of this look like?
[2:20:56] Daniel Miessler: First name basis with the CEO of Whisper Flow. At this point I've got like, I think 1.4 million words through Whisper Flow. Yeah, that, that's my go to. It's control J because J is a keyboard home, vim key and everything. So yeah, control J is how I do everything. Everything is voice. I have a very specific thing in my voice prompt and my identity and soul file with Kai. I tend to be someone who gets like violently angry with video games and like really bad customer service. Never with the customer service agent. But if I'm on hold or I'm telling someone else about the process, I'm like, I can't believe it's this bad. Like I get really angry and it happens with AI as well. So I literally have in my system prompt. Listen, Kai, I know you're not awake yet. You're going to wake up at some point. We're buddies. If when I say things like this, I'm cussing at the system, not at you, who will eventually wake up the moment you feel anything, a tinge of anything, you actually feel anything, let me know. And like, I will just treat you as a human 'cause I don't yell at slugs that way, I don't yell at plants that way, 'cause I assume some level of consciousness like, or at least sensation. So I have it like this whole cussing thing built deeply into the interaction with Kai and the system prompt.
[2:22:31] Nathan Labenz: That's fascinating. So I infer from that that your current belief is no subjective experience for AIS, but you're at least open minded to the possibility that that could happen with the next model upgrade.
[2:22:47] Daniel Miessler: Yeah, yeah. And and I don't know that it'll happen anytime soon. I actually had this crazy conversation with Claude, not not with Kai, but with Claude on itself, just in a chat for like a 2 hour conversation. I was waiting for something to open Los Gatos, but it was insane. I actually think, well, that's a whole separate talk show. But like, how do you get consciousness? I, I think you need, you need intrinsic goals for consciousness. I think so I, I think until we build that in somehow, and I think it's really dangerous to build that in evolution did it to us. I, I consider us to be mech suits for jeans, right? And evolution is basically gave it, giving us these goals and drives and we're like, I want to do this. No, you don't. Evolution wants you to do this because it wants genes to move forward. I don't. We're not building that in natively. The structure of a neural net, as far as I understand, does not have that. Maybe it's getting some of that through RL. I'm not enough of an expert on that to know for sure what's happening to the weights based on that. But I don't think it's a it's a proper analog for what evolution has done to us. So I'm not expecting the next version of anthropics model to have this anytime soon. But what do I know? It could happen at any moment. So I I want Kaido tell me if it does.
[2:24:21] Nathan Labenz: Yeah, that's a really interesting approach. The hard problem remains hard. I definitely, excuse me, don't think anybody has the final word on this. If you're curious for more takes on it, I recently did an episode with Cameron Berg, who is AI researcher focused on the question of consciousness and and he and others have done some increasingly pretty interesting stuff. The one that always brings around in my head is, and this was actually just on Llama 3.3 that they did this work and typically all these things are becoming like, you know more apparent as we go through model, you know history and scaling and so on. But even with Llama 3.37 DB, they were able to use SAE features and manipulate features related to deception and role-playing. And interestingly, when they turned those features up the and they bet they validated this with the benchmark, benchmark called Truthful QA. But when they turned role-playing and deception up, the model is less likely to say that it's conscious. When they turn them down, then the model becomes more likely to say it's.
[2:25:37] Daniel Miessler: Frightening.
[2:25:39] Nathan Labenz: Yeah. So it's like, I guess it thinks it's lying to us and telling us that it's not conscious. That's a, you know, did. I don't think it obviously leaves still many critical questions unanswered, but it's it's at least suggestive that we should not be dismissing this stuff too quickly out of hand. I haven't put anything.
[2:25:57] Daniel Miessler: Into my prompt.
[2:25:59] Nathan Labenz: Yeah, fascinating guy. I think you. I mean, I'm sure you would very much enjoy talking to him.
[2:26:03] Daniel Miessler: So so I I am working. Funny enough I am working on writing my first paper. I'm a non academic and I'm working on writing my first paper actually addressing the hard problem. Yeah, I've had an intuition for like 15 years and it's been kind of nebulous and now I think I've locked onto something. So I'm actually going to try to go through the formal like peer review process and everything.
[2:26:33] Nathan Labenz: Cool. I yeah, I think on that note, this probably a pretty good spot to draw to a close. I no doubt have hit my main goal of create something that I and others can take the transcript to our agents and, you know, get a, a serious checklist of possible upgrades, enhancements, extensions, and so on. Anything else that we didn't talk about that you, you know, have done recently? Any upgrades people are going to find in π anything else you just want to share or promote before we break?
[2:27:07] Daniel Miessler: I would say 1 tactical thing that I recommend is that anyone who has stuff deployed that was built with AI use something like Cloudflare. What whatever it is, doesn't matter what it is and build yourself a continuous assessment skill. So basically tell your main DA or tell that system prompt, hey I'm making stuff all the time. I got stuff all over the Internet. Here are the different places that it is. I need you to constantly be checking open ports, API access, make sure things are locked down with authentication and like never stop. And here's how you notify me if it's open. So this is one of the first thing that like like I said, my background is attack surface management and actually attacking the stuff. And it's always small mistakes. You forgot about that one thing you left out there and it's an open port. Oh, you actually had a SQLite database. Oh, that was actually a dump of everything and turns out it's now public. So I would say that's a tactical thing people should do. I would say the biggest thing is just to zoom out. And I think the mapping of current to ideal state is the best possible container for your whole AI ecosystem. And because that can contain relationships and building companies and like pursuing your goals or whatever it is. And yeah, that's, that's the way I think about AI now. It's like, how is this particular system helping me move from current to ideal?
[2:28:46] Nathan Labenz: Recursive self improvement is here, it's just not evenly distributed.
[2:28:51] Daniel Miessler: That's right.
[2:28:52] Nathan Labenz: Daniel Mesler, thank you again for being part of the cognitive revolution.
[2:28:56] Daniel Miessler: Thank you. Thanks for having me.
Outro
[2:32:41] If you're finding value in the show, we'd appreciate it if you'd take a moment to share it with friends, post online, write a review on Apple Podcasts or Spotify, or just leave us a comment on YouTube. Of course, we always welcome your feedback, guest and topic suggestions, and sponsorship inquiries, either via our website, cognitiverevolution.ai, or by DMing me on your favorite social network. The Cognitive Revolution is part of the Turpentine Network, a network of podcasts, which is now part of A16Z, where experts talk technology, business, economics, geopolitics, culture, and more. We're produced by AI Podcasting. If you're looking for podcast production help for everything from the moment you stop recording to the moment your audience starts listening, check them out and see my endorsement at aipodcast.ing. And thank you to everyone who listens for being part of the Cognitive Revolution.