Scaling Intelligence Out: Cisco's Vision for the Internet of Cognition, with Vijoy Pandey
Vijoy Pandey of Outshift by Cisco outlines his vision for an Internet of Cognition built from networked AI agents, explaining protocol-driven architectures, enterprise controls, and real-world multi-agent systems like Cisco's CAPE, AGNTCY, and a healthcare demo.
Watch Episode Here
Listen to Episode Here
Show Notes
Vijoy Pandey of Outshift by Cisco lays out his vision for an “Internet of Cognition,” where AI agents can share context, build reputation, and collaborate safely at scale. He offers a useful mental model for superintelligence: progress has to scale in two directions — up, through better individual models, and out, through networks of agents and humans thinking together. The conversation explores how distributed, protocol-driven agent systems could give enterprises fine-grained permissions, auditability, and controlled interfaces, in contrast to today’s centralized frontier models. Vijoy also walks through Cisco’s internal CAPE system of 20 cooperating agents, the open-source AGNTCY project, and a live multi-agent healthcare demo spanning diagnostics, insurance, pharmacy, and scheduling.
LINKS:
- AGNTCY Project
- Open source multi-agent infrastructure under Linux Foundation governance. Covers discovery, identity, communication, observability. Vijoy walks through the architecture at [00:34:57] and [00:41:17].
- Scaling Out Superintelligence Whitepaper
- The technical whitepaper detailing the Internet of Cognition architecture, three-layer stack, and cognition state protocols. Referenced at [01:25:40].
- Internet of Cognition Interactive Demo
- Clickable walkthrough showing per-agent activity, intent, context, and collective reasoning across a multi-agent SRE system. Vijoy demos at [01:26:20].
- CAIPE Project (GitHub)
- Cloud Native AI Platform Engineer. Multi-agent system with participation from Adobe, AWS, Cisco, Nike. 20 agents, 100+ tool calls, 10+ workflows. Referenced at [00:11:52].
- Cloud Native AI Platform Engineer. Multi-agent system with participation from Adobe, AWS, Cisco, Nike. 20 agents, 100+ tool calls, 10+ workflows. Referenced at [00:11:52].
Sponsors:
Tasklet:
Build your own Cognitive Revolution monitoring agent in one click.
Try it for free and use code COGREV for 50% off your first month at https://tasklet.ai
VCX:
VCX, by Fundrise, is the public ticker for private tech, giving everyday investors access to high-growth private companies in AI, space, defense tech, and more. Learn how to invest at https://getvcx.com
Claude:
Claude is the AI collaborator that understands your entire workflow, from drafting and research to coding and complex problem-solving. Start tackling bigger problems with Claude and unlock Claude Pro’s full capabilities at https://claude.ai/tcr
CHAPTERS:
(00:00) About the Episode
(04:16) Cisco and networking foundations
(13:34) Jarvis and ASI vision (Part 1)
(18:16) Sponsors: Tasklet | VCX
(21:09) Jarvis and ASI vision (Part 2) (Part 1)
(31:46) Sponsor: Claude
(33:59) Jarvis and ASI vision (Part 2) (Part 2)
(34:00) Practical multi-agent examples
(50:02) Multi-agent plumbing architecture
(01:01:44) Agent identity and TBAC
(01:15:23) Internet of cognition fabric
(01:21:48) Emergent agents and safety
(01:36:52) Outro
PRODUCED BY:
SOCIAL LINKS:
Website: https://www.cognitiverevolution.ai
Twitter (Podcast): https://x.com/cogrev_podcast
Twitter (Nathan): https://x.com/labenz
LinkedIn: https://linkedin.com/in/nathanlabenz/
Youtube: https://youtube.com/@CognitiveRevolutionPodcast
Spotify: https://open.spotify.com/show/6yHyok3M3BjqzR0VB5MSyk
Transcript
This transcript is automatically generated; we strive for accuracy, but errors in wording or speaker identification may occur. Please verify key details when needed.
Introduction
Hello, and welcome back to the Cognitive Revolution!
Today my guest is Vijoy Pandey, SVP and GM of Outshift by Cisco.
For more than 40 years, from helping to define early low-level protocols that are still in use today, to building out the infrastructure that powers modern high-speed networks, Cisco has been critical to how we manage the flow of digital information, and today Vijoy and the team at Outshift, are bringing Cisco's distributed systems DNA to the fundamentally new challenges presented by frontier, agentic AI systems.
The AI-powered preparation that Vijoy and I did for this episode demonstrates why this is so important. I used Tasklet to conduct deep research on Vijoy's work and draft a starter set of questions, and at the same time his team ran a deep research process on me and the podcast, and identified a number of suggested discussion topics based on themes we've previously explored. Both agents did a really good job on their respective assignments, but they knew nothing about one another and they had no opportunity to collaborate. Their output was sent from human to human, by email, and it was up to me to figure out how to synthesize their work.
What's missing, Vijoy says, is "The Internet of Cognition" – higher-order protocols and infrastructure that AI agents need to share context, understand one another's intent, build reputation and establish trust, and ultimately solve problems in shared spaces.
The upside of filling this gap, I'm convinced, will be world-changing and perhaps even world-saving.
It was, of course, the emergence of language and the evolution of culture that allowed humans to sustain cooperation over long distances and time horizons and ultimately build the global civilization we enjoy today.
And the distributed nature of this system makes it extremely difficult and rare for any individual to accumulate a systemically dangerous amount of power.
In contrast, the current AI paradigm emphasizes scaling things up, with more and more resources creating ever-more powerful frontier models, each of which is meant to do everything on its own.
Such concentration of capabilities into just a few systems, and the concentration of power it could easily bring about, has always struck me as dangerous, so I think it's very exciting to see a major company developing an alternative paradigm that's meant to scale intelligence out horizontally, in a way that is fundamentally distributed and designed to support permissionless participation from the start, and which could give rise to a more buffered, ecological, and stable network-based architecture for AI.
Importantly, Vijoy argues that this paradigm gives enterprises what they really want – a way to grant agents only the minimum permissions truly needed to perform their roles, a clean separation of concerns, visibility and auditability of their systems, and controlled interfaces though which to interact with the outside world.
Of course, this conversation goes well beyond the theory and deep into the progress that Cisco and its partners are making in practice.
Internally at Cisco, they've built a system they call the Community AI Platform Engineer, or CAPE, which is composed of 20 distinct agents that collectively manage complex cloud computing environments. This system has reduced load on site reliability engineers and improved response times for end users by fully automating 40% of tasks.
And for the public, they've taken the lead on the AGNTCY project – which, you may recall from past sponsorship, is spelled A-G-N-T-C-Y – which is laying an open-source foundation for how AI agents, representing different interests, can connect, communicate, and meaningfully collaborate.
At one point in the conversation, Vijoy fires up a demo which shows how 4 agents, each representing different organizations and specializing in distinct skills, can collaborate to serve a patient in a healthcare setting that spans diagnostics, insurance, pharmacy, and scheduling. He narrates the demo pretty effectively, but I think it would be worth flipping over to YouTube to see that bit in action if you can.
With that, I hope you enjoy this window into some of the most sophisticated systems thinking about the giga-agent AI future that I've found anywhere, with Vijoy Pandey, of Outshift by Cisco.
Main Episode
Nathan Labenz: Vijoy Pandey, SVP and GM of Outshift by Cisco. Welcome to the Cognitive Revolution.
Vijoy Pandey: I'm so excited to be here, Nathan.
Nathan Labenz: Me too. Lot to learn. I've been studying up on your work and there's many facets to it. I would love to start, if you would indulge me for a second, with just a super high level view. I think everybody in America knows the brand Cisco. But probably even a lot of people who are very into the AI world at this point and know a ton about the intricacies of post-training and building all these agent workflows. If you pressed them and said, what does Cisco do? What I've come to in terms of a three-word answer is moving information around the world. And that, as we say, has many different ways to unpack it. But I'd love to just get your introduction to the company and the fundamental role that it plays in our modern technology life.
Vijoy Pandey: You're right. I mean, it's moving information around the world. It's connecting people to machines, to objects. It's about secure connectivity. It's about observable connectivity. It's about collaboration. So if you think about Cisco's four core pillars in its businesses, it's networking, it's security, it's observability, and it's collaboration. So that's the four business units, so as to speak, that we go after. But if I take, if I were to take a step back, and because it's a, this is, an ML AI audience. The way I think about Cisco in that context is we are a distributed systems company. So we enable scale out, we enable horizontal scale. So if you think about servers and if you think about making them bigger and bigger, that's one way of scaling, that's scaling up. Cisco is a company that allows you to take many of these entities, connect them through a network and enable a cluster of compute. So you're enabling distributed computing and you are enabling scale out technologies and you end up doing both scaling up and scaling out to get the compute that you need for all of the awesome workloads that we're trying to run today.
Nathan Labenz: I'm just reflecting on how much we need a sort of robust foundational layer for decentralization. So just the last week of news in AI is like frontier companies battling against governments. And then I think like, what was the original dream of the internet? It was sort of this decentralized thing that people could kind of plug into on their own terms and contribute to. And I worry that we are losing that a little bit. And I think part of what, at least the great hope that I see in some of the work that you've done is to help enable that kind of distributed participatory future. So let's take one more beat on just kind of fundamentals, though, because one of the big things that you are proposing is an extension to the conceptual framework for networking that people have developed over, you know, the entire history of information technology. Again, I suspect most people don't know the seven layer network model. Could you kind of just walk us through, you know, what is the 101 framework that guides, you know, people in the networking space?
Vijoy Pandey: So there is, like you said, there is this OSI seven-layer model for networking, and it's a somewhat of a formal theoretical model. People like to stick to the model, but more often than not, people find ways around it. Like with anything that is formal, you make things work, you don't necessarily stick to what's out there as a formal stack. But if you were to think about that formal stack, the seven layers are actually, there's physical, there's link, there's a network there's transport and I'm just going to walk through all of them and I'll give you the sense of where things are pretty awesome right now but there's transport and then there's session there's presentation and there's application so those are the seven layers what you really need to care about and the buzzwords that you probably heard in in the in the literature so ethernet is probably something that most people are familiar with that's a technology that connects computers in a local area environment or local area network. And Ethernet is operating at the physical as well as the link layer, primarily at the link layer. So that's Ethernet that you might be familiar with. TCP/IP, which is probably the most famous protocol out there on which the entire internet runs, that is transport and network. So those are the layer four and layer three, and then as you move higher up the stack, The other protocol that people are most familiar with is HTTP and HTTPS. And these are layer seven protocols or the application layer protocols. So typically, if people, if you ask somebody, what are you familiar with? In the networking stack, you'll think about HTTP and the secure equivalent of that, which is HTTPS, TCP/IP on which the entire internet is built, and ethernet, which connects local computers together in the network. So those are things that people are familiar with. The rest of the layers are there for formalism, I guess, but not really as important. And I know if you're a geek in networking, you might find that statement a little bit off-putting, but this is the reality of things.
Nathan Labenz: Yeah, shoot us straight, always, please. So a big question I also had about kind of the world that I enjoy today is, how much of the underlying protocols and sort of the way that today's major networks are managed is run on explicit code with rules that we fully understand that somebody sat down and designed and implemented versus how much has like machine learning already kind of penetrated its way down the stack to be managing the way that data actually flows and, you know, when issues are happening, like how are they're detected, remediated, routed around, so on and so forth. I know that, you know, there's people have, I'm sure got many PhDs in this area, but I realize I don't know what that boundary looks like between kind of what has been designed and what has been learned through, you know, the emergent process.
Vijoy Pandey: So let's take a look at that from a pre-LLM era a little bit. even straight up machine learning and those pipelines that existed for a while now, the actual hardware and the actual data, sending data across the network is pretty deterministic in nature today. And so if you think about routing and the routing tables that exist within these large switches and routers, that's a very deterministic process. I mean, you don't want to take chances on figuring out where things are sent. You want pretty good determinism between the connection, between you and me, Nathan, for example. I don't want that to be left to chance. But there is a control plane that sits above all of this. And this is where the algorithms sit. This is where, I don't know if the reader of the listenership is familiar with BGP and some of these routing protocols that exist that run the internet. That's a control plane piece of software that figures out who can talk to who and that not just in the regular day-to-day traffic sense of the word, but also when issues happen and outages happen. So what happens when you need to route around outages? And if an undersea cable has been cut, what happens when you have to route around that, for example? So these are control plane entities. Also, they deal with things like policies between organizations. So if like a Google and an Amazon want to connect to each other as two separate entities, There's strict policies and how they exchange information, how they exchange routing tables and connect to each other as an example. So policies, security, actual routing control plane and application software, those are all pieces of software. And we've been using ML for quite some time in those pieces of software. Simple examples, anomaly detection is a common one where we've used ML pipelines for a long, long, long time. traffic predictions because you can foresee events and like Super Bowl and you can figure out what needs to be done for an event like that. So we've used ML there. We've used ML to predict failures. There are certain like subsea cables or there are certain choke points in the internet where you need to route around that and be careful around those failure points. So we've looked at ML for those things. Also for around the business, we've used ML quite a bit within a company like Cisco. So to do things like sentiment analysis around customers. And there's a lot of use for ML, even some amazing users in, like I said, one of our pillars is collaboration, which is Webex. And Webex uses ML pipelines to do things like noise reduction. So we've been using ML in a company like Cisco for a long, long time. But I think like with everybody else, with the advent of generative AI and the practicality of some of these LLMs coming into the enterprise, a lot of that is now changing.
Nathan Labenz: Yeah, maybe you can tell us more about how it's changing. I was interested to read about the Project Jarvis, which is obviously one higher order additional layer that has been placed on top. I'm also kind of curious, are the LLMs reaching down into the stack or is it purely kind of a layering on? This will get back to the two new layers that you're proposing to the network model as well, but I was kind of challenged to think like, in the highest or in the highest level terms, like what is it that we've accomplished recently with language models? And it's like, it's not just that they've learned to do, you know, narrow task, right? But it's this sort of general purpose semantic and now even like agentic capability that has been layered on top. And so you're building at that layer too. Tell us about Jarvis and like other places where language models are starting to change how Cisco operates.
Vijoy Pandey: So Jarvis was actually one of the first use cases that we deployed, at least within Outshift in Cisco. And then also we've taken that piece of code and worked with other BUs within the company like Splunk and Webex and some of these other teams. And the whole notion behind Jarvis was, if you're an SRE, you know this, like there are so many tasks that you do day in, day out. to support your developer base, to support your customer base that are A, repetitive in nature, B, can be highly automated through generative AI and agents. And it's almost like the way to think about this is think about everything that is happening in the software development environment through coding agents and taking that paradigm and applying it to site reliability engineering, and that's what Jarvis does. It takes the SRE pipeline and brings in the same agentification, there's the same number of agents working together to solve for a problem that you would do in a coding environment for developing code. Because the paradigms are quite similar. I mean, the whole notion behind SREs is to leverage software development to solve for infrastructure and operational needs. So there's a lot of commonality in there, but there's also a lot of specific. special cases that we need to handle. So Jarvis was built and it's called CAPE. Now by the way, it's a Community AI Platform Engineer. So C-A-I-P-E has got a nice logo with a superhero with a cape behind their back. But CAPE is actually a multi-agent system. So it's a mass that allows for the automation and agentification of the entire SRE pipeline. And what we've done is we've had five plus user interfaces that feed into Kape. It's doing a hundred plus tool calls across cloud providers, across cloud native environments on-prem, across various aspects of that cloud native environment. So everything from observability to orchestration, to networking, to security, it's actually tackling more than 10 workflows today. And it's roughly 20 agents that are working together. So that MAS system consists of around 20 agents together. And the outcomes have been pretty amazing, where we've reduced the load on the team by 30%. 40% of the tasks that the team handles have actually been identified. So we don't even worry about it. It's completely taken care of end to end. the response time, because we've done all of this, has gone from hours to instantaneously. So it's efficiency as well as it's just morale in the team because you're not dealing with issues on a day-to-day basis and the developer community that is using CAPE is also pretty productive now. So it's efficiency for the SRE team, it's also efficiency for the developer community as well. And so we started here, like I said, we've rolled it out to other parts of Cisco, but we've also made this open source where there's this cloud native operating excellence canoe community. It's got members from Adobe. I mean, these names are fascinating, Nathan. So it's got members from Adobe, AWS, of course, Cisco, Nike, a whole bunch of enterprise companies participate. So they've been toying around with us, they've been growing this, and there's a decent community gathering around this Cape project.
Nathan Labenz: Can you give us a little bit better sense of what the frontier looks like today? I mean, it's a little hard, obviously, because it's such a moment in time thing. And, you know, from GPT 5.3 to 5.4, it's it's going to move, I'm sure. But what would you say are sort of the, you know, the upper end things that the multi-agent system can handle today? And what are the sorts of things that it can't that we need people to bring their expertise to? And how far do you think this kind of goes? Is there a vision for like extreme, I mean, we're talking in software, right? About like, obviously the productivity is changing pretty fast. Do we need 10 times as much software? Do we need 100 times as much software? Like, what's the future of this market? And again, I kind of lack intuition for like, if this were to become a hundred times more efficient? Do we like do a hundred times more network management or does the role change or do some roles go away? What is the sort of, even just next year, if you can think that far ahead, look like as models get better and kind of get plugged into the frameworks that you've built?
Vijoy Pandey: The one thing that interests me quite a bit is first and foremost, the definition of what we're trying to achieve as an industry. So whether we're looking at it from the AGI or ASI perspective, there is a definition of what we're all going after. And the definition that interests me the most is having a team of agents collaborating together to then solve for something that is net new, that is completely novel, that has not been in their training data at all, in any of those models or agents' training data at all, so it's completely net new. and doing it without any human intervention 100% of the time. So I'm mixing aspects of the economic and the technical definition, but we've seen variations of this come across from many, many researchers in this field. And that's the one that I aligned behind. So that's the North Star that we all want to go after. And I would say we are, we, I can make a prediction. There are others who've been making a prediction here. The timeline shifts forward, comes back, it's been a back and forth on how and when we can achieve ASI. But to me, the big thing that matters is right now, the entire industry has been chasing one vector to go towards this goal of ASI, and that is vertical scaling. So we are building bigger and bigger models, we are building better and better reasoners, that's, we are throwing data, compute, Resources are the problem, more parameters. That's one axis that'll continue to happen. But we haven't really tapped into the second axis, which is the horizontal axis. And this is where a company like Cisco has a play. This is where I get interested because I'm a distributed systems person. And what I mean by that horizontal axis is, can we scale intelligence horizontally? Can we enable collective intelligence where the collective is always greater than the individual. And so right now we have not tapped into that piece yet. And the big reason for that is we've managed to scale these individual brains, quote unquote, and they become smarter and smarter, but we haven't figured out how they can think together. So how we can bring them together so that they can have shared intent, shared cognition, and then innovate collectively together to solve for this new set of problems without human intervention, which is the definition of ASI. So to me, if we want to pull in those timelines, we have to tap into the horizontal axis of scaling intelligence. And that's something that we would like to bring to the table and we want to push towards, because it's completely missing today.
Nathan Labenz: Yeah, I think that's really fascinating. I was just reading Ajaya Katra's latest blog post earlier today, she's now at meter. And I'm sure everybody's familiar, of course, at this point with the meter exponential graph. But she was saying it might be time to rethink the metric. It has been until now. How long would it take a human to do this task? And she was pointing out that in the range that we've been measuring from 0 to whatever, Opus 4, 6, it was like 16 hours or whatever best estimate, It's not too different in that range to have one person do the work versus try to divvy it up and have multiple people do the work. Because the coordination costs are-- there's some fixed coordination cost, whatever. Certainly, if you think down to the limit of take a one-minute task, you can't really paralyze that eight ways more often than not. But now, as you get past 16 hours and you get to maybe a couple weeks' worth of work, now it is something where humans can paralyze and take it down in a better way. She was emphasizing like, maybe we need to measure AIs by how long would it take a well-organized team of humans to do this thing? But you're bringing the other side of that to the equation as well, which is like, sort of assumed here is like it's kind of still one AI. We do have one, you know, we're getting now with the latest models, certainly Claw is doing this and, you know, Kimi 2.5 made a big point about how they're like spawning sub-agents and kind of swarming themselves. But I think you have a much grander vision for that, which goes beyond like one AI kind of self delegating, which still feels kind of brittle and like subject to correlated failures and, you know, potentially who knows what other kind of resonant weirdnesses that might emerge. And I kind of hear you saying like, for humans, it was It's diversity, it's across culture, it's across time. It's all these sort of additional richer dimensions of collaboration other than just narrowly cloning yourself and delegating a subtask. So maybe there's the formalism of you're going to add two layers to the network stack. You can touch on those. But then I really want to get into your vision for what does AI culture look like? What does AI cultural evolution look like? What is this world that we're going to step into? And then, of course, how are we going to make it work as well? Take me to the vision part of AI collaboration.
Vijoy Pandey: To get to the vision, let's take a look at human history and human evolution and how intelligence evolved in humans as well. 'Cause the one thing that we do really, really well in AI is the human is the bar and we try to see where we are with artificial intelligence and compared to singular human or teams of humans. And that's always been the yardstick, though we might be surpassing that yardstick pretty soon. But if you think about human intelligence evolution, humans became smarter and smarter. They became very conversant with tool usage. They became very conversant with symbolic communication for a long, long time. And this was happening for hundreds of thousands of years. And you could use an ax, and you could use an agricultural tool, and you could actually paint pictures, and you could raise flags, and you could communicate in those ways. But the big paradigm shift, the step function happened when language got invented. And that happened around 70,000 years ago. And there's a ton of literature on this where you can see that the invention of language actually was a step function change in intelligence evolution within humans and human societies. Because suddenly you could collaborate on tasks, you could align on intent. So instead of all of us trying to run and capturing that hill is like, let's figure out a strategy to go ahead and capture that hill. And that delegation, that taking a larger task and breaking it down into smaller tasks and giving it to various members of the team based on their expertise and then aligning on that intent, then coordinating between these members of the team, and then actually executing on that task and solving for a net new problem, which is, we've never seen that hill before. Let's figure out how to capture that hill. That's what unlocked the next revolution in human intelligence. And what we are seeing is that this exact trajectory is actually playing out in silicon. As we are seeing this build out of smarter and smarter brains and bigger and bigger models, agents are getting better and better. Yes, you are getting these sub-agents in Claude and in in OpenAI and GPT and in Gemini, but it's still not, it's like sub-processes within my brain is actually not getting out of that. And we are not looking at teams of agents that can come together and collaborate. So what we need to enable just based on that paradigm is, to your point, look at longer duration tasks, look at tasks that humans do, more than 16 hours, few days, look at the specialization that happens within human teams, and then figure out how agents can actually mimic those behaviors. And the vision is that what we will find, that specialized agents, subject matter experts, and we see that already in smaller environments, but subject matter expert agents that can come together, share intent, coordinate, negotiate, then work on shared knowledge and shared context almost like institutional knowledge, and then innovate on a new problem statement, that's the way to go. And that's what we're calling the internet of cognition because it is going to be distributed across a bunch of agents who by definition will come from different vendors, 'cause they're all subject matter experts, they will not come from the same vendor, and they will all need to come together, collaborate, and solve for this net new problem space, and so that internet of cognition is what the vision is, And that is a horizontal scaling that we all need to go towards to then enable the second axis, the analog, the second axis of scale on intelligence.
Nathan Labenz: So can we make this really practical for a moment in, you know, let's look at like the interaction that we've had leading up to this conversation. So I use an AI agent to help me prepare for every episode of the podcast. I specifically use Tasklet that basically channels Claude and gives it, you know, a really robust cloud-based framework to work in. And it does a really pretty good job of like going out and researching everything about you, finding all these publications, you know, bringing me back a good source list. Then it also crosses that against my past work, you know, all the previous outlines of questions that I've put together, and it comes up with a decent draft for me. I definitely do still have to spend time on that for the record and end up usually writing my own because I want to be able to be present in the conversation. And just having had an AI do it doesn't really give me the ability to do that. But that's what happened on my side. And then usually the guests don't really do anything like that. But somebody on your team, I don't know if it was you, but you or somebody on your team did a similar thing and took all your work and then crossed it against my recent record of episodes. And there was another thing that got sent over that was like, okay, here's all the themes that we know you're interested in and how Vijay's work relates to those. What didn't happen at all was that the AIs had any back and forth or any sort of coordination. It was just kind of two ships passing in the night. Now I have these kind of two side-by-side documents. Both useful, but no meeting of the AI minds. How do you think that, you know, that's obviously a pretty basic case and we can build up from there, but how would our agents in the future come together and what should we expect in terms of like additional value from that kind of interaction?
Vijoy Pandey: This is an excellent use case and it's a pretty straightforward use case. Like you said, it's simple, so it'll be easy to understand what we're thinking about in terms of shared intent, shared context and collective innovation. So your agent, Nathan, has got a local optimization function running. There's a goal there which says, let's produce the best podcast ever in your series, in the entirety of what you've done with Cognitive Revolution. And let's figure out what that entails. And the history or the context that your agent has is all of your guests and what they've spoken about and what might be net new and interesting for the audience. based on what that agent is seeing, which is your history. There's a similar agent, like you said, on our end, which is looking at all the speaking engagements I've done or people from Outshift have done. And what's interesting to folks that have listened to Outshift people or me speak about. So there's a context there which is somewhat different. It's almost like a Venn diagram. So one is very Outshift centric and what works, one is very cognitive revolution centric and what works. But both of us have a similar goal, which is, can we get together and between the two of us create this podcast, which is probably going to be the most interesting podcast out there. And there are some, as you come together and make that happen, both you and my team, Rebecca on my side, we've been working through some concessions, we've been working through, yeah, that might not be, That might be a little bit too marketing-oriented, Vjoy. Let's get a little real and grounded and more technology-oriented. And so we've been looking at how to come to a common goal so that this conversation resonates with the audience that you have. But that work is being done by humans today. It's not hands-off. It's not automated. There's no gentification at all. What we're trying to get to is the gentrification of this entire pipeline, which is let's throw the CR, your agent, the CR agent and the outshift agent together, let them converge on a common intent, the best episode possible with some concessions on both sides, so that we narrow down on the questions and topics that we can talk about. And so that's something that can happen hands off between the two agents. And once you're done with that, and this podcast is out there and you're getting leadership, Now you have common context that both of them can work upon that can leverage for the next guest. Maybe I'm there the next time around and you can leverage that context. You don't have to redo these things. Or maybe I'm going on a different podcast or. maybe somebody else from Outshift is coming to your podcast. So you have common context that you can leverage over time that is not lost and you don't have to restart that process. So there's something really simple, but it also tells you that even in such a simple example, the humans are doing the intent alignment. The humans are doing the coordination and negotiation. The humans are actually building out context using Google Docs or SharePoint or whatever it is. is a human glue that is enabling collective intelligence. How do we codify that? How do we make that automated? What is that infrastructure that we need to build so that this human glue becomes software glue? And that's what we're trying to do through the iterative cognition.
Nathan Labenz: So let's get into the weeds on that, because I think it is really interesting to imagine, and then I, of course, have a ton of questions around like, Very practically, how should we think about it working? Like, should we exchange agent IDs and then, you know, I give my agent your agent ID. Should they be out there like potentially discovering each other? Is there some sort of way to kind of, you know, have your sort of menu of your roster perhaps of agents kind of present in a way that's attached to your identity? I have a lot of questions about identity just unto itself with agents. So I'm very interested in kind of just in the first place, like how do these things discover each other? Know that when they do communicate that they are talking to who they think they're talking to. And then, you know, previewing your additional layers for the network stack. Layer eight is semantic. Layer nine is cognitive. And, you know, we have big questions there, too. I mean, these questions are also very operative in human affairs all the time, right? Are we talking about the same thing? Are we thinking about it the same way? Are we miscommunicating? Obviously, human life is full of miscommunications, minor and major, but it seems like right now the AIs are going to kind of drift off into weird spaces if we just let them sort it out fully. So we're obviously going to need some guardrails and some kind of way to bring them back to true north, at least for now. So yeah, give me kind of the double click on the details of how you see all this working.
Vijoy Pandey: So to walk through all of that, that you just asked, I mean, there are many, many layers involved here. So to walk through each one of those layers, let me show an example. And this is a pretty abstract picture. So for everybody who's listening in, I'm going to describe it as much as I can. But if you can go back and listen and look at the diagram that I'm showing, then it actually makes even more sense. So let me share this picture. So to walk through that entire set of questions that you asked Nathan, because it's a little complicated and there are many, many layers here that need to get involved. Let's start with an example here. So this is an example that we built. It's a multi-agent system that we built for a healthcare provider. And the goal here is pretty straightforward. The goal here is to take patient calls. So people are calling in to this healthcare provider and they need to get routed to the appropriate subject matter expert, the provider themselves, like a doctor, let's say, based on the patient's history, based on the availability of the doctor, and based on the insurance profile of both the doctor and what exists in this hospital. So what you see here are actually four agents, and for everybody listening in, I'm just going to describe it But there is the scheduling agent on the bottom right, which is the one that is actually interacting with the patient. So this is where it's like a chatbot, you're talking to the chatbot, and it's also doing scheduling at the back end. So this is the one that's actually going to go and schedule you some time with the doctor. And then there are three other agents. So there is an insurance agent, which is from the payer's perspective, coming in from the payer's perspective, it's not from the hospital. It's a completely third party entity. There's a diagnostics agent. This is also a third party agent, not belonging to the hospital. And then there's a pharmacy agent, which is also a third party agent, not belonging to the hospital. So the only piece that belongs to the hospital is the scheduling and conversational chatbot piece. The rest of the three agents are actually third party agents. And the task is, again, pretty straightforward. Get me to the right provider for the symptoms I have based on my insurance history and based on my pharmacy record and so on. So what ends up happening is if you think about all of these other three agents, insurance, diagnostics, pharmacy, first and foremost, all of these agents are independent. They, today, they cannot talk to each other. I mean, they could not talk to each other. Let me start that way. They could not talk to each other. So they, first of all, they had to be discovered, connected, given the right identity and access, brought into and stitched together into a multi-agent system so that we can even start doing things like this. So that's step one.
Vijoy Pandey: And so to do that, we had launched this whole notion of the internet of agents. almost like a year ago, and we launched an open source project called Agency. That's spelled A-G-N-T-C-Y. And again, if you're watching this on YouTube, you can go to agency.org. That's the place for the landing page for the open source collective. It's part of the Linux Foundation as well. But coming back to this example, what Agency allows you to do is it allows you to first discover these agents So all of these four can get discovered. It then allows you to provide appropriate identity and access management attributes to each one of these, and we can spend some time on that, but that's a complicated topic. And then once you do that, then you can actually start communicating with each other. So traffic starts flowing between all of these agents, they get connected, they're communicating with each other, and so there are things like MCP, Protocols like MCP, there are protocols like eight way that allow you to have agents talk to tools through MCP or allow agents to talk to each other through eight way. So that happens, communication happens. And finally, there's an observability pillar which says, great, these things are working. Are they actually delivering what they're supposed to deliver? So a lot of observability from the agentic perspective, from the multi-agent system perspective, and a little bit of evaluation from that perspective as well. We have not solved the evaluation stack completely, just touched it a little bit. So that's what agency does. It is a basic plumbing required to even bring these four agents together in an enterprise so that they can start doing what we're trying to do here. So this is where what we deployed, this is what's working today, but what's not happening today is All of these four agents are actually isolated agents. So they can talk to each other, they can get connected, but the payload is opaque. The payload is just a blob. It's a binary object. You don't know what's contained within those payloads. And so all you're doing is enabling these things to talk to each other, enabling access to tools, but the coordination, the alignment, the shared memory, the shared context, everything that you see on the right hand side, which is grayed out, is not happening. And that is happening again through human in the loop. So the human is actually the coordinator. The human, just like the previous example we talked about, the human is playing the role of enabling intent. And if you think about what's happening here, you have time to route KPIs that the scheduling agent has, you have the diagnostics agent that is being measured on the outcome confidence. You have insurance agents that is being measured on ROI. So every agent has local optimizations to solve for, but somebody has to step in and say, You know what? Each one of these agents have to give a little to get a little. So you need to relax your time to route KPIs. You need to relax your, the insurance agents need to relax their ROI KPIs. so that we can get to the proper global outcome for the patient. And so there's intent, there's coordination, there's knowledge and context which needs to be shared between all of these agents, because all of these agents have a mix of patient information as well as broad information across all patients.
Nathan Labenz: A couple of little questions that I'll just, you know, float to kind of help prompt you are how You know, again, today it seems like the discoverability, I had this vision of sort of the human is like kind of back to being, you know, the old like switchboard routers. You know, we're doing that now like we long time ago did for phone calls of literally like, you know, plugging things in to make the connections. We're kind of doing that for our various AI systems. Like the number of times I in a week now go like, fetch an API key or some sort of equivalent of that to, you can see here it's going to be, okay, well, where do I go get that pharmacy agent ID? Where is that, who's broadcasting that anywhere? I'm going to have to go figure that out on my own. And then I'm interested too in like, in general, what do you think the patterns should be for like who sets this up? I guess one way to think about it is like, who owns the customer relationship, I guess. Like in this case, it would be if I'm calling the hospital, you kind of framed it from the hospital's perspective and had these other things as like third parties in a similar way that if I call the hospital, they might call my insurance for me and do some background double check. But then if that's right, does that mean that sort of the shared context lives forever with the same entity that owns the customer relationship? Is that kind of a pattern we should expect or will these shared context threads also be like jointly owned or sort of, you know, shared in such a way that any of these individual entities and their agents, you know, could come back and access them later. Again, we have some patterns like this, like if I do a Slack connect, that Slack creates this kind of shared space for me that both my company and the, you know, the other company that connected can go back and access. But I'm not sure if we should expect the same How skeuomorphic should this be? I think is always a really interesting question in AI, and I don't have a great intuition for that in this connectivity and coordination space.
Vijoy Pandey: Yeah, so let's take it one at a time. So let's just start with the basic connectivity pieces. So the four pillars of basic plumbing for multi-agent systems are discovery, identity and access, communication, and observability. So if you get those four down, then you can actually create a multi-agent system like the one shown in green here or teal, and get these agents to at least start talking to each other, even though a lot of the things that you see on the right, which is in gray, are not possible today. But what's possible today is the stuff in green or teal, in the center, so they can all talk to each other. And so if you think about what we did with agency, and I'm just gonna go and flip over to this, window here, and for folks who are just listening, if you go to agency.org, A-G-N-T-C-Y.org, you'll see those four pillars come here, and you see the architecture at the bottom, where to your question of how do these things get discovered, and there's a human switchboard primarily for consumer use cases, you and I do this all the time. So we figure out which agents make sense, we look at their reputation, we look at their evals, we look at their pedigree, and we say, Great, I'm going to use these. Sometimes we even depend on the LLMs to generate agents on the fly. And we are like, yeah, it's good enough because it's a consumer use case. But if you think about enterprises, enterprises don't work that way. Enterprises have long procurement cycles. They have trust in the equation and safety in the equation, and they have customer trust and responsible AI in the equation. So there's a lot of trust and safety and rigor that goes into how enterprises procure software, whether it's agentic or not, it doesn't matter. And so the way we've tackled this is each of those pillars, four pillars, has an entity in the software stack, in the software stack, in infrastructure stack, providing APIs for that. So there's an agent directory that does the job of discovery. So you can define agents, you can define tools, you can define multi-agent agents, because in the end, it struttles all the way down. Like an agent could be an agent built of many agents. So you can define an inordinate amount of hierarchy within an agent. So agent within an agent within an agent. You can define tool access, you can define software, data source access, all of that can be defined within the directory. and it's searchable based on capability. So it's like the DNS equivalent for agents. So today you want to go and access a website or you want to go and access a remote API, you have a URI and that gets translated into an IP address, which then flows over the network and you get routed towards that. Here is not as straightforward. You don't have a fixed URI or you don't have a fixed website. You can search on capability, you can search on reputation. So this is a little bit more involved.
Vijoy Pandey: So that's the directory. And what comes back may not be a service endpoint, like an API endpoint, a remote API endpoint. It could also be a Git code drudge. Those agents could be stood up as a service, or they could be, they could exist in a Git code tree because they might be local to you. or you might be pulling it from a Git code tree and deploying it as code within your environment. So it's a pretty powerful piece of software that allows you to search across billions of entries, trillions of entries. It's actually built on decentralization principles. So it's actually got a DHT that supports it in the backend, and it allows you to discover agents and then be able to bring them in your environment and be able to then connect them, so that's step one. Step two is then giving them the right access and identity. And this itself is a whole different ballgame because agents, as we all know, they have human-like characteristics, but they are operating at machine speed and scale. So they have agency, they have decision-making powers, they have semantic communication, which is not deterministic, but they are operating at software speed and scale. This is a blend of human-like and software-like, machine-like capabilities. So you cannot rely on the old role-based access control mechanisms that have been in existence for a while. So you need to rethink access control. And so we've brought in the whole notion of tool and task-based access control, and that's what the identity piece does in the stack. So you give them access control, you have discovered them, you've given them identity and access, Then you go towards communication, and the communication is of two types. One is either I'm talking to existing deterministic infrastructure, which is tools and data sources, and MCP, which is model context protocol from Anthropic, that is used for those kinds of communication. And then A2A from Google is the agent-to-agent protocol that allows agents to communicate with each other. And we are foundational members of both of those. So both A2A as well as MCP, and they fit nicely in this architecture. And then finally, there's the whole observability piece. So how do you observe not just the containers, the cloud native infrastructure or the bare metal infrastructure where these agents reside, but also the agentic parts of agents, because your service might be up, but the agent might be misbehaving. So how do you observe and evaluate the agent that sits within these infrastructure components. So we work together with Microsoft to push a whole bunch of extensions in OpenTelemetry, which is the de facto observability stack to make observability for agents happen. So that to me is the bare plumbing that you need for this picture to even get connected and you to start just working. So I'll just pause here to see if you have questions because we have not even tackled the gray parts. It just brought them together and the human is still trying to be the context setter, the intent coordinator, and all of that in the middle.
Nathan Labenz: A couple of questions that are really top of mind are, you mentioned like the decentralized directory, and I'm interested to understand that a little bit better. Obviously, like one of the mega trends on the internet has been toward walled gardens and platforms kind of controlling the directory in many major spaces where people like to spend a lot of their time. It seems like that is something that, say, OpenAI might be trying to do again. And I don't mean that in an overly pejorative way, but they're trying to create a curated app store like Apple before them and so many other platforms. They're gonna have reviews, that they're gonna kind of be the owner of the reviews on the plugin or the app store that they develop. Then they're also gonna have advertising, which is gonna kind of bring another whole layer of commerce to all that. What is the sort of alternate vision of a more decentralized, open, permissionless version of that look like? And is there, I guess there still is probably some sort of centralized kind of DNS equivalent in that vision? I don't know if it goes like as far as, I guess we do have like blockchain, you know, things that don't even require that. So I guess how decentralized ultimately do you think that can and will be?
Vijoy Pandey: So the two things that you touched upon, identity, there's a lot flows from identity and then discoverability, which is the DNS statement that we just made. Those two are the ones that are the source of everything else that follows after, and you touched upon both of those problems pretty concisely where if you control the directory pieces, which is how do I discover agents and how do I even get to the best agent for my task, and then if I control the identity, then you control the reputation, then you control the security, then you control how much you can charge for it and the commerce behind it. So everything flows from those two pieces, and so we took a lot of care to ensure that in this architecture, Those two critical pieces, the identity and the directory pieces, were actually built on DHTs. So these are distributed hash tables. Nobody owns the directory. You can participate in a directory decentralized infrastructure. Similarly, identity, we do plug into the well-known identity providers, IDPs, like Okta and Duo and others that exist out there, but we also enable a decentralized identity provider that somebody can deploy and participate in. So we are offering the best of both worlds when it comes to directory and identity because our vision is for this to be truly open and interoperable. And to the statements that you made, you can truly be open and interoperable if discovery and identity are also decentralized and no singular entity owns either of those pieces. So we've taken actually great care in making sure that those two are available as options. Now, enterprises might decide to start with a Microsoft Active Directory or an Okta because that's what they are familiar with. But soon they'll also realize that, yes, role-based is great, but they need something like a task-based or a tool-based, so they might move towards decentralization so we're not forcing one or the other. But principally, we are aligned to what we just said, that we want this to be decentralized in nature.
Nathan Labenz: Yeah, that's great. I think that is a really important contribution. When it comes to identity, I would love to get a little bit more of your vision for what sorts of identity are going to be supported, right? I mean, this has been another kind of major point of contention. in the digital sphere broadly, should we have anonymous accounts? What sort of clarity should we have on who's ultimately responsible for this account or this agent, as the case may be? Now we've got people doing experiments in just fully autonomous AI systems, which If you kind of give something a credit card and it can pay for its own server time, we're kind of right on that fuzzy border, I think, where some of these things might actually be able to persist for a while and potentially not solely due to memecoin scams, which obviously won't be, I don't think, a long-term strategy for them. Yeah, how do we, I mean, there's so many trade-offs. It's like such a vexing thing. And when I try to think about it, I kind of quickly get overwhelmed. So help me maybe be a little less overwhelmed. It might even start with like a taxonomy, but I would love to understand your vision better for identity and sort of responsibility, accountability as we go to the giga agent future, I think incredibly quickly.
Vijoy Pandey: Identity is actually the biggest hurdle that we are facing today when it comes to deploying full autonomy, or at least as much autonomy as possible with agents in the enterprise, at least. Because what's happening right now is, like we mentioned earlier, agents are like humans because they communicate semantically. There's a lot of natural language flowing between agents and agents, also between agents and humans. So if you look at a multi-agent human team, a lot of the communication is happening semantically, and so that's ambiguous, that's non-deterministic. There's a lot of decision-making authority that you're giving to agents, and there's a lot of tool calling authority that you're giving to agents as well. So for enterprises, especially, and even for consumers, when you're actually using a credit card, I would be pretty careful. You have to be really sure as to what access control are you giving an agent. And what we've seen all along is that identity systems and access management has been built for mostly humans. And over time, we took that, it's called role-based access control, RBAC. It's a very famous, like you think about access and identity, you think of RBAC. So role-based access control is a very human-oriented view of identity and access. Over time, we realize that there are pieces of software and there are machines that do not totally align with rollback, with RBAC. So we created this attribute-based or ABAC, attribute-based access control. But it's very similar because machines don't change their personas if they're not agentic. They're doing some piece of, they're running the same piece of deterministic code day in, day out. So it's long lived. It's, you know what it is. So it's almost like saying RBAC, but applied to machines. So those are the kinds of identity mechanisms that have existed in enterprises so far.
Vijoy Pandey: The problem with agents are that they are like humans and they're shifting around. And they might be VJOY tomorrow, today, they might be Nathan in the next hour, especially in larger teams and enterprises. So we had to go back to basics and we had to look at what is it that we are really giving access for. So when somebody gives VJOY access within Cisco, I have access to data, I have access to systems that I can read or write given where I am sitting in the organization and the kinds of skills and the roles that I have, that I perform. In the end, it all boils down to what are the tools that I'm accessing, what are the tasks that I'm trying to do, and what are the transactions that I'm participating in. So if you take it down to that level of granularity and back to basics, We said agents should look at tool, task, and transaction-based access control, where by default, you assume the lowest layer of access for an agent. Based on then the tool or the task that it's trying to execute, you then may elevate privileges just for that task or just for the tool access or just for participating in that transaction. Let that happen, if they are allowed to do that, then bring them back to base level identity and access, a base level access. And you keep doing this in a very ephemeral manner, and you don't give agents longevity when it comes to access control. So this whole notion of TBAC, which is Tooled Task Transaction-Based Access Control, is something that we proposed with agency, is seeing a lot of traction and a lot of interest. because it goes back to the basics of what we are all trying to do and what is it that you should be given access for. In fact, there's a one thing which is pretty interesting that happens when you start looking at this notion of a task-based or tool-based access control. What ends up happening, and to the earlier question that you asked, let's go back to the networking stack. So far, we've been dealing with deterministic pieces of software And so all kinds of access control mechanisms that existed out there, even for machines, were built for deterministic pieces of software. With agents coming in, we are going towards non-determinism. So we're going towards semantic communication. So when we need to give an agent access for a particular task, the question is, what is that task? And how do you figure out what is that task that the agent is trying to execute? And the way you do that is by tapping into the semantic communication that's taking place between a human and an agent, or between two agents, and figuring out what is that task that they're trying to execute on the transaction, that they're trying to execute, or the tool call that they're trying to make from that semantic communication, and then giving them access for that. So right away, to enable TBAC, you need to be able to tap into layer nine, which is semantic communication to really figure out how to give this agent the right access control.
Nathan Labenz: So it's talking, you said turtles all the way down. It's also like this screams LLMs all the way down. I like the idea a lot of the sort of minimum permissions and what is it? Principle of least privilege, I guess is the fancy way of saying that.
Vijoy Pandey: Or zero trust.
Nathan Labenz: But, and I was just going to go to trust. I guess there's kind of... Repetition, there's trust in the first case, and then there's also kind of, can you cash this sort of stuff? I mean, one big thing that we have as humans that is like to our advantage in our ability to coordinate together is while we do as individuals change, evolve, drift, whatever, over time, that's a relatively slow process. And people can generally sort of expect that one another will be similar tomorrow as they are today, if not exactly the same. With agents, I'm like, okay, so let's say I want to do some delegation, and now it's like, and I maybe want to engage with somebody else's agents, I want to give them some access to some information. I'm kind of getting a picture of, first I have my little guardian angel agent that kind of tries to watch out for my interest and determine what I'm willing to give. Part of how it's going to determine that, presumably, is some sort of reputation or identity on the agent on the other side, but that still strikes me as kind of a fuzzy, Okay, sure. But what exactly am I giving access to? And do I know that it's the same tomorrow as today? And how does it prove, you know, that I can trust it at all, right? I mean, even just inside an enterprise, you could imagine that one department is like trying to make their thing, you know, do its local optimization a touch better, and they maybe swap out the model. Okay, great. Maybe test it better on your benchmarks, but Over here, where I'm trying to make an independent decision on how much to trust this, or can I assume that my last designation was good, maybe I ran it through a battery of tests last time or something, and it passed. Under the hood, you swap out one, even just a prompt, a system prompt change could make a pretty dramatic difference in terms of how it's going to behave next time. So I guess if I had to boil all this down, it's like, how do we make... the trust designations in the first place? And then how do we have durable identity so I know that I'm going to get kind of consistent behavior from my counterparties over time?
Vijoy Pandey: Right. And I think you're getting to the crux of how do I make this non-deterministic system a little bit more deterministic? Because in the end, you need to be able to reason about things and you need to be able to persist. around certain kinds of state, not all kinds of state, but certain states. And so the way we are thinking about this is, again, going back to the OSI stack. So we started there, and let me take you back there, where so far, all of the layers that existed in the OSI stack, when two endpoints are communicating with each other, the endpoints that are connected through a network are deterministic endpoints. and they're exchanging data, and you're exchanging basically deterministic state between all the endpoints that exist in that network. What we're seeing now is that the world is moving towards non-deterministic endpoints, probabilistic endpoints, and the endpoints are intelligence endpoints. So the data, instead of exchanging data, now I'm exchanging cognition state, because I'm a cognition entity as a human, I might have five agents that are also cognition entities, and you might have five more. So we are exchanging cognition state, and the OSI stack does not have the right layers to even support that communication. And why do I need that? The reason I need that is because I need a little bit of determinism when I talk to each other, when all of these agents talk to each other. So I need to understand whether I'm in the intent alignment state, whether I'm in the discovery phase, whether I'm in the coordination phase, whether I'm negotiating. So there are all of these phases that you go through when you actually align on intent and actually work on a common goal. And we need to figure out where you are. And we need to also figure out whether you're trying to take an action during an execution phase. So all of these sort of meta keywords that need to take shape between agent to agent communication is very free flowing today. I mean, and some of that you've seen that happen in the open claw, open claw board book example that we saw a couple of weeks ago, where you had a bunch of agents, open claw agents come together and they're just talking NLP, of course, prompted by a bunch of humans in between, but they're talking NLP, but they're not building consistent state and they are all over the map. So if you want to bring this into an enterprise and actually work towards convergence, you want to work towards emergent behavior that is constructive and not just divergent in nature, you need to put some structure around it.
Vijoy Pandey: And the way you put structure around it is by making sure that the communication that happens between these agents is also a little structured compared to any and all NLP Wide, wide west in some ways. So the layers that we are bringing into the table are the two layers of L8, which is syntactic communication, and L9, which is semantic communication or cognition state protocols. So the syntactic layer is, think of it as grammar. It's like I built an agent on Vertex. Somebody else built it on a Langgraph or a Bedrock. How do I make sure that these two agents can talk to each other? because their grammars look different, their frameworks look different, and they may not be sending their blobs and their payloads in the same formats. So that's the syntactic layer, and this is where NCP and A2A and the entire architecture that I showed on agency, that's where this plays. Above this, we bring in the cognition layer or the semantic layer, that is layer nine, and layer nine is where we look deep inside that packet deep inside that communication flow and actually extract the meaning and figure out what is it that you're truly trying to do and put some structure behind that meaning. So instead of it just being plain, simple NLP, which it will be, can we wrap a header around it which says, this is what I'm trying to do. I'm trying to discover information. I'm trying to negotiate or coordinate and which phase am I in in that cycle? and I'm trying to access a tool or I'm trying to execute a command. I mean, there is a little bit of a meta header that needs to happen so that you can take action on that and you can build those trust and governance layers without which even those pieces become a lot more problemistic in nature. So that's the way we're thinking about this. Now, of course, the lowest common denominator is, yes, it's all going to be an LP, in which case, if you're driving, if you're dropping a guardian angel, or a cognition engine, as we call it, that is trying to contain blast radius, it will have to be also probabilistic in nature. You cannot guarantee 100%. You can guarantee maybe 95% or 99%, but you will not guarantee anything 99, 100%. But if you wrap these with a header and you can wrap it in a proper protocol, then you can get to determinism on some of these things that are frankly requirements for enterprises to deploy multi-agent systems.
Nathan Labenz: So who, is this something that can sort of exist on a spectrum as well, perhaps? Like when it comes to where this sort of shared history lives, Cisco has WebEx, as you said, that's sort of, you know, one place where people come together and obviously those, you know, meetings can be recorded and a lot of shared, you know, history and context can be established through that digital space for humans. one could imagine a like Cisco product that gets deployed to enterprises that's like, hey, this is your sort of WebEx for agents where they can come together, where they can create shared state, where you can have a history of this and then you can invite guests in from other entities as you will. Does that exist as sort of a enterprise product distinctly or can it also sort of have like a long, sort of open, permissionless tail to it. I'm not even sure if that's a fully coherent question, but I am really interested in like it's not easy for an individual, right, to interact with an enterprise. People, both people and the enterprises like want to do that.
Vijoy Pandey: So let me take a let me take a crack at this and let's see where it goes. So the way we think about this whole Internet of Cognition architecture and if you can see The slide here, and again, I'll try to describe it as much as I can for people listening in. What you're seeing out here is the entities in teal or green are the entities that make up the internet of cognition. And there are three layers that we're thinking about. The first is a protocol layer, which we just talked about, which is how can I enable semantic communication to take shape between intelligence endpoints instead of data communication between deterministic endpoints, which is where we are and the entire networking stack exists today. So we're connecting intelligence endpoints, the semantic communication happening. And what you see here on this picture is cognition state protocols. These are the classes of protocols that enable these kinds of communication. And we have three types based on where we are. It could be as simple as natural language, which we are calling semantic state transfer protocols. It could get better than that and really good by saying, let me just exchange the entirety of latent space between two models and two agents. So let's take the entire KV cache and send it across. That is excellent. It's like implanting a neuralink chip between you and me, Nathan, so that we don't have to undertake the cost of tokenization and transmission and detokenization back at the other end. but I might not allow a newer link implant, and you might not allow a newer link implant, which is what the case is today with model providers. So you can do this with open weight models, but you can't probably do it with closed weight models, at least not now, unless it becomes a standard. But we have this thing called a latent space transfer protocol, which allows you to do this at really high efficiency, and there's something that sits in between, which is compressed state transfer protocol, which is in variance between agents, especially for deployment scenarios where not everything might be within an enterprise or not everything is within a data center. So let's say you have edge cases. So you have edge deployments like your MacBook or your phone or your wearable device and something that exists within a data center. Can we send this information in a compressed way to that edge inference device? And so we have three kinds of protocols, but they're all within that protocol layer.
Vijoy Pandey: So that's how we talk to each other between intelligence endpoints. Then there is that cognition fabric. And the fabric is, as the name implies, it allows you to scale this out. So it allows many, many, many agents to talk to many, many, many agents. So many to many communication, real time in nature at the semantic layer, but it also allows you to plug in the memory of choice. So we are not dictating. So getting to your question, we are not dictating what you plug in. You could plug in an open source memory like a mem zero or something else. You can plug in a BigQuery, you can plug in whatever your choice of memory might be, but what we are we are suggesting above that is that you are storing different kinds of memories in the shared memory and shared context space. So you're storing ontologies, you're storing beliefs, you're storing working memory, you're storing all kinds of memory in that shared memory infrastructure, but we are not dictating what you plug in. And so when you think about beliefs and context and knowledge graphs and ontologies, That is shared memory across all of these agents that participate in this cluster. And then finally, the one thing that you alluded to earlier, but we have not talked about are these cognition engines. That's the third layer. And the cognition engines are accelerators or guardrails. So they are either cognitive accelerators or they're guardrail angels or technologies. And this is the famous Raj Reddy code, which is you would want AI to be either a cognitive accelerator or you'd want it to be a guardian angel. And so those are the two types of cognition engines that would exist. And they would, more often than not, be transparent to everything else that's happening. So while agents are talking to each other, these are almost like the note takers that sit within a collaboration environment like you and I participated. These are the note takers that, yeah, don't mind me, I'm here, I'm just taking notes. So those are the kinds of engines that we're thinking about where they're summarizers or they're compliance engines or they're security engines that actually make sure that whatever happens as collective intelligence is possible and feasible within the guardrails of the enterprise that they're operating within.
Nathan Labenz: This is a it is a fascinating challenge to just envision what this is all going to look like. One big question that folks in the AI safety space have wrestled with for a long time is to what degree should we allow our AIs to communicate in languages that we don't understand? You know, I do expect, we've seen also thinking, right? I mean, in today's, on the like more sort of current margin of performance, right? There is like explicit chain of thought that is, you know, very readable and hopefully faithful to what the AI is actually thinking and intending to do. We've seen some weirdness in some reports around like chain of thought dialects kind of emerging under intense RL pressure. I was pleased to learn that from folks at OpenAI, both in a recent episode and they just put out a paper on this, I think today, saying that basically that hasn't been as big of a problem as those kind of, you know, a few of those reports would have suggested that mostly the chain of thought remains pretty vanilla, pretty human interpretable. But flip side of that, you know, there was a paper out of Meta where Thinking in latent space is starting to be unlocked. And where, rather than detokenize, just pass your last high-dimensional state to the next embedding position. And you might be able to think faster, better, more in parallel, interesting, was a really notable finding from that paper, that pathfinding algorithms worked better. when they were able to think in kind of latent space, a certain superposition of of reasoning is able to be unlocked. So it feels like that's a pretty natural attractor if we just say we want to make these things work as well as possible. But I'm a little concerned, honestly, about like that could get away from us, especially when you combine it with extreme speed. I don't know if you had a chance to use ChatJimmy.ai in the last couple of weeks, but this is the, you know, the company that burned the actual architecture of a specific model into the silicon, and they're achieving 15,000 tokens per second, which, you know, even just a couple queries on chatjimmy.ai, I recommend it. It's a pretty basic model today, so I'm like probably not going to use it that much, but it's perspective forming for sure to go see what it looks like to get 2000 tokens back in like a, you know, a sixth of a second. It's wild. You know, how do we keep our arms around this? You know, how do we, um, Do you think we should set certain rules? Do you think we can like use interpretability techniques? Are there sort of AI governor, you know, systems that you think we can get to the point of reliability where we can count on them? Talk about a topic that could go on for hours, but I am very interested in at least your kind of first layer of thoughts on that.
Vijoy Pandey: As a belief, I believe that we need to go towards unlocking emergent behavior. So as a belief, I'm completely bought into the horizontal scale aspect where you need to bring in agents with different expertise, with different models. It's even like we've seen this pattern even in software systems in the past, where you would not want the same piece of code running in your environment because then you don't, for example, you will not discover the right zero day bugs or if there is a zero day bug, your entire system will go down. And so there is this best practice, even in deterministic software where you say, For redundancy, for availability, you would want different pieces of code that are trying to do, achieve similar goals, be deployed in your infrastructure because at least we're not hitting the same scalability challenges, the same security challenges. It's just the diversity that helps you out. And that emergent behavior, even with deterministic software, is actually something that you should strive for in large-scale cloud systems. And if you take that in these non-deterministic systems, non-deterministic systems, And based on, again, the analogy that we've talked about in the past, where you want to scale out intelligence horizontally, because that's how humans evolved, and that's how I believe intelligence in machines will evolve in addition to vertical. I'm of the strong belief that you need to enable these emergent systems to take place. And if you are, like, the more Brownian the motion, the better it is, because that's how we operate. I mean, the more chaos you can throw in the environment, the more innovation that you get. Now, that's why the cognition engines exist, because I know I cannot punch through this table, because the laws of physics prevent me from doing that. Within an enterprise, these cognition engines are those swim lanes, those guardrails, around what's allowable and what's possible. So you need those to be really effective, and that's a challenge. It's going to take us a little bit of time to get those things right. But given that construct, I am of the belief that you need to enable emergent behavior. You need to enable teams of multi-agent systems communicating semantically, sharing cognition state, and solving for something net new because that's how it's going to happen. As you take that premise, why would you slow things down? I would actually want that multi-agent system to reach emergent consensus, hopefully faster than going through NLP and the cost of tokenization and interpretability and negotiation and all of that, which actually adds cost. So I would actually, I'm of the big belief that Yes, agents are human-like, but they're also not human-like. A human cannot make an API call. A human doesn't have a KV cache that I can send across. I cannot ship my latent space to you. An agent can. So let's leverage the properties of agents that are not available to humans and make them better than humans. Now, there are pitfalls. You might get into divergent properties and outcomes. You might get into super specialization where They're all getting siloed and they're not talking to each other. So in addition to guardrailing compliance and security and identity and access, we need to also have these cognition engines, and that's why their engines are not just guardrails. We are watching out for divergence. We are watching out for extreme specialization and all of these problems that might creep into multi-agent systems. And this is also an interesting area where we need to start modeling And there are many, many papers here. We need to start modeling multi-agent systems using game theoretic approaches, because we need to bring those notions into agentic computation as well.
Nathan Labenz: Yeah, I couldn't agree more in terms of the need to, like, there are so many, a lot have been done, I guess, but it seems like There is just an incredible reservoir of research that has been done on how do people behave under myriad game theoretic premises. And we're going to need to get really good at characterizing AIs along similar dimensions, especially if we're going to turn them loose at high speed and with latent to latent communication that we'll have a hard time making sense of even kind of post hoc probably in a lot of scenarios. The ability to even characterize what agents are going to do under various scenarios right now remains really challenging.
Vijoy Pandey: We're getting a glimpse of that already, Nathan. If you look at, again, going back to the OpenCloud example, I mean, even though, like I said, it's just chatter right now, it's not building shared cognition. There's a markdown file, which we all know about, which is the soul.md. And Peter actually created soul.md to give the agent, his agent, personality that is close to his own personality. And in fact, I have not turned it on yet, but a bunch of my friends have actually turned on their own OpenClaw agents, and they all have their soul.md customized where they get We get many, many personalities, as many human individuals there are. And this is just the beginning. I mean, it might seem a little gimmicky, but if you play this out like a year or two years out, you are entering that world where you'll have agents with expertise, with personality, just like humans, and they will be your teammates in a multi-agent human society or team. So how do you make sure that the infrastructure that you build enables this multi-agent human team or society to collaborate together, share intent, coordinate, and then be guardrailed in the proper ways that an enterprise would want. And so we're getting there. I don't think it's science fiction. You can see that path taking shape.
Nathan Labenz: Yeah, that might be a great place to leave it for today. I know that I've already kept you a little bit long. I would love to check back in in, you know, I feel like this is going to go pretty fast. But I would be very interested to touch base again in 6, 9, 12 months, 12 fields long, and see how this is coming together and especially see how those guardrails are developing within the enterprise context. Because I see everything else really happening fast. And the one thing that I'm not so sure is going to be able to keep up is the layer that supervises it all and makes sure that we're actually getting what we want out of this whole AI transition. And making sure, obviously, on the flip side, that the whole phenomenon doesn't run away from us. But I do like a lot about what you're saying because one of the things I've learned in life, I say this episode after episode, is anything that becomes too concentrated can become dangerous. You see that in drugs. You see that in all sorts of phenomenon. And I kind of worry that too much concentrated intelligence itself is also going to be dangerous. So what's the alternative to that? It's got to be some sort of distributed, buffered, networked, more ecological, hopefully open, at least on some margin, and permissionless system. And I think it is, I was just texting a friend earlier today, we need a decentralized, permissionless kind of coordination reputation layer so badly. So I am really excited about the work that you're doing on it. And I'd love to come back again before too, too long to understand that next level up and how it's evolving.
Vijoy Pandey: Actually, the two comments that you made there, let me just provide a one second overview of that. You talked about the models actually becoming smarter and the guardrails not keeping up. And then you also talked about centralization from the capability perspective and the need for decentralization or distribution per se. And I'll flip those two around and say that in the enterprise, if you want traction in the enterprise, both of those statements actually have to be true. Because enterprises will not deploy solutions where the guardrails are not well thought out and effective. And that is to me a hurdle for this runaway train, which is actually on a good path that we all call generative AI and agentic AI. That's a hurdle for those things being adopted in the enterprise, is getting the guardrails right. 'Cause if we don't get it right, we don't get an option. So it's a requirement, so it's a chicken and egg. And the second thing is in an enterprise, like we discussed earlier as well, You will not get a singular player coming in because you will get, like, look at us or any other enterprise. We'll have agents from ServiceNow. We'll have agents from Salesforce. We'll have agents from Microsoft, from OpenAI, from Anthropic, from Google, from Cisco. We'll have agents from all of these companies that will exist in our environment by definition. And by definition, they'll have to come together, get connected, build shared cognition, align on intent, and work together. Because you will not get a singular agent that does all of those functions, at least not in the next five years. So these two are requirements is not even a good to have for anything to get adopted in the enterprise.
Nathan Labenz: This has been fantastic. Is there anything else that we didn't touch on that you would want to make sure you leave people with before we break?
Vijoy Pandey: Yeah, so we've been talking about all of this in the abstract, the whole internet of cognition, the three layers of shared intent and coordination, shared context and knowledge, and then collective innovation where we enable emergent behavior. We can all go to, the listeners can actually go to a link that I'll just share. There's a white paper to read through, which talks about all of these things in more detail, but also there's a live demo. And going back to the Jarvis example that we talked about right up top, where we built this SRE system, multi-agent system, that enables SRE teams to get more productive, we've actually taken a step further and built multi-agent system for SREs where an outage has happened, but they are all sharing cognition and state and doing collective intelligence and innovation with other agents from cloud providers, from the marketing team, for brand reputation from finance, who's looking at costs. And I'm just going to quickly share that here. This is a clickable demo. You can go in and just walk through it. So what you'll see here, when you go there, you can go to outshift.cisco.com and Internet of Cognition, we'll have it in the show notes. But you can go to this demo, walk through the example and see what happens per agent. So the activity per agent and the intent that takes place per agent, the context that takes place per agent, and the reasoning that is happening collectively and in an emergent way across the entirety of the multi-agent system. So I'm just showing it up here very quickly, but you can click through this and it'll make it a little bit more real behind what we were just talking about in this episode.
Nathan Labenz: Love it. I always say a working demo is the coin of the realm. So definitely encourage people to go check it out and I'll click through it myself, no doubt. Okay. Fantastic stuff. Vijay Pande, thank you for being part of the cognitive revolution.
Vijoy Pandey: Thank you, Nathan, for having me.