A 5-minute primer on what agentic networks are, why they matter, how they work — and the algorithmic questions my group works on.
The next wave of intelligent systems will not be a single, monolithic model behind an API. It will be many AI agents — LLMs, planners, tool-users, sensors, controllers — talking to each other to act in the world. This page is a quick tour of where the field is heading, and where my recent research fits in.
An agentic network is a system of AI agents that communicate, coordinate, and learn together to accomplish tasks that no single agent can solve alone. Each agent gathers its own observations, acts in its own slice of the world, and exchanges messages with peers to share what it has learned. The network is the unit of intelligence — not any individual agent.
The basic primitive is not “make a prediction” but “share what you learned, decide what to do next.” Almost every interesting question about agentic networks reduces to: how should agents exchange information so the group learns and acts as efficiently as possible — under bandwidth, privacy, and heterogeneity constraints?
Three flagship application areas drive the field — and they map onto three big technological frontiers.
A single agent learns from a single stream of data. A population of agents can pool experience and converge dramatically faster — the AI analogue of how distributed training scales by interconnect, not by model size. This is the most plausible path to learning agents that adapt to new environments in minutes rather than days.
Real deployments mix agents with different observations, action sets, and rewards: a recommendation system with users who like different things; a sensor network whose nodes see overlapping but distinct slices of the environment; a team of LLM agents specialized for different tools. Coordinating heterogeneous agents is fundamentally harder than coordinating identical ones — and it is the regime that matters in practice.
As LLM agents proliferate, which response should be served, and which agent’s feedback should be trusted? Agentic-network methods let a population of LLMs jointly evaluate, rank, and align their outputs to a particular user, turning what would otherwise be a fragmented model zoo into a coherent assistant.
The headline obstacle is the cost of communication: naive “tell everyone everything every round” scales quadratically in agents and saturates any realistic link. Worse, agents see partial, noisy feedback — and even what they observe is often correlated with the messages they receive, breaking the independence assumptions that make single-agent learning tractable.
The fix is a small set of design primitives that recur across cooperative learning:
Stitching these primitives together yields cooperative algorithms whose per-agent regret matches single-agent optimal, while total communication grows only logarithmically (or better) with the horizon — the substrate of a scalable agentic network.
The vision above hides a tower of unsolved algorithmic questions. Agents have bounded bandwidth. They disagree on what they see. They run on different clocks. Some are adversarial. Some are LLMs whose feedback is human and noisy. My recent work tackles four of these head-on.
When can a group of cooperating learners match the regret of a single learner with all the data — without flooding the network? We design protocols whose communication cost grows much slower than the regret it saves, and analyze the fundamental tradeoff between the two.
Real teams are not identical. Agents may have different reward distributions, different observation models, or only partial knowledge of each other’s existence. We characterize when heterogeneity helps (exploration for free) and design algorithms that exploit it.
In large agentic systems, multiple agents often want to use the same resource (an arm, a server, a tool). We design algorithms for shareable arms and resource-aware cooperation, where the right unit of optimization is the resource, not the agent.
When the agents are LLMs and the feedback is human, the problem becomes: which response, from which model, should the user see? We cast this as an online learning problem over a conversational multi-agent system that jointly evaluates and selects user-aligned responses.
Together, these threads contribute pieces of an algorithmic foundation for networked AI agents: how to learn, communicate, and align in a world where intelligence is collective and feedback is noisy. If any of this excites you, come build it with me.