Skip to content

p6: Orchestrator and delegation

In p5 one agent called plain tools. Here the tools are other agents. A concierge orchestrator hands the weather question to a weather analyst and the flight question to a flight advisor, each a full agent with its own brief, then combines their answers into a plan.

In a hurry? These three steps are the whole challenge. Everything below is the why and the how.

  1. Run make p6 and watch the one [delegate] line, where only the weather analyst is consulted so any flight detail is guessed.
  2. Edit start/agent.py: do the TODO (write the consult_flights tool that runs the flight_advisor sub-agent the way consult_weather runs the weather analyst, decorated @concierge.tool).
  3. Done when two [delegate] lines run and the plan combines real weather and flight answers.

The trick is small. A delegation tool’s body runs another agent and returns its answer. The orchestrator cannot tell the difference between that and a plain function, which is what makes this compose. And because each sub-agent has its own instructions, the concierge stays focused on planning instead of trying to be a weather expert and a flight expert at once.

This is close to p3, where you split work across several agents too. The difference is who decides the split. In p3 you fixed the subtasks in code: always cost, weather, safety. Here the orchestrator decides at runtime which specialists to consult, based on the request. The subtasks are not hard-coded; the model chooses them.

orchestrator agent
   |-- tool "ask the weather agent"  ->  a whole sub-agent runs
   |-- tool "ask the flights agent"  ->  a whole sub-agent runs
   '-- composes their replies  ->  answer

A delegation tool’s body runs another agent and returns its answer, so the orchestrator’s tools are other agents.

Forget travel. Say a support concierge delegates the refund question to a refund specialist that is itself a full agent. The delegation tool is an ordinary @concierge.tool; the only twist is what its body does:

refund_specialist = Agent(model, instructions="You judge refund eligibility...")


@concierge.tool
async def consult_refunds(ctx: RunContext[None], charge: str) -> str:
    """Ask the refund specialist whether a charge is refundable."""
    result = await refund_specialist.run(charge, usage=ctx.usage)  # run a whole agent
    return result.output                                           # hand its answer back

That is the whole pattern. The tool’s body runs another agent and returns its answer, so the orchestrator’s “tools” are agents. The orchestrator cannot tell this from a plain function, which is exactly what makes it compose; and because the specialist carries its own brief, the concierge stays focused on planning instead of being a refund expert too. Passing usage=ctx.usage into the sub-run keeps the whole tree’s token tally on the parent. Below you write TripMate’s version for flights.

Open start/agent.py. The two sub-agents (weather_analyst, flight_advisor) and the consult_weather delegation tool are provided as your template. What is missing is consult_flights: an @concierge.tool taking (ctx: RunContext[None], route: str), whose body runs flight_advisor with usage=ctx.usage and returns its .output. That is the TODO.

make p6

The concierge consults the weather analyst, so you will see one [delegate] weather_analyst line. There is no flight tool wired, so any flight detail in the plan is guessed. One delegation works; the other is missing, and you are about to write it.

  1. Run it and see the one delegation. Run make p6 and watch for the [delegate] line. There is one: weather_analyst. Read the plan and notice the flight advice has nothing behind it.

  2. Write the consult_flights delegation tool (TODO). consult_weather above is your template: an ordinary @concierge.tool whose body runs a sub-agent (weather_analyst.run(...)) and returns its .output instead of doing the work itself. The flight_advisor sub-agent is already provided. Write consult_flights the same shape: decorate it @concierge.tool, take (ctx: RunContext[None], route: str), give it a docstring (that is its description), and in the body run flight_advisor on the route with usage=ctx.usage and return its .output. The TODO comment in start/agent.py marks the spot; build it from the template rather than copying it from here.

    The @concierge.tool decorator is the wiring; the orchestrator can only delegate through tools it holds. Run again. Now both specialists are consulted ([delegate] weather_analyst, then [delegate] flight_advisor), and the plan combines real answers from both. The orchestrator chose to call each; you only supplied the option.

  3. Find the nesting in the trace (poke). Scroll up to the console trace. Each sub-agent run sits inside the orchestrator’s tool call, which sits inside the orchestrator’s own run. That nesting is delegation made visible: an agent run within a tool call within an agent run. result.usage.requests counts the whole tree because you threaded ctx.usage.

  4. Check you’ve got it. You should see two [delegate] lines, a plan that combines both answers, and the nested sub-agent runs in the trace. You should be able to say how delegation differs from the fixed fan-out in p3.

Stuck? finish/agent.py is the canonical version. Read it after you’ve had a real go.

  • Recursion with no floor. A delegating agent can delegate to an agent that delegates again. Keep the depth shallow and cap runs with UsageLimits(request_limit=...), or a run can fan out far more than you meant.
  • Naming an unwired tool in the instructions. If the orchestrator’s brief names a tool you have not added yet, pydantic-ai raises when the model tries to call it. Wire the tool and name it together; the starter softens the brief in step 1 for exactly that reason.
  • Dropping usage=ctx.usage. Without it the sub-agent’s tokens are not counted on the parent run, so a parent UsageLimits will not see them. Thread it through.
  • Delegating what a plain function should do. A sub-agent is the right call when the subtask needs judgement or language. For a lookup or a calculation, a plain tool is cheaper and more reliable.
Why does the orchestrator stay simpler than one mega-agent?

You could give one agent every instruction and every tool. It works until the brief grows long enough that the model loses the thread.

Delegation keeps each agent’s context small and its job clear: the concierge plans, the analysts analyse. Smaller context per agent gives more reliable behaviour, and you can test and tune each specialist on its own.

When should a tool be a sub-agent rather than a plain function?

Reach for a sub-agent when the subtask needs judgement, language, or its own multi-step reasoning, like “advise how to fly this route within a budget.” Reach for a plain tool when the subtask is a lookup or a calculation with one right answer, like reading a row from a table.

The orchestrator treats both the same way, so the choice is yours to make on cost and reliability, not on what the model can call.

Next up is p7, where the agent remembers across turns and streams its replies, so it feels like a chat rather than a one-shot call.