Skip to content

p3: Parallelization

Routing (p2) picks one path. Parallelization runs several at once. When subtasks are independent, you do not finish one before starting the next: fire them together and combine the results. Three reviewers judge the same itinerary, one concern each (cost, weather, safety), and an aggregator merges their verdicts into one.

One concern per call beats one call for everything: a reviewer focused only on cost does better on cost than a prompt juggling all three. So you write one of the reviewers, then fan them out.

In a hurry? These three steps are the whole challenge. Everything below is the why and the how.

  1. Run npm run p3: the three reviews run one after another, and safetyReviewer has no prompt yet, so its review is junk.
  2. Edit start/agent.ts: write safetyReviewer’s single-concern prompt (TODO 1), replace the three sequential awaits with one Promise.all (TODO 2), then aggregate the reviews with the synthesiser (TODO 3).
  3. Done when all three reviews are useful, the reviewer spans overlap in the trace, and one synthesiser verdict prints after they resolve.
       .-->  reviewer A  --.
input  -->  reviewer B  -->  aggregate  ->  result
       '-->  reviewer C  --'

Independent reviewers run at once with Promise.all, then one call combines them.

Forget travel. Say you want three quick takes on a pull request, one concern each:

const styleReviewer = new ToolLoopAgent({
  model,
  output: Output.object({ schema: review }),
  instructions: `
Review a code diff for STYLE only: naming, formatting, clarity.
Rate it 1-5 and give one terse comment.
`.trim(),
});
// ...a securityReviewer and a perfReviewer, each with its own single concern.

// They do not read each other, so fire them together:
const [style, security, perf] = await Promise.all([
  styleReviewer.generate({ prompt: diff }),
  securityReviewer.generate({ prompt: diff }),
  perfReviewer.generate({ prompt: diff }),
]);

Two moves. Each reviewer’s prompt names one concern, so it judges that concern well; a prompt juggling style, security, and perf at once does all three poorly. And because the calls are independent, Promise.all fires them together instead of one after another. Below you write one of TripMate’s reviewers, then fan all three out.

Open start/agent.ts. budgetReviewer and weatherReviewer are written for you, as the shape to copy; each has Output.object so its rating is typed. safetyReviewer’s instructions are blank: that single-concern prompt is TODO 1. In main() the three reviews run sequentially, the “before” you replace in TODO 2.

Parallel calls finish in about the time of the slowest, not the sum, when the provider runs them concurrently (a hosted model does). A single local GPU often serves generations one at a time, so against Ollama you may see little wall-clock drop. That does not change the lesson: the shape is fan-out-then-aggregate, and it pays the moment you move to a real provider or add calls. The printed milliseconds are real, so compare them.

  1. Run it and read the gap. Run npm run p3. Budget and weather review fine, but the safety review is junk (no prompt), and the three run one after another for no reason. The printed time is mostly spent waiting.
  2. Write the safety reviewer (TODO 1). Fill safetyReviewer’s instructions with one concern only: safety and practicality (timing, transport, crowds). Match the shape of the two provided reviewers. Re-run; the safety review should now be sharp.
  3. Replace the awaits with Promise.all (TODO 2). The heart of the pattern. The three calls do not depend on each other, so replace the sequential block with one Promise.all(...) and destructure as [budget, weather, safety]. The reviews array below keeps working unchanged: same calls, different timing.
  4. Aggregate (TODO 3). Three raw reviews are not an answer. Pass the collected reviews as JSON to the synthesiser and print its two-sentence verdict.
  5. Add a reviewer (poke it). Add a fourth reviewer to the batch (food, accessibility). One more entry in the Promise.all, one more review for the synthesiser; the shape does not change.
  6. Check you’ve got it. Say why these calls are safe to parallelize (they are independent), and name the two flavours (sectioning and voting). The trace shows the three reviewer spans overlapping, then the synthesiser after.

Stuck? finish/agent.ts is one version. Read it after you’ve had a real go; your safety prompt will read differently.

  • Parallelizing dependent work. Promise.all is safe only when the calls do not need each other’s output. If B needs A’s result, that is a chain (p1), not a fan-out.
  • Forgetting to aggregate. Three raw reviews are not an answer; the aggregation step turns N opinions into one decision.
  • One overloaded prompt. Asking a single call to rate cost, weather, and safety at once blurs all three. Separate, single-concern calls keep each focused.
  • Unbounded fan-out. Firing hundreds of calls at once can hit rate limits. Batch them if the list is large.
Sectioning vs voting

This challenge is sectioning: split a task into different subtasks (cost, weather, safety) and run each once. The other flavour is voting: run the same check several times and take a majority, raising confidence on a judgement call. Same Promise.all shape, different purpose: sectioning covers more ground, voting buys certainty.

Next up is p4, where one agent’s output is scored by another in a loop until it is good enough.