Troubleshooting
Setup and the notebooks
Section titled “Setup and the notebooks”A notebook will not open or run
Section titled “A notebook will not open or run”Foundations (f1–f6), Patterns (p1–p7), and the resilience self-serve track are
Jupyter notebooks. Install the notebook tooling with uv sync --extra notebook, then open
them with make lab (Jupyter), or in VS Code, Cursor, or Colab. Pick the project’s Python
environment (.venv) as the kernel so the dependencies are on the path.
ModuleNotFoundError inside a notebook
Section titled “ModuleNotFoundError inside a notebook”Each notebook is self-contained and should not import from shared. If you see this,
re-run the setup cell at the top (it builds the model and imports everything the rest of
the notebook needs). Run the cells in order, top to bottom.
make verify or the scripts: No module named 'shared'
Section titled “make verify or the scripts: No module named 'shared'”Run from the project root (pydantic-ai/) and make sure uv sync has run (it installs
the project in editable mode, which puts shared and app on the path). This applies to
the script-based parts (the app, mcp, make verify/make smoke).
In Cursor / VS Code, use the repo’s .vscode/settings.json: the Python interpreter
points at pydantic-ai/.venv, and Code Runner uses uv run python with cwd
pydantic-ai. Reload the window after uv sync if the play button still picks the wrong
interpreter. Or run make f3 / make solution-f3 from a terminal in pydantic-ai/.
uv sync is slow the first time
Section titled “uv sync is slow the first time”It builds the venv and downloads pydantic-ai, logfire, fastapi, Jupyter, and friends. Later runs are cached.
Model / Ollama
Section titled “Model / Ollama”Connection refused to localhost:11434
Section titled “Connection refused to localhost:11434”Ollama is not running. Start it, then ollama pull granite4.1:3b.
Answers are slow on the first call
Section titled “Answers are slow on the first call”The model warms up on first use; later calls are faster.
Using Gemini instead
Section titled “Using Gemini instead”Set GOOGLE_GENERATIVE_AI_API_KEY in your environment. The notebooks’ setup cell and
shared/model.py both switch automatically; make verify skips the Ollama checks.
Observability (the trace)
Section titled “Observability (the trace)”Reading a run
Section titled “Reading a run”From f2 on, tracing is on. In the notebooks the run prints as a span tree right in the cell
output. The standalone make fN script runs also stream to autotel-devtools, a local
browser viewer: run make devtools in one terminal (it listens on http://127.0.0.1:4446),
make f2 in another, then open that URL. f1 deliberately has no trace.
Nothing reaches the browser viewer
Section titled “Nothing reaches the browser viewer”The make fN runs export to the viewer only when autotel-devtools is genuinely what is on
http://127.0.0.1:4446, so start make devtools first. If another tool already holds that
port (some IDEs run their own telemetry collector there), the run detects it is not
autotel-devtools, skips the export, and stays on the console trace instead of printing an
export error. Free the port, or point WORKSHOP_OTLP_ENDPOINT at a viewer on another port.
A different viewer or a real backend
Section titled “A different viewer or a real backend”logfire is OTLP-compliant. Set OTEL_EXPORTER_OTLP_ENDPOINT (or WORKSHOP_OTLP_ENDPOINT)
to any OTLP/HTTP receiver (otel-tui, Jaeger, Grafana Tempo) and the spans render there
instead of autotel-devtools.
The model never calls my tool
Section titled “The model never calls my tool”It is not registered (missing @agent.tool_plain / @agent.tool), or its docstring is
too vague for the model to know when to use it (the whole lesson of f6).
A tool that needs context
Section titled “A tool that needs context”Use @agent.tool with a first parameter ctx: RunContext[YourDeps] and pass deps=...
to agent.run(...). @agent.tool_plain is for tools that need no context.
The chain skips a tool (p1, p5)
Section titled “The chain skips a tool (p1, p5)”Small local models drift. The instructions say “you MUST call all the tools” and the prompt asks for the flight price by name to keep granite on track. Re-run the cell if it still skips one; a stronger model chains more reliably.
MCP (mcp, a script, not a notebook)
Section titled “MCP (mcp, a script, not a notebook)”The agent crashes when it tries to use the hotel tool
Section titled “The agent crashes when it tries to use the hotel tool”Start the server first: make mcp-server in a separate terminal, leave it running,
then make mcp. The client needs the server up before the run.
Port 4321 in use
Section titled “Port 4321 in use”Change port= in mcp/mcp_server.py and the URL in the client to match. (Note the
docs site also defaults to :4321; stop it if it clashes.)
No find_hotel at first
Section titled “No find_hotel at first”Expected: there is no hotel tool until you wire the server in. Confirm via the “tools called this run” line.
Resilience (resilience, a notebook)
Section titled “Resilience (resilience, a notebook)”The “raise” cell prints a traceback
Section titled “The “raise” cell prints a traceback”That is the lesson, and the notebook wraps that cell in try/except so it prints the error
instead of stopping the run: a raised tool exception is uncaught and would take a real run
down. The next cell returns the error as data so the model recovers. Do not “fix” the first
cell by making the tool swallow errors silently.
UnexpectedModelBehavior when I use ModelRetry
Section titled “UnexpectedModelBehavior when I use ModelRetry”ModelRetry retries the same call. For a permanent failure (Atlantis), the model retries
until the budget is exhausted, then this raises. Return the error as data instead; reach
for ModelRetry only when another attempt could succeed.
RAG track (r1–r2)
Section titled “RAG track (r1–r2)”Only needed if you chose the RAG track after Foundations. Foundations, Patterns, and Full-Stack do not use embeddings.
embeddinggemma not found
Section titled “embeddinggemma not found”make verify checks this when Ollama is up. The check is labelled “RAG track only”; you
can ignore it if you are not doing r1–r2.
First make r1 or make r2 pauses for several seconds
Section titled “First make r1 or make r2 pauses for several seconds”Ollama is loading the embedding model and batch-embedding the documents. It is not hung. Later queries are faster.
Chat runs on Gemini but RAG fails
Section titled “Chat runs on Gemini but RAG fails”Retrieval always uses a local Ollama embedder (embeddinggemma), separate from the
chat model in shared/model.py. ollama serve must be running and the model pulled,
even when chat uses Gemini.
r1: scores are all 0.000 and the Swiss Alps appears for “warm beach”
Section titled “r1: scores are all 0.000 and the Swiss Alps appears for “warm beach””Expected in the starter: search ignores the query until you complete the TODOs (embed
the query, score by cosine similarity, sort and take top k). Compare with
make solution-r1.
r2: food passages score around 0.25 and snippets show opening paragraphs
Section titled “r2: food passages score around 0.25 and snippets show opening paragraphs”Expected in the starter: chunk() returns the whole guide as one vector. Split on blank
lines (see README TODO 1). Compare with make solution-r2.
Full-stack track (a cloned template)
Section titled “Full-stack track (a cloned template)”The full-stack track is a separate template, not part of this repo’s scripts. Clone it with
npx @jagreehal/ai-workshop fullstack-pydantic, then follow that folder’s own README.md
for setup (npm install, uv sync --extra dev, npm run dev) and troubleshooting.