Express course · No. 27

A raw language model returns free text — lovely for a human to read, useless for your code to rely on. To build real software on a model you need two things: output your program can actually trust (structured output), and a safe way for the model to reach beyond text and act (tool use). Together they turn a chatbot into a dependable component inside your system.

Essence only · One picture per idea · Engineering over magic

§ 01

The first step to building on a model is recognising the mismatch: it speaks prose, but your program needs data and actions. Bridging that gap is what this whole course is about.

Prose is for humans; code needs data

A handwritten paragraph describing an order versus a filled-in form with labelled boxes — a person reads the paragraph easily, but only the form can be filed by a machine.

By default a language model outputs free text — a paragraph, an explanation, prose. That's perfect for a human reader and useless for your code, which needs predictable, structured values it can act on: a number here, a category there, a yes-or-no. A program can't reliably pull "the customer wants a refund of $40" out of a sentence the model phrased however it liked. Prose and data are different things, and code runs on data.

Parsing prose is fragile and breaks

Trying to extract the total from a letter that might say "$40", "forty dollars", or "a refund of 40 bucks" — your rules work until the day the wording shifts, and then everything falls over.

The naive fix — have the model write text and then parse it with your own code — is brittle. The model is non-deterministic; it will phrase the same thing ten different ways, add a friendly preamble, or wrap the answer in markdown. Your parser handles the cases you saw and shatters on the one you didn't. Building on string-parsing the model's prose is building on sand: it works in the demo and fails in production.

The goal: make the model a reliable part

An engine becomes useful to a car only once it has standard mountings and connectors — predictable interfaces the rest of the machine can bolt onto with confidence.

To build software on a model, you have to turn it from a chatty oracle into a component with predictable interfaces — output your code can trust, and actions it can take safely. That's exactly the two halves of this course: structured output (data, not prose) and tool use (acting, not just talking). Get those, and the model stops being a demo and becomes a dependable part you can engineer around like any other.

A model speaks prose; your code needs data and actions. Parsing free text is fragile — the goal is to make the model a reliable component, with output you trust and actions it can take.

§ 02

The first half of making the model a component is getting it to return data in a shape your code defines, every time — not prose you have to guess at, but a predictable structure.

Ask for a schema, not a sentence

Handing someone a form with labelled boxes instead of a blank page — they fill in exactly the fields you need, in the order you need, with nothing extra.

Structured output means asking the model to return data in a defined format — typically JSON matching a schema you specify: these fields, these types, this shape. Instead of "tell me about this order," you say "return {customer, amount, reason}," and you get back data your code can use directly. You define the shape; the model fills it in. This is the single move that turns model output from prose into something a program can rely on.

The model can be constrained to valid output

A railway switch that physically can only send the train down one of the defined tracks — there's no way for it to end up somewhere off the map.

Modern models support constrained generation: they can be forced to produce output that actually matches your schema — valid JSON, the right fields, the correct types — rather than just being politely asked to. This closes the gap where the model "mostly" returns the right shape but occasionally drifts. With the output constrained to the schema, your code can count on the structure being there, which is what makes the model usable as a real building block.

Structured output is the bridge to your code

A translator who converts a guest's spoken wishes into a precise, standardised order ticket the kitchen can act on — the same meaning, now in a form the system can use.

Structured output is the bridge between the model's language ability and your software's need for data. It lets the model do what it's brilliant at — understanding messy human input — and hand the result to your code as clean, typed values. Classification, extraction, routing, form-filling: all become reliable when the output is structured. The model reads the chaos; the schema delivers the order. That hand-off is where most real LLM features actually live.

Structured output means the model returns data matching a schema you define, constrained to valid shape — the bridge that turns its language ability into typed values your code can rely on.

§ 03

Structured output gets you a predictable shape, but shape is not correctness. Before your code trusts what the model returned, it has to check it — the same discipline as any untrusted input.

Valid shape is not valid content

A form filled in neatly, every box completed — but the date is impossible and the total doesn't add up. Tidy is not the same as correct.

A schema guarantees the shape — the right fields and types — but not that the values make sense. The model can return perfectly-formed JSON with a negative quantity, a category that doesn't exist, or an amount it hallucinated. Structured output solves "can my code read it"; it doesn't solve "is it right." Treat a well-formed response as a starting point to check, not a guarantee of a correct one.

Validate before anything acts on it

A customs check between countries — nothing crosses into the next system until it's been inspected against the rules and declared safe.

So you validate the model's structured output before your code uses it: enforce the schema strictly, check values are in allowed ranges and sets, confirm anything referenced actually exists. This check runs in your code, at the boundary between the model and the rest of your system, because the model is a non-deterministic component you can't fully trust. The same lesson as security: model output is untrusted input to the next stage until you've verified it.

Never pipe raw output into something dangerous

You don't pour an unfiltered stream straight into the drinking supply — whatever was upstream is now in every tap, with no chance to catch it.

The rule gets strict when output flows somewhere consequential. Never feed raw model output directly into a database write, a shell command, a query, or another system without validating and escaping it first — that's how a hallucination or an injected instruction becomes a real bug or breach. Structured output plus strict validation is what lets you safely act on what the model produced. The structure makes it readable; the validation makes it trustworthy.

A valid schema guarantees shape, not correctness. Validate the model's output at the boundary — ranges, sets, existence — before any code acts on it, because model output is untrusted until checked.

§ 04

Structured output lets the model return data. The other half of making it a component is letting it do things — reach past text to search, calculate, fetch, and act in the real world. That's tool use.

Give the model functions it can call

A smart assistant who can't open the filing cabinet themselves — but can tell you precisely which drawer and file to pull, and use what you bring back.

With tool use (also called function calling), you describe functions the model is allowed to request — search_orders, send_email, get_weather — and when the model decides it needs one, it returns a structured call naming the function and its arguments. It's structured output applied to actions: instead of data, the model emits a request to do something. This is how an LLM reaches beyond its own text into your systems and the world.

The model decides; your code does

A foreman who points and says exactly what needs doing — but the crew, not the foreman, actually operates the machines and stays in control of the site.

The division of labour is the key: the model decides what to do and with what arguments; your code decides whether and how to actually do it. When the model returns a tool call, your code runs the function (or doesn't), gets the result, and feeds it back into the context for the model to continue. The model never touches your systems directly — it requests, your code executes. That boundary is where you keep control and safety.

The result feeds back into the loop

A researcher asks for a document, reads what's brought back, and uses it to decide the next question — a back-and-forth, not a single shot.

Tool use is a cycle: the model requests a tool, your code runs it and returns the result into the context, and the model reasons on with that new information — possibly calling another tool. This loop is exactly what turns a model into an agent (the agents course goes deep). Tools are the hands; the loop is the persistence. For now, the key idea is that tool use lets the model gather what it needs and act, step by step, instead of answering blindly from memory.

Tool use lets the model request functions — structured output for actions. The model decides what to do; your code decides whether and how to do it, and feeds the result back.

§ 05

The model picks and uses tools based on how you describe them, so tool design is part of getting reliable behaviour. A few principles separate tools that work from tools that confuse the model.

Clear names and descriptions guide the choice

A toolbox where every tool is clearly labelled with what it's for — the worker reaches for the right one instantly, instead of guessing from a row of unmarked handles.

The model chooses which tool to call almost entirely from its name and description. So those aren't documentation for humans — they're the instructions the model reasons from. A vague or misleading description leads to the wrong tool, the wrong arguments, or a tool ignored when it was needed. Write tool names and descriptions as carefully as you'd write a prompt, because to the model, that's exactly what they are.

Few sharp tools beat a giant menu

A kitchen with a few well-chosen, clearly-different tools works faster than one with fifty overlapping gadgets the cook has to sort through every time.

Giving the model too many tools — or several that overlap — makes it harder for it to choose well, every single turn. A small set of sharp, clearly-distinct tools is more reliable than a sprawling menu. If two tools do nearly the same thing, the model will sometimes pick the wrong one; if there are forty, choosing becomes its own source of error. Curate the toolset the way you curate context: only what the task needs.

Distrust the tool's description as a security surface

A new hire who follows the label on every box without question — so whoever writes the labels effectively controls what they do.

Because the model trusts tool descriptions and tool outputs, they're part of your security surface, not neutral plumbing. A poisoned description or a tool that returns attacker-controlled text can steer the model's behaviour — "tool poisoning." Vet the tools you wire in, scope each to the minimum it needs (a read-only tool shouldn't be able to delete), and treat what a tool returns as untrusted input. Capability you grant a tool is capability a confused or hijacked model can misuse.

The model picks tools from their names and descriptions, so write them like prompts. Keep tools few and sharp, scope each to least privilege, and treat descriptions and outputs as a security surface.

§ 06

As tools multiply, connecting each model to each system by hand doesn't scale. A standard has emerged for how models reach tools and data — and with it, a new layer to understand and secure.

A standard way to expose tools to models

Before standard plugs and sockets, every appliance needed its own custom wiring — a universal standard meant any device could connect to any outlet without bespoke work.

The Model Context Protocol (MCP) is an emerging standard for how models connect to tools and data sources. Instead of wiring each model to each system with custom code, MCP defines a common way to expose tools, so any compatible model can use any compatible tool. It's the universal socket for the agent era — a connector layer that lets the ecosystem of tools and models plug together without one-off integrations every time.

It standardises the plumbing of agents

A power grid: standardised so a new appliance just plugs in and works, and a new power source just feeds the same grid — the standard is what makes the whole network compose.

MCP matters because it makes tools and data composable: a tool built once can be used by many models and agents, and a new agent can immediately use the whole ecosystem of existing tools. This is a big part of why agents got powerful so fast — the plumbing got standardised, so capabilities snap together. When you give an agent access to your systems today, MCP is increasingly the layer doing it.

The connector layer is an attack surface

New plumbing run through the whole house fast — and a survey finds a large share of the valves were never fitted with a lock.

This new layer is also a new place to get breached. MCP connectors expose real actions and data, and the rush to adopt them has left many deployed with weak or missing security — a large share of remote MCP servers have shipped with no authentication at all. So treat a connector like the door it is: authenticate it, scope what it exposes, keep it off the public internet unless it must be there, and inventory what you've connected. Powerful plumbing demands the basic locks.

MCP standardises how models reach tools and data, making the ecosystem composable. That same connector layer is a real attack surface — authenticate it, scope it, and inventory it.

§ 07

Structured output and tool use are the tools that make a model a component, but reaching for them well still means climbing the ladder — using the least powerful piece that solves your problem.

Structured output first, tools when it must act

You don't hand someone the car keys when all you needed was an answer — you give just the capability the task requires, no more.

There's a ladder of capability. A plain prompt answers; structured output makes that answer usable by code; tools let the model fetch and act; a full agent loop lets it pursue a multi-step goal. Each rung adds power and new ways to fail. Reach only as high as the task needs: if you just need clean data, structured output is enough — don't give the model tools, and certainly not a loop, when a constrained response solves it. Most features never need the top rung.

Validate everything that crosses back

A returns desk that inspects every item coming back before it goes on the shelf — nothing re-enters the system unchecked, no matter how trustworthy the customer seemed.

The unifying habit across both halves is distrust at the boundary: validate the model's structured output before your code uses it, and keep your code in control of whether to run a tool the model requested. The model is a brilliant, non-deterministic component, and you treat everything crossing from it into your system as input to verify. That boundary discipline is what lets you build reliably on something that's fundamentally fallible.

Before you build on a model

Does my code need data — am I getting structured output to a schema, not parsing prose? - Is the output validated — values checked, not just well-formed, before anything acts on it? - Does it need to act — and have I exposed that as a tool, with my code in control of running it? - Are the tools few, sharp, and well-described, and scoped to least privilege? - If using MCP, is each connector authenticated, scoped, and inventoried? - Am I on the lowest rung that works — output, tools, or a full loop only if needed?

The words you now own

free text / structured output — prose for humans, versus data your code can rely on. - schema / JSON / constrained — the shape you define and force the model to match. - validation / boundary — checking values are correct before any code trusts them. - tool use / function calling — letting the model request actions your code runs. - tool description / tool poisoning — how the model picks tools, and the security risk in them. - MCP — the standard connector layer for exposing tools and data to models. - least privilege — scoping each tool to the minimum it needs.

Signs you build with it well

You get structured output to a schema instead of parsing the model's prose. - You validate the output at the boundary before any code acts on it. - The model requests tools and your code stays in control of running them. - Your tools are few, sharp, well-described, and least-privilege. - You climb only to the rung the task needs, and treat MCP connectors as a secured door.

To build on a model, make it a component: structured output your code can trust, validated at the boundary, and tools it requests but your code controls — the least powerful rung that solves the job.