Where departments make their services ready for agents. Author, audit, and measure — one platform for service legibility.
For an AI agent to act on a citizen's behalf, it needs more than an API. It needs to understand what a service does, who is eligible, what steps are involved, and how data is shared. The Legibility Studio is where departments publish that understanding as structured, machine-readable service descriptions — and where they monitor how agents are using them.
The Legibility Studio has three capabilities, each addressing a different question that department teams need answered.
Create and edit the four dimensions of a service description — identity, eligibility, journey, and data sharing — through a structured editor with LLM-assisted generation for teams starting from scratch.
Trace every agent interaction back to the policy rule, state transition, and consent grant that governed it. Browse sessions, replay journeys, inspect every decision the agent made on a citizen's behalf.
Track service description coverage across your department. See which services are agent-ready, which have gaps, and where to focus next. Operational metrics show how services perform once live.
The studio is a Next.js application that reads service definitions from local stores and fetches operational data (traces, case ledger) from the citizen app API via HTTP. It never accesses the evidence database directly — this architectural boundary keeps native module dependencies cleanly separated.
The table columns — Id, El, Jo, Da — show at a glance which of the four dimensions each service has: identity, eligibility, journey, and data sharing. A service with all four ticks is agent-ready. Anything less shows exactly what is missing.
The catalogue holds every government service the system knows about. Services arrive from two sources and are classified into eight interaction typologies.
The distinction matters because the two types represent very different levels of readiness.
Every service is classified by interaction pattern. The typology breakdown helps designers identify under-served categories and ensures the system handles every kind of government service.
A benefit service pays money to a citizen (Universal Credit, Carer's Allowance). An obligation requires the citizen to do something (file a tax return, register a death). An application is a one-off request with a decision (apply for a passport, request planning permission). Typologies affect how the agent frames its conversation and which state machine patterns apply.
The services table supports filtering by source (full / catalogue / all), typology, department, life event, and free-text search. Services are split into two sections: promoted agent services (highlighted at the top) and the full catalogue table below. This lets department teams focus on the services they are actively working on while keeping the full landscape visible.
The editor is where departments author the four dimensions of a service description. It guides teams through each dimension with structured forms, validation, and contextual help.
Every transactional government service publishes a description with four dimensions. Together they form a complete contract between a department and any authorised agent.
What the service does, what data it needs, what it produces, and who to contact when something goes wrong. This is the service's public contract — its name, department, description, input and output fields, SLA, fees, availability, and redress routes.
Structured rules that determine who can use the service. Each rule operates on a single data field using a simple operator. If a rule fails, the citizen sees a plain-English explanation and, where possible, a redirect to an alternative service.
A state machine defining the lifecycle of a service interaction: states, transitions with triggers, and terminal conditions. The agent follows this path deterministically — it cannot skip steps or invent new ones.
Every data-sharing grant the service requires: which fields are shared, where they come from, for what purpose, how long the grant lasts, and whether it is required or optional. Nothing is shared without the citizen's explicit consent.
The editor presents each dimension as a structured form section. Product teams work through:
Service name, department, description, input and output field definitions with types. The identity dimension.
Eligibility rules with conditions, SLA, fees, availability, complaint URL, appeal process, ombudsman, escalation phone and hours.
State definitions with transitions, consent grants, delegation scope, card definitions for the citizen chat UI.
The editor also includes a card definitions section for mapping service states to visual cards displayed in the citizen chat, and an audit section for data retention period, data controller, and lawful basis.
The editor guides, but the decisions belong to the team. For each dimension:
What is the citizen-facing name? What data does the service need as input? What does it produce? What are the SLA commitments? Who handles complaints? These are facts your department already knows.
What are the hard rules? Who is not eligible and why? What happens when someone fails a rule — is there an alternative service, or should the agent hand off to a human? Write the failure messages in plain English.
Start with the happy path. Then ask: what if they are not eligible? What if there is a medical complication? What if they refuse consent? Each of these becomes a branch in the state model.
What data is shared, with whom, and for how long? Which grants are required and which are optional? What is the lawful basis? Can the citizen revoke consent after the fact?
For catalogue-only services that lack descriptions, a generate action triggers the LLM to scaffold eligibility rules, a journey model, and data sharing grants from the service's basic metadata. Generated descriptions are timestamped and marked as machine-generated for human review. This gets teams to a first draft in seconds rather than days — the real work is in refining the rules and edge cases.
The gap analysis view answers a single question for each department: how ready are your services for agent delivery?
The catalogue draws from the full GOV.UK services directory, so the gap is deliberately large. It represents the scale of the challenge, not a failure to deliver. The point is to make the gap visible and give departments a clear path to closing it.
Each department gets a summary card with a colour-coded progress bar showing the ratio of fully-described services to total services. Departments are colour-coded consistently across the platform.
Services in the gap table are sorted by priority. Demo-critical services appear first — those used in live department demonstrations. Then transactional services (applications, registrations, claims). Then reference material. Within each tier, services with complete descriptions appear above those with gaps.
A service is agent-ready when all four dimensions are present and populated. The agent can read the identity to understand the service, evaluate eligibility rules deterministically, follow the journey state machine step by step, and request consent grants as specified in the data sharing model. Without any one of these, the agent either cannot find the service, cannot determine eligibility, cannot guide the citizen through it, or cannot share data lawfully.
Every agent interaction produces a trace — an append-only sequence of events stored in the evidence package. The evidence explorer lets department teams inspect exactly what happened during any citizen session.
The left panel lists trace sessions by ID and timestamp with event counts. Selecting a trace opens its events in the right panel, which offers two views.
Traces capture every decision point in the agent pipeline. Each event type is colour-coded for visual scanning.
Every agent action — from the initial LLM request through eligibility evaluation to consent grants and service completions — is captured as an immutable event. This is the evidential backbone that makes agentic government services auditable. No event can be deleted or modified after it is written.
This matters for complaints and oversight. If a citizen disputes what they were told, the trace shows exactly what the agent said, which policy rules were evaluated, what consent was granted, and what data was shared. If a service owner suspects a pattern of failures, they can filter traces by error events or handoff requests to identify systemic issues.
A DVLA product team wants to make driving licence renewal available through the agent. Here is how they go from a blank service to a published, agent-ready description.
The team opens the Legibility Studio and searches for "renew driving licence". The service already exists as a catalogue-only entry imported from the GOV.UK service graph — it has a name and department but no service description.
One click promotes the catalogue entry to a full service. This creates the directory structure in data/services/dvla-renew-driving-licence/ and opens the service editor. The coverage score shows 0% — no dimensions have been authored yet.
The team fills in the basics: service name as citizens should see it, responsible department (DVLA), input fields (driving licence number, address, photo), output fields (new licence reference, expected delivery date), SLA (three weeks), fee (£14 online), and redress routes (complaint URL, DVLA contact centre). Coverage rises to 25%.
Rather than starting from scratch on the remaining three dimensions, the team clicks generate. The LLM reads the identity metadata and scaffolds eligibility rules (age check, licence not revoked, not disqualified), a journey model (identify → check eligibility → consent → collect details → submit photo → payment → submitted → completed), and data sharing grants. Each generated item is marked as machine-generated.
The team reviews the generated rules. They add an over-70 edge case (different medical requirements), a revoked-licence handoff that routes to DVLA's licensing team, and plain-English failure messages for each rule. They remove a rule the LLM hallucinated about vehicle type.
The generated happy path is close, but the team adds branching: a medical-review state for over-70 applicants, a handed-off state for revoked licences, and a consent-refused terminal state. They verify that every branch has a clear terminal condition — no citizen can get stuck in a loop.
The team confirms that the photo retrieval grant (from HMPO) is correctly marked as required, and adds an optional grant for sharing the renewal outcome with DVLA's vehicle registration system. Duration is set to "session only" for the photo, "persistent" for the registration update.
Coverage now shows 100%. The gap analysis panel is empty. The team saves, and the service description is written to the local store. The agent can now find this service, evaluate eligibility, guide citizens through the journey, and request consent — all deterministically, all auditable.
Once published, the service appears in the gap analysis view with full coverage. The evidence explorer will begin capturing traces as citizens interact with the service through the agent. Operational metrics — completion rate, handoff rate, bottleneck states — appear in the service's ledger dashboard, giving the DVLA team a live view of how their service performs in practice.
For the full field-level specification of each dimension, see the technical architecture. For how these descriptions are consumed by the citizen app, see the citizen app deep dive.