Deep dive

The Legibility Studio

Where departments make their services ready for agents. Author, audit, and measure — one platform for service legibility.

For an AI agent to act on a citizen's behalf, it needs more than an API. It needs to understand what a service does, who is eligible, what steps are involved, and how data is shared. The Legibility Studio is where departments publish that understanding as structured, machine-readable service descriptions — and where they monitor how agents are using them.

01

How the studio works

The Legibility Studio has three capabilities, each addressing a different question that department teams need answered.

Author

Create and edit the four dimensions of a service description — identity, eligibility, journey, and data sharing — through a structured editor with LLM-assisted generation for teams starting from scratch.

Audit

Trace every agent interaction back to the policy rule, state transition, and consent grant that governed it. Browse sessions, replay journeys, inspect every decision the agent made on a citizen's behalf.

Measure

Track service description coverage across your department. See which services are agent-ready, which have gaps, and where to focus next. Operational metrics show how services perform once live.

The studio is a Next.js application that reads service definitions from local stores and fetches operational data (traces, case ledger) from the citizen app API via HTTP. It never accesses the evidence database directly — this architectural boundary keeps native module dependencies cleanly separated.

Legibility Studio — services
Legibility Studio
Dashboard
Services
Evidence
Gap analysis
Personas
1,653
Services
113
Described
7%
Coverage
8
Departments
ServiceDeptIdElJoDaSource
Tell HMRC about a deathHMRCFull
Apply for Carer's AllowanceDWPFull
Renew driving licenceDVLACatalogue
Register a deathHMCTSFull

The table columns — Id, El, Jo, Da — show at a glance which of the four dimensions each service has: identity, eligibility, journey, and data sharing. A service with all four ticks is agent-ready. Anything less shows exactly what is missing.

02

Service catalogue and typologies

The catalogue holds every government service the system knows about. Services arrive from two sources and are classified into eight interaction typologies.

Two source types

The distinction matters because the two types represent very different levels of readiness.

Full services

  • Complete service descriptions with all four dimensions authored
  • Stored in the service registry and published to all channels
  • Used in agent orchestration and audit
  • Editable through the studio's structured form editor
  • Actions: detail view, edit, view ledger, delete

Catalogue-only services

  • Basic metadata only — name, department, life event mapping
  • Imported from the GOV.UK service graph
  • Can be promoted to full services with one click
  • LLM generation can scaffold service descriptions automatically
  • Represents the long tail of 1,500+ government services

Eight service typologies

Every service is classified by interaction pattern. The typology breakdown helps designers identify under-served categories and ensures the system handles every kind of government service.

Benefit Entitlement Obligation Registration Application Document Legal process Grant

A benefit service pays money to a citizen (Universal Credit, Carer's Allowance). An obligation requires the citizen to do something (file a tax return, register a death). An application is a one-off request with a decision (apply for a passport, request planning permission). Typologies affect how the agent frames its conversation and which state machine patterns apply.

Filtering and search

The services table supports filtering by source (full / catalogue / all), typology, department, life event, and free-text search. Services are split into two sections: promoted agent services (highlighted at the top) and the full catalogue table below. This lets department teams focus on the services they are actively working on while keeping the full landscape visible.

03

The service editor

The editor is where departments author the four dimensions of a service description. It guides teams through each dimension with structured forms, validation, and contextual help.

Four dimensions of a service description

Every transactional government service publishes a description with four dimensions. Together they form a complete contract between a department and any authorised agent.

Identity

What the service does, what data it needs, what it produces, and who to contact when something goes wrong. This is the service's public contract — its name, department, description, input and output fields, SLA, fees, availability, and redress routes.

Includes: service name, department, description, input fields with types, output fields, SLA, fees, availability hours, complaint URL, appeal process, ombudsman details

Eligibility

Structured rules that determine who can use the service. Each rule operates on a single data field using a simple operator. If a rule fails, the citizen sees a plain-English explanation and, where possible, a redirect to an alternative service.

Includes: eligibility rules with conditions, operators, failure reasons, alternative services, edge cases, handoff triggers

Journey

A state machine defining the lifecycle of a service interaction: states, transitions with triggers, and terminal conditions. The agent follows this path deterministically — it cannot skip steps or invent new ones.

Includes: states, transitions, triggers, terminal states, branching logic for eligibility failures, consent refusals, and edge cases

Data sharing

Every data-sharing grant the service requires: which fields are shared, where they come from, for what purpose, how long the grant lasts, and whether it is required or optional. Nothing is shared without the citizen's explicit consent.

Includes: grant ID, description, fields shared, source, purpose, duration, required/optional flag, delegation scope, revocability

What the editor covers

The editor presents each dimension as a structured form section. Product teams work through:

Core and schema

Service name, department, description, input and output field definitions with types. The identity dimension.

Rules and redress

Eligibility rules with conditions, SLA, fees, availability, complaint URL, appeal process, ombudsman, escalation phone and hours.

Journey and consent

State definitions with transitions, consent grants, delegation scope, card definitions for the citizen chat UI.

The editor also includes a card definitions section for mapping service states to visual cards displayed in the citizen chat, and an audit section for data retention period, data controller, and lawful basis.

What product teams need to decide

The editor guides, but the decisions belong to the team. For each dimension:

1

Identity

What is the citizen-facing name? What data does the service need as input? What does it produce? What are the SLA commitments? Who handles complaints? These are facts your department already knows.

2

Eligibility

What are the hard rules? Who is not eligible and why? What happens when someone fails a rule — is there an alternative service, or should the agent hand off to a human? Write the failure messages in plain English.

3

Journey

Start with the happy path. Then ask: what if they are not eligible? What if there is a medical complication? What if they refuse consent? Each of these becomes a branch in the state model.

4

Data sharing

What data is shared, with whom, and for how long? Which grants are required and which are optional? What is the lawful basis? Can the citizen revoke consent after the fact?

LLM-assisted generation

For catalogue-only services that lack descriptions, a generate action triggers the LLM to scaffold eligibility rules, a journey model, and data sharing grants from the service's basic metadata. Generated descriptions are timestamped and marked as machine-generated for human review. This gets teams to a first draft in seconds rather than days — the real work is in refining the rules and edge cases.

Completeness scoring. The service detail page shows a percentage score based on which dimensions are present and how thoroughly they are populated. A gap-analysis panel lists exactly which dimensions or fields are missing, so teams always know what remains to be done.
04

Gap analysis and coverage

The gap analysis view answers a single question for each department: how ready are your services for agent delivery?

1,653
Services catalogued
113
With full descriptions
7%
Agent coverage
1,540
Need descriptions

The catalogue draws from the full GOV.UK services directory, so the gap is deliberately large. It represents the scale of the challenge, not a failure to deliver. The point is to make the gap visible and give departments a clear path to closing it.

Department coverage cards

Each department gets a summary card with a colour-coded progress bar showing the ratio of fully-described services to total services. Departments are colour-coded consistently across the platform.

HMRC

27 services — fully described

DWP

21 services — 78% coverage

Home Office

6 services — early progress

MoJ

12 services — 45% coverage

DVLA

16 services — 60% coverage

Cabinet Office

9 services — 35% coverage

Priority sorting

Services in the gap table are sorted by priority. Demo-critical services appear first — those used in live department demonstrations. Then transactional services (applications, registrations, claims). Then reference material. Within each tier, services with complete descriptions appear above those with gaps.

What "agent-ready" means

A service is agent-ready when all four dimensions are present and populated. The agent can read the identity to understand the service, evaluate eligibility rules deterministically, follow the journey state machine step by step, and request consent grants as specified in the data sharing model. Without any one of these, the agent either cannot find the service, cannot determine eligibility, cannot guide the citizen through it, or cannot share data lawfully.

The gap is the opportunity. 93% of government services are not yet described for agents. The studio makes this visible by design — not as criticism, but as a roadmap for departments to follow.
05

Evidence explorer and replay

Every agent interaction produces a trace — an append-only sequence of events stored in the evidence package. The evidence explorer lets department teams inspect exactly what happened during any citizen session.

Two-panel layout

The left panel lists trace sessions by ID and timestamp with event counts. Selecting a trace opens its events in the right panel, which offers two views.

Explorer

  • Filter events by type with colour-coded badges
  • Expand any event to inspect the full JSON payload
  • Summary bar shows event count, receipt count, and type breakdown
  • Search across all traces for specific services, users, or error types

Replay

  • Step through events in the order they occurred
  • See the full chain of agent reasoning, decision by decision
  • Identify exactly where a journey diverged from the expected path
  • Useful for debugging service flows and consent sequences

Event types

Traces capture every decision point in the agent pipeline. Each event type is colour-coded for visual scanning.

llm.request capability.invoked policy.evaluated consent.requested consent.granted credential.verified receipt.issued state.transition handoff.requested error

The evidence plane

Every agent action — from the initial LLM request through eligibility evaluation to consent grants and service completions — is captured as an immutable event. This is the evidential backbone that makes agentic government services auditable. No event can be deleted or modified after it is written.

This matters for complaints and oversight. If a citizen disputes what they were told, the trace shows exactly what the agent said, which policy rules were evaluated, what consent was granted, and what data was shared. If a service owner suspects a pattern of failures, they can filter traces by error events or handoff requests to identify systemic issues.

Traceability principle. Every action the agent takes on a citizen's behalf is logged with a cryptographic receipt. The citizen, the department, and any oversight body can independently verify what happened, when, and why. See the technical architecture for the evidence schema and receipt format.
06

Worked example: publishing a service

A DVLA product team wants to make driving licence renewal available through the agent. Here is how they go from a blank service to a published, agent-ready description.

1

Find the service in the catalogue

The team opens the Legibility Studio and searches for "renew driving licence". The service already exists as a catalogue-only entry imported from the GOV.UK service graph — it has a name and department but no service description.

2

Promote to a full service

One click promotes the catalogue entry to a full service. This creates the directory structure in data/services/dvla-renew-driving-licence/ and opens the service editor. The coverage score shows 0% — no dimensions have been authored yet.

3

Author the identity dimension

The team fills in the basics: service name as citizens should see it, responsible department (DVLA), input fields (driving licence number, address, photo), output fields (new licence reference, expected delivery date), SLA (three weeks), fee (£14 online), and redress routes (complaint URL, DVLA contact centre). Coverage rises to 25%.

4

Generate a first draft with LLM

Rather than starting from scratch on the remaining three dimensions, the team clicks generate. The LLM reads the identity metadata and scaffolds eligibility rules (age check, licence not revoked, not disqualified), a journey model (identify → check eligibility → consent → collect details → submit photo → payment → submitted → completed), and data sharing grants. Each generated item is marked as machine-generated.

5

Refine the eligibility rules

The team reviews the generated rules. They add an over-70 edge case (different medical requirements), a revoked-licence handoff that routes to DVLA's licensing team, and plain-English failure messages for each rule. They remove a rule the LLM hallucinated about vehicle type.

6

Refine the journey model

The generated happy path is close, but the team adds branching: a medical-review state for over-70 applicants, a handed-off state for revoked licences, and a consent-refused terminal state. They verify that every branch has a clear terminal condition — no citizen can get stuck in a loop.

7

Refine data sharing

The team confirms that the photo retrieval grant (from HMPO) is correctly marked as required, and adds an optional grant for sharing the renewal outcome with DVLA's vehicle registration system. Duration is set to "session only" for the photo, "persistent" for the registration update.

8

Review and publish

Coverage now shows 100%. The gap analysis panel is empty. The team saves, and the service description is written to the local store. The agent can now find this service, evaluate eligibility, guide citizens through the journey, and request consent — all deterministically, all auditable.

Time to publish. The entire process — from finding the service in the catalogue to a fully-described, agent-ready service — takes a product team approximately two hours. Most of that time is spent on edge cases and failure messages, not on the happy path. The LLM-assisted generation handles the scaffolding; the team's expertise goes into the decisions that matter.

Once published, the service appears in the gap analysis view with full coverage. The evidence explorer will begin capturing traces as citizens interact with the service through the agent. Operational metrics — completion rate, handoff rate, bottleneck states — appear in the service's ledger dashboard, giving the DVLA team a live view of how their service performs in practice.

For the full field-level specification of each dimension, see the technical architecture. For how these descriptions are consumed by the citizen app, see the citizen app deep dive.