Deep dive

Technical architecture

System architecture, runtime pipeline, service description schemas, and integration with existing government infrastructure.

01

System architecture

The stack is organised as a set of focused packages with strict dependency boundaries. Every package owns one concern; cross-cutting behaviour is mediated through well-typed interfaces.

Three hard architectural rules

These rules are non-negotiable.

  1. LLM integration is confined to the LLM Adapter Layer only. Zero direct LLM SDK usage elsewhere in the system. All LLM calls route through the adapter interface.
  2. Model Context Protocol (MCP) client usage in the LLM Adapter Layer, server usage in the MCP Server. No other component touches MCP directly.
  3. The Legibility Studio fetches evidence via HTTP. It does not import the Evidence Plane directly. This enforces a clean separation between the department-facing and citizen-facing surfaces.

Package dependency graph

The dependency tree is rooted at the shared schema layer, which has zero internal dependencies. All other components declare their dependencies explicitly. External SDK usage is confined to two components.

Shared Schema Layer
Leaf — no internal dependencies
LLM Adapter Layer
+ LLM provider SDK
+ MCP client
Evidence Plane
Identity Layer
Service Graph
Legibility Engine
+ Service Store, Evidence Plane
Personal Data Layer
+ Identity Layer, Evidence Plane
Service Store
+ Service Graph, Evidence Plane
Orchestration Layer
+ Legibility Engine
MCP Server
+ Legibility Engine
+ MCP server SDK

Two applications

Citizen app

The citizen-facing experience. Hosts the conversational agent, task cards, consent flows, wallet, and receipt history. Exposes an evidence API that the Legibility Studio reads from.

localhost:3106

Legibility Studio

The department-facing admin tool. Displays service coverage, gap analysis, trace inspection, and service description authoring. Consumes evidence over HTTP — never imports the Evidence Plane directly.

localhost:3101

The two applications share packages but never share code directly. The citizen app owns the evidence store; the studio reads from it over HTTP. This boundary is enforced at the dependency level — adding @als/evidence to the studio causes native module crashes by design.

Component responsibility summary

Component Owns Must never
schemas Shared types, schema validation, type guards Import any other component
adapters LLM calls, MCP client, tool dispatch Make policy or state decisions
evidence Trace events, receipts, case ledger, replay Import LLM integration or make API calls
legibility PolicyEvaluator, StateMachine, ConsentManager, FieldCollector Call the LLM or make network requests
runtime Orchestrator, pipeline, strategy dispatch, CapabilityInvoker Import LLM integration directly (must go through adapter)
personal-data Citizen data model, field sourcing, consent preferences Bypass consent checks
mcp-server MCP server, tool generation, resource exposure Use MCP client (that belongs to the adapter)
service-store Service storage, tiered resolution, gap analysis Make eligibility decisions (that belongs to legibility)
02

The runtime pipeline

Every citizen interaction passes through this 14-step orchestrator pipeline. Deterministic code runs before and after the language model; the platform always has the final word.

1
Policy evaluation deterministic
Evaluates the service's eligibility rules against citizen data via PolicyEvaluator. Returns eligible/ineligible with reasons. Rules are pure data — no code execution.
2
State machine setup deterministic
Creates StateMachine from the service's journey definition. Restores current state from client session. Computes allowed transitions.
3
Field collector deterministic
Seeds FieldCollector from the identity schema and citizen data. Tracks required versus collected fields and identifies what is still missing.
4
Agent selection deterministic
Selects triage mode (no service context, need identification) or journey mode (service in progress, service-description-driven). The choice determines prompt composition and available tools.
5
Strategy context and system prompt deterministic
Calls ServiceStrategy.buildServiceContext(). Assembles the system prompt from layered fragments: personality, persona style, scenario, service descriptions, guardrails, and output format.
6
Build tools deterministic
Calls ServiceStrategy.buildTools(). The deterministic strategy returns an empty array; the tool-based strategy returns service-specific check and advance tools.
7
Agentic loop ai
Up to 5 iterations: call the language model, dispatch any tool calls via the adapter, accumulate results. This is the only step where the LLM generates content. The loop terminates when no tool calls remain or the iteration limit is reached.
8
Parse structured output deterministic
Extracts the JSON block from the LLM response. Parses conversation title, tasks, proposed state transitions, and extracted facts. Malformed output is caught and logged.
9
Build tasks deterministic
Transforms parsed task descriptions into typed TaskEntry objects with structured fields: type, title, description, options, metadata.
10
State transitions deterministic
Validates the LLM's proposed transition against the StateMachine. Rejects illegal transitions. Applies forced and auto transitions defined in the journey.
11
Deterministic task injection deterministic
Overrides LLM-generated tasks with platform-generated cards at specific states. Consent cards, payment forms, and document upload prompts are always rendered by the platform, never by the LLM.
12
Consent requests deterministic
Surfaces consent grant cards at the eligibility-checked state. Required grants must be accepted to proceed; optional grants can be declined without blocking progress.
13
Handoff detection deterministic
Evaluates safeguarding triggers, edge-case conditions, and escalation rules. When a handoff is triggered, the conversation is routed to a human adviser with full context preserved.
14
Version metadata and trace platform
Hashes the system prompt, stamps ruleset and journey versions, builds the full pipeline trace with per-step timing. The trace is written to the evidence store for audit and replay.

The LLM runs at step 7 out of 14. Everything before it is deterministic setup; everything after it is validation and override. The language model proposes; the platform disposes. This is the central architectural guarantee.

Service strategy pattern

The orchestrator does not know how a service is implemented. It delegates through a pluggable ServiceStrategy interface. Three modes exist today.

Interface ServiceStrategy
interface ServiceStrategy {
  buildTools(ctx: ServiceStrategyContext): ToolDefinition[];
  buildServiceContext(ctx: ServiceStrategyContext): string | Promise<string>;
  dispatchToolCall(name: string, input: unknown): Promise<string>;
  extractStateTransitions(messages: unknown[]): StateTransitionResult[];
}

JsonServiceStrategy (deterministic, inline) — no tools are given to the LLM. Service context is built entirely from structured service descriptions and the field collector. All state transitions are proposed by the LLM in its structured output and validated by the state machine.

McpServiceStrategy (tool-based) — the LLM receives check_eligibility and advance_state tools. It calls these during the agentic loop. Tool results are parsed for state transitions.

Demo mode — scripted responses bypass the orchestrator entirely. Each persona has a pre-authored script that drives the conversation deterministically, with no LLM calls. Used for conference demonstrations and stakeholder walkthroughs.

All three modes produce identical output contracts. The citizen application cannot distinguish which strategy was used.

OrchestratorOutput

Every pipeline run produces a single typed output. This is the contract between the runtime and the citizen application.

Interface OrchestratorOutput
interface OrchestratorOutput {
  response: string;
  reasoning: string;
  toolsUsed: string[];
  conversationTitle: string | null;
  tasks: TaskEntry[];
  policyResult?: { eligible: boolean; explanation: string };
  handoff?: { triggered: boolean; reason: string };
  serviceState?: { currentState: string; stateHistory: string[] };
  consentRequests?: ConsentRequest[];
  extractedFields?: FieldExtraction[];
  serviceCompletions?: Array<{ serviceId: string; status: string }>;
  versionMetadata: {
    promptHash: string;
    rulesetVersion: string;
    stateModelVersion: string
  };
  pipelineTrace: {
    traceId: string;
    steps: PipelineStep[];
    totalDurationMs: number
  };
}
03

Service description schemas

Every government service is described across four dimensions. Together, they form the complete machine-readable contract that an agent uses to deliver the service on a citizen's behalf.

Identity

The service's identity card. Who runs it, what it does, what data it needs, what it produces, what it costs, how long it takes, and where to complain.

manifest.json

Eligibility

Machine-evaluable rules that determine whether a citizen can use the service. Each rule is a condition on a data field with a pass/fail outcome and a human-readable reason.

policy.json

Journey

The finite-state machine defining the valid sequence of states and transitions. Defines the happy path and all branches: rejection, edge cases, handoff to a human adviser.

state-model.json

Data sharing

Named data-sharing grants with purpose, source, duration, and required/optional status. The citizen sees exactly what they are agreeing to before any data moves.

consent.json

Service descriptions are the integration point for departments. A department does not need to build an API or write code. It publishes structured service descriptions, and the platform handles orchestration, consent, evidence, and agent behaviour. The Legibility Studio provides a guided authoring experience for department teams.

Identity — the manifest

The manifest is where you start. It declares the service name (as citizens should see it), the responsible department, jurisdictions, costs, SLA commitments, complaint and appeal routes, data controller, and lawful basis.

Identity — DVLA driving licence renewal
{
  "id": "dvla-renew-driving-licence",
  "name": "Renew a driving licence",
  "department": "Driver and Vehicle Licensing Agency",
  "description": "Renew a photocard driving licence that is expiring or has expired.",
  "version": "1.0.0",
  "jurisdiction": "England, Wales, Scotland",
  "input_schema": {
    "required": [
      "full_name", "date_of_birth",
      "driving_licence_number", "national_insurance_number",
      "address", "photo"
    ],
    "optional": ["email", "phone"]
  },
  "output_schema": {
    "produces": ["application_reference", "expected_delivery_date"]
  },
  "constraints": {
    "sla": "10 working days",
    "fee": { "amount": 14, "currency": "GBP" },
    "availability": "24/7 online"
  },
  "redress": {
    "complaint_url": "https://www.gov.uk/complain-about-dvla",
    "appeal_process": "Contact DVLA directly",
    "ombudsman": "Parliamentary and Health Service Ombudsman"
  },
  "audit_requirements": {
    "retention_period": "7 years",
    "data_controller": "DVLA",
    "lawful_basis": "Public task"
  }
}

The input_schema.required array drives the field collector. When the orchestrator loads a service, it seeds the FieldCollector from this array. Fields already present in the citizen's profile are marked as collected; the remainder become the "missing fields" list that the agent uses to ask targeted questions.

Eligibility — the policy ruleset

Each rule operates on a single data field using a typed operator. If a rule fails, the agent shows the citizen the reason_if_failed message — in plain English, written by the department team. Two additional fields make policies more than gatekeepers: alternative_service redirects the citizen, and triggers_handoff routes them to a human.

Eligibility — DVLA driving licence renewal
{
  "service_id": "dvla-renew-driving-licence",
  "version": "1.0.0",
  "rules": [
    {
      "id": "age-minimum",
      "description": "Applicant must be at least 16",
      "condition": { "field": "age", "operator": ">=", "value": 16 },
      "reason_if_failed": "You must be at least 16 to hold a driving licence"
    },
    {
      "id": "has-licence",
      "condition": { "field": "driving_licence_number", "operator": "exists" },
      "reason_if_failed": "You need an existing licence to renew.",
      "alternative_service": "dvla.apply-provisional-licence"
    },
    {
      "id": "not-revoked",
      "condition": { "field": "licence_status", "operator": "!=", "value": "revoked" },
      "reason_if_failed": "Your licence has been revoked.",
      "triggers_handoff": true
    }
  ],
  "edge_cases": [
    {
      "id": "medical-condition",
      "detection": "medical_conditions",
      "action": "Route to medical assessment. DVLA form C1 required."
    },
    {
      "id": "over-70",
      "detection": "over_70",
      "action": "Over-70 renewal is free but requires medical self-declaration."
    }
  ]
}

Journey — the state model

The state model defines a directed graph with a single entry point (not-started) and one or more terminal states (completed, rejected, handed-off). Every transition has a from state, a to state, a trigger, and an optional guard condition.

not-started
identity-verified
eligibility-checked
consent-given
details-confirmed
photo-submitted
payment-made
application-submitted
completed
rejected
handed-off

The state model is enforced deterministically. The agent cannot advance to "payment-made" unless "photo-submitted" has been reached. Every transition is logged as a trace event.

Data sharing — the consent model

Each grant declares an ID, a description shown to the citizen, the specific fields being shared, the source of those fields, the purpose, the duration, and whether the grant is required or optional.

Data sharing — DVLA driving licence renewal
{
  "grants": [
    {
      "id": "identity-verification",
      "description": "Verify your identity using GOV.UK One Login",
      "data_shared": ["full_name", "date_of_birth", "national_insurance_number"],
      "source": "one-login",
      "purpose": "To confirm you are who you say you are",
      "duration": "session",
      "required": true
    },
    {
      "id": "photo-sharing",
      "description": "Share your passport photo with DVLA",
      "data_shared": ["passport_photo"],
      "source": "hmpo-passport-office",
      "purpose": "DVLA will use your most recent passport photo for the new licence",
      "required": true
    }
  ],
  "revocation": {
    "mechanism": "Contact DVLA or revoke through your GOV.UK account",
    "effect": "Application will be cancelled if consent is revoked before completion"
  }
}
04

Data model

A unified data model for citizen information with field-level provenance, tiered trust levels, and deduplication across services.

Three tiers of trust

Every data field carries a trust tier that determines how it can be used, whether it requires consent to share, and whether the citizen can edit it.

Tier 1: Verified

Immutable, government-sourced data. Confirmed by a department system of record. The citizen cannot edit these fields — only the source department can update them. NI number, driving licence number, passport details.

Tier 2: Submitted

Citizen-entered data that has not been verified against a department source. The citizen can edit these fields at any time. Contact preferences, correspondence address, accessibility needs.

Tier 3: Inferred

Agent-derived data, flagged as such. Extracted from conversation context or computed from other fields. Always shown with an "inferred" label so the citizen knows its provenance.

Field source attribution

Every field in the citizen data model carries source metadata. This enables the platform to show exactly where each piece of data came from and which consent grant authorised its use.

Interface FieldSource
interface FieldSource {
  source: string;     // "HMRC", "DVLA", "Home Office"
  tier: "verified" | "submitted" | "inferred";
  topic: string;     // "identity", "finance", "employment"
}

Canonical alias map for deduplication

When a life event triggers multiple services, the field merger deduplicates data fields using a canonical alias map. Different services may use different names for the same piece of data — "ni_number", "national_insurance_number", and "nino" all resolve to the same canonical field. The agent asks once, and the answer fans out to all services that need it.

Canonical alias map (excerpt)
{
  "national_insurance_number": ["ni_number", "nino", "niNumber"],
  "full_name": ["fullName", "name", "legal_name"],
  "date_of_birth": ["dateOfBirth", "dob", "birth_date"],
  "address": ["postal_address", "home_address", "correspondence_address"]
}

Pre-fill logic

When a service is loaded, the FieldCollector seeds from the identity's input_schema.required array and checks each field against the citizen's profile using the alias map. Fields already present are marked as collected with their source attribution. The remaining fields become the "missing" list that the prompt layer injects into the system prompt, enabling the LLM to ask targeted questions rather than repeating information the citizen has already provided.

Base fields (all citizens)

Field Tier Source
fullNameVerifiedMultiple sources
dateOfBirthVerifiedMultiple sources
niNumberVerifiedDWP / HMRC
addressVerifiedMultiple sources
postcodeVerifiedHMRC / DVLA
emailVerifiedUser-provided
phoneVerifiedUser-provided

Persona-specific fields (examples)

Field Tier Source
bereavement.deceasedNameSubmittedGeneral Register Office
bereavement.dateOfDeathVerifiedGeneral Register Office
bereavement.estateValueSubmittedHMRC
bereavement.tellUsOnceRefVerifiedGDS / Tell Us Once
immigration.statusVerifiedHome Office
immigration.brpNumberVerifiedHome Office
immigration.rightToWorkVerifiedHome Office
justiceHistory.releaseDateVerifiedHMPPS
justiceHistory.licenceConditionsVerifiedHMPPS
childcare.thirtyHourCodeVerifiedHMRC
childcare.childBenefitRefVerifiedHMRC
transport.drivingLicenceNumberVerifiedDVLA
transport.licenceStatusVerifiedDVLA

Cross-department data flow is always explicit. When a citizen consents to share data across departments, the platform mediates the exchange. The citizen sees which fields will be shared, which department will receive them, and for what purpose. An immutable receipt records every data-sharing event.

05

Policy evaluator and state machine

The core of the deterministic layer. These two components enforce eligibility and journey progression without any LLM involvement.

PolicyEvaluator

Evaluates a service's eligibility rules against a citizen's data. The evaluator iterates each rule, evaluates conditions using typed operators, and returns a structured result. No fuzzy logic, no LLM interpretation — rules either pass or fail.

Class PolicyEvaluator
class PolicyEvaluator {
  evaluate(
    ruleset: PolicyRuleset,
    context: Record<string, unknown>
  ): PolicyResult
}

interface PolicyResult {
  eligible: boolean;
  passed: PolicyRule[];
  failed: PolicyRule[];
  edgeCases: PolicyRule[];
  explanation: string;
}

Policy operator reference

The PolicyEvaluator supports seven operators. Each has strict type semantics — the evaluator does not perform type coercion.

Operator Semantics Example
>=Greater than or equal. Numeric comparison only.age >= 16
<=Less than or equal. Numeric comparison only.age <= 70
==Strict equality. String or numeric.jurisdiction == "England"
!=Not equal.status != "disqualified"
existsField is present and non-null. Value parameter is ignored.driving_licence_number exists
not-existsField is absent or null. Value parameter is ignored.ban_end_date not-exists
inField value is one of the specified array values.licence_type in ["full", "provisional"]

Edge cases are distinct from failures. A rule with edge_case: true does not cause outright ineligibility. Instead, it flags the citizen for potential handoff to a human adviser. This handles ambiguous situations where the citizen might be eligible but needs human judgement.

StateMachine

A deterministic finite-state machine constructed from a service's journey definition. It enforces the legal transitions for a service journey, prevents the LLM from skipping steps, and identifies terminal and receipt-emitting states.

Class StateMachine
class StateMachine {
  constructor(definition: StateModelDefinition)
  getState(): string
  allowedTransitions(): Array<{ to: string; trigger?: string }>
  transition(trigger: string): TransitionResult
  isTerminal(): boolean
  setState(stateId: string): void
}

Transition guards

Transitions can carry guard conditions that must be satisfied before the transition is allowed. Guards prevent premature advancement — a citizen cannot reach "payment-made" without first passing through "details-confirmed".

Transition with guard
{
  "from": "eligibility-checked",
  "to": "consent-given",
  "trigger": "grant_consent",
  "guard": {
    "condition": "policy_result.eligible == true",
    "message": "Cannot proceed: citizen is not eligible for this service."
  }
}

The state machine evaluates guards synchronously. If a guard fails, the transition is rejected and the machine remains at its current state. The guard's message is available to the agent for explaining why the journey cannot proceed.

07

Evidence plane

Every significant action produces an immutable trace event. The evidence plane enables audit, replay, and accountability across all citizen interactions.

TraceEvent structure

Interface TraceEvent
interface TraceEvent {
  id: string;
  traceId: string;
  spanId: string;
  parentSpanId?: string;
  timestamp: string;
  type: TraceEventType;
  payload: Record<string, unknown>;
  metadata: {
    userId?: string;
    sessionId: string;
    capabilityId?: string;
  };
}

Event types

Type Description Updates ledger
state.transitionState machine moved to a new stateYes
consent.grantedCitizen approved data sharingYes
consent.deniedCitizen declined data sharingYes
policy.evaluatedEligibility rules checkedYes
handoff.initiatedEscalation to human adviser triggeredYes
capability.invokedDepartment service calledYes
llm.requestLanguage model call initiatedYes
llm.responseLanguage model response receivedYes
credential.presentedVerifiable proof submittedYes
receipt.issuedOutcome documented with immutable receiptYes
field.extractedFact parsed from conversationNo
error.occurredSystem error loggedNo

Receipt generation

Receipts are the citizen-facing evidence that an action was taken. They include a reference to the service, the action performed, the outcome, and which data was shared. Receipts are generated from trace events at receipt-emitting states defined in the journey.

Interface Receipt
interface Receipt {
  id: string;
  capabilityId: string;
  action: string;
  outcome: string;
  timestamp: string;
  dataShared?: string[];
}

Replay engine

The replay engine reconstructs full conversation state from trace events. Given a trace ID, it steps through events in chronological order, rebuilding state machine position, consent decisions, collected fields, and the complete message history.

Replay supports three use cases. (1) Audit — regulators or complaints teams step through a citizen's journey to verify that rules were followed. (2) Debugging — engineers identify where a journey diverged from expected behaviour. (3) Dispute resolution — when a citizen contests an outcome, the replay provides incontrovertible evidence of what happened and why.

Trace lifecycle

A typical pipeline run produces the following sequence of trace events.

span.start
Pipeline span opened with trace ID, session ID, and capability ID.
policy.evaluated
Eligibility rules checked. Payload contains passed/failed rules and overall result.
llm.request
Language model called. Payload contains prompt hash and token counts.
llm.response
Language model responded. Payload contains model ID, token usage, and tool calls made.
state.transition
State machine moved from eligibility-checked to consent-given. Transition validated.
consent.granted
Citizen accepted data sharing. Grant ID and scope recorded.
field.extracted
New facts parsed from conversation. Fields and sources recorded.
receipt.issued
Immutable receipt generated for the completed action.
span.end
Pipeline span closed. Duration and step count recorded.

The evidence store is append-only. Events cannot be modified or deleted after creation. If the case store is lost, it can be reconstructed by replaying events from the evidence store. This makes the evidence store the single source of truth.

08

FLEX and UDP integration

The Agentic Legibility Stack sits on top of the FLEX platform and Unified Data Platform. FLEX provides the secure pipes; ALS adds the intelligence that turns those pipes into a joined-up citizen experience.

Two layers, complementary

FLEX + UDP

Infrastructure layer (being built by GDS)
  • Three-tier gateway: public front door, private business rules, and a future service connector
  • Departments build "domain modules" that slot into each tier
  • UDP — centralised store of citizen data linked to One Login identity
  • Authenticates, rate-limits, and routes — the private gateway never touches the public internet

Agentic Legibility Stack

Intelligence layer
  • AI orchestrator for conversation, triage, and life event matching
  • Deterministic policy evaluation and state machine orchestration
  • Granular consent management with standing preferences
  • CapabilityInvoker as the single auditable choke point to FLEX
  • Immutable evidence store with receipts and full replay

CapabilityInvoker as the bridge

Every department service call routes through the CapabilityInvoker, which calls the FLEX Public Gateway. The invoker logs the request, checks consent, calls the gateway, logs the response, and emits a trace event. No service call ever bypasses this path.

ALS: Life event triage
Matches "my husband died" across 16 life events. Identifies Tell Us Once, DWP bereavement, probate, council tax.
ALS: Consent and field deduplication
Checks standing consent preferences. Merges data fields across 4 services — asks once, fans out to all.
CapabilityInvoker calls FLEX
Each service call routes through the single auditable choke point. Every call emits a trace event: who asked, what data was shared, what came back, duration, receipt ID.
FLEX: Secure delivery
Authenticates the request, applies business rules, routes to the correct department. Returns structured data.
ALS: Outcome and next steps
Issues outcome cards, updates the plan, earns credentials for the wallet. Guides the citizen to the next service.

What ALS adds that FLEX does not

FLEX is the plumbing — how data moves securely between departments and the app. ALS is the intelligent front-of-house — how an AI agent uses those pipes on behalf of a citizen. Six capabilities sit above FLEX:

Capability What it does
Service discovery "My husband died" → 4 services identified across 3 departments
Eligibility checking Deterministic rules evaluated before the citizen invests time
Consent management Standing preferences, scoped grants, revocation — all tracked with receipts
Field collection Canonical alias map deduplicates fields — ask once, fan out to all services
State orchestration Finite-state machine enforces the journey. The LLM cannot skip steps.
Evidence and audit Immutable trace for every action. Full replay. Receipts for every data exchange.

FLEX and ALS are complementary, not competing. FLEX handles real authentication against One Login, enforces rate limits, manages TLS termination, and routes across the private network. ALS handles service discovery, eligibility checking, consent management, field collection, state machine orchestration, and the immutable evidence trail. Today ALS simulates what FLEX will provide for real.