10 — Agent runtime architecture.

The Reasoning Plane is the execution surface for agents. It is a multi-tenant, sandboxed, queue-driven runtime that turns typed inputs into typed Decisions and proposed Actions under permission, budget and time bounds. The runtime is structured as an Orchestrator Model coordinating a population of Specialist Worker Models, with every step recorded in a typed Reasoning Trail. This section specifies its overall topology and the run lifecycle. §11 covers Planner / Executor / Verifier / Memory in detail.

10.0 Reference — orchestration runtime

From a CTO’s vantage, the Reasoning Plane is best understood as one orchestrator choosing among many specialist workers, all consuming typed inputs from the Ontology and Knowledge Base, and emitting Decisions and proposed Actions under governance. Fig. 10.0a — Orchestration runtime. One Orchestrator Model plans the work; five Specialist Worker Model classes execute it; the Reasoning Trail captures every step; the Verifier (§11) rules the candidate Decision; the Action Plane commits.

10.1 Runtime topology

The runtime is composed of seven services that communicate over typed contracts. There is no shared mutable state across runs; everything that survives a run is persisted in the Ontology, the audit chain, or telemetry. Fig. 10.1 — Runtime topology. Each box has a typed contract; nothing crosses a boundary except as a contract message.

Components

Run queue — per-tenant, partitioned by agent class. Persistent. Backed by a durable broker.
Dispatcher — pulls items, validates run preconditions (permissions, budget, agent enabled), and hands off.
Scheduler — allocates a sandbox worker, applies tenant capacity policy, sets seeds.
Sandbox — isolated worker process / container with no ambient network or filesystem; only typed contract calls allowed.
Agent runtime — the planner / executor / verifier / memory loop (§11).
Tool registry & PEP — the policy enforcement point that proxies every tool call through the PDP.
Model gateway — vendor-neutral routing of model calls (§12).

10.2 Agent definition

An agent is a versioned configuration artefact. It declares its identity, scope, planner, model lanes, tool slice, and verifier rule set. Agents are not code; they are typed configuration that the runtime instantiates.

id:           agent.<namespace>.<name>
version:      <semver>
ontologyPin:  2026.05
role:         underwriting.review | claims.intake | claims.adjustment | service.<…>
input:        Ref<Submission|Claim|Endorsement|Quote|…>
output:       Decision[]
scopes:
  read:       [ … ]
  invoke:     [ tool.* ]
modelLanes:
  reasoning:  reasoning.long
  embedding:  embedding.text
  vlm:        vlm.document
tools:
  - tool.<name>@<semver>
verifier:
  rulePacks:  [ rules.intake.v3, rules.policy.v2 ]
  thresholds: { confidence_min: 0.78, evidence_min: 1 }
memory:
  scope:      run | object-pinned
  ttl:        run-only
budget:
  tokens:     200000
  wallclock:  PT2M
  cost:       USD 0.50
handoff:
  on:
    - exception: ambiguous_evidence
    - exception: low_confidence
    - exception: authority_breach
    - state: verified_block
  to:         queue.<role>.review
release:
  state:      enabled | shadow | disabled
  evalGate:   pack.agent.<name>.<semver>

10.3 Run lifecycle

The run lifecycle is a finite state machine. Every transition is audited. Fig. 10.2 — Run state machine. verifying is a distinct state; nothing leaves the run without passing the verifier.

10.4 Generic run sequence

Fig. 10.3 — Generic run sequence. No specific workflow assumed.

10.5 Bounds & budgets

Every run has hard bounds. Exceeding any of them transitions the run to failed with a typed cause.

Bound	Default	Behaviour on breach
Token budget	per-agent setting	fail · cause `budget.tokens`
Wall-clock	per-agent setting	fail · cause `budget.wallclock`
Cost ceiling	per-agent + per-tenant	fail · cause `budget.cost`
Tool concurrency	4 per run	queue · cause `concurrency.bound`
Retry limit per tool	3	typed Exception emitted
Plan length	30 steps	fail · cause `plan.length`

10.6 Sandbox properties

No ambient network. The only egress is the Tool PEP.
No persistent filesystem. Only run-scoped scratch.
No stdin / stdout to the outside world. All emissions are typed events on the audit / telemetry channels.
Strict CPU / memory caps. Container exit on breach.
Process-per-run by default; pooled workers permitted only with mandatory state-zeroing between runs.

10.7 Multi-agent coordination

Agents do not call other agents directly. Coordination is via Decisions and Actions on the Ontology: an agent that needs another agent’s verdict raises an Exception or stages a Task targeted at the other agent’s queue. This makes coordination auditable and bounded.

10.8 Handoff manager

A handoff is a typed transition: {from: AgentRun|Principal, to: Queue|Principal, subject: Ref, reason: ExceptionCode|Decision, evidence: EvidenceSpan[]}. Handoff is the only way an agent run terminates without a typed Decision. The handoff manager enforces queue routing, SLA assignment, and ordering.

10.9 Orchestrator & Specialist Worker Models

The runtime separates two model roles. The Orchestrator Model plans the run and decides which Specialist Worker to invoke next, with which input, under which tool budget. Specialist Worker Models do bounded, well-typed sub-tasks and return structured outputs. This separation lets each role be sized independently (capability lane, latency target, cost), and makes both roles individually replaceable through the Model Gateway (§12).

Orchestrator Model

Reads the typed input and the agent’s allowed tools and policies.
Produces a structured plan (DAG of sub-tasks) using the planner grammar (§11.2).
Routes each sub-task to the right Specialist Worker class via the Model Gateway.
Reviews each Specialist’s output, decides whether to branch, retry, or proceed.
Pinned per agent definition; never invoked outside the runtime.
Subject to its own eval suite (§13) at promotion time.

Specialist Worker Model classes

Five classes of Specialist Worker cover the operating surface of insurance work. Each class is a capability lane in the Model Gateway (§12.2.1); each has its own approved registry, eval gates, drift sigma, and cost budget.

Parse & Segment

SW.1 · Parse and segment — Document layout analysis, page splitting, table region detection, email threading, transcript segmentation, multi-document deduplication. Often invoked before Extract & Match.

Extract & Match

SW.2 · Extract and match — Field extraction (typed properties), code-list matching (ICD-10, CPT, peril codes), entity matching to existing Ontology objects, address / identifier normalisation. Pairs with §8.9 / §8.10.

Reasoning & Analysis

SW.3 · Reasoning and analysis — Multi-step reasoning over typed inputs, rule-pack-aware judgment, conflicting evidence weighing, drafting Decisions with rationale. The most-used Specialist in any decision-heavy workflow.

Voice & Vision

SW.4 · Voice and vision — Voice transcription & diarisation, vision-language extraction from photos and diagrams, signature / stamp detection, damage assessment from imagery. Pairs with §7.10 (Agentic OCR) and §7.11 (Voice Transcription).

Synthesis & Export

SW.5 · Synthesis and export — Producing operator-readable summaries, drafting outbound communications, formatting exports (CSV / XML / regulator templates), composing typed Action payloads for the Action Plane. The boundary between agent reasoning and human consumption.

Why split this way

Each class has different latency / cost / accuracy tradeoffs — routing per class, not per workflow, lets the gateway serve them all efficiently.
Each class is independently evaluable. A regression in Extract & Match does not require revalidating Reasoning & Analysis.
Specialists do not own state; they take typed inputs and return typed outputs. The Orchestrator owns the loop and the Reasoning Trail.

10.10 Reasoning Trail

The Reasoning Trail is the substrate’s first-class record of how a Decision was reached: which Specialist Workers were called in which order, what each returned, what tools were invoked, what the verifier ruled, and which evidence spans grounded each step. It is the audit-grade reasoning record — not a free-text “thought log.” Trail entries Each entry in the trail is a typed ReasoningStep:

{
  "stepId":         "step_01HF…",
  "runId":          "run_01HF…",
  "ord":            7,
  "role":           "orchestrator | specialist | verifier | tool",
  "subject":        { "kind": "specialist.extract_and_match.v3", "lane": "extract.text" },
  "input":          { "ref": "obj_…", "props": ["claim.lossDate","claim.amount"] },
  "output":         { "values": [...], "confidence": 0.92 },
  "tools":          [ { "tool": "tool.lookup.icd10", "version": "1.4.2", "argsHash": "0x…", "resultHash": "0x…" } ],
  "evidence":       [ { "span": "es_01HF…", "page": 3, "bbox": [102,448,612,520] } ],
  "modelLineage":   { "model": "gw/lane/extract.text/v3", "promptRev": "p_2c8f", "retrievalSnap": "rs_19a4", "seed": 7 },
  "latencyMs":      812,
  "cost":           { "tokensIn": 1488, "tokensOut": 220, "usd": 0.018 },
  "verifier":       { "applied": ["coverage_in_force","amount_in_currency"], "verdict": "pass" },
  "audit":          { "auditEventId": "ae_…" }
}

Properties of the trail

Typed: every step has a known shape; no free-text “thoughts.”
Append-only: steps are written once, hash-chained, never rewritten.
Exportable: included in replay bundles (§8.7) and decision-lineage exports (§17.5).
Reviewable: rendered in the operator console as a step-by-step graph with the cited evidence inline.
Replayable: re-running with the same model lineage, retrieval snapshot, and seed reproduces the trail to the bounds the underlying models support.

From a CTO’s standpoint, the Reasoning Trail is what makes agentic AI auditable and diligence-able. It is the artefact that an internal audit team, a regulator, or a court of jurisdiction reads when asking “why did the system decide that?“

10.11 Multi-workflow runtime

One runtime serves every workflow domain. The same Orchestrator / Specialist Worker / Reasoning Trail / Verifier / Tool Registry pipeline runs underwriting agents, claims agents, servicing agents, billing-exception agents, compliance-investigation agents, distribution-audit agents, treaty-cession agents. Domain differences manifest as configurations on the same runtime, not as separate platforms. What changes per workflow

The agent definition (allowed tools, permission scopes, policies, ontology pin).
The Specialist Worker selection (e.g. Voice & Vision for inbound calls; Synthesis & Export for regulator filings).
The verifier rule packs (e.g. coverage-applies for claims; appetite-fit for underwriting; treaty-eligibility for reinsurance).
The action contracts available (e.g. policy-admin writes, billing adjustments, treaty cession statements).

What does not change

Identity, RBAC / ABAC, markings, PDP (§15–16).
Idempotency, audit chain, lineage, replay (§17, §8).
Telemetry, alerts, SLOs, error budgets (§18, §20).
Deployment topology, region pinning, tenant isolation (§19).

This is the property a global CTO must see: adding a new workflow is a configuration change, not a platform change. New workflows reuse the runtime, the Specialist Worker capabilities, the Reasoning Trail, the audit chain, the SoR adapters, the deployment topology — everything except the workflow-specific agent definition and tools. The marginal cost of the second, third, and N-th workflow is therefore much smaller than the first.

Documentation Index

​10 — Agent runtime architecture.

​10.0 Reference — orchestration runtime

​10.1 Runtime topology

​Components

​10.2 Agent definition

​10.3 Run lifecycle

​10.4 Generic run sequence

​10.5 Bounds & budgets

​10.6 Sandbox properties

​10.7 Multi-agent coordination

​10.8 Handoff manager

​10.9 Orchestrator & Specialist Worker Models

​Orchestrator Model

​Specialist Worker Model classes