High-Assurance Asynchronous Intelligence Compilation

The founding architecture RFC that established contrastive causal bounding, the Claim Ledger hierarchy of trust, and the epistemological limits of runtime AI governance.

1,061 words · ~5 min

Status: Founding RFC — Approved Context: Architecture design for the Kenshiki inference engine and governance layer. Objective: Establish an asynchronous intelligence compilation pipeline utilizing out-of-band extraction and contrastive causal bounding to generate forensically verifiable, reproducible intelligence artifacts.

1. Abstract

Current generative AI architectures rely on probabilistic self-governance and single-pass streaming, prioritizing low latency over epistemic rigor. This RFC proposes an asynchronous, Tri-Pass architecture that treats the LLM purely as a sandboxed synthesis engine. By externalizing the truth boundary to a Kura evidence boundary and utilizing out-of-band extraction (spaCy), the pipeline produces externally constrained and reproducibly bounded intelligence artifacts. The output is accompanied by an immutable Claim Ledger, providing cryptographic proof of positive causal attribution for all generated factual claims.

2. Problem Statement

The Circular Governance Trap: Utilizing an LLM to evaluate its own outputs or define its own schemas via JSON introduces unresolvable circular dependencies.
Geometric Fragility: Standard cosine similarity thresholds (τ) and basic standard deviation bounds fail in high-dimensional embedding spaces due to severe anisotropy and non-Gaussian clustering.
The “Fast/Cheap” Constraint: Synchronous streaming prohibits multi-pass causal verification.
Attention vs. Causality: Relying on cross-attention weights as proof of reasoning is epistemologically flawed. It conflates a routing diagnostic with causal proof, while operationally degrading optimized inference serving frameworks.

3. The Target Architecture: The Tri-Pass Pipeline

The system operates as an asynchronous AI-native ETL pipeline, abandoning conversational UI in favor of forensic artifact generation.

Ephemeral Ground Truth (Ingestion): A dynamic corpus is vectorized into a pgvector-backed Kura index, establishing the absolute boundary of reality for a single run.
Pass 1: The Generator (Synthesis): The inference engine executes the full prompt utilizing continuous batching and FlashAttention to generate the raw text brief at maximum bare-metal efficiency.
Pass 2: The Extractor (Deterministic Decomposition): An out-of-band, Cython-backed NLP pipeline (spaCy) parses the raw text. Utilizing strict dependency trees and rule-based matchers, it isolates factual spans and returns exact integer token coordinates.
Pass 3: The Evaluator (Causal Verification): The inference engine executes a highly targeted secondary forward pass over the exact token coordinates identified in Pass 2 to calculate the contrastive log-probability delta.
The Signed Envelope (Delivery): The verified brief is sealed with its cryptographic Claim Ledger, surfacing the coordinates, statistical bounds, and contrastive deltas via OpenTelemetry traces.

4. Hierarchy of Trust (The Claim Ledger)

The system rejects probabilistic heuristics in favor of a rigid hierarchy of evidentiary signals. Geometric bounding is used as a supporting statistical control on claim fit relative to the Kura corpus, while causal attribution remains grounded primarily in contrastive log-probability deltas over extracted factual spans.

Layer 1: Out-of-Band Entity Extraction (Absolute Determinism)

The extraction layer utilizes custom EntityRuler and DependencyMatcher rules to deterministically capture structural facts for compliance workflows:

Regulatory Citations: Exact token matchers for atomic government clauses (e.g., treating “DFARS 252.204-7012” as a single verifiable entity).
Liability & Capital Bounding: Syntactic dependency trees that map financial integers to their controlling nouns (preventing the model from semantically swapping a “valuation” for a “penalty”).
Relative Temporal Deadlines: Custom rules to capture contract triggers (“Net 30”, “T+45”).

Layer 2: Contrastive Information Gain (The Primary Causal Proof)

The foundational mathematical proof of attribution relies on the causal shift in model certainty, calculated during Pass 3.

$\Delta = \log P(y_i \mid \text{Prompt, RAG}) - \log P(y_i \mid \text{Prompt})$

If $\Delta$ is significantly positive, the ledger mathematically proves that the injected context exerted a direct causal influence on the generation of the specific factual token, successfully separating grounded claims from the model’s pre-training priors.

Layer 3: Distribution-Aware Geometric Bounding (The Sanity Boundary)

To account for the anisotropic nature of embedding spaces and the instability of small ephemeral payloads, geometric bounding utilizes Mahalanobis distance scaled by Ledoit-Wolf shrinkage.

$D_m(x) = \sqrt{(x - \mu)^T \Sigma^{-1} (x - \mu)}$

Execution: During ingestion, the system computes the centroid $\mu$ and a shrunk, invertible covariance matrix $\Sigma$ of the Kura corpus.
Control Limit: An ellipsoidal boundary is established using the $\chi^2$ distribution. This acts as a robust gating mechanism that resists collapsing under narrow or highly redundant payloads, serving as a geometric plausibility check rather than semantic proof.

Evolution into the current Claim Ledger

The three founding layers above evolved into the production Claim Ledger’s L1–L4 taxonomy:

L1 (Calibrated Confidence) incorporates the deterministic extraction principles from Layer 1, extended with token-level logprob calibration.
L2 (Source Entailment) subsumes Layer 2’s contrastive attribution into a broader evidence-checking framework using embedding similarity and NLI.
L3 (Stability) adds multi-draw regeneration analysis — available in all tiers where deterministic sampling is supported, with full control in Refinery and Clean Room.
L4 (Representation Uncertainty) adds hidden-state probes for internal volatility detection — available only in self-hosted environments (Refinery, Clean Room) where model internals are accessible.

Layer 3’s geometric bounding (Mahalanobis/Ledoit-Wolf) continues to operate as a supporting statistical control within the Kura evidence boundary, not as a numbered Claim Ledger layer.

5. Epistemological Limits (The Exclusion Caveat)

This architecture provides traceable positive attribution. It proves what external data influenced a specific token within the bounds of the governed evidence and the model boundary available in the deployment tier.

It does not provide complete exclusion proofs of pre-training priors. The system cannot cryptographically guarantee that a parameter deep within the weights did not subtly shape the grammatical connective tissue surrounding the verified claims. The Claim Ledger bounds the extracted facts, not the latent reasoning that connected them.

Degraded boundaries. When the ingestion pipeline excludes a document from the Kura evidence boundary (e.g., a corrupt file quarantined by the DLQ), the standard output states (AUTHORIZED, PARTIAL, REQUIRES_SPEC, NARRATIVE_ONLY, BLOCKED) still apply, but the response carries a DEGRADED_BOUNDARY annotation indicating that the evidence boundary was incomplete. This annotation does not change the output state — it adds provenance metadata so reviewers know the scope of the evidence was narrower than intended.

6. Infrastructure & Telemetry Primitives

Inference Serving: vLLM handling PagedAttention for Pass 1. Pass 3 utilizes an instrumented forward pass (or isolated shadow KV cache) to calculate log-probs without requiring destructive kernel-level toggles to FlashAttention.
Policy-as-Code: Boundary Gate evaluating the Mahalanobis thresholds and causal deltas, strictly decoupled from application logic.
Observability: OpenTelemetry (OTel) tracing the exact extraction coordinates, Ledoit-Wolf $\chi^2$ bounds, and contrastive $\Delta$ log-probs into the final immutable ledger.

This document may be shared for evaluation purposes. Redistribution requires written permission.

https://kenshikilabs.com/articles/bounded-synthesis

Platform Architecture

How the ideas in this RFC evolved into the Kura/Kadai contract and the bounded-synthesis pipeline.

Technical Article

AI Neurosurgery

The current inference-time observability system built on the principles established in this RFC.

Founding RFC

The Ingestion Pipeline

Phase 0 — how raw documents become the Kura evidence boundary that this architecture consumes.

Tool

Claim Ledger

The verification engine that implements the hierarchy of trust described in this document.