Founding RFC
Kenshiki Governed Intelligence Architecture
The canonical end-to-end specification for governed RAG: how identity, ingestion, retrieval, prompting, inference, and audit compose into a single deterministic pipeline that proves what the evidence caused.
Status: Founding RFC — Approved
1. Abstract
Standard RAG systems retrieve whatever is nearest in embedding space, paste it into a prompt, stream a response, and hope for the best. There is no authority boundary on evidence, no structural control over what the model sees, no causal proof that evidence influenced the output, and no deterministic audit trail connecting a claim back to a specific governed source.
Kenshiki replaces this with a six-stage deterministic pipeline: identity defines what each source is and what it must never answer; ingestion converts documents into SIRE-tagged, geometrically bounded evidence; retrieval enforces exclusion and authorization gates before evidence reaches the model; prompt compilation assembles a governed prompt contract that positions evidence where attention mechanisms will weight it; Tri-Pass inference generates, decomposes, and causally verifies every claim; and the Claim Ledger produces an immutable audit record proving what the evidence caused.
The model is treated as an untrusted synthesizer operating within a deterministic evidence boundary. Governance is structural, not aspirational.
2. Motivation
Five problems make standard generative AI architectures unsuitable for consequential operations:
-
No authority boundary. Vector similarity cannot distinguish regulatory jurisdictions. A SOC 2 question retrieves HIPAA evidence because the embeddings are close, even though the legal obligations are different.
-
Circular governance. Using the model to evaluate its own output, define its own schemas, or enforce its own constraints introduces unresolvable circular dependencies.
-
Attention decay buries authority. In unstructured prompts, critical grounding constraints placed early are weakened by mid-context degradation. The model forgets the rules by the time it generates.
-
Post-generation is too late. If the prompt allows improvisation, no amount of post-hoc scoring recovers a fundamentally ungoverned generation.
-
Correlation is not causation. Cross-attention weights and embedding proximity demonstrate co-occurrence, not that evidence caused a specific claim. Regulatory audit requires causal proof.
3. System Overview
The pipeline is a strict sequence of six subsystems. Each subsystem’s output is the next subsystem’s input. No subsystem can be bypassed.
Algorithm 1: SIRE(dv) = (S, I, R, E)
Human approval: PROPOSED → APPROVED → ACTIVE
Ingest(D, P) → (K, B, M)
SIRE tags stamped, geometric boundary locked
SIRE-tagged chunks + boundary B = (μ, Σ)
Algorithm 2: C → exclusion gate → ReBAC gate → C′
Compile(q, C′, T) → P
Zone assignment by ingestion metadata
Pass 1: Generate from P
Pass 2: Deterministic claim extraction
Pass 3: Contrastive causal verification
Dependency chain: SIRE identity feeds ingestion. Ingestion feeds Kura. Kura feeds retrieval. Retrieval feeds the Prompt Compiler. The Prompt Compiler feeds inference. Inference feeds the Claim Ledger. The Claim Ledger feeds the Boundary Gate. The Boundary Gate emits or blocks.
4. Core Primitives
4.1 SIRE Tuple
The identity of a versioned document d^v:
| Field | Type | Cardinality | Role |
|---|---|---|---|
| S (Subject) | Normalized identifier | Exactly 1 | Primary key for evidence grouping. Derived from oracle_id. |
| I (Included) | Ordered term set | Max 24 | Terminology the source covers. Informs retrieval ranking. |
| R (Relevant) | Ordered term set | Max 12 | Cross-domain references. Enables relationship discovery. |
| E (Excluded) | Ordered term set | Max 8 | Hard boundary. Only enforcing field. Gates retrieval. |
State lifecycle: PROPOSED (algorithmic extraction) → APPROVED (human review) → ACTIVE (written to frontmatter). Immutable per document version. Updating SIRE requires a new version d^(v+1).
4.2 Ingestion Function
| Symbol | Definition |
|---|---|
| D | Set of raw source documents (PDF, DOCX, JSON, Markdown, YAML, CSV) |
| P | Pipeline configuration (chunk size, overlap, embedding model, SIRE config) |
| K | Set of SIRE-tagged, embedded chunks in Kura |
| B | Geometric boundary: centroid , Ledoit-Wolf shrunk covariance |
| M | Pipeline metadata: DLQ quarantine list, telemetry, DEGRADED_BOUNDARY flag |
4.3 Evidence Sets
| Symbol | Definition |
|---|---|
| C | Candidate chunks from hybrid search (pgvector semantic + tsvector lexical) |
| C′ | Governed evidence set after SIRE exclusion gate and ReBAC authorization gate |
4.4 CFPO Zones
The runtime prompt assembly contract, in mandatory order:
- Content — identity, mission, domain framing (exploits primacy effect)
- Format — schemas, reference structures, taxonomies
- Policy — behavioral constraints and compliance requirements (authority re-stated near generation boundary)
- Output — response schema and exact emission format (exploits recency effect)
4.5 Compiled Prompt
| Symbol | Definition |
|---|---|
| q | The query |
| C′ | Governed evidence set from retrieval |
| G | Governance profile |
| T | Versioned template skeleton (CFPO zones, model ID, temperature, change summary) |
| P | The compiled, zone-ordered, authority-scoped prompt |
4.6 Claim Ledger Record
For each generated response, the Ledger produces:
- Per-claim extraction coordinates (exact token spans from deterministic decomposition)
- L1 calibrated confidence scores (token-level logprob distributions)
- L2 source entailment scores (embedding similarity + NLI against C′)
- L3 stability scores (cross-draw reproducibility, where tier permits)
- L4 representation uncertainty (hidden-state probes, self-hosted only)
- Contrastive attribution delta per claim:
- Composite verification status per claim: VERIFIED, PARTIALLY_VERIFIED, UNVERIFIED, REFUSED
- Output state: AUTHORIZED, PARTIAL, REQUIRES_SPEC, NARRATIVE_ONLY, BLOCKED
4.7 Geometric Boundary
where is the corpus centroid and is the Ledoit-Wolf shrunk covariance matrix. The boundary establishes an ellipsoidal control limit via the chi-squared distribution. Mahalanobis distance is the gating metric:
This is a geometric plausibility check — a sanity boundary, not semantic proof. It supports the Claim Ledger but does not replace causal attribution.
5. Lifecycle: From Document to Claim
A single end-to-end trace through the pipeline.
A corpus engineer submits 40 vendor contracts for a SOC 2 assessment.
Stage 1 — SIRE Identity. For each document, Algorithm 1 extracts SIRE proposals from frontmatter and body text. The SOC 2 contracts receive Subject soc_2_trust_services_criteria, Included terms like access control, change management, availability, and Excluded terms hipaa, gdpr, pci dss (self-referential soc 2 is automatically removed). The corpus engineer reviews and approves each proposal. State transitions to ACTIVE.
Stage 2 — Air-Gapped Ingestion. The ingestion DAG processes all 40 documents through five stages: air-gapped parsing (GPU-accelerated layout analysis, no external API calls), deterministic chunking (section-aware, 50-token overlap, normative language detection, clause ID extraction), streaming embedding, bulk COPY into a run-specific pgvector table (no indexes), and geometric boundary calculation (Ledoit-Wolf). Document 12 is a corrupted scan — after 3 retries, it is quarantined in the DLQ. The pipeline continues. The boundary is computed over 39 documents. Metadata M carries DEGRADED_BOUNDARY listing the excluded file.
Stage 3 — Query Arrives. An analyst asks: “What are our obligations under TSC CC6.1 for logical access controls?”
Stage 4 — SIRE + ReBAC Retrieval. Hybrid search (cosine + lexical) produces candidate set C. The exclusion gate runs: any chunk containing a term in E (e.g., a chunk mentioning HIPAA physical safeguards that surfaced via embedding proximity) is purged. The ReBAC gate runs: any chunk the analyst is not authorized to access is purged. The intersection C′ contains only SOC 2 evidence the analyst is permitted to see, ranked by hybrid score within subject groups.
Stage 5 — CFPO Prompt Compilation. The Compiler receives (q, C′, T). It classifies each chunk into a CFPO zone by ingestion metadata: normative mandates (SHALL/MUST, clause IDs) go to Policy, structural definitions go to Format, advisory narrative goes to Content, response templates go to Output. It assembles the prompt in strict CFPO order, runs five deterministic rewrite passes (context placement, instruction reinforcement, authority zone isolation, mechanism competition handling, format-pressure resolution), and emits compiled prompt P. The model never sees the raw query or raw evidence — only the compiled, zone-ordered, authority-scoped prompt. P is logged for audit.
Stage 6 — Tri-Pass Inference + Claim Ledger. Pass 1: the model generates response R from P. Pass 2: an out-of-band deterministic extractor (spaCy dependency parsing, entity recognition, rule-based matchers) decomposes R into atomic claim spans with exact token coordinates. Pass 3: for each claim, the evaluator computes contrastive attribution . L1 scores calibrated confidence. L2 checks source entailment against C′. L3 tests stability via multi-draw regeneration (tier permitting). L4 probes hidden-state volatility (self-hosted only). The composite verification maps to an output state. The Boundary Gate emits AUTHORIZED if all claims verify, PARTIAL if mixed, BLOCKED if contradicted.
The Claim Ledger records everything: query, SIRE tuples for all sources, chunks before and after each gate, exclusion and ReBAC decisions, compiled prompt version, per-claim extraction coordinates, per-claim layer scores, contrastive deltas, and final output state. The DEGRADED_BOUNDARY annotation from Document 12’s quarantine propagates into the Ledger — reviewers see that the evidence scope was narrower than intended.
6. Subsystem Specifications
6.1 SIRE Identity and Governed Retrieval
Algorithm 1: SIRE Identity Inference
Input: F (frontmatter: oracle_id, title, frameworks),
T (document body text),
P (config: domain phrases, thresholds, caps, default exclusions)
Output: (S, I, R, E) in PROPOSED state
1. S ← normalize(F.oracle_id) // lowercase, underscores
2. terms ← union(
phrase_extract(T, P.domain_phrases), // curated dictionary scan
acronym_extract(T), // [A-Z]{2,8}, count ≥ 2
significant_word_extract(T), // 4+ chars, count ≥ 4
capitalized_term_extract(T) // proper nouns, count ≥ 2
)
3. I ← deduplicate(terms, priority: phrases > acronyms > words) [:24]
4. R ← (F.frameworks ∪ capitalized_terms) \ I [:12]
5. self ← normalize(F.oracle_id ∪ S ∪ F.title ∪ F.frameworks)
6. E ← {e ∈ P.default_excludes : tokens(e) ⊄ self} [:8]
7. return (S, I, R, E) with state=PROPOSED
Determinism: No randomness, no model inference, no external calls. Identical (F, T, P) always produces identical output.
Algorithm 2: SIRE-Governed Retrieval
Input: q (query), C (candidate chunks from hybrid search),
SIRE(d_c) for each chunk c, caller K, ReBAC graph G
Output: Ordered evidence set C′
1. for each c ∈ C:
h(c) ← α · cosine(c, q) + (1-α) · lexical(c, q) // hybrid score
2. groups ← group_by(C, subject) // subject grouping
rank groups by mean(h) descending
3. for each c from document d: // exclusion gate
for each e ∈ E_d:
if word_boundary_match(e, c.text, case_insensitive):
C ← C \ {c}; log(exclusion, c.id, e, S_d)
4. for each c from document d: // ReBAC gate
if ¬authorized(K, d, G):
C ← C \ {c}; log(authz_denial, c.id, K, d)
5. C′ ← rerank(C, h) within groups // final ordering
return C′
Two-layer policy composition: SIRE defines regulatory eligibility (property of evidence). ReBAC defines caller authorization (property of the relationship). The system takes the intersection. Neither gate overrides the other. They are independently managed, audited, and tested.
Boundary compliance scoring: For each query, the system computes a compliance score: 1.0 (fully compliant — all evidence from expected subjects), 0.5–0.99 (degraded — cross-subject evidence via Relevant graph), below 0.5 (non-compliant — insufficient coverage, may refuse).
6.2 Air-Gapped Ingestion and Geometric Boundary
The ingestion DAG is a five-stage pipeline. Each stage has typed inputs and outputs.
| Stage | Name | Input | Output |
|---|---|---|---|
| 1 | Extract | Raw documents D | Structured Markdown M_d (or DLQ quarantine) |
| 2 | Transform | Markdown M_d | Chunks C_d with SIRE tags, clause IDs, normative flags |
| 3 | Embed | Chunks C_d | Embedded chunks E_d with SHA-256 hash + HMAC watermark |
| 4 | Load | Embedded chunks E_d | Rows in run-specific pgvector table (no indexes) |
| 5 | Lock | All embedded chunks | Geometric boundary via Ledoit-Wolf |
Air-gap invariant: Stage 1 blocks all network calls to external APIs at the infrastructure level. The parser operates in complete isolation.
No-index rule: Ephemeral tables are never indexed. Corpora rarely exceed 10,000 chunks; sequential exact KNN scan is faster than HNSW construction and guarantees exact recall. Cosine distance ranks candidates; Mahalanobis distance against is the downstream gating metric.
Fault tolerance (Poison Pill DLQ): A corrupt document is quarantined after 3 retries. The DAG continues. The boundary is computed over successful documents. The Claim Ledger receives a DEGRADED_BOUNDARY annotation listing excluded files. This annotation adds provenance metadata — it does not change output states.
Determinism: Given identical (D, P, embedding model version), the pipeline produces identical (K, B). Chunking is purely positional. Ledoit-Wolf shrinkage is a closed-form estimator. The only non-determinism is GPU floating-point rounding, bounded to machine epsilon.
Compute isolation: The ingestion pipeline must be physically or logically isolated from the inference engine. Shared GPU memory fragments vLLM’s continuous batching.
6.3 Prompt Governance (CFPO Prompt Compiler)
Why CFPO Order
- Primacy effect makes early framing sticky → Content first.
- Recency effect improves output-shape compliance → Output last.
- Mid-context degradation is well-established → authority constraints re-stated in late Policy zone.
Prompt Compilation Algorithm
Input: q (query), C′ (governed evidence), G (governance profile),
T (versioned template skeleton with CFPO zones)
Output: Compiled prompt P
1. Zone classification — for each chunk c ∈ C′:
if c has normative markers (SHALL/MUST, clause IDs) → Policy
if c has structural definitions, schemas → Format
if c has advisory narrative, domain context → Content
if c has response templates, output schemas → Output
else → Content (reduced authority weight)
2. Template assembly — load T, inject chunks in CFPO order:
Content → Format → Policy → Output
3. Context placement — position high-authority evidence where
attention mechanisms weight it (primacy + recency)
4. Instruction reinforcement — duplicate critical constraints
near the generation boundary (late Policy zone)
5. Authority zone isolation — separate evidence, instructions,
and user input with validated delimiters
6. Mechanism competition — strengthen retrieval authority signals
where parametric knowledge likely conflicts with evidence
7. Format-pressure resolution — enforce grounding constraints
over format completion pressure when evidence is missing
return P
Zone assignment is driven by ingestion metadata: SIRE tags, clause IDs, normative language markers, and source tier — all stamped during Stage 2 of ingestion. The Compiler does not evaluate evidence content. It positions evidence structurally.
Determinism: Given identical (q, C′, G, T), the Compiler always produces identical P. Zone classification is a deterministic function of chunk metadata. All seven steps are deterministic transformations.
Audit: The compiled prompt P is logged with query, evidence set identifiers, template version, and zone assignments. The Claim Ledger can reconstruct which evidence was placed in which zone for any historical request.
6.4 Tri-Pass Inference and Claim Ledger
The Tri-Pass Pipeline
Pass 1 — Generator. The model executes the compiled prompt P using continuous batching. The model sees only the compiled, zone-ordered, authority-scoped prompt — never the raw query or raw evidence.
Pass 2 — Extractor. An out-of-band, deterministic NLP pipeline decomposes the response into atomic claim spans. Extraction uses dependency parsing, entity recognition, and rule-based matchers to produce exact token coordinates. This layer captures regulatory citations as atomic entities, maps financial integers to controlling nouns, and extracts temporal triggers. No model inference is involved.
Pass 3 — Evaluator. For each extracted claim span, the evaluator computes contrastive attribution:
When is significantly positive, the Ledger mathematically proves that the evidence exerted a direct causal influence on the generation of that specific token. This separates grounded claims from pre-training priors.
Claim Ledger Layers
| Layer | Signal | What It Proves | Availability |
|---|---|---|---|
| L1 | Calibrated confidence | Where the model is certain vs. guessing | All tiers |
| L2 | Source entailment | Whether evidence entails the claim (not just proximity) | All tiers |
| L3 | Stability | Whether the claim reproduces across draws | Where deterministic sampling is available |
| L4 | Representation uncertainty | Whether internal state is stable (surface confidence can mask instability) | Self-hosted only (Refinery, Clean Room) |
Each layer produces an independent signal. No layer depends on another layer’s output. The composite verification function maps available layer scores plus contrastive delta to a per-claim status (VERIFIED, PARTIALLY_VERIFIED, UNVERIFIED, REFUSED), then aggregates to an output state.
Output States
| State | Meaning |
|---|---|
| AUTHORIZED | All claims verified against governed evidence |
| PARTIAL | Mixed verification — some claims supported, some not |
| REQUIRES_SPEC | Evidence gaps prevent full verification |
| NARRATIVE_ONLY | Advisory response, not authoritative |
| BLOCKED | Contradicted by evidence or failed verification |
Output states are assigned by the Boundary Gate based on Claim Ledger output, not by heuristic or model self-assessment.
Epistemic Limits
This architecture provides positive attribution: it proves what external evidence influenced a specific claim. It does not provide complete exclusion proofs of pre-training priors. The Claim Ledger bounds the extracted facts, not the latent reasoning that connected them. This is a deliberate design boundary stated in every audit record.
7. Global Invariants
Identity
- Every source in Kura must have all four SIRE fields (S, I, R, E) before entering the evidence boundary.
- Excluded is the only field that enforces. Subject, Included, and Relevant inform but never gate.
- SIRE proposals require human approval before application. State transitions: PROPOSED → APPROVED → ACTIVE.
- SIRE fields are immutable per document version. Updating SIRE requires a new version d^(v+1).
- A source’s Excluded list must never contain terms matching its own identity.
Ingestion
- The ingestion pipeline operates in complete air-gap. No external API calls during parsing.
- Ephemeral tables are never indexed. Retrieval uses exact KNN sequential scans.
- The geometric boundary is computed once per ingestion run and is immutable for that run.
- A quarantined document triggers DEGRADED_BOUNDARY but does not halt the pipeline.
- The embedding model version is recorded with every chunk. Version changes require re-ingestion.
Retrieval
- SIRE exclusion and ReBAC authorization are independent gates. The system takes the intersection.
- The exclusion gate runs at retrieval time, not ingestion time — changes take effect without re-indexing.
- Chunks from different versions of the same document are never mixed in a single retrieval.
Prompting
- CFPO order is mandatory for every compiled prompt. No exception.
- Output schema is always the closest block to the generation boundary.
- Authority constraints must be re-stated in the late Policy zone to survive mid-context degradation.
- The model never sees the raw query or raw evidence — only the compiled prompt.
- Zone classification is a function of ingestion metadata, not model behavior or query content.
Inference and Audit
- The model never evaluates its own output. The truth boundary is external.
- Claim decomposition is deterministic. Identical response text produces identical claim spans.
- Each Claim Ledger layer (L1–L4) produces an independent signal.
- Contrastive attribution proves what evidence caused a claim. It does not prove absence of pre-training influence.
- The Boundary Gate emits or blocks before the response reaches the caller. Unsupported claims are stopped, not explained after failure.
End-to-End
- Determinism chain. Given identical (documents, pipeline config, SIRE state, query, corpus version, caller identity, ReBAC graph, template version, model state), the pipeline produces identical output with identical Claim Ledger records.
- Audit sufficiency. The Claim Ledger stores sufficient state to reproduce any historical decision — retrieval, compilation, and verification — for regulatory review.
- No circular dependency. At no point does the model participate in evaluating, constraining, or governing its own output. Every governance decision is made by an external, deterministic system.
What the System Does Not Guarantee
- Exclusion of pre-training priors. The system proves what evidence caused a claim. It cannot prove that no model parameter influenced the connective tissue between verified facts.
- Semantic correctness of evidence. The system guarantees that evidence is properly identified, retrieved within authority boundaries, and causally attributed. It does not guarantee that the underlying source documents are factually correct.
- Perfect recall. Exact KNN on ephemeral tables guarantees no indexing loss, but embedding coverage is bounded by the corpus submitted. Evidence not in Kura cannot be retrieved.
8. Runtime Service Contract
8.1 API Surface
The pipeline is exposed through three endpoint groups. All requests carry tenant context (org_id, tenant_id) and caller identity (bearer token resolved to a principal for ReBAC).
Ingestion
| Method | Path | Purpose |
|---|---|---|
| POST | /v1/ingest | Submit documents D with pipeline config P. Returns run_id. |
| GET | /v1/ingest/{run_id} | Poll run status: RUNNING, COMPLETED, DEGRADED, FAILED. |
| GET | /v1/ingest/{run_id}/boundary | Retrieve geometric boundary B and DLQ metadata M. |
Request body for /v1/ingest includes: document payloads (multipart or S3 references), SIRE configuration overrides, embedding model selection, and chunk parameters. The response returns a run_id immediately; ingestion is asynchronous.
Query
| Method | Path | Purpose |
|---|---|---|
| POST | /v1/query | Submit governed query. Returns response R with Claim Ledger. |
| POST | /v1/query/stream | SSE stream: tokens, then final Claim Ledger envelope. |
Request body includes: query text, inference_profile (reliability or deep_reasoning), model_role (authoritative or advisory), response_mode (governed or narrative), and optional SIRE subject scope override. The caller’s ReBAC authorization is resolved from the bearer token — it is never client-supplied.
Audit
| Method | Path | Purpose |
|---|---|---|
| GET | /v1/ledger/{request_id} | Retrieve full Claim Ledger record for a historical query. |
| GET | /v1/ledger/{request_id}/claims | Per-claim breakdown with layer scores and contrastive deltas. |
| POST | /v1/ledger/{request_id}/verify | Re-execute verification against logged state. Returns pass/fail. |
8.2 Streaming Semantics
For /v1/query/stream, the SSE contract is:
- Token events arrive as
data: {"type": "token", "text": "..."}during generation. - Think-tag content (model reasoning traces) is stripped before emission. The client never sees raw think blocks.
- After generation completes, the Claim Ledger evaluation runs server-side.
- A final envelope event
data: {"type": "envelope", "output_state": "...", "ledger_summary": {...}}delivers the verification result. - If the Boundary Gate returns BLOCKED, the token stream is replaced with a governed refusal. Partial tokens already streamed are invalidated by the envelope.
8.3 Multi-Tenant Isolation
- Every Kura table is tenant-scoped. Row-level security (RLS) enforces tenant provenance at the database level.
- SIRE state is per-tenant. One tenant’s exclusion list changes do not affect another tenant’s retrieval.
- ReBAC policy graphs are tenant-scoped. Cross-tenant authorization is structurally impossible.
- Claim Ledger records carry tenant_id. Audit queries are tenant-filtered at the query layer.
- Ingestion runs are tenant-isolated. Geometric boundaries are per-tenant, per-run.
9. Model Roles and Deployment Tiers
9.1 Two-Tier Model Roles
The architecture supports two model roles with different authority levels:
| Role | Purpose | SIRE Scope | CFPO Behavior | Claim Ledger |
|---|---|---|---|---|
| Authoritative | System-of-record answers for governed domains | Full SIRE-governed retrieval with exclusion enforcement | Full CFPO compilation with all rewrite passes | Full L1-L4 evaluation, contrastive attribution required |
| Advisory | Helper responses, exploratory analysis, narrative | Retrieval scoped to Relevant graph only (no primary Subject authority) | CFPO compilation with advisory-only Policy zone (no SHALL/MUST enforcement) | L1-L2 only, output state capped at NARRATIVE_ONLY |
Role selection is per-request, specified in the query body. The Prompt Compiler adjusts zone content based on role: authoritative requests receive normative mandates in the Policy zone; advisory requests receive them as informational context in the Content zone.
Role cannot escalate authority. An advisory response cannot produce AUTHORIZED output state regardless of evidence quality. The Boundary Gate enforces the ceiling.
9.2 Deployment Tiers
| Tier | Infrastructure | L3 Availability | L4 Availability | Contrastive Attribution |
|---|---|---|---|---|
| Workshop | External model API (governed overlay) | Where API supports deterministic sampling | Not available | Where API supports logprobs |
| Refinery | Self-hosted inference (managed) | Full control | Available | Full logprob access |
| Clean Room | Self-hosted inference (air-gapped) | Full control | Available | Full logprob + hardware attestation |
Each tier increases proof depth. The Claim Ledger records which layers were available and evaluated. An answer verified at L1-L2 in Workshop is less deeply proven than the same answer verified at L1-L4 in Clean Room — but both are governed within their tier’s capabilities.
Tier does not affect identity, ingestion, retrieval, or prompting. SIRE, the ingestion DAG, SIRE+ReBAC retrieval, and CFPO compilation operate identically across all tiers. Only the inference-time observability depth varies.
10. Operational Guarantees and Failure Modes
10.1 Failure Semantics
The system is fail-closed by default. When a governance component is unavailable, the pipeline refuses rather than degrades silently.
| Failure | Behavior | Rationale |
|---|---|---|
| ReBAC backend unavailable | Fail closed. All retrieval returns empty C’. Query receives BLOCKED. | An unauthenticated retrieval is worse than no retrieval. |
| Claim Ledger write fails | Response withheld. The Boundary Gate does not emit without a persisted Ledger record. | An unaudited response violates the audit sufficiency invariant. |
| Geometric boundary computation fails | Ingestion run marked FAILED. Chunks are not promoted to the active evidence boundary. | An unbounded corpus cannot be governed. |
| Embedding service unavailable | Ingestion stage 3 blocks. DLQ does not apply — embedding failure is systemic, not per-document. | Partial embedding would produce an inconsistent boundary. |
| SIRE state missing for a source | Source excluded from retrieval. Chunks without SIRE cannot pass the exclusion gate. | Ungoverned evidence must not reach the model. |
10.2 DEGRADED_BOUNDARY Propagation
DEGRADED_BOUNDARY is an annotation, not a failure state. It propagates through the pipeline:
- Ingestion sets it when documents are quarantined (DLQ).
- Retrieval passes it through to the Claim Ledger context.
- The Claim Ledger records it on every claim derived from the degraded corpus.
- The Boundary Gate does not change output states based on DEGRADED_BOUNDARY — it adds provenance metadata so reviewers know the evidence scope was narrower than intended.
10.3 Staleness Bounds
| Component | Staleness expectation |
|---|---|
| SIRE state | Propagates on next retrieval after ACTIVE transition. No re-indexing required. |
| Exclusion list changes | Immediate effect — the exclusion gate evaluates at retrieval time. |
| ReBAC policy changes | Immediate effect — the authorization gate queries the live policy graph. |
| Template skeleton updates | Take effect on next compilation. Version is logged in every Claim Ledger record. |
| Corpus changes | Require re-ingestion. The geometric boundary is per-run and immutable. |
11. Governance of Configuration
11.1 Change Authority
| Configuration | Who can change it | Approval process |
|---|---|---|
| Default SIRE exclusion list | Corpus engineer | PR review + Ledger tagging of affected subject groups |
| Per-source SIRE tuples | Corpus engineer | Algorithm 1 proposal → human review → APPROVED → ACTIVE |
| CFPO template skeletons | Prompt engineer | Versioned registry with model ID, temperature, and change summary |
| ReBAC policies | Security engineer | Policy-as-code PR, tested against recorded retrieval decisions |
| Inference profiles | Platform engineer | Manifest update with decode parameters, reviewed for SLO impact |
| Boundary Gate thresholds | Governance lead | Change requires re-evaluation against historical Claim Ledger data |
11.2 Change Safety
All configuration changes are:
- Versioned. Every change produces a new version with timestamp and author.
- Auditable. The Claim Ledger records which version of every configuration was active at query time.
- Testable. The verification endpoint (
/v1/ledger/{request_id}/verify) can re-execute historical queries against new configuration to preview impact before rollout. - Canary-safe. New template versions or SIRE changes can be deployed to a percentage of traffic with Ledger comparison between old and new.
11.3 Forbidden Operations
No automated system — including agentic coding agents — may:
- Bypass or weaken the SIRE exclusion gate (e.g., empty Excluded sets, wildcard matches)
- Change Excluded from an enforcing field to an informational field
- Introduce non-deterministic steps into the ingestion or compilation pipeline
- Allow the model to participate in evaluating its own output
- Emit responses without a persisted Claim Ledger record
- Merge ReBAC and SIRE into a single gate (they must remain independent)
- Remove the air-gap from the ingestion parser
These are architectural invariants, not implementation preferences. Violating them invalidates the governance guarantees of the entire system.
12. Implementation Notes
Mandatory Protocol vs. Implementation Detail
| Mandatory | Implementation detail (substitutable) |
|---|---|
| SIRE four-field identity with Excluded enforcement | Specific NLP extractors in Algorithm 1 (regex, spaCy, etc.) |
| Air-gapped parsing with no external API calls | Choice of parser (Docling, Unstructured, etc.) |
| No vector indexes on ephemeral tables | Choice of embedding model and dimensionality |
| CFPO zone ordering and deterministic compilation | Specific delimiter format for authority zone isolation |
| Tri-Pass architecture with out-of-band extraction | Choice of extraction framework (spaCy, stanza, etc.) |
| Contrastive causal attribution ( log-prob) | Specific thresholds for “significantly positive” |
| Immutable Claim Ledger with per-claim records | Storage backend for Ledger records |
| Boundary Gate emission control | Specific mapping from composite scores to output states |
| ReBAC authorization as independent gate | Choice of policy engine (OpenFGA, Cedar, etc.) |
| Fail-closed on governance component failure | Specific retry/backoff strategy |
| Two-tier model roles with authority ceiling | Specific model selection per role |
Extension Points
Future RFCs or services should extend the architecture at these boundaries:
- Continuous ingestion. Incremental re-ingestion, boundary recomputation on corpus changes, SIRE version migration across document updates.
- Feedback loops. Claim Ledger analytics feeding corpus curation (which sources produce the most UNVERIFIED claims), retrieval tuning (which exclusion rules trigger most frequently), and prompt template refinement.
- Hardware attestation. TPM-backed signing for Claim Ledger records in Clean Room deployments, extending the cryptographic chain from evidence through generation to delivery.
- Multi-model orchestration. Routing between authoritative and advisory models within a single session, with role transitions governed by SIRE subject coverage.
- Agent integration. Structured tool-use contracts for agentic systems that need to invoke governed queries, inspect Claim Ledger results, and make decisions based on output states.
How to Extend This System Safely
For engineers and agents adding capabilities:
Allowed extensions:
- New SIRE subjects (add to corpus, run Algorithm 1, get human approval)
- New ingestion enrichers (add to Stage 2, must be deterministic, must stamp metadata)
- New CFPO template skeletons (add to versioned registry with model ID and change summary)
- New Claim Ledger fields (additive only — never remove or rename existing fields)
- New output states (must map from composite verification, must be documented in the Boundary Gate)
- New ReBAC relationship types (must be independent of SIRE, must not weaken exclusion)
Forbidden modifications:
- Any change that makes the exclusion gate non-enforcing
- Any change that allows the model to see raw queries or uncompiled evidence
- Any change that introduces non-determinism into ingestion, compilation, or claim extraction
- Any change that allows responses without Claim Ledger records
- Any change that merges SIRE and ReBAC into a single evaluation
- Any change that makes a governance component fail-open
© 2026 Kenshiki Labs · kenshikilabs.com · All rights reserved.
This document may be shared for evaluation purposes. Redistribution requires written permission.
https://kenshikilabs.com/articles/governed-intelligence-architecture
Further reading
Founding RFC
The SIRE Identity System
The deterministic identity system that controls what evidence enters the retrieval boundary.
Founding RFC
The Ingestion Pipeline
How raw documents become governed evidence in Kura.
Founding RFC
Prompt Governance
The CFPO prompt compilation contract.
Current Architecture
Platform Architecture
Where this specification is realized in the production system.
Founding RFC
Deterministic Admissibility Gating
The pre-retrieval obligation gate that verifies human-approved evidence exists before the vector store is queried.