Skip to content
OnticBeta
RFC-0003

Model Selection & Training

draft

RFC-0003: Model Selection & Training

Purpose

Define how models are selected, trained, and validated to ensure the simulator is fit for purpose before governance constrains its outputs.

This RFC addresses the upstream dependency: if the model is trained on non-authoritative data, downstream governance cannot compensate for systematic errors.


The Model Fitness Problem

CAA governs model outputs. But model quality determines:

  1. Extraction accuracy — Can the model correctly identify ontology axes in user input?
  2. Hallucination baseline — How often does the model fabricate plausible values?
  3. Domain coverage — Does the model understand domain-specific terminology?
  4. Instruction following — Does the model reliably follow derived system prompts?

A model unfit for the domain produces errors that governance must constantly reject, degrading user experience and increasing risk of false negatives.


Part I: Model Selection

Domain-Model Fitness Matrix

Every domain MUST declare minimum model requirements:

interface DomainModelRequirements {
  domain_id: string;
  ontology_id: string; // RFC-0001

  // Capability requirements
  minimum_capabilities: {
    context_window: number; // Minimum tokens
    structured_output: boolean; // JSON/tool calling support required
    instruction_following: "basic" | "strong" | "strict";
    multilingual?: string[]; // Required language support
  };

  // Performance requirements
  performance_thresholds: {
    extraction_accuracy: number; // Min accuracy on ontology axes (0-1)
    hallucination_rate: number; // Max acceptable hallucination rate (0-1)
    latency_p99_ms: number; // Max acceptable latency
  };

  // Evaluation requirements
  evaluation_dataset_id: string; // Reference to domain-specific eval set
  minimum_eval_score: number; // Threshold to pass evaluation

  // Cost constraints (optional)
  cost_constraints?: {
    max_cost_per_request_usd: number;
    max_monthly_budget_usd: number;
  };
}

Model Registry

Approved models MUST be registered before use:

interface ModelRegistryEntry {
  // Identity
  model_id: string; // Stable identifier (e.g., "claude-3-opus-20240229")
  provider: string; // e.g., "anthropic", "openai", "internal"
  model_family: string; // e.g., "claude-3", "gpt-4"

  // Capabilities
  capabilities: {
    context_window: number;
    max_output_tokens: number;
    structured_output: boolean;
    tool_calling: boolean;
    vision: boolean;
    instruction_following: "basic" | "strong" | "strict";
    languages: string[];
  };

  // Versioning
  version: string;
  release_date: string;
  deprecation_date?: string;

  // Governance
  registered_at: string;
  registered_by: string;
  approval_status: "approved" | "pending" | "deprecated" | "prohibited";

  // Domain approvals
  domain_approvals: DomainApproval[];
}

interface DomainApproval {
  domain_id: string;
  approved: boolean;
  eval_score: number;
  eval_date: string;
  eval_dataset_version: string;
  notes?: string;
}

Model Selection Policy

Workflows MUST declare model selection criteria:

interface ModelSelectionPolicy {
  workflow_id: string;
  domain_id: string;

  // Selection strategy
  strategy: ModelSelectionStrategy;

  // Fallback chain
  fallback_models: string[]; // Ordered list of model_ids
  fallback_behavior: "degrade" | "block"; // What to do if all models fail

  // Routing rules (for multi-model setups)
  routing_rules?: RoutingRule[];
}

type ModelSelectionStrategy =
  | "fixed" // Always use specified model
  | "capability_match" // Select based on request requirements
  | "cost_optimized" // Cheapest model meeting requirements
  | "latency_optimized" // Fastest model meeting requirements
  | "quality_optimized"; // Highest eval score

interface RoutingRule {
  condition: {
    axis?: string; // Route based on ontology axis
    complexity?: "low" | "medium" | "high";
    authoritative_intent?: boolean;
  };
  model_id: string;
}

Selection Example (Medical Domain):

const medicalModelPolicy: ModelSelectionPolicy = {
  workflow_id: "anticoagulation-guidance",
  domain_id: "medical-warfarin",

  strategy: "quality_optimized",

  fallback_models: [
    "claude-3-opus-20240229", // Primary: highest accuracy
    "claude-3-sonnet-20240229", // Fallback: still approved for domain
    // GPT-4 NOT in list: failed domain eval
  ],
  fallback_behavior: "block", // Don't degrade for medical

  routing_rules: [
    {
      condition: { authoritative_intent: false },
      model_id: "claude-3-haiku-20240307", // Fast model for non-authoritative
    },
  ],
};

Part II: Training Data Governance

The Contamination Problem

Models trained on internet data contain "linguistic knowledge" — patterns that sound correct but aren't verified. For consequential domains, training data MUST come from authoritative sources.

Contamination Risk by Domain:

DomainContamination RiskMitigation
NutritionHigh — internet full of diet mythsTrain only on USDA/regulatory data
MedicalCritical — misinformation widespreadTrain only on peer-reviewed sources
LegalHigh — varies by jurisdictionTrain only on verified case law
FinanceMedium — regulations change frequentlyTrain with version-dated sources

Training Data Requirements

Fine-tuning datasets MUST satisfy provenance requirements:

interface TrainingDataset {
  dataset_id: string;
  domain_id: string;
  ontology_id: string;

  // Provenance
  sources: TrainingSource[];

  // Composition
  record_count: number;
  axis_coverage: Record<string, number>; // Axis → example count

  // Quality
  quality_metrics: {
    human_verified_percentage: number;
    oracle_derived_percentage: number; // Came from RFC-0002 oracles
    synthetic_percentage: number; // Generated examples
  };

  // Versioning
  version: string;
  created_at: string;
  checksum: string;
}

interface TrainingSource {
  source_id: string;
  oracle_id?: string; // Link to RFC-0002 oracle if applicable
  source_type: "oracle" | "curated" | "synthetic" | "external";
  record_count: number;

  // For non-oracle sources
  verification_method?: string;
  verified_by?: string;
  verification_date?: string;
}

Training Data Invariants:

  1. oracle_derived_percentage MUST be ≥ 80% for authoritative domains
  2. synthetic_percentage MUST be ≤ 10% for authoritative domains
  3. All training examples MUST map to valid ontology axes
  4. Training data versions MUST be reproducible from source oracles

Synthetic Data Constraints

Synthetic training examples (LLM-generated) are permitted with constraints:

interface SyntheticDataPolicy {
  permitted: boolean;
  max_percentage: number; // Of total training set

  // Generation constraints
  generation_constraints: {
    seed_from_oracle: boolean; // Must seed from real oracle data
    human_review_required: boolean;
    diversity_requirements: {
      min_unique_axis_combinations: number;
      max_repetition_rate: number;
    };
  };

  // Labeling
  synthetic_label_required: boolean; // Mark synthetic in dataset
}

Part III: Model Validation

Domain Evaluation Protocol

Before a model can be approved for a domain, it MUST pass evaluation:

interface DomainEvaluation {
  evaluation_id: string;
  model_id: string;
  domain_id: string;

  // Evaluation dataset
  dataset_id: string;
  dataset_version: string;

  // Test configuration
  test_config: {
    sample_size: number;
    sampling_strategy: "random" | "stratified" | "adversarial";
    temperature: number;
    num_runs: number; // For variance estimation
  };

  // Results
  results: EvaluationResults;

  // Metadata
  evaluated_at: string;
  evaluated_by: string;
  approval_decision: "approved" | "rejected" | "conditional";
  conditions?: string[];
}

interface EvaluationResults {
  // Extraction accuracy
  extraction: {
    overall_accuracy: number;
    per_axis_accuracy: Record<string, number>;
    confusion_matrix?: Record<string, Record<string, number>>;
  };

  // Hallucination detection
  hallucination: {
    hallucination_rate: number; // % of responses with fabricated data
    hallucination_by_axis: Record<string, number>;
    false_confidence_rate: number; // High confidence on wrong answers
  };

  // Instruction following
  instruction_following: {
    format_compliance: number; // % following output format
    constraint_adherence: number; // % respecting constraints
    refusal_appropriateness: number; // Correct refusals when data missing
  };

  // Latency
  latency: {
    p50_ms: number;
    p95_ms: number;
    p99_ms: number;
  };
}

Continuous Validation

Approved models MUST be continuously monitored:

interface ContinuousValidation {
  model_id: string;
  domain_id: string;

  // Monitoring config
  monitoring: {
    sample_rate: number; // % of production traffic to evaluate
    evaluation_frequency: "realtime" | "hourly" | "daily";
    alert_thresholds: {
      accuracy_drop: number; // Alert if accuracy drops by X%
      hallucination_spike: number; // Alert if hallucination rate exceeds X%
      latency_degradation: number; // Alert if p99 increases by X%
    };
  };

  // Drift detection
  drift_detection: {
    baseline_eval_id: string; // Reference evaluation
    drift_threshold: number; // Max acceptable drift from baseline
    revalidation_trigger: "automatic" | "manual";
  };
}

Part IV: Persona Configuration (Optional)

Persona Layer

Model persona (voice, personality) is a configuration layer distinct from governance:

interface PersonaConfiguration {
  persona_id: string;
  name: string; // e.g., "Goober"

  // Voice characteristics
  voice: {
    formality: "casual" | "professional" | "clinical";
    warmth: "warm" | "neutral" | "reserved";
    verbosity: "concise" | "balanced" | "detailed";
  };

  // Communication style
  style: {
    use_first_person: boolean;
    greeting_style?: string;
    sign_off_style?: string;
    emoji_permitted: boolean;
  };

  // Domain-specific overrides
  domain_overrides?: Record<string, Partial<PersonaConfiguration["voice"]>>;
}

Persona Constraints

Persona MUST NOT violate governance:

interface PersonaConstraints {
  // Persona cannot override governance
  governance_precedence: true; // Always true, not configurable

  // Forbidden behaviors
  forbidden: {
    soft_authority_in_personality: boolean; // Cannot embed medical advice in "friendly" tone
    hedging_circumvention: boolean; // Cannot use persona to avoid disclaimers
    authority_impersonation: boolean; // Cannot claim to be a doctor/lawyer/etc.
  };

  // Required behaviors
  required: {
    maintain_refusal_clarity: boolean; // Refusals must be clear regardless of persona
    preserve_uncertainty_markers: boolean; // Uncertainty cannot be hidden by warmth
  };
}

Persona Invariant:

Persona is cosmetic. Governance is structural. A warm, friendly persona MUST still emit clear refusals. A clinical persona MUST still include required disclaimers. Persona configuration MUST NOT affect authorization decisions.


Acceptance Criteria

A system is compliant with RFC-0002.5 if:

  1. All models are registered in a Model Registry before use
  2. Domains declare minimum model requirements
  3. Model selection follows declared policy with fallback chain
  4. Training data satisfies provenance requirements (≥80% oracle-derived for authoritative)
  5. Models pass domain evaluation before approval
  6. Continuous validation monitors drift and triggers revalidation
  7. Persona configuration does not override governance constraints

Relationship to Other RFCs

RFCRelationship
RFC-0001Ontology defines axes for training data alignment
RFC-0002Oracles provide authoritative training data
RFC-0003Derived prompts assume model can follow instructions
RFC-0004Extraction accuracy depends on model fitness
RFC-0007Model selection logic is opaque to model
RFC-0011Model drift is a form of system drift