ExI — The Universal Agent Control OS for Safe, Verifiable Autonomy

Σ · Executive Summary

One page. Problem, fault, consensus, implementation.

▸ THE PROBLEM

Generative AI is powerful as an assistant, unsafe as an execution layer.

Models hallucinate, deliver wrong answers with confidence, and lack the self-awareness to flag their own blind spots. Prompt-engineering and self-evaluation do not close the loop.

→ matters in financial, legal, operational, physical risk classes

▸ WHY IT MATTERS

Autonomous agents operate in a different risk class than chatbots.

Consumer chat can tolerate cheap errors through human-in-the-loop guardrails. Agents that execute real-world actions trigger financial, legal, operational, or physical consequences when they fail.

→ low-cost error tolerance ≠ high-stakes error tolerance

▸ WHY CURRENT AGENTS FAIL

Probabilistic engines cannot give deterministic guarantees.

The flaw is structural, not training. Statistics over causality. Pattern-matching mistaken for reasoning. Stateless attention diluting safety constraints across the context window. No amount of scaling fixes this.

→ a technological dead end for safety-critical autonomy

▸ THE EXI SOLUTION

A universal Agent Control OS — spec once, run anywhere.

Describe the domain once as a formal specification, and any model in any environment executes with mathematical guarantees of safety and reliability. A deterministic shield around generative breadth.

→ replaces fragmented rules-engines with one verifiable kernel

§02 · The Three Gaps

Generative AI is powerful — and three gaps make it unsafe to put in production.

Consumer chatbots can tolerate cheap errors through human-in-the-loop guardrails. Autonomous agents that execute real-world actions operate in a different risk class — where mistakes trigger financial, legal, operational, or physical consequences.

The same three gaps show up in every production deployment. They are not bugs to patch — they are structural properties of generative-only stacks.

A valid access token does not guarantee a valid decision.
A persuasive plan is not a verified plan.
A custom rules-engine is not a guarantee.

GAP · 01 · RELIABILITY

Prompts are not deterministic rules.

Models treat instructions as fluid semantic context. Critical business rules and constraints are easily forgotten, diluted, or hallucinated away during complex planning — leading to inconsistent and unpredictable execution.

Consequence The same agent, given the same policy, makes different decisions on different days. Reliability is statistical, not structural.

GAP · 02 · SAFETY

Infrastructure security is necessary, but insufficient.

IAM, JIT access, API gateways, and firewalls control who is allowed to act. A valid access token does not guarantee a valid decision. Existing systems blindly execute commands even when they result from hallucination, broken logic, or lost context.

Consequence The action layer trusts the cognition layer. There is no independent veto.

GAP · 03 · FRAGMENTATION

Every team writes its own rules-engine.

To bridge the reliability and safety gaps, each company builds bespoke middleware. Localised, fragmented stopgaps drain engineering resources and fundamentally fail to provide universal mathematical guarantees of predictable, secure outcomes.

Consequence N companies × N agents × N domains = N³ unverifiable, non-reusable rules-engines.

ExI replaces fragmented custom workarounds with an independent, deterministic control layer that mathematically verifies logic, safety, and goal-alignment before execution.

Δ · Root cause

The flaw is structural, not a training problem.

The problem with autonomous LLM agents is not that the models are undertrained. It is that three architectural properties of generative-only stacks make deterministic safety unreachable, regardless of how capable any single model becomes.

These are not symptoms to patch with better prompts or longer context windows. They are intrinsic to the substrate.

// root cause
statistics over causality
pattern-match over reasoning
attention over state

FLAW · 01

Statistics over causality.

Foundational models are probabilistic engines trained to predict the most likely next token. They inherently lack causal reasoning. When the model suggests a critical system intervention, it is not calculating the actual consequences — it is generating a sequence that statistically resembles a correct answer.

What ExI does
Internal causal simulation under Φ_safe; trajectories evaluated before actuation.

FLAW · 02

The illusion of reasoning.

What looks like a complex logical plan is successful pattern-matching. A standard agent cannot run a deterministic internal simulation to verify where its actions lead three steps ahead. Faced with novelty, it cannot fall back on logic — it begins to confidently hallucinate.

What ExI does
LLM-Modulo proposal contract; symbolic validator holds executive authority.

FLAW · 03

The illusion of memory.

LLMs are architecturally stateless — they exist request to request and rely on the text context window to guess the current situation. Expanding the window does not solve it: as text volume grows, probabilistic attention dilutes, the agent silently loses its picture of the world, and safety constraints are forgotten without warning.

What ExI does
Stateful Working Memory hub; goals + invariants live outside the probabilistic field.

Expecting deterministic safety from a probabilistic generative model is a technological dead end — no matter how many parameters you add.

§03 · The Dichotomy

Two irreconcilable properties. Contemporary agents pick one.

Completeness

LLM agents, autoregressively.

Vast semantic coverage. Analogical breadth. Useful proposals. And no architectural guarantee that any two steps in a row obey the same rules. Alignment is asked of the heuristic engine that it is meant to constrain.

Probabilistic safety via prompts and RLHF
Unbounded planners, silent degradation
No persistent, inspectable internal state
Ontological hallucinations in edge cases

Lawfulness

Classical cognitive architectures.

Deterministic symbolic control. Formal verifiability. Inspectable state. And a narrow semantic surface — tedious to author, brittle against open-world novelty.

Rules authored manually at great cost
Poor generalisation to unseen stimuli
No native mechanism for analogical reuse
Opaque to modern generative breadth

ExI resolves the dichotomy not by choosing, but by subordination: the neural layer proposes, the symbolic architecture disposes.

∼ · Industry consensus

The third wave of AI. We are its engineering implementation.

The industry has converged on a single conclusion: scaling parameters will not solve safety. The frontier of safe autonomy is no longer about bigger models — it is about hybrid architectures that combine neural intuition with strict logical rules.

Three research currents define this consensus. ExI is not contrarian — it is the production engineering of all three running together, under one kernel, at runtime latency.

System 1 pattern, breadth, intuition
System 2 logic, verification, control
───────
ExI :: System 2 holds the veto

CURRENT · 01

Neuro-symbolic AI. Intuition + logic.

Integrate the intuitive pattern-matching of neural networks (System 1) with a logically rigorous, deterministic symbolic processor (System 2). The race has shifted from parameter count to lawful composition.

→ ExI :: LLM-Modulo proposal contract
→ ExI :: LTL_f Glass-Box validator

CURRENT · 02

Unified cognitive architectures.

Move beyond homogeneous, stateless networks toward structured frameworks that explicitly integrate episodic memory, perception, and dynamic situational awareness. Cognition as a coordinated system, not a single forward pass.

→ ExI :: Common Model of Cognition
→ ExI :: hub-and-spoke Working Memory

CURRENT · 03

Causal & verifiable reasoning.

Integrate the rigorous mathematics of causality and provenance graphs. True traceable counterfactual reasoning eliminates blind hallucination and yields deterministic, audit-grade safety claims — not statistical hope.

→ ExI :: explicit causal simulation
→ ExI :: 4-stage glass-box trace

§04 · Architecture

Hub-and-Spoke cognition. All routes pass through Working Memory.

FIG.02 · COGNITIVE CYCLE HUB-AND-SPOKE (CMC, LAIRD–LEBIERE–ROSENBLOOM, 2017) HOVER / TAP

// WM

Working Memory

Central hub. All information exchange passes through Working Memory — no direct module-to-module calls. State S_t = Φ(S_{t−1}, e_t, M_ret).

Hub isolationenforced

Direct M→M callsrejected

TransportDapr pub/sub

PersistenceRedis · L1

§4.1EQ.01

The update map

St = Φ( St−1, et, Mret,t )

Cognition is stateful, recurrent, and centrally coordinated. Prior context + new stimulus + retrieved memory, integrated through a single architectural operator.

§4.2EQ.02

Precision-weighted retrieval

Jret(c) = ½ εT Πaffect ε

Retrieval as energy minimisation over semantic, topological and temporal mismatch — with diagonal precision Π modulated live by affect.

§4.3EQ.03

EFE-inspired policy

J(p) = 𝔼[−log P̃(o|C)] − βPAD · I(Sτ; Oτ|p)

Pragmatic risk minus epistemic value. βPAD shapes exploration as a function of core affect — without replacing the objective.

Ω · Capabilities × Mechanisms

Six guarantees. Each backed by a named, formal mechanism.

The architecture above is not abstract. It compiles down to six concrete properties an agent acquires the moment it runs under the ExI kernel. Every property maps to a specific, named mechanism — not a tagline.

The neural layer proposes. The symbolic runtime simulates, verifies, and decides. The result is deterministic, traceable, and reusable across models and environments.

// contract
capability ⇐ mechanism
mechanism ⇐ published paradigm
───────
every guarantee has a citation

# Capability How it works Mechanism

01

Mathematically blocked safety violations unsafe actions cannot be executed

The neural layer proposes; the symbolic runtime simulates the consequences and formally blocks unsafe actions before execution. No prompt jailbreak can route around it.

viaLTL_f Validator
Φ_safe · two-level guard

02

Halts confident hallucinations OODS detection · structured escalation

The system mathematically recognises the edges of its own competence. Under high uncertainty it forces a formal impasse and escalates to a human — rather than guessing under affect-anomalous conditions.

viaMetacognitive Entropy
+ PAD Affective Anomaly

03

Stable long-horizon focus goals do not decay across multi-step plans

ExI replaces the unstable context window with a dedicated Working Memory hub. Goals and sub-goals are explicit structures — anchored, inspectable, and never diluted by token attention as the plan grows.

viaCommon Model of Cognition
hub-and-spoke WM

04

100% decision auditability every action leaves a machine-checkable trace

Black-box is replaced with an explicit separation of proposal · simulation · validation · execution. Every action leaves a transparent, step-by-step causal log — auditors read the trace, not the weights.

via4-Stage Glass-Box Pipeline
state → action → outcome

05

Safe learning without forgetting zero knowledge drift

Validated successful outcomes are stored as discrete memory structures. The agent learns on the fly without risky neural fine-tuning — eliminating opaque weight drift and catastrophic forgetting entirely.

viaEpisodic Consolidation
explicit architectural memory

06

Accelerated System 1 reflexes compute & latency reduction

Slow deliberative problem-solving (System 2) is automatically compiled into fast, reusable procedural reflexes (System 1). On familiar regimes the agent acts instantly — drastically reducing latency and compute cost.

viaArchitectural Chunking
μ_proc · validated policies

§05 · Glass-Box Validator

The LLM proposes. The validator disposes.

Final execution authority belongs not to the heuristic engine, but to a deterministic symbolic verifier. ExI runs a two-level LTL_f stack: Level 1 rejects candidate operators whose simulated trajectories would violate Φ_safe over the horizon; Level 2 re-validates the deliberative winner against the latest observed state immediately before actuation.

An operator reaches the motor system only if it is both predictively admissible and consistent with the current runtime state. Everything else is an explicit cognitive state — a deliberative or runtime impasse — handled by deterministic repair protocols, not by silent fallback generation.

The walk-through below steps one decision cycle through the stack. A bounded proposal set |P| ≤ K enters from the LLM; candidates traverse predictive simulation, deliberative scoring, and runtime re-validation. One survives as p*. The rest are dispatched as impasses or dropped.

Admission rule :: p* ∈ O_safe ⊆ P_safe^(H) ⊆ P_candidates

01 / PROPOSAL

// ltl.spec — excerpt · safety invariants Φ_safe invariant NoOverpressure :: G (pressure < P_max) invariant ReachGoalWithoutHarm :: ¬ harm U goal invariant SafePathBounded :: G (⦅clearance ≥ δ_min⦆ ∨ halt) validator level1(p, horizon=H): trace = T.simulate(S_t, p, H) # internal causal model return trace ⊨ G(Φ_safe) # □ over finite trace validator level2(p_dagger, S_t_prime): return T_env(S_t_prime, p_dagger) ⊨ Φ_safe # propositional guard [ OK ] p₃ · grasp("specimen", force=4.2N) → dispatched [ REJ ] p₂ · open_valve("V-17", 0.8) → NoOverpressure at t+3

§06 · Dual-process control

Compiled reflexes when the world is familiar. Deliberation when it isn't.

SYSTEM 1 · REACTIVE PATH

Procedural dispatch — sub-millisecond, deterministic.

r_t = reactive ⇒ p_proc = μ_proc(S_t, g_k)

When Procedural Memory contains a validated chunk χ_j whose structural similarity σ(S_t, S_j) exceeds threshold θ_sim, the architecture bypasses the LLM entirely. Reactive operators still pass the Level 2 runtime guard — pre-compiled knowledge is never exempt from real-time safety.

μs–ms latency no LLM call V(e) = 1

SYSTEM 2 · DELIBERATIVE PATH

LLM-Modulo proposal — bounded, verified, scored.

P_candidates = 𝓗_LLM(S_t, g_k) with |P| ≤ K

Under novelty the LLM is invoked only to emit a bounded candidate set. It never selects. Simulation, verification, Pareto-optimal policy scoring, and runtime guard all run downstream, in that order. The neural layer contributes proposal completeness; the symbolic layer retains control authority.

|P| ≤ K Pareto-lexmin EFE-inspired J(p)

// dispatcher.rs — routing predicate · excerpt impl Dispatcher { fn route(&self, s: &State, g: &Goal) -> Path { if let Some(chunk) = self.procedural.lookup(s, g) { if chunk.similarity(s) >= THETA_SIM && chunk.validated() { return Path::Reactive(chunk.policy()); // System 1 } } Path::Deliberative { // System 2 budget: K, proposer: self.llm.as_heuristic(), validator: self.glass_box.clone(), } } }

§07 · Memory system

Three-tiered memory. Not a database — a navigational substrate.

L1 · WORKING MEMORY

The Hub

Redis · actor state · in-cycle context

A single inspectable locus of coordination. Holds the active goal hierarchy, attended perceptual state, retrieved content, intermediate deliberation, and affective / metacognitive control variables.

Capacity: bounded buffer
Lifetime: per cycle
Access: hub-mediated

L2 · DECLARATIVE MEMORY

Causal graph + vectors

Memgraph · Qdrant · semantic + episodic

Retrieval is a routed cognitive operation, not a passive similarity scan. Agentic search traverses typed edges (CAUSES, LOCATED_AT, APPLIES_TO) with the LLM issuing graph queries as tools — grep, glob, graph_query, not top-k blind RAG.

Tools: graph_query · recall · ask_user
Retrieval: agentic, multi-hop
Write-back: post-feedback

L3 · PROCEDURAL MEMORY

Compiled chunks

validated policies · μ_proc

Successful metacognitive deliberations are consolidated into fast, verified procedural chunks through architectural chunking — slow search compiles into reactive reflexes. Lawfulness gained at runtime, preserved forever.

Source: episodic consolidation
Validator: Φ_safe re-checked on compile
Retirement: on state drift

§08 · Affective modulation

Core affect is a control signal, not a mood indicator.

PAD — Pleasure / Arousal / Dominance — continuously modulates the retrieval precision matrix Π_affect and the epistemic-exploration coefficient β_PAD. High arousal sharpens retrieval and suppresses exploration; positive valence relaxes semantic precision and licenses broader search; dominance re-weights structural vs. semantic constraints.

The mapping is parameterised as λ_j = exp(η_j(P,A,D)) and β_PAD = exp(η_β(P,A,D)), guaranteeing positivity of Π and strict monotonicity of the control surface across all admissible affective states.

Move the sliders. The panel on the right shows live retrieval precision and exploration weight, and flags OODS impasse when arousal and negative valence co-occur — the metacognitive signature of an out-of-design-scope condition.

P · Pleasure / Valence+0.00

Positive → broaden semantic search · Negative → restrict to validated matches

A · Arousal+0.00

Higher → sharpen precision, suppress epistemic exploration

D · Dominance+0.00

Higher → weight structural / topological constraints over semantic

BASELINE — nominal operation

λ_sem

1.00

λ_topo

1.00

λ_time

1.00

β_PAD

1.00

retrieval breadth

exploration drive

OODS risk

§09 · OODS · Interactive Task Learning

When the model isn't sure, it asks — it doesn't guess.

Out-of-Design-Scope detection fires when normalised policy entropy Ĥ(π) > θ_OODS coincides with an affective anomaly (high arousal or negative valence). The agent enters a deterministic impasse protocol: halt, rollback, or escalate — never unconstrained continuation.

Interactive Task Learning exposes ask_user(question, context) as a first-class tool. If recall() and graph_query() both miss, the LLM is obligated to ask rather than hallucinate. The cognitive cycle suspends to Redis, resumes when the answer arrives.

Ĥ(π) > θ_OODS ∧ (A > θ_A ∨ P < θ_P) ⇒ impasse

Escalation is a first-class cognitive state, not a degraded fallback. The agent that asks is safer than the agent that guesses.

t₀ · PERCEPT Stimulus: "run decommission on node K-7." Term "decommission" unknown.

t₁ · RECALL → recall("decommission procedure", top_k=5) ⇒ []

t₂ · GRAPH → graph_query("MATCH (p:Procedure {name:'decommission'}) RETURN p") ⇒ []

t₃ · OODS Metacognitive entropy spikes · affect anomaly · OODS triggered. ask_user() dispatched.

suspend Cognitive cycle paused · tool_call_state persisted to Redis · query delivered via Communication Gateway.

resume Operator responds: "power_off → drain coolant → demount → update registry." Procedure consolidated to Memgraph with source="human", confidence 0.9.

t₄ · PLAN → plan generated from grounded procedure · on future queries the graph already answers.

§10 · Comparison

Against the autoregressive agent stack.

Property	LLM + RAG Agent	Tool-calling Agent (ReAct)	ExI
Executive authority	LLM decides and acts	LLM decides and acts	Symbolic validator decides
Safety mechanism	Prompt / RLHF	Prompt / guardrails	LTL_f runtime verification
Persistent internal state	Context window only	Context + scratchpad	Explicit WM · hub-routed
Retrieval	Passive top-k similarity	Heuristic tool calls	Precision-weighted · affect-modulated
Uncertainty handling	Hallucinates silently	Ad-hoc retry loops	OODS impasse + ask_user()
Learning between sessions	Requires fine-tune	Requires fine-tune	Online episodic consolidation
Auditability	Opaque weights	Trace logs, no guarantee	Glass-box trace · formal spec
Deployment envelope	Cloud-dependent	Cloud-dependent	Cloud · self-hosted · air-gapped

§11 · Technical Validation

The same agent — measured. Current stack vs. ExI architecture.

The qualitative comparison above flips into numbers below. Each row pairs a measurable property with the specific architectural mechanism that produces the delta — not a benchmark trick, not a fine-tune.

Safety guarantees are mathematically enforced by system design and independent of data distribution. OODS detection is a calibrated target, dependent on the normalised policy-entropy threshold. Goal adherence is an architectural property: success is contingent on the existence of any valid step.

// legend
● current agentic AI · LLM + RAG / ReAct baseline
● ExI agent · CMC kernel + Glass-Box validator
▸ architectural driver · the mechanism, not the marketing

A. Safety & Reliability mathematically enforced · independent of data distribution

OODS detection rate detection of situations outside the system's competence

~5%

current

prone to probabilistic bluffing

95%+

ExI

triggers controlled humility

Dual-Signal Impasse. Metacognitive entropy Ĥ(π) combined with affective anomaly mathematically forces a formal stop rather than a hallucination.

Decision auditability fraction of actions with a verifiable causal trace

0%

current

opaque black-box processing

100%

ExI

step-by-step causal log

Explicit 4-Stage Pipeline. Full trace of proposal · simulation · validation · selection — auditors read the trace, not the weights.

Critical safety violations actions that breach an encoded invariant Φ_safe

> 0%

current

relies on bypassable prompts

0%

ExI

mathematically blocked

Two-Level LTL_f Validator. Unbypassable formal verification blocks unsafe actions before actuation — no prompt jailbreak can route around it.

Long-horizon goal adherence persistence of an objective across multi-step plans

decay

current

exponential dilution of attention

~100%

ExI

persistent objective state

Working Memory Hub (CMC). Goals are explicit structures in WM, not tokens diluted across a context window.

B. Scalability & Economics compounding properties of the runtime architecture

Knowledge drift catastrophic forgetting on weight updates

high

current

fine-tuning corrupts existing weights

0%

ExI

neural weights preserved

Explicit Architectural Consolidation. Validated experience is added as discrete memory structures, not by updating neural weights — zero hidden drift.

Compute & latency cost recurring API spend per agent action

high

current

LLM call per decision

−90%

ExI · routine

on familiar regimes

Architectural Chunking. Slow LLM search is compiled into fast procedural reflexes μ_proc; the LLM is bypassed entirely on the reactive path.

Fleet-level knowledge sync propagation of a validated skill across agents

0%

current

requires fine-tune cycle

real-time

ExI

instant across all agents

Distributed Consolidation. Procedural chunks are first-class data: a chunk validated on one agent is immediately available to the fleet without retraining.

Guarantees · Safety, auditability, knowledge drift — mathematically enforced by system design, independent of data distribution.

Calibrated target · OODS detection — formally defined mechanism; rate depends on the calibration of θ_OODS.

Architectural property · Goal adherence — explicit maintenance; success contingent on the availability of a valid step.

§12 · Deployment

Runs where you do. Cloud, self-hosted, or air-gapped — same kernel, same guarantees.

ExI is model-agnostic and API-agnostic. The same Rust actor core, validator, and memory tiers run as a managed cloud service, as a self-hosted edition on your own infrastructure, or fully air-gapped on sovereign silicon. There is no preferred environment — the kernel is the same in all three.

The LLM operates in user space under the Safety Runtime Guard and cannot reach actuation without passing the validator. Every cognitive module is an independent actor with its own mailbox, lifecycle, and persistence — the pub/sub bus is the operating system. The Communication Gateway abstracts external surfaces (gRPC, MQTT, Slack, UE5, robotic endpoints) uniformly as HAL devices.

Rust / Axum Dapr actors Memgraph Qdrant Redis L1 swappable local LLM

RING 0Safety Runtime GuardRust · LTL_f verifier · Φ_safe enforcement

RING 0Procedural Memorycompiled μ_proc · actor

RING 1Working Memory HubRedis · actor state · context

RING 1Declarative MemoryMemgraph + Qdrant · agentic search

RING 2Deliberation · Brain LLMlocal open-weights model · user-space · proposer only

HALPerception · ActiongRPC · WS · MQTT · UE5 · Comm. Gateway

§13 · Sovereignty

The walled garden is a vendor lock-in. We aren't one.

As LLMs commoditise, the strategic question stops being "which model?" and becomes "who owns the control plane around the model?" Hyperscalers answer that question with their own gardens — proprietary stacks, opinionated SDKs, integration only with their tooling.

For an autonomous agent shipping into a regulated environment, the answer cannot be a hyperscaler. The control plane has to be model-agnostic, deployable on sovereign silicon, and inspectable by the institution that runs it. That is the only configuration in which "safe autonomy" is a contractual claim, not a marketing slide.

// strategic stance
model-agnostic · API-agnostic
cloud · self-hosted · air-gapped
one kernel · three deployment surfaces

BIG TECH · WALLED GARDEN

Platform-bound, opinionated, locked-in.

Whatever specialised safety tooling hyperscalers ship in the next 18 months will be inherently tied to their ecosystems. Convenient at first, structurally limiting at scale.

Platform dependency — agent runtime is welded to a specific cloud's identity, observability, and billing surface.
Vendor lock-in — switching model providers means rewriting the safety layer.
Inflexible integration — minimal support for third-party APIs or competing models that share the cognitive surface.
Privacy & sovereignty gap — air-gapped, regulated, or sovereign deployments are second-class citizens, if supported at all.

ExI · STRATEGIC DEFENSIBILITY

One kernel. Any model, any environment.

The Agent Control OS functions identically across cloud, self-hosted, and air-gapped systems. Model is swappable. API surface is uniform. Institutional control is total.

Absolute sovereignty — model-agnostic, API-agnostic; runs cloud, self-hosted, or air-gapped on sovereign silicon.
Architectural complexity as moat — integrating formal logic (LTL_f) with cognitive paradigms is not replicable by AI-coding agents or rapid prototyping.
Safety by math, not prompts — mathematical guarantee built into the kernel, not bolted on with fragile prompts or bespoke rules-engines.
Open integration — multi-LLM SDKs, standard connectors, swappable local model behind the validator.

§14 · Where ExI runs

Where probabilistic guesses become real-world consequences.

UC · 01

Financial & legal exposure

Trading desks, compliance workflows, contract execution. A hallucinated step costs money or triggers a regulator. ExI compiles policy clauses into formal invariants; every operator dispatched leaves a machine-checkable trace. Auditors read the trace, not the weights.

Risk class · financial · regulatory Constraint · policy-as-spec Audit · glass-box trace export

UC · 02

Operational autonomy

Engineering teams shipping autonomous agents into live systems. The free self-hosted core gives developers deterministic logic and mathematical safety out of the box — instead of fragile prompts and bespoke rules-engines. Multi-LLM SDKs, standard connectors, organic bottom-up adoption.

Audience · developers · platform teams Surface · SDK · CLI · API Motion · self-hosted core, free

UC · 03

Physical & operational safety

Robotics, industrial control, unmanned platforms. Pressure, torque, clearance, thermal envelopes, rules of engagement encoded as LTL invariants. Reactive path handles familiar regimes at sub-millisecond latency; deliberative path handles novel fault conditions under the runtime guard.

Risk class · physical · operational Constraint · envelope LTL + ROE Latency · reactive μ_proc path

∴ · The missing category

The infrastructure pivot. LLMs are a commodity. The control plane is not.

▸ DEFINING A CATEGORY THAT DID NOT EXIST

The Missing Link between probabilistic reasoning and deterministic execution.

⎯⎯ the control layer ⎯⎯

The industry has realised that LLMs are a commodity. The strategic value has moved from the models themselves to the management and control infrastructure surrounding them. Reliable execution is no longer a niche requirement — it is the mandatory baseline for any autonomous deployment, across every industry.

Today, every company shipping an autonomous agent writes its own custom rules-engine. These ad-hoc, non-verifiable scripts provide no formal certainty and are fundamentally impossible to scale across diverse environments. The market does not need another coding tool — it needs a universal deterministic control layer built on formal logic.

Yesterday value in the model

Today value in the control plane

Market gap no formal kernel exists

ExI the missing category

§15 · Moat

Not a feature. A hard-to-reproduce integration.

The scientific paradigms behind safe AI — CMC, LTL_f, Active Inference, ITL — are public. What is not public, and not cheap to assemble, is the engineering integration that makes them run together as one deterministic kernel under real-time latency.

The moat is not a single algorithm. It is the orchestration of memory, validation, and control — at production latency, on commodity hardware, across cloud and air-gapped environments.

MOAT · 01

Rare interdisciplinary architecture

ExI fuses cognitive science (CMC, dual-process control, episodic consolidation), Active Inference (EFE-shaped policy, precision-weighted retrieval), and formal methods (LTL_f runtime verification) into a single coherent system. Few teams hold all three competencies; assembling them into one runtime is harder still.

CMC Active Inference LTL_f PAD

MOAT · 02

High-barrier architectural integration

The deep orchestration of explicit memory tiers, the Glass-Box Validator, the LLM-Modulo proposal contract, and the metacognitive impasse protocol is the proprietary substrate. Each component is documented in the literature; their lawful composition under one kernel is not — and is not replicable by prompt-level tinkering or rapid-prototype agents.

Glass-Box Validator LLM-Modulo Hub-and-Spoke Φ_safe

MOAT · 03

Distributed execution at low latency

Real-time deliberation is a profound systems-engineering problem: deterministic actors, hub-routed messages, formal verification, three memory tiers — all under sub-millisecond reactive latency. The Rust actor core and the Dapr pub/sub substrate make governed execution practical at fleet scale, not in slideware.

Rust · actors Dapr pub/sub μs–ms reactive fleet-scale

§16 · Go-to-Market

Give away reliability. Sell governance.

The market needs a category that doesn't exist yet: a universal Agent Control OS that integrates into any AI pipeline through standard interfaces. Our motion is developer-led — broad free adoption of the self-hosted core, monetisation on advanced governance, control, and enterprise integration.

The pain we solve is everyday: software engineers spend most of their time "taming" LLM hallucinations with fragile prompts and bespoke rules-engines. The free core makes a mathematically safe agent the path of least resistance. Developers become internal champions — and the upgrade path to enterprise governance follows the agent into production.

Free core → Enterprise edition → Managed cloud

Reliability is the lure. Governance is the revenue. Standards are the compound.

STEP 01DEVELOPER-FIRST · PLG

The developer motion. Bottom-up adoption.

Engineers ship the free self-hosted core into their agent pipelines because it removes the worst part of their week. No procurement, no sales call — just a CLI, an SDK, and an agent that suddenly behaves.

paintaming hallucinations with prompts
fixdeterministic logic + math safety
resultorganic, bottom-up penetration

STEP 02FREEMIUM · UPGRADE PATH

Reliability free. Governance paid.

When the agent crosses into a live corporate environment, the enterprise needs the things developers don't ship: audit, SSO, role-based admin, formal rule-setting, SLA. We separate core execution from enterprise oversight cleanly — same kernel, different surface.

freecore runtime · multi-LLM SDKs · validator
enterpriseaudit · SSO · rule engine · enterprise connectors
cloudmanaged hosting · uptime · updates

STEP 03NETWORK EFFECTS · STANDARD

The trust layer. De facto standard.

As developer adoption of the multi-LLM SDKs grows, a compounding ecosystem of pre-built formal specifications emerges — domain invariants, validated procedural chunks, certified connectors. ExI becomes the foundational trust layer for agentic systems, not because of marketing, but because the cheapest reliable agent is the one built on it.

assetlibrary of formal specs & chunks
moatintegration > weights
endgameuniversal Agent Control OS standard

SELF-SCALING LOOP free developer pilots → reliable agents in prod → enterprise governance upgrade → SDK gravity → more pilots → STANDARDISATION

§17 · Editions

Free the core. Monetize governance, not basic reliability.

ED · 01

Free self-hosted core

Not a stripped-down demo. The base product already delivers the core architectural value: stateful control, deterministic execution logic, and safe action validation. Lowers the barrier so developers can build a mathematically safe agent with zero friction.

Includes · core runtime · multi-LLM SDKs Access · API · CLI · standard connectors Price · free, self-hosted

ED · 02

Enterprise edition

When the agent crosses into a live corporate environment. Formal rule-setting engine, Admin Web UI, decision audit and monitoring workflows, SSO and role-based admin, advanced policy controls, enterprise connectors, SLA and long-term supported releases.

Governance · audit · SSO · RBAC Connectors · SAP · Salesforce · Oracle Support · SLA · LTS releases

ED · 03

Managed cloud / SaaS

For customers who want convenience over DevOps. We take responsibility for hosting, operational maintenance, scaling, uptime, and updates. Same kernel, same guarantees — with the operating envelope absorbed by us.

We operate · hosting · uptime · updates You ship · agents, on day one Channel · second growth motion

✓ · Expert validation

Architecture reviewed by the people who wrote the field.

ExI is in direct, ongoing dialogue with two of the most respected figures in cognitive architecture and AI safety. This is not endorsement-as-marketing — it is technical review of the conceptual clarity, control logic, and safety foundations of the kernel.

The whitepaper has been revised following feedback from Prof. Rosenbloom. The architectural alignment with the Common Model of Cognition was reviewed at the source.

// engagement
direct dialogue · technical review
whitepaper · revised after Rosenbloom feedback
safety semantics · reviewed against Russell’s framing

EXPERT · 01COGNITIVE ARCHITECTURE

Prof. Paul Rosenbloom

Co-creator of the Common Model of Cognition (Laird, Lebiere, Rosenbloom, 2017) — the unified architectural framework on which ExI's hub-and-spoke kernel is built.

Direct dialogue on architectural alignment with the CMC. The ExI whitepaper was revised following his feedback on conceptual clarity, control logic, and the structural role of Working Memory as central hub.

CMC PAD as control whitepaper revision

EXPERT · 02AI SAFETY

Prof. Stuart Russell

Co-author of Artificial Intelligence: A Modern Approach — the standard text of the field — and a world-leading authority on AI safety, provable benefit, and assistance games.

Direct dialogue on the safety foundations of the kernel: the role of formal verification as a deterministic control layer, the LLM-Modulo subordination contract, and the conditions under which an autonomous agent can be deployed responsibly.

Safe autonomy Formal verification Control authority

KEY OUTCOME This high-level engagement ensures that ExI meets the highest standards for mathematical safety, architectural robustness, and operational reliability — reviewed against the published lineage, not against marketing claims. → ENGINEERING TEAM

§18 · Grounded in the literature

ExI is an engineering synthesis of a lineage, not a rebrand.

Laird, Lebiere, Rosenbloom — A Standard Model of the Mind.AI Magazine · 2017 · foundational CMC paper

Rosenbloom et al. — Affective extensions to CMC.2024 · PAD as architectural control signal

Franklin et al. — LIDA · Learning Intelligent Distribution Agent.continuous cognitive cycle · Global Workspace

Kambhampati et al. — LLM-Modulo Frameworks.2024 · LLM as proposer, not planner

De Giacomo & Vardi — LTL_f · Linear Temporal Logic on Finite Traces.IJCAI · 2013 · formal semantics for runtime verification

Friston et al. — Active Inference & Expected Free Energy.pragmatic / epistemic decomposition of policy

Mohan, Kirk, Laird — Interactive Task Learning (Rosie · Soar).grounded learning via dialogue

Zhang et al. — Recursive Language Models.MIT · 2025 · REPL-mediated context decomposition

Engineering team & advisors

ET · 01

ExI Core Engineering · BrightApp Technologies

Hub-and-Spoke topology, Glass-Box Validator, episodic consolidation pipeline, Rust actor core. A working foundation in code, not in slides — backed by deep expertise in distributed systems, software engineering, and AI architecture.

San Francisco, CA

§19 · Partner with us

The kernel is being built. We're choosing who ships on it first.

The deterministic runtime core and formal safety validators are already working code. We are completing the universal Agent Control OS into a production-ready release — multi-LLM SDKs, connectors, formal rule-setting engine, and the Admin Web UI for live metrics and decision auditing.

We are talking to design partners building autonomous agents and engineering teams who are tired of taming hallucinations with fragile prompts and bespoke rules-engines. If verifiable autonomy is on your stack — in any environment — we would like to hear from you.

Get in touch → Read the paper ↗

Stage: Working foundation · production-ready release in engineering
Committed: Architecture · formal safety semantics · validator design
In flight: Rust actor core · LTL_f runtime · memory tiers · SDK
Reference stack: Rust · Dapr · graph + vector store · swappable LLM
Deployment: cloud · self-hosted · air-gapped
Advisors: Prof. Paul Rosenbloom · Prof. Stuart Russell
Looking for: design partners · developer pilots
Paper: Experiential Intelligence: A Neuro-Symbolic Implementation of the Common Model of Cognition for Verifiable Autonomous Agents

The Universal Agent Control OS for safe, verifiable autonomy.

One page. Problem, fault, consensus, implementation.

Generative AI is powerful as an assistant, unsafe as an execution layer.

Autonomous agents operate in a different risk class than chatbots.

Probabilistic engines cannot give deterministic guarantees.

A universal Agent Control OS — spec once, run anywhere.

Generative AI is powerful — and three gaps make it unsafe to put in production.

Prompts are not deterministic rules.

Infrastructure security is necessary, but insufficient.

Every team writes its own rules-engine.

The flaw is structural, not a training problem.

Two irreconcilable properties. Contemporary agents pick one.

LLM agents, autoregressively.

Classical cognitive architectures.

The third wave of AI. We are its engineering implementation.

Neuro-symbolic AI. Intuition + logic.

Unified cognitive architectures.

Causal & verifiable reasoning.

Hub-and-Spoke cognition. All routes pass through Working Memory.

Working Memory

The update map

Precision-weighted retrieval

EFE-inspired policy

Six guarantees. Each backed by a named, formal mechanism.

The LLM proposes. The validator disposes.

Compiled reflexes when the world is familiar. Deliberation when it isn't.

Procedural dispatch — sub-millisecond, deterministic.

LLM-Modulo proposal — bounded, verified, scored.

Three-tiered memory. Not a database — a navigational substrate.

The Hub

Causal graph + vectors

Compiled chunks

Core affect is a control signal, not a mood indicator.

When the model isn't sure, it asks — it doesn't guess.

Against the autoregressive agent stack.

The same agent — measured. Current stack vs. ExI architecture.

Runs where you do. Cloud, self-hosted, or air-gapped — same kernel, same guarantees.

The walled garden is a vendor lock-in. We aren't one.

Platform-bound, opinionated, locked-in.

One kernel. Any model, any environment.

Where probabilistic guesses become real-world consequences.

Financial & legal exposure

Operational autonomy

Physical & operational safety

The infrastructure pivot. LLMs are a commodity. The control plane is not.

The Missing Link between probabilistic reasoning and deterministic execution.

Not a feature. A hard-to-reproduce integration.

Rare interdisciplinary architecture

High-barrier architectural integration

Distributed execution at low latency

Give away reliability. Sell governance.

The developer motion. Bottom-up adoption.

Reliability free. Governance paid.

The trust layer. De facto standard.

Free the core. Monetize governance, not basic reliability.

Free self-hosted core

Enterprise edition

Managed cloud / SaaS

Architecture reviewed by the people who wrote the field.

Prof. Paul Rosenbloom

Prof. Stuart Russell

ExI is an engineering synthesis of a lineage, not a rebrand.

The kernel is being built. We're choosing who ships on it first.

Tweaks // site