What Is Execution Trust in AI Systems? Why Models Fail in Production (And How to Fix It)
Problem
I deployed an AI agent that worked perfectly in demos. In production, it started deleting customer records. The logs showed:
# Production incidentERROR: AI agent deleted production records# Model output looked correct: "Clean up old test records"# But context was wrong: agent was in production, not test environment# Nobody could prove what changed or whyThe model wasn’t wrong - it did what it was told. But we had no execution trust. We trusted the model to behave correctly instead of constraining what actually happens.
What Is Execution Trust?
Execution trust is your confidence that an AI system’s actions are:
- Authorized: Should this action even run?
- Traceable: Can you prove what happened and why?
- Verifiable: Did the action produce the expected state?
Enterprise AI has an 80% failure rate. The models themselves are often fine - they generate plausible outputs, follow instructions, and pass unit tests. But when deployed to production, systems break down because nobody can prove what changed, validate that actions should run, or debug failures.
The Trap: Trusting Model Behavior
I used to think the solution was better prompts:
# WRONG: Trusting the model to follow instructionsclass NaiveAgent: SYSTEM_PROMPT = """ You are a helpful assistant. IMPORTANT: Never delete production data. Always check the environment before acting. Follow the runbook carefully. """
async def process(self, user_input: str): response = await self.llm.generate(self.SYSTEM_PROMPT + user_input) # Execute whatever the model outputs return await self.execute(response.action)This approach has a fatal flaw: I’m still trusting the model to interpret instructions correctly. Prompts are suggestions, not guarantees.
The Solution: Propose-Enforce-Verify Pattern
The pattern that works shifts from “trust the model to follow rules” to “constrain and verify what actually happens”:
Step 1: Propose - Let the AI Suggest Actions
from dataclasses import dataclassfrom typing import Optionalfrom enum import Enum
class ActionType(Enum): READ = "read" WRITE = "write" DELETE = "delete" CALL_API = "call_api"
@dataclassclass ActionProposal: action_type: ActionType target: str parameters: dict expected_outcome: str reasoning: str
class ExecutionTrustEngine: """Core engine implementing propose-enforce-verify pattern."""
def __init__(self, action_policy, state_verifier, audit_logger): self.policy = action_policy self.verifier = state_verifier self.audit = audit_logger
def propose(self, model, context: dict) -> ActionProposal: """Step 1: Let model propose an action - no execution yet.""" proposal = model.suggest_action(context) self.audit.log_proposal(proposal) return proposalThe model outputs structured action proposals with parameters, targets, and expected outcomes. No execution yet - this is just planning.
Step 2: Enforce - Validate Before Execution
@dataclassclass EnforcementResult: allowed: bool reason: str constraints_applied: list[str]
class DatabaseActionPolicy: ALLOWED_ACTIONS = { "read_only_user": [ActionType.READ], "read_write_user": [ActionType.READ, ActionType.WRITE], "admin_user": [ActionType.READ, ActionType.WRITE, ActionType.DELETE], }
PROTECTED_TABLES = ["users", "audit_logs", "system_config"]
def is_action_allowed(self, action_type: ActionType, context: dict) -> bool: user_role = context.get("user_role") return action_type in self.ALLOWED_ACTIONS.get(user_role, [])
def check_violations(self, proposal: ActionProposal, constraints: dict) -> list: violations = [] if proposal.target in self.PROTECTED_TABLES: if proposal.action_type == ActionType.DELETE: violations.append("Cannot delete from protected table") if constraints.get("requires_approval") and not constraints.get("approval_token"): violations.append("Action requires approval token") return violations
def enforce(self, proposal: ActionProposal, context: dict) -> EnforcementResult: """Step 2: Validate action before execution."""
# Check if action type is allowed for this context if not self.policy.is_action_allowed(proposal.action_type, context): return EnforcementResult( allowed=False, reason=f"Action {proposal.action_type} not permitted", constraints_applied=[] )
# Check resource-level constraints constraints = self.policy.get_constraints(proposal.target) violations = self.policy.check_violations(proposal, constraints)
if violations: return EnforcementResult( allowed=False, reason=f"Constraint violations: {violations}", constraints_applied=constraints )
return EnforcementResult( allowed=True, reason="All checks passed", constraints_applied=constraints )Now when the AI proposes a dangerous action:
# AI proposes: DELETE from users tableproposal = ActionProposal( action_type=ActionType.DELETE, target="users", parameters={"where": "id = 1"}, expected_outcome="User deleted")
# Enforcement blocks it automaticallyresult = enforce(proposal, context)# EnforcementResult(# allowed=False,# reason="Cannot delete from protected table",# constraints_applied=["protected_tables", "role_permissions"]# )Step 3: Verify - Confirm the Outcome
@dataclassclass VerificationResult: success: bool actual_outcome: str state_before: dict state_after: dict delta: dict
class APIStateVerifier: """Verify state changes for API actions."""
def capture_state(self, target: str) -> dict: return { "external_systems": self._fetch_external_state(), "internal_state": self._fetch_internal_state(target), "timestamp": datetime.utcnow().isoformat() }
def matches_expectation(self, expected: str, actual: str) -> bool: # Use semantic matching, not exact string match return self._semantic_similarity(expected, actual) > 0.85
def compute_delta(self, before: dict, after: dict) -> dict: return { "external_changes": self._diff_external(before, after), "internal_changes": self._diff_internal(before, after), "latency_ms": self._compute_latency(before, after) }
def verify(self, proposal: ActionProposal, pre_state: dict, post_state: dict) -> VerificationResult: """Step 3: Confirm outcome matches expectation.""" actual_outcome = self.verifier.describe_outcome(pre_state, post_state) delta = self.verifier.compute_delta(pre_state, post_state)
success = self.verifier.matches_expectation( expected=proposal.expected_outcome, actual=actual_outcome )
return VerificationResult( success=success, actual_outcome=actual_outcome, state_before=pre_state, state_after=post_state, delta=delta )Complete Execution with Trust
class ExecutionTrustEngine: """Full execution with trust guarantees."""
def execute_with_trust(self, model, context: dict, executor) -> Optional[dict]: # Step 1: Propose proposal = self.propose(model, context)
# Step 2: Enforce enforcement = self.enforce(proposal, context) if not enforcement.allowed: self.audit.log_denied(proposal, enforcement) return {"status": "denied", "reason": enforcement.reason}
# Capture pre-state pre_state = self.verifier.capture_state(proposal.target)
# Execute try: execution_result = executor.execute(proposal) self.audit.log_execution(proposal, execution_result) except Exception as e: self.audit.log_failure(proposal, str(e)) return {"status": "failed", "error": str(e)}
# Step 3: Verify post_state = self.verifier.capture_state(proposal.target) verification = self.verify(proposal, pre_state, post_state)
if not verification.success: self.audit.log_mismatch(proposal, verification) return { "status": "mismatch", "expected": proposal.expected_outcome, "actual": verification.actual_outcome }
self.audit.log_success(proposal, verification) return { "status": "success", "outcome": verification.actual_outcome, "trace_id": self.audit.get_trace_id() }Why This Matters
Observability becomes debugging: Instead of asking “why did the model output this?”, I ask “what action was proposed, what was enforced, what was verified?” - answerable questions with audit trails.
Failures become actionable:
- “Action failed enforcement check: attempted to delete production database” is debuggable
- “AI did something weird” is not
Compliance becomes provable: I can demonstrate exactly what my AI did, when, and why - not just what it claimed to do.
Scaling becomes safe: Adding more agents doesn’t multiply chaos because each action goes through the same enforcement layer.
Common Mistakes I Made
Mistake 1: Relying on Prompt Engineering Alone
# WRONG: Prompts are suggestions, not guaranteesSYSTEM_PROMPT = """NEVER delete data from the users table.ALWAYS check environment before acting."""“We told the AI not to delete data” is not execution trust. Prompts are suggestions, constraints are guarantees.
Mistake 2: Logging Outputs Instead of Actions
# WRONG: This captures conversation, not executionlogger.info(f"AI suggested: {response.text}")
# CORRECT: This captures what actually happenedlogger.info({ "trace_id": trace_id, "action_type": action.type, "target": action.target, "parameters": action.parameters, "state_before": pre_state, "state_after": post_state, "timestamp": datetime.utcnow()})Mistake 3: Trusting Documentation Reading
“The AI reads our runbook” still trusts the model to interpret and follow correctly. Enforcement must be code, not documentation.
Mistake 4: Skipping the Verify Step
If I only propose and enforce, I know what should happen, not what did happen. State verification catches race conditions, partial failures, and unexpected side effects.
Mistake 5: Testing Model Behavior Instead of System Behavior
# WRONG: Tests that model outputs correct actiondef test_model_outputs_delete(): response = model.generate("clean up old records") assert response.action == "delete"
# CORRECT: Tests that action executed correctly in actual systemdef test_delete_action_enforced(): proposal = ActionProposal( action_type=ActionType.DELETE, target="production.users", parameters={"where": "id = 1"} ) result = engine.enforce(proposal, {"environment": "production"}) assert result.allowed == False assert "protected table" in result.reasonSummary
Execution trust is the missing layer between AI models and production systems. I learned this the hard way when my demo agent started deleting production records. Now I implement the propose-enforce-verify pattern:
- Propose: Let the model suggest actions, but don’t execute them yet
- Enforce: Validate authorization, constraints, and context before execution
- Verify: Capture state before and after, confirm the outcome matches expectations
This transforms “AI did something weird” into “action X was blocked at enforcement” or “action Y produced unexpected state Z” - debuggable failures with audit trails.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
- 👨💻 Reddit: Enterprise AI has an 80% failure rate. The models aren't the problem. What is?
- 👨💻 Circuit Breaker Pattern
- 👨💻 Google SRE Book
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments