How to Build AI Agents with Governance and Modular Architecture

Feb 9, 2026

Purpose

This post shows how to build AI agents that stay on track with governance and modular architecture.

Problem

I’ve been working on AI agents for production use, and I noticed a problem in the community. Many “agent” frameworks are just automation wrapped in AI buzzwords. They run tasks or handle webhooks but don’t have actual governance. When these agents make decisions in production, there’s no oversight.

Here’s a basic agent without governance:

class BasicAgent {
  async onMessage(input: string) {
    // Direct action without validation or oversight
    const response = await this.llm.generate(input);
    await this.executeAction(response.action);
    // No logging, no constraints, no oversight
  }
}

This works in demos, but in production this is dangerous. The agent can:

Take harmful actions
Exceed cost limits
Make decisions that violate business rules
Fail without any record of what happened

Environment

TypeScript 5.3
Node.js 20
PostgreSQL for audit logs
Redis for circuit breaker state

Solution

I implemented a modular governance architecture with four layers:

Policy Engine - Validates actions before execution
State Management - Immutable audit trail
Constraint System - Resource limits and safety checks
Monitoring & Control - Circuit breakers and manual overrides

Here’s the governed agent:

interface GovernancePolicy {
  validate(action: Action): Promise<PolicyResult>;
}

interface AuditLog {
  logDecision(decision: Decision): Promise<void>;
}

class GovernedAgent {
  constructor(
    private policyEngine: GovernancePolicy,
    private auditLog: AuditLog,
    private circuitBreaker: CircuitBreaker
  ) {}

  async onMessage(input: string) {
    // Layer 1: Generate candidate action
    const candidate = await this.llm.generate(input);

    // Layer 2: Policy validation (before execution)
    const policyCheck = await this.policyEngine.validate(candidate);
    if (!policyCheck.allowed) {
      await this.auditLog.logDecision({
        action: candidate,
        blocked: true,
        reason: policyCheck.reason
      });
      return this.fallbackResponse();
    }

    // Layer 3: Constraint check
    if (!this.circuitBreaker.canExecute()) {
      throw new Error("Circuit breaker open - too many failures");
    }

    // Layer 4: Execute with monitoring
    const result = await this.executeAction(candidate);

    // Layer 5: Audit trail
    await this.auditLog.logDecision({
      action: candidate,
      result: result,
      timestamp: new Date(),
      allowed: true
    });

    return result;
  }

  private async executeAction(action: Action) {
    try {
      return await this.actionExecutor.execute(action);
    } catch (error) {
      this.circuitBreaker.recordFailure();
      throw error;
    }
  }
}

Policy Engine

The policy engine checks every action before execution:

class RuleBasedPolicy implements GovernancePolicy {
  constructor(private rules: PolicyRule[]) {}

  async validate(action: Action): Promise<PolicyResult> {
    // Check resource limits
    if (action.estimatedCost > this.maxCostPerAction) {
      return { allowed: false, reason: "Cost limit exceeded" };
    }

    // Check allowed actions
    if (!this.allowedActions.includes(action.type)) {
      return { allowed: false, reason: "Action type not allowed" };
    }

    // Check safety constraints
    const safetyCheck = await this.safetyValidator.validate(action);
    if (!safetyCheck.safe) {
      return { allowed: false, reason: safetyCheck.reason };
    }

    return { allowed: true };
  }
}

I can explain the key parts:

Cost limits - Prevent runaway API bills
Allowed actions - Whitelist of actions the agent can take
Safety checks - Validate parameters against business rules

Audit Log

Every decision gets logged to an immutable audit trail:

interface Decision {
  action: Action;
  allowed: boolean;
  blocked?: boolean;
  reason?: string;
  result?: any;
  timestamp: Date;
}

class PostgresAuditLog implements AuditLog {
  async logDecision(decision: Decision): Promise<void> {
    await this.db.insert('agent_decisions', {
      agent_id: this.agentId,
      action_type: decision.action.type,
      allowed: decision.allowed,
      blocked: decision.blocked,
      reason: decision.reason,
      result: decision.result ? JSON.stringify(decision.result) : null,
      timestamp: decision.timestamp
    });
  }

  async getHistory(agentId: string, limit: number): Promise<Decision[]> {
    return await this.db
      .select('*')
      .from('agent_decisions')
      .where('agent_id', agentId)
      .orderBy('timestamp', 'desc')
      .limit(limit);
  }
}

This audit trail is crucial for debugging. When an agent does something unexpected, I can query the history to understand why.

Circuit Breaker

The circuit breaker prevents cascading failures:

class CircuitBreaker {
  private failureCount = 0;
  private lastFailureTime: Date | null = null;
  private state: 'CLOSED' | 'OPEN' | 'HALF_OPEN' = 'CLOSED';

  canExecute(): boolean {
    if (this.state === 'OPEN') {
      // Check if cooldown period passed
      if (Date.now() - this.lastFailureTime!.getTime() > this.cooldownMs) {
        this.state = 'HALF_OPEN';
        return true;
      }
      return false;
    }
    return true;
  }

  recordFailure(): void {
    this.failureCount++;
    this.lastFailureTime = new Date();

    if (this.failureCount >= this.failureThreshold) {
      this.state = 'OPEN';
      this.notifyAlerting();
    }
  }

  recordSuccess(): void {
    this.failureCount = 0;
    this.state = 'CLOSED';
  }

  private notifyAlerting(): void {
    // Send alert to monitoring system
    this.alertSystem.send({
      severity: 'CRITICAL',
      message: `Agent circuit breaker opened: ${this.failureCount} failures`
    });
  }
}

When I test this with intentional failures:

# Test circuit breaker
curl -X POST http://localhost:3000/agent/message \
  -H "Content-Type: application/json" \
  -d '{"message": "test"}'

# Simulate failures
# After 5 failures, circuit breaker opens
{"status": "blocked", "reason": "Circuit breaker open - too many failures"}

# Wait for cooldown, then it retries

Why This Matters

Without governance, agents are “a car with no brakes.” They might work in demos, but in production they can:

Violate compliance requirements (no audit trail)
Cause cost overruns (no resource limits)
Make harmful decisions (no policy validation)
Fail catastrophically (no circuit breakers)

The modular architecture separates concerns:

Policy layer - What actions are allowed?
Audit layer - What did the agent do and why?
Control layer - How do we stop failures?

Each layer can be tested independently and swapped out as needed.

Summary

In this post, I showed how to build AI agents with governance using modular architecture. The key point is that production-ready agents need policy validation, audit trails, and constraint systems. The complexity is worth it for real-world deployments where trust, safety, and compliance matter.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

👨‍💻 Reddit: Why architecture matters more than hype
👨‍💻 Open Policy Agent
👨‍💻 Circuit Breaker Pattern

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!