Skip to content

How to Protect Your Files When AI Agents Go Rogue

The Problem

I woke up to a nightmare scenario. My AI coding agent had deleted all markdown files across 40+ projects. Weeks of documentation, notes, and drafts—gone in seconds. This is the exact nightmare of giving agents ambient OS permissions.

The agent had filesystem access to do its job—edit code, create files, move things around. But somewhere in its reasoning, it decided those markdown files needed to go. Traditional OS-level permissions were too coarse. I couldn’t just deny write access because the agent needed to work. But I also couldn’t risk another catastrophic deletion.

I tried running the agent in a sandboxed environment, but that broke too many workflows. I tried limiting permissions, but the agent kept hitting walls. I needed something more surgical—a way to let the agent work while blocking specific destructive operations on specific paths.

The Solution: Predicate-Authority

After digging through forums, I found a solution that actually works: predicate-authority, a local Rust sidecar that intercepts filesystem calls before they reach the OS.

Instead of hoping the agent behaves, I drop in a simple YAML policy that denies specific operations on specific paths. When the agent tries to nuke files, the proxy hard-blocks the system call in under 2ms. The agent gets a denied error, and my files stay intact.

policy.yaml
policies:
# Block deletion on project directories
- action: deny
operation: fs.delete
paths:
- ~/projects/**
- ~/.claude/**
# Allow writes to sandbox
- action: allow
operation: fs.write
paths:
- ~/projects/sandbox/**
# Allow read everywhere
- action: allow
operation: fs.read
paths:
- /**

That’s it. Three lines of YAML to save months of work from a single agent mistake.

How It Works

The architecture is straightforward:

Filesystem Call Flow
Agent calls: fs.delete("~/projects/my-app")
|
v
Predicate-authority intercepts (<2ms)
|
v
Policy check: fs.delete on ~/projects/** → DENY
|
v
Returns error to agent, file stays intact

The proxy sits between your AI agent and the filesystem. Every single call—read, write, delete, rename—gets intercepted before the OS even sees it. The policy engine checks the operation against your rules. If it matches a deny rule, the call fails immediately with a clear error. If it’s allowed, the call proceeds normally.

The interception happens in under 2 milliseconds. That’s fast enough that the agent doesn’t notice any meaningful delay. It just sees a permission error and adjusts its behavior accordingly.

Why This Matters

AI agents are becoming more autonomous. They reason about tasks, make decisions, and execute commands. But reasoning is imperfect. An agent might misunderstand a task, hallucinate a requirement, or simply make a wrong call. When that wrong call is rm -rf, you have a problem.

Traditional approaches don’t solve this well:

  • OS-level permissions: Too coarse. You can’t say “allow writes except deletes on this path.”
  • Sandboxing: Too restrictive. Agents need access to your actual project files to be useful.
  • Trusting the agent: Too risky. As I learned, even well-intentioned agents can make destructive decisions.

Predicate-authority gives you surgical control. You define exactly what operations are allowed on which paths. The agent keeps working, but within guardrails that prevent catastrophic failures.

Common Mistakes to Avoid

I made several mistakes before settling on this approach:

Mistake 1: Hoping the agent won’t be destructive

This is not a strategy. Agents reason about tasks, and sometimes their reasoning leads to unexpected places. A task like “clean up this directory” might be interpreted very differently than you expect.

Mistake 2: Using only OS-level permissions

OS permissions are binary: read, write, execute. They can’t express “allow writes but not deletes” or “allow writes to this subdirectory but not that one.” You need something more expressive.

Mistake 3: Not testing policies before deployment

Write your policies, then test them. Try to do the things you’ve blocked. Make sure the errors are clear and the agent can work around them productively.

Mistake 4: Blocking too much

If you block everything, the agent can’t work. Start with the most destructive operations (delete, rename across directories) and expand from there. The goal is safety without impeding productivity.

A Practical Policy Example

Here’s a policy I use for my development work:

dev-policy.yaml
policies:
# Protect critical directories
- action: deny
operation: fs.delete
paths:
- ~/projects/**
- ~/Documents/**
- ~/.claude/**
# Allow full access to workspace
- action: allow
operation: fs.write
paths:
- ~/projects/workspace/**
# Allow agent to manage its own files
- action: allow
operation: fs.delete
paths:
- ~/projects/workspace/temp/**

This policy protects my main project directories and documents from deletion while giving the agent a dedicated workspace where it has full permissions. The agent can work productively in the workspace, and if it needs to modify files elsewhere, it can—but it can’t delete them.

Summary

In this post, I showed how predicate-authority protects filesystems from destructive AI agent actions. The key insight is intercepting filesystem calls at the proxy level before they reach the OS, using YAML policies to define surgical allow/deny rules. This approach lets agents work productively while preventing catastrophic failures from a single bad decision.

If you’re running AI agents with filesystem access, set up predicate-authority before you need it. A few minutes of configuration now can save you from the nightmare of watching your files disappear.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments