7 Common Mistakes When Working with AI Coding Agents (And How to Avoid Them)

Mar 28, 2026

I spent weeks frustrated with my AI coding assistant. One day it would produce brilliant code, the next day it would completely miss the mark on a similar task. I blamed the model—I thought it was just “unstable” or “inconsistent.”

Then I watched a colleague get consistently great results from the same model. The difference wasn’t the AI. It was how we were using it.

After analyzing my workflow and comparing it to successful patterns, I found I was making nearly every mistake in the book. Here’s what I was doing wrong, and how fixing these issues transformed my AI coding workflow.

Mistake 1: Stuffing Rules Into Prompts Instead of AGENTS.md

Every session, I’d paste the same wall of text: “Use TypeScript strict mode, never use any, prefer functional components, run npm test before committing…”

This approach has three problems:

Token waste - I’m paying for the same instructions over and over
Context pollution - The agent’s attention is diluted by repeated rules
Inconsistency - Sometimes I’d forget rules, leading to different behavior

## WRONG: Repeating rules every session

"Always use TypeScript strict mode, never use any, prefer
functional components, run npm test before committing,
use React Testing Library, max 50 lines per function..."

The fix is to persist long-term rules in AGENTS.md and use prompts only for task-specific details:

# Coding Standards
- TypeScript strict mode enabled
- No `any` types
- Functional components preferred
- Run `npm test` before marking complete
- Max 50 lines per function

Then my prompts become concise: “Add a login form following AGENTS.md standards.”

This separation means the agent always has access to persistent rules, and my task-specific prompts stay focused.

Mistake 2: Not Showing Agents How to Verify Their Work

I’d ask the agent to “fix the failing test in auth.test.js” without telling it how to run tests. The agent would make changes, but couldn’t verify if the fix actually worked.

## WRONG:
"Fix the failing test in auth.test.js"
# Agent fixes code but can't verify if fix worked

This created a broken feedback loop. The agent was essentially working blind.

The fix is to explicitly tell the agent how to verify:

## RIGHT:
"Fix the failing test in auth.test.js.
Run `npm test auth.test.js` to verify.
Run `npm run build` to check for type errors."
# Agent can now iterate until tests pass

Now the agent can run tests, see failures, adjust, and iterate until everything passes. This simple change dramatically improved reliability because the agent could actually see whether its changes worked.

Mistake 3: Skipping Planning for Complex Tasks

When I wanted to add authentication to my app, I just said “implement authentication” and let the agent run wild. The result was a mess—inconsistent patterns, half-finished features, and a lot of back-and-forth.

Planning mode exists for a reason. For multi-step tasks that touch multiple files, I need to gather context first.

┌─────────────────────────────────────────────────────────────┐
│                    Without Planning                         │
├─────────────────────────────────────────────────────────────┤
│  Agent starts immediately                                    │
│  ↓                                                          │
│  Discovers dependencies mid-implementation                   │
│  ↓                                                          │
│  Makes inconsistent decisions                                │
│  ↓                                                          │
│  Requires multiple corrections                               │
│  ↓                                                          │
│  Final result: fragmented, inconsistent code                 │
└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│                    With Planning                             │
├─────────────────────────────────────────────────────────────┤
│  Agent gathers context first                                 │
│  ↓                                                          │
│  Identifies all affected files                              │
│  ↓                                                          │
│  Creates step-by-step plan                                  │
│  ↓                                                          │
│  Executes systematically                                     │
│  ↓                                                          │
│  Final result: cohesive, well-integrated code                │
└─────────────────────────────────────────────────────────────┘

For complex tasks, I now always use plan mode. The agent reads relevant files, understands the architecture, and proposes a plan before writing any code. This upfront investment saves hours of corrections later.

Mistake 4: Giving Excessive Permissions Too Early

In my excitement to let the AI “just handle it,” I granted full file system access from day one. Bad idea.

The agent would sometimes delete the wrong files, or modify configuration files I didn’t want touched. I’d spend more time undoing changes than I saved.

The right approach is progressive trust:

Level 1: Sandboxed Access (New Project/Agent)
├── Read: Only specified directories
├── Write: Only files explicitly mentioned
└── Execute: Only safe, read-only commands

Level 2: Standard Access (After Proven Reliability)
├── Read: Project root and subdirectories
├── Write: Source files, not config
└── Execute: Build/test commands

Level 3: Full Access (After Extensive Trust)
├── Read: Full project
├── Write: All project files
└── Execute: All commands (with confirmation for destructive ops)

Start with sandboxed access. Expand permissions only after the agent proves reliable. This prevented catastrophic mistakes and made debugging easier when things went wrong.

Mistake 5: Automating Unstable Processes

I was so excited about automation that I’d create scripts and workflows for processes that still failed regularly. Then I’d spend more time debugging the automation than doing the work manually.

The pattern was clear: I was automating before stabilizing.

## Before Automating Any Process

1. Has this process been done manually 3+ times?
2. Does it produce correct results 95%+ of the time?
3. Is error handling in place for failures?
4. Have I documented the expected behavior?

If you answered "no" to any question, don't automate yet.
Stabilize first, then automate.

The correct workflow is:

Manual phase - Do the task manually until it’s smooth
Stabilization phase - Document the exact steps, identify edge cases
Skill phase - Create a reusable skill with proper error handling
Automation phase - Only automate after the skill is proven reliable

This prevented me from building automation on shaky foundations.

Mistake 6: Using One Endless Session

I used the same thread for weeks. Feature A, bug B, refactor C, feature D—all in one session. The context grew to 50K+ tokens of noise, and the agent increasingly lost track of what was relevant to the current task.

## WRONG: One Thread for Everything
Same session for 3 weeks:
- Added feature A
- Fixed bug B
- Refactored C
- Added feature D
- Context now has 50K tokens of noise
- Agent loses track of current task

Each task deserves its own focused context:

## RIGHT: One Thread Per Coherent Task
- Thread 1: Feature A (closed when done)
- Thread 2: Bug B fix (closed when done)
- Thread 3: Refactor C (closed when done)
- Thread 4: Feature D (current)

# Use fork/compose/subagent for context management
# Each thread has focused, relevant context

This dramatically improved agent focus and reduced hallucinations caused by irrelevant context from previous tasks.

Mistake 7: Not Using Worktrees for Parallel Changes

I wanted to work on two features simultaneously, both touching the same files. I tried running two agents in parallel, and chaos ensued—overwrites, conflicts, lost work.

Git worktrees solved this problem completely:

# Create isolated worktrees for parallel work
git worktree add ../project-feature-a feature-a
git worktree add ../project-feature-b feature-b

Now each agent works in complete isolation:

┌──────────────────────────┐     ┌──────────────────────────┐
│   ../project-feature-a   │     │   ../project-feature-b   │
│                          │     │                          │
│   Agent 1 works here     │     │   Agent 2 works here     │
│   ✓ Isolated files       │     │   ✓ Isolated files       │
│   ✓ No conflicts         │     │   ✓ No conflicts         │
│   ✓ Clean context        │     │   ✓ Clean context        │
└──────────────────────────┘     └──────────────────────────┘
              │                              │
              └──────────┬───────────────────┘
                         │
                    git worktree

No conflicts, no overwrites, no chaos. Each agent has its own working directory and can make changes independently.

The Deeper Pattern

Looking back at these mistakes, I see a common thread: I was treating AI agents as magic code generators rather than team members that need proper context, clear constraints, verification loops, and appropriate configuration.

Here’s what I now understand: AI coding agent reliability is half about the model, and half about the workflow design.

Many “model unstable” problems are actually “environment unstable” problems:

Working directory unclear
Permissions undefined
Default model unsuitable
MCP poorly connected
Sandbox and apppath not strict
No verification loop
Noisy context from endless sessions
Conflicting parallel changes

Fix the environment and workflow, and watch “unstable” outputs become consistent, useful results.

Checklist: Am I Making These Mistakes?

I now run through this checklist regularly:

## Before Each Session
- [ ] Is AGENTS.md up to date with team standards?
- [ ] Did I specify how to verify the task (tests, builds)?
- [ ] Is this task complex enough to need planning?
- [ ] Are permissions appropriate for task scope?

## Before Automating
- [ ] Has this process been done manually 3+ times?
- [ ] Does it produce correct results 95%+ of the time?
- [ ] Is error handling in place for failures?
- [ ] Have I documented the expected behavior?

## Session Hygiene
- [ ] Is this thread focused on one coherent task?
- [ ] Should I fork into a new thread for a new topic?
- [ ] Am I working with multiple agents on same files?
- [ ] Should I use worktree for isolation?

In this post, I shared the seven common mistakes I was making with AI coding agents and how each one was sabotaging my results. The fix wasn’t switching to a better model—it was fixing my workflow. By persisting rules in AGENTS.md, giving agents verification commands, planning complex tasks, starting with limited permissions, stabilizing before automating, using focused sessions, and isolating parallel work with worktrees, I transformed unreliable AI outputs into consistent, high-quality code.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!