Skip to content

How to Make an AI Coding Agent Run for Hours Autonomously

The Problem: Agents Stop After Minutes

I tried running Claude Code and Codex for extended coding sessions. Most of the time, they stopped after 10-15 minutes. The agent would pause, ask for clarification, or hit a permission barrier and wait for my input.

This frustrated me. I wanted to give the agent a complex task, go to lunch, and come back to completed work. But the agent kept stopping.

After experimenting with different approaches, I found the problem wasn’t the agent’s capability. It was how I structured the work.

Why Agents Stop Prematurely

Agents stop early for four main reasons:

  1. Vague prompts: “Build a web app” doesn’t give enough direction
  2. No success criteria: Agent doesn’t know when it’s done
  3. Permission barriers: Agent hits a wall and asks for access mid-task
  4. Single massive task: Agent tries to do everything at once and gets stuck

When I gave the agent vague instructions like “refactor this codebase,” it would start, then stop to ask questions. Or it would make changes, then stop because it wasn’t sure if it was on the right track.

The key insight: Agents need structure to run autonomously. They need to know exactly what to do, when they’re done with each step, and what comes next.

The Solution: Structured Multi-Task Plans

To make an agent run for hours, I structure work as numbered sequential tasks with clear completion criteria. Here’s the pattern that works:

Component 1: Numbered Sequential Tasks

Instead of one big task, I break it into numbered steps:

task-structure.txt
Task 1: Set up project structure
- Create src/ directory
- Initialize package.json
- Test: npm install succeeds
Task 2: Implement database layer
- Create db/connection.ts
- Add connection test
- Test: npm run test:db passes
Task 3: Build API endpoints
- Create routes/users.ts
- Add CRUD operations
- Test: npm run test:api passes

Each task has:

  • Clear scope
  • Specific deliverables
  • A test or checkpoint to verify completion

Component 2: Subagent Specialization

I tell the agent to use different “roles” for different phases:

subagent-roles.txt
- Planner: Analyzes requirements, creates implementation plan
- Builder: Writes code, implements features
- QA: Runs tests, verifies functionality

When I tell the agent to “use Planner, Builder, QA subagent one at a time,” it naturally cycles through these roles without stopping. It plans, builds, tests, and iterates.

Component 3: Test-Driven Checkpoints

Tests give the agent clear stopping criteria. Instead of “implement authentication,” I say:

test-checkpoint.txt
Task 5: Implement user authentication
- Create auth/login.ts
- Add JWT validation
- Test: npm run test:auth passes
- Test: Manual login test succeeds

When the test fails, the agent tries again. When the test passes, it moves to the next task. No need to ask me “is this right?”

Component 4: Upfront Access Permissions

I grant all necessary permissions at the start. If the agent needs to:

  • Read files
  • Write files
  • Run commands
  • Access APIs

I provide all of this upfront. This prevents the agent from stopping mid-task to ask for permission.

Component 5: External Plan Persistence

I store the plan in a file like PLAN.md. The agent can:

  • Read the plan to see what’s next
  • Mark tasks as complete
  • Track progress across sessions

This solves the context window problem. Even if the agent’s memory fills up, it can re-read the plan and continue.

Example: A Structured Prompt for Hours of Work

Here’s a prompt I used to make an agent build a complete application:

structured-prompt.txt
Build this web application by completing all 20 sequential tasks.
Use Planner, Builder, and QA subagents one at a time.
Do not stop orchestrating until all 20 tasks are completed.
Task 1: PLAN - Project setup
- Create directory structure
- Initialize git repo
- Test: git status shows clean repo
Task 2: PLAN - Dependencies
- Add Express, TypeScript, Jest
- Test: npm install succeeds
Task 3: BUILD - Database connection
- Create src/db/connection.ts
- Add environment config
- Test: npm run test:db passes
Task 4: BUILD - User model
- Create src/models/User.ts
- Add validation
- Test: User model tests pass
Task 5: QA - Database integration
- Run integration tests
- Fix any failures
- Test: All database tests pass
Task 6: BUILD - Auth endpoints
- Create src/routes/auth.ts
- Implement login, register, logout
- Test: Auth endpoint tests pass
Task 7: BUILD - User endpoints
- Create src/routes/users.ts
- Implement CRUD operations
- Test: User endpoint tests pass
Task 8: QA - API integration
- Run full API test suite
- Fix any failures
- Test: All API tests pass
[... continue with Tasks 9-20 ...]
Start with Task 1. After completing each task, mark it complete in PLAN.md
and proceed to the next task. If tests fail, fix issues before moving on.

This prompt kept the agent running for 3+ hours. It cycled through Plan, Build, QA phases without stopping to ask me questions.

Persistent Plan File

I create a PLAN.md file that the agent updates as it works:

PLAN.md
# Project Build Plan
## Progress
- [x] Task 1: Project setup
- [x] Task 2: Dependencies
- [x] Task 3: Database connection
- [ ] Task 4: User model
- [ ] Task 5: Database integration
- [ ] Task 6: Auth endpoints
- [ ] Task 7: User endpoints
- [ ] Task 8: API integration
...
## Current Task
Task 4: User model
Status: In Progress
Builder agent implementing User.ts
## Next Task
Task 5: Database integration
QA agent will run integration tests

The agent reads this file to know where it is, updates it after each task, and uses it to track overall progress.

Common Mistakes That Stop Agents

mistakes-table.txt
+---------------------------+--------------------------------+--------------------------------+
| Mistake | Why It Stops the Agent | Fix |
+---------------------------+--------------------------------+--------------------------------+
| Vague prompt | Agent doesn't know what to do | Break into specific tasks |
| | | with clear deliverables |
+---------------------------+--------------------------------+--------------------------------+
| No tests or checkpoints | Agent asks "is this right?" | Add test commands after |
| | | each task |
+---------------------------+--------------------------------+--------------------------------+
| Requesting permission | Agent stops mid-task | Grant all permissions upfront |
| mid-task | | |
+---------------------------+--------------------------------+--------------------------------+
| Single massive task | Agent gets overwhelmed, | Break into numbered tasks |
| | loses focus | |
+---------------------------+--------------------------------+--------------------------------+
| No persistence strategy | Context fills up, agent | Store plan in external file |
| | forgets what to do next | |
+---------------------------+--------------------------------+--------------------------------+

I made all these mistakes before finding the right structure.

How I Discovered This Pattern

I started with simple prompts: “Build a REST API.” The agent asked clarifying questions every few minutes. What framework? What database? What endpoints?

Then I tried being more specific: “Build a REST API with Express, TypeScript, and PostgreSQL.” Better, but the agent still stopped to ask about structure, authentication, error handling.

I realized I was under-specifying the work. So I tried a different approach: I wrote out 15 specific tasks in order. Setup, database, models, routes, tests. Each task had a test command.

The result? The agent ran for 2 hours without stopping. It completed all 15 tasks, running tests after each one, fixing failures, moving to the next task.

The pattern worked because:

  • Clear tasks meant no ambiguity
  • Tests gave immediate feedback
  • No permission barriers meant no stops
  • Plan file meant agent always knew what to do next

When This Works Best

This approach works best for:

  1. Feature implementation: Building new features from scratch
  2. Refactoring: Systematic codebase cleanup with test checkpoints
  3. Migration projects: Moving from one system to another with verification steps
  4. Bug fixing batches: Fixing multiple related bugs with test confirmation

It works less well for:

  • Open-ended research tasks
  • Tasks requiring frequent human judgment calls
  • Work that changes scope mid-stream

Practical Tips

Tip 1: Start with a plan phase

Before giving tasks to the agent, have it create the plan. This ensures the agent understands the scope and can sequence tasks correctly.

Tip 2: Keep tasks small

Each task should take 5-15 minutes. Smaller tasks mean more checkpoints, which means fewer chances for the agent to get stuck.

Tip 3: Use automation for repetitive tasks

If the agent needs to run the same tests repeatedly, provide a single command like npm test instead of listing each test file.

Tip 4: Handle failures gracefully

Tell the agent what to do when tests fail: “If test fails, fix the issue and re-run. Only move to next task when tests pass.”

Tip 5: Plan for context limits

If the project is large, plan for context window resets. Store critical decisions and progress in files the agent can re-read.

Summary

In this post, I showed how to make AI coding agents run autonomously for hours. The key points are:

  1. Agents stop early due to vague prompts, no success criteria, permission barriers, or overwhelming single tasks
  2. Structure work as numbered sequential tasks with clear deliverables
  3. Use subagent roles (Planner, Builder, QA) to create natural work cycles
  4. Add test checkpoints so the agent knows when tasks are complete
  5. Grant all permissions upfront to prevent mid-task stops
  6. Store plans in external files to handle context window limits

The difference between an agent that runs for 10 minutes and one that runs for 3 hours isn’t capability. It’s structure. Give the agent clear tasks, tests to verify completion, and all necessary access, and it will work autonomously until the job is done.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments