How Does Specs-Driven Development Work with AI Coding Assistants?

Mar 24, 2026

Problem

I was frustrated with AI coding assistants. I’d ask Cursor to implement a feature, and it would generate code that sort of worked but missed edge cases. I’d ask Claude Code to add authentication, and it would create a basic implementation that didn’t match my existing patterns. Every time, I ended up in an endless back-and-forth correcting assumptions.

Here’s what a typical session looked like:

Me: Add a login endpoint to the API

Claude: [generates 50 lines of code]

Me: That's wrong, we use bcrypt not argon2

Claude: [regenerates with bcrypt]

Me: Also, we need rate limiting

Claude: [adds rate limiting]

Me: And the response format should match our standard

Claude: [fixes response format]

Me: Wait, you missed the account lockout logic

Claude: [adds lockout logic]

... 10 more corrections later ...

I realized the problem: I was giving AI vague instructions, and it was making assumptions. Every assumption was a potential wrong turn.

What I Discovered

I found a Reddit thread where someone described exactly what I was experiencing:

“With autonomous agents… if you want to keep agents from running tasks, you need to be precise about what you want. And specification driven development is the way now.”

Another comment hit home:

“I almost feel like the barrier has been lifted since we can describe what the specs should look like to the AI, get those set up, and have the AI iterate until the minimum changes made to solve the problem.”

The insight was clear: AI needs precise specifications to work effectively. Without specs, it guesses. With specs, it executes.

The SDD + AI Workflow

I started using Specs-Driven Development with AI assistants, and the difference was immediate. Here’s the workflow that works:

+-------------------------------------------------------------+
|                    SDD + AI WORKFLOW                        |
+-------------------------------------------------------------+
|                                                             |
|  1. INITIAL SPEC DRAFT                                      |
|     +--------------+                                        |
|     | Human inputs |--> AI drafts initial spec              |
|     | requirements |    (structure + questions)             |
|     +--------------+                                        |
|            |                                                |
|            v                                                |
|  2. SPEC REFINEMENT                                         |
|     +------------------------------------------+           |
|     | Human reviews for:                       |           |
|     | - Specificity (avoid generic)           |           |
|     | - Edge cases                            |           |
|     | - Business logic accuracy               |           |
|     +------------------------------------------+           |
|            |                                                |
|            v                                                |
|  3. CODE GENERATION                                         |
|     +------------------------------------------+           |
|     | AI generates code from refined spec     |           |
|     | Human reviews output                    |           |
|     +------------------------------------------+           |
|            |                                                |
|            v                                                |
|  4. ITERATION LOOP                                          |
|     +------------------------------------------+           |
|     | Specs inform tests                     |           |
|     | Tests validate code                    |           |
|     | Failures inform spec updates           |           |
|     +------------------------------------------+           |
|                                                             |
+-------------------------------------------------------------+

Let me walk through each step with a real example.

Step 1: AI Drafts Initial Spec

Instead of asking AI to code directly, I now ask it to draft a specification first:

Generate a detailed specification for: User Authentication API

Context:
- Tech stack: Node.js, Express, PostgreSQL
- Existing patterns: Repository pattern with dependency injection
- Testing framework: Jest + Supertest

Requirements:
1. Email/password login endpoint
2. JWT token generation
3. Rate limiting

Output format: JSON specification with:
- Feature name and context
- Detailed requirements with acceptance criteria
- Edge cases and error scenarios
- Constraints (performance, security)
- Non-goals (what this feature will NOT do)

After generating, ask me 3-5 clarifying questions about
business logic that would make this spec more specific.

The AI generates:

{
  "feature": "User Authentication API",
  "context": {
    "tech_stack": ["Node.js", "Express", "PostgreSQL"],
    "existing_patterns": "Repository pattern with dependency injection",
    "testing_framework": "Jest + Supertest"
  },
  "requirements": [
    {
      "id": "AUTH-001",
      "description": "Email/password login endpoint",
      "acceptance_criteria": [
        "Returns 200 with JWT token on success",
        "Returns 401 with 'Invalid credentials' message on failure",
        "Returns 400 with validation errors for malformed input"
      ],
      "edge_cases": [
        "Email with whitespace should be trimmed",
        "Passwords must be 8-64 characters"
      ],
      "constraints": {
        "response_time": "< 200ms p95",
        "password_hash": "bcrypt"
      }
    }
  ],
  "non_goals": [
    "OAuth/Social login (future feature)"
  ]
}

Here’s where I almost made a mistake. The AI-generated spec looked good, but it was too generic. A Reddit comment warned me:

“The only caveat is that LLM generated specs are many times too generic or too average based on what they’ve been trained on.”

I created a review checklist:

## Spec Review Checklist

### Specificity
- [ ] No generic placeholder values (e.g., "appropriate value")
- [ ] Concrete numbers for limits, timeouts, thresholds
- [ ] Explicit error messages defined
- [ ] Clear success/failure criteria

### Edge Cases
- [ ] Empty inputs addressed
- [ ] Maximum values defined
- [ ] Concurrent access considered
- [ ] Network failures handled

### Context
- [ ] Tech stack explicitly stated
- [ ] Existing patterns referenced
- [ ] Dependencies identified
- [ ] Non-goals listed

### Testability
- [ ] Each requirement has testable criteria
- [ ] Mock data examples provided
- [ ] Integration test scenarios defined

Applying the checklist to my spec, I found issues:

ISSUE: "bcrypt" is too vague
FIX: bcrypt with cost factor 12 (matching our existing auth)

ISSUE: Missing rate limit specifics
FIX: Rate limited to 5 attempts per minute per IP

ISSUE: No account lockout behavior
FIX: Account locked after 5 failed attempts, unlocked after 15 min

ISSUE: No explicit error messages
FIX: Define exact messages like "Invalid credentials" not "Login failed"

My refined spec:

{
  "feature": "User Authentication API",
  "requirements": [
    {
      "id": "AUTH-001",
      "description": "Email/password login endpoint",
      "acceptance_criteria": [
        "Returns 200 with JWT token on success",
        "Returns 401 with 'Invalid credentials' message on failure",
        "Returns 400 with validation errors for malformed input",
        "Rate limited to 5 attempts per minute per IP"
      ],
      "edge_cases": [
        "Email with whitespace should be trimmed",
        "Passwords must be 8-64 characters",
        "Account locked after 5 failed attempts, unlocked after 15 min"
      ],
      "constraints": {
        "response_time": "< 200ms p95",
        "password_hash": "bcrypt with cost factor 12"
      },
      "error_messages": {
        "invalid_credentials": "Invalid credentials",
        "account_locked": "Account temporarily locked. Try again in 15 minutes.",
        "rate_limited": "Too many attempts. Please wait before trying again."
      }
    }
  ]
}

Step 3: AI Generates Code from Spec

Now when I ask AI to implement:

Implement the authentication endpoint according to the spec in refined-spec.json.
Follow the existing repository pattern in our codebase.
Create tests for each acceptance criteria.

The result:

// AI generated this from the spec - no guessing needed
class AuthController {
  constructor(userRepository, tokenService) {
    this.users = userRepository;
    this.tokens = tokenService;
    this.loginAttempts = new Map(); // In production, use Redis
  }

  async login(req, res) {
    const { email, password } = req.body;

    // Trim email (from spec edge case)
    const normalizedEmail = email.trim().toLowerCase();

    // Check rate limit (from spec: 5 attempts per minute per IP)
    const clientIp = req.ip;
    const attempts = this.getAttempts(clientIp);
    if (attempts >= 5) {
      return res.status(429).json({
        error: "Too many attempts. Please wait before trying again."
      });
    }

    // Check account lockout (from spec)
    const user = await this.users.findByEmail(normalizedEmail);
    if (user.lockedUntil && new Date() < user.lockedUntil) {
      return res.status(401).json({
        error: "Account temporarily locked. Try again in 15 minutes."
      });
    }

    // Validate credentials with bcrypt cost 12 (from spec)
    const valid = await bcrypt.compare(password, user.passwordHash);
    if (!valid) {
      this.recordAttempt(clientIp);
      await this.incrementFailedAttempts(user);

      if (user.failedAttempts + 1 >= 5) {
        await this.lockAccount(user, 15 * 60 * 1000); // 15 min
      }

      return res.status(401).json({
        error: "Invalid credentials"
      });
    }

    // Reset on success
    await this.resetFailedAttempts(user);

    const token = this.tokens.generate(user);
    return res.status(200).json({ token });
  }
}

No back-and-forth corrections. The spec made the requirements explicit.

Step 4: Iteration Loop

The spec also drives testing:

describe('AuthController', () => {
  // Test each acceptance criteria from spec

  it('returns 200 with JWT token on success', async () => {
    const res = await request(app).post('/auth/login').send({
      email: '[email protected]',
      password: 'validpassword123'
    });
    expect(res.status).toBe(200);
    expect(res.body.token).toBeDefined();
  });

  it('returns 401 with "Invalid credentials" on failure', async () => {
    const res = await request(app).post('/auth/login').send({
      email: '[email protected]',
      password: 'wrongpassword'
    });
    expect(res.status).toBe(401);
    expect(res.body.error).toBe('Invalid credentials');
  });

  it('rate limits to 5 attempts per minute per IP', async () => {
    for (let i = 0; i < 5; i++) {
      await request(app).post('/auth/login').send({
        email: '[email protected]',
        password: 'wrong'
      });
    }
    const res = await request(app).post('/auth/login').send({
      email: '[email protected]',
      password: 'wrong'
    });
    expect(res.status).toBe(429);
  });

  it('locks account after 5 failed attempts', async () => {
    // Create 5 failed attempts
    for (let i = 0; i < 5; i++) {
      await request(app).post('/auth/login').send({
        email: '[email protected]',
        password: 'wrong'
      });
    }
    // Try with correct password
    const res = await request(app).post('/auth/login').send({
      email: '[email protected]',
      password: 'correctpassword'
    });
    expect(res.status).toBe(401);
    expect(res.body.error).toContain('locked');
  });
});

When tests fail, they reveal gaps in the spec. The loop continues.

Why This Works

The SDD + AI workflow solves two fundamental problems:

Problem 1: AI needs context

Without specs, AI makes assumptions. With specs, it has explicit direction.

WITHOUT SPEC:
"Add login" --> AI guesses --> Wrong assumptions --> Back-and-forth

WITH SPEC:
"Add login" --> Spec defines everything --> AI executes --> Correct code

Problem 2: Specs take time

AI can draft specs quickly. Humans add business context. Together, they’re faster than either alone.

TRADITIONAL: Write spec manually (2 hours) --> Code (1 hour) = 3 hours
AI-ASSISTED: AI draft (5 min) --> Human refine (15 min) --> Code (1 hour) = 1.3 hours

Common Mistakes to Avoid

I made these mistakes when starting:

Mistake 1: Accepting AI specs without review

The AI-generated spec had “bcrypt” without the cost factor. Our existing code uses cost 12. Without review, the code would have inconsistent security levels.

Mistake 2: Specs that are too broad

# BAD
"Implement authentication securely"

# GOOD
"Implement email/password authentication with bcrypt cost 12,
5-attempt rate limiting per IP, and 15-minute account lockout"

Mistake 3: Skipping the non-goals section

Non-goals prevent scope creep. When I didn’t specify “no OAuth,” the AI tried to add social login scaffolding.

Tools That Support This Workflow

I’ve used these tools with the SDD approach:

Tool	How I Use It
Claude Code	Generates specs and code from specs
Cursor	Autonomous coding with spec context
Windsurf	Similar to Cursor, good for iteration

The key is having the spec visible in context. In Cursor, I keep the spec file open. In Claude Code, I reference it explicitly.

Summary

In this post, I showed how Specs-Driven Development and AI coding assistants form a powerful feedback loop. The key point is: specs give AI direction, AI accelerates spec creation, and human oversight ensures specificity.

The workflow is:

Human intent
AI spec draft
Human refinement
AI code generation
Human verification

Without specs, AI guesses. With specs, AI executes. The difference is hours of back-and-forth corrections eliminated.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

👨‍💻 Reddit: SDD Discussion
👨‍💻 Claude Code Documentation
👨‍💻 Cursor AI
👨‍💻 Windsurf IDE

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!