How Does Specs-Driven Development Work with AI Coding Assistants?
Problem
I was frustrated with AI coding assistants. I’d ask Cursor to implement a feature, and it would generate code that sort of worked but missed edge cases. I’d ask Claude Code to add authentication, and it would create a basic implementation that didn’t match my existing patterns. Every time, I ended up in an endless back-and-forth correcting assumptions.
Here’s what a typical session looked like:
Me: Add a login endpoint to the API
Claude: [generates 50 lines of code]
Me: That's wrong, we use bcrypt not argon2
Claude: [regenerates with bcrypt]
Me: Also, we need rate limiting
Claude: [adds rate limiting]
Me: And the response format should match our standard
Claude: [fixes response format]
Me: Wait, you missed the account lockout logic
Claude: [adds lockout logic]
... 10 more corrections later ...I realized the problem: I was giving AI vague instructions, and it was making assumptions. Every assumption was a potential wrong turn.
What I Discovered
I found a Reddit thread where someone described exactly what I was experiencing:
“With autonomous agents… if you want to keep agents from running tasks, you need to be precise about what you want. And specification driven development is the way now.”
Another comment hit home:
“I almost feel like the barrier has been lifted since we can describe what the specs should look like to the AI, get those set up, and have the AI iterate until the minimum changes made to solve the problem.”
The insight was clear: AI needs precise specifications to work effectively. Without specs, it guesses. With specs, it executes.
The SDD + AI Workflow
I started using Specs-Driven Development with AI assistants, and the difference was immediate. Here’s the workflow that works:
+-------------------------------------------------------------+| SDD + AI WORKFLOW |+-------------------------------------------------------------+| || 1. INITIAL SPEC DRAFT || +--------------+ || | Human inputs |--> AI drafts initial spec || | requirements | (structure + questions) || +--------------+ || | || v || 2. SPEC REFINEMENT || +------------------------------------------+ || | Human reviews for: | || | - Specificity (avoid generic) | || | - Edge cases | || | - Business logic accuracy | || +------------------------------------------+ || | || v || 3. CODE GENERATION || +------------------------------------------+ || | AI generates code from refined spec | || | Human reviews output | || +------------------------------------------+ || | || v || 4. ITERATION LOOP || +------------------------------------------+ || | Specs inform tests | || | Tests validate code | || | Failures inform spec updates | || +------------------------------------------+ || |+-------------------------------------------------------------+Let me walk through each step with a real example.
Step 1: AI Drafts Initial Spec
Instead of asking AI to code directly, I now ask it to draft a specification first:
Generate a detailed specification for: User Authentication API
Context:- Tech stack: Node.js, Express, PostgreSQL- Existing patterns: Repository pattern with dependency injection- Testing framework: Jest + Supertest
Requirements:1. Email/password login endpoint2. JWT token generation3. Rate limiting
Output format: JSON specification with:- Feature name and context- Detailed requirements with acceptance criteria- Edge cases and error scenarios- Constraints (performance, security)- Non-goals (what this feature will NOT do)
After generating, ask me 3-5 clarifying questions aboutbusiness logic that would make this spec more specific.The AI generates:
{ "feature": "User Authentication API", "context": { "tech_stack": ["Node.js", "Express", "PostgreSQL"], "existing_patterns": "Repository pattern with dependency injection", "testing_framework": "Jest + Supertest" }, "requirements": [ { "id": "AUTH-001", "description": "Email/password login endpoint", "acceptance_criteria": [ "Returns 200 with JWT token on success", "Returns 401 with 'Invalid credentials' message on failure", "Returns 400 with validation errors for malformed input" ], "edge_cases": [ "Email with whitespace should be trimmed", "Passwords must be 8-64 characters" ], "constraints": { "response_time": "< 200ms p95", "password_hash": "bcrypt" } } ], "non_goals": [ "OAuth/Social login (future feature)" ]}Step 2: Human Refinement (Critical!)
Here’s where I almost made a mistake. The AI-generated spec looked good, but it was too generic. A Reddit comment warned me:
“The only caveat is that LLM generated specs are many times too generic or too average based on what they’ve been trained on.”
I created a review checklist:
## Spec Review Checklist
### Specificity- [ ] No generic placeholder values (e.g., "appropriate value")- [ ] Concrete numbers for limits, timeouts, thresholds- [ ] Explicit error messages defined- [ ] Clear success/failure criteria
### Edge Cases- [ ] Empty inputs addressed- [ ] Maximum values defined- [ ] Concurrent access considered- [ ] Network failures handled
### Context- [ ] Tech stack explicitly stated- [ ] Existing patterns referenced- [ ] Dependencies identified- [ ] Non-goals listed
### Testability- [ ] Each requirement has testable criteria- [ ] Mock data examples provided- [ ] Integration test scenarios definedApplying the checklist to my spec, I found issues:
ISSUE: "bcrypt" is too vagueFIX: bcrypt with cost factor 12 (matching our existing auth)
ISSUE: Missing rate limit specificsFIX: Rate limited to 5 attempts per minute per IP
ISSUE: No account lockout behaviorFIX: Account locked after 5 failed attempts, unlocked after 15 min
ISSUE: No explicit error messagesFIX: Define exact messages like "Invalid credentials" not "Login failed"My refined spec:
{ "feature": "User Authentication API", "requirements": [ { "id": "AUTH-001", "description": "Email/password login endpoint", "acceptance_criteria": [ "Returns 200 with JWT token on success", "Returns 401 with 'Invalid credentials' message on failure", "Returns 400 with validation errors for malformed input", "Rate limited to 5 attempts per minute per IP" ], "edge_cases": [ "Email with whitespace should be trimmed", "Passwords must be 8-64 characters", "Account locked after 5 failed attempts, unlocked after 15 min" ], "constraints": { "response_time": "< 200ms p95", "password_hash": "bcrypt with cost factor 12" }, "error_messages": { "invalid_credentials": "Invalid credentials", "account_locked": "Account temporarily locked. Try again in 15 minutes.", "rate_limited": "Too many attempts. Please wait before trying again." } } ]}Step 3: AI Generates Code from Spec
Now when I ask AI to implement:
Implement the authentication endpoint according to the spec in refined-spec.json.Follow the existing repository pattern in our codebase.Create tests for each acceptance criteria.The result:
// AI generated this from the spec - no guessing neededclass AuthController { constructor(userRepository, tokenService) { this.users = userRepository; this.tokens = tokenService; this.loginAttempts = new Map(); // In production, use Redis }
async login(req, res) { const { email, password } = req.body;
// Trim email (from spec edge case) const normalizedEmail = email.trim().toLowerCase();
// Check rate limit (from spec: 5 attempts per minute per IP) const clientIp = req.ip; const attempts = this.getAttempts(clientIp); if (attempts >= 5) { return res.status(429).json({ error: "Too many attempts. Please wait before trying again." }); }
// Check account lockout (from spec) const user = await this.users.findByEmail(normalizedEmail); if (user.lockedUntil && new Date() < user.lockedUntil) { return res.status(401).json({ error: "Account temporarily locked. Try again in 15 minutes." }); }
// Validate credentials with bcrypt cost 12 (from spec) const valid = await bcrypt.compare(password, user.passwordHash); if (!valid) { this.recordAttempt(clientIp); await this.incrementFailedAttempts(user);
if (user.failedAttempts + 1 >= 5) { await this.lockAccount(user, 15 * 60 * 1000); // 15 min }
return res.status(401).json({ error: "Invalid credentials" }); }
// Reset on success await this.resetFailedAttempts(user);
const token = this.tokens.generate(user); return res.status(200).json({ token }); }}No back-and-forth corrections. The spec made the requirements explicit.
Step 4: Iteration Loop
The spec also drives testing:
describe('AuthController', () => { // Test each acceptance criteria from spec
it('returns 200 with JWT token on success', async () => { const res = await request(app).post('/auth/login').send({ password: 'validpassword123' }); expect(res.status).toBe(200); expect(res.body.token).toBeDefined(); });
it('returns 401 with "Invalid credentials" on failure', async () => { const res = await request(app).post('/auth/login').send({ password: 'wrongpassword' }); expect(res.status).toBe(401); expect(res.body.error).toBe('Invalid credentials'); });
it('rate limits to 5 attempts per minute per IP', async () => { for (let i = 0; i < 5; i++) { await request(app).post('/auth/login').send({ password: 'wrong' }); } const res = await request(app).post('/auth/login').send({ password: 'wrong' }); expect(res.status).toBe(429); });
it('locks account after 5 failed attempts', async () => { // Create 5 failed attempts for (let i = 0; i < 5; i++) { await request(app).post('/auth/login').send({ password: 'wrong' }); } // Try with correct password const res = await request(app).post('/auth/login').send({ password: 'correctpassword' }); expect(res.status).toBe(401); expect(res.body.error).toContain('locked'); });});When tests fail, they reveal gaps in the spec. The loop continues.
Why This Works
The SDD + AI workflow solves two fundamental problems:
Problem 1: AI needs context
Without specs, AI makes assumptions. With specs, it has explicit direction.
WITHOUT SPEC:"Add login" --> AI guesses --> Wrong assumptions --> Back-and-forth
WITH SPEC:"Add login" --> Spec defines everything --> AI executes --> Correct codeProblem 2: Specs take time
AI can draft specs quickly. Humans add business context. Together, they’re faster than either alone.
TRADITIONAL: Write spec manually (2 hours) --> Code (1 hour) = 3 hoursAI-ASSISTED: AI draft (5 min) --> Human refine (15 min) --> Code (1 hour) = 1.3 hoursCommon Mistakes to Avoid
I made these mistakes when starting:
Mistake 1: Accepting AI specs without review
The AI-generated spec had “bcrypt” without the cost factor. Our existing code uses cost 12. Without review, the code would have inconsistent security levels.
Mistake 2: Specs that are too broad
# BAD"Implement authentication securely"
# GOOD"Implement email/password authentication with bcrypt cost 12,5-attempt rate limiting per IP, and 15-minute account lockout"Mistake 3: Skipping the non-goals section
Non-goals prevent scope creep. When I didn’t specify “no OAuth,” the AI tried to add social login scaffolding.
Tools That Support This Workflow
I’ve used these tools with the SDD approach:
| Tool | How I Use It |
|---|---|
| Claude Code | Generates specs and code from specs |
| Cursor | Autonomous coding with spec context |
| Windsurf | Similar to Cursor, good for iteration |
The key is having the spec visible in context. In Cursor, I keep the spec file open. In Claude Code, I reference it explicitly.
Summary
In this post, I showed how Specs-Driven Development and AI coding assistants form a powerful feedback loop. The key point is: specs give AI direction, AI accelerates spec creation, and human oversight ensures specificity.
The workflow is:
- Human intent
- AI spec draft
- Human refinement
- AI code generation
- Human verification
Without specs, AI guesses. With specs, AI executes. The difference is hours of back-and-forth corrections eliminated.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
- 👨💻 Reddit: SDD Discussion
- 👨💻 Claude Code Documentation
- 👨💻 Cursor AI
- 👨💻 Windsurf IDE
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments