Skip to content

What Is Vibe Coding? When Natural Language Programming Works (And When It Fails)

I spent three hours debugging an authentication system that I had “vibe coded” into existence. The AI had silently decided to store passwords in plain text, use sequential user IDs instead of UUIDs, and implement session management that would break the moment I deployed to production.

The worst part? The system worked perfectly in my local tests. It “felt right” when I ran it. And that’s exactly the trap of vibe coding.

The “It Just Works” Trap

Here’s what happened. I typed this prompt:

my naive request
Add user login to my app

The AI responded with 200 lines of code, database migrations, and a login form. I pasted it in, ran the server, and logged in successfully. It worked. I moved on.

what I saw
✓ Login successful
✓ Session created
✓ User redirected to dashboard

But here’s what the AI had actually decided on my behalf:

AI's silent decisions
1. Password storage: plain text (I didn't specify hashing)
2. User IDs: auto-incrementing integers (I didn't specify UUIDs)
3. Sessions: in-memory storage (I didn't specify Redis/database)
4. Validation: basic email format only (I didn't specify strength rules)
5. Rate limiting: none (I didn't ask for it)

Each decision was reasonable in isolation. Together, they created a security disaster.

What Vibe Coding Actually Is

Andrej Karpathy coined “vibe coding” in early 2024. The definition is simple: you describe what you want in natural language, AI generates the code, and you accept it if the output “feels right”—without necessarily understanding the implementation.

vibe coding flow
Traditional: Design → Code → Test → Debug → Deploy
Vibe Coding: Describe → AI Generates → Test (does it work?) → Done
The key difference: you judge by output, not by code.

The paradigm works because modern AI has enough context to make reasonable assumptions. When you say “add a dark mode toggle,” the AI knows to:

  • Add a button
  • Store the preference
  • Apply CSS classes
  • Default to system preference

You don’t specify any of this. The AI “vibes” your intent.

When Vibe Coding Shines

Last month, I needed a quick tool to scrape product prices from an e-commerce site. I typed:

quick script request
Write a Python script that:
- Takes a product URL as input
- Extracts the price
- Saves to CSV with timestamp
- Handles errors gracefully

Five minutes later, I had a working script. I tested it, it worked, I used it for three days, then deleted it. Perfect vibe coding use case:

  • One-off task
  • Low stakes (just price data, no security concerns)
  • Fast validation (I could verify the output immediately)
  • Disposable code

I didn’t care how the script parsed HTML, how it handled network retries, or what library it used. I cared that the CSV file had the right prices. The “vibe” was correct.

The Silent Decision Problem

Here’s where vibe coding breaks down. Every prompt contains ambiguity, and the AI must resolve that ambiguity somehow.

Consider this request:

deceptively simple request
Add a user management feature

The AI must answer questions you never asked:

hidden complexity
- What is a "user"? (Admin? End-user? Both?)
- What does "manage" include? (CRUD? Permissions? Audit logs?)
- How should passwords work? (Hashing algorithm? Reset flow?)
- What's the session strategy? (JWT? Session cookies? OAuth?)
- How do we handle deletion? (Soft delete? Hard delete? Cascade?)
- What about search? (Filtering? Sorting? Pagination?)
- Should there be roles? (Admin? Moderator? User?)

For a prototype? Who cares. The AI makes reasonable defaults, you get something working, you show stakeholders.

For production? Every silent decision is a potential security vulnerability, a scalability bottleneck, or a maintenance nightmare.

The Spectrum of Vibe Tolerance

I’ve learned to categorize coding tasks by their “vibe tolerance”—how much ambiguity they can safely absorb.

vibe tolerance spectrum
HIGH TOLERANCE (vibe freely):
├── Prototypes and demos
├── One-off scripts
├── Hackathon projects
├── Personal tools
└── Proof-of-concepts
MEDIUM TOLERANCE (vibe carefully):
├── Internal tools
├── Non-critical features
├── Test utilities
└── Development scripts
LOW TOLERANCE (specify everything):
├── Authentication systems
├── Payment processing
├── Data validation
├── Security-critical features
└── Production APIs

The pattern is clear: vibe tolerance is inversely proportional to consequence. If the code fails, what happens?

failure impact analysis
Prototype fails → "Interesting, let's pivot"
Internal tool fails → "Someone fix it tomorrow"
Auth system fails → "We're in the news tomorrow"
Payment fails → "We're in court tomorrow"

My Vibe Coding Heuristic

After the authentication disaster, I developed a simple rule:

my heuristic
Can I explain in 30 seconds how this feature handles failure?
Yes → Vibe coding acceptable
No → Write specifications first

For the login system, I couldn’t answer:

  • How are sessions invalidated? “Um, the AI decided…”
  • What’s the password reset flow? “I think it sends emails…”
  • How are brute force attacks prevented? “I didn’t ask for that…”

For the price scraper, I could answer:

  • What if the site is down? “Script logs error, retries later”
  • What if the HTML changes? “Script fails, I fix the prompt”
  • What if prices are wrong? “I notice and adjust”

The difference isn’t the complexity of the code—it’s my understanding of the failure modes.

The Specification Alternative

When vibe coding fails, the alternative isn’t “write everything manually.” It’s “specify before generating.”

Here’s how I approach authentication now:

specified request
Add user authentication with these requirements:
1. Password handling:
- Use bcrypt with cost factor 12
- Minimum 12 characters, require mixed case + numbers + symbols
- Reset via email token (24-hour expiry)
2. Session management:
- JWT stored in httpOnly cookie
- 15-minute access tokens, 7-day refresh tokens
- Refresh token rotation on each use
3. Security:
- Rate limit: 5 login attempts per 15 minutes per IP
- Account lockout after 10 failed attempts
- Log all authentication events
4. Database:
- Users table with UUID primary key
- Soft delete (deleted_at column)
- Created_at and updated_at timestamps

This isn’t traditional coding. It’s still AI-assisted. But now I’m making the decisions, not delegating them to the AI’s “reasonable defaults.”

The Vibe Coding Workflow That Works

I’ve found a middle ground that captures most of vibe coding’s speed while avoiding its pitfalls:

practical workflow
1. VIBE CODE THE FIRST PASS
Describe what you want, let AI generate
2. READ THE CODE (yes, actually read it)
Identify silent decisions
3. SPECIFY WHAT MATTERS
Add constraints for security, scalability, correctness
4. REGENERATE WITH SPECS
Let AI incorporate your decisions
5. TEST EDGE CASES
Not just "does it work" but "how does it fail"

This takes longer than pure vibe coding. But it takes much less time than debugging a production outage caused by “reasonable defaults.”

When to Trust the Vibe

There’s still a place for pure vibe coding—where you accept output without deep inspection. My criteria:

trust checklist
✓ Will this code be deleted within a week?
✓ If it fails, is the worst case "I lose 30 minutes of work"?
✓ Can I verify correctness in under 5 minutes?
✓ Does it handle zero sensitive data?
✓ Is it not connected to production systems?
All yes → Vibe away
Any no → Specify first

The price scraper passed this test. The authentication system failed question two, three, four, and five.

The Future: Vibe Coding as Sketching

I’ve come to see vibe coding as the “sketching” phase of development. It’s how you explore ideas quickly, test assumptions, and validate concepts. But sketches aren’t blueprints.

development phases
Sketch (Vibe Code) → Blueprint (Specify) → Build (AI-Assisted) → Ship
The trap: treating sketches as blueprints
The opportunity: using each phase for what it's designed for

When I need a landing page for a hackathon project tomorrow, I vibe code it. When I need an authentication system for a production app next month, I specify first and generate second.

The skill isn’t choosing between vibe coding and traditional development. The skill is knowing when each approach is appropriate—and understanding that “it works” is a very different statement than “it’s safe.”

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments