Skip to content

Why AI Generates SQL Injection Vulnerable Code and How to Fix It

I asked an AI assistant to write a simple user lookup function. It gave me this:

userController.js
async function getUserById(req, res) {
const query = `SELECT * FROM users WHERE id = ${req.params.id}`;
const result = await db.query(query);
res.json(result);
}

It worked perfectly in testing. Pass in 123, get back user 123. Clean, readable, done.

Then someone passed in 1 OR 1=1-- as the ID parameter.

Suddenly my “simple” query returned every user in the database.

Why AI Generates Vulnerable SQL

I started noticing this pattern everywhere. Ask ChatGPT or Claude for database code, and they almost always start with string concatenation or template literals.

The reason is simple: LLMs are trained on millions of code examples, and a lot of that training data contains vulnerable code. Old tutorials, Stack Overflow answers from 2012, blog posts from the early days of Node.js - they all used string concatenation because it was easy to understand.

The AI sees these patterns repeatedly:

  • It produces readable code
  • It looks clean and concise
  • It works in simple test cases
  • It matches common tutorial patterns

So the AI reproduces what it learned. The code looks correct because it mirrors the “correct” examples in its training data.

The problem is those examples were wrong from a security standpoint.

The “It Works” Trap

Here’s why this is so dangerous: the vulnerable code actually works.

vulnerable.js
// This works fine with normal input
const userId = '123';
const query = `SELECT * FROM users WHERE id = ${userId}`;
// Result: SELECT * FROM users WHERE id = 123
// Test it manually - looks good!
// Deploy it - tests pass!
// Ship it - nobody complains!

Everything seems fine until someone decides to test your input handling.

Anatomy of the Attack

Let me show you exactly what happens when an attacker exploits this vulnerability.

The original query:

sql
SELECT * FROM users WHERE id = 123

An attacker sends this request:

HTTP Request
GET /api/users/1%20OR%201=1--

URL decoded, that’s 1 OR 1=1--. The resulting query becomes:

sql
SELECT * FROM users WHERE id = 1 OR 1=1--

The -- comments out anything that follows. The condition 1=1 is always true. So this returns every row in the users table.

But data exfiltration is just the beginning. An attacker can do much worse:

Attack Payload
1; DROP TABLE users;--
sql
SELECT * FROM users WHERE id = 1; DROP TABLE users;--

There goes your users table.

Or how about extracting sensitive data through UNION attacks:

Attack Payload
1 UNION SELECT username, password, email, NULL FROM admin_users--

This appends results from another table to your original query.

The Fix: Parameterized Queries

The solution is parameterized queries (also called prepared statements). This technique separates SQL code from user data.

Here’s the corrected version:

userController.js
async function getUserById(req, res) {
const query = 'SELECT * FROM users WHERE id = ?';
const result = await db.query(query, [req.params.id]);
res.json(result);
}

The ? is a placeholder. The database driver treats the parameter [req.params.id] as data, never as SQL code. Even if someone passes 1 OR 1=1--, the database looks for a user whose ID literally equals that string.

I tried this attack again with the parameterized version:

test.js
const maliciousInput = "1 OR 1=1--";
const query = 'SELECT * FROM users WHERE id = ?';
const result = await db.query(query, [maliciousInput]);
// Result: Empty set (no user with that ID)

The attack failed completely. The database searched for a user with ID equal to the literal string 1 OR 1=1--. No such user exists, so it returned nothing.

Implementation Examples for Different Databases

MySQL with mysql2

mysql-example.js
import mysql from 'mysql2/promise';
const pool = mysql.createPool({
host: 'localhost',
user: 'root',
database: 'myapp'
});
// WRONG - Vulnerable
async function getUserBad(id) {
const [rows] = await pool.query(
`SELECT * FROM users WHERE id = ${id}`
);
return rows;
}
// CORRECT - Parameterized
async function getUserGood(id) {
const [rows] = await pool.query(
'SELECT * FROM users WHERE id = ?',
[id]
);
return rows;
}

PostgreSQL with pg

postgres-example.js
import { Pool } from 'pg';
const pool = new Pool({
host: 'localhost',
database: 'myapp'
});
// PostgreSQL uses $1, $2, etc. for placeholders
async function getUserById(id) {
const result = await pool.query(
'SELECT * FROM users WHERE id = $1',
[id]
);
return result.rows;
}
// Multiple parameters
async function searchUsers(name, minAge) {
const result = await pool.query(
'SELECT * FROM users WHERE name LIKE $1 AND age > $2',
[`%${name}%`, minAge]
);
return result.rows;
}

Using ORMs (Sequelize, Prisma, TypeORM)

ORMs handle parameterization automatically:

sequelize-example.js
import { User } from './models';
// Sequelize automatically parameterizes
const user = await User.findOne({
where: { id: req.params.id }
});
// Custom WHERE conditions - use Sequelize operators
const users = await User.findAll({
where: {
name: { [Op.like]: `%${searchTerm}%` }
}
});
prisma-example.ts
import { PrismaClient } from '@prisma/client';
const prisma = new PrismaClient();
// Prisma handles parameterization
const user = await prisma.user.findUnique({
where: { id: parseInt(req.params.id) }
});
// Raw queries with Prisma
const users = await prisma.$queryRaw`
SELECT * FROM users WHERE name LIKE ${`%${searchTerm}%`}
`;

Note the backticks with ${} inside Prisma’s $queryRaw - this is NOT string concatenation. Prisma treats template literal parameters as separate data.

Prompt Engineering for Secure AI Code

I’ve learned to be explicit in my prompts when asking AI for database code:

Bad prompt:

prompt-bad.txt
Write a function to search users by name

Better prompt:

prompt-better.txt
Write a function to search users by name using parameterized queries.
Use prepared statements to prevent SQL injection.
Target: Node.js with PostgreSQL pg library.

Even better, include a security requirement:

prompt-best.txt
Write a secure user search function that:
1. Uses parameterized queries ($1, $2 placeholders)
2. Validates input is a non-empty string
3. Returns empty array for no results
4. Target: Node.js + PostgreSQL

Defense in Depth

Parameterized queries are your primary defense, but I recommend multiple layers:

Input Validation

validation.js
function validateUserId(id) {
const parsed = parseInt(id, 10);
if (isNaN(parsed) || parsed <= 0) {
throw new Error('Invalid user ID');
}
return parsed;
}
async function getUserById(req, res) {
const id = validateUserId(req.params.id);
const result = await db.query('SELECT * FROM users WHERE id = $1', [id]);
res.json(result.rows);
}

Principle of Least Privilege

Your application database user should only have permissions it needs:

db-setup.sql
-- Create application user with limited permissions
CREATE USER app_readonly WITH PASSWORD 'secure_password';
GRANT SELECT ON users TO app_readonly;
-- For write operations, use a different user
CREATE USER app_readwrite WITH PASSWORD 'different_password';
GRANT SELECT, INSERT, UPDATE ON users TO app_readwrite;
-- No DROP, ALTER, or CREATE permissions

Error Handling Without Leaks

error-handling.js
async function getUserById(req, res) {
try {
const result = await db.query(
'SELECT * FROM users WHERE id = $1',
[req.params.id]
);
if (result.rows.length === 0) {
return res.status(404).json({ error: 'User not found' });
}
res.json(result.rows[0]);
} catch (error) {
// Log the actual error internally
console.error('Database error:', error);
// Return generic message to client
res.status(500).json({ error: 'Internal server error' });
}
}

Never expose database errors to users - they can reveal schema information.

Quick Reference: Do’s and Don’ts

dos-and-donts.js
// DON'T: String concatenation
const query = "SELECT * FROM users WHERE name = '" + name + "'";
// DON'T: Template literals with user input
const query = `SELECT * FROM users WHERE id = ${id}`;
// DON'T: Trusting "sanitization" functions
const query = `SELECT * FROM users WHERE name = '${escape(name)}'`;
// DO: Parameterized queries
const query = 'SELECT * FROM users WHERE name = ?';
// DO: ORM methods
const user = await User.findOne({ where: { name } });
// DO: Input validation before any database operation
if (!isValidId(id)) throw new Error('Invalid ID');

Summary

AI generates vulnerable SQL because its training data contains millions of examples of string concatenation. The code looks clean and works in testing, making it an easy trap.

The fix is always parameterized queries:

  • MySQL: Use ? placeholders
  • PostgreSQL: Use $1, $2 placeholders
  • ORMs: Use their built-in methods

Add input validation and proper error handling as defense layers. When prompting AI for database code, explicitly request parameterized queries and specify your database driver.

The vulnerable code looks simpler, but parameterized queries aren’t much harder. They’re just different. Once you make them a habit, you’ll write secure code by default.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments