Why AI Generates SQL Injection Vulnerable Code and How to Fix It
I asked an AI assistant to write a simple user lookup function. It gave me this:
async function getUserById(req, res) { const query = `SELECT * FROM users WHERE id = ${req.params.id}`; const result = await db.query(query); res.json(result);}It worked perfectly in testing. Pass in 123, get back user 123. Clean, readable, done.
Then someone passed in 1 OR 1=1-- as the ID parameter.
Suddenly my “simple” query returned every user in the database.
Why AI Generates Vulnerable SQL
I started noticing this pattern everywhere. Ask ChatGPT or Claude for database code, and they almost always start with string concatenation or template literals.
The reason is simple: LLMs are trained on millions of code examples, and a lot of that training data contains vulnerable code. Old tutorials, Stack Overflow answers from 2012, blog posts from the early days of Node.js - they all used string concatenation because it was easy to understand.
The AI sees these patterns repeatedly:
- It produces readable code
- It looks clean and concise
- It works in simple test cases
- It matches common tutorial patterns
So the AI reproduces what it learned. The code looks correct because it mirrors the “correct” examples in its training data.
The problem is those examples were wrong from a security standpoint.
The “It Works” Trap
Here’s why this is so dangerous: the vulnerable code actually works.
// This works fine with normal inputconst userId = '123';const query = `SELECT * FROM users WHERE id = ${userId}`;// Result: SELECT * FROM users WHERE id = 123
// Test it manually - looks good!// Deploy it - tests pass!// Ship it - nobody complains!Everything seems fine until someone decides to test your input handling.
Anatomy of the Attack
Let me show you exactly what happens when an attacker exploits this vulnerability.
The original query:
SELECT * FROM users WHERE id = 123An attacker sends this request:
GET /api/users/1%20OR%201=1--URL decoded, that’s 1 OR 1=1--. The resulting query becomes:
SELECT * FROM users WHERE id = 1 OR 1=1--The -- comments out anything that follows. The condition 1=1 is always true. So this returns every row in the users table.
But data exfiltration is just the beginning. An attacker can do much worse:
1; DROP TABLE users;--SELECT * FROM users WHERE id = 1; DROP TABLE users;--There goes your users table.
Or how about extracting sensitive data through UNION attacks:
1 UNION SELECT username, password, email, NULL FROM admin_users--This appends results from another table to your original query.
The Fix: Parameterized Queries
The solution is parameterized queries (also called prepared statements). This technique separates SQL code from user data.
Here’s the corrected version:
async function getUserById(req, res) { const query = 'SELECT * FROM users WHERE id = ?'; const result = await db.query(query, [req.params.id]); res.json(result);}The ? is a placeholder. The database driver treats the parameter [req.params.id] as data, never as SQL code. Even if someone passes 1 OR 1=1--, the database looks for a user whose ID literally equals that string.
I tried this attack again with the parameterized version:
const maliciousInput = "1 OR 1=1--";const query = 'SELECT * FROM users WHERE id = ?';const result = await db.query(query, [maliciousInput]);// Result: Empty set (no user with that ID)The attack failed completely. The database searched for a user with ID equal to the literal string 1 OR 1=1--. No such user exists, so it returned nothing.
Implementation Examples for Different Databases
MySQL with mysql2
import mysql from 'mysql2/promise';
const pool = mysql.createPool({ host: 'localhost', user: 'root', database: 'myapp'});
// WRONG - Vulnerableasync function getUserBad(id) { const [rows] = await pool.query( `SELECT * FROM users WHERE id = ${id}` ); return rows;}
// CORRECT - Parameterizedasync function getUserGood(id) { const [rows] = await pool.query( 'SELECT * FROM users WHERE id = ?', [id] ); return rows;}PostgreSQL with pg
import { Pool } from 'pg';
const pool = new Pool({ host: 'localhost', database: 'myapp'});
// PostgreSQL uses $1, $2, etc. for placeholdersasync function getUserById(id) { const result = await pool.query( 'SELECT * FROM users WHERE id = $1', [id] ); return result.rows;}
// Multiple parametersasync function searchUsers(name, minAge) { const result = await pool.query( 'SELECT * FROM users WHERE name LIKE $1 AND age > $2', [`%${name}%`, minAge] ); return result.rows;}Using ORMs (Sequelize, Prisma, TypeORM)
ORMs handle parameterization automatically:
import { User } from './models';
// Sequelize automatically parameterizesconst user = await User.findOne({ where: { id: req.params.id }});
// Custom WHERE conditions - use Sequelize operatorsconst users = await User.findAll({ where: { name: { [Op.like]: `%${searchTerm}%` } }});import { PrismaClient } from '@prisma/client';
const prisma = new PrismaClient();
// Prisma handles parameterizationconst user = await prisma.user.findUnique({ where: { id: parseInt(req.params.id) }});
// Raw queries with Prismaconst users = await prisma.$queryRaw` SELECT * FROM users WHERE name LIKE ${`%${searchTerm}%`}`;Note the backticks with ${} inside Prisma’s $queryRaw - this is NOT string concatenation. Prisma treats template literal parameters as separate data.
Prompt Engineering for Secure AI Code
I’ve learned to be explicit in my prompts when asking AI for database code:
Bad prompt:
Write a function to search users by nameBetter prompt:
Write a function to search users by name using parameterized queries.Use prepared statements to prevent SQL injection.Target: Node.js with PostgreSQL pg library.Even better, include a security requirement:
Write a secure user search function that:1. Uses parameterized queries ($1, $2 placeholders)2. Validates input is a non-empty string3. Returns empty array for no results4. Target: Node.js + PostgreSQLDefense in Depth
Parameterized queries are your primary defense, but I recommend multiple layers:
Input Validation
function validateUserId(id) { const parsed = parseInt(id, 10); if (isNaN(parsed) || parsed <= 0) { throw new Error('Invalid user ID'); } return parsed;}
async function getUserById(req, res) { const id = validateUserId(req.params.id); const result = await db.query('SELECT * FROM users WHERE id = $1', [id]); res.json(result.rows);}Principle of Least Privilege
Your application database user should only have permissions it needs:
-- Create application user with limited permissionsCREATE USER app_readonly WITH PASSWORD 'secure_password';GRANT SELECT ON users TO app_readonly;
-- For write operations, use a different userCREATE USER app_readwrite WITH PASSWORD 'different_password';GRANT SELECT, INSERT, UPDATE ON users TO app_readwrite;-- No DROP, ALTER, or CREATE permissionsError Handling Without Leaks
async function getUserById(req, res) { try { const result = await db.query( 'SELECT * FROM users WHERE id = $1', [req.params.id] );
if (result.rows.length === 0) { return res.status(404).json({ error: 'User not found' }); }
res.json(result.rows[0]); } catch (error) { // Log the actual error internally console.error('Database error:', error);
// Return generic message to client res.status(500).json({ error: 'Internal server error' }); }}Never expose database errors to users - they can reveal schema information.
Quick Reference: Do’s and Don’ts
// DON'T: String concatenationconst query = "SELECT * FROM users WHERE name = '" + name + "'";
// DON'T: Template literals with user inputconst query = `SELECT * FROM users WHERE id = ${id}`;
// DON'T: Trusting "sanitization" functionsconst query = `SELECT * FROM users WHERE name = '${escape(name)}'`;
// DO: Parameterized queriesconst query = 'SELECT * FROM users WHERE name = ?';
// DO: ORM methodsconst user = await User.findOne({ where: { name } });
// DO: Input validation before any database operationif (!isValidId(id)) throw new Error('Invalid ID');Summary
AI generates vulnerable SQL because its training data contains millions of examples of string concatenation. The code looks clean and works in testing, making it an easy trap.
The fix is always parameterized queries:
- MySQL: Use
?placeholders - PostgreSQL: Use
$1, $2placeholders - ORMs: Use their built-in methods
Add input validation and proper error handling as defense layers. When prompting AI for database code, explicitly request parameterized queries and specify your database driver.
The vulnerable code looks simpler, but parameterized queries aren’t much harder. They’re just different. Once you make them a habit, you’ll write secure code by default.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments