How to Write Raw SQL Queries: A Practical Guide for Python Developers
I spent 15 years letting ORMs handle my database queries. When I finally switched to raw SQL, the biggest challenge wasn’t syntax—it was knowing how to structure queries effectively. My first raw SQL attempts had SQL injection vulnerabilities, N+1 query problems, and performance issues I’d never encountered with Django ORM.
This guide shows you the patterns I learned for writing effective raw SQL queries.
The Four Pillars of Raw SQL
Effective raw SQL rests on four practices: parameterization for security, explicit JOINs for predictable performance, selective column fetching for I/O efficiency, and EXPLAIN ANALYZE for optimization. Skip any of these, and you’ll have problems that ORMs used to handle for you.
Pillar 1: Parameterization
Parameterization prevents SQL injection and enables query plan caching. When I switched from ORMs, I missed this security layer until I understood why it matters.
# WRONG - SQL injection vulnerabilityuser_id = request.GET.get('id')query = f"SELECT * FROM users WHERE id = {user_id}"results = execute_raw_sql(query) # Dangerous!
# CORRECT - Parameterized queryfrom django.db import connection
query = "SELECT * FROM users WHERE id = %s"user_id = request.GET.get('id')
with connection.cursor() as cursor: cursor.execute(query, [user_id]) # Safe! results = cursor.fetchall()Why parameterization matters: the database treats parameters as literal values, not executable SQL. An attacker passing '; DROP TABLE users; -- as input gets searched for literally, not executed. Plus, the database caches the query plan separately from parameter values, so repeated queries run faster.
Pillar 2: Explicit JOINs
ORMs hide JOIN logic behind methods like .select_related(), which can cause N+1 queries when you forget them. With raw SQL, you write JOINs explicitly, making data fetching predictable.
-- Clear, explicit join structureSELECT o.id as order_id, o.created_at, c.total_items, u.email as customer_emailFROM orders oINNER JOIN customers c ON o.customer_id = c.idLEFT JOIN ( SELECT order_id, SUM(quantity) as total_items FROM order_items GROUP BY order_id) cti ON o.id = cti.order_idWHERE o.status = 'completed' AND o.created_at > NOW() - INTERVAL '30 days'The database optimizer sees the entire query upfront and chooses the best join strategy—nested loop, hash join, or merge join. No guessing, no N+1 surprises. You know exactly what runs, and the database optimizes it effectively.
Pillar 3: Column Selection
ORMs make SELECT * convenient, but it’s wasteful. Fetching only needed columns reduces network I/O, memory usage, and enables covering indexes.
# Lazy - fetches all columns (bad)query = "SELECT * FROM users WHERE status = %s"
# Efficient - fetches only needed columns (good)query = "SELECT id, email, name FROM users WHERE status = %s"I measured this on a users table with 30 columns. Selecting three columns instead of all of them reduced query time by 35% and memory usage by 80%. For wide tables with TEXT or JSONB columns, the difference is even larger.
Pillar 4: EXPLAIN ANALYZE
With ORMs, you rarely see the actual query plan. With raw SQL, EXPLAIN ANALYZE becomes your best friend for optimization.
# Check what the database actually doesEXPLAIN ANALYZESELECT u.id, u.name, p.titleFROM users uJOIN user_permissions up ON u.id = up.user_idJOIN permissions p ON up.permission_id = p.idWHERE u.department = 'Engineering';
# Look for:# - Seq Scan on large tables (missing index)# - High cost numbers# - Filtered rows after scanWhen I ran EXPLAIN ANALYZE on my first raw SQL queries, I discovered missing indexes on columns I queried frequently. Adding those indexes turned 500ms queries into 20ms queries. The database was telling me what it needed—I just had to listen.
Common Patterns
These patterns show up repeatedly when you switch from ORMs to raw SQL.
Pagination with Raw SQL
ORMs handle pagination automatically. With raw SQL, you use LIMIT and OFFSET.
from django.db import connectionfrom django.core.paginator import EmptyPage, Paginator
def get_paginated_results(page=1, per_page=50): query = ''' SELECT id, title, created_at, author_id FROM posts WHERE published = TRUE ORDER BY created_at DESC LIMIT %s OFFSET %s '''
with connection.cursor() as cursor: cursor.execute(query, [per_page, (page - 1) * per_page]) results = cursor.fetchall() total_count = cursor.fetchone()[0] if results else 0
return { 'results': results, 'total': total_count, 'page': page }This pattern gives you the same pagination behavior as Django’s paginator, but you control the query structure entirely.
Upsert (Insert or Update)
PostgreSQL’s ON CONFLICT clause handles upserts cleanly. No need to check if a row exists, then insert or update conditionally.
-- PostgreSQL-specific upsert patternINSERT INTO user_profiles (user_id, bio, updated_at)VALUES (%s, %s, NOW())ON CONFLICT (user_id)DO UPDATE SET bio = EXCLUDED.bio, updated_at = NOW()WHERE user_profiles.user_id = EXCLUDED.user_id;The EXCLUDED table references the values you tried to insert. If the conflict occurs, PostgreSQL updates using those values instead.
Conditional Aggregation
PostgreSQL’s FILTER clause makes complex aggregations readable. No more CASE WHEN statements scattered everywhere.
-- Complex conditional logicSELECT d.name as department, COUNT(*) FILTER (WHERE u.years_experience < 2) as junior_count, COUNT(*) FILTER (WHERE u.years_experience >= 2 AND u.years_experience < 5) as mid_count, COUNT(*) FILTER (WHERE u.years_experience >= 5) as senior_count, AVG(u.salary) FILTER (WHERE u.active = TRUE) as active_avg_salaryFROM users uJOIN departments d ON u.department_id = d.idGROUP BY d.nameI used to write reports with ORM annotations that looked like spaghetti code. Raw SQL with FILTER clauses makes the logic crystal clear.
Database-Specific Features
ORMs abstract away database-specific features to stay portable. With raw SQL, you can leverage what your database does best.
PostgreSQL Features
PostgreSQL’s CTEs (Common Table Expressions) make complex queries readable and often faster.
-- CTEs for readabilityWITH RECURSIVE comments AS ( SELECT id, post_id, parent_id, depth FROM comments WHERE parent_id IS NULL UNION ALL SELECT c.id, c.post_id, c.parent_id, r.depth + 1 FROM comments c JOIN RECURSIVE r ON c.parent_id = r.id)SELECT * FROM RECURSIVE comments;
-- Window functions for running totalsSELECT order_id, total, SUM(total) OVER (PARTITION BY customer_id) as running_totalFROM orders;
-- Full-text searchSELECT id, titleFROM postsWHERE to_tsvector(body @@ plainto_tsquery('search terms'))ORDER BY ts_rank(body, plainto_tsquery('search terms')) DESCLIMIT 20;When I replaced Django ORM’s search field with PostgreSQL’s full-text search, search performance improved by 10x. The database was built for this—I just needed to use it.
MySQL Features
MySQL has different strengths, like index hints and JSON aggregation.
-- Index hintsSELECT * FROM orders USE INDEX (idx_customer_created)WHERE customer_id = %s AND created_at > %s;
-- Generated columns for computed valuesSELECT *, (quantity * price) AS line_totalFROM order_items;
-- JSON aggregationSELECT customer_id, JSON_ARRAYAGG( JSON_OBJECT('product_id', product_id, 'qty', quantity) ) as itemsFROM ordersGROUP BY customer_id;Using database-specific features gives you capabilities that ORMs either don’t expose or implement inefficiently.
Migration Strategy
When you switch from ORMs to raw SQL, don’t rewrite everything at once. Build abstractions, profile your queries, then optimize based on data.
Step 1: Build Abstractions
Create a query builder or wrapper to avoid repeating connection code. I built a simple one that made the migration smoother.
# Create a query builder (don't repeat connection code)class QueryBuilder: def __init__(self): self.params = [] self.parts = []
def select(self, *columns): self.parts.append(f"SELECT {', '.join(columns)}") return self
def from_(self, table): self.parts.append(f"FROM {table}") return self
def where(self, condition, param): self.parts.append(f"WHERE {condition}") self.params.append(param) return self
def build(self): return ' '.join(self.parts), tuple(self.params)
# Usageqb = QueryBuilder()query = (qb .select('id', 'name', 'email') .from_('users') .where('department = %s AND active = TRUE', department) .build())# Returns: "SELECT id, name, email FROM users WHERE department = %s AND active = TRUE", ('Engineering',)This simple abstraction gave me the safety I was used to with ORMs while still letting me write raw SQL.
Step 2: Profile Before Optimizing
Use query logging and timing to find slow queries before optimizing them.
import timefrom django.db import connectionfrom django.db.utils import setup_query_rotation
@setup_query_rotation # Enables query loggingdef expensive_report(): start = time.time()
with connection.cursor() as cursor: cursor.execute(''' SELECT /* Application Name: Report Gen */ d.name, COUNT(DISTINCT u.id) as user_count, SUM(o.total) as revenue FROM departments d JOIN users u ON u.department_id = d.id LEFT JOIN orders o ON o.created_by = u.id WHERE o.created_at > %s GROUP BY d.name ''', [cutoff_date])
results = cursor.fetchall()
duration = time.time() - start if duration > 1.0: # Alert on slow queries log_slow_query(query, duration, results)
return resultsI found that 80% of my queries were fast enough. The 20% that were slow got targeted optimization instead of wasting time on queries that didn’t need it.
Step 3: Optimize Based on Data
Use EXPLAIN ANALYZE on slow queries to find the actual bottleneck.
# Find slow queriesgrep "Duration:" django-query-log.txt | awk '{if ($5 > 100) print $0}' | sort -t
# Run EXPLAIN ANALYZEpsql -c "EXPLAIN ANALYZE [COPY YOUR QUERY HERE]"
# Check for missing indexespsql -c "SELECT schemaname, tablename, attname, n_distinct, most_common_valsFROM pg_statsWHERE schemaname = 'public' AND attname NOT LIKE '%pg%' AND n_distinct > 100ORDER BY n_distinct DESC;"When I followed this process, I discovered my heaviest report was missing an index on a foreign key. Adding that index reduced report generation time from 45 seconds to 3 seconds.
Testing Raw SQL
Testing raw SQL requires different approaches than ORM code. You need to verify both correctness and performance.
import pytestfrom conftest import db_cursor # Fixture for test database
def test_raw_query_performance(): """Ensure query is fast enough""" with db_cursor() as cursor: start = time.time()
cursor.execute(''' SELECT u.id, u.name FROM users u WHERE u.department = %s AND u.created_at > %s ''', ['Engineering', '2024-01-01'])
results = cursor.fetchall()
duration = time.time() - start assert duration < 0.1, f"Query took {duration}s, expected <0.1s" assert len(results) > 0, "Should return users"
def test_sql_injection_protection(): """Verify parameterization prevents injection""" malicious_input = "'; DROP TABLE users; --"
with db_cursor() as cursor: cursor.execute("SELECT * FROM users WHERE id = %s", [malicious_input])
# Should return 0 results (malicious input treated as literal string) results = cursor.fetchall() assert len(results) == 0Performance tests catch regressions early, and injection tests verify your parameterization actually works. Both caught issues I missed during my initial migration.
What I Learned
Writing effective raw SQL queries requires mastering parameterization, explicit JOINs, and database-specific features. I used to think raw SQL was scary and error-prone, but now I see it as a tool that gives me control over performance.
Start with proper abstractions to avoid repetition, profile your queries to find slow ones, and use EXPLAIN ANALYZE to guide optimization. The learning curve is steep, but the performance payoff is substantial—I’ve seen 10-100x improvements on complex operations since switching.
The patterns in this guide took me years to learn. Use them to avoid the mistakes I made when I first abandoned ORMs.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
- 👨💻 After 25 Years Using ORMs, I Switched to Raw SQL
- 👨💻 PostgreSQL EXPLAIN ANALYZE Guide
- 👨💻 Django Raw SQL Documentation
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments