How to Configure PostgreSQL Autovacuum to Prevent Table Bloat
1. The Problem: My Queries Were Getting Slower
I noticed my PostgreSQL queries were slowing down over time. Tables that once returned results in milliseconds now took seconds. The database storage kept growing even though I was deleting old records regularly.
I checked my tables and found they were bloated - containing many “dead tuples” that weren’t being cleaned up. The culprit? Autovacuum wasn’t keeping up with my workload.
2. Understanding What Autovacuum Does
When you UPDATE or DELETE rows in PostgreSQL, the old row versions aren’t immediately removed. They become “dead tuples” that:
- Waste disk space - Tables grow larger than necessary (“bloat”)
- Slow scans - Sequential and index scans must skip dead tuples
- Risk transaction ID wraparound - Can eventually force database shutdown
Autovacuum runs VACUUM and ANALYZE automatically to reclaim dead tuples and update statistics. I initially thought about disabling it because it was “causing performance issues.” That was the wrong approach.
3. The Investigation
3.1 Checking Dead Tuple Ratio
First, I checked which tables had the worst bloat:
SELECT relname, n_dead_tup, n_live_tup, n_dead_tup::float / NULLIF(n_live_tup, 0) AS dead_ratioFROM pg_stat_user_tablesWHERE n_live_tup > 0ORDER BY dead_ratio DESC;My busiest table showed a dead_ratio of 0.5 - half the tuples were dead. That explained the slow scans.
3.2 Checking Current Autovacuum Settings
I examined my current configuration:
SHOW autovacuum_vacuum_cost_limit;SHOW autovacuum_max_workers;SHOW autovacuum_naptime;SHOW log_autovacuum_min_duration;The results:
autovacuum_vacuum_cost_limit = -1autovacuum_max_workers = 3autovacuum_naptime = 1minlog_autovacuum_min_duration = -1The -1 for autovacuum_vacuum_cost_limit means it inherits from vacuum_cost_limit, which defaults to 200. This is far too conservative for modern SSDs.
3.3 Checking Autovacuum Activity
I wanted to see when autovacuum last ran:
SELECT relname, last_autovacuum, autovacuum_count, n_dead_tupFROM pg_stat_user_tablesWHERE autovacuum_count > 0ORDER BY last_autovacuum DESC;My large tables showed last_autovacuum dates from weeks ago. Autovacuum wasn’t triggering frequently enough.
4. Why Autovacuum Wasn’t Keeping Up
4.1 The Cost Limit Problem
PostgreSQL throttles autovacuum using a “cost budget” mechanism. Each read/write operation depletes this budget. When exhausted, autovacuum sleeps before continuing.
With autovacuum_vacuum_cost_limit = 200 (via the default vacuum_cost_limit), autovacuum was being throttled too aggressively. On modern SSDs, this conservative limit causes unnecessarily slow vacuuming.
4.2 The Threshold Formula Problem
Autovacuum triggers based on this formula:
vacuum_threshold = autovacuum_vacuum_threshold + (autovacuum_vacuum_scale_factor * table_size)For my 1 billion row table with default autovacuum_vacuum_scale_factor = 0.2, autovacuum wouldn’t start until 200 million rows were modified. That’s way too much accumulated bloat before cleanup starts.
4.3 The Worker Limit Problem
With only 3 workers, only 3 tables could be vacuumed simultaneously. My database had dozens of active tables - many had to wait in queue.
5. The Solution
5.1 Global Configuration Changes
I updated postgresql.conf with these settings:
autovacuum_vacuum_cost_limit = 3000autovacuum_max_workers = 5log_autovacuum_min_duration = 0Why these values:
autovacuum_vacuum_cost_limit = 3000: Allows much faster vacuuming on SSD storageautovacuum_max_workers = 5: More parallel vacuum workers for databases with many active tableslog_autovacuum_min_duration = 0: Logs all autovacuum activity for monitoring
After applying and restarting PostgreSQL:
ALTER SYSTEM SET autovacuum_vacuum_cost_limit = 3000;ALTER SYSTEM SET autovacuum_max_workers = 5;ALTER SYSTEM SET log_autovacuum_min_duration = 0;SELECT pg_reload_conf();5.2 Per-Table Tuning for Large Tables
For my busiest tables, I set custom thresholds:
-- For a table with frequent updatesALTER TABLE orders SET ( autovacuum_vacuum_scale_factor = 0.05, autovacuum_analyze_scale_factor = 0.02);
-- For a very large table needing aggressive vacuumingALTER TABLE audit_log SET ( autovacuum_vacuum_scale_factor = 0.01, autovacuum_vacuum_threshold = 1000, autovacuum_vacuum_cost_delay = 2);With autovacuum_vacuum_scale_factor = 0.01, vacuuming triggers after just 1% of rows change - much more responsive for large tables.
5.3 Measuring Bloat with pgstattuple
I installed the pgstattuple extension to measure actual bloat:
CREATE EXTENSION pgstattuple;
SELECT * FROM pgstattuple('orders');The output showed:
table_len | 1073741824tuple_count | 1000000tuple_len | 100000000dead_tuple_count | 500000dead_tuple_len | 50000000free_space | 10000000With 500,000 dead tuples, I ran a manual vacuum to clear the backlog:
VACUUM VERBOSE orders;5.4 Monitoring After Changes
I set up regular monitoring:
SELECT relname, n_dead_tup, n_live_tup, last_autovacuum, autovacuum_countFROM pg_stat_user_tablesWHERE n_dead_tup > 10000ORDER BY n_dead_tup DESC;With log_autovacuum_min_duration = 0, I also check PostgreSQL logs to see autovacuum activity patterns.
6. When Autovacuum Still Can’t Keep Up
For extremely busy tables, even aggressive tuning wasn’t enough. I considered partitioning:
- Each partition has its own autovacuum worker- Smaller tables = faster vacuum cycles- Can drop old partitions instead of vacuuming themFor time-series data, I partition by month. When the retention policy requires deletion, I detach and drop the partition - no vacuum needed at all.
7. Common Mistakes I Avoided
Mistake 1: Disabling Autovacuum
I considered autovacuum = off to “stop the performance problem.” This would have led to:
- Accumulating table bloat
- Slowing queries progressively
- Eventually hitting transaction ID wraparound (database shutdown for emergency vacuum)
The right approach was tuning, not disabling.
Mistake 2: Setting Scale Factor Too Low Globally
Setting autovacuum_vacuum_scale_factor = 0.01 globally would cause constant vacuuming on small tables. Per-table settings are the right approach for large, frequently-updated tables.
Mistake 3: Ignoring the Logs
Without log_autovacuum_min_duration = 0, I couldn’t see which tables autovacuum was visiting or how long it took. Visibility is essential for tuning.
8. Summary
PostgreSQL autovacuum is essential for table health - it prevents bloat and transaction wraparound. I fixed my performance issues by:
- Raising
autovacuum_vacuum_cost_limitto 3000 for faster vacuuming - Increasing
autovacuum_max_workersto 5 for parallel maintenance - Setting
log_autovacuum_min_duration = 0for monitoring - Applying per-table thresholds for large, busy tables
For tables where even aggressive tuning couldn’t keep up, partitioning distributes the vacuum workload across smaller tables.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
- 👨💻 EnterpriseDB Autovacuum Tuning Guide
- 👨💻 PostgreSQL Autovacuum Documentation
- 👨💻 PostgreSQL Transaction ID Wraparound
- 👨💻 pgstattuple Extension
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments