Skip to content

How to Configure PostgreSQL Autovacuum to Prevent Table Bloat

1. The Problem: My Queries Were Getting Slower

I noticed my PostgreSQL queries were slowing down over time. Tables that once returned results in milliseconds now took seconds. The database storage kept growing even though I was deleting old records regularly.

I checked my tables and found they were bloated - containing many “dead tuples” that weren’t being cleaned up. The culprit? Autovacuum wasn’t keeping up with my workload.

2. Understanding What Autovacuum Does

When you UPDATE or DELETE rows in PostgreSQL, the old row versions aren’t immediately removed. They become “dead tuples” that:

  1. Waste disk space - Tables grow larger than necessary (“bloat”)
  2. Slow scans - Sequential and index scans must skip dead tuples
  3. Risk transaction ID wraparound - Can eventually force database shutdown

Autovacuum runs VACUUM and ANALYZE automatically to reclaim dead tuples and update statistics. I initially thought about disabling it because it was “causing performance issues.” That was the wrong approach.

3. The Investigation

3.1 Checking Dead Tuple Ratio

First, I checked which tables had the worst bloat:

check dead tuple ratio
SELECT relname, n_dead_tup, n_live_tup,
n_dead_tup::float / NULLIF(n_live_tup, 0) AS dead_ratio
FROM pg_stat_user_tables
WHERE n_live_tup > 0
ORDER BY dead_ratio DESC;

My busiest table showed a dead_ratio of 0.5 - half the tuples were dead. That explained the slow scans.

3.2 Checking Current Autovacuum Settings

I examined my current configuration:

view autovacuum settings
SHOW autovacuum_vacuum_cost_limit;
SHOW autovacuum_max_workers;
SHOW autovacuum_naptime;
SHOW log_autovacuum_min_duration;

The results:

default settings output
autovacuum_vacuum_cost_limit = -1
autovacuum_max_workers = 3
autovacuum_naptime = 1min
log_autovacuum_min_duration = -1

The -1 for autovacuum_vacuum_cost_limit means it inherits from vacuum_cost_limit, which defaults to 200. This is far too conservative for modern SSDs.

3.3 Checking Autovacuum Activity

I wanted to see when autovacuum last ran:

monitor autovacuum activity
SELECT relname,
last_autovacuum,
autovacuum_count,
n_dead_tup
FROM pg_stat_user_tables
WHERE autovacuum_count > 0
ORDER BY last_autovacuum DESC;

My large tables showed last_autovacuum dates from weeks ago. Autovacuum wasn’t triggering frequently enough.

4. Why Autovacuum Wasn’t Keeping Up

4.1 The Cost Limit Problem

PostgreSQL throttles autovacuum using a “cost budget” mechanism. Each read/write operation depletes this budget. When exhausted, autovacuum sleeps before continuing.

With autovacuum_vacuum_cost_limit = 200 (via the default vacuum_cost_limit), autovacuum was being throttled too aggressively. On modern SSDs, this conservative limit causes unnecessarily slow vacuuming.

4.2 The Threshold Formula Problem

Autovacuum triggers based on this formula:

vacuum trigger formula
vacuum_threshold = autovacuum_vacuum_threshold +
(autovacuum_vacuum_scale_factor * table_size)

For my 1 billion row table with default autovacuum_vacuum_scale_factor = 0.2, autovacuum wouldn’t start until 200 million rows were modified. That’s way too much accumulated bloat before cleanup starts.

4.3 The Worker Limit Problem

With only 3 workers, only 3 tables could be vacuumed simultaneously. My database had dozens of active tables - many had to wait in queue.

5. The Solution

5.1 Global Configuration Changes

I updated postgresql.conf with these settings:

recommended postgresql.conf settings
autovacuum_vacuum_cost_limit = 3000
autovacuum_max_workers = 5
log_autovacuum_min_duration = 0

Why these values:

  • autovacuum_vacuum_cost_limit = 3000: Allows much faster vacuuming on SSD storage
  • autovacuum_max_workers = 5: More parallel vacuum workers for databases with many active tables
  • log_autovacuum_min_duration = 0: Logs all autovacuum activity for monitoring

After applying and restarting PostgreSQL:

verify new settings
ALTER SYSTEM SET autovacuum_vacuum_cost_limit = 3000;
ALTER SYSTEM SET autovacuum_max_workers = 5;
ALTER SYSTEM SET log_autovacuum_min_duration = 0;
SELECT pg_reload_conf();

5.2 Per-Table Tuning for Large Tables

For my busiest tables, I set custom thresholds:

per-table autovacuum settings
-- For a table with frequent updates
ALTER TABLE orders SET (
autovacuum_vacuum_scale_factor = 0.05,
autovacuum_analyze_scale_factor = 0.02
);
-- For a very large table needing aggressive vacuuming
ALTER TABLE audit_log SET (
autovacuum_vacuum_scale_factor = 0.01,
autovacuum_vacuum_threshold = 1000,
autovacuum_vacuum_cost_delay = 2
);

With autovacuum_vacuum_scale_factor = 0.01, vacuuming triggers after just 1% of rows change - much more responsive for large tables.

5.3 Measuring Bloat with pgstattuple

I installed the pgstattuple extension to measure actual bloat:

measure table bloat
CREATE EXTENSION pgstattuple;
SELECT * FROM pgstattuple('orders');

The output showed:

pgstattuple output
table_len | 1073741824
tuple_count | 1000000
tuple_len | 100000000
dead_tuple_count | 500000
dead_tuple_len | 50000000
free_space | 10000000

With 500,000 dead tuples, I ran a manual vacuum to clear the backlog:

manual vacuum
VACUUM VERBOSE orders;

5.4 Monitoring After Changes

I set up regular monitoring:

ongoing monitoring query
SELECT relname,
n_dead_tup,
n_live_tup,
last_autovacuum,
autovacuum_count
FROM pg_stat_user_tables
WHERE n_dead_tup > 10000
ORDER BY n_dead_tup DESC;

With log_autovacuum_min_duration = 0, I also check PostgreSQL logs to see autovacuum activity patterns.

6. When Autovacuum Still Can’t Keep Up

For extremely busy tables, even aggressive tuning wasn’t enough. I considered partitioning:

partitioning benefits for autovacuum
- Each partition has its own autovacuum worker
- Smaller tables = faster vacuum cycles
- Can drop old partitions instead of vacuuming them

For time-series data, I partition by month. When the retention policy requires deletion, I detach and drop the partition - no vacuum needed at all.

7. Common Mistakes I Avoided

Mistake 1: Disabling Autovacuum

I considered autovacuum = off to “stop the performance problem.” This would have led to:

  • Accumulating table bloat
  • Slowing queries progressively
  • Eventually hitting transaction ID wraparound (database shutdown for emergency vacuum)

The right approach was tuning, not disabling.

Mistake 2: Setting Scale Factor Too Low Globally

Setting autovacuum_vacuum_scale_factor = 0.01 globally would cause constant vacuuming on small tables. Per-table settings are the right approach for large, frequently-updated tables.

Mistake 3: Ignoring the Logs

Without log_autovacuum_min_duration = 0, I couldn’t see which tables autovacuum was visiting or how long it took. Visibility is essential for tuning.

8. Summary

PostgreSQL autovacuum is essential for table health - it prevents bloat and transaction wraparound. I fixed my performance issues by:

  1. Raising autovacuum_vacuum_cost_limit to 3000 for faster vacuuming
  2. Increasing autovacuum_max_workers to 5 for parallel maintenance
  3. Setting log_autovacuum_min_duration = 0 for monitoring
  4. Applying per-table thresholds for large, busy tables

For tables where even aggressive tuning couldn’t keep up, partitioning distributes the vacuum workload across smaller tables.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments