Skip to content

PostgreSQL Index Types: B-tree vs BRIN vs GIN - When to Use Each

The Problem: Wrong Index Type = Wasted Resources

I’ve seen this happen too many times. A developer adds an index because “queries are slow,” but months later, the query is still slow, writes have degraded, and the index is taking up gigabytes of space.

Adding indexes without understanding types leads to:

  • Huge index sizes that slow writes
  • Indexes planner won’t use
  • Maintenance overhead for rarely queried data
  • Still-slow queries despite having indexes

The right index type matters as much as having an index at all.

Index Types Decision Matrix

Here’s my quick reference for choosing the right index type:

Index TypeBest ForData StructureSort SupportSizeWrite Cost
B-treeRange queries, equality, sorted resultsBalanced treeYesMediumMedium
BRINTime series, append-only, correlated dataBlock range summaryPartialTiny (~1% of table)Very low
GINJSONB, arrays, full-text, multiple values per rowInverted indexNoLargeHigh
GiSTGeospatial, ranges, custom operatorsR-tree variantPartialLargeHigh
HashOnly equality checksHash tableNoSmallLow
BloomMulti-column ANY combination queriesBloom filterNoSmallMedium

Let me walk through when to use each one with real examples.

B-tree: The Default Choice

B-tree is PostgreSQL’s default index type, and for good reason. It handles most common query patterns well.

When to use:

  • WHERE clause with =, <, >, BETWEEN, IN
  • ORDER BY requirements
  • Multi-column combinations with clear selectivity order

Here’s a composite index I created for a production orders table:

Composite B-tree index with covering columns
-- High selectivity first, avoid sort
CREATE INDEX idx_orders_comp
ON orders (status, category_id, order_date DESC)
INCLUDE (customer_id, total_amount); -- Covering columns

The key insight: column order matters. I check selectivity before deciding the order:

Query pg_stats for selectivity
SELECT n_distinct, correlation
FROM pg_stats
WHERE tablename = 'orders' AND attname IN ('status','category_id');

Results from my production database:

Columnn_distinctCorrelationSelectivity
status50.0120% (low)
category_id2000.850.5% (high)
order_date18250.980.05% (very high)

Order decision: Put high-selectivity columns first to filter more rows early. But I also consider query patterns—if I always filter by status first, that column should be early even with lower selectivity.

BRIN: Massive Time Series Data

BRIN (Block Range INdex) was a game-changer for my time series workloads. I had a 200M row orders table where traditional B-tree indexes were consuming 60GB of space.

When to use:

  • Ordered/append-only data (logs, timestamps, sequences)
  • Table size > 100GB
  • Range queries on correlated column

Key advantage: Index size ~1% of table size

BRIN index for time series
CREATE INDEX idx_orders_date_brin
ON orders USING BRIN (order_date)
WITH (pages_per_range = 64, autosummarize = on);

Results from my production case:

  • 200M row orders table: Index size 1.2MB (vs 60GB table)
  • Query time: 1250ms → 245ms (80% reduction)

BRIN limitation: Not for random-access queries; only efficient when data is physically ordered (correlated) with the index column. If your data is randomly inserted, BRIN won’t help.

GIN: JSONB, Arrays, Full-Text

GIN (Generalized Inverted iNdex) shines when you have multiple values per row.

When to use:

  • JSONB key/value queries: WHERE data @&gt; '{"key": "value"}'
  • Array containment: WHERE tags @&gt; ARRAY['tag1','tag2']
  • Full-text search: WHERE text_vector @@ to_tsquery('term')

Here are three GIN index patterns I use regularly:

GIN indexes for different data types
-- JSONB index
CREATE INDEX idx_products_attrs
ON products USING GIN (attributes jsonb_path_ops);
-- Array index
CREATE INDEX idx_items_tags
ON items USING GIN (tags);
-- Full-text
CREATE INDEX idx_articles_search
ON articles USING GIN (to_tsvector('english', content));

GIN cost: Higher write overhead; each INSERT updates multiple index entries. For bulk loads, I use fastupdate option or create the index after loading data.

GiST: Geospatial and Ranges

GiST (Generalized Search Tree) handles data types where equality isn’t the primary operation.

When to use:

  • PostGIS geometry queries
  • Range types: WHERE range && '[a,b]'
  • Custom operator classes
GiST spatial and range indexes
-- Spatial index
CREATE INDEX idx_locations_geo
ON locations USING GiST (coordinates);
-- Range overlap search
CREATE INDEX idx_events_period
ON events USING GiST (time_range);

GiST is flexible but requires understanding your query operators. The && (overlaps) operator is the most common use case I see.

Partial and Covering Indexes

Two techniques I use to make indexes more efficient:

Partial and covering indexes
-- Only index active records
CREATE INDEX idx_orders_active
ON orders (customer_id, order_date)
WHERE status != 'cancelled';
-- Cover columns to avoid table lookup
CREATE INDEX idx_orders_covering
ON orders (status, order_date)
INCLUDE (customer_id, total_amount);

Partial indexes reduce index size by only indexing relevant rows. I use them when queries have a consistent WHERE clause filter.

Covering indexes include columns needed by SELECT to avoid heap lookups. The INCLUDE clause in PostgreSQL 11+ is perfect for this.

Index Selection Validation

After creating any index, I always validate it’s being used:

Check index usage
EXPLAIN (ANALYZE) SELECT ... WHERE your_condition;
-- Look for "Index Scan" vs "Seq Scan"

And I regularly check for unused indexes:

Find unused indexes
SELECT relname, indexrelname, idx_scan
FROM pg_stat_all_indexes
WHERE schemaname NOT LIKE 'pg_%'
ORDER BY idx_scan ASC;

Indexes with idx_scan = 0 or very low usage = candidates for removal. I’ve cleaned up dozens of unused indexes this way.

Before and After Index Strategy

Let me show you a real example. I had a 3450ms query on 200M rows:

Problem query
-- Problem: 3450ms query on 200M rows
SELECT * FROM orders
WHERE status = 'shipped'
AND order_date >= '2023-06-01'
AND category_id = 42
ORDER BY order_date DESC
LIMIT 100;

Solution: Composite covering index

Solution index
CREATE INDEX idx_orders_status_category_date
ON orders (status, category_id, order_date DESC)
INCLUDE (customer_id, total_amount)
WHERE status != 'cancelled';
-- Result: 45ms (77x faster), index scan only 102 rows

The key changes:

  1. Multi-column index matching query filter order
  2. DESC on order_date to avoid sort
  3. INCLUDE columns to avoid heap lookup
  4. Partial index to exclude cancelled orders

I also applied BRIN for massive time series queries:

BRIN case study
-- Problem: 1250ms range query on 200M rows
SELECT * FROM orders
WHERE order_date BETWEEN '2023-01-01' AND '2023-01-31';
-- Solution: BRIN index (1.2MB size)
CREATE INDEX idx_orders_date_brin
ON orders USING BRIN (order_date)
WITH (pages_per_range = 64);
-- Result: 245ms (5x faster, tiny index)

Summary

In this post, I showed how to match PostgreSQL index types to query patterns: B-tree for general range/equality, BRIN for ordered large datasets, GIN for JSONB/arrays/full-text, GiST for spatial/ranges. The key point is to always validate with EXPLAIN and check pg_stat_all_indexes for actual usage.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments