PostgreSQL Index Types: B-tree vs BRIN vs GIN - When to Use Each
The Problem: Wrong Index Type = Wasted Resources
I’ve seen this happen too many times. A developer adds an index because “queries are slow,” but months later, the query is still slow, writes have degraded, and the index is taking up gigabytes of space.
Adding indexes without understanding types leads to:
- Huge index sizes that slow writes
- Indexes planner won’t use
- Maintenance overhead for rarely queried data
- Still-slow queries despite having indexes
The right index type matters as much as having an index at all.
Index Types Decision Matrix
Here’s my quick reference for choosing the right index type:
| Index Type | Best For | Data Structure | Sort Support | Size | Write Cost |
|---|---|---|---|---|---|
| B-tree | Range queries, equality, sorted results | Balanced tree | Yes | Medium | Medium |
| BRIN | Time series, append-only, correlated data | Block range summary | Partial | Tiny (~1% of table) | Very low |
| GIN | JSONB, arrays, full-text, multiple values per row | Inverted index | No | Large | High |
| GiST | Geospatial, ranges, custom operators | R-tree variant | Partial | Large | High |
| Hash | Only equality checks | Hash table | No | Small | Low |
| Bloom | Multi-column ANY combination queries | Bloom filter | No | Small | Medium |
Let me walk through when to use each one with real examples.
B-tree: The Default Choice
B-tree is PostgreSQL’s default index type, and for good reason. It handles most common query patterns well.
When to use:
- WHERE clause with =, <, >, BETWEEN, IN
- ORDER BY requirements
- Multi-column combinations with clear selectivity order
Here’s a composite index I created for a production orders table:
-- High selectivity first, avoid sortCREATE INDEX idx_orders_compON orders (status, category_id, order_date DESC)INCLUDE (customer_id, total_amount); -- Covering columnsThe key insight: column order matters. I check selectivity before deciding the order:
SELECT n_distinct, correlationFROM pg_statsWHERE tablename = 'orders' AND attname IN ('status','category_id');Results from my production database:
| Column | n_distinct | Correlation | Selectivity |
|---|---|---|---|
| status | 5 | 0.01 | 20% (low) |
| category_id | 200 | 0.85 | 0.5% (high) |
| order_date | 1825 | 0.98 | 0.05% (very high) |
Order decision: Put high-selectivity columns first to filter more rows early. But I also consider query patterns—if I always filter by status first, that column should be early even with lower selectivity.
BRIN: Massive Time Series Data
BRIN (Block Range INdex) was a game-changer for my time series workloads. I had a 200M row orders table where traditional B-tree indexes were consuming 60GB of space.
When to use:
- Ordered/append-only data (logs, timestamps, sequences)
- Table size > 100GB
- Range queries on correlated column
Key advantage: Index size ~1% of table size
CREATE INDEX idx_orders_date_brinON orders USING BRIN (order_date)WITH (pages_per_range = 64, autosummarize = on);Results from my production case:
- 200M row orders table: Index size 1.2MB (vs 60GB table)
- Query time: 1250ms → 245ms (80% reduction)
BRIN limitation: Not for random-access queries; only efficient when data is physically ordered (correlated) with the index column. If your data is randomly inserted, BRIN won’t help.
GIN: JSONB, Arrays, Full-Text
GIN (Generalized Inverted iNdex) shines when you have multiple values per row.
When to use:
- JSONB key/value queries:
WHERE data @> '{"key": "value"}' - Array containment:
WHERE tags @> ARRAY['tag1','tag2'] - Full-text search:
WHERE text_vector @@ to_tsquery('term')
Here are three GIN index patterns I use regularly:
-- JSONB indexCREATE INDEX idx_products_attrsON products USING GIN (attributes jsonb_path_ops);
-- Array indexCREATE INDEX idx_items_tagsON items USING GIN (tags);
-- Full-textCREATE INDEX idx_articles_searchON articles USING GIN (to_tsvector('english', content));GIN cost: Higher write overhead; each INSERT updates multiple index entries. For bulk loads, I use fastupdate option or create the index after loading data.
GiST: Geospatial and Ranges
GiST (Generalized Search Tree) handles data types where equality isn’t the primary operation.
When to use:
- PostGIS geometry queries
- Range types:
WHERE range && '[a,b]' - Custom operator classes
-- Spatial indexCREATE INDEX idx_locations_geoON locations USING GiST (coordinates);
-- Range overlap searchCREATE INDEX idx_events_periodON events USING GiST (time_range);GiST is flexible but requires understanding your query operators. The && (overlaps) operator is the most common use case I see.
Partial and Covering Indexes
Two techniques I use to make indexes more efficient:
-- Only index active recordsCREATE INDEX idx_orders_activeON orders (customer_id, order_date)WHERE status != 'cancelled';
-- Cover columns to avoid table lookupCREATE INDEX idx_orders_coveringON orders (status, order_date)INCLUDE (customer_id, total_amount);Partial indexes reduce index size by only indexing relevant rows. I use them when queries have a consistent WHERE clause filter.
Covering indexes include columns needed by SELECT to avoid heap lookups. The INCLUDE clause in PostgreSQL 11+ is perfect for this.
Index Selection Validation
After creating any index, I always validate it’s being used:
EXPLAIN (ANALYZE) SELECT ... WHERE your_condition;-- Look for "Index Scan" vs "Seq Scan"And I regularly check for unused indexes:
SELECT relname, indexrelname, idx_scanFROM pg_stat_all_indexesWHERE schemaname NOT LIKE 'pg_%'ORDER BY idx_scan ASC;Indexes with idx_scan = 0 or very low usage = candidates for removal. I’ve cleaned up dozens of unused indexes this way.
Before and After Index Strategy
Let me show you a real example. I had a 3450ms query on 200M rows:
-- Problem: 3450ms query on 200M rowsSELECT * FROM ordersWHERE status = 'shipped' AND order_date >= '2023-06-01' AND category_id = 42ORDER BY order_date DESCLIMIT 100;Solution: Composite covering index
CREATE INDEX idx_orders_status_category_dateON orders (status, category_id, order_date DESC)INCLUDE (customer_id, total_amount)WHERE status != 'cancelled';
-- Result: 45ms (77x faster), index scan only 102 rowsThe key changes:
- Multi-column index matching query filter order
- DESC on order_date to avoid sort
- INCLUDE columns to avoid heap lookup
- Partial index to exclude cancelled orders
I also applied BRIN for massive time series queries:
-- Problem: 1250ms range query on 200M rowsSELECT * FROM ordersWHERE order_date BETWEEN '2023-01-01' AND '2023-01-31';
-- Solution: BRIN index (1.2MB size)CREATE INDEX idx_orders_date_brinON orders USING BRIN (order_date)WITH (pages_per_range = 64);
-- Result: 245ms (5x faster, tiny index)Summary
In this post, I showed how to match PostgreSQL index types to query patterns: B-tree for general range/equality, BRIN for ordered large datasets, GIN for JSONB/arrays/full-text, GiST for spatial/ranges. The key point is to always validate with EXPLAIN and check pg_stat_all_indexes for actual usage.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments