Postgres database indexing is the single highest-leverage knob in a read-heavy application, and it is also the one most teams get subtly wrong. PostgreSQL ships six distinct index types, each tuned for a different access pattern, and choosing the wrong one is a silent throughput killer most developers never diagnose — the query still returns, just slower, and no code review catches it.
The stakes rise as your tables grow. An index that was invisible on a ten-thousand-row table becomes the difference between a 1 ms lookup and a sequential scan once you cross a few million rows. And the fix is rarely "add another index" — over-indexing taxes every write, and a single mis-specified covering index can quietly halve your update throughput.
This reference covers the six index types and when each one fits, why the "most selective column first" rule for composite indexes is a documented myth, the conditions under which BRIN beats B-tree by three orders of magnitude (and when it is useless), the real cost of over-indexing, and how to use EXPLAIN ANALYZE as the final arbiter on every decision. Every claim is sourced to primary documentation or a named benchmark, with vendor-specific figures attributed as such.
- 01Match the index type to the access pattern, not the data.B-tree handles equality and range on any sortable type. Hash is exact-match only. GiST suits geometric and nearest-neighbor queries; GIN suits arrays, JSONB, and full-text; BRIN suits append-only data physically correlated with a column.
- 02The 'most selective column first' rule is a myth.Per Markus Winand (Use The Index, Luke), composite index column order should maximize how often the index is usable — driven by query frequency and the left-prefix rule, not raw selectivity. Equality predicates should precede range predicates.
- 03BRIN wins only under strong physical correlation.On a correlated timestamp column, Crunchy Data measured a 24 KB BRIN versus a ~21 MB B-tree on the same 42 MB table. On randomly-ordered data BRIN is effectively useless — point-lookup latency runs hundreds of milliseconds versus single-digit for B-tree.
- 04Over-indexing is as damaging as under-indexing.Percona benchmarked scaling from 7 to 39 indexes on one schema: write TPS fell roughly 57% (about 1,400 to 600) with cache hit ratio above 99.7%, isolating index overhead as the cause. Each index also has to be maintained on every write.
- 05EXPLAIN ANALYZE is the final arbiter.It executes the query and returns planner estimates beside real measured timings. Large estimate-versus-actual gaps signal stale statistics — run ANALYZE. Remember cost units are arbitrary, not milliseconds; read 'actual time' for real latency.
01 — The Six TypesSix index types, each tuned for a different access pattern.
PostgreSQL exposes six index access methods, and the first job of any read-heavy schema is matching each frequently-run query to the right one. The defaults are forgiving — almost every primary key and most secondary indexes are B-tree — but the specialized types exist precisely because B-tree degrades or fails entirely on the access patterns they were built for.
B-tree — the default workhorse
B-tree is the Postgres default and the right answer for the vast majority of indexes. It handles equality and range queries on any sortable data type, supporting <, <=, =, >=, >, BETWEEN, IN, IS NULL / IS NOT NULL, and even anchored LIKE 'foo%' prefix patterns. Because it keeps entries ordered, it can also satisfy ORDER BY without a separate sort.
Hash, GiST, GIN, BRIN, SP-GiST — the specialists
Hash stores a 32-bit hash code derived from the indexed value and supports only = equality — no ranges, no sorting. Reach for it only when every query on a column is an exact match. GiST (Generalized Search Tree) is not a single index but an infrastructure within which many strategies are implemented; it is the correct choice for geometric and spatial data, range types, and nearest-neighbor queries like ORDER BY location <-> point '(101,456)'.
GIN (Generalized Inverted Index) is built for columns that hold multiple component values — arrays, JSONB, and tsvector. It creates a separate index entry per component and is the preferred type for full-text search. BRIN (Block Range Index) stores only min/max summaries per block range and shines on append-only data physically correlated with the indexed column. SP-GiST rounds out the set for non-balanced, space-partitioned structures such as quadtrees and radix trees.
B-tree
The right default for almost every index. Supports =, ranges, BETWEEN, IN, anchored LIKE prefixes, and ordered ORDER BY scans. Also the only common type that can satisfy sorting for free.
GIN
Inverted index with one entry per component value. The preferred type for full-text search and JSONB containment. Powerful, but write-side behavior needs care (see fastupdate, below).
BRIN
Tiny on disk when column values track physical row order — an auto-increment ID or created_at on an append-only table. Beats B-tree only under that correlation; otherwise effectively useless.
02 — Decision MatrixA single scannable index-type decision matrix.
Most indexing guides describe each type in prose and leave you to assemble the trade-offs yourself. The matrix below co-locates the primary use case, supported operators, approximate size relative to B-tree, write overhead, and — the column most posts omit — when to avoideach type. Operator support is from the PostgreSQL documentation; the BRIN size ratio is from Crunchy Data's controlled test on correlated data; write-overhead notes draw on pganalyze and vendor analysis.
B-treeHashGiSTGINBRINSP-GiST| Index type | Best for · operators | Size & write notes |
|---|---|---|
B-tree | Equality + range on sortable types. =, <, <=, >=, >, BETWEEN, IN, IS NULL, anchored LIKE. Satisfies ORDER BY. | Baseline size and write cost — the reference everything else is measured against. Avoid only when a specialist clearly fits (multi-value, spatial, append-only). |
Hash | Equality only (=). 32-bit hash code; no ranges, no sorting. | Comparable footprint to B-tree, narrow benefit. Avoid unless every query on the column is an exact-match lookup; a B-tree usually serves the same need plus more. |
GiST | Geometric/spatial data, range types, nearest-neighbor (ORDER BY <-> ). Infrastructure, not one algorithm. Supports INCLUDE. | Sized per operator class. Avoid for plain scalar equality/range work — that is B-tree's job. Use when the data is genuinely spatial or KNN. |
GIN | Arrays, JSONB, tsvector. One entry per component value. Preferred for full-text search and JSONB containment (@>). | Larger than B-tree on rich documents. fastupdate defers writes to a pending list that flushes in bursts — plan for spiky write latency on heavily-written tables. |
BRIN | Range scans on data physically correlated with the column (append-only created_at, auto-increment ID). Min/max per block range. | Orders of magnitude smaller than B-tree under strong correlation (Crunchy Data: 24 KB vs ~21 MB). Avoid for point lookups or randomly-ordered columns — effectively useless there. |
SP-GiST | Space-partitioned, non-balanced structures: quadtrees, k-d trees, radix trees, some text/IP types. Supports INCLUDE. | Niche. Reach for it only when your data maps naturally onto a partitioning structure GiST handles poorly. Most apps never need it. |
03 — Column OrderThe "most selective column first" rule is a myth.
Almost every indexing tutorial repeats the same advice: put the most selective column first in a composite index. Markus Winand, author of SQL Performance Explained and the reference site Use The Index, Luke, explicitly catalogs this as a myth. What actually governs column order is how often the index can be used — a function of your query patterns and the left-prefix rule, not raw selectivity.
The left-prefix rule is the mechanism. An index on (col_a, col_b) can accelerate queries filtering on col_a alone, or on (col_a, col_b) together — but it cannot efficiently serve a query that filters only on col_b. Put the column your queries always filter on first, even if it is less selective, so the index stays usable across the widest set of queries.
The most important consideration when defining a concatenated index is how to choose the column order so it can be used as often as possible.— Markus Winand, Use The Index, Luke
There is a second rule that often matters more than selectivity: equality predicates should precede range predicates. A range condition on an intermediate column ends the index traversal — every column to its right becomes a filter predicate rather than an access predicate, which means Postgres fetches and discards rows instead of skipping straight to them. Order your composite index so all the equality columns come first and the single range column comes last.
The nuance worth preserving: selectivity is not irrelevant. Winand is clear that for independent range conditions, selectivity does help. The correction is not "selectivity never matters" — it is "usage frequency and the left-prefix and equality-before-range rules dominate, and selectivity only breaks ties." Over-indexing on selectivity alone is how teams build composite indexes their actual queries can never use.
Filter on a value, range on a date
WHERE status = 'active' AND created_at > $1. Put the equality column first, the range column last: (status, created_at). The range ends traversal, so nothing useful can sit to its right.
Filter then ORDER BY
WHERE customer_id = $1 ORDER BY created_at DESC. Index (customer_id, created_at) lets Postgres both filter and return rows pre-sorted, skipping a sort node entirely.
Many queries, one common filter
If nearly every query filters on tenant_id, lead with it even when it is low-selectivity. The left-prefix rule means one index then serves the whole family of queries.
No leading equality
If queries filter only on a range column with nothing to its left, a composite index buys little. Index that column alone, and let selectivity guide whether it earns its keep at all.
04 — Covering & PartialCovering and partial indexes — and the HOT-update trap.
Two of the most powerful read-side tools are covering indexes and partial indexes. A covering index adds non-key payload columns via the INCLUDE clause — CREATE INDEX tab_x_y ON tab(x) INCLUDE (y) — so the planner can satisfy a query entirely from the index without touching the heap. That is an index-only scan, and on the right query it is dramatically faster. Only B-tree, GiST, and SP-GiST support INCLUDE, and the feature arrived in PostgreSQL 11.
A partial index is built over a subset of the table defined by a WHERE predicate — CREATE INDEX ON orders(customer_id) WHERE status = 'active'. It avoids indexing common values the planner would ignore anyway, shrinks the index, and speeds both reads and the writes that don't touch the predicate. For a table where most rows are inactive but queries only ever hit active ones, a partial index can be a fraction of the size of the full one.
There is also a documentation caveat worth internalizing before you reach for covering indexes everywhere. Index-only scans depend on the visibility map: Postgres must consult the heap for rows on pages not yet marked all-visible by VACUUM. In practice, covering indexes pay off most on slowly-changing tables with frequent VACUUM runs — exactly the opposite of the high-churn tables where the HOT-update trap bites hardest.
There is little point in including payload columns in an index unless the table changes slowly enough that an index-only scan is likely to not need to access the heap.— PostgreSQL official documentation
One more historical nuance for anyone reading older benchmarks: before PostgreSQL 13, covering indexes were larger than the equivalent multicolumn index because B-tree deduplication did not apply to INCLUDE columns — in a 2021 postgres.ai test, 30 MiB versus 21 MiB. Deduplication in PG 13 narrowed that gap, so don't carry the old size penalty into version-13-and-later planning. If you are wiring up indexes through an ORM, our guide to running Prisma ORM in production covers how generated queries map to the indexes they actually need.
vs 21 MiB multicolumn
Before PG 13, INCLUDE columns were not deduplicated, so covering indexes ran larger than the equivalent multicolumn index. Deduplication in PG 13 closed much of the gap. (2021 postgres.ai benchmark.)
single-col index on the changing field
When the frequently-updated column was only in a plain single-column index, ~97% of updates stayed HOT. Move that same column into a covering index's INCLUDE and the HOT rate collapsed to ~0%. (2021 postgres.ai.)
index the subset you query
A predicate like WHERE status = 'active' skips indexing the common dead values, shrinking the index and speeding reads plus the writes that don't touch the predicate. Just keep volatile columns out of the predicate.
05 — BRINWhen BRIN beats B-tree by 1,000× — and when it's useless.
BRIN is the most misunderstood index type, and the misunderstanding cuts both ways — teams either ignore it entirely or deploy it on columns where it does nothing. The headline is real but conditional: in a controlled Crunchy Data test, a 42 MB table produced roughly 21 MB of B-tree index but only 24 KB of BRIN index on the same timestamp column — roughly a 1,000-to-1 size advantage. That is not magic; it is the direct result of BRIN storing only min/max summaries per block range instead of an entry per row.
The condition is everything. That advantage only materializes when the indexed column is strongly correlated with the physical row order of the table — an auto-incrementing ID, or a created_attimestamp on an append-only table where new rows land at the end. Break that correlation — update rows randomly, or index a high-cardinality column with no physical ordering — and BRIN stops helping. Crunchy Data's benchmark is blunt about it: on the small result sets that point lookups produce, BRIN random-access latency ran 207–211 msagainst B-tree's 0.6–5 ms. BRIN only pulls ahead on very large result sets (100K+ rows) over sequentially-inserted data.
That makes BRIN a natural fit for the write-heavy, append-only tables at the edges of an event-driven system — the same tables that show up in our reference on webhook reliability and event-driven architecture. A webhook-event or audit log inserted in time order, queried by date range and almost never point-looked-up, is close to the ideal BRIN workload: tiny index, fast range scans, negligible write overhead.
06 — The Write TaxOver-indexing taxes every write.
Reads get all the attention, so the failure mode that actually bites most read-heavy apps is the opposite of what people expect: too many indexes. Every index has to be maintained on every insert, update, and delete that touches its columns. Add enough of them and the write path slows measurably even when reads look fine.
The clearest independent evidence is a Percona benchmark. Scaling a schema from 7 indexes to 39 indexes cut write throughput by roughly 57% — TPS fell from about 1,400 to about 600, and transaction latency rose from 11 ms to 26 ms. Crucially, the PostgreSQL cache hit ratio stayed above 99.7% throughout, which isolates index maintenance — not I/O or cache misses — as the cause.
Write throughput collapses as index count climbs
Source: Percona — Benchmarking PostgreSQL: The Hidden Cost of Over-Indexing (2024)There is a second, harder-to-measure cost: write amplification. A single logical row write fans out into heap pages, every relevant index, and the write-ahead log. One vendor analysis (Tiger Data) puts steady-state Postgres write amplification in the order of a few times the raw payload, with each additional index adding incrementally to that ratio. Treat those specific multipliers as a single-vendor, order-of-magnitude illustration rather than a precise benchmark — but the direction is corroborated by the independent Percona TPS data above: indexes are not free, and the marginal cost compounds.
The operational discipline that follows is to prune, not just add. PostgreSQL tracks index usage in pg_stat_user_indexes; the idx_scan column counts how many times each index has been used. On a busy table with sufficient uptime, an index sitting at zero or near-zero scans is a strong candidate to drop. Auditing for unused indexes is one of the highest-ROI database chores most teams never schedule.
You do not eliminate write amplification in high throughput databases; you decide where it shows up.— GodOfGeeks, Write Amplification in Databases (dev.to)
fastupdate defers index writes to a pending list (4 MB by default). The list flushes in bursts — periodically or at query time — so write latency looks fine on average and then spikes. pganalyze, reporting on GitLab's production workload, documented GIN pending-list flushes ranging from a few hundred milliseconds to several seconds during peak hours on a heavily-written table. If you run GIN on a hot table, monitor for flush spikes rather than trusting average write latency.07 — The ArbiterEXPLAIN ANALYZE is the final arbiter.
Every decision above is a hypothesis until you measure it on your own data, and the measurement tool is EXPLAIN ANALYZE. Plain EXPLAINshows only the planner's estimates without running the query; EXPLAIN ANALYZE actually executes it and returns estimates and real measured results side by side. The output is a tree, and each node reports both (cost=startup..total rows=N width=bytes) and (actual time=Xms rows=N loops=N).
Two reading habits prevent most misinterpretation. First, cost units are arbitrary, not milliseconds. seq_page_cost is conventionally 1.0 and every other cost parameter is relative to it, so a cost of 5.04 says nothing about wall-clock time — read actual time for real latency. Second, watch the gap between estimated and actual rows. A large discrepancy means the planner is working from stale statistics; run ANALYZE on the table to refresh them. Postgres auto-analyzes by default, but it can lag on fast-growing tables, and a bad row estimate leads directly to a bad plan — a sequential scan where an index scan belonged, or the wrong join strategy.
EXPLAIN results should not be extrapolated to situations much different from the one you are actually testing; for example, results on a toy-sized table cannot be assumed to apply to large tables.— PostgreSQL official documentation
That warning is the single most important sentence in the EXPLAIN documentation for read-heavy work. The index that looks pointless on a development table of ten thousand rows is often the one that saves you in production at ten million — and vice versa. Benchmark against production-scale data, or at least production-shaped data, and re-check after meaningful growth. When you add an index to a live table, use CREATE INDEX CONCURRENTLYso the build doesn't hold a write lock; it takes longer and can't run inside a transaction block, but it keeps the table writable throughout.
Indexing is the first layer of a read-performance stack, not the whole of it. Once the right indexes are in place and EXPLAIN ANALYZE confirms index scans on your hot paths, the next lever is moving the hottest reads off the database entirely — see our guide to Redis caching strategies in production for the "index first, cache second" sequencing.
08 — ChecklistA read-heavy indexing checklist.
Pulling the reference together into the order you'd actually work through it on a real schema. Indexing is iterative — you measure, add or prune one thing, and measure again — so treat this as a loop, not a one-time setup.
- Start from the query, not the table. List the actual
WHERE,JOIN, andORDER BYclauses that run frequently. Index the access patterns you have, not the ones you imagine. - Default to B-tree. Only reach for GIN (arrays / JSONB / full-text), GiST (spatial / KNN), or BRIN (append-only, correlated) when the access pattern genuinely calls for it.
- Order composite indexes by usage and predicate type. Lead with the column queries always filter on, put equality columns before the single range column, and let selectivity break ties — not decide order.
- Use covering and partial indexes carefully. They are read wins, but keep frequently-updated columns out of INCLUDE clauses and partial-index predicates to preserve HOT updates.
- Count your indexes per table. Each one taxes every write. Audit
pg_stat_user_indexesforidx_scan = 0and drop what nothing uses. - Verify with EXPLAIN ANALYZE on production-scale data. Read
actual time, watch estimate-versus-actual row gaps, and runANALYZEwhen they diverge. Add indexes withCREATE INDEX CONCURRENTLY.
If your application sits on Supabase, the same principles apply directly to its managed Postgres — our complete guide to Supabase and Next.js production setup shows where index management fits into a full deployment, and teams that want this whole loop run for them can engage our web development team to audit and tune a production schema end to end.
09 — ConclusionIndex the queries you have, then measure.
Indexing is a measurement discipline, not a checklist of types.
The six index types are the easy part. The hard part — and the part that separates a fast read-heavy app from a slow one — is matching each type to a real access pattern, ordering composite indexes by how your queries actually run rather than by a selectivity myth, and resisting the urge to solve every slow query by adding yet another index.
The numbers in this reference point the same direction. BRIN's three-orders-of-magnitude size advantage is real, but only on physically-correlated data. A covering index is a read win that can silently halve your write throughput if it captures a volatile column. And Percona's benchmark — 7 indexes to 39 indexes, roughly 57% of your write throughput gone — is the clearest reminder that indexing is a trade-off you tune, not a feature you maximize.
So the discipline is the takeaway: start from the queries you have, default to B-tree, reach for specialists deliberately, prune unused indexes as aggressively as you add useful ones, and let EXPLAIN ANALYZE on production-scale data settle every disagreement. Do that on a loop and the read performance of a Postgres app stops being a mystery and becomes something you can engineer.