DevelopmentIndustry Guide12 min readPublished May 30, 2026

6 index types · the selectivity myth · over-indexing can cut write throughput by ~57%

Database Indexing for Read-Heavy Apps: A 2026 Reference

PostgreSQL ships six index types, and choosing the wrong one is a silent throughput killer most developers never diagnose. This is a reference for read-heavy apps: which index fits which query, why the "most selective column first" rule is a myth, when BRIN wins, and how over-indexing quietly taxes every write.

DA
Digital Applied Team
Senior engineers · Published May 30, 2026
PublishedMay 30, 2026
Read time12 min
Sources10 primary references
Postgres index types
6
B-tree, Hash, GiST, GIN, BRIN, SP-GiST
Write TPS drop, 7→39 indexes
~57%
Percona benchmark
1,400 → 600 TPS
BRIN vs B-tree size
24KB
vs ~21 MB B-tree (correlated data)
Crunchy Data
HOT-update throughput cost
~40%
broken HOT update (2021 benchmark)

Postgres database indexing is the single highest-leverage knob in a read-heavy application, and it is also the one most teams get subtly wrong. PostgreSQL ships six distinct index types, each tuned for a different access pattern, and choosing the wrong one is a silent throughput killer most developers never diagnose — the query still returns, just slower, and no code review catches it.

The stakes rise as your tables grow. An index that was invisible on a ten-thousand-row table becomes the difference between a 1 ms lookup and a sequential scan once you cross a few million rows. And the fix is rarely "add another index" — over-indexing taxes every write, and a single mis-specified covering index can quietly halve your update throughput.

This reference covers the six index types and when each one fits, why the "most selective column first" rule for composite indexes is a documented myth, the conditions under which BRIN beats B-tree by three orders of magnitude (and when it is useless), the real cost of over-indexing, and how to use EXPLAIN ANALYZE as the final arbiter on every decision. Every claim is sourced to primary documentation or a named benchmark, with vendor-specific figures attributed as such.

Key takeaways
  1. 01
    Match the index type to the access pattern, not the data.B-tree handles equality and range on any sortable type. Hash is exact-match only. GiST suits geometric and nearest-neighbor queries; GIN suits arrays, JSONB, and full-text; BRIN suits append-only data physically correlated with a column.
  2. 02
    The 'most selective column first' rule is a myth.Per Markus Winand (Use The Index, Luke), composite index column order should maximize how often the index is usable — driven by query frequency and the left-prefix rule, not raw selectivity. Equality predicates should precede range predicates.
  3. 03
    BRIN wins only under strong physical correlation.On a correlated timestamp column, Crunchy Data measured a 24 KB BRIN versus a ~21 MB B-tree on the same 42 MB table. On randomly-ordered data BRIN is effectively useless — point-lookup latency runs hundreds of milliseconds versus single-digit for B-tree.
  4. 04
    Over-indexing is as damaging as under-indexing.Percona benchmarked scaling from 7 to 39 indexes on one schema: write TPS fell roughly 57% (about 1,400 to 600) with cache hit ratio above 99.7%, isolating index overhead as the cause. Each index also has to be maintained on every write.
  5. 05
    EXPLAIN ANALYZE is the final arbiter.It executes the query and returns planner estimates beside real measured timings. Large estimate-versus-actual gaps signal stale statistics — run ANALYZE. Remember cost units are arbitrary, not milliseconds; read 'actual time' for real latency.

01The Six TypesSix index types, each tuned for a different access pattern.

PostgreSQL exposes six index access methods, and the first job of any read-heavy schema is matching each frequently-run query to the right one. The defaults are forgiving — almost every primary key and most secondary indexes are B-tree — but the specialized types exist precisely because B-tree degrades or fails entirely on the access patterns they were built for.

B-tree — the default workhorse

B-tree is the Postgres default and the right answer for the vast majority of indexes. It handles equality and range queries on any sortable data type, supporting <, <=, =, >=, >, BETWEEN, IN, IS NULL / IS NOT NULL, and even anchored LIKE 'foo%' prefix patterns. Because it keeps entries ordered, it can also satisfy ORDER BY without a separate sort.

Hash, GiST, GIN, BRIN, SP-GiST — the specialists

Hash stores a 32-bit hash code derived from the indexed value and supports only = equality — no ranges, no sorting. Reach for it only when every query on a column is an exact match. GiST (Generalized Search Tree) is not a single index but an infrastructure within which many strategies are implemented; it is the correct choice for geometric and spatial data, range types, and nearest-neighbor queries like ORDER BY location <-> point '(101,456)'.

GIN (Generalized Inverted Index) is built for columns that hold multiple component values — arrays, JSONB, and tsvector. It creates a separate index entry per component and is the preferred type for full-text search. BRIN (Block Range Index) stores only min/max summaries per block range and shines on append-only data physically correlated with the indexed column. SP-GiST rounds out the set for non-balanced, space-partitioned structures such as quadtrees and radix trees.

Default
B-tree
equality + range · any sortable type

The right default for almost every index. Supports =, ranges, BETWEEN, IN, anchored LIKE prefixes, and ordered ORDER BY scans. Also the only common type that can satisfy sorting for free.

Use unless you have a reason not to
Multi-value
GIN
arrays · JSONB · tsvector

Inverted index with one entry per component value. The preferred type for full-text search and JSONB containment. Powerful, but write-side behavior needs care (see fastupdate, below).

Full-text + JSONB
Append-only
BRIN
min/max per block range

Tiny on disk when column values track physical row order — an auto-increment ID or created_at on an append-only table. Beats B-tree only under that correlation; otherwise effectively useless.

Correlation required

02Decision MatrixA single scannable index-type decision matrix.

Most indexing guides describe each type in prose and leave you to assemble the trade-offs yourself. The matrix below co-locates the primary use case, supported operators, approximate size relative to B-tree, write overhead, and — the column most posts omit — when to avoideach type. Operator support is from the PostgreSQL documentation; the BRIN size ratio is from Crunchy Data's controlled test on correlated data; write-overhead notes draw on pganalyze and vendor analysis.

Index type
B-tree
Best for · operators
Equality + range on sortable types. =, <, <=, >=, >, BETWEEN, IN, IS NULL, anchored LIKE. Satisfies ORDER BY.
Size & write notes
Baseline size and write cost — the reference everything else is measured against. Avoid only when a specialist clearly fits (multi-value, spatial, append-only).
Index type
Hash
Best for · operators
Equality only (=). 32-bit hash code; no ranges, no sorting.
Size & write notes
Comparable footprint to B-tree, narrow benefit. Avoid unless every query on the column is an exact-match lookup; a B-tree usually serves the same need plus more.
Index type
GiST
Best for · operators
Geometric/spatial data, range types, nearest-neighbor (ORDER BY <-> ). Infrastructure, not one algorithm. Supports INCLUDE.
Size & write notes
Sized per operator class. Avoid for plain scalar equality/range work — that is B-tree's job. Use when the data is genuinely spatial or KNN.
Index type
GIN
Best for · operators
Arrays, JSONB, tsvector. One entry per component value. Preferred for full-text search and JSONB containment (@>).
Size & write notes
Larger than B-tree on rich documents. fastupdate defers writes to a pending list that flushes in bursts — plan for spiky write latency on heavily-written tables.
Index type
BRIN
Best for · operators
Range scans on data physically correlated with the column (append-only created_at, auto-increment ID). Min/max per block range.
Size & write notes
Orders of magnitude smaller than B-tree under strong correlation (Crunchy Data: 24 KB vs ~21 MB). Avoid for point lookups or randomly-ordered columns — effectively useless there.
Index type
SP-GiST
Best for · operators
Space-partitioned, non-balanced structures: quadtrees, k-d trees, radix trees, some text/IP types. Supports INCLUDE.
Size & write notes
Niche. Reach for it only when your data maps naturally onto a partitioning structure GiST handles poorly. Most apps never need it.
Read this column first
The most valuable column in any index matrix is avoid when. Adding the right index helps one query; adding the wrong index — or a redundant one — taxes every write to the table forever. Default to B-tree, and only reach for a specialist when the access pattern genuinely demands it.

03Column OrderThe "most selective column first" rule is a myth.

Almost every indexing tutorial repeats the same advice: put the most selective column first in a composite index. Markus Winand, author of SQL Performance Explained and the reference site Use The Index, Luke, explicitly catalogs this as a myth. What actually governs column order is how often the index can be used — a function of your query patterns and the left-prefix rule, not raw selectivity.

The left-prefix rule is the mechanism. An index on (col_a, col_b) can accelerate queries filtering on col_a alone, or on (col_a, col_b) together — but it cannot efficiently serve a query that filters only on col_b. Put the column your queries always filter on first, even if it is less selective, so the index stays usable across the widest set of queries.

The most important consideration when defining a concatenated index is how to choose the column order so it can be used as often as possible.— Markus Winand, Use The Index, Luke

There is a second rule that often matters more than selectivity: equality predicates should precede range predicates. A range condition on an intermediate column ends the index traversal — every column to its right becomes a filter predicate rather than an access predicate, which means Postgres fetches and discards rows instead of skipping straight to them. Order your composite index so all the equality columns come first and the single range column comes last.

The nuance worth preserving: selectivity is not irrelevant. Winand is clear that for independent range conditions, selectivity does help. The correction is not "selectivity never matters" — it is "usage frequency and the left-prefix and equality-before-range rules dominate, and selectivity only breaks ties." Over-indexing on selectivity alone is how teams build composite indexes their actual queries can never use.

Equality + range
Filter on a value, range on a date

WHERE status = 'active' AND created_at > $1. Put the equality column first, the range column last: (status, created_at). The range ends traversal, so nothing useful can sit to its right.

(equality, range)
Equality + sort
Filter then ORDER BY

WHERE customer_id = $1 ORDER BY created_at DESC. Index (customer_id, created_at) lets Postgres both filter and return rows pre-sorted, skipping a sort node entirely.

(filter, sort key)
Shared leading column
Many queries, one common filter

If nearly every query filters on tenant_id, lead with it even when it is low-selectivity. The left-prefix rule means one index then serves the whole family of queries.

Lead with the common column
Range-only
No leading equality

If queries filter only on a range column with nothing to its left, a composite index buys little. Index that column alone, and let selectivity guide whether it earns its keep at all.

Single-column index

04Covering & PartialCovering and partial indexes — and the HOT-update trap.

Two of the most powerful read-side tools are covering indexes and partial indexes. A covering index adds non-key payload columns via the INCLUDE clause — CREATE INDEX tab_x_y ON tab(x) INCLUDE (y) — so the planner can satisfy a query entirely from the index without touching the heap. That is an index-only scan, and on the right query it is dramatically faster. Only B-tree, GiST, and SP-GiST support INCLUDE, and the feature arrived in PostgreSQL 11.

A partial index is built over a subset of the table defined by a WHERE predicate — CREATE INDEX ON orders(customer_id) WHERE status = 'active'. It avoids indexing common values the planner would ignore anyway, shrinks the index, and speeds both reads and the writes that don't touch the predicate. For a table where most rows are inactive but queries only ever hit active ones, a partial index can be a fraction of the size of the full one.

The trap most guides skip
Covering and partial indexes have a hidden write-side cost: they can break HOT (Heap-Only Tuple) updates. If you put a frequently-updated column inside an INCLUDE clause — or inside a partial index's WHERE predicate — every update to that column becomes a non-HOT update that must also touch the index. In a 2021 postgres.ai benchmark, including a frequently-changed column in a covering index dropped UPDATE throughput by roughly 40% (about 48k TPS down to the high-20k/low-30k range). The HOT mechanism itself has not changed since; treat this as structural behavior, not a stale figure.

There is also a documentation caveat worth internalizing before you reach for covering indexes everywhere. Index-only scans depend on the visibility map: Postgres must consult the heap for rows on pages not yet marked all-visible by VACUUM. In practice, covering indexes pay off most on slowly-changing tables with frequent VACUUM runs — exactly the opposite of the high-churn tables where the HOT-update trap bites hardest.

There is little point in including payload columns in an index unless the table changes slowly enough that an index-only scan is likely to not need to access the heap.— PostgreSQL official documentation

One more historical nuance for anyone reading older benchmarks: before PostgreSQL 13, covering indexes were larger than the equivalent multicolumn index because B-tree deduplication did not apply to INCLUDE columns — in a 2021 postgres.ai test, 30 MiB versus 21 MiB. Deduplication in PG 13 narrowed that gap, so don't carry the old size penalty into version-13-and-later planning. If you are wiring up indexes through an ORM, our guide to running Prisma ORM in production covers how generated queries map to the indexes they actually need.

Covering index size (pre-PG13)
vs 21 MiB multicolumn
30MiB

Before PG 13, INCLUDE columns were not deduplicated, so covering indexes ran larger than the equivalent multicolumn index. Deduplication in PG 13 closed much of the gap. (2021 postgres.ai benchmark.)

Version-dependent
HOT update rate
single-col index on the changing field
~97%

When the frequently-updated column was only in a plain single-column index, ~97% of updates stayed HOT. Move that same column into a covering index's INCLUDE and the HOT rate collapsed to ~0%. (2021 postgres.ai.)

Drops to ~0% if covered
Partial index payoff
index the subset you query
WHERE

A predicate like WHERE status = 'active' skips indexing the common dead values, shrinking the index and speeding reads plus the writes that don't touch the predicate. Just keep volatile columns out of the predicate.

Smaller + faster

05BRINWhen BRIN beats B-tree by 1,000× — and when it's useless.

BRIN is the most misunderstood index type, and the misunderstanding cuts both ways — teams either ignore it entirely or deploy it on columns where it does nothing. The headline is real but conditional: in a controlled Crunchy Data test, a 42 MB table produced roughly 21 MB of B-tree index but only 24 KB of BRIN index on the same timestamp column — roughly a 1,000-to-1 size advantage. That is not magic; it is the direct result of BRIN storing only min/max summaries per block range instead of an entry per row.

The condition is everything. That advantage only materializes when the indexed column is strongly correlated with the physical row order of the table — an auto-incrementing ID, or a created_attimestamp on an append-only table where new rows land at the end. Break that correlation — update rows randomly, or index a high-cardinality column with no physical ordering — and BRIN stops helping. Crunchy Data's benchmark is blunt about it: on the small result sets that point lookups produce, BRIN random-access latency ran 207–211 msagainst B-tree's 0.6–5 ms. BRIN only pulls ahead on very large result sets (100K+ rows) over sequentially-inserted data.

State the condition out loud
BRIN's size and scan advantages are not a general property of the index — they are a property of data physically ordered by the indexed column. On a randomly-updated column, BRIN is effectively useless. Treat it as a specialist for append-only, time-ordered tables (event logs, metrics, audit trails), never as a drop-in B-tree replacement.

That makes BRIN a natural fit for the write-heavy, append-only tables at the edges of an event-driven system — the same tables that show up in our reference on webhook reliability and event-driven architecture. A webhook-event or audit log inserted in time order, queried by date range and almost never point-looked-up, is close to the ideal BRIN workload: tiny index, fast range scans, negligible write overhead.

06The Write TaxOver-indexing taxes every write.

Reads get all the attention, so the failure mode that actually bites most read-heavy apps is the opposite of what people expect: too many indexes. Every index has to be maintained on every insert, update, and delete that touches its columns. Add enough of them and the write path slows measurably even when reads look fine.

The clearest independent evidence is a Percona benchmark. Scaling a schema from 7 indexes to 39 indexes cut write throughput by roughly 57% — TPS fell from about 1,400 to about 600, and transaction latency rose from 11 ms to 26 ms. Crucially, the PostgreSQL cache hit ratio stayed above 99.7% throughout, which isolates index maintenance — not I/O or cache misses — as the cause.

Write throughput collapses as index count climbs

Source: Percona — Benchmarking PostgreSQL: The Hidden Cost of Over-Indexing (2024)
Write TPS · 7 indexestransaction latency ~11 ms
~1,400
Write TPS · 39 indexestransaction latency ~26 ms · ~57% drop
~600

There is a second, harder-to-measure cost: write amplification. A single logical row write fans out into heap pages, every relevant index, and the write-ahead log. One vendor analysis (Tiger Data) puts steady-state Postgres write amplification in the order of a few times the raw payload, with each additional index adding incrementally to that ratio. Treat those specific multipliers as a single-vendor, order-of-magnitude illustration rather than a precise benchmark — but the direction is corroborated by the independent Percona TPS data above: indexes are not free, and the marginal cost compounds.

The operational discipline that follows is to prune, not just add. PostgreSQL tracks index usage in pg_stat_user_indexes; the idx_scan column counts how many times each index has been used. On a busy table with sufficient uptime, an index sitting at zero or near-zero scans is a strong candidate to drop. Auditing for unused indexes is one of the highest-ROI database chores most teams never schedule.

You do not eliminate write amplification in high throughput databases; you decide where it shows up.— GodOfGeeks, Write Amplification in Databases (dev.to)
GIN writes are spiky, not slow
One specific over-indexing footgun: GIN's fastupdate defers index writes to a pending list (4 MB by default). The list flushes in bursts — periodically or at query time — so write latency looks fine on average and then spikes. pganalyze, reporting on GitLab's production workload, documented GIN pending-list flushes ranging from a few hundred milliseconds to several seconds during peak hours on a heavily-written table. If you run GIN on a hot table, monitor for flush spikes rather than trusting average write latency.

07The ArbiterEXPLAIN ANALYZE is the final arbiter.

Every decision above is a hypothesis until you measure it on your own data, and the measurement tool is EXPLAIN ANALYZE. Plain EXPLAINshows only the planner's estimates without running the query; EXPLAIN ANALYZE actually executes it and returns estimates and real measured results side by side. The output is a tree, and each node reports both (cost=startup..total rows=N width=bytes) and (actual time=Xms rows=N loops=N).

Two reading habits prevent most misinterpretation. First, cost units are arbitrary, not milliseconds. seq_page_cost is conventionally 1.0 and every other cost parameter is relative to it, so a cost of 5.04 says nothing about wall-clock time — read actual time for real latency. Second, watch the gap between estimated and actual rows. A large discrepancy means the planner is working from stale statistics; run ANALYZE on the table to refresh them. Postgres auto-analyzes by default, but it can lag on fast-growing tables, and a bad row estimate leads directly to a bad plan — a sequential scan where an index scan belonged, or the wrong join strategy.

EXPLAIN results should not be extrapolated to situations much different from the one you are actually testing; for example, results on a toy-sized table cannot be assumed to apply to large tables.— PostgreSQL official documentation

That warning is the single most important sentence in the EXPLAIN documentation for read-heavy work. The index that looks pointless on a development table of ten thousand rows is often the one that saves you in production at ten million — and vice versa. Benchmark against production-scale data, or at least production-shaped data, and re-check after meaningful growth. When you add an index to a live table, use CREATE INDEX CONCURRENTLYso the build doesn't hold a write lock; it takes longer and can't run inside a transaction block, but it keeps the table writable throughout.

Indexing is the first layer of a read-performance stack, not the whole of it. Once the right indexes are in place and EXPLAIN ANALYZE confirms index scans on your hot paths, the next lever is moving the hottest reads off the database entirely — see our guide to Redis caching strategies in production for the "index first, cache second" sequencing.

08ChecklistA read-heavy indexing checklist.

Pulling the reference together into the order you'd actually work through it on a real schema. Indexing is iterative — you measure, add or prune one thing, and measure again — so treat this as a loop, not a one-time setup.

  • Start from the query, not the table. List the actual WHERE, JOIN, and ORDER BY clauses that run frequently. Index the access patterns you have, not the ones you imagine.
  • Default to B-tree. Only reach for GIN (arrays / JSONB / full-text), GiST (spatial / KNN), or BRIN (append-only, correlated) when the access pattern genuinely calls for it.
  • Order composite indexes by usage and predicate type. Lead with the column queries always filter on, put equality columns before the single range column, and let selectivity break ties — not decide order.
  • Use covering and partial indexes carefully. They are read wins, but keep frequently-updated columns out of INCLUDE clauses and partial-index predicates to preserve HOT updates.
  • Count your indexes per table. Each one taxes every write. Audit pg_stat_user_indexes for idx_scan = 0 and drop what nothing uses.
  • Verify with EXPLAIN ANALYZE on production-scale data. Read actual time, watch estimate-versus-actual row gaps, and run ANALYZE when they diverge. Add indexes with CREATE INDEX CONCURRENTLY.

If your application sits on Supabase, the same principles apply directly to its managed Postgres — our complete guide to Supabase and Next.js production setup shows where index management fits into a full deployment, and teams that want this whole loop run for them can engage our web development team to audit and tune a production schema end to end.

09ConclusionIndex the queries you have, then measure.

The shape of read-heavy indexing, 2026

Indexing is a measurement discipline, not a checklist of types.

The six index types are the easy part. The hard part — and the part that separates a fast read-heavy app from a slow one — is matching each type to a real access pattern, ordering composite indexes by how your queries actually run rather than by a selectivity myth, and resisting the urge to solve every slow query by adding yet another index.

The numbers in this reference point the same direction. BRIN's three-orders-of-magnitude size advantage is real, but only on physically-correlated data. A covering index is a read win that can silently halve your write throughput if it captures a volatile column. And Percona's benchmark — 7 indexes to 39 indexes, roughly 57% of your write throughput gone — is the clearest reminder that indexing is a trade-off you tune, not a feature you maximize.

So the discipline is the takeaway: start from the queries you have, default to B-tree, reach for specialists deliberately, prune unused indexes as aggressively as you add useful ones, and let EXPLAIN ANALYZE on production-scale data settle every disagreement. Do that on a loop and the read performance of a Postgres app stops being a mystery and becomes something you can engineer.

Tune your database for production read load

Slow reads are usually the wrong index, not a bigger server.

Our engineering team audits Postgres schemas for read-heavy production apps — index selection, composite ordering, over-indexing cleanup, and EXPLAIN ANALYZE tuning — delivered as concrete, measured changes, not generic advice.

Free consultationExpert guidanceTailored solutions
What we work on

Database performance engagements

  • Index audits — selection, composite order, redundancy
  • Over-indexing cleanup via pg_stat_user_indexes
  • EXPLAIN ANALYZE tuning on production-scale data
  • BRIN / GIN / covering-index strategy for the right tables
  • Supabase & Prisma query-to-index alignment
FAQ · Postgres indexing

The questions we get every week.

B-tree, in almost every case. It is the PostgreSQL default and handles both equality and range queries on any sortable data type, supports operators like =, <, >, BETWEEN, IN, IS NULL, and anchored LIKE 'foo%' prefixes, and can satisfy ORDER BY without a separate sort. Reach for a specialist type only when the access pattern genuinely requires it: GIN for arrays, JSONB, and full-text search; GiST for geometric, spatial, range, and nearest-neighbor queries; BRIN for append-only tables where the column is physically correlated with row order; Hash only when every query on a column is an exact-match equality lookup. Defaulting to B-tree and adding specialists deliberately is the safest path for a read-heavy schema.