What an index is
An index is a separate data structure that maps column values to the physical location of matching rows. Without an index, the database reads every row to find matches — a sequential scan. With an index it jumps directly to the relevant rows — an index scan — which is orders of magnitude faster on large tables.
The tradeoff: indexes consume disk space and slow down writes (every INSERT, UPDATE, and DELETE must also update every index on the table).
B-tree indexes — the default
The default index type in every major database is a B-tree (balanced tree). It keeps values in sorted order, enabling fast equality lookups, range scans, and sorts.
-- Creating a B-tree index (explicitly, and the default shorthand)
CREATE INDEX idx_orders_customer_id ON orders (customer_id);
-- The database can now use this index for:
SELECT * FROM orders WHERE customer_id = 1001; -- equality
SELECT * FROM orders WHERE customer_id BETWEEN 100 AND 200; -- range
SELECT * FROM orders ORDER BY customer_id; -- sort (avoids sort step)
Primary keys and UNIQUE constraints create B-tree indexes automatically.
Foreign key columns on the referencing side do not get automatic indexes
— create them manually.
-- Foreign key columns that need manual indexes
CREATE INDEX idx_orders_customer_id ON orders (customer_id);
CREATE INDEX idx_order_items_order_id ON order_items (order_id);
CREATE INDEX idx_order_items_product_id ON order_items (product_id);
Composite indexes — column order matters
A composite index on (col_a, col_b) can be used for queries on:
col_aalonecol_aANDcol_btogether
It cannot be used for col_b alone (the leftmost prefix rule).
-- Good composite index: supports both filtering by customer and sorting by date
CREATE INDEX idx_orders_customer_created
ON orders (customer_id, created_at DESC);
-- Supported queries:
SELECT * FROM orders WHERE customer_id = 1001;
SELECT * FROM orders WHERE customer_id = 1001 ORDER BY created_at DESC;
SELECT * FROM orders WHERE customer_id = 1001 AND created_at > '2026-01-01';
-- Not supported (no customer_id prefix):
SELECT * FROM orders WHERE created_at > '2026-01-01';
-- ^ requires a separate index on (created_at)
Put the equality columns first, range columns last, and the ORDER BY column last (if it matches the range direction).
Partial indexes — index a subset of rows
A partial index only indexes rows that satisfy a WHERE condition. It is
smaller, faster to maintain, and more selective than a full-column index.
-- Index only pending orders (the ones that need the most lookups)
CREATE INDEX idx_orders_pending
ON orders (customer_id, created_at)
WHERE status = 'pending';
-- Queries that can use this index must include the same condition
SELECT * FROM orders WHERE customer_id = 1001 AND status = 'pending';
-- Partial unique index: only one active subscription per customer
CREATE UNIQUE INDEX idx_subscriptions_active_customer
ON subscriptions (customer_id)
WHERE status = 'active';
Covering indexes — eliminate heap fetches
A covering index includes all columns a query needs, so the database never has to read the main table (heap) at all — an Index Only Scan.
-- Query: SELECT email FROM users WHERE created_at > '2026-01-01'
-- A covering index using INCLUDE (Postgres 11+):
CREATE INDEX idx_users_created_covering
ON users (created_at DESC)
INCLUDE (email);
-- EXPLAIN shows: Index Only Scan (Heap Fetches: 0)
EXPLAIN (ANALYZE, BUFFERS)
SELECT email FROM users WHERE created_at > '2026-01-01';
The INCLUDE columns are stored in the leaf nodes of the index but are not
part of the sort order. The index is still B-tree sorted on created_at.
What prevents index use
-- 1. Function wrapped around the indexed column — index on email is unused
SELECT * FROM users WHERE LOWER(email) = 'alice@example.com';
-- Fix: create a functional index
CREATE INDEX idx_users_email_lower ON users (LOWER(email));
-- 2. Leading wildcard in LIKE — cannot scan B-tree from the front
SELECT * FROM products WHERE product_name LIKE '%keyboard%';
-- Fix: use full-text search or a TRIGRAM index (pg_trgm extension in Postgres)
-- 3. Type mismatch — implicit cast prevents index use
SELECT * FROM orders WHERE customer_id = '1001'; -- '1001' is text, column is INT
-- Fix: match the literal type: customer_id = 1001
-- 4. OR conditions that span different indexed columns — usually a full scan
SELECT * FROM users WHERE email = 'a@b.com' OR phone = '555-1234';
-- Fix: UNION instead of OR, or a multi-column index that covers both
Identifying missing indexes
-- Postgres: find sequential scans on large tables (candidates for indexing)
SELECT relname AS table, seq_scan, seq_tup_read, idx_scan
FROM pg_stat_user_tables
WHERE seq_scan > 0
ORDER BY seq_tup_read DESC
LIMIT 20;
-- Check if a specific query uses an index
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM orders WHERE customer_id = 1001 AND status = 'pending';
-- Look for: Index Scan or Index Only Scan (good) vs Seq Scan (investigate)
Recap
Create B-tree indexes on every foreign key column (the referencing side) and on
every column that appears in WHERE, JOIN ON, or ORDER BY for high-traffic
queries. For composite indexes, equality columns go first, range columns last.
Use partial indexes to index only the active/pending subset of hot-path queries.
Use INCLUDE to build covering indexes that eliminate heap fetches. Avoid
wrapping indexed columns in functions — create functional indexes instead.