Indexes: the basics

An index is the single biggest lever you have on query speed. This lesson builds the mental model — what an index is, when Postgres uses one, and when it correctly ignores the one you built.

The seed is an events table with twenty-five million rows: an id, a high-cardinality user_id, a lopsided status, a created_at, and an amount. Big enough that the difference between reading every row and jumping straight to a few isn't just visible in the plan — you can watch it in the query's execution time.

sql

SELECT count(*) FROM events;

The problem: a Seq Scan reads everything

Ask for one user's events. There's no index yet, so watch how Postgres plans it:

sql

EXPLAIN SELECT * FROM events WHERE user_id = 42;

The top line says Seq Scan on events — a sequential scan. To find the handful of rows for user 42, Postgres reads all twenty-five million rows and throws away the ones that don't match. The rows= estimate on that line is roughly the whole table.

That's the default when there's no better path. It's correct, it's just linear: double the table, double the work. Every filtered query without a useful index pays this.

Think of the table as a book with no index at the back. To find every mention of "user 42" you read the book cover to cover.

EXPLAIN only estimated that plan — it never ran the query. Add ANALYZE and Postgres actually executes it and prints the real Execution Time on the last line:

sql

EXPLAIN ANALYZE SELECT * FROM events WHERE user_id = 42;

On twenty-five million rows that's a few hundred milliseconds — the price of touching every row to return a few dozen. Hold onto that number; we're about to crush it. (Confusingly, the ANALYZE inside EXPLAIN ANALYZE means "run it and time it" — it is not the standalone ANALYZE command we use next, which gathers statistics. Reading these plans in full is the next lesson; for now just watch the Execution Time.)

The problem: a Seq Scan reads everything

Give the planner stats first

The fix: CREATE INDEX

What a B-tree covers

Selectivity: an index only pays off for few rows

Multicolumn indexes and the left-prefix rule

UNIQUE indexes come with constraints

Indexes are not free

What you learned