Table of Contents
- Real life example
- What is an index?
- The problem without indexes
- Creating an index
- Indexes on multiple columns
- Unique indexes
- Primary keys are automatically indexed
- When to create indexes
- When NOT to create indexes
- Viewing existing indexes
- Dropping an index
- How indexes work under the hood
- Covering indexes
- Analyzing query performance
- Best practices
- Summary
Written with a little help from AI, reviewed and edited by a human.
When you have a small table with 100 rows, queries are fast. But when you have a table with 1 million rows, finding a specific row can be slow. Indexes make searches much faster by creating a “lookup table” that tells the database exactly where to find data.
Real life example
Imagine you have a massive phone book with 100,000 names in random order. Someone asks: “What’s the phone number for John Smith?”
Without an index, you’d have to flip through every single page until you find John Smith. That could take hours.
But if the phone book is sorted alphabetically (which it usually is), you can open it to the middle, see you’re at the M section, flip to the S section, and quickly find John Smith. That’s what an index does — it organizes the data so you can find things quickly.
What is an index?
An index is a separate data structure that stores a sorted copy of one or more columns along with pointers to where the full rows are stored. When you query that column, the database uses the index to find the rows much faster.
Think of it like the index at the back of a textbook — instead of reading every page to find a topic, you look it up in the index which tells you “Chapter 7, Page 142”.
The problem without indexes
Let’s say you have a users table with 1 million rows:
SELECT * FROM users WHERE email = 'alice@example.com';
Without an index, the database has to check every single row (1 million rows) to see if the email matches. This is called a full table scan and it’s slow.
With an index on the email column, the database can jump straight to the row with that email — checking maybe 10-20 rows instead of 1 million. That’s a huge speed improvement.
Creating an index
You create an index on a column like this:
CREATE INDEX idx_users_email ON users(email);
idx_users_emailis the name of the index (you can name it anything, but a convention isidx_tablename_columnname)ON users(email)means “create an index on the email column of the users table”
Now queries that filter by email will be much faster:
SELECT * FROM users WHERE email = 'alice@example.com'; -- Fast!
Indexes on multiple columns
You can create an index on multiple columns:
CREATE INDEX idx_users_city_age ON users(city, age);
This is called a composite index or multi-column index. It’s useful when you often query both columns together:
SELECT * FROM users WHERE city = 'New York' AND age > 25; -- Fast!
The order matters. This index helps with:
WHERE city = 'New York'✅WHERE city = 'New York' AND age > 25✅WHERE city = 'New York' AND age = 30✅
But not with:
WHERE age > 25❌ (age is the second column, so it can’t be used alone)
Composite indexes work left-to-right. You can use the first column alone, but not the second column without the first.
Unique indexes
A unique index ensures that no two rows have the same value in that column:
CREATE UNIQUE INDEX idx_users_email ON users(email);
Now if you try to insert a duplicate email, you’ll get an error. This is similar to adding a UNIQUE constraint when creating the table, but you can add it later with an index.
Primary keys are automatically indexed
When you create a PRIMARY KEY, the database automatically creates an index on that column:
CREATE TABLE users (
id INTEGER PRIMARY KEY, -- Automatically indexed
name VARCHAR(100)
);
That’s why queries like WHERE id = 5 are always fast — the id column is indexed by default.
When to create indexes
Create indexes on columns that you frequently search, filter, or sort by:
- Columns used in
WHEREclauses - Columns used in
JOINconditions - Columns used in
ORDER BY - Foreign key columns
For example, if you often search users by email:
SELECT * FROM users WHERE email = 'alice@example.com';
Create an index on email:
CREATE INDEX idx_users_email ON users(email);
When NOT to create indexes
Indexes have costs:
- They take up disk space — Each index is a separate data structure stored on disk.
- They slow down writes — When you
INSERT,UPDATE, orDELETEa row, the database also has to update all the indexes on that table.
So don’t create indexes on:
- Columns you rarely query
- Very small tables (under 1000 rows — full table scans are already fast)
- Columns with very few unique values (like
true/falseorgender— indexes don’t help much)
Viewing existing indexes
To see what indexes exist on a table:
SQLite:
.indexes users
PostgreSQL:
\d users
MySQL:
SHOW INDEXES FROM users;
Dropping an index
If an index is no longer needed or is slowing down writes, you can drop it:
DROP INDEX idx_users_email;
This doesn’t affect the data in the table — it just removes the index structure.
How indexes work under the hood
Most databases use a data structure called a B-tree (or B+ tree) for indexes. It’s like a binary search tree, but optimized for disk storage.
When you create an index on email, the database creates a sorted tree structure:
"j@example.com"
/ \
"a@example.com" "m@example.com"
/ \ / \
alice bob john mary
Each node points to where the actual row is stored. When you search for alice@example.com, the database traverses the tree (a few steps) instead of scanning the whole table (millions of steps).
You don’t need to understand B-trees deeply, but it’s good to know that indexes are fast for lookups but have overhead for updates.
Covering indexes
A covering index includes all the columns you need in the query, so the database doesn’t have to go back to the table at all:
CREATE INDEX idx_users_email_name ON users(email, name);
SELECT name FROM users WHERE email = 'alice@example.com';
This query can be answered entirely from the index because both email and name are in the index. The database doesn’t need to load the full row. This is even faster.
Analyzing query performance
Most databases have tools to see if a query is using an index:
SQLite:
EXPLAIN QUERY PLAN SELECT * FROM users WHERE email = 'alice@example.com';
PostgreSQL:
EXPLAIN SELECT * FROM users WHERE email = 'alice@example.com';
MySQL:
EXPLAIN SELECT * FROM users WHERE email = 'alice@example.com';
This shows you the “query plan” — how the database will execute the query. Look for terms like:
- “USING INDEX” or “Index Scan” ✅ Good, using an index
- “SCAN TABLE” or “Seq Scan” ❌ Bad, full table scan (slow)
Best practices
Start simple: Don’t add indexes prematurely. Add them when you notice slow queries.
Index foreign keys: Always index columns used in JOINs (usually foreign keys).
Index WHERE columns: If you frequently filter by a column, index it.
Index ORDER BY columns: If you sort by a column often, index it.
Don’t over-index: More indexes = slower writes. Only index what you actually need.
Monitor query performance: Use EXPLAIN to see if your queries are using indexes.
Summary
- Indexes make searches faster by creating a sorted lookup structure
- Create indexes on columns you frequently search, filter, or join on
- Primary keys are automatically indexed
- Indexes speed up reads but slow down writes
- Use
CREATE INDEXto add an index,DROP INDEXto remove it - Use
EXPLAINto see if your queries are using indexes - Don’t over-index — only index what you need