Table of Contents
Written with a little help from AI, reviewed and edited by a human.
A subquery is a query inside another query. It’s like asking a question, but to answer it, you first need to ask a smaller question. Subqueries let you break complex problems into smaller steps and write more powerful SQL.
Real life example
Imagine you’re a teacher and you want to find all students who scored higher than the class average. To do this, you need to:
- First, calculate the class average
- Then, find all students whose score is above that average
You can’t do both in one step — you need the average first before you can compare it. In SQL, a subquery lets you get the average in an “inner query” and then use it in the “outer query”.
A simple subquery
Let’s say you have a table of students with their scores:
| id | name | score |
|---|---|---|
| 1 | Alice | 85 |
| 2 | Bob | 92 |
| 3 | Charlie | 78 |
| 4 | Diana | 88 |
| 5 | Eve | 95 |
The average score is (85 + 92 + 78 + 88 + 95) / 5 = 87.6
Now you want to find students who scored above 87.6. You could calculate the average yourself and then query:
SELECT * FROM students
WHERE score > 87.6;
But what if the data changes? You’d have to recalculate the average every time. Instead, use a subquery:
SELECT * FROM students
WHERE score > (SELECT AVG(score) FROM students);
Result:
| id | name | score |
|---|---|---|
| 2 | Bob | 92 |
| 4 | Diana | 88 |
| 5 | Eve | 95 |
Here’s what happened:
- The subquery
(SELECT AVG(score) FROM students)runs first and returns87.6 - The outer query then becomes
WHERE score > 87.6
The subquery is wrapped in parentheses and runs before the outer query.
Subqueries in WHERE
The most common use of subqueries is in the WHERE clause to filter based on calculated values.
Example: Find users who have written posts
Let’s say you have two tables:
users
| id | name |
|---|---|
| 1 | Alice |
| 2 | Bob |
| 3 | Charlie |
posts
| id | title | user_id |
|---|---|---|
| 1 | First post | 1 |
| 2 | Hello world | 1 |
| 3 | SQL is fun | 2 |
You want to find all users who have written at least one post. You can use a subquery with IN:
SELECT * FROM users
WHERE id IN (SELECT user_id FROM posts);
The subquery SELECT user_id FROM posts returns: 1, 1, 2 (the user IDs of all posts).
The outer query then becomes: WHERE id IN (1, 1, 2), which matches Alice and Bob.
Result:
| id | name |
|---|---|
| 1 | Alice |
| 2 | Bob |
Charlie is not in the result because he hasn’t written any posts.
Subqueries with NOT IN
You can flip it to find users who haven’t written posts:
SELECT * FROM users
WHERE id NOT IN (SELECT user_id FROM posts);
Result:
| id | name |
|---|---|
| 3 | Charlie |
Subqueries in SELECT
You can also use a subquery in the SELECT clause to add calculated columns.
For example, show each student along with how much higher or lower their score is compared to the average:
SELECT
name,
score,
score - (SELECT AVG(score) FROM students) AS difference_from_avg
FROM students;
Result:
| name | score | difference_from_avg |
|---|---|---|
| Alice | 85 | -2.6 |
| Bob | 92 | 4.4 |
| Charlie | 78 | -9.6 |
| Diana | 88 | 0.4 |
| Eve | 95 | 7.4 |
The subquery (SELECT AVG(score) FROM students) runs once and returns 87.6, then it’s subtracted from each student’s score.
Subqueries in FROM
You can use a subquery in the FROM clause to create a “temporary table” that you then query from.
For example, find the average score of only students who scored above 80:
SELECT AVG(score) AS high_scorer_avg
FROM (
SELECT score FROM students WHERE score > 80
);
The inner query SELECT score FROM students WHERE score > 80 returns:
| score |
|---|
| 85 |
| 92 |
| 88 |
| 95 |
The outer query then calculates the average of those four scores: 90.
Correlated subqueries
A correlated subquery references columns from the outer query. It runs once for each row in the outer query.
For example, find students whose score is above their city’s average (assuming we have a city column):
SELECT name, city, score
FROM students s1
WHERE score > (
SELECT AVG(score)
FROM students s2
WHERE s2.city = s1.city
);
For each student in the outer query, the subquery calculates the average score for their city and compares their score to it.
This is more advanced and slower because the subquery runs many times.
EXISTS
Sometimes you just want to check if any rows exist that match a condition. Use EXISTS:
SELECT * FROM users
WHERE EXISTS (
SELECT 1 FROM posts WHERE posts.user_id = users.id
);
This finds all users who have at least one post. The subquery doesn’t return the posts — it just checks if any exist. If yes, the user is included.
EXISTS is often faster than IN for large datasets because it stops searching as soon as it finds one match.
NOT EXISTS
Find users who have no posts:
SELECT * FROM users
WHERE NOT EXISTS (
SELECT 1 FROM posts WHERE posts.user_id = users.id
);
Result:
| id | name |
|---|---|
| 3 | Charlie |
When to use subqueries vs JOINs
Often you can solve the same problem with either a subquery or a JOIN. For example:
With subquery:
SELECT * FROM users
WHERE id IN (SELECT user_id FROM posts);
With JOIN:
SELECT DISTINCT users.*
FROM users
INNER JOIN posts ON users.id = posts.user_id;
Both work, but:
- Subqueries are often easier to read for simple cases
- JOINs are usually faster for large datasets
- Subqueries are better when you need to filter based on aggregate functions (like “above average”)
Summary
- A subquery is a query inside another query
- Use subqueries in
WHERE,SELECT, orFROMclauses - Subqueries with
INandNOT INare great for filtering based on another table - Use
EXISTSandNOT EXISTSfor checking if related rows exist - Subqueries can often be replaced with JOINs, but sometimes they’re clearer