You can now read, join, group, and write. This is the first lesson of the advanced module, and it's about composing queries: using the result of one query inside another. It's what lets you express things like "orders worth more than this customer's average" without two round trips.
A subquery: a query used as a value
A subquery is just a SELECT wrapped in parentheses and used inside another query. The simplest place is in a WHERE: the inner query produces a list, and the outer query filters against it.
Press Run to execute your query.
Read it inside-out: the inner SELECT customer_id FROM orders produces the set of customers who have ordered, and the outer query keeps only customers whose id is IN that set. (You could also write this as a JOIN; subqueries and joins often express the same idea, and you pick whichever reads more clearly.)
Correlated subqueries: one that runs per row
A correlated subquery refers to the outer query's current row, so it runs once for each outer row. Here, for every customer, a little subquery counts that customer's orders.
Press Run to execute your query.
Notice the inner query mentions c.id from the outer query. That's the "correlation." It's expressive, but be aware it conceptually runs per row, so on large tables a join or window function is often faster. Great for clarity, worth watching for performance.
CTEs: name a result with WITH
As queries grow, nesting subqueries inside subqueries becomes unreadable. A CTE (Common Table Expression), written with WITH, lets you name a query's result and then use that name like a table. It turns one tangled query into a series of clear steps.
Press Run to execute your query.
Read it top-down: first order_totals computes each order's value, then the main query joins it to customers and filters. Same result a giant nested query would give, but you can follow it. You can chain several CTEs (WITH a AS (...), b AS (...)) to build a pipeline of named steps.
CTE or subquery?
They often do the same job. Reach for a CTE when naming a step makes a complex query readable, or when you need to use the same intermediate result more than once. Reach for a plain subquery for something small and local, like a single value in a WHERE. Readability is the deciding factor more often than performance.
Recursive CTEs: a query that builds on itself
A WITH RECURSIVE CTE can refer to itself, building up rows step by step. It has two parts joined by UNION ALL: a seed (the starting row, also called the anchor member) and a recursive step (the recursive member, which feeds on the rows produced so far) until a condition stops it.
Press Run to execute your query.
This one counts down from 5. The seed is SELECT 5; the recursive step takes each n and produces n - 1 until n > 1 is false. The same shape walks hierarchies: a category with sub-categories with sub-sub-categories, or an org chart of manager → reports → their reports. Anywhere you'd otherwise need an unknown number of joins, a recursive CTE handles it in one query.
Seed
Produce the starting row(s). Here, the single row n = 5.
Recursive step
Run the second query against the rows just produced, generating more (n − 1).
Stop
When the recursive step produces no new rows (the WHERE fails), recursion ends and all rows are returned.
- …then back to the top, forever.
Always give recursion a way to stop
A recursive CTE with no stopping condition runs forever (or until the database kills it). The WHERE n > 1 here is what ends it. When walking a hierarchy, the recursion naturally stops when it reaches rows with no children, but if your data has a cycle (A reports to B reports to A), you need to guard against looping. Make sure every recursive query has a clear end.
What to take away
A subquery is a query used as a value inside another; a correlated subquery references the outer row and runs per row; a CTE (WITH) names a result so big queries read as clear steps; and a recursive CTE builds on its own output to walk series and hierarchies. Together they let you express genuinely complex logic in one readable query instead of many round trips.
Next: the most-loved advanced SQL topic in interviews, and the one that unlocks rankings and running totals without losing your individual rows — window functions.
Test yourself
Questions· say the answer out loud before you open it. If you can't, the chapter isn't done.
QWhat is a subquery?+
A SELECT wrapped in parentheses and used inside another query, for example to produce a list of ids that the outer query filters against in a WHERE ... IN (...). It lets one query depend on the result of another, and often expresses the same thing a JOIN would.
QWhat makes a subquery 'correlated', and why care?+
It references a column from the outer query's current row, so it runs once per outer row instead of once overall. That's expressive and readable, but because it conceptually executes per row, on large tables a JOIN or window function is often faster.
QWhen do you use a CTE instead of a subquery?+
When naming an intermediate step makes a complex query readable, or when you need to reuse the same intermediate result more than once. A CTE (WITH name AS (...)) turns nested logic into a top-down pipeline of named steps. For a small local value, a plain subquery is fine.
QWhat are the two parts of a recursive CTE?+
A seed (the starting rows) and a recursive step, joined by UNION ALL. The recursive step runs against the rows produced so far, generating more, until it produces none. This lets one query generate a series or walk a hierarchy of unknown depth.
QWhat's the danger with recursive CTEs?+
They can run forever if there's no stopping condition, or if the data has a cycle (A → B → A) that keeps producing new rows. Every recursive CTE needs a clear end: a WHERE that eventually fails, or, for cyclic data, a guard that detects and stops loops.
Comments
Loading comments…