Querying data
SELECT is the statement you will write more than any other. It reads rows from one or more tables without changing them, and it belongs to DML (Data Manipulation Language), one of SQL's sublanguages. This lesson covers reading from a single table; joins add more tables later.
Every query box below is live - edit it and press Run (or Ctrl+Enter).
The anatomy of a query
A SELECT reads like a sentence: select these columns, from this table, where this is true, sorted this way, and stop after so many rows.
Five clauses do the work, and you almost always write them in this order:
SELECT- which columns (or computed values) to return.FROM- which table to read.WHERE- which rows to keep.ORDER BY- how to sort the result.LIMIT- how many rows to return.
Only SELECT and FROM are required. The rest are optional filters and shaping on top.
Choosing what comes back
The column list after SELECT is not limited to stored columns. You can compute values and rename them with AS (an alias):
SELECT * returns every column. It is handy when exploring, but in real queries prefer naming the columns you need - it is clearer and avoids hauling back data you will not use.
Use DISTINCT to collapse duplicate rows. "Which countries do we have customers in?" should list each country once:
Filtering rows with WHERE
WHERE keeps only the rows whose condition is true. The basic operators are comparisons - =, <> (not equal), <, >, <=, >= - combined with AND, OR, and NOT.
Beyond those, four constructs do most of the day-to-day filtering.
Ranges, lists, and patterns
BETWEEN a AND b- an inclusive range.WHERE total BETWEEN 20 AND 50.IN (...)- matches any value in a list;NOT INexcludes them. Clearer than a chain ofORs.LIKE- matches a text pattern, where%is "any run of characters" and_is "exactly one."WHERE name LIKE 'A%'finds names starting with A.
Working with NULL
NULL means unknown - no value recorded. It is not zero and not an empty string, and it has a surprising rule: any comparison with NULL returns neither true nor false, but unknown. So WHERE country = NULL matches nothing, even for rows that have no country.
To test for missing values, use IS NULL and IS NOT NULL:
This three-valued logic (true / false / unknown) is the single most common source of "why didn't that row come back?" - when a filter behaves oddly, suspect a NULL.
Conditional logic with CASE
A CASE expression computes a value per row by testing conditions in order - SQL's if/else. It belongs in the SELECT list (or ORDER BY), returning the result of the first WHEN that is true, or the ELSE if none match.
Bucket each order into 'big' or 'small' by its total:
Chain several WHENs for more than two buckets. The first match wins, so order them from most to least specific:
COALESCE and NULLIF: handling NULL in output
Two small functions tidy up NULL in the SELECT list, where CASE would be clumsy.
COALESCE(a, b, ...)returns the first non-NULL argument. Use it to supply a default: show a country, or'unknown'when it is missing.NULLIF(a, b)returnsNULLwhena = b, elsea. Use it to turn a sentinel value into a realNULL, or to dodge divide-by-zero.
Eve has no country. COALESCE substitutes a label:
A LEFT JOIN fills unmatched columns with NULL. Wrapping the total in COALESCE(total, 0) turns "no orders" into a clean 0 instead of a blank - you will use this in joins.
Sorting and limiting
ORDER BY sorts the result - ASC (ascending, the default) or DESC. Give several keys to break ties: sort by country, then by name within each country. LIMIT caps the row count, and OFFSET skips rows first (useful for paging).
The order SQL really runs in
Here is the subtlety that trips up nearly everyone. You write the clauses in one order, but the database evaluates them in another. That is why you can filter on a column in WHERE yet cannot use an alias defined in SELECT - when WHERE runs, the SELECT list has not been computed yet.
Read it top to bottom: the database starts with the whole table, drops rows that fail WHERE, optionally groups what survives, then computes the SELECT list, sorts, and trims. Carry this picture with you - it explains most "why didn't that work" moments, and it is the backbone of the next lesson on aggregation.
Exercise
Write your answers in the sandbox - the third one checks your result automatically.
1. Worked. Return the names of customers in Ireland, alphabetically (the query above).
2. Finish it. Return the id and total of every order of at least 30.00, largest first. Fill in the blanks, then run - the box checks your result.
3. Write it yourself. Return only the three most recent orders: their id and created_at, newest first. Run it - the box checks your result against the expected answer.
4. Finish it (CASE + COALESCE). For each customer show their name, their country (or 'unknown' when missing), and a region of 'home' when the country is 'IE' else 'away'. Fill the blanks, then run - the box checks your result.
Show answers
2. WHERE total >= 30.00 ORDER BY total DESC;
3.
SELECT id, created_at
FROM orders
ORDER BY created_at DESC
LIMIT 3;
4.
SELECT name,
COALESCE(country, 'unknown') AS country,
CASE WHEN country = 'IE' THEN 'home' ELSE 'away' END AS region
FROM customers;
Eve's country is NULL, so country = 'IE' is unknown (not true) and her region is 'away'.