Ranking functions assign a position number to each row within its
partition, based on the window's ORDER BY. The main ones are ROW_NUMBER,
RANK, DENSE_RANK, and NTILE, plus the distribution functions PERCENT_RANK
and CUME_DIST.
SELECT name, salary,
ROW_NUMBER() OVER (ORDER BY salary DESC) AS rn,
RANK() OVER (ORDER BY salary DESC) AS rnk,
DENSE_RANK() OVER (ORDER BY salary DESC) AS dense_rnk
FROM employees;
All ranking functions require an ORDER BY in the OVER clause — without an
order, "rank" has no meaning.
Rule of thumb: ranking functions number rows by an ordering; they always need
ORDER BY in OVER.
ROW_NUMBER() assigns a unique, sequential integer to each row within the
partition, in the window's ORDER BY order — 1, 2, 3, ... with no ties and no
gaps. Even rows with equal ordering values get distinct numbers (the tie-break is
arbitrary unless you add more ORDER BY columns).
SELECT name, dept_id,
ROW_NUMBER() OVER (PARTITION BY dept_id ORDER BY salary DESC) AS rn
FROM employees;
It's the go-to for pagination, deduplication, and top-N-per-group.
Rule of thumb: ROW_NUMBER() = a unique 1,2,3 sequence per partition, no ties —
use it when you need exactly one row per position.
Both give tied rows the same rank, but they differ in what comes next:
RANK()leaves gaps after ties — if two rows tie at 1, the next is 3.DENSE_RANK()leaves no gaps — after a tie at 1, the next is 2.
-- salaries: 100, 100, 90
-- RANK(): 1, 1, 3
-- DENSE_RANK(): 1, 1, 2
SELECT name, salary,
RANK() OVER (ORDER BY salary DESC) AS rnk,
DENSE_RANK() OVER (ORDER BY salary DESC) AS dense_rnk
FROM employees;
Rule of thumb: RANK skips numbers after ties (like Olympic ranking);
DENSE_RANK keeps them consecutive.
The key difference is how ties are handled:
ROW_NUMBER()— always unique; tied rows get different numbers (arbitrary order among the tie).RANK()— tied rows get the same rank, then a gap.DENSE_RANK()— tied rows get the same rank, no gap.
-- values 100, 100, 90:
-- ROW_NUMBER: 1, 2, 3
-- RANK: 1, 1, 3
-- DENSE_RANK: 1, 1, 2
Rule of thumb: choose ROW_NUMBER for unique positions, RANK/DENSE_RANK when
ties should share a position (gaps vs no gaps).
NTILE(n) divides the ordered rows of a partition into n roughly equal buckets
and labels each row with its bucket number 1..n. It's used for quartiles,
deciles, percentile bands, and bucketing.
-- split employees into 4 salary quartiles
SELECT name, salary,
NTILE(4) OVER (ORDER BY salary DESC) AS quartile
FROM employees;
If the row count doesn't divide evenly, the earlier buckets get one extra row.
Rule of thumb: NTILE(n) splits ordered rows into n balanced groups — use it for
quartiles/deciles and even distribution.
PERCENT_RANK() returns the relative rank of a row as a value between 0 and
1: (rank - 1) / (total_rows - 1). The first row is always 0; the last is 1.
It tells you what fraction of rows rank below the current one.
SELECT name, salary,
PERCENT_RANK() OVER (ORDER BY salary) AS pct_rank
FROM employees;
It's useful for percentile-style comparisons (e.g. "this salary is higher than 80% of others").
Rule of thumb: PERCENT_RANK() = where a row sits on a 0–1 scale relative to the
rest; first row 0, last row 1.
CUME_DIST() (cumulative distribution) returns the fraction of rows with a value
less than or equal to the current row: rows_<=_current / total_rows. It ranges
in (0, 1].
The difference from PERCENT_RANK():
CUME_DIST= count of rows ≤ current / total (includes the current row).PERCENT_RANK=(rank - 1) / (n - 1)(excludes current; first row is 0).
SELECT name, salary,
CUME_DIST() OVER (ORDER BY salary) AS cume,
PERCENT_RANK() OVER (ORDER BY salary) AS pct_rank
FROM employees;
Rule of thumb: CUME_DIST answers "what proportion are at or below me?";
PERCENT_RANK answers "what's my relative rank position 0–1?".
Ranking is meaningless without a defined order — the function needs to know by
what to rank. So ROW_NUMBER, RANK, DENSE_RANK, NTILE, etc. all require
ORDER BY inside OVER; omitting it is an error in most databases.
-- ERROR: no ordering to rank by
ROW_NUMBER() OVER (PARTITION BY dept_id)
-- correct
ROW_NUMBER() OVER (PARTITION BY dept_id ORDER BY salary DESC)
(Aggregate windows like SUM OVER () don't need ORDER BY, but ranking functions
do.)
Rule of thumb: every ranking function needs ORDER BY in OVER to define the
ranking criterion.
The classic pattern: number rows within each partition with a ranking function in a
CTE/subquery, then filter on that number in the outer query (window functions
can't go in WHERE).
-- top 3 highest-paid employees per department
WITH ranked AS (
SELECT name, dept_id, salary,
DENSE_RANK() OVER (PARTITION BY dept_id
ORDER BY salary DESC) AS rnk
FROM employees
)
SELECT name, dept_id, salary
FROM ranked
WHERE rnk <= 3;
Use ROW_NUMBER for "exactly N rows" or RANK/DENSE_RANK to include ties at
the cutoff.
Rule of thumb: rank in a CTE, filter rnk <= N — pick ROW_NUMBER for an exact N,
DENSE_RANK to keep ties.
Rank the rows descending, then filter for the Nth rank. Use DENSE_RANK when you
want the Nth distinct value (so duplicate values count once).
-- the 3rd highest distinct salary
WITH ranked AS (
SELECT salary, DENSE_RANK() OVER (ORDER BY salary DESC) AS rnk
FROM employees
)
SELECT DISTINCT salary FROM ranked WHERE rnk = 3;
With ROW_NUMBER you'd get the 3rd row, not the 3rd distinct value; with RANK
you'd risk gaps. DENSE_RANK is the safe choice for "Nth highest distinct."
Rule of thumb: for the Nth highest distinct value, DENSE_RANK() ... = N.
Partition by the columns that define a duplicate, number the rows, and keep only
rn = 1 (deleting or excluding the rest).
-- keep the most recent record per email, drop older duplicates
WITH ranked AS (
SELECT id, ROW_NUMBER() OVER (PARTITION BY email
ORDER BY created_at DESC) AS rn
FROM users
)
DELETE FROM users
WHERE id IN (SELECT id FROM ranked WHERE rn > 1);
The ORDER BY decides which duplicate is the "keeper" (e.g. newest).
Rule of thumb: ROW_NUMBER() partitioned by the dup key, keep rn = 1, remove
rn > 1 — the standard dedup pattern.
Number the ordered rows, then select the slice for a page. This was the classic
pagination method before OFFSET/FETCH and is still used in SQL Server pre-2012.
-- page 2, 10 rows per page (rows 11–20)
WITH ordered AS (
SELECT *, ROW_NUMBER() OVER (ORDER BY created_at DESC) AS rn
FROM products
)
SELECT * FROM ordered WHERE rn BETWEEN 11 AND 20;
Modern engines often prefer ORDER BY ... LIMIT 10 OFFSET 10, but ROW_NUMBER
pagination works everywhere and pairs well with deterministic ordering. Note both
get slow at deep offsets — keyset pagination scales better.
Rule of thumb: ROW_NUMBER + BETWEEN slices pages; for large datasets prefer
keyset pagination over deep offsets.
ROW_NUMBER() always produces unique numbers, but when the ORDER BY values tie,
the assignment among tied rows is arbitrary and can change between runs. Add a
tie-breaker column (ideally a unique key) to the ORDER BY to make it
deterministic.
ROW_NUMBER() OVER (ORDER BY salary DESC, id ASC) -- id breaks ties stably
Without the tie-break, paginating or deduplicating can return inconsistent results across executions.
Rule of thumb: append a unique column to the window ORDER BY so tied rows get a
stable, repeatable order.
PARTITION BY makes the rank restart at 1 for each group. Without it, ranking
runs across the entire result set as one partition.
-- rank salaries WITHIN each department (resets per dept)
SELECT name, dept_id, salary,
RANK() OVER (PARTITION BY dept_id ORDER BY salary DESC) AS dept_rank
FROM employees;
So an employee can be rank 1 in their department even if they're not the highest-paid company-wide.
Rule of thumb: add PARTITION BY to rank within groups (rank resets per group);
omit it to rank globally.
Window functions — including ranking functions — are evaluated after WHERE,
so the rank doesn't exist yet when WHERE runs. Referencing it there is an error.
-- ILLEGAL
SELECT name FROM employees
WHERE RANK() OVER (ORDER BY salary DESC) <= 5;
-- LEGAL: compute in a CTE, filter outside
WITH r AS (
SELECT name, RANK() OVER (ORDER BY salary DESC) AS rnk FROM employees
)
SELECT name FROM r WHERE rnk <= 5;
Rule of thumb: ranks are computed after filtering — always wrap them in a CTE or subquery before filtering.
When the row count isn't divisible by n, NTILE makes the first buckets one
row larger than the later ones. For example, 10 rows into NTILE(3) gives
buckets of sizes 4, 3, 3.
-- 10 rows, NTILE(3): bucket 1 has 4 rows, buckets 2 and 3 have 3 each
SELECT val, NTILE(3) OVER (ORDER BY val) AS bucket FROM nums;
This guarantees buckets differ in size by at most one, with the extras front-loaded.
Rule of thumb: NTILE front-loads the remainder — earlier buckets get the extra
rows when the count doesn't divide evenly.
Pick based on how you want ties handled:
- Need exactly one row per position (pagination, dedup, "the single latest")
→
ROW_NUMBER(). - Ties should share a rank with gaps (standings where 2 golds means no silver)
→
RANK(). - Ties should share a rank without gaps (Nth distinct value, dense tiers)
→
DENSE_RANK().
ROW_NUMBER() -- 1,2,3,4 (unique)
RANK() -- 1,1,3,4 (gap after tie)
DENSE_RANK() -- 1,1,2,3 (no gap)
Rule of thumb: unique → ROW_NUMBER; ties-with-gaps → RANK; ties-no-gaps →
DENSE_RANK.
A median is the value at the 50th percentile. You can approximate it with
CUME_DIST/PERCENT_RANK, but most databases offer the dedicated ordered-set
aggregate PERCENTILE_CONT/PERCENTILE_DISC (a WITHIN GROUP function, related
to window analytics).
-- exact continuous median per department (PostgreSQL / Oracle / SQL Server)
SELECT dept_id,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY salary) AS median_salary
FROM employees
GROUP BY dept_id;
PERCENTILE_CONT interpolates between rows; PERCENTILE_DISC returns an actual
data value.
Rule of thumb: for a median, prefer PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY ...) over hand-rolling it from rank functions.
More Window Functions interview questions
More ways to practice
The self-quiz is live. Get notified when mock interviews and new question packs drop.