SQL Ranking Functions — ROW_NUMBER, RANK, DENSE

Why ranking functions matter

Ranking functions assign a position number to each row within a window partition. They are the practical solution for three everyday problems:

Top-N per group — the best-selling product per category, the latest order per customer, the highest-scoring ticket per agent.
Deduplication — keeping one row per entity when a table has duplicates.
Bucketing — dividing a result set into percentiles or quartiles.

ROW_NUMBER — unique sequential number

ROW_NUMBER() gives each row a unique integer starting from 1 within its partition. Ties get different numbers (arbitrarily assigned by the database).

-- Assign a row number within each customer's orders (newest first)
SELECT
    id,
    customer_id,
    total_amount,
    created_at,
    ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY created_at DESC) AS rn
FROM orders;

Pattern: get the most recent row per group (the classic N=1 use case):

-- Latest order per customer
WITH ranked AS (
    SELECT
        *,
        ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY created_at DESC) AS rn
    FROM orders
)
SELECT * FROM ranked WHERE rn = 1;

This works even when MAX(created_at) is ambiguous (multiple orders at the same timestamp) — ROW_NUMBER always assigns a unique 1 to exactly one row.

RANK vs DENSE_RANK — handling ties

Both assign the same number to tied rows, but differ in what comes next.

RANK() — skips position numbers after a tie: 1, 2, 2, 4 (gap)
DENSE_RANK() — no gaps: 1, 2, 2, 3

-- Sales leaderboard for the current month
SELECT
    employee_id,
    total_sales,
    RANK()       OVER (ORDER BY total_sales DESC) AS rank,
    DENSE_RANK() OVER (ORDER BY total_sales DESC) AS dense_rank
FROM monthly_sales
WHERE month = '2026-06-01';

-- rank:        1, 2, 2, 4, 5
-- dense_rank:  1, 2, 2, 3, 4

Use DENSE_RANK when "rank 3" should mean "the third distinct performance level". Use RANK when gaps are acceptable or meaningful (e.g., a sports league where tied teams share a position and the next is truly skipped).

Top-N per group with RANK

-- Top 3 products by revenue in each category
WITH product_revenue AS (
    SELECT
        p.id,
        p.product_name,
        p.category,
        SUM(i.unit_price * i.quantity) AS revenue
    FROM order_items i
    JOIN products    p ON p.id = i.product_id
    GROUP BY p.id, p.product_name, p.category
),
ranked AS (
    SELECT *,
           RANK() OVER (PARTITION BY category ORDER BY revenue DESC) AS rnk
    FROM product_revenue
)
SELECT category, product_name, revenue, rnk
FROM ranked
WHERE rnk <= 3
ORDER BY category, rnk;

NTILE — dividing into buckets

NTILE(n) divides rows into n roughly equal buckets and labels each row with its bucket number (1 to n). Useful for quartiles, deciles, and percentile bands.

-- Divide customers into 4 revenue quartiles
WITH customer_revenue AS (
    SELECT customer_id, SUM(total_amount) AS lifetime_value
    FROM orders
    GROUP BY customer_id
)
SELECT
    customer_id,
    lifetime_value,
    NTILE(4) OVER (ORDER BY lifetime_value DESC) AS quartile
FROM customer_revenue;
-- quartile 1 = top 25% spenders (highest LTV)
-- quartile 4 = bottom 25% spenders

PERCENT_RANK and CUME_DIST

PERCENT_RANK() returns a value between 0 and 1 representing relative rank. CUME_DIST() returns the fraction of rows with a value ≤ the current row.

-- What percentile is each order in, by total amount?
SELECT
    id,
    total_amount,
    ROUND(PERCENT_RANK() OVER (ORDER BY total_amount) * 100, 1) AS percentile,
    ROUND(CUME_DIST()    OVER (ORDER BY total_amount) * 100, 1) AS cumulative_pct
FROM orders;

Deduplication with ROW_NUMBER

Ranking functions are the cleanest way to remove duplicates in a DELETE or identify them in a report:

-- Find duplicate customer rows (same email, keep the oldest id)
WITH dupes AS (
    SELECT id,
           ROW_NUMBER() OVER (PARTITION BY email ORDER BY id ASC) AS rn
    FROM customers
)
DELETE FROM customers WHERE id IN (
    SELECT id FROM dupes WHERE rn > 1
);

Recap

ROW_NUMBER gives every row a unique number — ideal for top-1-per-group and deduplication. RANK and DENSE_RANK handle ties: RANK leaves gaps, DENSE_RANK does not. NTILE(n) buckets rows into n equal groups. All four require an ORDER BY in the OVER clause to be meaningful. For top-N per group, wrap the ranking in a CTE and filter WHERE rn <= N in the outer query.

SQL Ranking Functions — ROW_NUMBER, RANK, DENSE_RANK, and NTILE