The most common aggregation confusion in SQL interviews
Interviewers love asking: "Show me each employee's salary alongside the average salary for
their department." Most candidates reach for GROUP BY and immediately hit a wall — you
can't do it with GROUP BY alone. The answer is a window function, and understanding
exactly why reveals the core difference between the two approaches.
The fundamental difference in one sentence
GROUP BY collapses rows — output has fewer rows than input. A window function
annotates rows — output has the same number of rows as input, with an extra computed
column.
Quick-reference comparison
GROUP BY aggregate | Window function (OVER) | |
|---|---|---|
| Output rows | One per group | Same count as input |
| Detail columns | Only grouping columns | All original columns kept |
| Can compare row to group? | No — detail is gone | Yes — both values on each row |
Requires OVER clause? | No | Yes |
Requires GROUP BY? | Yes | No |
Can use HAVING? | Yes — filter after aggregation | No — use WHERE in a subquery/CTE |
| Ranking support? | No | Yes — ROW_NUMBER, RANK, DENSE_RANK |
| Running totals? | No | Yes — with ORDER BY in OVER |
| Moving averages? | No | Yes — with frame clause |
| Best for | Summary reports, aggregated output | Row-level analysis with group context |
GROUP BY — collapsing rows into summaries
GROUP BY is the right tool when the output should be one row per group — you want
the summary, not the detail.
-- One row per department — detail rows are gone
SELECT dept_id,
COUNT(*) AS headcount,
AVG(salary) AS avg_salary,
MAX(salary) AS max_salary
FROM employees
GROUP BY dept_id;
You can only SELECT columns that are either in GROUP BY or wrapped in an aggregate
function. Trying to select name here raises an error — the engine doesn't know which
employee's name to show per group.
HAVING filters on the aggregated result (after grouping):
-- Only show departments with more than 10 employees
SELECT dept_id, COUNT(*) AS headcount
FROM employees
GROUP BY dept_id
HAVING COUNT(*) > 10;
Deep dive: Aggregation & GROUP BY interview questions
Window functions — keeping the detail
A window function adds a computed value alongside each row. PARTITION BY divides the
window into groups (like GROUP BY), but ORDER BY inside OVER doesn't remove rows —
it just establishes row order for running calculations.
-- Every employee row is kept — salary AND the department average side by side
SELECT name,
dept_id,
salary,
AVG(salary) OVER (PARTITION BY dept_id) AS dept_avg,
salary - AVG(salary) OVER (PARTITION BY dept_id) AS diff_from_avg
FROM employees;
This query is impossible with plain GROUP BY. The department average needs grouping,
but the individual salary and name need the detail rows. Window functions let you have
both at once.
Queries only window functions can answer
1 — Row vs group comparison
-- Who earns above their department average?
SELECT name, dept_id, salary
FROM (
SELECT name, dept_id, salary,
AVG(salary) OVER (PARTITION BY dept_id) AS dept_avg
FROM employees
) t
WHERE salary > dept_avg;
2 — Ranking within a group
-- Top earner per department (rank = 1)
SELECT name, dept_id, salary, rank
FROM (
SELECT name, dept_id, salary,
RANK() OVER (PARTITION BY dept_id ORDER BY salary DESC) AS rank
FROM employees
) t
WHERE rank = 1;
GROUP BY can find the max salary per department, but it can't tell you who earned it
without a second join. RANK() does it in one pass.
Deep dive: Window Function Basics interview questions
3 — Running totals
-- Cumulative salary spend, ordered by hire date
SELECT name, hire_date, salary,
SUM(salary) OVER (ORDER BY hire_date) AS running_total
FROM employees;
4 — Row-over-row comparisons (LAG / LEAD)
-- Each employee's salary compared to the next hire in their department
SELECT name, hire_date, salary,
LAG(salary) OVER (PARTITION BY dept_id ORDER BY hire_date) AS prev_hire_salary,
LEAD(salary) OVER (PARTITION BY dept_id ORDER BY hire_date) AS next_hire_salary
FROM employees;
Can you combine both in one query?
Yes. GROUP BY and window functions can coexist — the window function sees the grouped
rows:
-- Aggregate first, then rank the summaries
SELECT dept_id,
SUM(salary) AS total_spend,
RANK() OVER (ORDER BY SUM(salary) DESC) AS spend_rank
FROM employees
GROUP BY dept_id;
Window functions are evaluated after GROUP BY and HAVING, so SUM(salary) here
operates on the already-grouped rows.
The decision rule
Do I want fewer rows than I started with — one row per group?
└── Yes → GROUP BY
Do I want all rows, but with a per-group aggregate attached?
└── Yes → window function with PARTITION BY
Do I need ranking, running totals, or row-over-row comparisons?
└── Yes → window function (RANK/ROW_NUMBER, ORDER BY inside OVER, LAG/LEAD)
A useful mental model: GROUP BY answers "what is the summary of each group?" while
window functions answer "how does each row relate to its group?"
Recap
GROUP BY is the right tool when you want a summary with one row per group —
HAVING filters those groups, and detail columns are unavailable. A window function
with OVER keeps every row and attaches a group-level aggregate alongside it, enabling
comparisons between a row and its group, rankings, running totals, and moving averages —
queries that are impossible or require multiple self-joins without them. When you need both
(rank the summaries themselves), GROUP BY runs first and the window function operates
on the grouped result.