What set operations do
Joins combine columns from multiple tables into wider rows. Set operations combine
the rows of multiple queries into a single result set — like stacking spreadsheets
vertically. The three operators are UNION, INTERSECT, and EXCEPT (called
MINUS in Oracle).
The rules for all three: the queries must return the same number of columns, and
corresponding columns must have compatible types. Column names come from the
first query. ORDER BY goes at the very end and applies to the combined result.
UNION — combining result sets
UNION stacks two query results and removes duplicates (implicit DISTINCT).
UNION ALL stacks them without deduplication and is faster.
-- All email addresses in the system (customers + staff), deduplicated
SELECT email, 'customer' AS source FROM customers
UNION
SELECT email, 'staff' AS source FROM staff_accounts;
-- Transaction history: credit card charges and bank transfers in one timeline
SELECT
id,
amount,
created_at,
'card' AS payment_type
FROM card_transactions
WHERE customer_id = 4821
UNION ALL
SELECT
id,
amount,
created_at,
'bank'
FROM bank_transfers
WHERE customer_id = 4821
ORDER BY created_at DESC;
Use UNION ALL by default — deduplication requires sorting or hashing the
entire result, which is expensive. Only use UNION when you genuinely need
duplicates removed.
INTERSECT — rows that exist in both queries
INTERSECT returns only rows that appear in both result sets.
-- Customers who have both placed an order AND submitted a support ticket
-- (i.e., paying customers with problems — a useful cohort for churn analysis)
SELECT customer_id FROM orders
INTERSECT
SELECT customer_id FROM support_tickets WHERE created_at >= '2026-01-01';
INTERSECT is equivalent to a semi-join with EXISTS, but more readable when
you are comparing two already-aggregated sets. Note: MySQL added INTERSECT
support only in version 8.0.31.
EXCEPT — rows in one set but not the other
EXCEPT (Oracle: MINUS) returns rows from the first query that do not appear
in the second. Order matters: A EXCEPT B ≠ B EXCEPT A.
-- Products that exist in the catalogue but have never been ordered
SELECT id FROM products
EXCEPT
SELECT DISTINCT product_id FROM order_items;
-- Users registered before 2026 who have NOT logged in this year
SELECT user_id FROM users WHERE created_at < '2026-01-01'
EXCEPT
SELECT DISTINCT user_id FROM login_events WHERE occurred_at >= '2026-01-01';
The alternative with NOT EXISTS is identical in result and often faster because
it can use an index on the subquery:
SELECT id FROM products p
WHERE NOT EXISTS (
SELECT 1 FROM order_items i WHERE i.product_id = p.id
);
Chaining multiple set operations
You can chain several set operations. Without parentheses they are evaluated left to right; use parentheses to control precedence.
-- All premium or trial users, minus those who are currently suspended
(
SELECT user_id FROM subscriptions WHERE plan = 'premium'
UNION
SELECT user_id FROM subscriptions WHERE plan = 'trial'
)
EXCEPT
SELECT user_id FROM user_flags WHERE flag = 'suspended';
Using set operations for data quality checks
Set operations are excellent for comparing what should be in a table versus what is in it:
-- Find order IDs that exist in the payments table but not in the orders table
-- (orphaned payments — a data integrity problem)
SELECT order_id FROM payments
EXCEPT
SELECT id FROM orders;
-- Find products in the warehouse inventory not in the product catalogue
SELECT sku FROM warehouse_stock
EXCEPT
SELECT sku FROM products;
UNION ALL for combining partitioned data
When data is split across multiple tables (e.g., by year), UNION ALL assembles
them without a schema change:
-- Query across three years of archived order data
SELECT id, customer_id, total_amount, created_at FROM orders_2024
UNION ALL
SELECT id, customer_id, total_amount, created_at FROM orders_2025
UNION ALL
SELECT id, customer_id, total_amount, created_at FROM orders_2026
ORDER BY created_at DESC
LIMIT 100;
Modern databases handle this more elegantly with table partitioning, but UNION ALL views are a pragmatic workaround when partitioning is not available.
Recap
UNION ALL is the workhorse — use it whenever you need to stack rows from
multiple queries and do not need deduplication. UNION (with dedup) is for
when duplicates are genuinely unwanted and the cost of sorting is acceptable.
INTERSECT finds the common set; EXCEPT finds the difference. All three
require matching column counts and compatible types. For set membership tests
(NOT IN / NOT EXISTS), the join-based form is usually faster — but EXCEPT
is more readable for comparing two large, pre-computed result sets.