Choosing the right type affects storage size, query performance, and
data integrity. A correct type rejects bad data at insert time (the
database enforces the constraint for free) and lets the engine use internal
optimizations (integer comparisons are faster than string comparisons; a
DATE column can use date arithmetic natively).
-- BAD: storing a price as VARCHAR lets "abc" in and breaks SUM()
price VARCHAR(20)
-- GOOD: exact numeric, 2 decimal places, always positive
price NUMERIC(10, 2) NOT NULL CHECK (price >= 0)
Rule of thumb: choose the narrowest type that correctly represents every valid value — it saves space, speeds up indexes, and keeps invalid data out automatically.
All major databases offer a family of fixed-size integers:
| Type | Bytes | Range (~) | Use when… |
|---|---|---|---|
SMALLINT |
2 | ±32 k | small lookup codes, status flags |
INT / INTEGER |
4 | ±2.1 B | most surrogate keys, counters |
BIGINT |
8 | ±9.2 × 10¹⁸ | high-volume tables, distributed IDs |
-- Postgres auto-increment shorthand
id SERIAL PRIMARY KEY -- alias for INT + sequence
id BIGSERIAL PRIMARY KEY -- alias for BIGINT + sequence
-- Standard SQL (Postgres 10+, MySQL 8, SQL Server)
id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY
Rule of thumb: default to INT for PKs; switch to BIGINT if you
expect more than ~1 billion rows or use globally distributed IDs (snowflakes,
UUIDs stored as numbers).
NUMERIC(p, s) / DECIMAL(p, s) store exact values using binary-coded
decimal arithmetic. They never introduce rounding errors and are required for
money, tax rates, or any value where "0.10 + 0.20 = 0.30" must hold exactly.
FLOAT / REAL / DOUBLE PRECISION are IEEE-754 floating-point types.
They are faster and more compact but introduce tiny rounding errors, making
them unsuitable for financial calculations.
-- exact: total will always equal the sum of its parts
price NUMERIC(12, 4)
-- approximate: fine for sensor readings, ML feature vectors
latitude DOUBLE PRECISION
Rule of thumb: use NUMERIC for money and anything that will be summed
or compared for equality; use FLOAT/DOUBLE for scientific measurements
where small rounding is acceptable.
CHAR(n)— fixed-length, alwaysncharacters, right-padded with spaces. Trailing spaces are ignored in comparisons in most databases. Useful only for truly fixed-width codes (country codes, ISO currency codes).VARCHAR(n)— variable-length up toncharacters. The limit is a declaration; values shorter thannuse less storage.TEXT— unlimited-length character string (no declared max). Postgres treatsTEXTandVARCHARidentically at the storage level. MySQL and SQL Server have different performance trade-offs for very large TEXT values.
country_code CHAR(2) -- 'US', 'GB' — always exactly 2
email VARCHAR(255) -- typical cap; protects against runaway input
body TEXT -- blog post, unlimited
Rule of thumb: use CHAR only for fixed-width codes, VARCHAR(n) for
fields with a meaningful business-length cap (email, username), and TEXT
for free-form content.
| Type | Stores | Timezone-aware? |
|---|---|---|
DATE |
year-month-day | No |
TIME |
hour-min-sec | No (TIMETZ in Postgres) |
TIMESTAMP |
date + time | No |
TIMESTAMPTZ (Postgres) / DATETIMEOFFSET (SQL Server) |
date + time | Yes — stored as UTC, displayed in session tz |
-- Postgres
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
birth_date DATE
-- MySQL (no TIMESTAMPTZ; DATETIME is naive, TIMESTAMP is UTC-stored)
created_at DATETIME(6) -- 6-digit microsecond precision
Rule of thumb: always store timestamps with time-zone awareness
(TIMESTAMPTZ in Postgres, DATETIMEOFFSET in SQL Server). Store DATE
alone only when the time component is meaningless (birthdays, holidays).
Postgres has a native BOOLEAN type that accepts TRUE/FALSE (and
aliases like 'yes'/'no', 1/0).
MySQL lacks a native boolean — BOOL/BOOLEAN is an alias for
TINYINT(1), where 0 = false and any non-zero = true.
SQL Server uses BIT (0 or 1, no native TRUE/FALSE literal).
-- Postgres
is_active BOOLEAN NOT NULL DEFAULT TRUE
-- MySQL
is_active TINYINT(1) NOT NULL DEFAULT 1
-- SQL Server
is_active BIT NOT NULL DEFAULT 1
Rule of thumb: in Postgres use BOOLEAN; in MySQL use TINYINT(1);
in SQL Server use BIT. In all cases, enforce NOT NULL DEFAULT to keep
the flag unambiguous.
NULL means unknown / missing / not applicable — it is not zero, not
empty string, not FALSE. SQL uses three-valued logic: a comparison
involving NULL evaluates to UNKNOWN, which acts like FALSE in WHERE
clauses (the row is excluded).
-- These all return UNKNOWN, not TRUE or FALSE:
NULL = NULL -- UNKNOWN
NULL <> 1 -- UNKNOWN
NULL IS NULL -- TRUE ← the correct test
NULL IS NOT NULL -- FALSE
-- Practical pitfall: rows where discount IS NULL are excluded
SELECT * FROM orders WHERE discount <> 0;
-- Fix:
SELECT * FROM orders WHERE discount <> 0 OR discount IS NULL;
Rule of thumb: never compare with = NULL; always use IS NULL /
IS NOT NULL. Use COALESCE(col, default) to substitute a default before
comparison.
A UUID (Universally Unique Identifier) is a 128-bit value, usually
written as xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx. It is collision-resistant
without a central coordinator, making it ideal for distributed inserts
and exposing IDs in APIs (not predictable like an integer sequence).
-- Postgres: native UUID type (stored as 16 bytes)
id UUID PRIMARY KEY DEFAULT gen_random_uuid()
-- MySQL: no native UUID; store as CHAR(36) or BINARY(16)
id CHAR(36) PRIMARY KEY DEFAULT (UUID())
Downsides: random UUIDs (v4) cause index fragmentation because inserts scatter across the B-tree. UUIDv7 (time-ordered) mitigates this.
Rule of thumb: prefer integer PKs for internal tables; use UUIDs when rows are created across multiple nodes or when IDs are exposed externally and must not be guessable.
JSON columns let you persist semi-structured, schema-flexible data
(event payloads, third-party API responses) alongside relational data.
Postgres's JSONB stores a parsed binary representation — indexable with
GIN, fast to query. MySQL 8+ and SQL Server 2016+ also support JSON but
store it as text with helper functions.
-- Postgres JSONB with a GIN index for fast containment queries
CREATE TABLE events (
id BIGSERIAL PRIMARY KEY,
payload JSONB NOT NULL
);
CREATE INDEX idx_events_payload ON events USING GIN (payload);
-- Query a nested key
SELECT * FROM events WHERE payload @> '{"type": "click"}';
Rule of thumb: use JSON columns for truly variable structures that would
otherwise require dozens of nullable columns or a separate EAV table. If
you find yourself querying the same JSON key in every WHERE clause,
extract it into a proper column.
An ENUM restricts a column to a predefined list of string labels,
enforcing a domain constraint at the type level. Postgres stores ENUM as a
user-defined type; MySQL stores it internally as an integer but displays the
label.
-- Postgres
CREATE TYPE order_status AS ENUM ('pending', 'shipped', 'delivered', 'cancelled');
ALTER TABLE orders ADD COLUMN status order_status NOT NULL DEFAULT 'pending';
-- MySQL
status ENUM('pending', 'shipped', 'delivered', 'cancelled') NOT NULL DEFAULT 'pending'
Trade-offs:
- ✅ Compact storage, enforced domain, readable values.
- ❌ Adding a new label requires
ALTER TYPE(Postgres) orALTER TABLE(MySQL), which can lock the table. - ❌ Harder to manage via migrations; lookup tables are more flexible.
Rule of thumb: use ENUM for short, stable lists (< 10 values, rarely
changing); use a lookup/reference table when the list is large or frequently
updated.
p = total significant digits, s = digits to the right of the
decimal point.
| Value | Type | p | s |
|---|---|---|---|
12345.67 |
price | 7 | 2 |
0.000001 |
rate | 7 | 6 |
-9999999.9999 |
balance | 11 | 4 |
-- Store prices up to $9,999,999.99 with cent precision
price NUMERIC(9, 2)
-- Interest rate: 0.0000 – 1.0000 with 4 decimal places
rate NUMERIC(5, 4)
Postgres and SQL Server will raise an error if a value exceeds the declared precision. MySQL silently rounds or truncates.
Rule of thumb: set s to the number of decimal places your business
logic requires; set p to s plus the number of digits you expect to the
left of the decimal, then add a few digits of headroom.
Both auto-generate ascending integer PKs, but they differ in SQL standard compliance and control:
SERIAL(Postgres-specific) creates a sequence and sets aDEFAULT nextval(...)on the column. It is an alias, not a type — the column's real type isINTEGER. Users can stillINSERTan explicit value, bypassing the sequence.GENERATED ALWAYS AS IDENTITY(SQL:2003 standard, Postgres 10+, SQL Server, MySQL 8+) formally declares the column as identity-generated.GENERATED ALWAYSprevents manual inserts;GENERATED BY DEFAULTallows them.
-- Old Postgres style
id SERIAL PRIMARY KEY
-- Standard SQL (preferred)
id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY
Rule of thumb: prefer GENERATED ALWAYS AS IDENTITY for new schemas —
it is portable and prevents accidental sequence skips from manual inserts.
Use NUMERIC(p, 2) (or a higher scale for currencies with sub-cent
precision). Never use FLOAT — floating-point arithmetic makes 0.10 + 0.20
equal 0.30000000000000004, which causes reconciliation errors.
Some teams store money as a BIGINT of the smallest unit (cents, pence) and
convert to a decimal only in the application layer — this avoids any numeric
type ambiguity.
-- Option 1: NUMERIC column in dollars
amount NUMERIC(12, 2) NOT NULL
-- Option 2: BIGINT in cents (no decimal at all)
amount_cents BIGINT NOT NULL -- 1099 = $10.99
Rule of thumb: use NUMERIC in SQL; if you also need speed at very
high throughput, use BIGINT cents and divide by 100 in the application.
Document which approach you use in the column comment.
Binary columns (BYTEA in Postgres, VARBINARY/BLOB in MySQL/SQL Server)
store raw byte sequences — images, PDFs, encrypted values, hashes.
In practice, storing large blobs directly in the database bloats the table, slows backups, and is rarely optimal compared to object storage (S3, GCS) with only a URL or key in the DB.
-- Appropriate: cryptographic hash (fixed 32 bytes)
password_hash BYTEA NOT NULL -- bcrypt/argon2 output
-- Appropriate: small thumbnail thumbnail (< 64 KB)
avatar_thumb BYTEA
-- Avoid: storing full-resolution images in the DB
-- Store the S3 key instead:
avatar_s3_key VARCHAR(500)
Rule of thumb: store binary data in the database only when it is small (< 1 MB), must be transactionally consistent with other columns, or access patterns demand it. Otherwise, use object storage and keep a reference key.
Implicit casting (coercion) happens when the database silently converts a value from one type to another to satisfy a comparison or expression. This can cause index scans to degrade into full-table scans if the cast prevents the engine from using the index on the original column.
-- Table: users(id INT, phone VARCHAR(20))
-- phone has an index.
-- BAD: implicit cast of the integer literal to VARCHAR
-- Some databases may cast the column instead, killing the index
SELECT * FROM users WHERE phone = 12345;
-- GOOD: explicit cast or string literal
SELECT * FROM users WHERE phone = '12345';
-- or
SELECT * FROM users WHERE phone = CAST(12345 AS VARCHAR);
Rule of thumb: always compare like types. Mismatched types in WHERE
predicates are a common source of unexpected full-table scans — check with
EXPLAIN when in doubt.
Postgres supports ARRAY columns that hold a list of any base type
(INTEGER[], TEXT[], UUID[]). They can save a join for read-heavy
denormalized patterns but lose referential integrity and are harder to
index and update partially.
-- Array column: fast read, no join, but no FK enforcement
CREATE TABLE articles (
id SERIAL PRIMARY KEY,
tags TEXT[] NOT NULL DEFAULT '{}'
);
CREATE INDEX idx_articles_tags ON articles USING GIN (tags);
SELECT * FROM articles WHERE tags @> ARRAY['sql', 'indexing'];
-- Child table: full relational integrity, flexible queries
CREATE TABLE article_tags (
article_id INT REFERENCES articles(id),
tag TEXT NOT NULL,
PRIMARY KEY (article_id, tag)
);
Rule of thumb: use arrays when the list is small, ordered, read far more than written, and does not need referential integrity or per-element queries. Use a child table when you need FK constraints, ordering, or per-row metadata.
More Schema & Data Types interview questions
More ways to practice
The self-quiz is live. Get notified when mock interviews and new question packs drop.