Why data types matter
The data type of a column controls what values it stores, how much disk space
it uses, how it is indexed, and what arithmetic you can do on it. Choosing the
wrong type — storing a price as VARCHAR, a user ID as FLOAT, or a timestamp
as a string — leads to silent bugs, poor query performance, and joins that never
match.
Integer types
Use the smallest integer type that fits your expected maximum value. Storage differences matter at scale (millions of rows).
| Type | Storage | Range |
|---|---|---|
SMALLINT | 2 bytes | −32 768 to 32 767 |
INTEGER / INT | 4 bytes | −2.1 B to 2.1 B |
BIGINT | 8 bytes | −9.2 × 10¹⁸ to 9.2 × 10¹⁸ |
CREATE TABLE orders (
id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
customer_id INT NOT NULL,
item_count SMALLINT NOT NULL DEFAULT 0,
status_code SMALLINT NOT NULL DEFAULT 1
);
Use BIGINT for primary keys on high-volume tables — INT runs out at 2.1
billion rows and a busy table can reach that. Never use FLOAT or DOUBLE for
IDs — floating-point arithmetic introduces rounding errors that can cause missed
joins.
NUMERIC / DECIMAL — for money
Floating-point types (FLOAT, DOUBLE, REAL) store approximate values. They
are wrong for money. 0.1 + 0.2 in floating-point is 0.30000000000000004.
Use NUMERIC(precision, scale) or DECIMAL for any monetary column.
CREATE TABLE products (
id INT PRIMARY KEY,
product_name VARCHAR(200) NOT NULL,
unit_price NUMERIC(10, 2) NOT NULL, -- up to 99,999,999.99
cost_price NUMERIC(10, 2) NOT NULL,
tax_rate NUMERIC(5, 4) NOT NULL -- e.g., 0.2000 for 20%
);
NUMERIC(10, 2) stores 10 significant digits with 2 after the decimal point.
Arithmetic on NUMERIC is exact — no rounding surprises.
TEXT vs VARCHAR
VARCHAR(n)— variable-length string capped atncharacters. Exceeding the cap raises an error (Postgres) or silently truncates (MySQL).TEXT— unlimited variable-length string (Postgres, MySQL). In Postgres,TEXTandVARCHARhave identical storage and performance.CHAR(n)— fixed-length, padded with spaces. Rarely useful.
CREATE TABLE users (
id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
email VARCHAR(254) NOT NULL UNIQUE, -- RFC 5321 max
username VARCHAR(50) NOT NULL UNIQUE,
display_name VARCHAR(100),
bio TEXT -- no sensible length cap
);
In Postgres, prefer TEXT with a CHECK constraint over VARCHAR(n) — you
can change the constraint without rewriting the column. In MySQL, VARCHAR(n)
is required for columns you intend to index (large TEXT columns need a prefix
index).
DATE, TIME, and TIMESTAMP
CREATE TABLE orders (
id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
order_date DATE NOT NULL, -- '2026-06-20', no time component
scheduled_at TIME NOT NULL, -- '14:30:00', no date
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), -- date + time + timezone
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
Always use TIMESTAMPTZ (timestamp with time zone) over TIMESTAMP (without).
TIMESTAMPTZ stores UTC internally and converts to the session's time zone on
display — so two users in different time zones see the correct local time.
TIMESTAMP without time zone stores the value literally, with no zone
information — it becomes ambiguous as soon as users or servers span time zones.
BOOLEAN
CREATE TABLE feature_flags (
flag_name VARCHAR(100) PRIMARY KEY,
is_enabled BOOLEAN NOT NULL DEFAULT FALSE,
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
-- MySQL: TINYINT(1) is the conventional boolean
-- (MySQL's BOOLEAN type is an alias for TINYINT(1))
JSON / JSONB
Use JSON columns for truly variable or schema-less data — configuration blobs, third-party API payloads, or sparse attributes. Do not use JSON as a workaround for poor relational design.
CREATE TABLE product_attributes (
product_id INT NOT NULL REFERENCES products(id),
attributes JSONB NOT NULL DEFAULT '{}' -- Postgres: binary JSON, indexed
);
-- Query a JSON field
SELECT product_id, attributes->>'colour' AS colour
FROM product_attributes
WHERE attributes @> '{"material": "leather"}';
-- Index a JSON field
CREATE INDEX ON product_attributes USING GIN (attributes);
Use JSONB (binary JSON) in Postgres rather than JSON — it stores the parsed
form, supports indexing, and allows operators like @> (contains).
Recap
Match the type to the data: BIGINT for IDs, NUMERIC(p,s) for money, TEXT
or VARCHAR for strings, TIMESTAMPTZ for timestamps, BOOLEAN for flags.
The wrong type — floats for money, strings for dates, strings for IDs — leads
to rounding errors, missed indexes, and broken comparisons. Get the types right
at schema creation time; changing them later requires an expensive table rewrite.