Normalization is a set of rules that help design relational databases efficiently by reducing redundancy and improving data integrity. These rules are organized into "normal forms".
Rule: Every column should contain atomic (indivisible) values, and each row should be unique.
| CustomerId | Name | PhoneNumbers |
|---|---|---|
| 1 | Alice | 123-456, 789-012 |
Here, PhoneNumbers stores multiple values in a single field.
| CustomerId | Name | PhoneNumber |
|---|---|---|
| 1 | Alice | 123-456 |
| 1 | Alice | 789-012 |
Absolutely! Here's a clearer and slightly more real-world rewrite of Second Normal Form (2NF) with a better example and deeper explanation:
A table is in 2NF if:
- It is already in 1NF (no repeating groups, atomic values),
- Every non-key column is fully dependent on the entire primary key — not just part of it.
This rule mostly applies when you have composite primary keys (i.e., two or more columns making up the key).
Imagine we have a table that records which products are in which orders, along with product information:
| OrderId | ProductId | ProductName | UnitPrice | Quantity |
|---|---|---|---|---|
| 1001 | 2001 | Coffee | 5.99 | 2 |
| 1001 | 2002 | Tea | 3.49 | 1 |
Here, the composite key is (OrderId, ProductId) — together they identify the row.
But:
ProductNameandUnitPriceonly depend onProductId, not the full key.- This causes redundancy (if "Coffee" appears in multiple orders, its name and price are repeated).
Split the table into two:
| OrderId | ProductId | Quantity |
|---|---|---|
| 1001 | 2001 | 2 |
| 1001 | 2002 | 1 |
| ProductId | ProductName | UnitPrice |
|---|---|---|
| 2001 | Coffee | 5.99 |
| 2002 | Tea | 3.49 |
Now:
- Each non-key column in
OrderProductdepends only on the full key (OrderId + ProductId). - Product-related info lives in a separate table.
- Prevents duplication of product data across many orders
- Easier to update (change price once)
- Reduces potential inconsistencies
Rule: Must be in 2NF, and all columns must depend only on the key, not on other non-key columns.
| CustomerId | Name | ZipCode | City |
|---|---|---|---|
| 1 | Alice | 12345 | Oslo |
Here, City depends on ZipCode, not on CustomerId.
Split into:
Customer
| CustomerId | Name | ZipCode |
|---|---|---|
| 1 | Alice | 12345 |
ZipCodeInfo
| ZipCode | City |
|---|---|
| 12345 | Oslo |
| Normal Form | Fixes Problem With |
|---|---|
| 1NF | Repeating groups / multivalued fields |
| 2NF | Partial dependencies (on part of a key) |
| 3NF | Transitive dependencies |
Normalization helps build databases that are:
- Easier to maintain
- Less redundant
- More consistent
However, denormalization (intentionally breaking these rules) is sometimes useful for performance.