In relational database design, the process of organizing data to minimize redundancy is called normalization. It usually involves dividing a database into 2 or more tables and defining relationships between tables. Objective is to isolate data so that additions, deletions, and modifications can be made in just one table.
Benefits:-
- Eliminate data redundancy
- Improve performance
- Query optimization
- Faster update due to less number of columns in one table
- Index improvement
There are multiple forms of Normalizations in a database.
- First Normal Form (1NF)
- Second Normal Form (2NF)
- Third Normal Form (3NF)
- Fourth Normal Form (4NF -BCNF NF)
First normal form (1NF)
Eliminate duplicative columns from the same table.
· Create separate tables for each group of related data and identify each row with a unique column or set of columns.
· Remove repetitive groups
· Create Primary Key
Name | State | Country | Phone1 | Phone2 | Phone3 |
John | 101 | 1 | 488-511-3258 | 781-896-9897 | 425-983-9812 |
Bob | 102 | 1 | 861-856-6987 | | |
Rob | 201 | 2 | 587-963-8425 | 425-698-9684 | |
PK | | | [ Phone Nos ] | | |
? | | | | ? | |
ID | Name | State | Country | Phone | |
1 | John | 101 | 1 | 488-511-3258 | |
2 | John | 101 | 1 | 781-896-9897 | |
3 | John | 101 | 1 | 425-983-9812 | |
4 | Bob | 102 | 1 | 861-856-6987 | |
5 | Rob | 201 | 2 | 587-963-8425 | |
6 | Rob | 201 | 2 | 425-698-9684 | |
Second Normal Form (2NF)
Second normal form (2NF) further addresses the concept of removing duplicative data:
- · Meet all the requirements of the first normal form.
- · Remove subsets of data that apply to multiple rows of a table and place them in separate tables.
- · Create relationships between these new tables and their predecessors through the use of foreign keys.
- Remove columns which create duplicate data in a table and related a new table with Primary Key – Foreign Key relationship.
Third Normal Form (3NF)
Third normal form (3NF) goes one large step further:
- · Meet all the requirements of the second normal form.
- · Remove columns that are not dependent upon the primary key.
Country can be derived from State also… so removing country
ID | Name | State | Country |
1 | John | 101 | 1 |
2 | Bob | 102 | 1 |
3 | Rob | 201 | 2 |
Fourth Normal Form (4NF)
Finally, fourth normal form (4NF) has one additional requirement:
- Meet all the requirements of the third normal form.
- A relation is in 4NF if it has no multi-valued dependencies.
If PK is composed of multiple columns then all non-key attributes should be derived from FULL PK only. If some non-key attribute can be derived from partial PK then remove it
The 4NF also known as BCNF NF
TeacherID | StudentID | SubjectID | StudentName |
101 | 1001 | 1 | John |
101 | 1002 | 2 | Rob |
201 | 1002 | 3 | Bob |
201 | 1001 | 2 | Rob |
| | | |
| | | |
TeacherID | StudentID | SubjectID | StudentName |
101 | 1001 | 1 | X |
101 | 1002 | 2 | X |
201 | 1001 | 3 | X |
201 | 1002 | 2 | X |