Data Normalization
In this section you will learn the concepts of data normalization. The most important
thing in database designing is to make sure that the data get
properly distributed among the tables. Simply we can say that
the designing of the table in proper manner is called Normalization.
Normalization is a process that is used in relational database design to
organize the data for minimizing the duplication. In normalization, we divide
the database in two or more tables and create a relationship between them.
After isolating the data we perform some addition, deletion or modification
on the fields of a table then we propagate and remove the duplicate data from
the related tables.
The main goals of normalization
process are:
- To eliminate the redundancy of data
- To make sure that the data
dependencies (relationship) make sense.
By these two goals we reduce the space occupied by the duplicate data in the database
tables and ensure that the data is logically stored there.
Some of the positive points of the
data normalization in database is as under :
- Data integrity
- To make optimized queries on the normalized tables and produce fast, efficient results.
- To make faster index and also make a perfect sorting.
- To increase the performance of the database.
First normal Form (1NF)
The First Normal Form requires the atomic values in each column. Atomic means
the set of values are not available with in the column. In other words, in
First Normal Form table must have at least one candidate key and make sure
that the
table don?t have any duplicate record. In First Normal Form repeating groups
are not allowed, that is no attributes which occur a different number of times
on different records.
Second normal Form (2NF)
The Second Normal Form can
be achieved only when a table is in the 1NF. We can make the 2NF by
eliminating the partial dependencies. As the First Normal Form deals with
only the atomicity of data, but Second Normal Form deals with the relationships of
tables like composite keys and non-key columns. In
Second Normal Form subset of data is removed and is organized in separate
tables. This process is applied to multiple rows of
a table till the duplicity get reduced.
Third Normal Form (3NF)
The Third Normal Form can be achieved only when a table is in the Second Normal Form. We can make the 3NF by
eliminating all transitive dependencies lying among the fields of a record. In Third Normal Form, all columns
should depend on the primary key only i.e. remove the column, which is not
dependent upon the primary key.
Boyce-Codd Normal Form (BCNF)
In the normalization Boyce - Codd Normal Form needs a table
to meet the Third Normal Form. In
Boyce - Codd Normal Form every non-trivial functional dependency must be a
dependent on a superkey.
Fourth Normal Form (4NF)
The Fourth Normal Form can be achieved when a table meets the
Boyce-Codd Normal Form. Fourth Normal Form confirms that the independent multivalued facts are efficiently and
correctly represented in database design.
Fifth Normal Form (5NF)
The Fifth Normal Form is achieved only when a table meets the
Fourth Normal Form. In this normalization it is ensured
that all non-trivial join dependencies get eliminated.