Database Normalization in MySQL with Examples
In this article, I am going to discuss Database Normalization in MySQL with Examples. Please read our previous article, where we discussed Database Relationships in MySQL. Understanding Database Normalization is a must for a database developer. So, here in this article, we will discuss if we are not following Database Normalization, then what problems we face and then we will discuss how to overcome such problems using Database Normalization. I am going to discuss everything with examples so that it will be easier for you to understand the concepts.
Database Normalization in MySQL
Database Normalization is the most important factor in Database design or Data modeling. Database Normalization is the process to eliminate data redundancies and store the data logically to make data management easier. Database relationships and keys are useful in the Database Normalization process. The Database Normalization was developed by E.F.Codd. In the database normalization process, there are series of rules called Normal Forms.
There are mainly 6 types of Normal Forms: First Normal Form (1NF), second Normal Form (2NF), Third Normal Form (3NF), Fourth Normal Form (4NF), Fifth Normal Form (5NF), and Boyce Codd Normal Form (BCNF). But majorly we used up to third normal form in our database design.
- First Normal Form (1NF): In the first normal form, each column must contain only one value and no table should store repeating groups of related data.
- Second Normal Form (2NF): In the second normal form, first the database must be in the first normal form, it should not store duplicate rows in the same table. And if there are duplicate values in the row, they should be stored in their own separate tables and linked to the table using foreign keys. The ideal way to a database in second normal form is to create one to many relationship tables.
- Third Normal Form (3NF): In the third normal form, the database is already in the third form, if it is in the second normal form and every non-key column is mutually independent. Identify any columns in the table that are interdependent and break those columns into their own separate tables.
- Boyce Codd Normal Form (BCNF): It is the highest form of the third normal form which deals with different types of anomalies that are not handled by the 3NF.
First Normal Form (1NF) in MySQL
In the first normal form, each column must contain only one value. No table should store repeating groups of related data. The easiest way to follow the first normal form is to inspect the database table horizontally.
- Each column of your table should be single-valued.
- The values stored in each column must be of the same type.
- Each column in a table should have a unique name.
- You can store the data in the table in any order.
Example of 1NF
Take a look at the example table.
The old database table ‘users’ store ‘userid’ and ‘fullname’ of the users. According to first normal form rules, each column must contain only one value but in the case of fullname the data value can further be stored as 2 separate columns ‘firstname’ and ‘lastname’. Therefore, we modified the old database and created 2 columns, firstname, and lastname to store the values separately. This change makes the database table in first normal form.
Second Normal Form (2NF) in MySQL
In the second normal form, first, the database must be in the first normal form and there should not be any partial dependency. If there are duplicate values in the row, they should be stored in their own separate tables and linked to the table using foreign keys. The ideal way to database in second normal forms is to create one to many relationships tables. The easiest way to follow second normal forms is to inspect the database table vertically.
- The table should be in first normal form (1NF).
- There should not be any partial dependency.
What is Partial Dependency?
When a non-prime attribute is functionally dependent on a part of a candidate key, it is called partial dependency.
Let’s take a StudentProject table.
In the above table, StudentID and ProjectNo are the primary key attributes. StudentName and ProjectName are the non-primary key attributes that should be functionally dependent on a part of the candidate key. Here, StudentName is determined by StudentID, and ProjectName is determined by ProjectNo which is partially dependent which violates the 2NF and is a bad database design.
How to remove Partial Dependency?
To remove the partial dependency, decompose the table. For the above table, to remove partial dependency remove the ProjectName column from the StudentProject table and add it to a separate table as follows:
Now the tables are not partially dependent.
Example of 2NF:
Take a look at the example old ‘employees’ table. In the ‘jobtitle’ column, the jobtitle values are frequently repeated for the employees, which breaks the second normal form rule. To make the database table in second normal form we created a separate table to store jobtitle values and linked those values using foreign key value ‘jobid’ which refers to the primary key ‘jobid’ in ‘jobtitles’ table. This change makes the database table in second normal form.
Third Normal Form (3NF) in MySQL
In the third normal form, the database is already in the third normal form, if it is in the second normal form. Every non-key column must be mutually independent. Identify any columns in the table that are interdependent and break those columns into their own separate tables.
- The table should be in Second Normal Form (2NF)
- There should not be any transitive dependency for non-prime attributes.
What is Functional Dependency?
When there is a relationship exists between the primary key and non-key attribute within a table it is called functional dependency.
X -> Y
Here, X is known as determinant, and Y is known as the dependent.
Let’s take an employee table.
Here, EmployeeId uniquely identifies the EmployeeName because if we know the EmployeeID we can easily tell the EmployeeName associated with it. In this case, functional dependency can be written as EmployeeID -> EmployeeName which means EmployeeName is functionally dependent on EmployeeID.
What is Transitive Dependency?
When there is an indirect functional dependency between the attributes it is called Transitive Dependency. If A -> B and B -> C then A -> C is called Transitive Dependency. To achieve Third Normal Form (3NF) we have to eliminate Functional Dependency.
In this table, MovieID->ListingID and ListingID->ListingType are the attributes which have transitive functional dependency. Therefore, MovieID -> ListingType has transitive functional dependency.
How to remove Transitive Dependency?
To remove Transitive Dependency, we can split the tables. In the above example to remove transitive dependency from the table we can remove the ListingType column from the table and create a separate table for it as follows:
Now the tables are not transitive dependent.
Example of 3NF:
Take a look at the example old clients table, the address column is interdependent. The address depends upon the city and the city depends upon states and so on. Therefore, we can break the address column to separate cities or states table. Create primary key cityid and stateid for cities and states tables respectively. And link the primary keys to appropriate tables using foreign keys. This change makes the database table in third normal form.
Boyce Codd Normal Form (BCNF)
It is an extension of the third normal form and is also known as the 3.5 Normal Form.
Rules for BCNF
- The table should be in 3NF
- For dependency like a-> B, A should be a super key which means A cannot be a non-prime attribute if B is a prime attribute.
In the above table, one student can enroll in multiple subjects. A professor is assigned for each subject to the student. There can be multiple professors who can teach one subject. Here, StudentID and Subject form the primary key because by using StudentID and Subject we can able to find all the columns of the table. Also, the subject attribute depends on the professor’s name because one professor teaches only one subject but one subject may have two different professors. Therefore, the table is not in BCNF because here the subject is a prime attribute and the professor is a non-prime attribute.
To make this table satisfy BCNF, we can decompose this table into two different tables as follows:
Now the relation satisfies BCNF.
In the next article, I am going to discuss Database De-normalization in MySQL with Example. Here, in this article, I try to explain Database Normalization in MySQL with Examples. I hope you enjoy this Database Normalization in the MySQL article. If you have any questions or queries about this Database Normalization in MySQL with Examples article, then please feel free to ask me in the comment section.