Edi 104 - Chapter 6
Edi 104 - Chapter 6
Edi 104 - Chapter 6
6
The normalization is done, based on the
data model, which is the representation of
tables before they are created in a
database.
The latter part of the chapter describes
the step-by-step approach, as how to
normalize a table into several tables.
Eventually, the chapter also focuses on
more examples to normalize tables.
Example :
1|P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
6.1.2 A notation of Relation
Relation Schema
A relation scheme is a list of attributes and their corresponding domains.
Format:
Relation_name (Primary_key, Non_ Key_Attribute)
From the example, we can identify:
· Relation name : PUBLISHER
· Primary Key : Publisher_Code
· Non Primary Key : Name, City
The relation schema will be like this:
PUBLISHER (Publisher_Code, Name, City)
2|P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
3|P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
The relation schema are :
PUBLISHER (Publisher_Code, Name, City)
AUTHOR (Author_No, Author_Name)
BOOK_AUTHOR (Book_Code, Author_No)
4|P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
6.1.3 Problems in Relational Data Model
There are 2 problems in relational data model:
· Anomalies
· Information Redundancy
6.1.3.1 Anomalies
Anomalies - Errors or inconsistencies that may result when a user attempts to update a table
that contains redundant data.
Types of anomalies:
(i) Modification Anomalies
(ii) Insert Anomalies
(iii) Delete Anomalies
(i) Modification Anomalies
The Problem :
Due to the database is not normalized, changing the data of a record may subject to the
problem of changing the data of a few records instead.
For example, when we want to change the Course_Name’s data from “Computer Power 200 (Graphic)” to
“Computer Power 200 (Graphic & Multimedia)” in the STUDENT_COURSE
table, we have to change two rows of students who take that course. (As shown above)
The Solution :
Therefore, to avoid this problem, we have to separate the STUDENT_COURSE table into
5|P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
two tables as follows:
The Problem :
When data is not normalized, the confusing error in adding a new record with null primary
6|P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
key may happen.
For example, when we want to insert a new Course_Code (i.e. CGMP200) and Course_Name (i.e. Diploma
in Information Technology) onto the STUDENT_COURSE table, a null data of Student_ID appears in the
record and it will cause an error because the Student_ID is the primary key and can not be null. On the other
hand, no student is being admitted to study the new course yet.
The Solution :
Therefore, to avoid this problem, we have to separate the STUDENT_COURSE table as
follows:
7|P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
The Problem :
When data is not normalized, a deletion of one record can cause the table to provide
insufficient information.
For example, the deletion of the Student_Code equals to S0159 may cause the course with Course_Code
equals to DIPCGR totally vanish from the database, as if the course is not available at all. Therefore, the
table can no longer provide the information for all the courses available for the student to study any more.
The Solution :
Therefore, to avoid this problem, we have to separate the STUDENT_COURSE table as
follows:
8|P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
6.1.3.2 Information Redundancy
Duplication is due the storing of the same data for multiple times.
The Problem :
In the STUDENT_SUBJECT table there is redundant information. It shows that the details
of subject ADM101 (Internet Publishing) are repeated for every member of Student_ID
enrolled for that subject.
The Solution :
9|P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
Therefore, to avoid this problem, we have to separate the STUDENT_SUBJECT table into
three tables as follows:
6.2 Normalization
Example 1
10 | P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
The table below shows that the Subject_Name, Subject_Code, Credit_Hour and Grade fields
contain multiple or more repeating values.
Unnormalised Form (UNF)
[Please
insert
any
relevant
photo
around
this
box]
11 | P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
12 | P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
Example 2
The table below shows that the Item_code, Item_desc, Qty, Price, Supplier &
Supplier_Address fields contain multiple or more repeating values.
Unnormalised Form (UNF)
13 | P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
14 | P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
6.2.3 Level of Normalization
There are 3 levels of normalization:
16 | P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
STUDENT (Student_ID, Student_Name, Subject_Name , Subject_Code, Credit_Hour,
Grade, Address, Course, Coordinator)
The table above shows that the Subject_Name, Subject_Code, Credit_Hour and Grade fields contain
multiple repetitive values.
Note 1.1
Problems that occur in tables with repeating groups :
· It’s hard to know exactly which Credit_Hour belongs to which Subject_Name.
· To find a record in the table is very difficult.
How to handle Repeating Groups?
· The repeating group can be handled by separating Unnormalised Form (UNF) into
First Normal Form (1NF) as shown later.
17 | P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
18 | P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
Note 1.2
What is Functional Dependencies?
Functional Dependencies describe relationships between attributes in a relation.
A relationship between attributes, of which an attribute or a group of attributes determines the value of
another in the same table.
Example :
Consider microcomputers in a COSMOPOINT computer lab. Each has only one size
of main memory. So MemorySize is functionally dependent on ComputerSerialNumber.
How to write?
The attributes on the left side of the arrow are called determinants.
How to read?
“ComputerSerialNumber functionaly determines MemorySize”
or
“ComputerSerialNumber determines MemorySize”
or
“MemorySize is functionally dependent on ComputerSerialNumber”
19 | P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
Note 1.3
What is Transitive Dependencies?
· A functional dependency between 2 or more non-key attributes. A condition in which an attribute is
dependent on another attributes that is not part of the primary key.
· A relation is in Third Normal Form (3NF) if it is in 2NF and has no transitive
dependency.
· Non-primary key attribute is functionally dependent on another non-primary key
attribute.
20 | P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
Solution : In order to build 3NF table (relation), you have to remove the transitive
Dependencies
21 | P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
22 | P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
The table above shows that the Item_Code, Item_Desc, Qty, Price, Supplier &
Supplier_Address fields contain multiple repetitive values.
Assumption : One supplier supplies many products.
23 | P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
Note 2.1
Problems that occur in table with repeating groups :
· It’s hard to know exactly which Item_Code belongs to which Item_Desc.
· To find a record in the table is very difficult.
How to handle Repeating Groups?
· The repeating group can be handled by separating Unnormalised Form (UNF) into
First Normal Form (1NF) as shown later.
The table above shows that each row contains a single value.
Note 2.2 : Identify Functional Dependencies.
What is Functional Dependencies?
Functional Dependencies describe relationships between attributes in a relation.
A relationship between attributes, of which an attribute or a group of attribute determines the value of
another in the same table.
24 | P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
Based on these 3 functional dependencies, there will be 3 tables (relations) in 2NF.
25 | P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
Note 2.3
Transitive Dependency
· A functional dependency between 2 or more non-key attributes. A condition in which
an attribute is dependent on another attributes that is not part of the primary key.
· A relation is in Third Normal Form (3NF) if it is in 2NF and has no transitive
dependency.
· Non-primary key attribute is functionally dependent on another non-primary key
attribute.
26 | P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
Solution: In order to build 3NF table (relation), you have to remove the transitive
dependencies.
27 | P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
28 | P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
The above table shows that the fields (Staff_No, Staff_Name, Job_Classification,
Hourly_Rate & Hours) have Repeating Values.
29 | P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
Note 3.1:
Problems that occur in those tables with repeating groups are :
· It’s hard to know exactly which Staff_No belongs to which Staff_Name.
· It’s hard to find a record in the table.
How to handle Repeating Groups?
The repeating groups can be handled by separating Unnormalised Form (UNF) into First
Normal Form (1NF) and it will be shown later on.
The above table shows that each row has its own value.
Note 3.2:
What is Functional Dependencies?
Functional Dependencies describe relationships between attributes in a relation.
A relationship between attributes, of which one attribute or group of attribute determines the value of
another in the same table.
Identify Functional Dependencies
30 | P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
Note 3.3:
Transitive Dependency
· A functional dependency between 2 or more non-key attributes. A condition in which an attribute is
dependent on another attribute that is not part of the primary key.
· A relation is in Third Normal Form (3NF) if it is in 2NF and has no transitive
dependency.
· Non-primary key attribute is functionally dependent on another non-primary key
attribute.
32 | P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
33 | P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
34 | P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
Table in 1NF is the situation where the intersection of each row and column contains no
repeating values (only one value).
Table in 1NF must be transform to 2NF by identify the functional dependencies.
35 | P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
What is Functional Dependencies?
Functional Dependencies describe relationships between attributes in a relation.
A relationship between two attributes, of which is an attribute or a group of attributes
determines the value of another in the same table.
Identify Functional Dependencies
36 | P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
Table in 1NF must be transform to 2NF by identify the transitive dependency.
37 | P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
Transitive Dependency
· A functional dependency between 2 or more non-key attributes. A condition in which an attribute is
dependent on another attributes that is not part of the primary key.
· A relation is in Third Normal Form (3NF) if it is in 2NF and has no transitive
dependency.
· Non-primary key attribute is functionally dependent on another non-primary key
attribute.
Identify Transitive Dependencies
Example 5:
Normalize the following form to Third Normal Form (3NF). You are required to show the
Normalization steps taken. Provide your answer in a relation scheme.
38 | P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
Solution:
The Steps in Normalization:
· Name the appropriate relation name.
· List down all attributes involved.
Relation Scheme Format:
40 | P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
Solution:
The Steps in Normalization:
· Name the appropriate relation name.
· List down all attributes involved.
Relation Scheme Format:
Solution:
The Steps in Normalization:
42 | P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
· Name the appropriate relation name.
· List down all attributes involved.
Relation Scheme Format:
43 | P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
PRODUCT
ORDER (Invoice_No, Product_Code, Quantity, Discount, Subtotal )
Relation Scheme in 3NF:
PRODUCT (Product_Code, Product_Name, Unit_Price )
ORDER (Invoice_No, Product_Code, Quantity, Discount, Subtotal )
INVOICE (Invoice_No, Date, Customer_No)
CUSTOMER (Customer_No, Name, Address, Tel_No)
6.3 Summary
1 NF
2 NF
3 NF
· 1 NF is a relation where the intersection of each row and column contains no repeating values.
44 | P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
· NF is a relation that is in 1 NF and every non-primary key attribute is fully functionally
depending on the primary key.
· NF is a relation that is in 2 NF and is based on the concept of transitive dependency
· Steps in normalization :
45 | P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
6.4 Exercises
47 | P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
48 | P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
a) Explain the anomalies that exist when the name for Product_No is going to
change to Black Pen?
b) Explain the anomalies that exist when we are going to add new product but we
don’t receive any order yet?
c) Explain the anomalies that exist when we are going to delete Order_No 123?
a) Convert the following table to 3 NF. This is a table concerning information about
patients of a dentist. Each patient belongs to a household. The head of the household
is designated as HH in the table. The following dependencies exist in Patient.
Patient (HouseholdNumber, HouseholdName, HouseholdStreet, HouseholdCity,
HouseholdState, HouseholdZip, Balance, PatientName, ServiceCode, ServiceName,
ServiceDescription, ServiceFee)
49 | P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
3) Using your knowledge of a college environment, determine the functional
dependencies that exist in the following table. After these have been determined,
convert this table to 3 NF.
Student (StudentNumber, StudentName, NumberCredits, AdvisorNumber,
AdvisorName, DeptNumber, DeptName, CourseNumber, CourseDescription,
CourseTerm, Grade)
4) Consider the following relation and assume that the values of SiblingName are the
names of all of a given students brothers and sisters, and also assume that students
have at most one major.
STUDENT (StudentNum, StudentName, SiblingName, Major)
a) Show an example of this relation for 2 students, one of whom has 3 siblings
and the other has only 2 siblings
b) State the functional dependencies in this relation.
c) Show the relation in 3 NF.
Answer the following question
1) Based on the relation below, normalize to 3NF
Employee (EmpCode, LastName, FirstName, Education, DeptCode, Department,
DepartmentManager, JobClass, Title, Dependents, BirthDate, HireDate, Training,
BaseSalary)
2) To keep track of the office furniture, computers, printers, and so on, the ABX
Company uses the following table structure. Normalize it to 3 NF.
50 | P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l
3) Normalize the table below to 3 NF.
4) Consider the table below, which contains sample data for parts and for vendors who
supply those parts. (Part Number uniquely identifies parts and vendor names uniquely
identify vendors. Normalize the table to 3 NF.
51 | P a g e C h a p t e r 6 : D a t a b a s e F u n d a m e n t a l