Chapter# 14 Database Design Theory and Normalization
Chapter# 14 Database Design Theory and Normalization
Systems
CHAPTER # 14
DATABASE DESIGN THEORY AND NORMALIZATION
Normalization
This violates the entity integrity for EMP_DEPT because its primary
key Ssn cannot be null.
EXAMPLE OF AN INSERT
ANOMALY
Consider the relation:
EMP_PROJ(Emp#, Proj#, Hours
Ename, Pname,Plocation)
Insert Anomaly:
Cannot insert a project unless
an employee is assigned to it.
Conversely
Cannot insert an employee
unless an he/she is assigned to
a project.
EXAMPLE OF AN UPDATE
ANOMALY(Repeated Update)
Consider the relation:
EMP_PROJ(Emp#, Proj#, Hours
Ename, Pname,Plocation)
Update Anomaly:
Changing the name of
project number P1 from
“ProjectX” to “Customer-
Accounting” may cause this
update to be made for all
100 employees working on
project P1.
EXAMPLE OF A DELETE
ANOMALY
Consider the relation:
EMP_PROJ(Emp#, Proj#, Hours
Ename, Pname,Plocation)
Delete Anomaly:
When a project is deleted, it
will result in deleting all the
employees who work on that
project.
Alternately, if an employee is
the sole employee on a
project, deleting that
employee would result in
deleting the corresponding
project.
Guideline 2.
X Y holds if whenever two tuples have the same value for X, they
must have the same value for Y
For any two tuples t1 and t2 in any relation instance r(R): If
t1[X]=t2[X], then t1[Y]=t2[Y]
X Y in R specifies a constraint on all relation instances r(R)
Written as X Y; can be displayed graphically on a relation
schema as in Figures. ( denoted by the arrow: ).
FDs are derived from the real-world constraints on the attributes
Examples of FD constraints
(1)
Social security number determines employee
name
SSN ENAME
Project number determines project name and
location
PNUMBER {PNAME, PLOCATION}
Employee ssn and project number determines the
hours per week that the employee works on the
project
{SSN, PNUMBER} HOURS
Defining FDs from
instances
Note that in order to define the FDs, we need to
understand the meaning of the attributes
involved and the relationship between them.
An FD is a property of the attributes in the schema
R
Given the instance (population) of a relation, all
we can conclude is that an FD may exist between
certain attributes.
What we can definitely conclude is – that certain
FDs do not exist because there are tuples that
show a violation of those dependencies.
Determining Functional
Dependencies
Figure 14.8 What FDs may
exist?
A relation R(A, B, C, D) with its extension.
Which FDs may exist in this relation?
Figure 14.8 What FDs may
exist? FD holds
FD not
B -> C
A relation R(A, B, C, D) with its extension.
holds
C -> B
A -> B
Which FDs may exist in this relation? {A,B} -> C
A -> C
{A,B} -> D
A -> D
{A,C} -> B
B -> A
{A,C} -> D
B -> D
{A,D} -> B
C -> A
{A,D} -> C
C -> D
{B,D} -> A
D -> A
{B,D} -> C
D -> B
{C,D} -> A
D -> C
{C,D} -> B
{B,C} -> A
{A,B,C} -> D
{B,C} -> D
{A,B,D} -> C
{B,C,D} -> A
Important Definitions
Determinant
Refers to the attribute, or group of attributes, on the left-
hand side of the arrow of a functional dependency.
Full Functional dependency:
Indicates that if A and B are attributes of a relation, B is
fully functionally dependent on A if B is functionally
dependent on A, but not on any proper subset of A.
{StaffNo, StaffName} BranchNo
StaffNo BranchNo
Transitive Dependency
A condition where A, B, and C are attributes of a relation
such that if A → B and B → C, then C is transitively
dependent on A via B (provided that A is not functionally
dependent on B or C).
Example Transitive
Dependency
R (A, B, C, D)
FD: {A->BC} is decomposed into R1(ABC) and
R2(AD) which is a lossless join decomposition as:
First condition holds true as Att(R1) U Att(R2) =
(ABC) U (AD) = (ABCD) = Att(R).
Second condition holds true as Att(R1) ∩ Att(R2) =
(ABC) ∩ (AD) ≠ Φ
Third condition holds true as Att(R1) ∩ Att(R2) = A is
a key of R1(ABC) because A->BC is given.
Dependency Preserving
Decomposition
If we decompose a relation R into relations R1 and
R2, All dependencies of R either must be a part of
R1 or R2 or must be derivable from combination of
FD’s of R1 and R2.
For Example,
R (A, B, C, D) with FD set{A->BC} is decomposed
into R1(ABC) and R2(AD) which is dependency
preserving because FD A->BC is a part of
R1(ABC).
FIRST NORMAL FORM
It states that
the domain of an attribute
must include only atomic
(simple, indivisible) values
and
that the value of any
attribute in a tuple must be
a single value from the
domain of
that attribute.
Figure 14.9
Normalization into 1NF. (a)
A relation schema that is not
in 1NF. (b) Sample state of
relation DEPARTMENT. (c)
1NF version of the same
relation with redundancy.
Normalizing nested relations
into 1NF.
Figure 14.10
Normalizing nested
relations into 1NF. (a)
Schema of the
EMP_PROJ relation
with a nested relation
attribute PROJS. (b)
Sample extension of
the EMP_PROJ relation
showing nested
relations within each
tuple. (c)
Decomposition of
EMP_PROJ into
relations EMP_PROJ1
and EMP_PROJ2 by
propagating the
primary key.
Second Normal Form
42
Example Transitive Dependency
Normalized Relations:
R1 (Instructor, Course) and
R2(Instructor, Student)
Second Normal Form (2NF)