Database Mangement Systems
Database Mangement Systems
SYSTEMS
Relational Database Design
Dr Pardeep Kumar
Overview Relational Database Design
• The goal of relational database design is to generate a set
of relation schemas that allows us to store information
without unnecessary redundancy, yet also allows us to
retrieve information easily.
• To determine whether a relation schema is in one of the
desirable normal forms, we need information about the
real-world enterprise that we are modeling with the
database.
DEPARTMENT PROFILE
Goals of Normalization
Let R be a relation scheme with a set F of functional dependencies.
Decide whether a relation scheme R is in “good” form.
In the case that a relation scheme R is not in “good” form, decompose
it into a set of relation scheme {R1, R2, ..., Rn} such that
each relation scheme is in good form
the decomposition is a lossless-join decomposition
Preferably, the decomposition should be dependency preserving.
DEPARTMENT PROFILE
Functional-Dependency Theory
We now consider the formal theory that tells us which functional
dependencies are implied logically by a given set of functional
dependencies.
We then develop algorithms to generate lossless decompositions into
BCNF and 3NF
We then develop algorithms to test if a decomposition is dependency-
preserving
Closure of a Set of Functional
DEPARTMENT PROFILE
Dependencies
Given a set F set of functional dependencies, there are certain other
functional dependencies that are logically implied by F.
For example: If A B and B C, then we can infer that A C
The set of all functional dependencies logically implied by F is the closure
of F.
We denote the closure of F by F+.
We can find all of F+ by applying Armstrong’s Axioms:
if , then (reflexivity)
if , then (augmentation)
if , and , then (transitivity)
These rules are
sound (generate only functional dependencies that actually hold) and
complete (generate all functional dependencies that hold).
DEPARTMENT PROFILE
Example
R = (A, B, C, G, H, I)
F={ AB
AC
CG H
CG I
B H}
some members of F+
AH
by transitivity from A B and B H
AG I
by augmenting A C with G, to get AG CG
and then transitivity with CG I
CG HI
by augmenting CG I to infer CG CGI,
and augmenting of CG H to infer CGI HI,
and then transitivity
DEPARTMENT PROFILE
Procedure for Computing F+
To compute the closure of a set of functional dependencies F:
F+=F
repeat
for each functional dependency f in F+
apply reflexivity and augmentation rules on f
add the resulting functional dependencies to F +
for each pair of functional dependencies f1and f2 in F +
if f1 and f2 can be combined using transitivity
then add the resulting functional dependency to F +
until F + does not change any further
(Cont.)
We can further simplify manual computation of F+ by using the
following additional rules.
If holds and holds, then holds (union)
If holds, then holds and holds
(decomposition)
If holds and holds, then holds
(pseudotransitivity)
The above rules can be inferred from Armstrong’s axioms.
DEPARTMENT PROFILE
Closure of Attribute Sets
Given a set of attributes a, define the closure of a under F (denoted by
a+) as the set of attributes that are functionally determined by a under
F
result := a;
while (changes to result) do
for each in F do
begin
if result then result := result
end
DEPARTMENT PROFILE
Example of Attribute Set Closure
R = (A, B, C, G, H, I)
F = {A B
AC
CG H
CG I
B H}
(AG)+
1. result = AG
2. result = ABCG (A C and A B)
3. result = ABCGH (CG H and CG AGBC)
4. result = ABCGHI (CG I and CG AGBCH)
Is AG a candidate key?
1. Is AG a super key?
1. Does AG R? == Is (AG)+ R
2. Is any subset of AG a superkey?
1. Does A R? == Is (A)+ R
2. Does G R? == Is (G)+ R
DEPARTMENT PROFILE
R = (A, B, C)
F = {A BC
BC
AB
AB C}
Combine A BC and A B into A BC
Set is now {A BC, B C, AB C}
A is extraneous in AB C
Check if the result of deleting A from AB C is implied by the other
dependencies
Yes: in fact, B C is already present!
Set is now {A BC, B C}
C is extraneous in A BC
Check if A C is logically implied by A B and the other dependencies
Yes: using transitivity on A B and B C.
– Can use attribute closure of A in more complex cases
The canonical cover is: AB
BC
DEPARTMENT PROFILE
Lossless-join Decomposition
For the case of R = (R1, R2), we require that for all possible
relations r on schema R
r = R1 (r ) R2 (r )
A decomposition of R into R1 and R2 is lossless join if and
only if at least one of the following dependencies is in F+:
R1 R2 R1
R1 R2 R2
DEPARTMENT PROFILE
Example
R = (A, B, C)
F = {A B, B C)
Can be decomposed in two different ways
R1 = (A, B), R2 = (B, C)
Lossless-join decomposition:
R1 R2 = {B} and B BC
Dependency preserving
R1 = (A, B), R2 = (A, C)
Lossless-join decomposition:
R1 R2 = {A} and A AB
Not dependency preserving
(cannot check B C without computing R1 R2)
DEPARTMENT PROFILE
Dependency Preservation
result := {R };
done := false;
compute F +;
while (not done) do
if (there is a schema Ri in result that is not in BCNF)
then begin
let be a nontrivial functional dependency that holds on
Ri
such that Ri is not in F +,
and = ;
result := (result – Ri ) (Ri – ) (, );
end
else done := true;
J L K
j1 l1 k1
j2 l1 k1
j3 l1 k1
null l2 k2
repetition of information (e.g., the relationship l1, k1)
need to use null values (e.g., to represent the relationship
l2, k2 where there is no corresponding value for J).
DEPARTMENT PROFILE
Testing for 3NF
Optimization: Need to check only FDs in F, need not check all FDs in
F+ .
Use attribute closure to check for each dependency , if is a
superkey.
If is not a superkey, we have to verify if each attribute in is
contained in a candidate key of R
this test is rather more expensive, since it involve finding
candidate keys
testing for 3NF has been shown to be NP-hard
Interestingly, decomposition into third normal form (described
shortly) can be done in polynomial time
DEPARTMENT PROFILE
3NF Decomposition Algorithm