0% found this document useful (0 votes)
143 views86 pages

Functional Dependencies and Normalization4

Let's prove CID, email → grade using the inference rules: 1) SID, CID → grade (given) 2) email → SID (given) 3) CID, email → CID, SID (augmentation) 4) CID, SID → grade (transitivity using 1) and 3) 5) CID, email → grade (transitivity using 4) and 3) Therefore, CID, email → grade follows from the given FDs.

Uploaded by

AbuBakar Sohail
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
143 views86 pages

Functional Dependencies and Normalization4

Let's prove CID, email → grade using the inference rules: 1) SID, CID → grade (given) 2) email → SID (given) 3) CID, email → CID, SID (augmentation) 4) CID, SID → grade (transitivity using 1) and 3) 5) CID, email → grade (transitivity using 4) and 3) Therefore, CID, email → grade follows from the given FDs.

Uploaded by

AbuBakar Sohail
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 86

Functional Dependencies

and
Normalization for Relational
Databases
Relational Database Designs
 There are many different relational database
"designs" (schemas) that can be used to store the
relevant mini-world information.

 Can one schema be much better than another?


EXAMPLE: SHIPMENT OF PARTS
Ship(S#, sname, status, city, P#, pname, colour, weight, qty, date)

Above table contains information regarding shipments of parts from suppliers.

• A supplier belongs to a particular city and a status is associated with each


city.
• A part has a name, color, quantity and weight and is shipped on particular
date
EXAMPLE: SHIPMENT OF PARTS
Ship(S#, sname, status, city, P#, pname, colour, weight, qty, date)

Problems ??

• Redundant Data
• Inability to represent certain information
• Loss of information
Design Anomalies
Ship(S#, sname, status, city, P#, pname, colour, weight,
qty, date)

 This Apply relation exhibits three types of anomalies:


 Redundancy
 Update Anomaly
 Deletion Anomaly

 Good designs:
 No anomalies
 Can reconstruct all original information
EXAMPLE: SHIPMENT OF PARTS
EXAMPLE: SHIPMENT OF PARTS
Informal Design Guidelines for Relational
Databases

 We first discuss informal guidelines for good


relational design
 Then we discuss formal concepts of functional
dependencies and normal forms

9
INFORMAL DESIGN GUIDELINES FOR
RELATIONAL DATABASES

NULL VALUES IN TUPLES


 Relations should have few NULL values
 Attributes that are NULL frequently could be placed
in separate relations (with the primary key)

 Reasons for nulls:


 Attribute not applicable or invalid
 Attribute value unknown (may exist)
 Value known to exist, but unavailable
SEMANTICS OF THE RELATION ATTRIBUTES
 Each tuple in a relation should represent one entity or
relationship instance.

 Attributes of different entities should not be mixed in the


same relation
 Only foreign keys should be used to refer to other entities

11
RELATION DECOMPOSITION
 Relation decomposition can be dangerous we can
loose information.
 Loss-less join decomposition

 If we decompose a relation R into smaller


relations, the join of those relations should return
R, not more, not less.
EXAMPLE: RELATION DECOMPOSITION

• Eliminates redundancy

• To get back to the


original relation use join
RELATION DECOMPOSITION

 Association between CID and grade is lost


 Join returns more rows than the original relation

Spurious Tuples
RELATION DECOMPOSITION
 Join returns the original relation
 Unnecessary: no redundancy is removed, and
now SID is stored twice!

Unnecessary decomposition
Functional Dependencies
 How do we tell if a design is bad?

 This design has redundancy and update anomalies

 How about a systematic approach to detecting and


removing redundancy in designs?
 Dependencies, decompositions, and normal forms
FUNCTIONAL DEPENDENCIES
 Functional dependencies play an important role
in database design.
 They describe relationships between attributes.

 FDs are derived from the real-world constraints


on the attributes
FUNCTIONAL DEPENDENCIES
 A functional dependency (FD) has the form X → Y, where X and
Y are sets of attributes in a relation R

 X → Y means that whenever two tuples in R agree on all the


attributes in X, they must also agree on all attributes in Y

For any two tuples t1 and t2


in any relation instance r(R):
If t1[X]=t2[X], then t1[Y]=t2[Y]

 We say X functionally determines Y


 X -> Y in R specifies a constraint on all relation instances r(R)
EXAMPLE: STUDENT GRADE INFO
 StudentGrade (SID, name, email, CID, grade)
 SID → name, email
 email → SID
 SID, CID → grade

 Not a good design


EXAMPLE: ADDRESS
Address (street_address, city, state, zip)
 street_address, city, state → zip

 zip → city, state

 zip, state → zip?


 This is a trivial FD
 Trivial FD: LHS ⊇ RHS

 zip → state, zip?


 This is non-trivial, but not completely non-trivial
 Completely non-trivial FD: LHS ∩ RHS = ∅
EXAMPLES OF FD CONSTRAINTS
 If K is a key of R, then K functionally determines
all attributes in R

 SSN -> ENAME


 PNUMBER -> {PNAME, PLOCATION}
 {SSN, PNUMBER} -> HOURS

21
EXAMPLES OF FD CONSTRAINTS
• FD is a property of relational schema R. Must be define
by someone who knows the semantic of attributes of R.
• FD’s hold at all times
• Certain FD’s can be ruled out based on a given state of
the database

• Text -> course


• possible FD
• Teacher -> course
• ruled out

22
FUNCTIONAL DEPENDENCIES
 Given a relation R and a set of FD’s F
 Does another FD follow from F?
 Are some of the FD’s in F redundant (i.e., they follow
from the others)?

 Example: Emp_Dept(ename, ssn, bdate, address,


dnumber, dname, dmgrssn)
 F ={ssn -> {ename, bdate, address, dnumber),
dnumber ->{dname, dmgrssn) }
 Additional FD that can be infer from F
 ssn-> dname, dmgrssn
 ssn-> ssn
 dnumber -> dname
FUNCTIONAL DEPENDENCIES

 Which of the following dependencies may hold in the above


relation?
 If the dependency cannot hold, explain why by specifying
the tuples that cause the violation.
 i. A→B,
 ii. B→C,
 iii. C→B,
 iv. B→A,
 v. C→A
 Does the above relation have a potential candidate key? If
it does, what is it? If it does not, why not?
CONVERT BUSINESS STATEMENTS INTO
DEPENDENCIES.
 Consider the relation
DISK_DRIVE (Serial_number, Manufacturer, Model, Batch, Capacity, Retailer)

 Example:
 Disk_drive (‘1978619’, ‘WesternDigital’, ‘A2235X’, ‘765234’, 500, ‘CompUSA’)

 Write each of the following dependencies as an FD:


a) The manufacturer and serial number uniquely identifies the drive.
b) A model number is registered by a manufacturer and therefore can’t
be used by another manufacturer.
c) All disk drives in a particular batch are the same model.
d) All disk drives of a certain model of a particular manufacturer have
exactly the same capacity.
Inference Rules for FDs
 Given a set of FDs F, we can infer additional
FDs that hold whenever the FDs in F hold
 Armstrong's inference rules:
 IR1. (Reflexive)
If Y ⊆ X, then X -> Y
 IR2. (Augmentation)
If X -> Y, then XZ -> YZ
(XZ stands for X U Z)
 IR3. (Transitive)
If X -> Y and Y -> Z, then X -> Z

 IR1, IR2, IR3 form a sound and complete set of


inference rules
26
INFERENCE RULES FOR FDS
 Some additional inference rules that are useful:
 Decomposition:
 If X -> YZ, then X -> Y and X -> Z
 Union:
 If X -> Y and X -> Z, then X -> YZ
 Pseudotransitivity:
 If X -> Y and WY -> Z, then WX -> Z

 The last three inference rules, as well as any


other inference rules, can be deduced from IR1,
IR2, and IR3 (completeness property)

27
PROOFS
 If X -> YZ, then X -> Y and X -> Z
 YZ -> Y (reflexive)
 X -> YZ (given)
 X -> Y (transitivity)

 If X -> Y and X -> Z, then X -> YZ


 X ->XY (augmenting X in X-> Y)
 XY -> YZ (augmenting Y in X-> Z)
 X -> YZ (transitivity)

 If X -> Y and WY -> Z, then WX -> Z


 WX->WY (augmenting)
 WY -> Z (given)
 WX -> Z (transitivity)
INFERENCE RULES FOR FDS
 Closure of a set F of FDs is the set F+ of all FDs
that can be inferred from F

 Closure of a set of attributes X with respect to F


is the set X+ of all attributes that are functionally
determined by X

 X+ can be calculated by repeatedly applying IR1,


IR2, IR3 using the FDs in F

29
ALGORITHM FOR COMPUTING ATTRIBUTE
CLOSURE
 The closure of X (denoted X +) with respect to F is the
set of all attributes functionally determined by X
 Algorithm for computing the closure

X+ = X
repeat
OldX+ =X+
for each functional dependency Y → Z in F do
if Y⊆ X+ then X+ := X+ ∪ Z
until (X + = oldX+ )
EXAMPLE 1: CLOSURE ALGORITHM
Consider following schema

EMP_PROJ(Ssn, Ename, Pno, Pname, Hours)

And FDs
 F = {Ssn -> Ename,
Pno -> {Pname, Plocation},
{Ssn, Pno} -> Hours }

 Find the following Closure Sets


 {Ssn}+ =
 {Pno} + =
 {Ssn, Pno} + =
EXAMPLE 2: CLOSURE ALGORITHM
StudentGrade (SID, name, email, CID, grade)
F includes:
 SID → name, email
 email → SID
 SID, CID → grade

 { CID, email }+ = ?
 X+ = {CID, email}

Consider email → SID


 X+ = {SID, CID, email}

Consider SID → name, email


 X+ = {SID, CID, email, name}
Consider SID, CID → grade
 X+ = {SID, CID, email, name, grade}
USING RULES OF FD’S
 Given a relation R and set of FD’s F
 Does another FD X → Y follow from F?
 Use the rules to come up with a proof

 Example F includes:
 SID → name, email; email → SID; SID, CID → grade
 CID, email → grade?
 email → SID (given in F)
 CID, email → CID, SID (augmentation)
 SID, CID → grade (given in F)
 CID, email → grade (transitivity)
Minimal Sets of FDs
 A set of FDs is minimal if it satisfies the
following conditions:

1. Every dependency in F has a single attribute for


its RHS.
2. We cannot replace any dependency X -> A in F
with Y -> A, Y ⊆ X and still have a set of
dependencies that is equivalent to F.
3. We cannot remove any dependency from F and
have a set of dependencies that is equivalent to F.

 Every set of FDs has an equivalent minimal set


 There can be several equivalent minimal sets

34
EXAMPLE: MINIMAL COVER
 F = { AB -> C, C -> A, BC -> D, ACD -> B, D -> EG,
BE -> C, CG -> BD, CE -> AG}

 Step1: Canonical Form


 Step 2: We need to look for redundant attributes on
the LHS
 Step3: We need to look for FDs that are redundant
EXAMPLE: MINIMAL COVER
 ForSTEP 2: apply either of the rules below to
find redundant attributes on the LHS:
 1. In an FD {X → A} from set F, attribute B ∈ X is
redundant in X if F logically implies
(F – {X → A}) ∪ {(X – B) → A}.
 2. To test if attribute B ∈ X is redundant in X
 Step 1: compute ({X} – B)+ using the dependencies in F
 Step 2: check that ({X} – B)+ contains B; if it does, B is

redundant
EXAMPLE: MINIMAL COVER
 Consider ACD -> B to see if any of the three
attributes on the LHS is redundant
 Computing CD+ ={ACDEGB}
 Closure contains A and thus, A is redundant.
 The closure contains B, which tells us that CD -> B holds.

 Alternatively, we need to prove that F logically


implies CD -> B in place of ACD -> B.
 C -> A (given in F)
 CD -> ACD (augment with CD)
 ACD -> B (given in F)
 CD -> B (transitivity)
EXAMPLE: MINIMAL COVER
 Step3: Look for redundant FDs in set F.
 CE -> A
 CG -> B
 (as CG -> D, and CD -> B in F.)
 No more redundant FDs in F.

 Minimal Cover : {AB -> C, C -> A, BC -> D, CD ->


B, D -> E, D -> G, BE -> C, CG -> D, CE -> G}
EXAMPLE: MINIMAL COVER
 Minimal Cover 2 : {AB -> C, C -> A, BC -> D, D -> E, D -> G,
BE -> C, CG -> B, CE -> G}
 Obtained by eliminating redundant FDs,
 CE -> A,
 CG -> D, (CG->B, BC->D)
 CD -> B (D->G, CG->B)

 We need one of {CG -> D, CG -> B} in the minimal cover – can


not eliminate both.
ALGORITHM FOR FINDING MINIMAL
COVER F FOR A SET OF FDS E
 Set F:= E
 Replace each FD X-> {A1, A2, …, An} in F by n
FDs X-> A1, X->A2, … , X-> An
 For each FD X-> A in F

For each attribute B of X


if {{ F- { X -> A }} U { (X- B) -> A}} = F
then replace X->A with (X- B) -> A in F
 For each remaining FD X->A in F

if {F – {X->A} } = F
then remove X->A from F
EXAMPLE 2: COMPUTE MINIMAL SETS OF FDS

Consider set of FDs E : {B → A, D → A, AB → D}. We have


to find the minimum cover of E.

 Step1: Canonical form


 Step 2: Determine if AB → D has any redundant
attribute?
Consider B → A
BB → AB (augmenting with B -- IR2)
B → AB
AB → D (given)
We get B → D ( transitivity – IR3)
Hence, AB → D may be replaced by B → D.

 Now E′ : {B → A, D → A, B → D}.
41
EXAMPLE 2: COMPUTE MINIMAL SETS OF FDS

Consider set of FDs E : {B → A, D → A, AB → D}. We have


to find the minimum cover of E.

 E′ : {B → A, D → A, B → D}.
 Step 3: Look for a redundant FD in E′
B→D
D → A,
Thus, B → A (transitive rule)
Eliminate B → A in E’.

 The minimum cover of E is {B → D, D → A}.

42
MINIMAL COVER
 Given a relation R (A, B, C, D, E, F) and
 a set of FDs F = {A →BCE, CD →EF, E →F, B →
E, AB → CF}.
 Compute the minimal cover for F
Equivalence of Sets of FDs
 Two sets of FDs F and G are equivalent if:
 Every FD in F can be inferred from G, and
 Every FD in G can be inferred from F
 Hence, F and G are equivalent if F+ =G+

 Covers:
 F covers G if every FD in G can be inferred from F
 (i.e., if G+ ⊆ F+)

 F and G are equivalent if F covers G and G covers F

44
EXAMPLE: EQUIVALENCE BETWEEN SETS
OF FUNCTIONAL DEPENDENCIES

 Consider two sets of FDs, F and G,


 F = {A -> B, B -> C, AC -> D} and
 G = {A -> B, B -> C, A -> D}

 Are F and G equivalent?

 We need to prove that F+ = G+.


 To show that F covers G calculate X+ w.r.t to F for
each FD in G, and then see if this F closure contains
attribute in G.
 Similarly compute G covers F.
EXAMPLE: EQUIVALENCE BETWEEN SETS
OF FUNCTIONAL DEPENDENCIES

 Use attribute closure to infer all FDs in F using G


 F = {A -> B, B -> C, AC -> D}
 G = {A -> B, B -> C, A -> D}

 Take the attributes from the LHS of FDs in F and


compute attribute closure for each using FDs in G:
 (A+ using G) = ABCD;
 (B+ using G) = BC;
 (AC+ using G) = ABCD;

 All FDs in F are inferred using FDs in G.


EXAMPLE: EQUIVALENCE BETWEEN SETS
OF FUNCTIONAL DEPENDENCIES

 F = {A -> B, B -> C, AC -> D}


 G = {A -> B, B -> C, A -> D}

 Now we see if all FDs in G are inferred by F


 A+ using F = ABCD;
 B+ using F = BC;

 All FDs in F can be obtained from G and vice


versa, so we conclude that F and G are equivalent.
EXAMPLE: EQUIVALENCE BETWEEN SETS
OF FUNCTIONAL DEPENDENCIES

 F = {A -> B, A ->C}
 G = {A -> B, B -> C}
 Are F and G equivalent?

 A+ using G = ABC;
 A+ using F = ABC;
 B+ using F = B,
 This indicate B -> C in G is not inferred using the
FDs from F.

 Thus, F and G are not equivalent, because B -> C


in G is not inferred from the FDs in F.
FIND THE KEY FOR RELATION R
 Consider a relation R= {A,B,C,D,E,F,G,H,I,J} and
 The set of FD’s {
 A,B -> C ,
 A,G-> D,H ,
 B,G->E,F ,
 A->I ,
 H->J }.
 Find the key for Relation R with given FD’s.
ALGORITHM: TO FIND A KEY K FOR
R GIVEN A SET F OF FD’S
Input: A universal relation R and a set of
functional dependencies F on the attributes of
R.
1. Set K := R.
2. For each attribute A in K {
compute (K - A)+ with respect to F;
If (K - A)+ contains all the attributes in R,
then set K := K - {A};
}
Normalization of Relations
 Normalization:
 The process of decomposing "bad" relation by
breaking it into smaller relations

 Normalization should maintain properties like


lossless join and dependency preservation to
ensure a good relational design (

51
NORMAL FORMS

 The database designers don’t have to normalize


relations to the highest possible normal form
(usually 3NF, BCNF or 4NF is enough)

 Denormalization:
 Opposite of normalization
 Join relations to form a base relation—which is in
a lower normal form

52
SUPERKEYS AND KEYS
 Superkey
 Key

 Candidate key

 Primary key

 Secondary keys

 Prime attribute: it must be a member of some


candidate key
 Nonprime attribute: it is not a member of any
candidate key.

53
First Normal Form
 It disallows Composite attributes, Multivalued attributes and
Nested relations
 Attributes can have atomic values for an individual tuple
 It is considered to be a part of the definition of relation

54
Normalization of nested relations into
1NF

55
Second Normal Form
 Uses the concepts of FDs
 Full functional dependency: a FD Y -> Z where removal
of any attribute from Y means the FD does not hold any
more

 EMP_PROJ(SSN, Ename, Pno, Pname, Hours)


 {SSN, Pno} -> Hours is a full FD
 since neither SSN -> Hours nor Pno -> Hours hold
 {SSN, Pno} -> Ename is not a full FD, it is called a partial
dependency
 since SSN -> Ename also holds

56
SECOND NORMAL FORM
 Second Normal Form:
A relation that is in 1NF and every non-prime
attribute is fully functionally dependent on the
primary key

 We can decompose a relation into 2NF relations


via the process of 2NF normalization

57
Normalizing into 2NF

58
Third Normal Form
 Transitive functional dependency:
 a FD X -> Z that can be derived from two FDs X -> Y and
Y -> Z

 Emp_Dept(SSN, Ename, Bdate, Address, Dno, Dname,


Dmgrs_ssn)
 SSN -> DMGRSSN is a transitive FD
 Since SSN -> DNUMBER and DNUMBER -> DMGRSSN hold
 SSN -> ENAME is non-transitive
 Since there is no set of attributes X where SSN -> X and
X -> ENAME

59
THIRD NORMAL FORM
 Third Normal Form:
A relation that is in 2NF, and in which no non-prime
attribute is transitively dependent on the primary
key.

60
THIRD NORMAL FORM
 NOTE:
 In X -> Y and Y -> Z, with X as the primary key,
we consider this a problem only if Y is not a
candidate key.

 Consider EMP (SSN, Emp#, Salary ).


 Here, SSN -> Emp# -> Salary and Emp# is a
candidate key.

 When Y is a candidate key, there is no problem


with the transitive dependency
NORMAL FORMS DEFINED INFORMALLY

 1st normal form


 All attributes depend on the key
 2nd normal form
 All attributes depend on the whole key
 3rd normal form
 All attributes depend on nothing but the key

62
SUMMARY OF NORMAL FORMS
based on Primary Keys

63
GENERAL NORMAL FORM DEFINITIONS
(FOR MULTIPLE KEYS)
 The above definitions consider the primary key
only
 The following more general definitions take into
account relations with multiple candidate keys
 A relation schema R is in 2NF if every non-prime
attribute A in R is fully functionally dependent
on every key of R

64
GENERAL NORMAL FORM DEFINITIONS
 A relation R is in 3NF if whenever a FD X -> A
holds in R, then either:
a) X is a superkey of R, or

b) A is a prime attribute of R

65
Successive Normalization of LOTS into 2NF and
3NF

66
BCNF (Boyce-Codd Normal Form)
 A relation schema R is in Boyce-Codd Normal Form
(BCNF) if whenever an FD X -> A holds in R, then X is a
superkey of R
 Each normal form is strictly stronger than the previous one
 Every 2NF relation is in 1NF
 Every 3NF relation is in 2NF
 Every BCNF relation is in 3NF
 There exist relations that are in 3NF but not in BCNF
 The goal is to have each relation in BCNF (or 3NF)

67
Boyce-Codd Normal Form

68
A relation that is in 3NF but not in BCNF

Two FDs exist in the relation TEACH:


FD1: { student, course} -> instructor
FD2: instructor -> course

{student, course} is a
candidate key

69
A relation that is in 3NF but not in BCNF

Two FDs exist in the relation TEACH:


FD1: { student, course} -> instructor
FD2: instructor -> course

Three possible decompositions for relation TEACH


{student, instructor} and {student, course}
{course, instructor } and {student, course}
{course, instructor } and {student, instructor}

All three decompositions lose FD1. We


sacrifice the FD preservation. But cannot
sacrifice the non-additivity property after
decomposition.
Only the 3rd decomposition will not
generate spurious tuples after join

70
EXAMPLE
 What is the key for R ? Given
 F = {A, B->C,
B, D->E, F,
A, D->G, H,
A->I,
H->J }

 Key
 ABD
 2NF
 R1 (A, B, C) R2 (B,D, E, F) R3 (A, D, G, H, J) R4 (A, I) , R=(ABD)
 3NF
 R1 (A, B, C) R2 (B,D, E, F) R3.1 (A, D, G, H) R3.2 (H, J) R4 (A, I),
R=(ABD)
EXAMPLE: DESIGNING A STUDENT
DATABASE
1. Students take courses
2. Students typically take more than one course
3. Students can fail courses and can repeat the same course in
different semesters => Students can take the same course more
than once.
4. Students are assigned a grade for each course they take.

The data that needs to be retained has


repeating groups

Procedure: Remove repeating groups by


adding extra rows to hold the
repeated attributes. => Add redundancy.
DESIGNING A STUDENT DATABASE -2
 Student_course(sID, sname, dept, advisor,
course, credit, semester, grade, course-room,
instructor, instructor-office)

 1NF: Tables should have no repeating groups

 Problems:
 Redundancy
 Insert anomalies
 Delete anomalies
 Update problems
DESIGNING A STUDENT DATABASE -3
 sname, dept, advisor are dependent only upon sID
 credit dependent only on course and is independent of which
semester it is offered and which student is taking it.
 course-room, instructor and instructor-office only depend upon
the course and the semester (are independent of which student is
taking the course).
 Only grade is dependent upon all 3 parts of the original key.

 2NF: The non-prime attributes should depend


on the whole key.
 Procedure: remove partial key dependencies
DESIGNING A STUDENT DATABASE -4
 Student_course(sID, sname, dept, advisor, course, credit,
semester, grade, course-room, instructor, instructor-
office)

 Student (sID, sname, dept, advisor)


 Student_Reg (sID, course, semester, grade)
 Course (course, credit)
 Course_Offering (course, semester, course-room,
instructor, instructor-office)

 2NF: There are no non-key attributes with partial key


dependencies in any table.
 Less redundancy
 Still insert/delete/update problems
DESIGNING A STUDENT DATABASE -5
 3NF: The non-key attributes should not
depend on other non-key attributes.

 Procedure: remove non-key dependencies


(transitive functional dependencies)
DESIGNING A STUDENT DATABASE -3NF
 Student (sID, sname, dept, advisor)
 Student_Reg (sID, course, semester, grade)
 Course (course, credit)
 Course_Offering (course, semester, course-room, instructor,
instructor-office)

 Student (sID, sname, dept, advisor)


 Student_Reg (sID, course, semester, grade)
 Course (course, credit)
 Course_Offering (course, semester, course-room,
instructor)
 Instructor (instructor, instructor-office)

 More organized, Less redundancy (save space??)


 performance problems?? (indirect references)
DESIGNING A STUDENT DATABASE -7
 In normalization, we are seeking to make sure
that attributes depend:
 on the key (1NF)
 on the whole key (2NF)
 on nothing but the key (3NF)

 We have not checked whether some part of the


key itself depends on a non-key attribute(s).
DESIGNING A STUDENT DATABASE -8
 Suppose:
1. each department may have more than one advisor
2. a advisor works for only one department (advisor -> dept)
3. a student may be associated with several departmental
programs.

 Student (sID, sname, dept, advisor)

 Part of the compound key is determined by a non-key attribute

 advisor -> dept


 sname is function of only sID
 dept is function of only advisor
DESIGNING A STUDENT DATABASE -9
 Student (sID, sname)
 Advisor (advisor, dept)
 Student-Advice (sID, advisor)
 Student_Reg (sID, course, semester, grade)
 Course (course, credit)
 Course_Offering (course, semester, course-room, instructor)
 Instructor (instructor, instructor-office)

BCNF: Table in 3NF and there are


no dependencies of part of the
compound key on another attribute
NORMALIZATION EXAMPLE -- SALES ORDER

Fields in the
original data
table will be as
follows:

SalesOrderNo,
Date,
CustomerNo,
CustomerName,
CustomerAdd,
ClerkNo,
ClerkName,
ItemNo,
Description,
Qty, UnitPrice
NORMALIZATION: FIRST NORMAL FORM
 Separate Repeating Groups into New Tables.
 Repeating Groups Fields that may be repeated several
times for one document/entity
 The primary key of the new table (repeating group) is
always a composite key;

 Relations in 1NF:
 SalesOrderNo, ItemNo, Description, Qty, UnitPrice
 SalesOrderNo, Date, CustomerNo, CustomerName,
CustomerAdd, ClerkNo, ClerkName
NORMALIZATION: SECOND NORMAL FORM
 Remove Partial Dependencies.
 Note: Do not treat price as dependent on item. Price
may be different for different sales orders (discounts,
special customers, etc.)

 Relations in 2NF
 ItemNo, Description
 SalesOrderNo, ItemNo, Qty, UnitPrice
 SalesOrderNo, Date, CustomerNo, CustomerName,
CustomerAdd, ClerkNo, ClerkName
NORMALIZATION: SECOND NORMAL FORM
 What if we did not Normalize the Database to
Second Normal Form?
 Redundancy
 Delete Anomalies
 Insert Anomalies
 Update Anomalies
NORMALIZATION: THIRD NORMAL FORM
 Remove transitive dependencies

 Relations in 3NF
 Customers: CustomerNo, CustomerName, CustomerAdd
 Clerks: ClerkNo, ClerkName
 Inventory Items: ItemNo, Description
 Sales Orders: SalesOrderNo, Date, CustomerNo, ClerkNo
 SalesOrderDetail: SalesOrderNo, ItemNo, Qty, UnitPrice
NORMALIZATION: THIRD NORMAL FORM
 What if we did not Normalize the Database to
Third Normal Form?
 Redundancy – Detail for Cust/Clerk would appear on
every SO
 Delete Anomalies – Delete a sales order, delete the
customer/clerk
 Insert Anomalies – To insert a customer/clerk, must
insert sales order.
 Update Anomalies – To change the name/address, etc,
must change it on every SO.
CHAPTER SUMMARY
 Informal Design Guidelines for Relational
Databases
 Functional Dependencies (FDs)
 Definition, Inference Rules, Equivalence of Sets of
FDs, Minimal Sets of FDs
 Normal Forms Based on Primary Keys
 General Normal Form Definitions (For Multiple
Keys)
 BCNF (Boyce-Codd Normal Form)

87

You might also like