21csc205p Dbms Unit IV
21csc205p Dbms Unit IV
UNIT-IV
Pitfalls in Relational Database
Introduction
• Relational databases are widely used in modern software
applications for storing and managing data.
• However, designing a good schema that represents the data
accurately and efficiently is a challenging task.
• In this presentation, we will explore the pitfalls in relational database
design, and discuss how to decompose a bad schema and understand
functional dependencies to improve it.
Pitfalls in Relational Database Design
1.1 Redundancy:
Storing the same data in multiple places can lead to inconsistencies and inefficiencies.
1.2 Inconsistency:
Data inconsistencies can occur when different parts of the database hold different
versions of the same data.
1.3 Inefficiency:
Poor database design can result in slower query performance and increased storage
requirements.
1.4 Complexity:
A complex database schema can be difficult to understand and maintain, leading to
errors and inefficiencies.
Decomposing a Bad Schema
• employees2(department, salary)
Example
In this new design, each table has fewer attributes and there is no
redundancy.
• A→B
• B→C
• C→D
To find the closure of the attribute set {A}, the axioms are used to derive all the other attributes that
are functionally dependent on {A}.
Using reflexivity, we know that A → A holds true, so we can augment {A} to get {A, B}. Then,
using transitivity, we know that {A, B} → C holds true, so we can augment {A, B} to get {A, B,
C}.
Finally, using transitivity again, we know that {A, B, C} → D holds true, so we can augment {A, B,
C} to get the closure of {A}, which is {A, B, C, D}.
In this example, we have identified all the functional dependencies that hold in the relation and
eliminated any redundancy or anomalies by normalizing the relation.
Closure of FD set
● Given a set F of functional dependencies on a schema, we can prove that certain other functional dependencies
also hold on the schema. We say that such functional dependencies are “logically implied” by F.
● Let F be a set of functional dependencies. The closure of F, denoted by F+, is the set of all functional
dependencies logically implied by F. Given F, we can compute F+ directly from the formal definition of
functional dependency.
24
● We can use the following three rules to find logically implied functional
dependencies.
● By applying these rules repeatedly, we can find all of F+, given F. This collection of
rules is called Armstrong’s axioms in honor of the person who first proposed it.
25
● Armstrong’s axioms are sound, because they do not generate any incorrect functional
dependencies. They are complete, because, for a given set F of functional
dependencies, they allow us to generate all F+.
26
ADDITIONAL RULES
27
PROCEDURE TO COMPUTE F+
F+ = F
apply the reflexivity rule /* Generates all trivial dependencies */
repeat
for each functional dependency f in F+
apply the augmentation rule on f
add the resulting functional dependencies to F+
for each pair of functional dependencies f1 and f2 in F+
if f1 and f2 can be combined using transitivity
add the resulting functional dependency to F+
until F+ does not change any further
28
EXAMPLE
For relation R = (A, B, C, G, H, I) and the set F of functional dependencies {A → B, A → C, CG → H,
CG → I , B → H}. Find out the closure of functional dependency.
● Another way of finding that AG → I holds is as follows: We use the augmentation rule on A → C to
infer AG → CG. Applying the transitivity rule to this dependency and CG → I, we infer AG → I .
29
Closure of attributes
30
PSEUDOCODE TO FIND THE CLOSURE OF ATTRIBUTE
31
EXAMPLE
● Given Relational schema R(A,B,C,D,E) with the given FDs find the closure
of an attribute ‘E’.
32
Irreducible a set of functional dependencies
(or)Canonical Cover
● A canonical cover or irreducible a set of functional dependencies FD is a
simplified set of FD that has a similar closure as the original set FD.
Extraneous attributes
● An attribute of an FD is said to be extraneous if we can remove it without
changing the closure of the set of FD.
33
PROCEDURE TO COMPUTE CANONICAL COVER
Fc = F
repeat
Use the union rule to replace any dependencies in Fc of the form
α1 → β1 and α1 → β2 with α1 → β1 β2.
Find a functional dependency α → β in Fc with an extraneous
attribute either in α or in β.
/* Note: the test for extraneous attributes is done using Fc, not F */
If an extraneous attribute is found, delete it from α → β in Fc.
until (Fc does not change)
34
● A canonical cover Fc for F is a set of dependencies such that F logically implies all
dependencies in Fc, and Fc logically implies all dependencies in F. Furthermore, Fc
must have the following properties:
● No functional dependency in Fc contains an extraneous attribute.
● Each left side of a functional dependency in Fc is unique. That is, there are no two
dependencies α1 → β1 and α2 → β2 in Fc such that α1 = α2
35
Normalizatio
✔ Database
n
Normalization is the technique of organizing data into
more than one table in the database.
✔ Updation Anomaly: To update address of student to occurs twice (or) more than
twice in a table we have to update address column in all the rows else the data will
come inconsistent.
✔ Insertion Anomaly: Suppose for a student new admission, we have s_id, s_name,
s_address of a student but the student has not opted for any subject then we have
to insert null for the subject which leads to insertion anomalies.
✔ Deletion Anomaly: If s_id = 401 has only one subject and temporarily drop it when
we delete that row entire student record will be deleted along with it.
NORMAL
FORMS
✔ The term Normalization comes from the concept of normal form which
describe how to organize the data in the database.
✔ Normal Forms were developed around the concept of table based on relational
database.
Type
✔
s Form)
1NF (First Normal
✔ 2NF (Second Normal Form)
✔ 3NF (Third Normal Form)
✔ BCNF (Boyce-Codd Normal Form)
✔ 4NF (Fourth Normal Form)
✔ 5NF (Fifth Normal Form)
UN-NORMALIZED
FORM
✔ If a table contains one or more repeating groups it is called Un-Normalized
form.
Course Content
Programming C, Java, Python
✔ If and only if all the attributes of relation are atomic (Single valued) in
nature.
✔ Must not contain any multi valued (or) composite attributes.
Course Content
Programming C, Java, Python
Web PHP
✔ This 1NF allows data redundancy and there will be many column with same
data in multiple rows but each row as a whole will be unique.
✔ It is used in most small to medium application.
SECOND NORMAL FORM
✔ (2NF)
A relation is said to be in Second Normal Form,
✔ If and only if it is already in First Normal Form.
✔ There exist no Partial Dependency.
✔ Partial Dependency:
✔ It means that a non-prime attribute is functionally dependent on part of a prime
attribute.
✔ Functional Dependency:
✔ All the non key attributes are fully functional depends on key attribute.
47
To eliminate Partial Dependency
✔ After removing the position into another relation store the lesser
amount of data in two relations without the loss of data.
Example Table
Emp_id E_name Month Sales B_id B_name
E1 AA Jan 1000 B1 SBI
E1 AA Feb 1200 B1 SBI
E2 BB Jan 2200 B2 UBI
E2 BB Feb 2500 B2 UBI
E3 CC Jan 1700 B1 SBI
E3 CC Feb 1800 B1 SBI
Table 1 Table 2
Emp_id E_name Month Sales B_id B_id B_name
E1 AA Jan 1000 B1 B1 SBI
E1 AA Feb 1200 B1
B2 UBI
E2 BB Jan 2200 B2
E2 BB Feb 2500 B2
E3 CC Jan 1700 B1
E3 CC Feb 1800 B1
THIRD NORMAL FORM (3NF)
✔ A relation is said to be in Third Normal Form if and only if,
✔ It is already in Second Normal Form.
✔ There exist no Transitive Dependency.
Transitive Dependency
A->B (B depends on A)
B->C
A->C
Eg:
Rollno->marks
Marks->grade
Rollno->grade
To eliminate Transitive Dependency
✔ For each non-key attribute that is determinant in relation, create a
new relation that attribute becomes a primary key in new relation.
✔ Leave the attribute (which serve as primary key in new relation) in old
relation to serve as a foreign key.
S_id S_name City Pincode
1 Shiva Chennai 15
2 Peter Pondy 17
3 Meera Villupuram 20
4 Rohan Delhi 01
5 Shakthi Bangalore 18
Example Table
Table 1 Table 2
S_id S_name Pincode Pincode City
1 Shiva 15 15 Chennai
2 Peter 17 17 Pondy
3 Meera 20 20 Villupuram
4 Rohan 01 01 Delhi
5 Shakthi 18 18 Bangalore
Fourth Normal Form (4NF)
ID→→dept name is a non-trivial multivalued dependency and ID is not a superkey for the schema.
4NF DECOMPOSITION ALGORITHM
Lossless-join Decomposition
2NF A relation will be in 2NF if it is in 1NF and all non-key attributes are fully functional dependent on the primary key.
BCNF A stronger definition of 3NF & on the left-hand side of the functional dependency there is a candidate key
4NF A relation will be in 4NF if it is in Boyce Codd's normal form and has no multi-valued dependency.
5NF A relation is in 5NF. If it is in 4NF and does not contain any join dependency, joining should be lossless.
THANK YOU