Minimal Cover Document
Minimal Cover Document
Minimal Cover Document
{Lecturer_ID} → {Lect_First_Name}
We can continue to find other FDs that are logically implied by F. The set of all such
FDs (and including those in F) is called the closure of F (denoted F+).
Using these axioms we can find all FDs logically implied by some set of FDs F and
hence find its closure F+.
Exercise 11
Using the axioms, find the closure F+ of F = {AB → CD, C → E}.
Looking again at the PROJECTS schema from our previous 3NF example, we had the
following functional dependencies:
Project-Budget
Project Department
Dept-Address
Project-Budget Department
it appears that we have lost one of the original FDs: Project → Dept-Address.
However, using the axioms, we can infer that Project → Dept-Address is logically
implied by Project → Department and Department → Dept-Address so that the FD has
not actually been lost.
(FR1 ∪ FR2)+ = F+
Where FR1 is the set of dependencies that apply between attributes only in R1 and
similarly FR2 is the set of dependencies that apply between attributes only in R2.
Although the set of dependencies (FR1 U FR2) might be different to the original set of
dependencies F, we only require that the two sets are equivalent (that is, their
closures are the same) for the decomposition to be dependency preserving.
For example, the closure of the PROJECTS relation in the previous example is:
The closure of the combined FDs in the decomposed tables is the same, so the
decomposition is dependency preserving.
RELATION: R1
A B C
a1 b1 c1
a3 b1 c2
a2 b2 c3
a4 b2 c4
we would have:
RELATION: R2 RELATION: R3
A B B C
a1 b1 b1 c1
a3 b1 b1 c2
a2 b2 b2 c3
a4 b2 b2 c4
A→B C→B
A B C
a1 b1 c1
a1 b1 c2
a3 b1 c1
a3 b1 c2
a2 b2 c3
a2 b2 c4
a4 b2 c3
a4 b2 c4
The resultant relation contains more rows (spurious tuples) than existed in the original
R1. This is called a “lossy decomposition” as it causes constraint information to be
lost. Lossy decomposition is extremely undesirable and should be avoided at all costs.
(R1 ∩ R2) → R1 or
(R1 ∩ R2) → R2
We can interpret this condition as the condition that common attributes must form a
superkey in either R1 or R2 for the decomposition to be lossless.
Exercise: use the definition above to show that decomposition of R1 = ({A, B, C},
{A → B, C → B}) into R2 = ({A, B}, {A → B}) and R3 = ({B, C}, {C → B}) is not
lossless.
The closure of a set of FDs is of interest, as we have seen, but can be very large and
take a long time to compute. There is no efficient method (shortcut) for computing it.
For algorithms used in decomposition, typically we only want to check if some given
FD X → Y is in the closure of a set of FDs F. There is a way of doing this without
computing the closure F+. One ad hoc way of answering this is to try to derive X → Y
from F using the inference axioms. Another is by use of the concept of the closure of
an attribute.
The attribute closure of X with respect to the set of FDs F is defined as the set of all
attributes A such that X → A is in F+.
Closure of a set of attributes X (denoted X+) is the set of all the attributes that are
functionally dependent on X, given some set of functional dependencies F. The
attribute closure is much more efficient to compute than F+. The aim of computing it
is to find what attributes depend on a given set of attributes and therefore ought to be
together in the same relation. We will use it later to formulate an algorithmic way of
decomposing a relation into relations of higher normal form.
The following algorithm is used to compute the closure of a set of attributes X under
the set of FDs F.
X+ := X
repeat
temp_X+ := X+
for each FD Y → Z in F do
if Y ⊆ X+ then X+ := X+ ∪ Z
until (temp_X+ = X+)
Explanation of Algorithm:
If the attribute X determines some attribute Y and in turn Y determines Z, then X also
determines Z (axiom (3)) and so Z is added to X+. Continuing, if any set of attributes
in X+ determine some other attribute(s) then X also determines this attribute(s) and it
is added to X+, and so on.
We can use attribute closure to quickly answer questions about specific FDs, and also
to find keys of complex relations.
Example:
Is B → G in F+ where
R = {A, B, C, D, E} and
F = {B → CD, E → F, D → E, B → A, AD → B, F → G}
B+ = B
B+ = BCD
B+ = BCDE
B+ = BCDEA
B+ = BCDEAF
B+ = BCDEAFG
stop
Yes G is in B+ so B → G is in F+.
Note that B+ contained all the attributes in the relation R and so B must be a key to
the relation. Finding the closure of attributes can be used as a general method for
checking if multiple attributes taken together are relation keys. For example, is AD a
relation key of R? To answer this, find A+, D+ and AD+. If neither A+ nor D+ contain
all the attributes in R but AD+ does, then AD is a relation key.
Exercise 12
Given the relation:
Compute the X+, showing each step of the algorithm, and from X+ determine if
(a) X is a key of R
(b) X → VZ holds on R
As we’ve seen, the closure of a set of attributes X under a set of FDs F is the set of all
the attributes that are functionally dependent on X. We can use attribute closure to
remove redundant FDs from a set. This will be useful in systematically normalising
relations (later). As an example, remove all redundant functional dependencies from
F, using the attribute closure algorithm, where:
F = {AD → B, B → C, C → D, A → B}
set G = F
for each FD X → A in G {
compute X with respect to the set of dependencies (G – (X → A))
+
G = {AD → B, B → C, C → D, A → B}
For AD → B compute AD+ under (G – (AD → B))
AD+ = AD
AD+ = ADB
AD+ = ADBC
AD+ = ADBCD
stop
G = (G – (AD → B)) = {B → C, C → D, A → B}
For B → C compute B+ under (G – (B → C))
B+ = B
stop
G = {B → C, C → D, A → B}
For C → D compute C+ under (G – (C → D))
C+ = C
stop
G = {B → C, C → D, A → B}
For A → B compute A+ under (G – (A → B))
A+ = A
stop
G = {B → C, C → D, A → B}
end
Note that depending on the FDs and their order we can sometimes find a different G.
There may be further redundancy in a set of FDs in the form of redundant attributes in
the determinants of the FDs. If we also remove such attributes we arrive at what is
called a minimal cover for the set of FDs. The minimal cover must also have the
property of having single attributes on the right hand side of all FDs in the set.
Formally:
The minimal cover for a set of FDs F is the set of FDs G such that:
set G = F
replace each FD X → {A1,A2,...,An} in G by n FDs X → A1,X → A2,...,X → An
for each FD X → A in G {
compute X with respect to the set of dependencies (G – (X → A))
+
The first step simply replaces any FDs with multiple attributes on the rhs with
equivalent sets of FDs with rhs containing only one attribute.
The second step removes any redundant FDs, using the algorithm we had before.
The third step checks if any attribute on the left hand side of any FD can be removed.
Note: it is sometimes possible to have more than one valid minimal cover for a given
set of FDs.
G = {AB → D, B → C, AE → B, A → D, D → E, D → F}
G = {B → C, AE → B, A → D, D → E, D → F}
For AE → B
For A: compute E+ with respect to (G-(AE → B) ∪ (E → B))
E+ wrt {B → C, E → B, A → D, D → E, D → F} = EBC
E+ doesn’t contain A, so A not redundant in AE → B
For E: compute A+ with respect to (G-(AE → B) ∪ (A → B))
A+ wrt {B → C, A → B, A → D, D → E, D → F} = ABDEFC
A+ contains E, so E is redundant in AE → B
Minimal closure = {B → C, A → B, A → D, D → E, D → F}
A → BC
B→C
A→B
AB → C
AC → D
We can use our minimal cover algorithm to systematically decompose relations into a
higher normal form, whilst meeting our objectives of preserving dependencies (at
least for 3NF) and avoiding loss in the decomposition (for either 3NF or BCNF).
We stop here at 3NF but note that the maths can be applied to systematically
normalising to higher normal forms. Note however, that we cannot guarantee that
there will always exist a BCNF (or higher normal form) which is lossless and
dependency preserving. For this reason, 3NF is often a high enough normal form for
many practical applications.
Exercise 15
By first finding a minimal cover, normalise the relation R to 3NF.