functional dependency and Attribute Closure (1)
functional dependency and Attribute Closure (1)
Introduction
Functional dependency can be de ned as the method that describes the relationship between the attributes in a given
relation. It means we represent how the different attributes in a dataset table are related to each other with the help of
functional dependencies. Suppose there is functional dependency A→B; it means 'A determines B' or 'B is determined by A
.'Here 'A' is a determinant attribute, and 'B' is a determined attribute. We can also say that 'B' is dependent on 'A.'
Now let’s understand with the help of an example, suppose there is a functional dependency given STUDENT_ID→
STUDENT_NAME, as mentioned in the table below:
STUDENT_ID STUDENT_NAME
1 David
2 David
Here in the above table, we see two students with the same name. How will we differentiate whether they are the same
students or different ones? To uniquely identify this, we will refer to STUDENT_ID. Now we can clearly see that both have
different IDs, which means both are different students, and each tuple dataset (all the other attributes) will be different.
Here STUDENT_NAME is a determined attribute, and STUDENT_ID is a determinant attribute that solves confusions
related to determinant attribute by uniquely identifying the tuple.
Validating a FunctionalDependency
Now, let's see how functional dependencies can help to validate a data set with a few examples. It will help us to
understand the concept more clearly.
Table 1
In the above table, the functional dependency from STUDENT_ID → STUDENT_NAME is valid since both students have
different names and ids.
STUDENT_ID STUDENT_NAME
1 Bob
2 David
STUDENT_ID STUDENT_NAME
1 Bob
1 Bob
Table 2: In the above table, functional dependency STUDENT_ID → STUDENT_NAME is valid since both students are the same, and there
is redundant data in the table.
Table 3
STUDENT_ID STUDENT_NAME
1 Bob
2 Bob
In the above table, functional dependency STUDENT_ID → STUDENT_NAME is valid since both students are different, and
STUDENT_ID helps identify it uniquely.
Table 4
STUDENT_ID STUDENT_NAME
1 Bob
1 David
In the above table, functional dependency STUDENT_ID,→ STUDENT_NAME is not valid since both students are
different and cannot have the same STUDENT_ID.
For example, STUDENT_ID→STUDENT_ID is a trivial functional dependency; it will always be valid, validity and invalidity
of a functional dependency have already been discussed above.
There is another method to determine trivial functional dependency: if we take the intersection of left and right
attributes, it will never be empty('Ⲫ').
In non-trivial dependency cases of validity and invalidity arise, trivial ones are always valid.
A multivalued dependency consists of at least two attributes that are dependent on a third attribute, which is why at least
three attributes are always required.
1 Science 2020
2 Science 2021
Here, STU_ID can determine both COURSE and PASSING_YEAR. STU_ID→{ COURSE, PASSING_YEAR }, but there is no
functional dependency between COURSE and PASSING_YEAR. Hence we can say that COURSE and PASSING_YEAR
both are independent of each other which makes them a multivalued dependent on STU_ID.
For example, if A→B, B→C holds true, then according to the axiom of transitivity, A→C will also hold true.
1 7 L202
2 6 B101
Here, with the STU_ID we can determine CLASS, and with CLASS, we can determine the LECTURE_HALL number for that
particular class. It means with the help of STU_ID, we can determine LECTURE HALL. Therefore STU_ID→LECTURE_HALL holds
true.
Functional Dependency Set
The functional Dependency set(or FD set) of a relation is the set of all functional dependencies present in the given
relation.
1. STU_ID→STU_NAME,
2. STU_ID→STU_AGE,
3. STU_ID→STU_NO,
4. STU_STATE→STU_COUNTRY
}
Here, STU_ID is a ‘determinant attribute’ that can determine and validate STU_NAME, STU_AGE, STU_NO. Similarly,
STU_STATE→STU_COUNTRY is a functional dependency where STU_STATE can determine and validate STU_COUNTRY.
So set of all these dependencies will form an FD set.
For example, if STU_ID→CLASS, CLASS→LECTURE_HALL holds true then according to the axiom of transitivity,
STU_ID→LECTURE_HALL will also hold true.
For example, STU_ID → STU_NAME, STU_ID→COURSE then STU_ID→ {STU_NAME, COURSE} holds true.
For example STU_ID→ {STU_NAME, COURSE} then STU_ID → STU_NAME, STU_ID→COURSE holds true.
After understanding the functional dependencies and their types, let's know what an attribute closure is:
Attribute Closure
Attribute Closure of an attribute set is de ned as a set of all attributes that can be functionally determined from it.
The closure of an attribute is represented as +
Given a relation R(A,B,C,D) and FD { A→B, B→C, C→D}, then determine the A+ :
Which of the following dependencies is not implied by the above set? (GATE CS 2005).
A. CD→AC
B. BD→CD
C. BC→CD
D. AC→BC
Solution:
1. CD can determine C and D(can determine themselves) and E(as CD→E, given in FD
set) . Therefore, (CD) + =CDE
2. Now E can determine A (as E→A), (CD) + =CDEA
3. With the help of A, we can determine B, (CD) + =CDEAB
So,(CD) + =ABCDE
We can clearly see that with CD, AC can be determined. Therefore, CD→AC holds true.
On taking BD closure (BD) + =BD as they can determine themselves only, no other
dependencies in the FD set.
We can see that BD→CD does not hold; hence B is the correct option.
Another example:
Finding Candidate Keys and Super Keys using the Attribute Closure
Question: The following functional dependencies are given:
A. CF+ = {ACDEFG}
B. BG+ = {ABCDG}
C. AF+ = {ACDEFG}
D. AB+ = {ABCDFG}
Solution:
This can be done with the steps discussed above in the article.
But option C and D have attribute closure: (AF) + =(AFDE) and (AB) + =(ABCDG).
A candidate key is the minimal set of attributes that can uniquely identify a tuple. For example, STU_ID in the 'Student'
relation can be a candidate key. The candidate key should be unique and not null.
A superkey is also a set of attributes that can uniquely identify a tuple in a given relation. The concept of the candidate key
and the super key is closely related. Super key reduced to the minimum number of attributes makes the candidate key,
which means the super key is the superset of a candidate key. In a given relation, ‘Student’ STU_ID and STU_NAME can be
super key.
The attribute closure method can also be used to nd all the candidate keys and super keys in the given relation.
If the attribute closure of an attribute set contains all the attributes of the given relation, then the attribute set will be the
superkey of the relation.
Similarly, if no subset of this attribute set can functionally determine all relation attributes, the present set will be a
candidate key.
For example given a relation R(A,B,C,D) and FD { A→B, B→C, C→D,D→A}, then A+ =BCDA,
B+ =ACDB
C+ =ACDB D+ =ACDB
Since all A, B, C, D can functionally determine all the attributes, so candidate key set will be
{A, B, C, D}. With the help of this, we can also determine prime and non-prime attributes.
Prime attributes are the set of all attributes present in the candidate key, and non-prime attributes are the remaining
attributes of the relation.
Superkeys can be formed by combining other attributes with the candidate key. Suppose
AB can uniquely identify all the other attributes in the given, and A alone can do this, so AB
GATE Question: Consider a relation scheme R = (A, B, C, D, E, H) on which the following
functional dependencies hold: {A–>B, BC–>D, E–>C, D–>A}. What are the candidate keys
of R?
1. AE, BE
2. AE, BE, DE
(DEH)+={ABCDEH}