Sample Questions
Sample Questions
Sample Questions
• For each member of staff, their staff identity number, name, job title, and salary.
• For each member of staff, all departments that they belong to. It is required that every member
• For each department, the head of department. It is required that each department has exactly
(a) Draw an ER diagram that expresses the requirements for the database. Make sure that you
(b) Are there any other natural constraints one might impose on the data that are not captured by
the requirements above? For each such constraint, say whether it would be possible to modify
your ER diagram to include the constraint, and, if so, explain how this would be done
Solution
(a) Here is one possible ER diagram:
Question:
Consider the following relational database schema consisting of the four relation schemas:
passenger ( pid, pname, pgender, pcity)
agency ( aid, aname, acity)
flight (fid, fdate, time, src, dest)
booking (pid, aid, fid, fdate)
Answer the following questions using relational algebra queries;
Solution:
Relational algebra operators:
σ – selection with conditions (It selects all tuples that satisfies the conditions. Shows entire table with
respect to the structure)
Π – projection operator (It selects the attributes which are listed here)
⨝ - natural join operator (Binary operator that join two relations on common attributes’ values)
-, ∪, and ∩ - set operators (difference, union and intersection)
b) Get the details about all flights from Chennai to New Delhi.
σ src = “Chennai” ^ dest = “New Delhi” (flight)
-----------------------------------------------------------------------------------------------------
c) Find only the flight numbers for passenger with pid 123 for flights to Chennai before 06/11/2020.
Π fid (σ pid = 123 (booking) ⨝ σ dest = “Chennai” ^ fdate <
06/11/2020 (flight))
[Hint: Given conditions are pid, dest, and fdate. To get the flight id for a passenger given a pid, we have
two tables flight and booking to be joined with necessary conditions. From the result, the flight id can be
projected]
-----------------------------------------------------------------------------------------------------
d) Find the passenger names for passengers who have bookings on at least one flight.
Π pname (passenger ⨝ booking)
-----------------------------------------------------------------------------------------------------
e) Find the passenger names for those who do not have any bookings in any flights.
Π pname ((Π pid (passenger) - Π pid (booking)) ⨝ passenger)
[Hint: here applied a set difference operation. The set difference operation returns only pids that have no
booking. The result is joined with passenger table to get the passenger names.]
-----------------------------------------------------------------------------------------------------
f) Find the agency names for agencies that located in the same city as passenger with passenger id 123.
Π aname (agency ⨝ acity = pcity (σ pid = 123 (passenger)))
[Hint: we performed a theta join on equality conditions (equi join) here. This is done between details of
passenger 123 and the agency table to get the valid records where the city values are same. From the
results, aname is projected.]
-----------------------------------------------------------------------------------------------------
g) Get the details of flights that are scheduled on both dates 01/12/2020 and 02/12/2020 at 16:00
hours.
(σ fdate = 01/12/2020 ^ time = 16:00 (flight)) ∩ (σ fdate = 02/12/2020 ^ time =
16:00 (flight))
[Hint: the requirement is for flight details for both dates in common. Hence, set intersection is
used between the temporary relations generated from application of various conditions.]
-----------------------------------------------------------------------------------------------------
h) Get the details of flights that are scheduled on either of the dates 01/12/2020 or 02/12/2020 or both
at 16:00 hours.
(σ fdate = 01/12/2020 ^ time = 16:00 (flight)) ∪ (σ fdate = 02/12/2020 ^ time =
16:00 (flight))
-----------------------------------------------------------------------------------------------------
i) Find the agency names for agencies who do not have any bookings for passenger with id 123.
Π aname (agency ⨝ (Π aid (agency) – Π aid (σ pid =
123 (booking)))
-----------------------------------------------------------------------------------------------------
j) Find the details of all male passengers who are associated with Jet agency.
Π passengers.pid, pname, pcity (σ pgender = “Male” (passengers ⨝
booking ⨝ agency))
[Hint: To get the link between passengers and agency, we need to join all three tables passengers,
booking, and agency with necessary condition. Here, agency links both passengers and agency. As we
have performed natural join operation between all three tables, the degree of the result will consist of all
attributes from all the three tables. Hence, we project only passengers details as these are mentioned as
required.]
Consider the following schema
For the above schema (the primary key for each relation is denoted by the underlined attribute), provide
relational algebra expressions for the following queries:
Note:
For notational convenience, I am using pname instead of person-name, cname instead of company-
name, and mname instead of manager-name.
1. Find the name of all employees (i.e., persons) who work for the City Bank company (which is a
specific company in the database).
2. Find the name and city of all employees who work for City Bank.
Similar to previous query, except we have to access the lives table to extract the city of the
employee. The join condition is the same person name in the two tables Lives and Works.
3. Find the name, street and city of all employees who work for City Bank and earn more
than$10,000.
Similar to previous query except an additional condition on salary attribute.
4. Find all employees who live in the same city as the company they work for.
(For this query we need to access the lives table to get city of the employee and the located-in
table to get city of the company; plus the works table to associate employee with their
company. The selection condition is then that the two cities are the same.)
5.
Find all persons who do not work for City Bank.
Can write this in multiple ways – one solution is to use set difference:
From above arrow diagram on R, we can see that an attributes AB is not determined by any of the given
FD, hence AB will be the integral part of the Candidate key, i.e. no matter what will be the candidate
key, and how many will be the candidate key, but all will have W compulsory attribute.
4.7M
57
Competitive questions on Structures
Let us calculate the closure of AB
AB + = ABCD (from the method we studied earlier)
Since the closure of AB contains all the attributes of R, hence AB is Candidate Key
From the definition of Candidate Key(Candidate Key is a Super Key whose no proper subset is a Super
key)
Since all key will have AB as an integral part, and we have proved that AB is Candidate Key, Therefore,
any superset of AB will be Super Key but not Candidate key.
Hence there will be only one candidate key AB
Definition of 2NF: No non-prime attribute should be partially dependent on Candidate Key
Since R has 4 attributes: - A, B, C, D, and Candidate Key is AB, Therefore, prime attributes (part of
candidate key) are A and B while a non-prime attribute are C and D
a) FD: AB → CD satisfies the definition of 2NF, that non-prime attribute(C and D) are fully dependent on
candidate key AB
b) FD: B → C does not satisfy the definition of 2NF, as a non-prime attribute(C) is partially dependent on
candidate key AB( i.e. key should not be broken at any cost)
As FD B → C, the above table R( A, B, C, D) is not in 2NF
Convert the table R(A, B, C, D) in 2NF:
Since FD: B → C, our table was not in 2NF, let's decompose the table
R1(B, C)
Since the key is AB, and from FD AB → CD, we can create R2(A, B, C, D) but this will again have a
problem of partial dependency B → C, hence R2(A, B, D).
Finally, the decomposed table which is in 2NF
a) R1( B, C)
b) R2(A, B, D)
From above arrow diagram on R, we can see that an attributes PQS is not determined by any of the
given FD, hence PQS will be the integral part of the Candidate key, i.e., no matter what will be the
candidate key, and how many will be the candidate key, but all will have PQS compulsory attribute.
Let us calculate the closure of PQS
PQS + = PQSRT (from the method we studied earlier)
Since the closure of PQS contains all the attributes of R, hence PQS is Candidate Key
From the definition of Candidate Key (Candidate Key is a Super Key whose no proper subset is a Super
key)
Since all key will have PQS as an integral part, and we have proved that PQS is Candidate Key. Therefore,
any superset of PQS will be Super Key but not Candidate key.
Hence there will be only one candidate key PQS
Definition of 2NF: No non-prime attribute should be partially dependent on Candidate Key.
Since R has 5 attributes: - P, Q, R, S, T and Candidate Key is PQS, Therefore, prime attributes (part of
candidate key) are P, Q, and S while a non-prime attribute is R and T
a) FD: PQ → R does not satisfy the definition of 2NF, that non-prime attribute( R) is partially dependent
on part of candidate key PQS.
b) FD: S → T does not satisfy the definition of 2NF, as a non-prime attribute(T) is partially dependent on
candidate key PQS (i.e., key should not be broken at any cost).
Hence, FD PQ → R and S → T, the above table R( P, Q, R, S, T) is not in 2NF
Convert the table R( P, Q, R, S, T) in 2NF:
Since due to FD: PQ → R and S → T, our table was not in 2NF, let's decompose the table
R1(P, Q, R) (Now in table R1 FD: PQ → R is Full F D, hence R1 is in 2NF)
R2( S, T) (Now in table R2 FD: S → T is Full F D, hence R2 is in 2NF)
And create one table for the key, since the key is PQS.
R3(P, Q, S)
Finally, the decomposed tables which is in 2NF are:
a) R1( P, Q, R)
b) R2(S, T)
c) R3(P, Q, S)
Example:
F: AB C
BD
Solution:
AB+
is candidate key in above table because AB+
= ABCD
(The closure of AB contains all the attributes of R)
AB – Prime attribute ( because AB are the part of the candidate key)
CD – Non prime attribute (because CD are not the part of the candidate key)
Now, the functional dependency AB C follows the rule of 2NF,
But functional dependency B D violates the rule of 2NF, because attribute B
which is prime attribute (part of the candidate key) is determining the non prime
attribute D. It is partial dependency and this type of partial dependency is not allowed in
2NF.
Therefore, to convert the relation R(A B C D) in 2NF, It is divided into two relations R1,
R2 as following:
R1 (A B C)
R2(B D)
In R1 (A B C), AB is candidate key and, since AB C holds.
In R2 (B D), B is candidate key since B D holds.
Now R1 and R2 are following the rules of 2NF.
The B D
violates the rule of
2NF in R, so BD is
kept in separate
table.
Solution: Let us construct an arrow diagram on R using FD to calculate the candidate key.
From the above arrow diagram on R, we can see that an attribute X is not determined by any of the
given FD, hence X will be the integral part of the Candidate key, i.e. no matter what will be the candidate
key, and how many will be the candidate key, but all will have X compulsory attribute.
Let us calculate the closure of X
X + = X(from the closure method we studied earlier)
Since the closure of X contains only X, hence it is not a candidate key.
Let us check the combination of Y, i.e. XY, XZ.
a) XY + = XYZ ( from the closure method we studied earlier)
Since the closure of XY contains all the attributes of R, hence XY is Candidate Key
b) XZ + = XZY (from the closure method we studied earlier)
Since the closure of XZ contains all the attributes of R, hence XZ is Candidate Key
Hence there are two candidate key XY and XZ
Since R has 3 attributes: - X, Y, Z, and Candidate Key is XY and XZ, Therefore, prime attribute(part of
candidate key) are X, Y, and Z while a non-prime attribute is none.
Using the Definition of 3NF to check whether R is in 3NF?: First, it should be in 2NF and if there exists a
non-trivial dependency between two sets of attributes X and Y such that X → Y ( i.e. Y is not a subset of
X) then
a) Either X is Super Key
b) Or Y is a prime attribute.
Given FD are XY → Z, and Z → Y and Super Key / Candidate Key are XZ and XY
a) FD: X Y → Z satisfies the definition of 3NF, as XY is Super Key also Z is a prime attribute.
b) FD: Z → Y satisfies the definition of 3NF, even though Z is not Super Key but Y is a prime attribute.
Since both FD of R, XY → Z and Z → Y satisfy the definition of 3NF hence R is in 3 NF
Using the Definition of BCNF to check whether R is in BCNF?: First, it should be in 3NF and if there
exists a non-trivial dependency between two sets of attributes X and Y such that X → Y ( i.e. Y is not a
subset of X) then
a) X is Super Key
Given FD are XY → Z, and Z → Y and Super Key / Candidate Key is XZ and XY
b) FD: X Y → Z satisfies the definition of BCNF, as XY is Super Key.
c) FD: Z → Y does not satisfy the definition of BCNF, as Z is not Super Key Since both FD of R, XY → Z and
Z → Y satisfy the definition of 3NF hence R is in 3 NF
Convert the table R( X, Y, Z) into BCNF:
Since due to FD: Z → Y, our table was not in BCNF, let's decompose the table
FD: Z→ Y was creating an issue, hence one table R1( Z, Y )
Create Table for key XY R2(X, Y) as XY was candidate key
Create Table for key XZ R2(X, Z) as XZ was candidate key
Note: When we have more than one key( eg: XY and XY) then while decomposing keep in
mind that you compare both R2 and R3 with R1 such that among R1 and R2 or R1 and R3
there should be at least one common attribute and, that common attribute must be key in
any of the table.
Considering R1( Z, Y) and R2(X, Y) both tables have one common attribute Y, but Y is not key in any of
the table R1 and R2, hence we discard R2(X, Y) i.e. discarding candidate key XY.
Considering R1( Z, Y) and R3(X, Z) both tables have one common attribute Z, and Z is key of the table R1,
hence we include R3(X, Z) i.e. including candidate key XZ.
Hence decomposed tables which are in BCNF:
R1(Z, Y)
R2(X, Z)
Question : Given a relation R( X, Y, Z) and Functional Dependency set FD = { X → Y and Y
→ Z }, determine whether the given R is in BCNF? If not convert it into BCNF.
Solution: Let us construct an arrow diagram on R using FD to calculate the candidate key.
From the above arrow diagram on R, we can see that an attribute X is not determined by
any of the given FD, hence X will be the integral part of the Candidate key, i.e. no matter
what will be the candidate key, and how many will be the candidate key, but all will have
X compulsory attribute.
Let us calculate the closure of X
X + = XYZ (from the closure method we studied earlier)
Since the closure of X contains all the attributes of R, hence X is Candidate Key
From the definition of Candidate Key (Candidate Key is a Super Key whose no proper
subset is a Super key)
Using the Definition of BCNF to check whether R is in BCNF?: First, it should be in 3NF
and if there exists a non-trivial dependency between two sets of attributes X and Y such
that X → Y ( i.e. Y is not a subset of X) then
a) X is Super Key
First, we check that table is in 3NF?
Using the Definition of 3NF to check whether R is in 3NF?: If there exists a non-trivial
dependency between two sets of attributes X and Y such that X → Y ( i.e. Y is not a subset
of X) then
a) Either X is Super Key
b) Or Y is a prime attribute.
a) FD: X → Y is in 3NF (as X is a super Key)
b) FD: Y → Z is not in 3NF (as neither Y is Key nor Z is a prime attribute)
Hence because of Y → Z using definition 2 of 3NF, we can say that above table R is not in
3NF.
Convert the table R( X, Y, Z) into 3NF:
Since due to FD: Y → Z our table was not in 3NF, let's decompose the table
FD: Y → Z was creating issue, hence one table R1(Y, Z)
Create one Table for key X, R2(X, Y), since X → Y
Hence decomposed tables which are in 3NF:
R1(X, Y)
R2(Y, Z)
Both R1(X, Y) and R2(Y, Z) are in BCNF
Question Suppose you are given a relationR= (A, B, C, D, E) with the following
functional dependencies:{CE→D, D→B, C→A}
a. Find all candidate keys.
b. Identify the best normal form that R satisfies (1NF, 2NF, 3NF, or BCNF).
c. If the relation is not in BCNF, decompose it until it becomes BCNF.
At each step, identify a new relation, decompose and re-compute the keys and the
normal forms they satisfy.
Answer.a. The only key is{C, E}
b. The relation is in 1NF
c. Decompose into R1=(A,C) and R2=(B,C,D,E). R1 is in BCNF, R2 is in 2NF.
Decompose R2into, R21=(C,D,E) and R22=(B,D). Both relations are in BCNF.
Question You are given the following set of functional dependencies for a relation
R(A,B,C,D,E,F),F={AB→C, DC→AE, E→F}.
a. What are the keys of this relation?
b. Is this relation in BCNF? If not, explain why by showing one violation.
c. Is the decomposition (A,B,C,D) (B,C,D,E,F) a dependency preserving
decomposition? If not,explain briefly.
Answer
.a. What are the keys of this relation?{A, B, D}and{B, C, D}.
b. Is this relation in BCNF? If not, explain why by showing one violation. No, all
functional dependencies are actually violating this. No dependency contains a
superkey on its left side.
c. Is the decomposition (A,B,C,D) (B,C,D,E,F) a dependency preserving
decomposition? If not, explain briefly. Yes,AB→C and DC→A are preserved in the
first relation.DC→E and E→F are preserved in the second relation
QUESTIONS ON CLOSURE SET OF ATTRIBUTE:
1) Given relational schema R( P Q R S T U V) having following attribute P Q R S T U and
V, also there is a set of functional dependency denoted by FD = { P->Q, QR->ST, PTV-
>V }.
Now as per algorithm look into a set of FD that complete the left side of any FD contains
either Q, R, or QR since in FD QR→ST has complete QR.
Again, trace the remaining two FD that any left part of FD contains any Q, R, S, T.
Since no complete left side of the remaining two FD{P->Q, PTV->V} contain Q, R, S, T.
Note: In FD PTV→V, T is in QRST but that cannot be entertained, as complete PTV should
be a subset of QRST
Now as per algorithm look into a set of FD, and check that complete left side of any FD
contains either P, R, or PR. Since in FD P→Q, P is a subset of PR, Hence PR+ = PRQ
Again, trace the remaining two FD that any left part of FD contains any P, R, Q, Since, in
FD QR → ST has its complete left part QR in PQR
Again trace the remaining one FD { PTV->V } that its complete left belongs to PRQST.
Since complete PTV is not in PRQST, hence we ignore it.
T + = T (as the closure of an attribute or set of attributes contain same) Now as per
algorithm look into a set of FD that complete the left side of any FD contains T since, in
FD T → P, T is in T, Hence T+ = TP Again trace the remaining three FD that any left part of
FD contain any TP, Since in FD P → QR has its complete left part P in TP, Hence T+ = TPQR
Again trace the remaining two FD { RS->T, Q->S } that any of its Complete left belongs to
TPQR, Since in FD Q → S has its complete left part Q in TPQR, Hence T+ = TPQRS Again
trace the remaining one FD { RS->T } that its complete left belongs to TPQRS, Since in FD
RS → T has its complete left part RS in TPQRS Hence T+ = TPQRS ( no changes, as T, is
already in TPQRS) Therefore T+ = TPQRS ( Answer)