Exam 04
Exam 04
Exam 04
3 Hours Answer all questions. Make sure that your answers are clear and to the point. Calculators and printed foreign language dictionaries are allowed. No reference material is allowed. There are 180 marks on the exam.
CONTENTS: Question 1. Question 2. Question 3. Question 4. Question 5. Question 6. Question 7. Relational Data Model and Relational Algebra SQL and Enhanced Entity Relationship Data Model Mapping ER to Relational Data Model Functional Dependencies and Normalization Query Optimisation Concurrency Control [26 marks] [26 marks] [25 marks] [25 marks] [33 marks] [25 marks] [20 marks]
COMP302
Continued . . . .
COMP302
Continued . . . .
Student ID.
[26 marks]
(a) What is a Relation in the Relational Model, and what two properties must its elements have? [3 marks] ANSWER
(b) What is the difference between a Relation and a Relation Schema? ANSWER
[3 marks]
[3 marks]
[3 marks]
COMP302
Continued . . . .
(e) Suppose R = ({A,B,C},{A}) and S = ({C,D,E},{C}) are relation schemas, and r(R) and s(S) are relations over R and S respectively. (i) Write a relational algebra expression for the relation consisting of those tuples in r(R) for which C > 10, restricted to the attributes A,B. [2 marks] ANSWER
(ii) Write a relational algebra expression for the relation over {B,C,D} consisting of the values of B, C, and D from all pairs of tuples from r(R) and s(S) that have the same value for the attribute C. [2 marks] ANSWER
[1 marks]
(f) Suppose r1 and r2 are relations over relation schemas N1({A, B}, {A}) and N2({B, C}, {B}), respectively. Using the relational algebra project operator, express the referential integrity constraint N1[B] N2[B]. [2 marks] ANSWER
COMP302
Continued . . . .
Student ID.
(g) Consider the following set of three relation schemas and referential integrity constraints: N1 ({A, B }, {A }) N2 ({A, C, D }, {AC }) N3 ({A, C, E, F }, {ACE }) N2 [A] N1 [A], N3 [A] N1 [A] N3 [(A, C )] N2 [(A, C )]
(i) Which of the three referential integrity constraints given above is a consequence of the other two constraints? [2 marks] ANSWER
(ii) Use the transitivity property of the set inclusion operator to show that the referential integrity constraint you identified in part (i) will be always satisfied if the other two referential integrity constraints are satisfied. [5 marks] ANSWER
COMP302
Continued . . . .
Question 2. SQL
[26 marks]
The three relation schemas below are part of a relational database schema to record details of tests measuring the time it takes for a test car to stop on different road surfaces when fitted with different tyres. The tyres are specified by their make and model. The database records the width and tread pattern of each tyre, the gravel-size and kind of tar of each road surface, and the stopping time of each test. Tyre ({Make, Model, Width, TreadPattern }, {Make+Model }) RoadSurface ({SurfaceCode, GravelSize, Tar } {SurfaceCode }) StoppingTest ({SurfaceCode, Make, Model, Time }, {SurfaceCode+Make+Model }) The attribute constraints are given in the following table: Attribute Make Model Width TreadPattern SurfaceCode GravelSize Tar Time Data Type Char Char Decimal Char Char Int Char Integer Max. Length (bytes) variable length 8 6 15 6 4 Variable length 4 Null No No No Yes No No Yes No [4 Marks]
(b) Write an SQL statement to return the make and model of all tyres with a width greater than 15. [3 Marks] ANSWER
COMP302
Continued . . . .
Student ID.
(c) Write an SQL statement to return a list of all tyres and their stopping time on roads with a gravel size of 5, ordered from the best tyres (fastest to stop) to the worst. [4 Marks] ANSWER
(d) Write an SQL statement that would define the StoppingTest table. Include any key and referential integrity constraints. [7 Marks] Note, deleting a tyre or a road surface from the database should also remove any stopping tests involving the tyre or the road surface. Modifying data in the tyre or road surface relations should be possible. ANSWER
COMP302
Continued . . . .
(e) Write an SQL statement to return the make and model of each tyre that has not been tested on all possible road surfaces, along with the number of road surfaces that the tyre has not been tested on. [8 marks] ANSWER
COMP302
Continued . . . .
Student ID.
[25 marks]
Construct an EER diagram for each of the situations described below. The names of the entities are in italic font. Your diagrams should indicate the keys of all entities, as well as the cardinality and participation constraints of all relationships. (a) An HR database. [4 marks] The HR section of a company keeps a database of all Applicants for positions. It identifies each Applicant by an application number, and stores their name and address (house number, street, and town)
ANSWER
(b) A Factory Database [5 marks] A factory has a large number of milling machines and a staff of machine operators. Each milling Machine is identified by a machine number, and also has a machine type. Each Operator is identified by their name (they use nicknames if two people have the same real name) and has a seniority level. There are several working sessions in a day. At the beginning of the day, the manager assigns each operator to a machine for each session. An operator may have a different machine in each session. There are more machines than there are operators.
ANSWER
COMP302
Continued . . . .
(c) A Storage Database [8 marks] A Geology department keeps its collection of rocks in large cabinets. Each Cabinet is identified by the name of the researcher who collected the rocks in the cabinet. (No one has more than one cabinet). Each cabinet contains a set of Drawers where the rocks are put. The drawers of a cabinet are numbered 1, 2, 3, Each drawer is labeled with the field trip or trips where the rocks were collected (each field trip is specified by a place and date). ANSWER
COMP302
10
Continued . . . .
Student ID.
(d) A Mouse Database. [8 marks] A Psychology department has a database of the mice that they have bred. They use the male mice for maze experiments. Each Mouse is identified by a name, and is described by its sex and date of birth. Each FemaleMouse has a count of her children, and her largest litter size. Each MaleMouse has a count of the number of experiments it has done, its total score, and its average score. The database must also record which two mice are the parents of each mouse. ANSWER
COMP302
11
Continued . . . .
[25 marks]
For each of the following EER diagrams, map the diagram to an equivalent relational database schema. Specify the attributes and keys of the relation schemas, any attribute constraints, and a set of non-redundant referential integrity constraints. You do not need to specify the domains of the attributes. (a) A database of all the measuring scales in a research facility and a record of their calibration history. [5 Marks]
IDNum ScaleType Location
WeighScale
Calibrated
Date
Initials Correct
COMP302
12
Continued . . . .
Student ID.
Floor
Number
OfficeID
Size
ID
Name
Position
Office
N
occupies
Person
[4 marks]
(ii) Explain how the relationship with total participation is enforced in your relational database schema. [2 marks] ANSWER
COMP302
13
Continued . . . .
(c) A database of experiment results. A rat may not be tested on the same maze more than once.
RatID Sex Age MazeNum Difficulty
[5 Marks]
Rat
M
test
Maze
Date
Speed
ANSWER
(d) The specimens in the Geology Department collection are either rocks or thin slices prepared from the rock specimens. Your schema design should not require any null attributes.
Location SNum
RockClass Category
Specimen
Category d
thickness
ThinSlice
pattern
Rock 1
prepared from
Name
1 Technician
COMP302
14
Continued . . . .
Student ID.
[4 marks]
(ii) In the EER diagram, the specialisation is total. Explain why your relational database schema does or does not enforce this constraint. [2 marks] ANSWER
(iii) Express the additional constraints that are present in the EER but not in your relational database schema and explain what would be required to enforce them: [3 marks] ANSWER
COMP302
15
Continued . . . .
[33 marks]
[6 marks]
Transform the set F1 into a new set of fds F1 , in which each fd is left reduced (i.e. has no redundant attributes on its left hand side). ANSWER
(b) Consider the following set of left reduced functional dependencies: F2 = {ABC, ANSWER CD, ABD }.
[4 marks]
(c) Suppose (U, F ) is a universal relation schema, where U = {A, B, C, D } and F = {AB, CB, DB }.
[7 marks]
Suppose that starting from (U, F ) the following set of BCNF relation schemas has been produced: S = {N1({A, B }, {A}), N2({C, B }, {C}), N3({D, B }, {D })}.
If you consider that S is a lossless join decomposition of (U, F ), explain why it is. If you consider that S is not a lossless join decomposition of (U, F ), explain why it is not, and transform it into a new set of relation schemas S that will be at least in BCNF and a lossless join decomposition of (U, F ).
COMP302
16
Continued . . . .
Student ID.
ANSWER
(d) Consider the following relation schema N({A, B, C, D}, {ABC, CA, DB, ABD}) (i) Find all relation schema N keys. ANSWER [4 marks]
(ii) What is the highest normal form that relation schema N is in? Justify your answer. [4 marks] ANSWER
COMP302
17
Continued . . . .
(e) The Inference Rules for functional dependencies are given in Figure 5.1. You have seen them in the lectures and in the textbook. Given U, F, and X, Y, Z, V, W U 1. (Reflexivity) Y X XY (trivial fd) XZYW XZ XYZ WXZ 2. (Augmentation) XY and W Z 3. (Transitivity) XY and YZ 4. (Decomposition) XYZ 5. (Union) XY and XZ
XY and XZ
Figure 5.1 State whether each of the following implications is true or false. If it is false, give a relation with only two tuples that satisfies the functional dependencies in the left-hand side of the implication but does not satisfy the dependencies in the right-hand side. If it is true, show which inference rules you used to prove it is true. (i) VX and VXY ANSWER VY [4 marks]
XY
[4 marks]
COMP302
18
Continued . . . .
Student ID.
COMP302
19
Continued . . . .
[25 marks]
Table 1 below contains a description of a part of the LTSA database schema (as in question 2). Table 2 below contains the corresponding parameters of the physical database structure. Attribute Make Model Width TreadPattern SurfaceCode GravelSize Tar SurfaceCode Make Model Time Data Type char char decimal Char char int char char char char int Max. Length 15 8 6,2 15 6 4 15 6 15 8 4 Table 1 Relation schema Tyre RoadSurface StoppingTest L F r x tuple blocking number primary key length factor of tuples index height 44 11 3,000 3 25 20 500 2 33 15 150,000 5 Table 2 Number of B* tree leaf nodes 120 10 10,000 Null N N N N N N N N N N Default
(a) Draw the heuristic optimization tree that corresponds to the SQL query: SELECT *
[5 marks]
FROM (RoadSurface r NATURAL JOIN StoppingTest s NATURAL JOIN Tyre t) WHERE t.Make = Goodyear;
COMP302
20
Continued . . . .
Student ID.
ANSWER
(b) Draw the heuristic optimization tree that corresponds to the SQL query: SELECT Make, AVG(Time) FROM StoppingTest GROUP BY Make; ANSWER
[5 marks]
COMP302
21
Continued . . . .
For parts (c) and (d), assume: The StoppingTest relation contains data about tests of all 3000 tyres on only 50 surfaces, The intermediate results of the query evaluation are materialized, The final result of the query is materialized, The size of each intermediate or final result block should not exceed 500 bytes, There is a buffer pool of 6000 bytes provided for query processing in the main memory, There are primary indexes on Tyre.(Make + Model), RoadSurface.(SurfaceCode), and StoppingTest.( SurfaceCode + Make + Model) available, Primary indexes are B* trees with nodes of size at most 500 bytes.
NOTE:
Some of the formulae you may need when computing the estimated query costs are given at the end of this exam paper. (c) Calculate the lowest execution cost of the query SELECT * FROM RoadSurface NATURAL JOIN StoppingTest; ANSWER [7 marks]
COMP302
22
Continued . . . .
Student ID.
(d) The following SQL statement retrieves SurfaceCodes that have been used so far to test tyres [8 marks] SELECT DISTINCT SurfaceCode FROM StoppingTest Find the lowest cost estimate of the query above. Hint: If there is an index on the queried relation that is sorted by the attribute list of a SELECT DISTINCT clause, it can be used to evaluate SELECT DISTINCT operation without accessing the queried relation itself. Also note, in a B* tree index, all primary key values of a relation are stored in the leaf nodes, leaf nodes are linked into a linked list, and there is a pointer to the leftmost leaf node. ANSWER
COMP302
23
Continued . . . .
COMP302
24
Continued . . . .
Student ID.
[20 marks]
[6 marks]
(i) A transaction T1 reads a database item A. Another transaction T2 reads the same database item A. The transaction T1 updates A and commits. Then T2 updates the database item A and commits. ANSWER
(ii) A transaction T1 reads a database item A that is updated by a not committed transaction T2 and then T2 aborts. ANSWER
(iii) A transaction T1 reads a database item A. Another transaction T2 updates A and commits. Then T1 reads the database item A again. ANSWER
(iv) A transaction T1 locks database items that satisfy a selection condition and updates them. During that update, another transaction T2 inserts a data item that satisfies the same selection condition. After that, transaction T1 reads and displays all database items that satisfy the selection condition and a user discovers a not updated item. ANSWER
COMP302
25
Continued . . . .
(b) Two transaction programs Borrow_Book and Return_Book are given in Figures 7.1 and 7.2, respectively. They use a database whose schema is given in Figure 7.3. These programs were written in SQL using some of the PostgreSQL specific commands we discussed in tutorials and you used in your Project 2. Suppose the two programs are run concurrently as shown in Figure 7.4. After the Return_Book program issued the third SQL statement, both programs were stalled. [14 marks] Borrow_Book BEGIN; SELECT * FROM Customer WHERE CustomerId = 007 FOR UPDATE; SELECT Copies_Left FROM Book WHERE isbn = 1111 FOR UPDATE; INSERT INTO Cust_Book VALUES(007, 111, 2004-05-31, 2004-06-19); UPDATE Customer SET BooksBorrowed = BooksBorrowed + 1; UPDATE Book SET CopiesLeft = CopiesLeft - 1; COMMIT; Figure 7.1 Return_Book BEGIN; SELECT * FROM Book WHERE isbn = 1111 FOR UPDATE; SELECT * FROM Customer WHERE CustomerId = 007 FOR UPDATE; DELETE FROM Cust_Book WHERE CustomerId = 007 AND isbn = 1111; UPDATE Customer SET BooksBorrowed = BooksBorrowed - 1; UPDATE Book SET CopiesLeft = CopiesLeft + 1; COMMIT; Figure 7.2 Customer({CustomerId, Name, BooksBorrowed}, {CustomerId}) Book({isbn, Title, NoCopies, CopiesLeft}, {isbn}) Cust_Book({CustomerId, isbn, BorrowDate, DueDate}, {CustomerId + isbn}) Figure 7.3 (i) Why were the programs stalled? ANSWER [2 marks]
COMP302
26
Continued . . . .
Borrow_Book BEGIN; SELECT * FROM Customer WHERE CustomerId = 007 FOR UPDATE;
Return_Book
T i m e
BEGIN; SELECT Copies_Left FROM Book WHERE isbn = 1111 FOR UPDATE;
SELECT Copies_Left FROM Book WHERE isbn = 1111 FOR UPDATE; //Here, the program was stalled SELECT * FROM Customer WHERE CustomerId = 007 FOR UPDATE; //Here, the program was stalled Figure 7.4
[2 marks]
(iii) Rewrite the first three statements of either of the programs to avoid the anomalous situation you identified in question (i). Your rewriting of the program should not allow any transaction anomaly to happen. [10 marks] ANSWER
*******************
COMP302 27 End
COMP302
28
Continued . . . .
SELECT
SELECT [ ALL | DISTINCT ] * | expression [ AS output_name ] [, ...] [ FROM from_item [, ...] ] [ WHERE condition ] [ GROUP BY expression [, ...] ] [ HAVING condition [, ...] ] [ { UNION | INTERSECT | EXCEPT } [ ALL ] select ] [ ORDER BY expression [ ASC | DESC | USING operator ] [, ...] ] [ FOR UPDATE [ OF tablename [, ...] ] ] where from_item can be: [ ONLY ] table_name [ * ] [ [ AS ] alias [ ( column_alias_list ) ] ] | ( select ) [ AS ] alias [ ( column_alias_list ) ] | from_item [ NATURAL ] [ join_type ] JOIN from_item [ ON join_condition | USING ( join_column_list ) ] and join_type can be: INNER | LEFT [ OUTER ] | RIGHT [ OUTER ] | FULL [ OUTER ] | CROSS For INNER (the default) and OUTER join types, exactly one of NATURAL, ON join_condition, or USING ( join_column_list ) must appear. For CROSS JOIN, none of these items may appear.
CREATE VIEW
CREATE VIEW view [ ( column name list ) ] AS SELECT query