0% found this document useful (0 votes)

42 views

Tut8 QPO Qa

金

Uploaded by

6r8bwn769s

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

42 views

Tut8 QPO Qa

金

Uploaded by

6r8bwn769s

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

INFS2200/INFS7903 – Relational Database Systems

Tutorial 8 Query Processing and Optimization

Exercise 1. Assume that relation R has attributes A, B, C, D, and relation S has attributes D, E, F. Which of the
following are valid relational algebra transformations during query optimization and which are not?

Please explain the reason briefly when one is not a valid transformation; you do not need to give any
explanation for valid transformations. Note that corresponds to the natural join operator on the common
attribute (i.e., attribute D).

A. πA,B(πA,B,C,D(R S)) = πA,B(R S)

B. σA>15 or B<10 or C=20(R) = σA>15(σB<10(σC=20(R)))

C. πA,B(σC<12(R)) = σC<12(πA,B(R))

D. πA,F(R S) = (πA(R)) (πF(S))

Exercise 2. Consider the following relational schema and SQL query. The schema captures information about
employees, departments, and company finances (organized on a per department basis).

Emp (eid:integer, did:integer, sal:integer, hobby:char(20))

Dept (did:integer, dname:char(20), floor:integer, phone:char(10))
Finance (did:integer, budget:real, sales:real, expenses:real)

Consider the following query:

SELECT D.dname, F.budget

FROM Emp E, Dept D, FinanceF
WHERE E.did = D.did AND D.did = F.did
AND D.floor = 1 AND E.hobby = ‘camping’;

A. List the join orders (i.e., orders in which pairs of relations can be joined to compute the query result) that a
relational query optimizer will consider. (Assume that the optimizer follows the heuristic of never
considering plans that require the computation of cross-products.)

B. For one of the join orders above, identify a relational algebra tree (or a relational algebra expression) that
reflects the order of operations a query optimizer would choose.

C. Suppose that the following additional information is available: B+ tree indexes exist on Emp.did, Emp.sal,
Dept.floor, Dept.did, and Finance.did. The system’s statistics indicate that employees enjoy 200 different
hobbies, and the company owns two floors in the building, with a uniform value distribution. There are a
total of 50,000 employees and 5,000 departments (each with corresponding financial information) in the
database. The DBMS used by the company has only one join method available: single nested loop join.
a. For each of the query’s base relations (Emp, Dept, and Finance), estimate the number of tuples that
would be initially selected from that relation if all the non-join predicates on that relation were applied
to it before any join processing begins.
b. Given your answer to the above question, which of the join orders considered by the query optimizer
would be more efficient?

Exercise 3. Consider the following relational schemas and instances:

Student (SID, Name, Class, Major)

Student_Dir (SID, Address, Phone)
FK: (SID) → Student (SID)
Course (Course_No, Name, Level)
Course_Taken (Course_No, Term, SID, Grade)
FK: (Course_No) → Course (Course_No); (SID) → Student (SID)

1
INFS2200/INFS7903 – Relational Database Systems

Student
SID Name Class Major
123 John 3 CS
124 Mary 3 CS
126 Sam 2 CS
129 Julie 2 Math

Student_Dir
SID Address Phone
123 333 Library St 555---535---5263
124 219 Library St 555---963---9635
129 555 Library St 555---123---4567

Course
Course_No Name Level
CS1520 Web Programming UGrad
CS1555 Database Management Systems UGrad
CS1550 Operating Systems UGrad
CS1655 Secure Data Management and Web Applications Ugrad
CS2550 Database Management Systems Grad

Course_Taken
Course_No Term SID Grade
CS1520 Fall 11 123 3.75
CS1520 Fall 11 124 4
CS1520 Fall 11 126 3
CS1555 Fall 11 123 4
CS1555 Fall 11 124 NULL
CS1550 Spring 12 123 NULL
CS1550 Spring 12 124 NULL
CS1550 Spring 12 126 NULL
CS1550 Spring 12 129 NULL
CS2550 Spring 12 124 NULL
CS1520 Spring 12 126 NULL

For each of the relational algebra expressions below, identify the expected arity (number of attributes),
resulting schema, and min/max cardinality (number of distinct tuples) of the relation resulted from the query,
without actually evaluating the query and based only on the schemas and cardinalities of the four given
relations.

A. σTerm = 'Spring 12' (Course_Taken)

B. Course_Taken * Course

Note
(a) Cardinality means NDV (Number of Distinct Values). Cardinality of a relation means the population of
a relation, as every tuple is unique in a relation.
(b) The symbol ‘*’ corresponds to the natural join operator on the common attribute (i.e., attribute
Course_No). When there is no specific subscript, is regarded as an equal join by default. If there is
no ambiguity within the context, can also be used as a natural join.

2
INFS2200/INFS7903 – Relational Database Systems

Exercise 4. Consider the following schema, annotated with the number of records, whose population is for
the calendar year 2022. Further, there are 10 distinct values of ProdType in Product, and 5 distinct values of
Type in Customer.

Customer (CustID, Name, Type) 10,000

Invoice (InvID, CustID, Date, Amount) 10 * customer * month (1.2 million)
FK: (CustID) à Customer (CustID)
LineItem (InvID, LineNo, ProdID, Qty) 10 per invoice (12 million)
FK: (InvID) à Invoice(InvID)
FK: (ProdID) à Product(ProdID)
Product (ProdID, Description, ProdType) 1,000

A. How many tuples are there in the natural joins of them (i.e., the relations Customer, Invoice, LineItem,
Product)?

B. What would be the cost of computing these natural joins, step by step, in the sequence indicated? Is
there another sequence that would cost more, or less?

C. Suppose we want to know the types of customers who have bought a given type of product (widget) in
July. How many tuples would you expect in the result?

D. What relational projection operations are there? Can any be done immediately?

E. What relational selections can be done immediately? How big are the resulting tables?

F. What joins are remaining, in increasing order of cost?

G. How many tuples are there in the result of the cheapest join?

H. Repeat above questions F. and G. until the joins are completed. What operation is left?

Exercise 5. An SQL statement for the query of Exercise 4 is

SELECT Type
FROM Customer, invoice, LineItem, Product
WHERE Invoice.CustID = Customer.CustID
AND LineItem.InvID = Invoice.InvID
AND Product.ProdID = LineItem.ProdID
AND Product.ProdType = 'Widget'
AND Invoice.Date.Month = July;

Construct a query tree for this query and show the steps in optimization process.

3
INFS2200/INFS7903 – Relational Database Systems

Solutions of Tutorial 8
Exercise 1 Solution

A. Valid.

B. Invalid. Because it is a disjunction query. For instance, records with A>15 or B<10 but C≠20 should
appear in the query answer. However, under this transformation all those records will be eliminated when
evaluating σC=20(R).

C. Invalid. Because the projection does not contain the attributes in the selection condition (i.e., attribute C).
Under this transformation, applying πA,B first will eliminate attribute C from the intermediate result and it will
not be possible to apply the selection condition σC<12.

D. Invalid. Because the projection does not contain the attributes in the join condition (i.e., attribute D). The
condition for a natural join is R.D=S.D, since D is the common attribute. However, under this
transformation, applying πA will eliminate attribute D from relation R. Similarly, applying πF will also
eliminate attribute D from relation S.

Exercise 2 Solution:

A. There are two join orders considered, assuming that the optimizer ignores cross-products:
• First possible join order: ((E D) F)
• Second possible join order: ((D F) E)

B. A query optimizer would typically push down the selection and projection as far down the tree as possible.
For the join order ((E D) F), this would result in:

πD.dname,F.budget(
(πE.did(σE.hobby=’camping’(E)) πD.did,D.dname(σD.floor=1(D)))
πF.budget,F.did(F))

C. Given the additional information and statistics of the database:

a. Emp size = 50,000 records, E.hobby = ‘camping’

Resulting size = 50,000 * (1/200) = 250 records.

Dept size = 5,000 records, D.floor = 1

Resulting size = 5,000 * (1/2) = 2,500 records.

Finance size = 5,000 records, as there are no combined predicates for departments.
Resulting size = 5,000 records.

b. Plan ((E D) F) is more efficient because in that plan, the optimizer executes the most restrictive
operations first.

As opposed to plan ((D F) E), where there are no restrictions on the Finance relation to reduce the
resulting number of records.

Exercise 3 Solution:

A. Arity = Arity of Course_Taken = 4.

Schema = the schema of Course_Taken = (Course_No, Term, SID, Grade).

Cardinality = Cardinality of Course_Taken * Selectivity of σTerm = 'Spring 12'
• Cardinality of Course_Taken = 11;
• Selectivity is within the range of 0 to 1;

4
INFS2200/INFS7903 – Relational Database Systems

Hence, Min Cardinality = 0 and Max Cardinality = 11.

B. Arity = Arity of Course_Taken + Arity of Course - number of common attributes = 4 + 3 - 1 = 6.

Schema = (Course_No, Term, SID, Grade, Name, Level).

Attribute Course_No is a foreign key of Course_Taken that refers to Course, which means that for every
Course_Taken tuple there is exactly one matching Course tuple (because Course.Course_No is a primary
key). Hence, Cardinality = Cardinality of Course_Taken = 11.

Exercise 4 Solution:

A. The joins of these 4 relations can be performed separately. Firstly, we join the two of them:

|Customer Invoice| = 1.2 million

|LineItem Product| = 12 million;

Then the final resulting size, by joining the above two, is 12 million. Here LineItem governs the size by the
foreign keys.

B. See below the join cost calculations.

Join (10,000, 1,200,000) + Join (1,200, 000, 12,000,000) + Join (12,000,000, 1,000)

If LineItem were in the first join that would cost more since now all results have 12 million tuples, whereas
in the original sequence the first one had only 1.2 million.

C. At most 5 (the number of distinct customer types).

D. (This question is regarding Question 4-C. above. Please see Exercise 5 for a graphical illustration.)
Please refer to the graphical solutions of Exercise 5:

Projection onto customer type. It can't be done until the end.

Projection of line items on invoice ID. Must be done after join of Line item with Product.

E. (This question is regarding Question 4-C. above. Please see Exercise 5 for a graphical illustration.)
Please refer to the graphical solutions of Exercise 5:

We assume uniform distributions of product type values and monthly invoices.

Selection of Product on Product Type (around 100 tuples)

Selection of invoice on month July (around 100,000 tuples)

F. Excluding Cartesian products, the following three joins can be performed:

First Join R = Customer (invoices for July), cost = Join(10,000, 100,000), result 100,000 tuples
Second Join S = (invoices for July) LineItem, cost = 100,000 x 12,000,000, result 1,000,000 tuples
Third Join T = (LineItem for July) (products of given type), cost = Join(1,000,000, 100), result 100,000 tuples

The final join can be done on R and T: R T, cost = Join(100,000, 100,000), result 100,000 tuples

G. See above solutions of different number of tuples to Exercise 4-F.

5
INFS2200/INFS7903 – Relational Database Systems

H. After joins R, T and R T, a projection on Customer Type is needed, which requires a sorting operation
followed by duplicate elimination.

Exercise 5 Solution:

The steps of this optimisation process are following:

(1) we must build a standard canonical tree for the query.

πType
σInvoice.CustID = Customer.CustID
σLineItem.InvID = Invoice.InvID
σProduct.ProdID = LineItem.ProdID
σProductT ype = Widget
σM onth = July

(2) Then we must push down as far as possible the selections.

.CustID

6
INFS2200/INFS7903 – Relational Database Systems

(3) And now we can replace Cartesian products with joins.

(4) Here is a sequence of joins, so we can choose the less expensive arrangement.

(5) At the final step, we push down the projections to eliminate irrelevant attributes.

---ooo000O000ooo---

Dynamics AX AIF Services
No ratings yet
Dynamics AX AIF Services
122 pages
Entity Framework Core Cheat Sheet.
100% (2)
Entity Framework Core Cheat Sheet.
3 pages
Mid PDM Answer
No ratings yet
Mid PDM Answer
14 pages
08 Query Processing Strategies and Optimization
No ratings yet
08 Query Processing Strategies and Optimization
32 pages
Database Management Systems Exam
No ratings yet
Database Management Systems Exam
28 pages
Relational Models.
No ratings yet
Relational Models.
34 pages
Unit 4
No ratings yet
Unit 4
36 pages
Outer join and aggregate function
No ratings yet
Outer join and aggregate function
64 pages
04 - Relational Algebra and Calculus
No ratings yet
04 - Relational Algebra and Calculus
38 pages
C817b299unit 2 - Relational Algebra
No ratings yet
C817b299unit 2 - Relational Algebra
20 pages
DBMS
No ratings yet
DBMS
15 pages
Relational algebra - Join
No ratings yet
Relational algebra - Join
29 pages
1. Unit-2-DBMS-Part-3
No ratings yet
1. Unit-2-DBMS-Part-3
36 pages
Chapter 3
No ratings yet
Chapter 3
53 pages
A3 Preview
No ratings yet
A3 Preview
18 pages
1.6 PPT - Query Optimization
No ratings yet
1.6 PPT - Query Optimization
53 pages
ADB Chapter 2 DB Part1
No ratings yet
ADB Chapter 2 DB Part1
10 pages
Week 4: Relational Algebra (Part II) : Database System Concepts
No ratings yet
Week 4: Relational Algebra (Part II) : Database System Concepts
35 pages
DB Draft
No ratings yet
DB Draft
10 pages
DE_Module5_QueryOptimization
No ratings yet
DE_Module5_QueryOptimization
11 pages
Relational Algebra
No ratings yet
Relational Algebra
31 pages
Ayesha A3
No ratings yet
Ayesha A3
5 pages
Unit-1, 2 and 3
No ratings yet
Unit-1, 2 and 3
22 pages
Assignment DBMS
No ratings yet
Assignment DBMS
6 pages
Advanced Database Systems Chapter One Query Processing & Optimization
No ratings yet
Advanced Database Systems Chapter One Query Processing & Optimization
22 pages
Formal Relational Query Language Part 2
No ratings yet
Formal Relational Query Language Part 2
14 pages
18csc303j Dbms Unit IV
No ratings yet
18csc303j Dbms Unit IV
96 pages
ch2_dbms (1)
No ratings yet
ch2_dbms (1)
24 pages
28-Query Processing-30-09-2024
No ratings yet
28-Query Processing-30-09-2024
17 pages
Introduction To Relational Model
No ratings yet
Introduction To Relational Model
29 pages
Chapter 3
No ratings yet
Chapter 3
41 pages
The Relational Algebra and Calculus
No ratings yet
The Relational Algebra and Calculus
34 pages
DBMS - Unit 2
No ratings yet
DBMS - Unit 2
108 pages
∏ (Σ (P×R) ) −∏ (Σ (Q×R) ) 2. Q: R⋈ (Σ (S) ) : Σ (R⋈S) B) Σ (Rlojs) Rloj (Σ (S) ) D) Σ (R) Lojs
No ratings yet
∏ (Σ (P×R) ) −∏ (Σ (Q×R) ) 2. Q: R⋈ (Σ (S) ) : Σ (R⋈S) B) Σ (Rlojs) Rloj (Σ (S) ) D) Σ (R) Lojs
6 pages
Assignment 2018-19 DBMS
No ratings yet
Assignment 2018-19 DBMS
9 pages
03-Relational Model
No ratings yet
03-Relational Model
40 pages
Database final document
No ratings yet
Database final document
5 pages
DBMS BCA334 - (Quiz)
No ratings yet
DBMS BCA334 - (Quiz)
15 pages
EEE207 Database Concepts Lecture 2
No ratings yet
EEE207 Database Concepts Lecture 2
23 pages
join
No ratings yet
join
9 pages
Sem 6 End Sem Paper
No ratings yet
Sem 6 End Sem Paper
11 pages
Relational Algebra and SQL
No ratings yet
Relational Algebra and SQL
68 pages
2020 DBMS Mid
No ratings yet
2020 DBMS Mid
2 pages
Chap12 Practice Key
No ratings yet
Chap12 Practice Key
3 pages
Relational Algebra Operations in RDM: Tools Boot Camp
No ratings yet
Relational Algebra Operations in RDM: Tools Boot Camp
32 pages
Dbms 2
No ratings yet
Dbms 2
54 pages
Module 2 Part2
No ratings yet
Module 2 Part2
76 pages
Chapter2 Session3
No ratings yet
Chapter2 Session3
43 pages
Unit_2
No ratings yet
Unit_2
85 pages
Databases Exercise Sheet +
No ratings yet
Databases Exercise Sheet +
5 pages
Lecture 2005 3
No ratings yet
Lecture 2005 3
26 pages
DBMS Assignment-2
No ratings yet
DBMS Assignment-2
6 pages
Chapter2-Part 3 (New)
No ratings yet
Chapter2-Part 3 (New)
21 pages
Advanced D.base 4
No ratings yet
Advanced D.base 4
20 pages
Relational Algebra and Relational Calculus
No ratings yet
Relational Algebra and Relational Calculus
45 pages
Relational Algebra
100% (1)
Relational Algebra
140 pages
Database Management Systems Week 4
No ratings yet
Database Management Systems Week 4
31 pages
Advanced DBMS 2023
No ratings yet
Advanced DBMS 2023
49 pages
Co-So-Du-Lieu - Truong-Tuan-Anh - Dbs-Algebra - (Cuuduongthancong - Com)
No ratings yet
Co-So-Du-Lieu - Truong-Tuan-Anh - Dbs-Algebra - (Cuuduongthancong - Com)
53 pages
100 Puzzles to Learn Data Warehousing
From Everand
100 Puzzles to Learn Data Warehousing
Cristian Scutaru
No ratings yet
Couchbase Certified Java Developer - Exam Practice Tests
From Everand
Couchbase Certified Java Developer - Exam Practice Tests
Cristian Scutaru
No ratings yet
Administering Microsoft Azure SQL Solutions DP 300
From Everand
Administering Microsoft Azure SQL Solutions DP 300
Manish Soni
No ratings yet
DBM Sprint
No ratings yet
DBM Sprint
29 pages
An A-Z Index of The Command Line: Apple OS X
No ratings yet
An A-Z Index of The Command Line: Apple OS X
8 pages
TR-4323-DeSIGN-0814 Highly Available OpenStack Deployment With NetApp Storage
No ratings yet
TR-4323-DeSIGN-0814 Highly Available OpenStack Deployment With NetApp Storage
64 pages
Uart Core Ug v3.6
No ratings yet
Uart Core Ug v3.6
18 pages
Databases and Database Management Systems: Understanding Computers: Today and Tomorrow, 13th Edition
No ratings yet
Databases and Database Management Systems: Understanding Computers: Today and Tomorrow, 13th Edition
43 pages
Overview of Typical Windows Server Roles
No ratings yet
Overview of Typical Windows Server Roles
8 pages
SAP BTP ABAP Environment - Release 2208: Florian Wahl
No ratings yet
SAP BTP ABAP Environment - Release 2208: Florian Wahl
5 pages
Anna University: Chennai 600 025 B.E / B.Tech Degree Examinations, October / Novemebr 2014 R-2013 Third Semester Cs6312: Database Management Systems Laboratory Time: 3 Hours MARKS: 100
100% (1)
Anna University: Chennai 600 025 B.E / B.Tech Degree Examinations, October / Novemebr 2014 R-2013 Third Semester Cs6312: Database Management Systems Laboratory Time: 3 Hours MARKS: 100
20 pages
Veritas Netbackup Interview Questions
100% (2)
Veritas Netbackup Interview Questions
4 pages
Vtwsclib-1 4
No ratings yet
Vtwsclib-1 4
47 pages
Airdrive Rs-232 Recorder Airdrive Rs-232 Recorder TS: User'S Guide
No ratings yet
Airdrive Rs-232 Recorder Airdrive Rs-232 Recorder TS: User'S Guide
22 pages
Scloader 2 A
No ratings yet
Scloader 2 A
4 pages
Dms Important Questions
No ratings yet
Dms Important Questions
6 pages
1st Os - Merged
No ratings yet
1st Os - Merged
63 pages
Excel Training Poster
No ratings yet
Excel Training Poster
1 page
Semester 1 Final
No ratings yet
Semester 1 Final
29 pages
Class: X Session: 2020-21 Computer Applications (Code 165) Sample Question Paper (Theory)
No ratings yet
Class: X Session: 2020-21 Computer Applications (Code 165) Sample Question Paper (Theory)
9 pages
Jeffrey Archer - SONS of Fortune
No ratings yet
Jeffrey Archer - SONS of Fortune
35 pages
Sample 2023-24
No ratings yet
Sample 2023-24
12 pages
M.6 - STD - Unique - PTR - Learn C++
No ratings yet
M.6 - STD - Unique - PTR - Learn C++
20 pages
Cse
No ratings yet
Cse
166 pages
SQL Lab Exercises 2012
100% (1)
SQL Lab Exercises 2012
5 pages
Binary Trees: Data Structures and Algorithms in Java
No ratings yet
Binary Trees: Data Structures and Algorithms in Java
58 pages
C# File
No ratings yet
C# File
16 pages
EMXPv309 Referencemanual PDF
No ratings yet
EMXPv309 Referencemanual PDF
812 pages
300 015 012 - 04
No ratings yet
300 015 012 - 04
58 pages
DBMS Model Answer Papers
No ratings yet
DBMS Model Answer Papers
50 pages
LDD For Relational Database Management System: Internal
No ratings yet
LDD For Relational Database Management System: Internal
9 pages