0% found this document useful (0 votes)

9 views

08 Query Processing Strategies and Optimization

Uploaded by

beshahashenafi32

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views

08 Query Processing Strategies and Optimization

Uploaded by

beshahashenafi32

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 32

Query Processing

Strategies and
Optimization
CPS352: Database Systems

Simon Miner
Gordon College
Last Revised: 10/25/12
Agenda

• Check-in

• Design Project Presentations

• Query Processing

• Programming Project

• Exam 1 (time permitting)

• Homework 4
Check-in
Design Project
Presentations
Query Processing and
Optimization
Different Ways to Execute
Queries
• Database creates a plan to get the results for a query
• Not just one way to do this.

• Example : Find the titles of all books written by

Korth.
• π title σ author = ‘Korth’ Book |X| BookAuthor
• π title Book |X| σ author = ‘Korth’ BookAuthor

• Good DBMS will transform queries to make them

as efficient as possible
• Minimize disk accesses
Selection Strategies
• Linear search – full table scan
• Cost of potentially accessing each disk block containing the
desired data

• Binary search (with B+ tree index)

• Exact matches
• Multiple matches
• Range queries
• Complex queries

• Index often requires disk accesses the index structure as

well as for actual data
• Typically far fewer for data than linear search
• Index root and (perhaps) first level kept in buffer pool
Query Type vs. Index Type

Condition Example Clustering / Secondary Hashed Index

Primary Index Index
Exact match id = 12345 Great! Great! Great!
on candidate
key
Exact match status = N/A Find first Find first
on non-key ‘Active’ match (+ match (+
potential scan) potential scan)
Range query age between 21 Find first Less helpful Not useful
and 65 match +
sequential scan
Complex query color = ‘blue’ Not useful Not useful Not useful
or status = (multiple
‘Inactive’ indexes help)
Join Strategies

• Joins are most expensive part of query processing

• Number of tuples examined can approach the product of the
number of records in tables being joined

• Example
• σ Borrower.lastName = BookAuthor.authorName Borrower X BookAuthor
• Where BookAuthor has 10K tuples and Borrower has 2K tuples
• Cartesian join yields 20 million tuples to process
Nested Loop Join
Nested Block Join
Buffering an Entire Relation
Using Indexes to Speed Up
Joins
• Example: Borrower |X| CheckedOut
• Assume
• 2K Borrower tuples, 1K CheckedOut tuples
• 20 records per block (so 100 and 50 blocks for each table, respectively)
• We cannot buffer either table entirely
• Without indexes – nested block join takes 5050 or 5100 disk accesses,
depending on which table is in the outer loop
• With index on Borrower.borrowerID – exactly one match (PK)
• Scan all 1000 CheckedOut records (50 blocks) – each matches exactly one
Borrower record, which can be looked up in the index
• Requires processing only 2000 tuples
• Not quite as good as it seems
• Each borrower may require a separate disk access (50 + 1000 = 1050 accesses)
• Traversing index might take multiple disk accesses (especially B+ Tree indexes)
Temporary Indexes

• Indexes created and buffered for the purpose of a single

query and then discarded
• Example: neither Borrower nor CheckedOut is indexed
• Borrower |X| CheckedOut might cause a temporary index
to be built on Borrower.borrowerID
• If each (dense) index entry takes ~10 bytes, entire index will
be ~20K
• Index construction requires reading all 2K borrowers = 100
disk accesses
• Join itself costs up to 1050 disk accesses (see previous slide)
• Total of 1150 disk accesses
Merge Join
Order of Joins
• For multiple joins, performance can be greatly impacted by the order
in which the joins are done
• Example
• π last, first, authorName Borrower |X| BookAuthor |X| CheckedOut
• Assume 2K borrowers, 1K CheckedOut records, and 10K authors
• Each book has an average of 2 authors
• 3 ways to do the (binary commutative) join operations
• ( Borrower|X| BookAuthor ) |X| CheckedOut
• ( BookAuthor |X| CheckedOut ) |X| Borrower
• ( Borrower |X| CheckedOut ) \X| BookAuthor
• Final number of tuples is the same, but intermediate joins create
temporary tables which are then joined with the remaining table
• Which way is most efficient in light of this?
Rules of Equivalence
• Two formulations of a query are equivalent if the
produce the same set of results
• Not necessarily in the same order

• Example : Find the titles of all books written by Korth.

• select title
from Book natural join BookAuthor
where authorName = ‘Korth’;
• Equivalent relational algebra queries
• π title σ author = ‘Korth’ Book |X| BookAuthor

• π title Book |X| σ author = ‘Korth’ BookAuthor

• Equivalent, but the same in terms of performance

Equivalence Rules

1. Conjunctive selection operations can be deconstructed into a

sequence of individual selections.
s q Ùq ( E ) = s q (s q ( E ))
1 2 1 2
2. Selection operations are commutative.
s q (s q ( E )) = s q (s q ( E ))
1 2 2 1

3. Only the last in a sequence of projection operations is

needed, the others can be omitted.
 L1 ( L2 (( Ln ( E ))))   L1 ( E )

4. Selections can be combined with Cartesian products and

theta joins.
a. (E1 X E2) = E1  E2
b. 1(E1 2 E2) = E1 1 2 E2

Database System Concepts - 6th Edition 1.18 ©Silberschatz, Korth and Sudarshan
Equivalence Rules (Cont.)
5. Theta-join operations (and natural joins) are commutative.
E1  E2 = E2  E1
6. (a) Natural join operations are associative:
(E1 E2) E3 = E1 (E2 E3)

(b) Theta joins are associative in the following manner:

(E1 1 E2) 2 3 E3 = E1 1 3 (E2 2 E3)

where 2 involves attributes from only E2 and E3.

Database System Concepts - 6th Edition 1.19 ©Silberschatz, Korth and Sudarshan
Equivalence Rules (Cont.)
7. The selection operation distributes over the theta join operation
under the following two conditions:
(a) When all the attributes in 0 involve only the attributes of one
of the expressions (E1) being joined.

0E1  E2) = (0(E1))  E2

(b) When  1 involves only the attributes of E1 and 2 involves

only the attributes of E2.
1 E1  E2) = (1(E1))  ( (E2))

Database System Concepts - 6th Edition 1.20 ©Silberschatz, Korth and Sudarshan
Equivalence Rules (Cont.)
8. The projection operation distributes over the theta join operation
as follows:
(a) if  involves only attributes from L1  L2:
 L1 L2 ( E1  E2 )  ( L1 ( E1 ))  ( L2 ( E2 ))

(b) Consider a join E1  E2.

l Let L1 and L2 be sets of attributes from E1 and E2,
respectively.
l Let L3 be attributes of E1 that are involved in join condition ,
but are not in L1  L2, and
l let L4 be attributes of E2 that are involved in join condition ,
but are not in L1  L2.
 L  L ( E1
1 2  E2 )   L  L (( L  L ( E1 ))
1 2 1 3  ( L  L ( E 2 )))
2 4

9. The set operations union and intersection are commutative

E1  E2 = E2  E1
E1  E2 = E2  E1
n (set difference is not commutative).
10. Set union and intersection are associative.
(E1  E2)  E3 = E1  (E2  E3)
(E1  E2)  E3 = E1  (E2  E3)
11. The selection operation distributes over ,  and –.
 (E1 – E2) =  (E1) – (E2)
and similarly for  and  in place of –
Also:  (E1 – E2) = (E1) – E2
and similarly for  in place of –, but not for 
12. The projection operation distributes over union
L(E1  E2) = (L(E1))  (L(E2))
Database System Concepts - 6th Edition 1.22 ©Silberschatz, Korth and Sudarshan
Push Selections Inward
• Do selections as early as possible
• Reduces (“flattens”) the number of records in the relation(s) being
joined

• Example:
• π title σ author = ‘Korth’ Book |X| BookAuthor
• π title Book |X| σ author = ‘Korth’ BookAuthor

• Sometimes this is not feasible

• σ Borrower.lastName = BookAuthor.authorName Borrower X BookAuthor
• i.e. when there are no shared attributes

• Alter the structure of the selection itself

• Find late checked out books that cost more than $20.00.
• σ purchasePrice > 20 ∧ dateDue < today Book |X| CheckedOut
• σ purchasePrice > 20 Book |X|σ dateDue < today CheckedOut
Push Projections Inward

• Do projections as early as possible

• Reduces (“narrows”) the number of columns in the relation(s)
being joined

• Example:
• π lastName, firstName, title, dateDue Borrower|X| CheckedOut |X| Book
• π lastName, firstName, title, dateDue Borrower|X|
(π borrowerID, title, dateDue CheckedOut |X| Book )
• Reduces the number of columns in the temporary table from the
intermediate join
Statistics and Query
Optimization
• Using statistics about database objects can help speed
up queries

• Maintaining statistics as the data in the database

changes is a manageable process

• Types of statistics
• Table statistics
• Column statistics
Table Statistics

• On a relation r
• nr = number of tuples in the relation
• br = number of blocks used by the relation
• lr = size (in bytes) of a tuple in the relation
• fr = blocking factor, number of tuples per block
• Note that fr = floor( block size / lr ) if tuples do not span
blocks
• Note that br = ceiling( nr / fr ) if tuples in r reside in a single
file and are not clustered with other relations
Column Statistics
• on a column A
• V( A, r ) = number of distinct values in the column
• If A is a superkey, then V( A, r ) = nr
• If A is not a superkey, the number of times each
column value occurs can be estimated by nr / V( A, r )
• If column A is indexed, V( A, r ) s relatively easy to
maintain
• Keep track of the count of entries in the index

• Histogram of the relative frequency of column

values in different ranges
Estimating the Size of a Join
• Cartesian product– r X s
• Number of tuples in join = nr X s = nr * ns
• Size of each tuple in join = lr X s = lr + ls

• Natural join – r |X| s, where r and s have A in common

• The size of the join can be estimated in two ways
• The ns tuples of s will join with nr / V( A, r ) tuples of r
for ns * nr / V( A, r ) total tuples
• The nr tuples of r will join with ns / V( A, s ) tuples of s
for nr * ns / V( A, s ) total tuples
• We want to use the smaller of these estimates
• min(nr * ns / V(A, s) , ns * nr / V(A, r) ) = ns * nr / max( V(A, r), V(A, s) )
• Also note that V(A, r |X| s) = min( V(A, r), V(A, s) )
• Some tuples in the relation with the larger number of column values do not join
with any tuples in the other relation
Example Join Estimation
• π last, first, authorName Borrower |X| BookAuthor |X| CheckedOut

• 3 ways to do the join operations – Which is most efficient?

• ( Book |X| BookAuthor ) |X| CheckedOut
• ( BookAuthor |X| CheckedOut ) |X| Borrower
• ( Borrower |X| CheckedOut |X| BookAuthor

• Statistics
nr V( A, r )
nBorrower = 2000 V( borrowerID, Borrower ) = 2000
nCheckedOut = 1000 V( borrower, CheckedOut ) = 100
nBookAuthor = 10,000 V( callNo, CheckedOut ) = 500
V( callNo, BookAuthor ) = 5000
Programming Project
Part I
Exam 1
Homework 4

28-Query Processing-30-09-2024
No ratings yet
28-Query Processing-30-09-2024
17 pages
11 Ch13 Query Optimization
No ratings yet
11 Ch13 Query Optimization
54 pages
Chapter 13 (2)
No ratings yet
Chapter 13 (2)
57 pages
Chapter 13: Query Optimization: Database System Concepts, 6 Ed
No ratings yet
Chapter 13: Query Optimization: Database System Concepts, 6 Ed
62 pages
Ch13 QueryOptimization Korth6E
No ratings yet
Ch13 QueryOptimization Korth6E
24 pages
04 - Relational Algebra and Calculus
No ratings yet
04 - Relational Algebra and Calculus
38 pages
Unit 3 Query Languages
No ratings yet
Unit 3 Query Languages
80 pages
FALLSEM2023 24 - BCSE302L - TH - VL2023240100776 - 2023 06 14 - Reference Material I 2
No ratings yet
FALLSEM2023 24 - BCSE302L - TH - VL2023240100776 - 2023 06 14 - Reference Material I 2
14 pages
ADB Chapter 2 DB Part1
No ratings yet
ADB Chapter 2 DB Part1
10 pages
C817b299unit 2 - Relational Algebra
No ratings yet
C817b299unit 2 - Relational Algebra
20 pages
Relational Algebra
No ratings yet
Relational Algebra
31 pages
1.6 PPT - Query Optimization
No ratings yet
1.6 PPT - Query Optimization
53 pages
数据库原理与实践 Database Systems-Principle and Practice
No ratings yet
数据库原理与实践 Database Systems-Principle and Practice
182 pages
Query Execution
No ratings yet
Query Execution
87 pages
Unit4 SQL and Database Project at Students Notes
No ratings yet
Unit4 SQL and Database Project at Students Notes
29 pages
DBMS Unit - 7
No ratings yet
DBMS Unit - 7
34 pages
Chapter 5: Query Optimization: Acknowledgements: Slides Are Adapted From Böhlen and
No ratings yet
Chapter 5: Query Optimization: Acknowledgements: Slides Are Adapted From Böhlen and
53 pages
6 Query Optimization-Ch 16
No ratings yet
6 Query Optimization-Ch 16
35 pages
4 Chapter Four
No ratings yet
4 Chapter Four
34 pages
Chapter 3
No ratings yet
Chapter 3
41 pages
Relational Model Introduction For Noncse
No ratings yet
Relational Model Introduction For Noncse
45 pages
CH 11
No ratings yet
CH 11
19 pages
Chapter 12, 13 - Query Processing and Optimization
No ratings yet
Chapter 12, 13 - Query Processing and Optimization
24 pages
Relational Algebra
No ratings yet
Relational Algebra
42 pages
DBMS - Unit 2
No ratings yet
DBMS - Unit 2
108 pages
Important Alg Rela
No ratings yet
Important Alg Rela
39 pages
Relational Algebra and Relational Calculus
No ratings yet
Relational Algebra and Relational Calculus
45 pages
Chapter 4 - RA
No ratings yet
Chapter 4 - RA
59 pages
Chapter 3 - The Relational Database Model
No ratings yet
Chapter 3 - The Relational Database Model
36 pages
DE_Module5_QueryOptimization
No ratings yet
DE_Module5_QueryOptimization
11 pages
Tut8 QPO Qa
No ratings yet
Tut8 QPO Qa
7 pages
Outer join and aggregate function
No ratings yet
Outer join and aggregate function
64 pages
CH - 6 Algebra Operation
No ratings yet
CH - 6 Algebra Operation
39 pages
Session - 10 Querying
No ratings yet
Session - 10 Querying
36 pages
Lecture 06
No ratings yet
Lecture 06
41 pages
DS UNIT 3
No ratings yet
DS UNIT 3
38 pages
Chapter 13: Query Processing
No ratings yet
Chapter 13: Query Processing
25 pages
Clear
No ratings yet
Clear
60 pages
Unit 6: Query Processing and Optimization
No ratings yet
Unit 6: Query Processing and Optimization
21 pages
1.5 Relational Algebra
No ratings yet
1.5 Relational Algebra
94 pages
Relational Algebra Final PPT 08-06-2023
No ratings yet
Relational Algebra Final PPT 08-06-2023
70 pages
Lesson 06
No ratings yet
Lesson 06
44 pages
Relational Algebra
No ratings yet
Relational Algebra
26 pages
Q. State The Basic Database Concepts
No ratings yet
Q. State The Basic Database Concepts
7 pages
IF3140 Query Optimization
No ratings yet
IF3140 Query Optimization
77 pages
Relational Algebra
No ratings yet
Relational Algebra
15 pages
The Relational Algebra and Calculus
No ratings yet
The Relational Algebra and Calculus
34 pages
Relational Algebra
No ratings yet
Relational Algebra
31 pages
SELECT Operation in Relational Algebra - 20241024 - 102938 - 0000
No ratings yet
SELECT Operation in Relational Algebra - 20241024 - 102938 - 0000
11 pages
DBMS Unit - 7
No ratings yet
DBMS Unit - 7
33 pages
Finals
No ratings yet
Finals
15 pages
Unit_2
No ratings yet
Unit_2
85 pages
Definitions of Database Terms
No ratings yet
Definitions of Database Terms
7 pages
Linear Search: Collision Chain
No ratings yet
Linear Search: Collision Chain
16 pages
Chapter2-Part 3 (New)
No ratings yet
Chapter2-Part 3 (New)
21 pages
Ch13-Query Optimization
No ratings yet
Ch13-Query Optimization
42 pages
Chapter 6 Query Languges
No ratings yet
Chapter 6 Query Languges
26 pages
Relational Algebra
No ratings yet
Relational Algebra
54 pages
Co-So-Du-Lieu - Truong-Tuan-Anh - Dbs-Algebra - (Cuuduongthancong - Com)
No ratings yet
Co-So-Du-Lieu - Truong-Tuan-Anh - Dbs-Algebra - (Cuuduongthancong - Com)
53 pages
Oracle SQL and PL/SQL
From Everand
Oracle SQL and PL/SQL
Niraj Gupta
4.5/5 (8)
Chapter5 ExternalMemory
No ratings yet
Chapter5 ExternalMemory
66 pages
Chapter5 InternalMemory
No ratings yet
Chapter5 InternalMemory
107 pages
DSA Chapter 4 - Stack
No ratings yet
DSA Chapter 4 - Stack
62 pages
DSA Chapter 6 - Tree
No ratings yet
DSA Chapter 6 - Tree
67 pages
DSA Chapter 7 - Graphs
No ratings yet
DSA Chapter 7 - Graphs
71 pages
Chapter 1
No ratings yet
Chapter 1
16 pages
DSA Chapter 1 - Intro w12
No ratings yet
DSA Chapter 1 - Intro w12
98 pages
Chapter 15
No ratings yet
Chapter 15
7 pages
Chapter 4 Distributed Database Systems
No ratings yet
Chapter 4 Distributed Database Systems
69 pages
DSA Chapter 8 - Advanced Sorting and Searching
No ratings yet
DSA Chapter 8 - Advanced Sorting and Searching
50 pages
Lect Note On Chapter 3 - Part II - Event-Driven Component
No ratings yet
Lect Note On Chapter 3 - Part II - Event-Driven Component
73 pages
Chapter-6 SN
No ratings yet
Chapter-6 SN
70 pages
Chapter 1
No ratings yet
Chapter 1
50 pages
Transaction Management, Concurrency Control and Deadlocks
No ratings yet
Transaction Management, Concurrency Control and Deadlocks
30 pages
Chapter 5 - Switiching and Network Devices
No ratings yet
Chapter 5 - Switiching and Network Devices
45 pages
Lect Note On Chapter 2 - Part I - Prog With Event Driven
No ratings yet
Lect Note On Chapter 2 - Part I - Prog With Event Driven
49 pages
Lect None On Chapter 2 - Part II - String and StringBuilder
No ratings yet
Lect None On Chapter 2 - Part II - String and StringBuilder
49 pages
Lect Note On Chapter 3 - Part I - Event-Driven Component
No ratings yet
Lect Note On Chapter 3 - Part I - Event-Driven Component
60 pages
Lect Note On Chapter 1 - Event Driven Fundamentals
No ratings yet
Lect Note On Chapter 1 - Event Driven Fundamentals
18 pages
Lec Note On Chapter 5 - Database Programming
No ratings yet
Lec Note On Chapter 5 - Database Programming
20 pages
COA Lab Session 8
No ratings yet
COA Lab Session 8
5 pages
Asterisk CDR Mysql
No ratings yet
Asterisk CDR Mysql
11 pages
CPM Examples
No ratings yet
CPM Examples
198 pages
Data Abstraction and Basic Data Structures: Improving Efficiency by Building Better Object IN
No ratings yet
Data Abstraction and Basic Data Structures: Improving Efficiency by Building Better Object IN
12 pages
Technical Report: Usability Test of My Cross - Course Project
No ratings yet
Technical Report: Usability Test of My Cross - Course Project
3 pages
Instant Download Beginning JavaScript For Adobe Acrobat A Non Programmer S Guide John Deubert PDF All Chapter
100% (4)
Instant Download Beginning JavaScript For Adobe Acrobat A Non Programmer S Guide John Deubert PDF All Chapter
84 pages
Short A Word Cards
No ratings yet
Short A Word Cards
14 pages
Smartmicro-Traffic Radar Product Brochure 2017 Web MT
No ratings yet
Smartmicro-Traffic Radar Product Brochure 2017 Web MT
24 pages
Artificial Intelligence: Chapter 6: Representing Knowledge Using Rules
No ratings yet
Artificial Intelligence: Chapter 6: Representing Knowledge Using Rules
54 pages
Farm Management Information System
No ratings yet
Farm Management Information System
45 pages
Index:: SL No Name of Experiment No Date
No ratings yet
Index:: SL No Name of Experiment No Date
26 pages
Dahua HDCVI DVR Users Manual V4 - 0 - 0 201805239 PDF
No ratings yet
Dahua HDCVI DVR Users Manual V4 - 0 - 0 201805239 PDF
327 pages
AI20-Session 07 - 01 - Yampolskiy - 2015 - Taxonomy of Pathways To Dangerous AI
No ratings yet
AI20-Session 07 - 01 - Yampolskiy - 2015 - Taxonomy of Pathways To Dangerous AI
6 pages
Smart Card For Health Care System
No ratings yet
Smart Card For Health Care System
6 pages
Performance and Scalability Models Website
No ratings yet
Performance and Scalability Models Website
15 pages
Full JavaScript Frameworks For Modern Web Development, 2nd Edition Sufyan Bin Uzayr Ebook All Chapters
No ratings yet
Full JavaScript Frameworks For Modern Web Development, 2nd Edition Sufyan Bin Uzayr Ebook All Chapters
49 pages
DN 6857 PDF
No ratings yet
DN 6857 PDF
6 pages
SAP HANA EIM Installation and Configuration Guide en
No ratings yet
SAP HANA EIM Installation and Configuration Guide en
598 pages
Best Resume Format For 10 Years Experience
100% (1)
Best Resume Format For 10 Years Experience
7 pages
FDI 01 Falling Dart Impact Tester
No ratings yet
FDI 01 Falling Dart Impact Tester
2 pages
Eunex - How To Register Your Account
No ratings yet
Eunex - How To Register Your Account
8 pages
Sad 2
No ratings yet
Sad 2
8 pages
Navbar
No ratings yet
Navbar
6 pages
PDF Spry User Manual DL
No ratings yet
PDF Spry User Manual DL
23 pages
Javascript Notes
No ratings yet
Javascript Notes
49 pages
GT300 User Manual V2.06
No ratings yet
GT300 User Manual V2.06
15 pages
Java Lab QB - CIE-2024
No ratings yet
Java Lab QB - CIE-2024
4 pages
ISO-IEC-24712-2007
No ratings yet
ISO-IEC-24712-2007
12 pages
Nuevas Actualizaciones Mayo 02 2021 PC
No ratings yet
Nuevas Actualizaciones Mayo 02 2021 PC
6 pages
Manual of 5th Controller
No ratings yet
Manual of 5th Controller
88 pages
Cutting Force of End Cutting Tool Milling Machining
No ratings yet
Cutting Force of End Cutting Tool Milling Machining
24 pages