0% found this document useful (0 votes)
2 views

L2 Distributed QueryProcessing

Uploaded by

naziya1531
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

L2 Distributed QueryProcessing

Uploaded by

naziya1531
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

10-08-2024

Query decomposition and data


localization
Dario Della Monica

These slides are a modified version of the slides provided with the book
Özsu and Valduriez, Principles of Distributed Database Systems (3rd Ed.), 2011
The original version of the slides is available at: extras.springer.com

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/1

Outline (distributed DB)


• Introduction (Ch. 1) ⋆

• Distributed Database Design (Ch. 3) ⋆

• Distributed Query Processing (Ch. 6-8) ⋆


➡ Overview (Ch. 6) ⋆

➡ Query decomposition and data localization (Ch. 7) ⋆


➡ Distributed query optimization (Ch. 8) ⋆

• Distributed Transaction Management (Ch. 10-12) ⋆

⋆ Özsu and Valduriez, Principles of Distributed Database Systems (3rd Ed.), 2011

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/2

1
10-08-2024

Outline (today)
• Query decomposition and data localization (Ch. 7) ⋆
➡ The problem of distributed data localization
➡ A naïve algorithm
➡ Optimization steps (reductions)
✦ PHF (selection, join)
✦ VF (projection)
✦ DHF (join)
✦ Hybrid Fragmentation (selection/join + projection)

⋆ Özsu and Valduriez, Principles of Distributed Database Systems (3rd Ed.), 2011

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/3

Data Localization
Input: Relational algebra expression on global, distributed relations (distributed
query)
Output: Relational algebra expression on fragments (localized query)
• Localization uses global information about distribution of fragments (no use of
quantitative information, e.g., catalog statistics)
• Recall that fragmentation is obtained by several application of rules expressed
by relational algebra …
➡ primary horizontal fragmentation: selection σ
➡ derived horizontal fragmentation: semijoin ⋉
➡ vertical fragmentation: projection 

• … and that reconstruction (reverse fragmentation) rules are also expressed in


relational algebra
➡ horizontal fragmentation: union ∪
➡ vertical fragmentation: join ⋈

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/4

2
10-08-2024

A naïve algorithm to localize


distribute queries
• Localization program: relational algebra expression that reconstructs a
global relation from its fragments, by reverting the rules employed for
fragmentation
• A localized query is obtained from distributed, global query by replacing
leaves (global relations) with (the tree of) its corresponding localization
program
➡ Leaves of localized queries are fragments

• This approach to obtain a localized query from a distributed one is


inefficient and the result can be improved
➡ During data localization there is a first optimization phase
✦ we call it reduction
✦ different from the “proper” global optimization phase (“proper” in the sense of
the centralized case, i.e., finding the “best” strategy for executing the query)

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/5

Example
Assume
EMP ⋈ ASG
• EMP is fragmented as follows:
=
(EMP1 ∪ EMP2 ∪ EMP3) ⋈ (ASG1 ∪ ASG2 )
➡ EMP1= ENO≤“E3”(EMP)

➡ EMP2= “E3”<ENO≤“E6”(EMP)

➡ EMP3= ENO≥“E6”(EMP)

• ASG is fragmented as follows: ⋈ENO

➡ ASG1= ENO≤“E3”(ASG)
 
➡ ASG2= ENO>“E3”(ASG)

Replace EMP by (EMP1  EMP2  EMP3) EMP1 EMP3 ASG1 ASG2


and ASG by (ASG1  ASG2) in any query EMP2

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/6

3
10-08-2024

Provides Parallellism
( EMP1 ∪ EMP2 ∪ EMP3) ⋈ (ASG1 ∪ ASG2 )
EMP1= ENO≤“E3”(EMP)
=
EMP2= “E3”<ENO≤“E6”(EMP) ( EMP1 ⋈ ASG1) ∪ ( EMP1 ⋈ ASG2) ∪
EMP3= ENO≥“E6”(EMP) ( EMP2 ⋈ ASG1) ∪ ( EMP2 ⋈ ASG2) ∪
( EMP3 ⋈ ASG1) ∪ ( EMP3 ⋈ ASG2)
ASG1= ENO≤“E3”(ASG)
ASG2= ENO>“E3”(ASG) 
⋈ENO

 

⋈ENO ⋈ENO ⋈ENO ⋈ENO ⋈ ⋈ENO


EMP 1 EMP3 ASG1 ASG2 ENO
EMP2

EMP1 ASG1 EMP1 ASG2 EMP2 ASG1 EMP2 ASG2 EMP3 ASG1 EMP3 ASG2

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/7

Eliminates Unnecessary Work



⋈ENO ⋈ENO ⋈ENO

EMP1 ASG1 EMP2 ASG2 EMP3 ASG2

Identify (pairs of) fragments that can be ignored because they produce
empty relations (e.g., when a selection or a join is applied to them)
Distributed DBMS © M. T. Özsu & P. Valduriez
Ch.7/8

4
10-08-2024

Reduction for PHF – Selection


• Reduction of a selection over a relation fragmented with PHF: ignore a fragment
if selection predicate and fragment predicate are contradictory
➡ Consider p(R)

➡ Horizontal fragmentation on R: FR={R1, R2, …, Rw}, where Rj=p (R)


j

➡ p(Rj)= if x in R: ¬(p(x) pj(x)) i.e., p and pj are contradictory

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/9

Reduction for PHF – Selection


(Example)
• Reduction of a selection over a relation fragmented with PHF: ignore a fragment
if selection predicate and fragment predicate are contradictory
➡ Example SELECT * EMP1= ENO≤“E3”(EMP)
FROM EMP EMP2= “E3”<ENO≤“E6”(EMP)
WHERE ENO="E5" EMP3= ENO≥“E6”(EMP)
ASG1= ENO≤“E3”(ASG)
ASG2= ENO>“E3”(ASG)

ENO=“E5” ENO=“E5” ENO=“E5”


EMP EMP1 EMP2 EMP3 EMP2


distributed query localized query reduced local query

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/10

10

5
10-08-2024

Reduction for PHF – Join


• Reduction of a join over relations fragmented with PHF: ignore the join of 2
fragments if their fragment predicates are contradictory over the join attributes

➡ Possible if fragmentation predicates (minterms) involve the join attribute

➡ Distribute join over union

R ⋈ S  (R1 R2) ⋈ (S1  S2)


 (R1 ⋈ S1)  (R1 ⋈ S2)  (R2 ⋈ S1)  (R2 ⋈ S2)
➡ Then, join between 2 fragments can be simplified in some cases

✦ Given Ri =pi(R) and Sj = pj(S) [pi and pj defined over join attributes]

Ri ⋈Sj = if x in R ⋈ S : ¬(pi(x) pj(x)) [there is a mistake in the textbook]


i.e., pi and pj are contradictory

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/11

11

Reduction for PHF – Join (Example)


EMP1= ENO≤“E3”(EMP) ⋈ENO
EMP2= “E3”<ENO≤“E6”(EMP)
EMP3= ENO≥“E6”(EMP)
ASG1= ENO≤“E3”(ASG)  
ASG2= ENO>“E3”(ASG)

EMP1 EMP2 EMP3 ASG1 ASG2


• Consider the query
SELECT *
FROM EMP,ASG 
WHERE EMP.ENO=ASG.ENO
• Distribute join over unions ⋈ENO ⋈ENO ⋈ENO
• Apply the reduction rule
Not always useful EMP1 ASG1 EMP2 ASG2 EMP3 ASG2

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/12

12

6
10-08-2024

Reduction for VF
• Reduction of a projection over a relation fragmented with VF: ignore the fragment for
which the set of fragmentation attributes intersected with the set of projection
attributes is contained in the primary key
• Recall that the localization program consists in joins over key attributes
• Let R1 be a fragment of R obtained as R1 = A’ (R) where A’  attr(R) :
➡ Reduction of a projection A’’ over R1 is possible when A’’ ∩ A’  key(R)

Ex.: EMP1 = ENO,ENAME (EMP)


ENAME ENAME
EMP2 = ENO,TITLE (EMP)
⋈ENO
SELECT ENAME
FROM EMP
EMP1 EMP2 EMP1

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/13

13

Reduction for DHF


• Similar to the case PHF
• DHF: 2 relations S (owner) and R (member) in association one-to-many
➡ S participates with cardinality N , R participates with cardinality 1
➡ Fragmentation propagate from S to R
➡ Localization program: union
➡ Compatible fragments (i.e., fragments that agree on the values of join
attributes) are placed at the same site
• Reduction of a join over relations fragmented with DHF: only join
“corresponding” fragments
➡ Distribute joins over unions
➡ Apply the join reduction for horizontal fragmentation

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/14

14

7
10-08-2024

Reduction for DHF – Example


• Example [EMP is owner , ASG is member]
EMP1: TITLE=“Programmer” (EMP) Always convenient
EMP2: TITLE ≠“Programmer” (EMP) - the number of joins is always equal to the number
ASG1: ASG ⋉ENO EMP1 of fragments
ASG2: ASG ⋉ENO EMP2 - all joins can be performed in parallel (are disjoint)
• Query SELECT *
FROM EMP, ASG
WHERE ASG.ENO = EMP.ENO

⋈ENO 

  ⋈ENO ⋈ENO

ASG1 ASG2 EMP1 EMP2 ASG1 EMP1 ASG2 EMP2

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/15

15

Complex reduction for PHF and DHF


1. Generic query ⋈ENO SELECT *
FROM EMP, ASG
WHERE ASG.ENO = EMP.ENO
TITLE=“Mech. Eng.” AND EMP.TITLE = “Mech. Eng”

 

ASG1 ASG2 EMP1 EMP2

2. Reduction of selection over a 3. Reduction of join over a relation


relation fragmented with HF fragmented with DHF
⋈ENO ⋈ENO

 TITLE=“Mech. Eng.” TITLE=“Mech. Eng.”

ASG1 ASG2 EMP2 ASG2 EMP2

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/16

16

8
10-08-2024

Reduction for Hybrid Fragmentation


• Combine the rules already specified
➡ Remove empty relations generated by contradicting predicates (inside
selections or joins) on horizontal fragments

➡ Remove useless relations generated by projections on vertical fragments

➡ Distribute joins/selections/projections over unions in order to isolate and


remove useless operands

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/17

17

Reduction for Hybrid Fragmentation


Example
ENAME
Consider the following hybrid fragmentation:
ENAME
EMP1= ENO≤"E4" (ENO,ENAME (EMP))
EMP2= ENO>"E4" (ENO,ENAME (EMP)) ENO=“E5”
EMP3= ENO,TITLE (EMP)
⋈ENO
 ENO=“E5”
Thus, the localization program for EMP is:
EMP = (EMP1 ∪ EMP2) ⋈ EMP3

Consider also the query: 


EMP2
SELECT ENAME
FROM EMP EMP1 EMP2 EMP3
WHERE ENO="E5"

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/18

18

You might also like