0% found this document useful (0 votes)
62 views

7 Query Localization

Uploaded by

Nida Fatima
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views

7 Query Localization

Uploaded by

Nida Fatima
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Outline

• Introduction
• Background
• Distributed Database Design
• Database Integration
• Semantic Data Control
• Distributed Query Processing
➡ Overview
➡ Query decomposition and localization
➡ Distributed query optimization
• Multidatabase query processing
• Distributed Transaction Management
• Data Replication
• Parallel Database Systems
• Distributed Object DBMS
• Peer-to-Peer Data Management
• Web Data Management
• Current Issues
Distributed DBMS © M. T. Özsu & P. Valduriez
Ch.7/1
Step 1 – Query Decomposition
Input : Calculus query on global relations
• Normalization
➡ manipulate query quantifiers and qualification
• Analysis
➡ detect and reject “incorrect” queries
➡ possible for only a subset of relational calculus
• Simplification
➡ eliminate redundant predicates
• Restructuring
➡ calculus query  algebraic query
➡ more than one translation is possible
➡ use transformation rules

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/2
Normalization
• Lexical and syntactic analysis
➡ check validity (similar to compilers)
➡ check for attributes and relations
➡ type checking on the qualification

• Put into normal form


➡ Conjunctive normal form
(p11 p12  …  p1n)  …  (pm1  pm2  …  pmn)
➡ Disjunctive normal form
(p11  p12  …  p1n)  …  (pm1  pm2  …  pmn)
➡ OR's mapped into union
➡ AND's mapped into join or selection
Distributed DBMS © M. T. Özsu & P. Valduriez
Ch.7/3
Analysis
• Refute incorrect queries
• Type incorrect
➡ If any of its attribute or relation names are not defined in the global schema
➡ If operations are applied to attributes of the wrong type
• Semantically incorrect
➡ Components do not contribute in any way to the generation of the result
➡ Only a subset of relational calculus queries can be tested for correctness
➡ Those that do not contain disjunction and negation
➡ To detect
✦ connection graph (query graph)
✦ join graph

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/4
Analysis – Example
SELECT ENAME,RESP
FROM EMP, ASG, PROJ
WHERE EMP.ENO = ASG.ENO
AND ASG.PNO = PROJ.PNO
AND PNAME = "CAD/CAM"
AND DUR ≥ 36
AND TITLE = "Programmer"

Query graph Join graph


DUR≥36

ASG ASG
EMP.ENO=ASG.ENO ASG.PNO=PROJ.PNO EMP.ENO=ASG.ENO ASG.PNO=PROJ.PNO

TITLE =
EMP RESP PROJ EMP PROJ
“Programmer”

ENAME
RESULT
PNAME=“CAD/CAM”

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/5
Analysis
If the query graph is not connected, the query may be wrong or
use Cartesian product
SELECT ENAME,RESP
FROM EMP, ASG, PROJ
WHERE EMP.ENO = ASG.ENO
AND PNAME = "CAD/CAM"
AND DUR > 36
AND TITLE = "Programmer"

ASG

EMP RESP PROJ

ENAME
RESULT
PNAME=“CAD/CAM”

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/6
Simplification
• Why simplify?

➡ Remember the example


• How? Use transformation rules

➡ Elimination of redundancy
✦ idempotency rules
p1  ¬( p1)  false
p1  (p1 p2)  p1
p1  false  p1

➡ Application of transitivity
➡ Use of integrity rules

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/7
Simplification – Example
SELECT TITLE
FROM EMP
WHERE EMP.ENAME = "J. Doe"
OR (NOT(EMP.TITLE = "Programmer")
AND (EMP.TITLE = "Programmer"
OR EMP.TITLE = "Elect. Eng.")
AND NOT(EMP.TITLE = "Elect. Eng."))


SELECT TITLE
FROM EMP
WHERE EMP.ENAME = "J. Doe"

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/8
Restructuring
• Convert relational calculus to relational ENAME Project
algebra
• Make use of query trees σDUR=12 OR DUR=24
• Example
Find the names of employees other than
J. Doe who worked on the CAD/CAM
σ PNAME=“CAD/CAM” Select
project for either 1 or 2 years.
SELECT ENAME σENAME≠“J. DOE”
FROM EMP, ASG, PROJ
WHERE EMP.ENO = ASG.ENO ⋈PNO
AND ASG.PNO = PROJ.PNO
AND ENAME≠ "J. Doe" ⋈ENO Join
AND PNAME = "CAD/CAM"
AND (DUR = 12 OR DUR = 24) PROJ ASG EMP
Distributed DBMS © M. T. Özsu & P. Valduriez
Ch.7/9
Restructuring –Transformation
Rules
• Commutativity of binary operations
➡ R×SS×R
➡ R ⋈S  S ⋈R
➡ RSSR
• Associativity of binary operations
➡ ( R × S) × T  R × (S × T)
➡ (R ⋈S) ⋈T  R ⋈ (S ⋈T)
• Idempotence of unary operations
➡  A’( A’(R))   A’(R)
➡ p1(A1)(p2(A2)(R))  p1(A1)p2(A2)(R)
where R[A] and A'  A, A"  A and A'  A"
• Commuting selection with projection
Distributed DBMS © M. T. Özsu & P. Valduriez
Ch.7/10
Restructuring – Transformation
Rules
• Commuting selection with binary operations
➡ p(A)(R × S)  (p(A) (R)) × S

➡ p(A )(R ⋈(A ,B )S)  (p(A ) (R)) ⋈(A ,B )S


i j k i j k

➡ p(A )(R  T)  p(A ) (R)  p(A ) (T)


i i i
where Ai belongs to R and T
• Commuting projection with binary operations
➡  C(R × S)   A’(R) ×  B’(S)

➡  C(R ⋈(A ,B )S)   A’(R) ⋈(A ,B )  B’(S)


j k j k

➡  C(R  S)   C(R)   C(S)

where R[A] and S[B]; C = A'  B' where A'  A, B'  B


Distributed DBMS © M. T. Özsu & P. Valduriez
Ch.7/11
Example
Recall the previous example: ENAME
Project
Find the names of employees other
than J. Doe who worked on the DUR=12  DUR=24
CAD/CAM project for either one or
two years.
PNAME=“CAD/CAM” Select
SELECT ENAME
FROM PROJ, ASG, EMP
ENAME≠“J. DOE”
WHERE ASG.ENO=EMP.ENO
AND ASG.PNO=PROJ.PNO
⋈PNO
ENAME ≠ "J. Doe"
⋈ENO
AND
Join
AND PROJ.PNAME="CAD/CAM"
AND (DUR=12 OR DUR=24) PROJ ASG EMP
Distributed DBMS © M. T. Özsu & P. Valduriez
Ch.7/12
Equivalent Query
ENAME

PNAME=“CAD/CAM”  (DUR=12  DUR=24) ENAME≠“J. Doe”

⋈PNO,ENO

EMP PROJ ASG


Distributed DBMS © M. T. Özsu & P. Valduriez
Ch.7/13
Restructuring
 ENAME

⋈PNO
 PNO,ENAME

⋈ENO

 PNO  PNO,ENO  PNO,ENAME

PNAME = "CAD/CAM" DUR =12DUR=24 ENAME ≠ "J. Doe"

PROJ ASG EMP


Distributed DBMS © M. T. Özsu & P. Valduriez
Ch.7/14
Step 2 – Data Localization
Input: Algebraic query on distributed relations
• Determine which fragments are involved
• Localization program
➡ substitute for each global query its materialization program
➡ optimize

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/15
Example
Assume  ENAME

➡ EMP is fragmented into EMP1, EMP2, DUR=12 DUR=24


EMP3 as follows:
✦ EMP1= ENO≤“E3”(EMP) PNAME=“CAD/CAM”
✦ EMP2= “E3”<ENO≤“E6”(EMP)
ENAME≠“J. DOE”
✦ EMP3= ENO≥“E6”(EMP)
➡ ASG fragmented into ASG1 and ASG2 ⋈PNO
as follows:
✦ ASG1= ENO≤“E3”(ASG) ⋈ENO

✦ ASG2= ENO>“E3”(ASG) PROJ  


Replace EMP by (EMP1  EMP2  EMP3)
and ASG by (ASG1  ASG2) in any query EMP1EMP2 EMP3 ASG1 ASG2
Distributed DBMS © M. T. Özsu & P. Valduriez
Ch.7/16
Provides Parallellism

⋈ENO ⋈ENO ⋈ENO ⋈ENO

EMP1 ASG1 EMP2 ASG2 EMP3 ASG1 EMP3 ASG2

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/17
Eliminates Unnecessary Work

⋈ENO ⋈ENO ⋈ENO

EMP1 ASG1 EMP2 ASG2 EMP3 ASG2

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/18
Reduction for PHF
• Reduction with selection
➡ Relation R and FR={R1, R2, …, Rw} where Rj=pj(R)

pi(Rj)= if x in R: ¬(pi(x) pj(x))


➡ Example
SELECT *
FROM EMP
WHERE ENO="E5"
ENO=“E5” ENO=“E5”

EMP1 EMP2 EMP3 EMP2


Distributed DBMS © M. T. Özsu & P. Valduriez
Ch.7/19
Reduction for PHF
• Reduction with join

➡ Possible if fragmentation is done on join attribute

➡ Distribute join over union

(R1 R2)⋈S  (R1⋈S)  (R2⋈S)

➡ Given Ri =p (R) and Rj = p (R)


i j

Ri ⋈Rj = if x in Ri, y in Rj: ¬(pi(x) pj(y))

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/20
Reduction for PHF
• Assume EMP is fragmented as ⋈ENO
before and
➡ ASG1: ENO ≤ "E3"(ASG)  
➡ ASG2: ENO > "E3"(ASG)
• Consider the query EMP1 EMP2 EMP3 ASG1 ASG2
SELECT *
FROM EMP,ASG
WHERE EMP.ENO=ASG.ENO 
• Distribute join over unions
• Apply the reduction rule ⋈ENO ⋈ENO ⋈ENO

EMP1 ASG1 EMP2 ASG2 EMP3 ASG2

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/21
Reduction for VF
• Find useless (not empty) intermediate relations

Relation R defined over attributes A = {A1, ..., An} vertically fragmented


as Ri = A'(R) where A' A:
 D,K(Ri) is useless if the set of projection attributes D is not in A'
Example: EMP1= ENO,ENAME (EMP); EMP2= ENO,TITLE (EMP)

SELECT ENAME
FROM EMP
ENAME ENAME

⋈ENO

EMP1 EMP2 EMP1


Distributed DBMS © M. T. Özsu & P. Valduriez
Ch.7/22
Reduction for DHF
• Rule :
➡ Distribute joins over unions
➡ Apply the join reduction for horizontal fragmentation
• Example
ASG1: ASG ⋉ENO EMP1
ASG2: ASG ⋉ENO EMP2
EMP1: TITLE=“Programmer” (EMP)
EMP2: TITLE=“Programmer” (EMP)
• Query
SELECT *
FROM EMP, ASG
WHEREASG.ENO = EMP.ENO
AND EMP.TITLE = "Mech. Eng."
Distributed DBMS © M. T. Özsu & P. Valduriez
Ch.7/23
Reduction for DHF
Generic query ⋈ENO
TITLE=“Mech. Eng.”

 

ASG1 ASG2 EMP1 EMP2

Selections first ⋈ENO

 TITLE=“Mech. Eng.”

ASG1 ASG2 EMP2


Distributed DBMS © M. T. Özsu & P. Valduriez
Ch.7/24
Reduction for DHF

Joins over unions

⋈ENO ⋈ENO

TITLE=“Mech. Eng.” TITLE=“Mech. Eng.”

ASG1 EMP2 ASG2 EMP2


Elimination of the empty intermediate relations
(left sub-tree) ⋈ENO

TITLE=“Mech. Eng.”

ASG2 EMP2
Distributed DBMS © M. T. Özsu & P. Valduriez
Ch.7/25
Reduction for Hybrid
Fragmentation
• Combine the rules already specified:
➡ Remove empty relations generated by contradicting selections on horizontal
fragments;
➡ Remove useless relations generated by projections on vertical fragments;

➡ Distribute joins over unions in order to isolate and remove useless joins.

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/26
Reduction for HF
Example
ENAME
Consider the following hybrid
fragmentation: ENAME
ENO=“E5”
EMP1= ENO≤"E4" ( ENO,ENAME (EMP))

EMP2= ENO>"E4" ( ENO,ENAME (EMP))


⋈ENO
 ENO=“E5”
EMP3= ENO,TITLE (EMP)

and the query 


EMP2
SELECT ENAME
FROM EMP
WHERE ENO="E5" EMP1 EMP2 EMP3

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/27

You might also like