Distributed Dbms Ca1

NSHM KNOWLEDGE CAMPUS, DURGAPUR-
GOI (College Code: 273)

CA1 Assessment
Distributed Query Optimization Techniques
Presented By
Student Name: DEBASIS GARAI

University Roll No.: 27300121013
University Registration 212730100110040(2021-22)
No.:
Branch: Computer Science Engineering
Year: 3rd
Semester: 6th
Paper Name: DISTRIBUTED SYSTEMS

Paper Code: PEC-IT601B
TABLE OF CONTENTS
 What are distributed systems ?

 Why do we need query optimization?
 Distributed Query Processing Architecture
 Optimal utilization resources
 Distributed query processing
 Key techniques for optimization
What are Distributed Systems ?
A distributed system is a collection of independent computers or nodes that work together to provide a unified and coherent set of services. In
a distributed system, these nodes are connected and communicate with each other to achieve a common goal or provide a specific
functionality. The key characteristics of distributed systems include:
1.Independent Nodes: Nodes in a distributed system are separate entities, each having its own memory, processing power, and possibly its
own operating system.
2.Communication: Nodes in a distributed system communicate with each other by passing messages. Communication can occur through
various methods, such as direct inter-process communication or through a network.
3.Shared Resources: Distributed systems often share resources such as data, files, or computational capabilities among the nodes. This
sharing allows for efficient utilization of resources and collaboration.
4.Concurrency: Distributed systems handle multiple tasks or processes concurrently. Nodes can work independently on different parts of a
task, contributing to parallel processing and improved performance.
5.Scalability: Distributed systems can scale horizontally by adding more nodes to the network. This scalability allows them to handle
increased workloads and adapt to changing requirements.
6.Fault Tolerance: Distributed systems are designed to be resilient in the face of failures. If one node fails, others can continue to operate,
ensuring the system's availability.
7.Consistency: Maintaining data consistency across distributed nodes is a challenge. Distributed systems employ various mechanisms, such
as distributed transactions and consensus algorithms, to ensure data consistency.
Why do we need to optimize query in case of
distributed systems ?
•Performance Enhancement: Optimize queries to reduce latency, improve throughput, and enhance overall system
performance.
•Resource Efficiency: Ensure efficient utilization of distributed resources, minimizing communication overhead, and
conserving bandwidth.
•Cost Reduction: Optimized queries lead to cost savings, particularly in scenarios where data transfer incurs charges.
•Scalability Support: Facilitate horizontal scalability by distributing and processing data efficiently across multiple nodes.
•Consistency and Reliability: Maintain data consistency across distributed nodes, preserving the integrity of the database.
•Adaptation to Heterogeneity: Handle diverse hardware and software configurations in distributed environments for seamless query
execution.
•Adherence to SLAs: Meet service level agreements by optimizing queries to achieve agreed-upon standards.
•Improved User Experience: Faster response times contribute to an enhanced overall user experience.
Distributed Query Processing Architecture
:
 In a distributed database system, processing a query comprises

of optimization at both the global and the local level. The query
enters the database system at the client or controlling site. Here,
the user is validated, the query is checked, translated, and
optimized at a global level.
 Distributed query optimization requires evaluation of a large
number of query trees each of which produce the required results
of a query. This is primarily due to the presence of large amount
of replicated and fragmented data. Hence, the target is to find an
optimal solution instead of the best solution.
 The main issues for distributed query optimization are −

1. Optimal utilization of resources in the distributed system.
2. Query trading.
3. Reduction of solution space of the query.
Optimal Utilization of Resources
techniques
1. Operation Shipping − In operation shipping, the operation is run at the

site where the data is stored and not at the client site. The results are
then transferred to the client site. This is appropriate for operations
where the operands are available at the same site. Example: Select
and Project operations.
2. Data Shipping − In data shipping, the data fragments are transferred
to the database server, where the operations are executed. This is
used in operations where the operands are distributed at different
sites. This is also appropriate in systems where the communication
costs are low, and local processors are much slower than the client
server.
3. Hybrid Shipping − This is a combination of data and operation
shipping. Here, data fragments are transferred to the high-speed
processors, where the operation runs. The results are then sent to the
client site.
Distributed Query Processing:
1.. Parallel Execution:

1. Utilize parallel processing to enhance query performance in a
distributed environment.
2.Data Fragmentation and Replication:
1. Efficiently manage data distribution through fragmentation and
replication strategies.
3.Query Routing:
1. Route queries to relevant nodes effectively, optimizing the path for
query execution.
4.Global Query Optimization:
1. Tackle challenges of optimizing queries across multiple nodes for a
cohesive and efficient approach.
Key Techniques for Optimization
1.Partitioning Strategies:
1. Optimize data distribution through horizontal and
vertical partitioning techniques.
2.Replication Strategies:
1. Improve availability and performance by
strategically implementing data replication
methods.
3.Caching Mechanisms:
1. Enhance query response times by implementing
efficient caching mechanisms for frequently
accessed data.
4.Load Balancing:
1. Ensure even distribution of workload among
distributed nodes for optimal resource utilization.
THANK
YOU

Distributed Dbms Ca1

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

Distributed Dbms Ca1

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Distributed Dbms Ca1

Uploaded by

Copyright:

Available Formats

NSHM KNOWLEDGE CAMPUS, DURGAPUR-

GOI (College Code: 273)

Distributed Query Optimization Techniques

Student Name: DEBASIS GARAI

Paper Name: DISTRIBUTED SYSTEMS

 What are distributed systems ?

 In a distributed database system, processing a query comprises

 The main issues for distributed query optimization are −

1. Operation Shipping − In operation shipping, the operation is run at the

1.. Parallel Execution:

You might also like