0% found this document useful (0 votes)

3 views

8 Distributed Databases

A distributed system is a software application that coordinates multiple processes across a network using protocols for communication. Distributed databases, which can be homogeneous or heterogeneous, improve reliability, availability, and performance by storing data across different locations and managing it through a Distributed Database Management System. Key concepts include data fragmentation, replication, query processing, and concurrency control to ensure efficient operation and recovery in the event of failures.

Uploaded by

canolowana6

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

8 Distributed Databases

Uploaded by

canolowana6

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 13

DISTRIBUTED SYSTEMS

What is a distributed system? It's one of those things that's hard to define without first defining many other things.
Here is a "cascading" definition of a distributed system:

A program
is the code you write.

A process
is what you get when you run it.

A message
is used to communicate between processes.

A packet
is a fragment of a message that might travel on a wire.

A protocol
is a formal description of message formats and the rules that two processes must follow in order to
exchange those messages.

A network
is the infrastructure that links computers, workstations, terminals, servers, etc. It consists of routers which
are connected by communication links.

A component
can be a process or any piece of hardware required to run a process, support communications between
processes, store data, etc.

A distributed system
is an application that executes a collection of protocols to coordinate the actions of multiple processes
on a network, such that all components cooperate together to perform a single or small set of related tasks.

A distributed system is a software system in which components located on networked

computers communicate and coordinate their actions by passing messages.

Motivation, Definition and Challenges of Distributed Systems

Why build Distributed Systems

 Fault-Tolerant: It can recover from component failures without performing incorrect actions.
 Highly Available: It can restore operations, permitting it to resume providing services even when
some components have failed.
 Recoverable: Failed components can restart themselves and rejoin the system, after the cause of
failure has been repaired.
 Consistent: The system can coordinate actions by multiple components often in the presence of
concurrency and failure. This underlies the ability of a distributed system to act like a non-
distributed system.
 Scalable: It can operate correctly even as some aspect of the system is scaled to a larger size. For
example, we might increase the size of the network on which the system is running. This increases
the frequency of network outages and could degrade a "non-scalable" system. Similarly, we might
increase the number of users or servers, or overall load on the system. In a scalable system, this
should not have a significant effect.
 Predictable Performance: The ability to provide desired responsiveness in a timely manner.
 Secure: The system authenticates access to data and services

DISTRIBUTED DATABASES

What are distributed databases?

 Distributed database is a system in which storage devices are not connected to a common
processing unit.
 Database is controlled by Distributed Database Management System and data may be
stored at the same location or spread over the interconnected network. It is a loosely
coupled system.
 Shared nothing architecture is used in distributed databases.
 The above diagram is a typical example of distributed database system, in which
communication channel is used to communicate with the different locations and every
system has its own memory and database.

Goals of Distributed Database system.

The concept of distributed database was built with a goal to improve:

Reliability: In distributed database system, if one system fails down or stops working for
some time another system can complete the task.
Availability: In distributed database system reliability can be achieved even if sever fails
down. Another system is available to serve the client request.
Performance: Performance can be achieved by distributing database over different
locations. So the databases are available to every location which is easy to maintain.

Types of distributed databases.

The two types of distributed systems are as follows:

1. Homogeneous distributed databases system:

 Homogeneous distributed database system is a network of two or more databases (With
same type of DBMS software) which can be stored on one or more machines.
 So, in this system data can be accessed and modified simultaneously on several databases
in the network. Homogeneous distributed system is easy to handle.
Example: Consider that we have three departments using Oracle-9i for DBMS. If some
changes are made in one department then, it would update the other department also.

2. Heterogeneous distributed database system.

 Heterogeneous distributed database system is a network of two or more databases with
different types of DBMS software, which can be stored on one or more machines.
 In this system data can be accessible to several databases in the network with the help of
generic connectivity (ODBC and JDBC).
Example: In the following diagram, different DBMS software are accessible to each
other using ODBC and JDBC.

The basic types of distributed DBMS are as follows:

1. Client-server architecture of Distributed system.

 A client server architecture has a number of clients and a few servers connected in a
network.
 A client sends a query to one of the servers. The earliest available server solves it and
replies.
 A Client-server architecture is simple to implement and execute due to centralized server
system.

2. Collaborating server architecture.

 Collaborating server architecture is designed to run a single query on multiple servers.

 Servers break single query into multiple small queries and the result is sent to the client.
 Collaborating server architecture has a collection of database servers. Each server is
capable for executing the current transactions across the databases.

3. Middleware architecture.

 Middleware architectures are designed in such a way that single query is executed on
multiple servers.
 This system needs only one server which is capable of managing queries and transactions
from multiple servers.
 Middleware architecture uses local servers to handle local queries and transactions.
 The software are used for execution of queries and transactions across one or more
independent database servers, this type of software is called as middleware.

What is fragmentation?

 The process of dividing the database into a smaller multiple parts is called
as fragmentation.
 These fragments may be stored at different locations.
 The data fragmentation process should be carried out in such a way that the
reconstruction of original database from the fragments is possible.

Types of data Fragmentation

There are three types of data fragmentation:

1. Horizontal data fragmentation

Horizontal fragmentation divides a relation(table) horizontally into the group of rows to

create subsets of tables.

Example:
Account (Acc_No, Balance, Branch_Name, Type).
In this example if values are inserted in table Branch_Name as Pune, Baroda, Delhi.

The query can be written as:

SELECT*FROM ACCOUNT WHERE Branch_Name= “Baroda”

Types of horizontal data fragmentation are as follows:

1) Primary horizontal fragmentation

Primary horizontal fragmentation is the process of fragmenting a single table, row wise
using a set of conditions.

Example:

Acc_No Balance Branch_Name

A_101 5000 Pune

A_102 10,000 Baroda

A_103 25,000 Delhi

For the above table we can define any simple condition like, Branch_Name= 'Pune',
Branch_Name= 'Delhi', Balance < 50,000

Fragmentation1:
SELECT * FROM Account WHERE Branch_Name= 'Pune' AND Balance < 50,000

Fragmentation2:
SELECT * FROM Account WHERE Branch_Name= 'Delhi' AND Balance < 50,000

2) Derived horizontal fragmentation

Fragmentation derived from the primary relation is called as derived horizontal
fragmentation.

Example: Refer the example of primary fragmentation given above.

The following fragmentation are derived from primary fragmentation.

Fragmentation1:
SELECT * FROM Account WHERE Branch_Name= 'Baroda' AND Balance < 50,000

Fragmentation2:
SELECT * FROM Account WHERE Branch_Name= 'Delhi' AND Balance < 50,000

3) Complete horizontal fragmentation

 The complete horizontal fragmentation generates a set of horizontal fragmentation, which
includes every table of original relation.
 Completeness is required for reconstruction of relation so that every table belongs to at
least one of the partitions.
4) Disjoint horizontal fragmentation
The disjoint horizontal fragmentation generates a set of horizontal fragmentation in which
no two fragments have common tables. That means every table of relation belongs to only
one fragment.

5) Reconstruction of horizontal fragmentation

Reconstruction of horizontal fragmentation can be performed using UNION operation on
fragments.

2. Vertical Fragmentation

Vertical fragmentation divides a relation(table) vertically into groups of columns to create

subsets of tables.

Example:

Acc_No Balance Branch_Name

A_101 5000 Pune

A_102 10,000 Baroda

A_103 25,000 Delhi

Fragmentation1:
SELECT * FROM Acc_NO

Fragmentation2:
SELECT * FROM Balance

Complete vertical fragmentation

 The complete vertical fragmentation generates a set of vertical fragments, which can
include all the attributes of original relation.
 Reconstruction of vertical fragmentation is performed by using Full Outer Join operation
on fragments.

3) Hybrid Fragmentation

 Hybrid fragmentation can be achieved by performing horizontal and vertical partition

together.
 Mixed fragmentation is group of rows and columns in relation.

Example: Consider the following table which consists of employee information.

Emp_ID Emp_Nam Emp_Addres Emp_Age Emp_Salary
e s

101 Surendra Baroda 25 15000

102 Jaya Pune 37 12000

103 Jayesh Pune 47 10000

Fragmentation1:
SELECT * FROM Emp_Name WHERE Emp_Age < 40

Fragmentation2:
SELECT * FROM Emp_Id WHERE Emp_Address= 'Pune' AND Salary < 14000

Reconstruction of Hybrid Fragmentation

The original relation in hybrid fragmentation is reconstructed by performing UNION and
FULL OUTER JOIN.

What is data replication?

Data replication is the process in which the data is copied at multiple locations (Different
computers or servers) to improve the availability of data.

Goals of data replication

Data replication is done with an aim to:

 Increase the availability of data.
 Speed up the query evaluation.

Types of data replication

There are two types of data replication:

1. Synchronous Replication:
In synchronous replication, the replica will be modified immediately after some changes are
made in the relation table. So there is no difference between original data and replica.

2. Asynchronous replication:
In asynchronous replication, the replica will be modified after commit is fired on to the
database.

Replication Schemes

The three replication schemes are as follows:

1. Full Replication
In full replication scheme, the database is available to almost every location or user in
communication network.

Advantages of full replication

 High availability of data, as database is available to almost every location.
 Faster execution of queries.
Disadvantages of full replication
 Concurrency control is difficult to achieve in full replication.
 Update operation is slower.

2. No Replication

No replication means, each fragment is stored exactly at one location.

Advantages of no replication
 Concurrency can be minimized.
 Easy recovery of data.
Disadvantages of no replication
 Poor availability of data.
 Slows down the query execution process, as multiple clients are accessing the same server.

3. Partial replication

Partial replication means only some fragments are replicated from the database.

Advantages of partial replication

The number of replicas created for fragments depend upon the importance of data in that
fragment.

Distributed databases - Query processing and Optimization

DDBMS processes and optimizes a query in terms of communication cost of processing a

distributed query and other parameters.

Various factors which are considered while processing a query are as follows:

Costs of Data transfer

 This is a very important factor while processing queries. The intermediate data is
transferred to other location for data processing and the final result will be sent to the
location where the actual query is processing.
 The cost of data increases if the locations are connected via high performance
communicating channel.
 The DDBMS query optimization algorithms are used to minimize the cost of data
transfer.
Semi-join based query optimization

 Semi-join is used to reduce the number of relations in a table before transferring it to

another location.
 Only joining columns are transferred in this method.
 This method reduces the cost of data transfer.

Cost based query optimization

 Query optimization involves many operations like, selection, projection, aggregation.

 Cost of communication is considered in query optimization.
 In centralized database system, the information of relations at remote location is obtained
from the server system catalogs.
 The data (query) which is manipulated at local location is considered as a sub query to
other global locations. This process estimates the total cost which is needed to compute
the intermediate relations.

Distributed Transactions

 A Distributed Databases Management System should be able to survive in a system failure

without losing any data in the database.
 This property is provided in transaction processing.
 The local transaction works only on own location(Local Location) where it is considered as
a global transaction for other locations.
 Transactions are assigned to transaction monitor which works as a supervisor.
 A distributed transaction process is designed to distribute data over many locations and
transactions are carried out successfully or terminated successfully.
 Transaction Processing is very useful for concurrent execution and recovery of data.

What is recovery in distributed databases?

Recovery is the most complicated process in distributed databases. Recovery of a failed

system in the communication network is very difficult.

For example:
Consider that, location A sends message to location B and expects response from B but B is
unable to receive it. There are several problems for this situation which are as follows.

 Message was failed due to failure in the network.

 Location B sent message but not delivered to location A.
 Location B crashed down.
 So it is actually very difficult to find the cause of failure in a large communication network.
 Distributed commit in the network is also a serious problem which can affect the recovery
in a distributed databases.

Two-phase commit protocol in Distributed databases

 Two-phase protocol is a type of atomic commitment protocol. This is a distributed

algorithm which can coordinate all the processes that participate in the database and
decide to commit or terminate the transactions. The protocol is based on commit and
terminate action.
 The two-phase protocol ensures that all participant which are accessing the database
server can receive and implement the same action (Commit or terminate), in case of local
network failure.
 Two-phase commit protocol provides automatic recovery mechanism in case of a system
failure.
 The location at which original transaction takes place is called as coordinator and where
the sub process takes place is called as Cohort.

Commit request:
In commit phase the coordinator attempts to prepare all cohorts and take necessary steps
to commit or terminate the transactions.

Commit phase:
The commit phase is based on voting of cohorts and the coordinator decides to commit or
terminate the transaction.

Concurrency problems in distributed databases.

Some problems which occur while accessing the database are as follows:

1. Failure at local locations

When system recovers from failure the database is out dated compared to other locations.
So it is necessary to update the database.

2. Failure at communication location

System should have a ability to manage temporary failure in a communicating network in
distributed databases. In this case, partition occurs which can limit the communication
between two locations.

3. Dealing with multiple copies of data

It is very important to maintain multiple copies of distributed data at different locations.

4. Distributed commit
While committing a transaction which is accessing databases stored on multiple locations, if
failure occurs on some location during the commit process then this problem is called as
distributed commit.

5. Distributed deadlock
Deadlock can occur at several locations due to recovery problem and concurrency problem
(multiple locations are accessing same system in the communication network).

Concurrency Controls in distributed databases

There are three different ways of making distinguish copy of data by applying:

1) Lock based protocol

A lock is applied to avoid concurrency problem between two transaction in such a way that
the lock is applied on one transaction and other transaction can access it only when the lock
is released. The lock is applied on write or read operations. It is an important method to
avoid deadlock.

2) Shared lock system (Read lock)

The transaction can activate shared lock on data to read its content. The lock is shared in
such a way that any other transaction can activate the shared lock on the same data for
reading purpose.

3) Exclusive lock
The transaction can activate exclusive lock on a data to read and write operation. In this
system, no other transaction can activate any kind of lock on that same data.

Distributed Database Concepts
No ratings yet
Distributed Database Concepts
52 pages
Adt Unit I
No ratings yet
Adt Unit I
18 pages
Adt Unitnotes 1to3
No ratings yet
Adt Unitnotes 1to3
107 pages
Unit 5
No ratings yet
Unit 5
17 pages
Lecture 8 - Distributed Database Management Systems
No ratings yet
Lecture 8 - Distributed Database Management Systems
60 pages
Week 12- Distributed Databases
No ratings yet
Week 12- Distributed Databases
37 pages
Unit 1 DISTRIBUTED DATABASE
No ratings yet
Unit 1 DISTRIBUTED DATABASE
6 pages
Distributed Databases
No ratings yet
Distributed Databases
46 pages
Dd Mid Answers
No ratings yet
Dd Mid Answers
29 pages
DDB Slides
No ratings yet
DDB Slides
30 pages
DB unit-2
No ratings yet
DB unit-2
27 pages
Midterm Elective Database Notes
No ratings yet
Midterm Elective Database Notes
14 pages
Chapter 4 - Distributed Database System
No ratings yet
Chapter 4 - Distributed Database System
52 pages
Unit V NoSQL Databases
No ratings yet
Unit V NoSQL Databases
124 pages
Distributed Database Frank Chinembiri and Florence-2
No ratings yet
Distributed Database Frank Chinembiri and Florence-2
42 pages
Distributed Databases
No ratings yet
Distributed Databases
53 pages
Tybca Recent Trends in It Chpter 1
No ratings yet
Tybca Recent Trends in It Chpter 1
16 pages
Distributed DBM S
No ratings yet
Distributed DBM S
67 pages
Lefikir PowerPoint
No ratings yet
Lefikir PowerPoint
15 pages
Chapter 5 - Distributed Databases Roobera
No ratings yet
Chapter 5 - Distributed Databases Roobera
58 pages
Distrubuted Database Concept
No ratings yet
Distrubuted Database Concept
22 pages
DBMS-Unit 5
No ratings yet
DBMS-Unit 5
27 pages
10 Distributeddbms
No ratings yet
10 Distributeddbms
56 pages
Unit-V Distributed and Client Server Databases: A Lalitha Associate Professor Avinash Degree College
No ratings yet
Unit-V Distributed and Client Server Databases: A Lalitha Associate Professor Avinash Degree College
24 pages
Distributed DB
No ratings yet
Distributed DB
16 pages
Unit 2 DDMS
No ratings yet
Unit 2 DDMS
26 pages
advanced database individual assignment
No ratings yet
advanced database individual assignment
4 pages
Distributed Database
100% (1)
Distributed Database
24 pages
Distributed Database - Unit 5
No ratings yet
Distributed Database - Unit 5
4 pages
Distributed Database and Big Data
No ratings yet
Distributed Database and Big Data
72 pages
Distributed Databases and Client Server Architectures
No ratings yet
Distributed Databases and Client Server Architectures
14 pages
DISTRIBUTED DATABASES Presentation
No ratings yet
DISTRIBUTED DATABASES Presentation
13 pages
ch6 Distributed Database
No ratings yet
ch6 Distributed Database
25 pages
Lecture 2 Distriburted Databases
No ratings yet
Lecture 2 Distriburted Databases
45 pages
Distributed Databases
No ratings yet
Distributed Databases
12 pages
W7 DBMS Chapter23
No ratings yet
W7 DBMS Chapter23
33 pages
04_Distributed DBMSs - Concepts and Design
No ratings yet
04_Distributed DBMSs - Concepts and Design
72 pages
Advanced Database Chapter 6 and 7
No ratings yet
Advanced Database Chapter 6 and 7
30 pages
Module 1
No ratings yet
Module 1
24 pages
Unit 5 Notes
No ratings yet
Unit 5 Notes
30 pages
ddb unit 1-5
No ratings yet
ddb unit 1-5
190 pages
Distributed Database System
No ratings yet
Distributed Database System
4 pages
Unit i Distributed Databases
No ratings yet
Unit i Distributed Databases
15 pages
Unit - 2 (1) DBMS
No ratings yet
Unit - 2 (1) DBMS
25 pages
DDIS U1-3
No ratings yet
DDIS U1-3
40 pages
Distributed Databases and Client-Server Architectures
No ratings yet
Distributed Databases and Client-Server Architectures
41 pages
Distributed Databases: Presentation-I
No ratings yet
Distributed Databases: Presentation-I
30 pages
Distributed Databases: Centralized Database System Distributed Database System Advantages and Disadvantages of DDBMS
No ratings yet
Distributed Databases: Centralized Database System Distributed Database System Advantages and Disadvantages of DDBMS
26 pages
Chapter 6
No ratings yet
Chapter 6
28 pages
Advanced Data Base Management Systems
No ratings yet
Advanced Data Base Management Systems
35 pages
Unit V
No ratings yet
Unit V
22 pages
Q # 1: What Are The Components of Distributed Database System? Explain With The Help of A Diagram. Answer
No ratings yet
Q # 1: What Are The Components of Distributed Database System? Explain With The Help of A Diagram. Answer
12 pages
MC4202 - Adavanced Database Technology
No ratings yet
MC4202 - Adavanced Database Technology
159 pages
A Distributed Database Management System ('DDBMS') Is A Software System
No ratings yet
A Distributed Database Management System ('DDBMS') Is A Software System
5 pages
DDB Slides
No ratings yet
DDB Slides
67 pages
Distributed
No ratings yet
Distributed
83 pages
Topic 7 DDBMS
No ratings yet
Topic 7 DDBMS
28 pages
ch6 Distributed Database
No ratings yet
ch6 Distributed Database
35 pages
06 - Distributed DBMSs and Replication
No ratings yet
06 - Distributed DBMSs and Replication
55 pages
Database Management System
From Everand
Database Management System
Knowledge Flow
No ratings yet
Tanfilm Chip Carrier Resistor Network: Electrical Data
No ratings yet
Tanfilm Chip Carrier Resistor Network: Electrical Data
4 pages
Energy Efficient Data Aggregation Techniques in Wireless Sensor Networks
No ratings yet
Energy Efficient Data Aggregation Techniques in Wireless Sensor Networks
1 page
Ih 2303
No ratings yet
Ih 2303
8 pages
Stack and Queue Using Linked List
No ratings yet
Stack and Queue Using Linked List
17 pages
G Luxon
No ratings yet
G Luxon
170 pages
CotBG Jumpdoc
No ratings yet
CotBG Jumpdoc
34 pages
MKT 401 Sec 1 HUAWEI Marketing Retards
No ratings yet
MKT 401 Sec 1 HUAWEI Marketing Retards
15 pages
Davidson2013 Ebook
No ratings yet
Davidson2013 Ebook
10 pages
Rapid7 Buyers Guide Appsec Web
No ratings yet
Rapid7 Buyers Guide Appsec Web
10 pages
Requirement Routines - Output Control
No ratings yet
Requirement Routines - Output Control
3 pages
344 4787, 346 2842 10 El Batal Medhat Abd El Hamid Street, Mohandiseen, Cairo
64% (11)
344 4787, 346 2842 10 El Batal Medhat Abd El Hamid Street, Mohandiseen, Cairo
22 pages
Service Manual: 8M266 Chassis
50% (2)
Service Manual: 8M266 Chassis
57 pages
7 4kw
No ratings yet
7 4kw
30 pages
Project Planning Notes Google Project Planning - Google Certificate
No ratings yet
Project Planning Notes Google Project Planning - Google Certificate
4 pages
Boundary Value Analysis
No ratings yet
Boundary Value Analysis
5 pages
PLDT Osp Training
100% (1)
PLDT Osp Training
55 pages
Check List For Fire Alarm & Voice Evacuation System
No ratings yet
Check List For Fire Alarm & Voice Evacuation System
1 page
Ix58a-Axp Ix58b-Axp 090806
No ratings yet
Ix58a-Axp Ix58b-Axp 090806
69 pages
Ilovepdf Merged PDF
No ratings yet
Ilovepdf Merged PDF
41 pages
Architecture
No ratings yet
Architecture
1 page
CMMI V2 0 Model at A Glance - Print Ready - ENG - 2019 04 29v2 1
No ratings yet
CMMI V2 0 Model at A Glance - Print Ready - ENG - 2019 04 29v2 1
7 pages
Import Import Import From Import From Import Import: Drive STR STR STR
No ratings yet
Import Import Import From Import From Import Import: Drive STR STR STR
7 pages
TALLYGENICOM T6212 User Guide
No ratings yet
TALLYGENICOM T6212 User Guide
2 pages
Diagrama Electrico PDF
No ratings yet
Diagrama Electrico PDF
2 pages
Database Management Systems Ramakrishnan 3rd Edition Raghu Ramakrishnan - The ebook with rich content is ready for you to download
100% (1)
Database Management Systems Ramakrishnan 3rd Edition Raghu Ramakrishnan - The ebook with rich content is ready for you to download
57 pages
Phased Array Ultrasonic Technique Speeds Up Examination of Aluminothermic Rail Welds
No ratings yet
Phased Array Ultrasonic Technique Speeds Up Examination of Aluminothermic Rail Welds
1 page
m094 h7cc Digital Counter Tachometer Datasheet en
No ratings yet
m094 h7cc Digital Counter Tachometer Datasheet en
66 pages
Data Protection Liberia
No ratings yet
Data Protection Liberia
6 pages
RME TotalMix - Features - Operation
No ratings yet
RME TotalMix - Features - Operation
13 pages
MVS21 USB Kill Switch
No ratings yet
MVS21 USB Kill Switch
74 pages