0% found this document useful (0 votes)
5 views58 pages

Distributed Database Management System Notes

The document provides a comprehensive overview of Distributed Database Management Systems (DDBMS), covering their features, architecture, and the need for distributed systems in organizations. It discusses the benefits of DDBMS, such as autonomy, data sharing, availability, and performance improvements, while also addressing the challenges of managing distributed transactions and data transparency. Additionally, it outlines the reference architecture for DDBMS, including global schemas, fragmentation, and allocation schemas, emphasizing the importance of distribution transparency in application development.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views58 pages

Distributed Database Management System Notes

The document provides a comprehensive overview of Distributed Database Management Systems (DDBMS), covering their features, architecture, and the need for distributed systems in organizations. It discusses the benefits of DDBMS, such as autonomy, data sharing, availability, and performance improvements, while also addressing the challenges of managing distributed transactions and data transparency. Additionally, it outlines the reference architecture for DDBMS, including global schemas, fragmentation, and allocation schemas, emphasizing the importance of distribution transparency in application development.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 58

Distributed DBMS

Syllabus

Introduction to DDMBS, Features and Needs, Reference architecture, levels


of distribution transparency, replication, and Distributed database design,
Fragmentation and allocation criteria, and Storage mechanisms
of global queries, global query optimization, queries execution and access
plan.

Concurrency control, 2-phase locks, distributed deadlocks, quorum based,


time based protocols, comparison, reliability, non-blocking commitment
protocols 3-phased), partitioned networks checkpoints and cold starts

Management of distributed transactions 2-phase unit protocols, architectural


aspects, nodes and link failure recoveries, distributed data dictionary
management of distributed database administration

Heterogeneous database-federated data reference architecture, loosely and


tightly coupled alternative architectures, development tasks, operational
global task management client server database, SQL Server, Open database
connectivity, constructing an application.
Introduction
Independent or centralized systems were the norm in the earlier days
of information management. There was duplication of hardware and
facilities. Incompatible procedures and lack of management control were the
consequences of the evolving nature of the field.

In a centralized database system, the DBMS and the data reside at a


single location and control and processing is limited to this location. However
many organizations have geographically dispersed operations. The reliability
of the system is compromised since loss of messages between sites or
Failure of the communication links may occur. The excessive load on the
The system at the central site would likely cause all accesses to be delayed.

An organization located in a single building with quasi-independent


operational divisions, each with its own information processing needs and
using a centralized database on a local network, would have similar
problems.

The current trend is toward a distributed system. This is a central


system connected to intelligent remote devices, each of which can itself be a
computer or interconnected, possibly heterogeneous computers. The
The distribution of processing power creates a feasible environment for data.
distribution. All distributed data must still be accessible from each site. Thus,
A distributed database can be defined as consisting of a collection of data with
different parts under the control of separate DBMS, running on independent
computer systems. All the computers are interconnected and each system
has autonomous processing capability, serving local applications. Each
system participates as well in the execution of one or more global
applications. Such applications require data from more than one site.

Another typical definition of a distributed database is the following:

A distributed database is a collection of data which belong logically to


the same system but are spread over the sites of a computer network. Third
definition emphasizes two equally important aspects of a distributed
database.

1. Distribution: The fact that the data are not resident at the same site.
so that we can distinguish a distributed database from a single,
centralized database.
2. Logical co-relation: The fact that the data have some properties which
tie them together, so that we can distinguish a distributed database
from a set of local databases or files which are resident at different
sites of a computer network.

The problem with the above definition is that properties, distribution and
logical correlations are too vaguely defined. In order to develop a more
specific definition, let us consider an example.

Fig. A distributed database on a geographically dispersed network.

Consider a bank that has 3 branches at different locations. At each


branch a computer controls the teller terminals of the branch and the
account database of that branch. Each computer with its local account
A database at one branch constitutes one site of the distributed database;
Computers are connected by a communication network. During normal
operations the applications which are requested from the terminals of a
branch need only to access the database of that branch. These applications
are completely executed by the computer of the branch where they are
issued, and will therefore be called local applications.
An example of a local application is a debit or credit application.
performed on an account stored at the same branch at which the application
is required.

Therefore the definition of a distributed database is a collection of data


which are distributed over different computers of a computer network. Each
The site of the network has autonomous processing capability and can perform
local applications. Each site also participates in the execution of at least one
global applications, which requires accessing data at several sites using
a communication subsystem.

Need of Distributed DBMS


There are several reasons why distributed databases are developed.

1. Organizational and economic reasons:

Many organizations are decentralized and a distributed database.


approach fits more naturally the structure of the organizations. The
problems of a distributed organizational structure and of the
corresponding into systems are the subject of several books and
papers. With the recent developments in computer technology, the
Economies of scale motivate having a large centralized computer.
centers is becoming questionable. The organizational and economic
reasons are probably the most important reasons for developing
distributed databases.

2. Interconnection of existing databases:

Distributed databases are the natural solution when several databases


already exist in an organization and the necessity of performing global
applications arises. In this case, the distributed database is created
bottom up from the pre-existing local databases. This process may
require a certain degree of local restructuring; however the effort
which is required by this restructuring in much less than that needed
for the creation of a completely new centralized database.

3. Incremental Growth:
if an organization grows by adding new, relatively autonomous
organizational units (new branches, warehouses, etc.) then the
The distributed database approach supports a smooth incremental growth.
with a minimum degree of impact on the already existing units.
In a centralized approach, it is very difficult to add new branches.

4. Reduced Communication Overhead:

In a geographically distributed database, any applications are local.


clearly reduces the communication overhead with respect to a
centralized database.

5. Performance considerations:

The existence of several autonomous processors results in the increase


of performance through a high degree of parallelism. This
consideration can be applied to any multiprocessor system and not
only to distributed databases. Distributed databases have the
advantage in that the decomposition of data reflects application
dependent criteria which maximize application locality; in this way the
natural interference between different processors is maximized.
load is shared between the different processors, and critical
bottlenecks are avoided.

Features of DDBMS
A distributed database can be defined as consisting of a collection of
data with different parts under the control of separate DBMSs, running on
independent computer systems. All the computers are interconnected and
each system has autonomous processing capability, serving local
applications.

There are various features of DDBMS are as follows:

1. Autonomy: in distributed databases, it is possible to identify a


hierarchical control structure based on a global database
administrator, who has the central responsibility of the whole database
and on local database administrators, who have the responsibility of
their respective local databases. However, it must be emphasized that
The local database administrator may have a high degree of autonomy, up
to the point that a global database administrator is completely missing
and the inter-site coordination is performed by the local administrators
themselves. This characteristic is usually called site autonomy.

2. Sharing: Data sharing is also provided by DDBMS, Users at a given


site are able to access data stored at other sites and at the same time
retain control over the data at their own site.

3. Availability & Reliability: Even when a portion of a system (i.e. a


local site) is down, the system remains available. With replicated data,
the failure of one site staff allows access to the replicated copy of the
data from another site. The remaining site, continue to function. The
Greater accessibility enhances the reliability of the system.

4. Parallel Evaluation: A query involving data from several sites can be


subdivided into subqueries and the parts evaluated in parallel.

5. Distributed data: Data distribution in DBMS with redundant copies


can be used to increase system availability and reliability. If data can
be obtained from a site other than the one that has failed, then
availability improves, and as the system can still run, reliability does
too. Data distribution can also be used to decrease communication.
costs. If most of the data used at a given site is available locally, the
communication cost compared with that of a remote centralized
system obviously reduced.

6. Data Distribution Transparency: Data independence means that


the actual organization of data is transparent to the application
programmer.

Management of distributed data with different levels of


transparency: ideally, a DBMS should be distribution transparent in
the sense of hiding the details of where each file (table, relation) is
physically stored within the system.

8. Distribution or Network Transparency: this refers to freedom for


the users from the operational details of the network. It may be divided
into location transparency and naming transparency. Location
transparency refers to the fact that the command issued to perform a
task is independent of the location of data and the location of the
system where the command was issued. Naming transparency implies
that once a name is specified, the named object can be accessed
unambiguously without additional specifications.
9. Replication Transparency: copies of data may be stored at multiple
sites for better availability, performance and reliability. Replication
Transparency makes the user unaware of the existence of copies.

10. Fragmentation Transparency: two types of fragmentation are


possible. Horizontal fragmentation decomposes all the rows of relation
into several subsets that are called fragments of that relation. Vertical
Fragmentation decomposes attributes of a relation into several subsets.
that are referred to as fragments. A global query by the user must be
transformed into several fragment queries. Fragmentation
Transparency makes the user unaware of the existence of fragments.

11. Improved Performance: A distributed DBMS fragments the


database by keeping the data closer to where it is needed most. Data
localization reduces the contention for CPU and I/O services and
simultaneously reduces access delays involved in WAN. When a large
database is distributed over multiple sites smaller databases exist at
each site. As a result, local queries and transaction accessing data at a
Single site have better performance because of the smaller local
database.

12. Easier Expansion: In a distributed environment, expansion of


the system in terms of adding more data, minimizing database sizes,
or adding more processors much easier.

Reference Architecture for Distributed Database


Fig: Reference Architecture for Distributed Database.

The above figure shows the reference architecture for a distributed database.
This architecture is not explicitly implemented in all distributed databases.
However, its levels are conceptually relevant in order to understand the
organization of any distributed database.

The major components of reference architecture are as follows:

-
At the top level of the reference architecture is the global schema.
global schema defines all the data which are contained in the distributed
database as if the database were not distributed at all. For this reason, the
Global schema can be defined exactly in the same way as in a nondistributed
database. And also the global schema consists of the definition of a set of
global relations.

Fragmentation Schema
Each global relation can be split into several non-overlapping parts which
are called fragments." Fragments are indicated by a global relation name
with an index.

There are several different ways to perform the spitting.


operation; the mapping between global relations and fragments is defined in
the fragmentation schema. This mapping is one to many.

Allocation schema
The allocation schema defines at which sites a fragment is located.
Fragments are logical portions of a global relation which are physically located at
one or several sites of the network.

All the fragments which correspond to the same global relation R and
are located at the same site, they constitute the physical image of global relation.
at site j. This image is indicated by global relation name and a site index.

Physical image Rj

Where, R is the global relation at site j.


Fig: fragments and physical images for global relation.

A global relation R is split into four fragments R1, R2, R3& R4are located
redundantly at three sites, and thus building three physical images are R1,
R2& R3.

To complete the terminology, we will refer to a copy of a fragment at a


given site and denote it using the global relation name and two indexes and
Two physical images can be identical, that is, it is a copy of another physical one.
image.

Example: - R1is a copy of R2.

The three top levels of reference architecture are site independent.


lower level, it is necessary to map the physical image to the object which
are manipulated by the local DBMS, this mapping is called a local mapping
schema & depends on the type of local DBMS.

Objectives
The three most important objectives which motivate the features of this
architecture are the separation of data fragmentation and allocation, the
control of redundancy and the independence from local DBMS.

Separating the concept of data fragmentation from the


concept of data allocation: This separation allows us to distinguish
two different levels of distribution transparency these are as follows.

a) Fragmentation transparency: It is the highest degree of


transparency and consists of the fact that the user works globally
relation.

b) Location transparency It is the lower degree of transparency and


requires the user to work on the fragments instead of global
relation.

Explicit control of redundancy reference architecture provides


explicit control of redundancy at fragment level in above fig R2& R3
Physical images contain common data. The definition of disjoin
fragments as a building block of physical images allows us to refer
explicitly to overlapping part: replicated fragment R2.

Independence from local DBMS or Local mapping transparency


It allows us to study several problems of distributed DBMS without
having to take into account these specific data models of local DBMS.

Distribution Transparency

Distribution transparency means that programs can be written as if the


databases were not distributed. Thus the correctness of data from site to,
however their speed of execution is affected.

There are three types of transparencies, which are as follows:

1) Fragmentation transparency (user need not be aware of the


data),
2) Location transparency (the user does not know the location of the
data), and
Allocation transparency.

Fragmentation transparency is the highest degree of transparency and


consists of the fact that the user or application programmer works on global
relations.
Location transparency is a lower degree of transparency and requires the
user or application programmer to work on fragments instead of global
relations; however, he or she does not know where the fragments are
located.

The separation between the concept of fragmentation and allocation is very


convenient in distributed database design, because the determination of
relevant portions of the data is thus distinguished from the problem of
optimal allocation.

To understand the concept of Distribution transparency, here consider a


simple e.g. application, called SUPINQUIRY, which consists in accepting a
supplier number from a terminal, finding the corresponding supplier name
and displaying it at the terminal.

Level 1 Fragmentation Transparency

The way in which the application accesses the database if the DDBMS
provides fragmentation transparency as shown in fig a

Read (terminal, $SNUM);

Select NAME into $NAME

From SUPPLIER

Where $NUM=$SNUM;

WRITE (terminal, NAME).


First, the application accepts a supplier number from the terminal; then it
accesses the database. The whole SQL statement repeats a single
distributed database access primitive, which receives the variable $SUPNUM
as the input parameter and returns the variable $NAME as the output
parameter. The DDBMS interprets the primitive by accessing the database at
one of the three sites in a way which is completely determined by the
system.

From the viewpoint of distribution transparency, notice that the


application refers to the global relation name SUPPLIER, completely ignoring
the fact that the database is distributed. In this way, the application is
completely not distributed by any change, which is applied to all schemata
which are below the global schema in reference architecture.

Level 2 Location Transparency

If the DDBMS provides location transparency but not fragmentation


transparency, the same application can be written as shown in the fig.

The request for the supplier with the given number in first issued referring
to fragment.SUPPLIER1, and if the DDBMS returns a negative answer in the
control variable #FOUND, a similar request is issued with request to
SUPPLIER2At this point, this naïve implementation assumes

Read (terminal, $SNUM);

Select Name into $NAME

From SUPPLIER2
Where SNUM=$SNUM;

Write (terminal, NAME).

that the supplier has been found and displays the result. Of course several
Variations of this solution are possible, for instance, issuing both requests in.
parallel in order to exploit the parallelism of the distributed system;
however this does not change the distribution transparency characteristics.
This application is clearly independent from the changes in the allocation
schema, but not from changes in the fragmentation schema, because the
fragmentation structure is incorporated in the application.

However, location transparency is by itself very useful, because it allows


the application to ignore which copies exist of which fragment, therefore
allowing copies to be moved from one site to another, and allowing the
creation of new copies without affecting the application when allocation
transparency is provided without fragmentation transparency, it is very
Effective to write advantage of knowing the fragmentation structure.

Level 3 Local Mapping Transparency

Read(terminal,$SNUM);

Select NAME into $NAME

From SUPPLIER1AT SITE1

Where SNUM=$SNUM;
If not #FOUND then

Read(terminal,$SNUM);

Select NAME into $NAME

From SUPPLIER1AT SITE1

Where SNUM=$SNUM;

At this level we assume that the application will refer to objects using
names which are independent from the individual local system; however it
has to specify at which site names are indicated in the SQL statements by
adding an at clause to the from clause. In this case each database access
primitive is routed by the DDBMS to a specific site. However, these primitives
use site-independent fragment names. If this mapping were not provided,
the application would incorporate directly the filenames, which are not used
by the local systems.

Replication
Replication improves the performance of simple read operation in a
distributed system and improves its reliability. However, updates incur
greater overhead and the requirement that all replicates of data be updated
and consistent adds complexity to a distributed database system.

A database said to be:

Strictly partitioned: when no replicates of the fragments are allowed

2. Fully redundant: when complete database copies are distributed at all


sites

Partially redundant: when only certain fragments are replicated

Choice 1: for data replication would lead to relatively expensive query


evaluation due to the unavailability of the data locally or at some nearby
site

Choice 2: Is very expensive in terms of storage space and the overhead to


maintain consistency, it is meaningless to replicate data at nodes where it is
unlikely to be accessed

Choice 3: Is responsible for allowing reduced access time for frequently


reading of local data or that from a nearby site. This choice allows for the
higher reliability and availability during site crashes. However because of the
Replication, updates are expensive.

Updates require access to all copies of the data item. Because the copies are
distributed over different sites of the network, the sites must reach a
consensus on the possibility of an update failed sites may not participate in
the agreement and sites may fail after the process has started. These issue
are dealt with later in this section on concurrency control and recovery.

Although a major aim of the database system is to reduce if not eliminate


Redundancy, planned data redundancy can improve distributed database.
performance

If a number of copies of a data item are available, a read operation can


be directed to any one of these copies. A write operation, however must
update all copies, otherwise we would have inconsistent data. The system is
required to ensure that any update operation is done on all replicates this
results in increased overhead a price to be paid in distributed databases.

Fragmentation
Each global relation can be split into several non-overlapping
portions which are called fragments.

Fragmentation Schema
The mapping between global relations and fragments is defined in the
fragmentation schema this mapping is one to many i.e. several fragments
correspond to one global relation but only one relation corresponds to one
fragment fragments are indicated by a global relation name with an index
(fragment index)

Fragments are logical portions of global relation which are physically located
at or several sites of the network all the fragments which correspond to the
same global relation R and are located at the same site j constitute the
physical image of global relation R at site j. (i.e. R)

There are some rules which must be followed when defining fragments:

(a) Completeness Condition: All the data of the global relation must be
mapped into the fragments i.e. it must not happen that a data item which
belongs to a global relation does not belong to any fragment.
(b) Reconstruction Condition: It must always be possible to reconstruct each
global relation from its fragments. the necessity of this condition is obvious,
In fact, only fragments are stored in the distributed database and global.
Relations have to be built through this reconstruction operation if necessary.

(c) Disjointness Condition It is convenient that fragments be disjoint, so that


The replication of the data can be controlled explicitly at the allocation level.
however this condition is useful mainly with horizontal fragmentation, while
for vertical fragmentation we will sometimes allow this condition to be
violated the reason for this exception will be discussed when dealing with
vertical fragmentation

Types of Fragmentation
The decomposition of global relations into fragments can be performed
by applying two different types of fragmentation

Horizontal fragmentation

2. Vertical fragmentation

Mixed fragmentation

We will first consider these two types of fragmentation separately and then
consider more complex fragmentation which can be obtained by applying
combination of both.

1. Horizontal fragmentation
Horizontal fragmentation consists of partitioning the tuples of a global relation.
into subset, this is clearly useful in Database where each subset can contain
data that have common geographical properties. It can be defined by
expressing each fragment as a selection operation on the global relation

SUPPLIER(SNUM, NAME, CITY)

Then the horizontal fragmentation can be defined in the following way:

SL

SL
The above fragmentation satisfies the completeness condition if 'SF'
and 'LA' are the only possible values of the city attribute otherwise we
would not know to which fragment the tuples with other CITY values belong

The reconstruction condition is easily verified because it is always


possible to reconstruct the SUPPLIER global relation through the following
operation

SUPPLIER = SUPPLIER1 AND SUPPLIER2

The Disjointness condition is clearly verified we will call the predicate which
is used in the selection operation which defines a fragment's qualification

CITY = "SF"

LA

The above example shows that in order to satisfy the completeness condition
the set of qualifications of all fragments must be complete, at least with
respect to the set of allowed values. The reconstruction condition is always
satisfied through the union operation and the Disjointness condition requires
that qualifications be mutually exclusive.

Derived Horizontal Fragmentation:

In some cases, the horizontal fragmentation of a relation cannot be


based on a property of its own attributes but is derived from the horizontal
fragmentation of another relation.

SUPPLY (SNUM, PNUM, DEPTNUM, QUAN)

SNUM is a supplier number it is meaningful to partition this relation so that a


fragment contains the tuples for suppliers which are in a given city. However
city is not an attribute of the SUPPLY relation it is an attribute of the
SUPPLIER relation considered in the above example.

Therefore we need a semi-join operation in order to determine the


tuples of supply which corresponds to the suppliers in given city.

The derived fragmentation of supply can be therefore defined as


follows:

SUPPLY

SUPPLY2=SUPPLY SJ SNUM=SNUM SUPPLIER2


The effect of the semi-joint operations is to select from SUPPLY the
tuples which satisfy the join condition between SUPPLIER1 OR SUPPLIER2 and
supply.

The reconstruction of the global relation SUPPLY can be performed


through the union as shown for SUPPLIER.

Q1 : SUPPLY.SNUM=SUPPLIER.NUM AND SUPPLIER.CITY = 'SF'

Q2: SUPPLY.SNUM=SUPPLIER.NUM AND SUPPLIER.CITY = "LA"

2. VERTICAL FRAGMENTATION:
The vertical fragmentation of a global relation is the subdivision of its
attributes into groups. Fragments are obtained by projecting the global
relation over each group. This can be useful in a distributed database where
each group of attributes can contain data which have common geographical
properties. The fragmentation is correct if each attribute is mapped into at
least one attribute of the fragments moreover it must be possible to
reconstruct the original relation by joining the fragments together.

Ex: EMP(EMPNUM, NAME, SAL, TAX, MGRNUM, DEPTNUM)

A vertical fragmentation of this relation can be defined as

PJ

PJ EMPNUM, SAL, TAX EMP

This fragmentation could for instance reflect an organization in which


salaries and taxes are managed separately. The reconstruction of relation
EMP can be obtained as:

EMP = EMP1 JN EMPNUM = EMPNUM EMP2

Notice that the above formula for the reconstruction of global relation
EMP is not complete, because the result of joining EMP1 and EMP2 contains
th column EMPNUM twice. This undesired replication can be eliminated by a
projection operation that we omit to indicate.

Ex: the following vertical fragmentation of relation EMP

PJ EMONUM, NAME, MGRNUM, DEPTNUM EMP

PJ EMPNUM, NAME, SAL, TAX EMP


The attribute NAME is replicated in both fragments. We can explicitly
eliminate this attribute when EMP through an additional projection operation.

EMP = EMP1 AND EMPNUM = EMPNUM FOR EMPNUM, SAL, TAX EMP2

3. MIXED FRAGMENTATION:
The fragments which are obtained by the above fragmentation
operations are related to themselves so that it is possible to apply the
fragmentation operation recursively provided that the correctness condition
are satisfied each time.

The reconstruction can be obtained by applying the reconstruction.


rules in inverse order.

Ex: EMP(EMPNUM, NAME, SAL, TAX, MGRNUM, DEPTNUM)

The following is a mixed fragmentation which is obtained by applying


the vertical fragmentation, followed by a horizontal fragmentation on
DEPTNUM

SL DEPTNUM <= 10 PJ EMPNUM, NAME, MGRNUM, DEPTNUM EMP

EMP2 = SL 10 < DEPTNUM> 20 PJ EMPNUM,NAME,MGRNUM,DEPTNUM EMP

EMP3 = SL DEPTNUM >20 PJ EMPNUM,NAME,MGRNUM,DEPTNUM EMP

PJ RMPNUM, NAME, SAL, TAX EMP

The reconstruction of EMP relation EMP is defined by the following


expression:

EMP = UN( EMP1,EMP2,EMP3) JN EMPNUM = EMENUM PJ EMPNUM, SAL, TAX


EMP4

Mixed fragmentation can be conveniently represented by a


fragmentation tree
EMP

EMP4

EMP2 EMP3
EMP1

In a fragmentation tree, the root corresponds to a global relation.


leaves corresponds to the fragments and the intermediate nodes correspond
to the intermediate results of the fragment defining expression. The set of
nodes which are sons of a given node represent the decomposition of this
node by a fragmentation operation (vertical or horizontal)

Ex: The fragmentation tree of relation EMP. The root (relation EMP) is
vertically fragmented into two portions one portion corresponds to a leaf
node of the tree (EMP1) the other portion is horizontally partitioned thus
generating the other three leaves corresponding to fragments EMP1, EMP2
and EMP3.

Distributed Database Design

Designing a distributed database has many technical and


organizational issues that become more difficult in a multiple site system.
The technical problem is the interconnection of sites by a computer network.
and the optimal distribution of data and applications to the sites for meeting
the requirements of applications and for optimizing performances. From the
From an organizational viewpoint, the issue of decentralization is crucial, since
Distributed systems typically substitute for large centralized systems.
Distributing an application has a major impact on the organization.

The mathematical problem of optimally distributing data over a


Computer network has been widely analyzed in the context of distributed the
system and distributed databases. The major outcomes of this research are:

1. Several design criteria have been established about how data can be
conveniently distributed.
2. Mathematical foundation has been given to 'design aids' that, in the
near future, will help the designer in determining data distribution.

Framework for Distributed Database Design

The term “Distributed Database Design” has a very broad and


imprecise meaning. The design of a centralized database amounts to:

Designing the 'conceptual schema' which describes the integrated


database (all the data which are used by the database applications).
2. Designing the 'physical database', i.e. mapping the conceptual
schema to storage areas and determining appropriate access methods.

In a distributed database these two problems become the design of the


global schema and the design of the local physical databases at each site.
The techniques which can be applied to these problems are the same as in
centralized databases. The distribution of the database adds to the above
problems two new ones:

3. Designing the fragmentation i.e. determining how global relations are


subdivided into horizontal, vertical or mixed fragments.
4. Designing the allocation of fragments, i.e. determining how fragments
are mapped to physical images; in this way, also the replication of
fragments is determined.

These two problems fully characterize the design of data distribution.


Since fragmentation has been established as a distinguishing feature of
distributed databases.

The distinction between these two problems is conceptually relevant.


since the first one deal with the “logical criteria” which motivate the
fragmentation of global relation, while the second one deals with the
physical placement of data at the various sites.
Although the design of application programs is made after the design
of schemata, the knowledge of application requirements influences schema
design, since schemata must be able to support applications efficiently.
Thus, in the design of a distributed database, sufficiently precise knowledge
of application requirements is needed; clearly, this knowledge is required
only for the more 'important' applications, i.e. those which will be executed
frequently or whose performances are critical.

In the application requirements we include:

1. The site from which the application is issued (also called site of origin)
2. The frequency of activation of the application (i.e. the number of
activation requests in the unit time
3. The number, type and the statistical distribution of accesses made by
each application to each required data 'object'.

Objectives of the design of Data Distribution

In the design of data distribution, the following objectives should be taken into account.
to account

1. Processing Locality: Distributing data to maximize processing


locality corresponds to the simple principle of placing data as close as
possible to the applications which use them. The simplest way of
characterizing processing locality is to consider two types of references to
"local" reference and "remote" reference. Clearly, once the sites of
origin of applications are known, locality and remoteness of references
depend only on data distribution.

Designing data distribution for maximizing processing locality can be


done by adding the number of local and remote references corresponding
to each candidate fragmentation and fragment allocation, and selecting
the best solution among them.

The term complete locality designates those applications which can be


completely executed at their sites of origin. The advantage of complete
locality is not only the reduction of remote accesses, but also the
increased simplicity in controlling the execution of the applications.

2. Availability and Reliability of Distributed Data: A high


degree of availability is achieved by storing multiple copies of the same
information; the system must be able to switch to an alternative copy
when the one that should be accessed under normal conditions is not
available.

Reliability is also achieved by storing multiple copies of the same


Information. It is possible to recover from crashes or from the physical
destruction of one of the copies by using the other, still available copies.
Since physical destruction can be caused by events such as fire,
earthquake, or sabotage), it is relevant to store replicated copies in
geographically dispersed locations.

3. Workload Distribution: Distributing the workload over the sites


in an important feature of distributed computer systems. Workload
distribution is done in order to take advantage of the utilization of
computers at each site, and to maximize the degree of parallelism of
execution of applications.
4. Storage Cost and Availability: Database distribution should
reflect the cost and availability of storage at the different sites. Typically,
the cost of data storage is not relevant, if compared with CPU, I/O, and
transmission costs of applications (but the limitation of available storage
at each site must be considered.

Using all the criteria at the same time is extremely difficult, since this
leads to complex optimization models. It is possible to consider some of the
above features as constraints, rather than objectives.

Top-Down and Bottom-Up Approaches to the Design of


Data Distribution

There are two alternative approaches to the design of data distribution.

1. The top down and


2. The bottom up approach

In the top-down approach, we start by designing the global schema.


and we proceed by designing the fragmentation of the database, and then
by allocating the fragments to the sites, creating the physical images. The
approach is completed by performing, at each site, the 'physical design' of
the data which are allocated to it.

This approach is the most attractive for systems which are developed
from scratch, since it allows performing the design rationally.
When the distributed database is developed as the aggregation of
existing databases, it is not easy to follow the top-down approach. In this
the global schema is often produced as a compromise between existing
data descriptions. It is even possible that each pair of existing databases is
independently interfaced using a different translation schema, without the
notion of a global schema. This leads to systems which are different in
conception from distributed database architecture.

When existing databases are aggregated, a bottom-up approach to the


design of data distribution can be used. This approach is based on the
integration of existing schemata into a single, global schema. By integration,
we mean the merging of common data definitions and the resolution of
conflicts among different representations given to the same data.

It should be noted that the bottom-up approach lends itself less


easily to the development of horizontally fragmented relations. Horizontal
fragments of the same global relation must have the same relation schema;
this feature is easily enforced in a top-down design, while it is difficult to
discover. Since horizontal fragments are a relevant and useful feature of a
distributed database, the integration process should attempt to modify the
definitions of local relations, so that they can be regarded as horizontal
fragments of common, global relations.

When existing databases are aggregated into a distributed database, it


It is possible that they use different DBMS. A heterogeneous system adds to
the complexity of data integration the need for a translation between
different representations. In this case, it is possible to make a one-to-one
translation between each pair of different DBMS. However the approach
which is mostly used in the prototypes of heterogeneous systems is to select
a common data model, and then to translate it into unique representations
all the different schemata of the involved DBMS.

In summary, the bottom-up design of a distributed database requires:

1. The selection of a common database model for describing the global


schema of the database.
2. The translation of each local schema into the common data model.
3. The integration of the local schemata into a common global schema.
Concurrency Control
A distributed database (DDB) is a collection of multiple, logically
interrelated databases distributed over a computer network, like LAN,
WAN, MAN, etc.

A distributed database management system (D–DBMS) is the software


that manages the DDB and provides an access mechanism that makes this
distribution transparent to the users where transparency is the separation of
the higher level semantics of a system from the lower level implementation
issues.

A concurrency control is defined as, it is a situation where several


transactions execute concurrently in the database and there may be
possibility of deadlock detection, such situation is controlled by any of the
concurrency control schema or protocols and prevents the deadlocks, is
known as concurrency control.

Example

Two-Phase Locking-based (2PL)

2) Timestamp Ordering (TO)

A concurrency control deals with ensuring transactions atomicity in


the presence of concurrent execution of transactions, such problem occurs in
the synchronization. In distributed systems the synchronization problem is
harder than centralized system.

Atomicity is a property of a transaction where either all the operations


of the transactions are reflected properly or none of them operation is
performed.

A transaction is a collection of actions that make consistent


transformations of system states while preserving system consistency.

A concurrency control method is responsible for maintaining


consistency among the multiple copies of the data items placed in the
DDBMS.

2-Phase Locking

The 2-phase locking ensures the serializability, the transaction is


serializable if it produces the same result as some serial execution of the
transaction. In serial execution, each transaction runs for completion before
any statements from any other transaction are executed.
A lock is a variable associated with each data item and the
Manipulation of such variable is called locking.

The two phase locking is so called, because it has two phases as follows:

1) Growing Phase: A transaction may obtain locks but may not release
any lock.
2) Shrinking Phase: A transaction may not obtain any new locks.

Both the phases are monoatomic; the number of locks are only
increasing at first phase and decreasing at second phase. All locks request
by transaction or its sub transactions should be made in the growing phase
and released in the shrinking phase. When a transaction issues and unlock
instruction phase starts indicating that all required locks are obtained. Where
data is replicated all sub transaction of the transaction which modify the
Replicated data item would be observed in the two-phase locking protocol.
Therefore subtraction released a lock and subsequently has another sub
Transaction request another lock which required each sub transaction notify
all other sub transactions acquired all its locks. This shrinking phase start
once all subtractions acquired their locks.

When all sub transactions finished their growing phase the number of
The involvement of massages is high. The exchange of massages between the sites is
done by using two phase commit protocol.

Ex. Consider two transactions T1& T2and sites S1& S2. Suppose a
distributed data A is stored S1and B at sites S2then for execution of these
two transactions, each transaction generates two local sub-transactions as
T1s1,T1s2& T2s1, T2s2respectively executed at site S1& S2.

Transaction T1 Transaction T2

Lock x (A) lockx (A)


A : = 100 A : = 200
Write (A) Write (A)
Unlock (A) Unlock (A)
Lock x (B) Lock x (B)
B : = 1000 B : = 2000
Write (B) Write (B)
Unlock (B) Unlock (B)

Fig. (a) Two modifying Transactions.

The execution schedule is shown in the following figure.

Site S1 Site S2
Transaction Transaction TransactionTransaction
Time T1s1 T2s1 T1s2 T2s2
t1 Lock x(A) Lock x(B)
t2 A = 100 B = 2000
t3 Write (A) Write (B)
t4 Unlock (A) Unlock(B)
t5 Lockx(A) Lockx(B)
t6 A : = 200 B = 1000
t7 Write (A) Write (B)
t8 Unlock (A) Unlock(B)

Fig.(b) A schedule for transactions.

Distributed Deadlocks

A system is in a deadlock state if there exist several transactions such


that every transaction is in the set is waiting for another transaction in the
set.

Consider there exists a set of existing transactions T.0, T1, T3,


…… , Tnsuch that T0is waiting for a data item that is held by T1, T1is waiting
for a data item that is held by T2, and ……. , and Tn-1 is waiting for a data item
that hold by T0Therefore none of the transactions get processed in such
situation, & is called a deadlock detection.

A deadlock occurs in a distributed database system when a cycle is


produced in the execution of the transaction in Distributed Wait-For-Graph
(DWFG), and these transactions are aborted or restarted repeatedly, such
A complex situation produced in the system is known as distributed deadlock, as
shown in the following fig.

A deadlock occurs when each transaction T from the set of one or more
The transaction is waiting for an item that is locked by someone else.
transaction T’ in that set. Hence each transaction which is in a waiting queue
In that set, it is waiting for one of the other transactions to release the lock.

A deadlock determines the cycles in the wait-for-graph, when more


than one transactions are executed their operations for the same resource. A
distributed deadlock is more difficult than that of centralized database.

Site-1 site-2 site-1

T1 T1 T1
A1 A2 A1

T2 T2
A1 A1
T1 A2

T2 T2 A2
A2

Fig a. A distributed wait for graph showing fig b. A local wait for graph.
distributed deadlock

Labeling: T- Transactions

A- Agenda

A distributed wait for graph as shown in fig (a) consists of a cycle which
corresponds to the deadlock detection. Fig. consists of two sites, two
transaction T1 & T2 & two agents as A1 & A2.

Assume that each transaction has only one agent at each site where it
is executed. A directed edge from an agent Ti Aj to an agent Tr As means
that Ti Aj is blocked and waiting for Tr As.

There are the following two reasons for an agent to wait for another one.

1. Agent Ti Aj waits for agent Tr Aj to release a resource which it


needs. Here, Ti and Tr are different transactions and two agents
are at the same site, because agents request only local resources.
Such wait is shown in fig. (a) where T1 A1 waiting for T2 A1 to
release the resource at site 1.

Agent Ti Aj waits for agent Ti As to perform some required tasks.


function. Here, two agents belong to the same transaction. The
Agent Ti Aj has requested that agent Ti As perform some function.
at a different site. As shown in fig (a) with dashed lines.

In fig. (b) which shows local wait for graph. The square nodes are
called input parts if they have an edge entering the local wait for graph and
output ports if they receive an edge exiting the LWFG. A deadlock occurs in
LWFG by a cycle in it.

In a distributed system, a deadlock occurs when a cycle is formed.


produced in DWFG. It is detected when an exchange of information takes place.
place the cycle gets produced and deadlock occurs. It is a different task than
local deadlocks.
A deadlock resolves the one or more transactions to be aborted or
restarted, so that it releases its resources for other transactions.

Following are the criteria for aborting the transactions.

a) Abort the youngest transaction.

b) Abort the transaction which owns fewer resources

c) Abort the transaction with the smallest abort cost.

d) Abort the transaction with the longest expected time to complete.

A redundancy increases the deadlocks probability in distributed DBMS.

Consider, two transactions T1& T2both of which must lock the same data item
exclusively, as x.

If 'x' is replicated i.e. If x1& x2are two copies at site 1 & 2 and both
transactions want to obtain a lock then a deadlock occurs.

AT Site 1: T1 locks x1; T2 waits

AT site 2 : T2 locks x2 ; T1 waits.

Hence, deadlock occurs.

Methods of deadlock detection


Deadlock detection using centralized or hierarchical control.

2) Distributed deadlock detection.

3) Deadlock prevention.

1) Deadlock detection using centralized or hierarchical


control
In a centralized control method, a deadlock detector is placed which
has a responsibility to discover or find the cycles in DWFG. The deadlock
The detector receives the information from all sites in a distributed database.

responsibility to find all potential deadlocks at that particular site.

Site 1
T1
A1 T1 A2 T1 T1 A3
A1

T2 T5
A1 A1
T4
A1
T3 A2
T3
A1

T4
A1T3 A2

Fig.(a) Local wait for graph at site 1 Fig.(b) Potential global deadlock
at site 1

The above fig. shows deviation of a potential global deadlock cycle


from a local wait for graph.

To determine potential deadlock cycles, the local deadlock detector


starts from an input port and searches backward along the local graph until it
reaches an output port. Such part is a potential deadlock cycle. Only the
initial and final agents of each potential cycle must be transmitted as shown
in fig. (b).

A deadlock detector collects these messages to form a DWFG, finds


cycles and selects the transactions for abort.

Drawbacks Of Centralized Deadlock

It is unprotected to failures of the site where the centralized detector runs.

It requires large communication cost.

To reduce communication costs, hierarchical controllers are used.

Hierarchical controllers
NLLDD 0

NLDD 1
NLDD 2
Site 1 Site 2 Site 3
Site 4 Site 5

Fig. A tree for deadlock detectors

Labeling:

NLDD – Non local deadlock detector.

LDD – Local deadlock detector.

The local deadlock detectors determine local deadlocks and transmit


information to the potential global cycles to non local deadlock detectors.

The performance of hierarchical deadlock detection mechanism


depends on the choice of hierarchy.

Here, site 1 & 2 have a common detector as LDD1 while site 3, 4 & 5
has a common detector as NLDD 2 finally all of them are controlled by NLDD
0.

Distributed Deadlock Detection


Here, no distinction between local and non-local deadlock detectors.
Each site has the same responsibility when exchanging information.
determine global deadlocks.

The potential deadlock cycles detected at local sites are transmitted


through an algorithms. In LWFG all the input and output parts are collected
into a single node called the external (EX). Main difference between
Centralized and distributed deadlock is that in centralized all the potential
deadlocks cycles are sent to the designated site while in distributed deadlock
detectors need a rule for determining to which site the potential deadlock
cycles are transmitted. This rule must attempt to minimize the amount of
transmitted information.

The following algorithm consists of distributed deadlock detection.

Site 1 Site 1 Site 1

T1 T1 T2 EX T1 T2 EX
T2
T1 T2 EX T1 T2
T1
EX T2

Site 2 Site 2 Site 2

Fig (a) Fig(b) fig(c)

Fig. Distributed Deadlock detection algorithm.

A local deadlock detector performs the following actions at each site.

1) Forms LWFG including EX node.

The received message performs the following modifications of LWFG.

a) For each transaction in massage, add it to LWFG if it does not


already exist.

b) For each transaction n message, starting with EX create an edge to


the next transaction in the massage.

Cycles without EX node in LWFG indicate existence of end lock.

4) Cycles with EX node are the potential deadlock cycles. Which must be
transmitted to different site.

(3) Distributed Deadlock Prevention


Deadlock prevention eliminates the need for deadlock detection.
resolution.

It is done in the following ways. If a transaction T1 requests a resource


which hold by T2 then 'prevention test' is applied. If this test indicates a risk
of deadlock, then T1 is not allowed to enter wait stats. While either T1 is
aborted restarted or T2 is aborted & restarted.

Prevention test must ensure that if T1 is allowed to wait for T2 then


deadlock can never occur. This can be obtained by ordering transactions using
lexicographical ordering.
Let TIis allowed to wait for TJ, Where i<J. because it is impossible to
built a closed chain.

i1 I2 ------- in i1.

Such that if IJ jkthen I< ik J,k.

Following are methods of deadlock prevention.

1) Non preemptive method.

2) Preemptive method.

1) Non-preemptive method:

For deadlock prevention based on timestamps is the following: In


Preemptive method older transaction which already holds resources & not
Allow younger transactions to wait for older ones.

2) Preemptive method:-

In the Preemptive method, an older transaction preempts a younger transaction.


therefore younger transaction waits for older.

Reliability :-
A reliability is a measure of the success on the basis of that the
system conforms to some authorities specification on its behavior

The deviation of such specified behavior is known as failures.

The reliability of a system is inversely related to the frequency of


failures. Reliability means the consistency of good quality which is acquired
by the DDBMS.

The reliability of a system is measured in several ways which are based


on the incidence of failures. These measures include the following parts

Mean time between failures (MTBF)


2) Mean time to repair (MTTR)

3) Availability (fraction of time that system meets its specification)

In a database system application, the reliability problem is divided into


two parts as below

Application dependent

It requires that the transaction fulfills the general system’s


specification.

2) Application independent:-

It requires that the transactions maintain their atomicity, durability,


serializability & isolation properties.

Types of Failures
There are the following types of failures in the transactions.

Transaction failures

Transaction aborts (unilaterally or due to deadlock)

Avg. 3% of transactions abort abnormally

2. System (site) failures

•Failure of processor, main memory, power supply, ...

Main memory contents are lost, but secondary storage


contents are safe

3. Media failures

• Failure of secondary storage devices such that the stored


data is lost

Head crash/controller failure (?)

4. Communication failures

. Lost/undeliverable messages

. Network partitioning
Components of Distributed Reliability Protocols

Commit protocols:

. How to commit a transaction properly when more than one site


are involved in the commitment.

. It is different from centralized DB.

. How to execute the commit command for distributed transactions.

. Issue: how to ensure atomicity and durability?

Termination protocols:

. Designed for the sites that are affected by a failed site, tell a site
how to commit/abort properly when another site fails.

. If a failure occurs, how can the remaining operational sites deal


with it.

. Non-blocking: the occurrence of failures should not force the


wait until the failure is repaired to terminate the sites
transaction.

•Recovery protocols:-

. Address the recovery procedure for a failed site once it restarts.


just opposite to the termination protocols.

. When a failure occurs, how do the sites where the failure


occurred deal with it.

. Independent: a failed site can determine the outcome of a


transaction without having to obtain remote information.

Non-blocking commitment protocol


A commitment protocol is called blocking if the occurrences of failures,
that affects some of the participating sites to wait until the failure is
repaired before terminating the transaction & the transaction which is not
terminated at sites called the pending at that site.

The basic 2-phase-commitment protocol assures that transactions


are correctly committed or aborted at all sites even in the presence of failures.
Also, in some cases transactions are blocked until the failure is repaired.
which blocks resources and reduces the system’s availability

The following figure shows the state diagram for the 2-phase commitment protocol.
without ACK messages. The input & output messages are indicated for each
A transaction occurs when an input message arrives and which
sends the output message. The state information of the transaction is
recorded into stable storage for recovery purpose.

I I

-- / PM
PM / RM ua/PM/AAM
U R
tm/ACM

RM / CM
CM / -- C A
A C
AAM/ACM

Coordinator Participants
Massages
I = initial state
PM = prepare massage
U = uncertain(waiting for some
RM = ready answer massage
information
abort answer message
R = ready to commit
ACM = abort command
A = abort (transaction) massage

CM = commit command
Local conditions
transactions which are due to an exchange of messages
ua = local unilateral abort

unilateral transitions(timeout) timeout

Fig. state diagram for the 2-phase-commit protocol

The problems due to 2-Phase-Commit are as follows


• Blocking
oReady implies that the participant waits for the coordinator
If the coordinator fails, the site is blocked until recovery.
oBlocking reduces availability
• Independent recovery is not possible

These problems are solved by 3-phase protocol.

3-Phase Protocol:

It is a nonblocking protocol, which overcomes the 2-phase protocol and


solves the problems of failures at the sites in the DDBMS.

When a failure occurs at coordinator sites, there are two possible cases as

At least one of the participants receives the command, which gives


message to the other participant to terminate the transaction.
None of the participants has received the command, while only the
The coordinator site crashed; all the participant sites are operational.
and choose another coordinator site.

In both the above cases the transaction is terminated correctly, but


consider the situation where none of the operational participants receive
Any command and one of them fails, such a problem can also arise.
at the coordinator site since the coordinator is also a participant of another
coordinator site. If the coordinator site fails, it creates a situation
where termination is impossible.

To eliminate the above blocking as shown in fig. a of 2-phase-protocol,


here the operational participants are blocked because in the second phase of
commitment the participants goes through R state which is ready to abort or
commit transaction to A or C. therefore, if all the operational participants
does not receive the command, then also the failed participants perform the
abort or commit action, which must not be performed. This problem can be
eliminated by modifying 2-phase-protocol as the 3-phase-protocol as shown
following fig.

PM/AAM

I U
-- / PM PM/RM ua / --

R A
U

A B P
C C
C
tm /ACM
ACM/

AAM/ACM RM / PCM PCM/ OK

New states

PC = prepared to commit
state
tm /ACM OK /CM CM / --
New message

PCM = enter the PC state


Coordinator Participants
OK = entered the PC state

possible
Fig. state diagram for 3-phase commit restart
protocol transitions from

The 3-phase protocol consists of three phases, it is an extension of 2-


phase-protocol which avoids the blocking problem. The third phase consists of
multiple sites are involved in the decision to commit.

The above fig. consists of new state as prepared-to-commit-state,


which commits the transaction. The coordinator issues the command in
second phase either ACM command or PCM massage. When the PCM
send the coordinator enters the new Before-commitment (BC) state
Then the participant must send an OK message. Hence it is entered in the new PCM.
state and records it to stable storage. Finally, when the coordinator receives OK
The message enters the final commit state (C) and sends the final commit.
command (CM) which eliminates the blocking problem.

Network Partition
When network partitions are created, the following two possibilities arise.

1) The coordinator and all its participants remain in one participant here the
failure has no effect on the commit protocol.

2) If the coordinator and its participants belong to several partitions, then it is


observed the sites in one of the partitions failed. Sites that are not
in the partition containing the coordinator simply executed the protocol to
deal with the failure of the coordinator. The coordinator and the sites that are
in the same partition as the coordinator follows the usual commit
protocol, assuming that the site in the other partitions have failed.
DEADLOCK HANDLING

A system is said to be in a deadlock state if there exists a set of


transactions such that every transaction in a set is waiting for another
transaction in set.

Suppose there exists a set of waiting transactions {T0T1…T} such


that T0is waiting for the data item that T1holds, and T1is waiting for the data
item that T2holds, and…..,and Tn-1 is waiting for the data item that Tnholds
and Tn is waiting for the data item that T0holds. None of the transactions can
make progress in such situations.

The only remedy to this undesirable situation is for the system to


invoke some drastic action, such as rolling back some of the transactions
involved in deadlock. Rollback of a transaction may be partial: i.e., a
transaction may be rolled back to the point where it obtained a lock whose
release resolves the deadlock.

There are two principal methods for dealing with the problem of deadlock:

1) DEADLOCK PREVENTION protocol to ensure that the systems will NEVER


a deadlock state.
We can allow the system to enter the deadlock state, and then try to
recover by using a deadlock detection and deadlock detection scheme.

Both the methods may result in transaction rollback. Prevention is


commonly used if the probability that the system would enter a state of
deadlock state is relatively high; otherwise, detection and recovery are
more efficient.

DEADLOCK PREVENTION

There are two approaches to deadlock prevention:

One approach ensures that no cyclic waits can occur by ordering the
request for locks, or requiring all locks to be acquired together.
The other approach is closer to deadlock recovery, and perform
transaction rollback instead of waiting for a lock, whenever the wait
could potentially result in rollback.
The simplest scheme under the first approach requires that each
transaction locks all its data items before it begins execution. Moreover,
either all are locked in one step or none are locked. There are 2 main
disadvantages of this protocol:

1. It is often hard to predict, before the transaction begins, what data


items need to be locked;
2. Data items utilization may be very low, since many of the data items
may be locked but unused for a long time.

Another approach for preventing deadlock is to impose an ordering of all


data items, and to require that a transaction is locked.

The second approach for preventing deadlock is to use PREEMPTION


and TRANSACTION ROLLBACKS.

In preemption, when a transaction T2requests a lock that transaction T1


holds, the lock granted to T1be preempted by rolling back of T1, and granting
of the lock to T2.To control the preemption, we assign a unique timestamp to
each transaction. The system uses these timestamps only to decide whether
a transaction is rolled back, it retains its old timestamp when restarted. Two
different deadlock-prevention schemes using timestamps have been
proposed:

1) Waite-die Scheme: this is the nonpreemptive technique. When


transaction Tirequests a data item currently held by Tj,Tiis
allowed to wait only if it had a timestamp smaller than that of Tj
(i.e., Tiis older than Tj).Otherwise, TIis rolled back (dies).

Example, suppose that transactions T22,T23, and T24requests a data item


held by T23,then T24will be rolled back.

2) Wound-wait Scheme: this scheme is a preemptive technique. It is a


counterpart to the wait-die scheme. When transactions request a
data item currently held by Tj,Tiis allowed to wait if it has a
timestamp larger than that of Tj(i.e. , Tiis younger than Tj).
Otherwise, Tj is rolled back(Tjis wounded by Ti).

From the above example, with transaction T22T23, and T24, if T22requests a
data item held by T23, then the data item will be preempted from T23, and T23
will be rolled back. If T24requests a data item held by T23, then T24will wait.

Whenever the system rolls back transactions, it is important to ensure that


there is no STARVATION- i.e., no transaction gets rolled back repeatedly and
is never allowed to make progress.

Both the wound-wait and the wait-die scheme avoid starvation: At any
time, there is a transaction with the smallest timestamp. This transaction
cannot be required to rollback in either scheme. Since timestamp always
increase, and since transactions are not assigned new timestamps when
they are rolled back, a transaction that is rolled back repeatedly will
eventually have the smallest timestamp, at which point it will be rolled back
again.

There are some differences in the way that the two schemes operate.

1) In the wait-die scheme, an older transaction must wait for the younger
one to release its data items. Thus, the older the transaction gets, the
more it tends to wait. On the other hand, in the wound-wait scheme,
An older transaction never waits for younger transactions.
2) In the wait-die scheme, if a transaction T idies and is rolled back
because it requested a data item held by transaction Tj , then Timay
reissue the same sequence of requests when it is restarted. If the data
item is still held by Tjthen Tiwill die again. Thus, Timay die several
times before acquiring the needed data item.

On the other hand in the wound-wait scheme transaction T iis wounded and
rolled back Tjrequested a data item that it holds. When Tiis restarted and
requests the data item now being held by TjIwaits. Thus, there may be
fewer rollbacks in the wound-wait scheme.
DEADLOCK DETECTION & RECOVERY

If the system does not employ some protocol that ensures deadlock
freedom, then a detection and recovery scheme must be used. An algorithm
that examines the state of the system is invoked periodically to determine
whether a deadlock has occurred. If one has, then the system must attempt
to recover from the deadlock. To do so, the system must:

Maintain information about the current allocation of data items to


transaction, as well as any outstanding data item requests.
2) Provide an algorithm that uses this information to determine whether
the system has entered a deadlock state.
Recover from the deadlock when the detection algorithm determines
that a deadlock exists.

NODES RECOVERY

At such a site, a local transaction manager is available, and it can use the
following techniques for local recovery

LOGS

A most widely used structure for recording database modifications in


the log. The log is a sequence of log records, recording all the update activities
in the database. A log contains information for undoing or redoing all actions
which are performed by transactions. To undo the actions of a transaction
means to reconstruct the database as prior to its execution. To redo the
Actions of a transaction mean to perform its actions again.

A log record contains the required information for undoing or redoing.


actions. Whenever a transaction performs an action on the database, a log
record is written on in the log file. Moreover, when a transaction is started,
committed or aborted, a begin transaction, commit or abort record is written
in the log.
The writing of a database update & the writing of the corresponding
Log records are two distinct operations; therefore, it is possible that a failure.
occurs between them. In this case if the database update were performed
before writing the log record, the recovery procedure would be unable to
undoing the update; the corresponding log record would in fact not be
available. In order to avoid this problem, the log write ahead protocol is
used .which consists of two basic rules

1. Before performing a database update, at least the undo portion of the


the corresponding log record must have been already recorded on stable
storage
2. Before committing a transition, all log records of the transaction must
have already been recorded on stable storage.

RECOVERY PROCEDURES

When a failure with loss of volatile storage occurs, a recovery


The procedure reads the log file and performs the following operations.

1. Determine all non-committed transactions that have to be undone. Non


committed transactions are recognized because they have a begin
transaction record in the log file without having a commit or abort
record.
2. Determine all transactions which need to be redone. In principle; this
set includes all transactions which have a commit record in the log file.
In practice, most of them were safely stored in stable storage before.
the failure, hence they do not need to be redone. In order to
distinguish transactions which need to be redone from those which do
not, and checkpoints are used.
Undo the transactions determined at step 1 and redo the transactions
determined at step 2.

Chokepoints are operations which are periodically performed (typically,


every 5 min) in order to simplify the first 2 steps of the recovery procedure
performing a checkpoint requires the following operations.

a) Writing all log records and all database updates to stable storage
which are still in volatile storage.
b) Writing to stable storage a checkpoint record. A checkpoint record in
the log contains the indication of transactions which are active at the
time when the checkpoint is done.

The existence of checkpoints facilitates the recovery procedure. Steps 1 &


2 of the recovery procedure are substituted by the following
DEADLOCK DETECTION & RECOVERY

If the system does not employ some protocol that ensures deadlock
freedom, then a detection and recovery scheme must be used. An algorithm
that examines the state of the system is invoked periodically to determine
whether a deadlock has occurred. If one has, then the system must attempt
to recover from the deadlock. To do so, the system must:

Maintain information about the current allocation of data items to


transaction, as well as any outstanding data item requests.
2) Provide an algorithm that uses this information to determine whether
the system has entered a deadlock state.
Recover from the deadlock when the detection algorithm determines
that a deadlock exists.

NODES RECOVERY

At such a site, a local transaction manager is available, and it can use the
following techniques for local recovery

LOGS

A most widely used structure for recording database modifications in


the log. The log is a sequence of log records, recording all the update activities
in the database. A log contains information for undoing or redoing all actions
which are performed by transactions. To undo the actions of a transaction
means to reconstruct the database as prior to its execution. To redo the
Actions of a transaction mean to perform its actions again.

A log record contains the required information for undoing or redoing.


actions. Whenever a transaction performs an action on the database, a log
record is written on in the log file. Moreover, when a transaction is started,
committed or aborted, a begin transaction, commit or abort record is written
in the log.
There failures include, for example, the abort of transactions due to an error.
condition in discovered, like an arithmetic overflow or a division by zero.

b) System error:

The system has entered an undesirable state (for example, deadlock). As a result
of which a transaction cannot continue with its normal execution. The transaction
however can be reexecuted at a later time

System crash (Failure with loss of volatile storage)

In these failures, the content of main memory is lost. There is hardware,


malfunction, or a bug in the database software or the operating system that
causes the loss of the content of volatile storage, and brings transaction
processing to a halt. However, all the information which is recorded on disks is not
affected by the failure.

The assumption that hardware errors and bugs in the software bring the
system to a halt, but do not corrupt the non-volatile storage contents in known as
theFail-Stop assumption. Well designed systems have numerous internal
checks at the hardware and the software level that brings the system to a halt
whenever there is an error. Hence, fail-stop assumption is a reasonable one.

3) Disk Failure (Failure with loss of non-volatile storage):

In these failures, the content of disk storage is lost. It is also called as Media.
failures. A disk block loses its content as a result of either a head crash or failure
during a data transfer operation. Copies of data on other disks (backups) such as
tapes are used to recover from the failure

The probability of failures of the third type is less than that of the other two.
types. The probability of failures of the third type is again reduced by using Stable
storage technique.

Stable storage technique is typically implemented by replicating the same


information on several disks with independent failure modes and using the so-
called Careful replacement strategy. At every update operation, first one copy of
the information is updated, then the correctness of the update is verified and
Finally, the second copy is updated.

With this stable storage, we introduced a new type of failure.


4) Failure with loss of stable storage:

Information stored in stable storage is lost because of several simultaneous reasons.


failures of the third type. Again careful replacement strategies are applied. We
cannot reduce this probability to zero.

Communication or Link Failure:

When a message is sent from site x to site y, we require a communication.


network. There is great variety of possible failures during the process of message
transmission. For example, the message might not be correct, the message might
be out of order, x might not receive the acknowledge with the message being
delivered, x might receive the acknowledgements without the message being
delivered and so on.

With the modern communication networks, which are capable of routing


messages, the following assumption about the network is also reasonable: If site x
cannot communicate with site y but can communicate with site z, the site z cannot
communicate with y either. In this case, when two operating sites x and y cannot
communicate, the means that no communication path is available between them
and the network is partitioned into two or more completely disconnected sub
networks, one including x, and the other one including y. All the operational sites
which belong to the same subnetwork can communicate with each other.

Therefore, two basic types of possible communication errors are: Lost


messages and partitions

Dealing with network partitions is a harder problem than dealing with site
crashes or lost message. Fortunately, in many computers network partitions are
the site crashes much less frequently.

Hence if a communication failure occurs, the system behaves in an unpredictable manner.

erroneous way.

MAJORITY PROTOCOLS
The majority protocols work as follows:-

If data item Q is replicated in different sites, then a lock-request


message must be sent to more than one-half of the sites in which Q is.
stored. Each lock manager determines whether the lock can be granted.
immediately. The response is delayed until the request can be granted. The
transaction does not operate on Q until it has successfully obtained a lock on
a majority of the replicas of Q.

The above scheme has the following disadvantages:

1) Implementation: The majority protocol is more complicated to


implement. It requires 2(n/2+1) messages for handling lock requests.
and (n/2+1) messages for handling unlock requests.

2) Deadlock Handling: In addition to the problem of global deadlocks


due to the use of a distributed lock-manager approach, it is possible for
a deadlock to occur even if only one data item is being locked.

For example: Suppose that the transactions T1and T2wish to lock data
items Q in the exclusive mode. Transaction T1may succeed in locking Q at
sites S1and S3, while transaction T2may succeed in locking Q at sites S1and
S4. Each then must wait to acquire the third lock; hence, a deadlock has
occurred.

BIASED PROTOCOL

This is another type of handling replication. The difference from the


The majority protocol is that requests for shared locks are given more favorable treatment.
treatment than requests for exclusive locks.

1) Shared locks: when a transaction needs to block data item Q, it simply


requests a lock on Q from the lock manager at one site that contains a
replica of Q.

2) Exclusive locks: When a truncation needs to lock data item Q, it requests a


lock on Q from the lock manager at all sites that contain a replica of Q.

The biased scheme has the advantage of imposing less overhead on


read operation that does the majority protocol. This saving is significant in
common cases in which the frequency of read is much greater than the
frequency of write.
QUORUM CONSENSUS PROTOCOL

This protocol is a generalization of the majority protocol. This protocol


assigns each site a nonnegative weight. It assigns read and write operations
on an item two integers, called read quorum Qrand write quorum Qw, that
must satisfy the following condition, where S is the total weight of all sites at
which resides:

Qr+ QwS and 2* Qw>S

To execute a read operation, enough replicas must be read that their


total weight is >= Qr.To execute a write operation, enough replicas must be
written so that their total weight is >= Qw.

The advantage of this approach is that it can permit the cost of either
Read or writes to be selectively reduced by appropriately defining the read and
write quorums.

For example: With a small read quorum, reads need to read fewer
replicas, but the write quorum will be higher, hence write can succeed only if
correspondingly more replicas are available. Also if higher weights are given
to some sites (for example, those which are less likely to fail), fewer sites need to be
accessed for acquiring the lock.

TIMESTAMPING

The basic idea behind this scheme is that each transaction is given a
unique timestamp that the system uses in deciding the serialization order.
Our first task, in generalizing the centralized scheme to a distributed in
terms of their right to change schemas or DBMS software. That software
must also cooperate with other sites in exchanging information about
transactions, to make transaction processing possible across multiple sites.

In contrast, in a homogeneous distributed database, different sites


may use different schemas, and different DBMS software. The sites may be
aware of one another, and they may provide only limited facilities for
cooperation in transaction processing, while the divergence in software
becomes difficult for processing transactions that access multiple sites.
Check Point And Cold Restart
Check Point: A check point is used to reduce the number of log records.
that the system must scan when it recovers from a crash.

Cold Restart: A cold restart is required if the log information is lost at the
site of network & effected site is not capable of reconstructing its most recent
state then the previous state is reconstructed where the check point is
located. This effect in the loss of some local subtransition.

Cold restart is required after some catastrophic (sudden accident with


terrible result) failure which has caused the loss of log information on stable
storage, so that the current consistent state of the database cannot be
reconstructed & a previous consistent state must be restored. A previous
The consistency state is marked by a checkpoint.

In a distributed database, the problem of cold restart is worse than in


centralized one because if one site has to establish on earlier state, then all
other sites also have to establish earlier states which are consistent.

A consistent global state C is characterized by the following two


properties.

1) For each transaction T, C contains the update performed by all


subtraction of Tat any site or it does not contain any them, in the
In the former case, we say that T is contained in C.

2) If transaction T1contained in C then all conflicting transaction which


have preceded T in the serialization.

The property 1 is related to the atomicity of transaction: either all


Effect of t or none of the m can appear in a consistent state.

The property 2 is related to serialization: if the effects of T are kept as


It is then we must keep all the effect of T.1.

A global consistent state in a distributed database is to use local


dumps, local logs & global checkpoint. A global checkpoint is a set of local
check points which performed at all site of the network & are synchronized
by the following condition.
If a sub-transition of a transition T is contained in the local check point
At some site then all other subtraction of T must be contained local check
point at other site.

If the global check points are available, it is very easy to reconstruct.


first consider the latest local point which earlier global state has to be
reconstructed then all other sites are requested to reestablish the local
states of the corresponding the local check point.

Site1 C1 time

Site 2 C2 T2

R C

Site 3
T3 C3

T2 and T3 are subtractions of T; T3 is the coordinator for the 2-phase commitment. C1,
C2&C3 are local check points.

Write check point measure.

Message of 2-phase commitments protocol.

READY

The main problem of global checkpoint synchronization is shown in


The above figure consists of two sub-transactions T. 2,T3of the same transaction
T & local check point C1, C2& C3where C2does not contain subtraction T2&
C3contain subtraction T3Which break base requirement of global check
point. The transaction T performed two phase commitments which do not
eliminate the problem of synchronization of subtraction.

Therefore, to avoid such problems, it is required that all the sites become
inactive simultaneously before which one records its local checkpoint.

Management Of Distributed Transactions


A transaction is a collection of actions that make consistent transformations.
of system states while preserving system consistency.

Example of transaction:

Begin transaction reservation


begin
flight number
EXEC SQL UPDATE FLIGHT
SET STSOLD = STSOLD + 1
WHERE FNO = flight_no AND DATE = date;
EXEC SQL INSERT
INTO FC(FNO, DATE, CNAME, SPECIAL);
VALUES (flight_no, date, customer_name,null);
reservation completed
end. {Reservation}

The management of distributed transactions requires dealing with


several problems which are strictly interconnected, like reliability
Concurrency control and the efficient utilization of the resources of the whole system.

In this section we consider only the most well-known technique, used


in building most commercial systems & research prototypes: 2-phase
commitment for recovery, & 2-phase locking for concurrency control

Framework for Transaction Management

In this section, we define the properties of transaction, state the goals


of distributed transaction

Properties of Transactions
1. Atomicity: either all or none of the transaction operations are
performed atomicity requires that if a transaction is interrupted by a
failure, its partial results are undone.
There are two typical reasons why a transaction is not
completed: transaction aborts & system crashes.

The abort of a transaction can be requested by the transaction itself


because some of its inputs are wrong or because some conditions are
recognized that make transaction completion inappropriate or
useless .A transaction abort can also be forced by the system for
system dependent reasons, typically system overload and deadlocks.

The activity of ensuring atomicity in the presence of transaction


abort is called transaction recovery and the activity of ensuring atomicity
In the presence of a system crash, it is called crash recovery.

2. Durability Once a transaction has committed, the system must


guarantee that the results of its operation will never be lost, independent
of subsequent failures. since the result of a transaction which must be
preserved by the system are stored in the database, the activity of
providing the transaction durability is called database recovery
3. Serializability: if several transactions are executed concurrently,
the result must be the same as if they were executed serially in some
Order. The activity of guaranteeing transactions Serializability is called
concurrency control.

If a system provides concurrency control, the programmer can write the


transaction as if it executed alone.

Isolation an incomplete transaction cannot reveal to other transactions before


its commitment. This property is needed in order to avoid the problem of
cascading aborts. i.e. the necessity to abort all transactions which have
observed the partial result of the transaction which was later aborted.

Goals Of Transaction Management:

After considering the characteristics of transactions, let us


return to the goals of transaction management: the efficient, reliable &
concurrent execution of transaction. these three goals are strongly
interrelated; moreover, there is a trade-off between them, because the effort
which is required in order to implement the properties of in a reliable way
transactions causes an obvious performance penalty.

1. CPU & Main Memory Utilization


This aspect is common to both centralized and distributed databases.
Although typical database applications are I/O bound in large systems.
reveal a bottleneck in main memory or in CPU time. If the operating system
has to create a process for each active transaction, most of these processes
will be swapped in & out of main memory. Moreover,

A lot of context switching will be required. In order to reduce this


overhead transaction managers apply specialized techniques which take
advantage of the typical characteristics of database applications, avoid
considering them in the same way as the generalized processes which dealt with
by general purpose operating system.

2. Control Messages

In a distributed database, we also have to consider another aspect of


efficiency: the number of control messages which are exchanged between
A control message is not used to transfer data but is needed in order to
control the execution of the application.

3. Response Time

As the third important efficiency aspect, we have to consider the response


time of each individual transaction. Clearly, obtaining an acceptable
Response time can be more critical for distributed applications than for local.
application, because of the additional time which is required for
communication between different sites

4. Availability

Another aspect which must be considered by the transaction manager of a


A distributed database is the availability of the whole system. Of course, in a
A distributed system does not accept the failure of one site to stop the
whole systems operation. Therefore, the algorithms implemented by
the transaction manager must not block the execution of those
transactions which do not strictly need to access a site which is not
operational. As a result, the effort of increasing the systems availability
in the presence of site failure is a significant characteristic of the
algorithm which are used for transaction recovery & concurrency
control in ddbms
Summary

Transactions have atomicity, durability, serializability, and isolation.


properties.

Their cost in terms of main memory, CPU, and number of transmitted control
messages & their response time are minimized.

The availability of the system is maximized

The 2 Phase-Commitment Protocol

In the basic 2-phase-commitment protocol, there is one agent which


has a special role. This agent is called the coordinator; all the other agents which
must commit together are called participants. The coordinator is responsible
for making final commit to abort decision

Each participant corresponds to a sub transaction which has performed


some write action; it is responsible for performing the write action at its local
database. We assume that each participant is at different site.

Note that when the transaction performs some write action at the site
of coordinator, then the coordinator and one participant are at the same site;
though they don’t need to communicate using the network, we assume that
They follow the protocol as if they were at different sites.

The 2 phase-commitment protocol consists of two phases.

The goal of the first phase of the protocol is to reach a common decision;

The goal of the second phase is to implement this secession.

Phase one: there is a first phase during which the coordinator asks all the
participants to prepare for commitment; each participant answer ready if it is
ready to commit & willing to do so. Before sending first prepare for
commitment message, the coordinator records on stable storage a log
record of a new type, called a 'prepare' log record, in which the identifiers of
all sub transactions participating in the two-phase commitment are
recorded. The coordinator also activates a time out mechanism, which will
interrupt the coordinator after the given time interval has expired.

When a participant answers Ready, it ensures that it will be able to


commit the sub transaction even if failures occur at its site. In practice, this
means that each participant has to record on stable storage two things:

All the information which is required for locally committing the


subtraction. This means that all the log records of the sub transaction must
be recorded on stable storage.

The fact that this sub transaction has declared to be ready to commit.
means that a log record of a new type, called a “ready” log record, must be
recorded on stable storage.

The coordinator decides whether to commit or abort the transaction as


a result, of the answer it has received from participants. If all participants
answered ready it decides to commit transaction. If instead some participant
has answer abort or has not yet answered when time out expires, it decides
to abort the transaction.

Phase two: The coordinator begins the second phase of 2PC by recording
on stable storage its decision. This corresponds to writing a 'global _commit'
or 'global abort record in log. The fact that the coordinator records its
decision on storage table means that the distributed transaction will
eventually be committed or aborted, in spite of failures.

Then the coordinator informs all participants of its decision by sending


them the command massage.

All the participants write a commit or abort record in the log, based on
the command message received from the co-coordinator. From this
movement, the local recovery procedure is capable of ensuring that effect of
the subs transaction will not be losses.

Finally, all participants send a final acknowledgement (ACK) message to


coordinator, and perform the actions required for committing or aborting the
sub transaction. When the coordinator has received an ACK message from all
participants he or she writes a log record of new type called a 'complete'
record. After having written this record, the coordinator can forget the
outcome of the transaction thus all records related to this transaction can be
taken offline after the next check point.

Questions
2007

1. Explain in detail the concept of data definition language, data


manipulation language and the various types of data manipulation
language.

2. What are the different types of failures occurring in transaction


processing? Explain.

3. What is query optimization? How is it carried out? Explain with your


own example

4. Explain in detail the timestamp based protocol used in concurrency


control

5. What do you mean by a deadlock? Explain any one protocol used for it.
deadlock handling

6. Explain the different factors affecting the design of distributed systems.


database environment

7. Explain in detail the procedure of replication and fragmentation


techniques carried out for designing distributed database environment.

8. What do you mean by serializability? Explain its importance in


database management

9. Explain in detail the execution of client/server database technology.

10. Write a short note on:

a) Need of distributed databases

b) Heterogeneous database

2004
1. A) List the features of distributed DBMS and its need.
B) Explain the mechanism for storage of data.
2. Write a detailed note on concurrency control.
3. What are time-based and quorum based protocols.
4. What are deadlocks? Explain distributed deadlocks. How to avoid
deadlocks
5. What is access planning? Why is it necessary?
6. Explain in detail the working of client-server technology in the distributed
database system
7. Write a short note on:
a) Constructing an application,
b) Distributed data dictionary management,
c) Checkpoints and cold starts

2003

A) What do you mean by the compatibility of two lock modes? Explain this.
compatibility relations using a matrix format.
B) What is the need for concurrent execution of transactions? What are the
different factors affecting concurrent execution?
2. Explain the two-phase locking protocol in detail. What are the main
advantages of this protocol? Whether it is able to eliminate the
What is the problem of deadlock? Explain briefly.
3. What are the different types of failures occurring in the distributed database?
environment? Explain in detail the cause of all the solution for them.
4. What is the working of client-server technology in the distributed
database system? Explain in detail.
5. Explain the following architecture used in client-server technology: i)
Shared Everything
6. What is meant by concurrency control? Why is it important in the
management of several transactions over the database. Explain in
brief.
Partitioned networks refer to networks that are divided into smaller segments or partitions, each of which can operate independently but also communicate with other partitions. This setup can enhance performance, security, and manageability by isolating different parts of the network while allowing for controlled interactions among them.

dictionary management, iii) check points.

You might also like