0% found this document useful (0 votes)
38 views

Chapter 22: Distributed Databases

Distributed Database

Uploaded by

Jithinshr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views

Chapter 22: Distributed Databases

Distributed Database

Uploaded by

Jithinshr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Chapter 22: Distributed Databases

Database System Concepts, 5th Ed.


Silberschatz, Korth and Sudarshan
See www.db-book.com for conditions on re-use

Distributed Database System

A distributed database system consists of loosely coupled sites that share


no physical component.

A distributed database system consists of a collection of sites, connected


together via some kind of communication network, in which:
a. Each site is a full database system site in its own right.

Database systems that run on each site are independent of each other

Transactions may access data at one or more sites

Database System Concepts - 5th Edition, Aug 22, 2005.

22.2

Silberschatz, Korth and Sudarshan

Homogeneous Distributed Databases

In a homogeneous distributed database

All sites have identical software

Are aware of each other and agree to cooperate in processing user


requests.

Appears to user as a single system

In a heterogeneous distributed database

Different sites may use different schemas and software

Difference in schema is a major problem for query processing

Difference in software is a major problem for transaction


processing

Sites may not be aware of each other and may provide only
limited facilities for cooperation in transaction processing

Database System Concepts - 5th Edition, Aug 22, 2005.

22.3

Silberschatz, Korth and Sudarshan

Data Replication

A relation or fragment of a relation is replicated if it is stored redundantly


in two or more sites.

Full replication of a relation is the case where the relation is stored at all
sites.

Fully redundant databases are those in which every site contains a copy
of the entire database.

Database System Concepts - 5th Edition, Aug 22, 2005.

22.4

Silberschatz, Korth and Sudarshan

Data Replication (Cont.)

Advantages of Replication

Availability: failure of site containing relation r does not result in


unavailability of r is replicas exist.

Parallelism: queries on r may be processed by several nodes in parallel.

Reduced data transfer: relation r is available locally at each site containing a


replica of r.
Disadvantages of Replication
Increased cost of updates: each replica of relation r must be updated.

Increased complexity of concurrency control: concurrent updates to


distinct replicas may lead to inconsistent data unless special concurrency
control mechanisms are implemented.

One solution: choose one copy as primary copy and apply concurrency
control operations on primary copy

Database System Concepts - 5th Edition, Aug 22, 2005.

22.5

Silberschatz, Korth and Sudarshan

Data Fragmentation

Division of relation r into fragments r1, r2, , rn which contain sufficient


information to reconstruct relation r.

Horizontal fragmentation: each tuple of r is assigned to one or more


fragments

Vertical fragmentation: the schema for relation r is split into several smaller
schemas

All schemas must contain a common candidate key (or superkey) to


ensure lossless join property.

A special attribute, the tuple-id attribute may be added to each


schema to serve as a candidate key.

Example : relation account with following schema

Account = (account_number, branch_name , balance )

Database System Concepts - 5th Edition, Aug 22, 2005.

22.6

Silberschatz, Korth and Sudarshan

Horizontal Fragmentation of account Relation


account_number

branch_name

balance

A-305
A-226
A-155

Hillside
Hillside
Hillside

500
336
62

account1 = branch_name=Hillside (account )


account_number
A-177
A-402
A-408
A-639

branch_name
Valleyview
Valleyview
Valleyview
Valleyview

balance
205
10000
1123
750

account2 = branch_name=Valleyview (account )


Database System Concepts - 5th Edition, Aug 22, 2005.

22.7

Silberschatz, Korth and Sudarshan

Vertical Fragmentation of employee_info Relation


branch_name

customer_name

tuple_id

Lowman
Hillside
Camp
Hillside
Camp
Valleyview
Kahn
Valleyview
Kahn
Hillside
Kahn
Valleyview
Green
Valleyview
deposit1 = branch_name, customer_name, tuple_id (employee_info )
account_number

balance

500
A-305
336
A-226
205
A-177
10000
A-402
62
A-155
1123
A-408
750
A-639
deposit2 = account_number, balance, tuple_id (employee_info )
Database System Concepts - 5th Edition, Aug 22, 2005.

22.8

1
2
3
4
5
6
7

tuple_id
1
2
3
4
5
6
7
Silberschatz, Korth and Sudarshan

Advantages of Fragmentation

Horizontal:

allows parallel processing on fragments of a relation

allows a relation to be split so that tuples are located where they are
most frequently accessed

Vertical:

allows tuples to be split so that each part of the tuple is stored where
it is most frequently accessed

tuple-id attribute allows efficient joining of vertical fragments

Vertical and horizontal fragmentation can be mixed.

Fragments may be successively fragmented to an arbitrary depth.

Replication and fragmentation can be combined

Relation is partitioned into several fragments: system maintains


several identical replicas of each such fragment.

Database System Concepts - 5th Edition, Aug 22, 2005.

22.9

Silberschatz, Korth and Sudarshan

Data Transparency

Data transparency: Degree to which system user may remain unaware of


the details of how and where the data items are stored in a distributed
system

Consider transparency issues in relation to:

Fragmentation transparency

Replication transparency

Location transparency

Naming of data items: criteria


1.

Every data item must have a system-wide unique name.

2.

It should be possible to find the location of data items efficiently.

3.

It should be possible to change the location of data items


transparently.

4.

Each site should be able to create new data items autonomously.

Database System Concepts - 5th Edition, Aug 22, 2005.

22.10

Silberschatz, Korth and Sudarshan

You might also like