0% found this document useful (0 votes)
2 views36 pages

Distributed Database Management System

Distributed Database Management Systems (DDBMS) allow organizations to store databases across multiple locations while maintaining centralized administration. Advantages include faster data access and improved communication, while disadvantages involve management complexity and security concerns. DDBMS supports various configurations, such as single-site and multiple-site processing, and utilizes data replication for reliability and performance enhancement.

Uploaded by

shakyapranil1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views36 pages

Distributed Database Management System

Distributed Database Management Systems (DDBMS) allow organizations to store databases across multiple locations while maintaining centralized administration. Advantages include faster data access and improved communication, while disadvantages involve management complexity and security concerns. DDBMS supports various configurations, such as single-site and multiple-site processing, and utilizes data replication for reliability and performance enhancement.

Uploaded by

shakyapranil1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 36

Distributed Database

Management Systems
Distributed Database Management Systems
 When an organization is geographically dispersed, it may choose to store its
databases on a central database server or to distribute them to local servers
(or a combination of both). A distributed database is a single logical
database that is spread physically across computers in multiple locations
that are connected by a data communications network.
The distributed database is still centrally administered as a corporate
resource while providing local flexibility and customization. The network
must allow the users to share the data; thus, a user (or program) at location
A must be able to access (and perhaps update) data at location B. The sites
of a distributed system may be spread over a large area (e.g., country or the
world) or over a small area (e.g., a building or campus). The computers may
range from PCs to large-scale servers or even supercomputers. A distributed
database requires multiple instances of a database management system (or
several DBMSs), running at each remote site.

2

Evolution of DDBMS
Decentralized database management systems
(DDBMS)
Interconnected computer systems
Data/processing functions reside on multiple sites
1970’s: Centralized DBMS
1980’s: Decentralized management structure
common
1990’s: New forces
Internet and the World Wide Web used for data
access and distribution

3
DDBMS

Advantages
Data located near site with greatest demand
Faster data access
Faster data processing
Growth facilitation
Improved communications
Reduced operating costs
User-friendly interface
Less danger of single-point failure
Processor independence

4
DDBMS

Disadvantages
Complexity of management and control
Security
Increased storage requirements
Greater difficulty in managing data
environment
Increased training costs

5
Distributed Processing
Shares database’s logical processing
among physically, networked independent
sites

6
Distributed Database
Stores logically related database over
physically independent sites

7
Distributed Database vs. Distributed
Processing
Distributed processing
Does not require distributed database
May be based on a single database on single
computer
Copies or parts of database processing functions
must be distributed to all data storage sites
Distributed database
Requires distributed processing
Both
Require a network to connect components

8
Functions of DDBMS
Application/end user interface
Transformation to determine request components
Query optimization to find the best access strategy
Mapping to determine the data location
I/O interface to read or write data
Formatting to prepare the data for presentation
Security to provide data privacy
Backup and recovery
DB Administration
Concurrency Control
Transaction Management

9
Centralized Database

Figure 10.3

10
Fully Distributed Database Management System

Figure 10.4
11
DDBMS Components
Computer workstations
Network hardware and software
components
Communications media
Transaction processor (TP)
Also called application manager (AP) or
transaction manager (TM)
Data processor (DP)
Also called data manager (DM)

12
Distributed Database Components

Transaction processor(TP), Data processor (DP)


13
DDBMS

Protocols
Interface with network to transport data
and commands between DPs and TPs
Synchronize data received from DPs and
route to appropriate TPs
Ensure common database functions
Security
Concurrency control
Backup and recovery

Transaction processor(TP), Data processor (DP)


14
Levels of Data and Process Distribution
Database systems can be classified based
on process distribution and data
distribution

15
Single-Site Processing, Single-Site Data (SPSD)

All processing on single CPU or host


computer
All data are stored on host computer disk
DBMS located on the host computer
Typical of mainframe and minicomputer
DBMSs
Typical of 1st generation of single-user
microcomputer database

16
Single-Site Processing, Single-Site Data (con’t.)

Figure 10.6

17
Multiple-Site Processing, Single-Site Data (MPSD)
• Requires network file server
• Applications accessed through LAN
• Variation known as client/server
architecture

Figure 10.7
18
Multiple-Site Processing,
Multiple-Site Data (MPMD)
Fully distributed DDBMS with support for
multiple DPs and TPs at multiple sites
Homogeneous I
 Integrate one type of centralized DBMS over the

network

Heterogeneous
 Integrate different types of centralized DBMSs
over a network

19
20
Homogeneous Distributed Database Scenario
Heterogeneous Distributed Database Scenario
Distributed

DB Transparency
Allows end users to feel like only database
user
Hides complexities of distributed database
Transparency features
Distribution
Transaction
Failure
Performance
Heterogeneity

23
Distribution Transparency
Allows management of a physically
dispersed database as though it were
centralized
Three Levels
Fragmentation transparency
Location transparency
Local mapping transparency Table 10.2

24
Transaction

Transparency
Ensures transactions maintain integrity
and consistency
Completed only if all involved database
sites complete their part of the transaction
Management mechanisms
Remote request
Remote transaction
Distributed transaction
Distributed request

25
Remote Request

Figure 10.10

26
Remote Transaction

27
Distributed Transaction
Figure 10.12

28
Distributed Requests

Figure 10.13

29
Distributed Requests (con’t.)

Figure 10.14

30
31
Synchronous and Asynchronous distributed Database

32
Data replication
A popular option for data distribution as well as for fault tolerance of a
database is to store a separate copy of the database at each of two or more
sites. Replication may use either synchronous or asynchronous distributed
database technologies, although asynchronous technologies are more typical
in a replicated environment. If a copy is stored at every site, we have the case
of full replication, which may be impractical except for only relatively small
databases. However, as disk storage and network technology costs have
decreased, full data replication, or mirror images, have become more
common, especially for “always on” services, such as electronic commerce
and search engines.

33
There are five advantages to data replication:
1. Reliability If one of the sites containing the relation (or database) fails, a
copy can always be found at another site without network traffic delays.
2. Fast response Each site that has a full copy can process queries locally.
3. Replicated databases are usually refreshed at scheduled intervals, so most
forms of replication are used when some relaxing of synchronization across
database copies is acceptable.
4. Node decoupling Each transaction may proceed without coordination across
the network. if nodes are down, busy, or disconnected (e.g., in the case of
mobile personal computers), a transaction is handled when the user desires. In
the place of real-time synchronization of updates, a behind-the-scenes process
coordinates all data copies.
5. Reduced network traffic at prime time Often updating data happens during
prime business hours, when network traffic is highest and the demands for
rapid response greatest. Replication, with delayed updating of copies of data,
moves network traffic for sending updates to other nodes to non-prime-time
hours.

34
Snapshot replication:
snapshot replication Different schemes exist for updating data copies,
assuming that multiple sites are updating the same data First, updates from
all replicated sites are periodically collected at a master, or primary, site,
where all the updates are made to form a consolidated record of all changes.
With some distributed DBMSs, this list of changes is collected in a snapshot
log, which is a table of row identifiers for the records to go into the snapshot.
Then a readonly snapshot of the replicated portion of the database is taken at
the master site. Finally, the snapshot is sent to each site where there is a copy.

35
u !
yo
n k
h a
T
36

You might also like