ADBMS Notes 3
ADBMS Notes 3
ADBMS Notes 3
units, in centralized database systems, the action requires substantial efforts and
simply requires adding new computers and local data to the new site and finally
2. More Reliable − In case of database failures, the total system of centralized databases
is more reliable.
3. Better Response − If data is distributed in an efficient manner, then user requests can
be met from local data itself, thus providing faster response. On the other hand, in
centralized systems, all queries have to pass through the central computer for
where it is mostly used, then the communication costs for data manipulation can be
Features
● Databases in the collection are logically interrelated with each other. Often they
● Data is physically stored across multiple sites. Data in each site can be managed
● The processors in the sites are connected via a network. They do not have any
multiprocessor configuration.
Replication involves using specialized software that looks for changes in the distributive
database. Once the changes have been identified, the replication process makes all the
databases look the same. The replication process can be complex and time-consuming,
depending on the size and number of the distributed databases. This process can also require
protocols that ensure the consistency of the replicas, i.e. copies of the same data item have the
same value.
• These protocols can be eager in that they force the updates to be applied to all the replicas
before the transactions completes, or they may be lazy so that the transactions updates one
copy (called the master) from which updates are propagated to the others after the transaction
completes.
Q.5 Write short notes on distributed concurrency control.
• Concurrency control involves the synchronization of access to the distributed database, such
that the integrity of the database is maintained. It is, without any doubt, one of the most
centralized framework. One not only has to worry about the integrity of a single database, but
also about the consistency of multiple copies of the database. The condition that requires all
values of multiple copies of every data item to converge to the same value is called mutual
consistency.
• Let us only mention that the two general classes are pessimistic, synchronizing the execution
of the user request before the execution starts, and optimistic, executing requests and then
• Two fundamental primitives that can be used with both approaches are locking, which is
based on the mutual exclusion of access to data items, and time-stamping, where transactions
• There are variations of these schemes as well as hybrid algorithms that attempt to combine
• One of the main questions that is being addressed is how database and the applications that
run
• In the partitioned scheme the database is divided into a number of disjoint partitions each of
which is placed at a different site. Replicated designs can be either fully replicated (also called
fully duplicated) where the entire database is stored at each site, or partially replicated (or
partially duplicated) where each partition of the database is stored at more than one site, but
• The two fundamental design issues are fragmentation, the separation of the database into
partitions called fragments, and distribution, the optimum distribution of fragments. The
research in this area mostly involves mathematical programming in order to minimize the
combined cost of storing the database, processing transactions against it, and message
• A directory contains information (such as descriptions and locations) about data items in the
database. Problems related to directory management are similar in nature to the database
• A directory may be global to the entire DDBS or local to each site; it can be centralized at one
site or distributed over several sites; there can be a single copy or multiple copies.
• Query processing deals with designing algorithms that analyze queries and convert them
into a series of data manipulation operations. The problem is how to decide on a strategy for
executing each query over the network in the most cost-effective way, however cost is defined.
• The factors to be considered are the distribution of data, communication cost, and lack of
parallelism is used to improve the performance of executing the transaction, subject to the
• Concurrency control involves the synchronization of access to the distributed database, such
that the integrity of the database is maintained. It is, without any doubt, one of the most
centralized framework. One not only has to worry about the integrity of a single database, but
also about the consistency of multiple copies of the database. The condition that requires all
values of multiple copies of every data item to converge to the same value is called mutual
consistency.
• Let us only mention that the two general classes are pessimistic, synchronizing the execution
of the user request before the execution starts, and optimistic, executing requests and then
• Two fundamental primitives that can be used with both approaches are locking, which is
based on the mutual exclusion of access to data items, and time-stamping, where transactions
• There are variations of these schemes as well as hybrid algorithms that attempt to combine
systems.
• The competition among users for access to a set of resources (data, in this case) can result in
well as to detect failures and recover from them. The implication for DDBSs is that when a
failure occurs and various sites become either inoperable or inaccessible, the databases at the
• Furthermore, when the computer system or network recovers from the failure, the DDBSs
should be able to recover and bring the databases at the failed sites up-to-date. This may be
especially difficult in the case of network partitioning, where the sites are divided into two or
7. Replication
protocols that ensure the consistency of the replicas, i.e. copies of the same data item have the
same value.
• These protocols can be eager in that they force the updates to be applied to all the replicas
before the transaction completes, or they may be lazy so that the transactions update one copy
(called the master) from which updates are propagated to the others after the transaction
completes.
In a homogeneous distributed database, all the sites use identical DBMS and operating
● The sites use identical DBMS or DBMS from the same vendor.
● Each site is aware of all other sites and cooperates with other sites to process user requests.
● The system may be composed of a variety of DBMSs like relational, network, hierarchical or
object oriented.
● A site may not be aware of other sites and so there is limited co-operation in processing user
requests.
● Autonomous − Each database is independent that functions on its own. They are integrated
● Un-federated − Database systems employ a central coordinating module through which the
Autonomy − It indicates the distribution of control of the database system and the degree to
Distribution − It states the physical distribution of data across the different sites.
This is a two-level architecture where the functionality is divided into servers and clients. The
server functions primarily encompass data management, query processing, optimization and
transaction management. Client functions include mainly user interface. However, they have