0% found this document useful (0 votes)
29 views

DDB Module-3 Notes 22 29-09-2024 Lecture

Uploaded by

laharipriya6435
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

DDB Module-3 Notes 22 29-09-2024 Lecture

Uploaded by

laharipriya6435
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

III Year I Sem - CSE Distributed Databases - Class Notes 2024-2025

MODULE – III
Transaction Management: Definition, properties of transaction, types of transactions
Distributed concurrency control: serializability, concurrency control mechanisms & algorithms,
time - stamped & optimistic concurrency control Algorithms, deadlock Management.

1 Differentiate between centralized 2PL protocol and Distributed 2PL protocol. 4 L2 3


OR
2 Classify and explain types of transaction models with suitable diagrams. 4 L6 3
3 Define a transaction. Discuss the ACID properties of a transaction. 4 L2 3
OR
4 Explore the computational structure of distributed transaction 4 L3 3

Transaction Management
• Definition: The Transaction consists of a set of operations that performs a single logical unit of
work in database environment.
• A transaction is used in database systems as a basic unit of consistent and reliable computing.
• a transaction takes a database, performs an action on it, and generates a new version of the
database, causing a state transition.
• A transaction is considered to be made up of a sequence of read and write operations on the
database, together with computation steps.
• In that sense, a transaction may be thought of as a program with embedded database access
queries.

Example: FLIGHT(FNO, DATE, SRC, DEST, STSOLD, CAP)


CUST(CNAME, ADDR, BAL)
FC(FNO, DATE, CNAME, SPECIAL)

Begin_transaction Reservation
begin
input(flight no, date, customer name); (1)
EXEC SQL UPDATE FLIGHT (2)
SET STSOLD = STSOLD + 1
WHERE FNO = flight no
AND DATE = date;
1 P. V Ramana Murthy
LEE; B.E(Comp); M.Tech(CS); (Ph.D(CSE));
Malla Reddy Engineering College (Autonomous)
III Year I Sem - CSE Distributed Databases - Class Notes 2024-2025

EXEC SQL INSERT (3)


INTO FC(FNO,DATE,CNAME,SPECIAL)
VALUES (flight no, date, customer name, null);
output(“reservation completed”) (4)
end.

Properties of Transactions
• A transaction is a unit of consistent and reliable computation.
• The consistency and reliability aspects of transactions are due to four properties:
(1) Atomicity,
(2) Consistency,
(3) Isolation, and
(4) Durability.
• Together, these are commonly referred to as the ACID properties of transactions.
• They are not entirely independent of each other;
• Usually there are dependencies among them as we will indicate below.

Atomicity:
• Atomicity refers to the fact that a transaction is treated as a unit of operation.
• Therefore, either all the transaction’s actions are completed, or none of them are.
• This is also known as the “all-or-nothing property.”
• Atomicity requires that if the execution of a transaction is interrupted by any sort of failure, the
DBMS will be responsible for determining what to do with the transaction upon recovery from the
failure.
• There are, of course, two possible courses of action:
➢ it can either be terminated by completing the remaining actions, or
➢ it can be terminated by undoing all the actions that have already been executed.
• There are two types of failures.
• A transaction itself may fail due to input data errors, deadlocks, or other factors.
➢ Maintaining transaction atomicity in the presence of this type of failure is commonly called the
transaction recovery.
• The second type of failure is caused by system crashes, such as media failures, processor failures,
communication link breakages, power outages, and so on.
➢ Ensuring transaction atomicity in the presence of system crashes is called crash recovery.
• An important difference between the two types of failures is that during some types of system
crashes, the information in volatile storage may be lost or inaccessible.

Consistency:
• The consistency of a transaction is simply its correctness.
• In other words, a transaction is a correct program that maps one consistent database state to
another.
• There is an interesting classification of consistency groups databases into four levels.
• Then, based on the concept of dirty data, the four levels are defined as follows:
Degree 3: Transaction T sees degree 3 consistency if:
1. T does not overwrite dirty data of other transactions.
2. T does not commit any writes until it completes all its writes [i.e., until the end of transaction
(EOT)].
2 P. V Ramana Murthy
LEE; B.E(Comp); M.Tech(CS); (Ph.D(CSE));
Malla Reddy Engineering College (Autonomous)
III Year I Sem - CSE Distributed Databases - Class Notes 2024-2025

3. T does not read dirty data from other transactions.


4. Other transactions do not dirty any data read by T before T completes.
Degree 2: Transaction T sees degree 2 consistency if:
1. T does not overwrite dirty data of other transactions.
2. T does not commit any writes before EOT.
3. T does not read dirty data from other transactions.
Degree 1: Transaction T sees degree 1 consistency if:
1. T does not overwrite dirty data of other transactions.
2. T does not commit any writes before EOT.
Degree 0: Transaction T sees degree 0 consistency if:
1. T does not overwrite dirty data of other transactions.”

• Of course, it is true that a higher degree of consistency encompasses all the lower degrees.
• The point in defining multiple levels of consistency is to provide application programmers the
flexibility to define transactions that operate at different levels.
• Consequently, while some transactions operate at Degree 3 consistency level, others may operate at
lower levels.

Isolation:
• Isolation property ensures that multiple transactions can occur concurrently without leading to the
inconsistency of database state.
• Isolation is the property of transactions that requires each transaction to see a consistent database at
all times.
• In other words, an executing transaction cannot reveal its results to other concurrent transactions
before its commitment.
• There are a number of reasons for insisting on isolation.
• One has to do with maintaining the interconsistency of transactions.
• If two concurrent transactions access a data item that is being updated by one of them, it is not
possible to guarantee that the second will read the correct value.

• Ex. Consider the following two concurrent transactions (T1 and T2), both of which access data item x.
Assume that the value of x before they start executing is 50.
T1: Read(x) T2: Read(x)
x ← x+1 x ← x+1
Write(x) Write(x)
Commit Commit
• The following is one possible sequence of execution of the actions of these transactions:
T1: Read(x)
T1: x ← x+1
T1: Write(x)
T1: Commit
T2: Read(x)
T2: x ← x+1
T2: Write(x)
T2: Commit

3 P. V Ramana Murthy
LEE; B.E(Comp); M.Tech(CS); (Ph.D(CSE));
Malla Reddy Engineering College (Autonomous)
III Year I Sem - CSE Distributed Databases - Class Notes 2024-2025

• In this case, there are no problems; transactions T1 and T2 are executed one after the other and
transaction T2 reads 51 as the value of x.
• Note that if, instead, T2 executes before T1, T2 reads 50 as the value of x.
• So, if T1 and T2 are executed one after the other (regardless of the order), the second transaction will read
51 as the value of x and x will have 52 as its value at the end of execution of these two transactions.

• However, since transactions are executing concurrently, the following execution sequence is also
possible:
T1: Read(x)
T1: x ← x+1
T2: Read(x)
T1: Write(x)
T2: x ← x+1
T2: Write(x)
T1: Commit
T2: Commit

• In this case, transaction T2 reads 50 as the value of x.


• This is incorrect since T2 reads x while its value is being changed from 50 to 51.
• Furthermore, the value of x is 51 at the end of execution of T1 and T2 since T2’s Write will overwrite
T1’s Write.

Durability:
• Durability refers to that property of transactions which ensures that once a transaction commits, its
results are permanent and cannot be erased from the database.
• Therefore, the DBMS ensures that the results of a transaction will survive subsequent system
failures.
• This is exactly the transaction commit before it informs the user of its successful completion.
• The durability property brings forth the issue of database recovery, that is, how to recover the
database to a consistent state where all the committed actions are reflected.

Types of Transactions
• Transactions have been classified according to a number of criteria.
➢ Fist one is the duration of transactions.
➢ Second is according to their structure
• According to duration transactions may be classified as
➢ online or
➢ batch transactions.
• These two classes are also called
➢ short-life and
➢ long-life transactions, respectively.

• Online transactions are characterized by very short execution/ response times (typically, on the order
of a couple of seconds) and by access to a relatively small portion of the database.
• Examples include
4 P. V Ramana Murthy
LEE; B.E(Comp); M.Tech(CS); (Ph.D(CSE));
Malla Reddy Engineering College (Autonomous)
III Year I Sem - CSE Distributed Databases - Class Notes 2024-2025

➢ banking transactions and


➢ airline reservation transactions.
• Batch transactions, on the other hand, take longer to execute (response time being measured in
minutes, hours, or even days) and access a larger portion of the database.
• Typical applications that might require batch transactions are
➢ design databases,
➢ statistical applications,
➢ report generation,
➢ complex queries, and
➢ image processing.

➢ Another classification that has been proposed is with respect to the


➢ organization of the read and write actions.
➢ The examples that we have considered so far intermix their read and write actions without any specific
ordering.
➢ We call this type of transactions general.
➢ If the transactions are restricted so that all the read actions are performed before any write action,
the transaction is called a
➢ two-step transaction.
➢ Similarly, if the transaction is restricted so that a data item has to be read before it can be updated
(written), the corresponding class is called
➢ restricted (or read-before-write).
➢ If a transaction is both twostep and restricted, it is called a
➢ restricted two-step transaction.

• Finally, there is the action model of transactions, which consists of the restricted class with the
further restriction that each (read, write) pair be executed atomically.
• This classification is shown in Fig, where the generality increases upward.

• According to their structure Transactions can have four broad categories in increasing
complexity:
1. Flat transactions,
2. Closed nested transactions,
3. Open nested transactions, and
5 P. V Ramana Murthy
LEE; B.E(Comp); M.Tech(CS); (Ph.D(CSE));
Malla Reddy Engineering College (Autonomous)
III Year I Sem - CSE Distributed Databases - Class Notes 2024-2025

4. Workflow models.

Flat Transactions:
• Flat transactions have
➢ a single start point (Begin transaction) and
➢ a single termination point (End transaction).
• Most of the transaction management work in databases has concentrated on flat transactions.
• Flat transactions are simple and perform well in short activity.
• It is benefited to commit partial results.
• Bulk updates can be expensive to undo all updates.

Nested Transactions:
• An alternative transaction model is to permit a transaction to include other transactions with their
own begin and commit points.
• Such transactions are called nested transactions.
• These transactions that are embedded in another one are usually called subtransactions.
• Nested transactions have received considerable interest as a more generalized transaction concept.

Begin transaction Reservation


begin
Begin transaction Airline
...
end. {Airline}
Begin transaction Hotel
...
end. {Hotel}
Begin transaction Car
...
end. {Car}
end.

• The level of nesting is generally open, allowing subtransactions themselves to have nested
transactions.
• This generality is necessary to support application areas where transactions are more complex than
in traditional data processing.
• In this taxonomy, we differentiate between closed and open nesting because of their termination
characteristics.
• Closed nested transactions commit in a bottom-up fashion through the root.
• Thus, a nested subtransaction begins after its parent and finishes before it, and the commitment of
the subtransactions is conditional upon the commitment of the parent.
• The semantics of these transactions enforce atomicity at the top-most level.
• Open nesting relaxes the top-level atomicity restriction of closed nested transactions.
• Therefore, an open nested transaction allows its partial results to be observed outside the
transaction.

Workflows:
• Flat transactions model relatively simple and short activities very well.
➢ However, they are less appropriate for modeling longer and more elaborate activities.
• That is the reason for the development of the various nested transaction models.
6 P. V Ramana Murthy
LEE; B.E(Comp); M.Tech(CS); (Ph.D(CSE));
Malla Reddy Engineering College (Autonomous)
III Year I Sem - CSE Distributed Databases - Class Notes 2024-2025

• It has been argued that flat and nested transaction extensions are not sufficiently powerful to model
business activities.
• Hence workflows models came into existence.
• A working definition is that a workflow is “a collection of tasks organized to accomplish some
business process.”
• Three types of workflows are identified
1. Human-oriented workflows
2. System-oriented workflows
3. Transactional workflows

1. Human-oriented workflows, which involve humans in performing the tasks.


The system support is provided to facilitate collaboration and coordination among humans, but it is
the humans themselves who are ultimately responsible for the consistency of the actions.
2. System-oriented workflows are those that consist of computation-intensive and specialized tasks
that can be executed by a computer.
The system support in this case is substantial and involves concurrency control and recovery,
automatic task execution, notification, etc.
3. Transactional workflows range in between human-oriented and system-oriented workflows and
borrow characteristics from both.
They involve “coordinated execution of multiple tasks that
(a) may involve humans,
(b) require access to HAD [heterogeneous, autonomous, and/or distributed] systems, and
(c) support selective use of transactional properties [i.e., ACID properties] for individual tasks or
entire workflows.”

Differentiate between centralized 2PL protocol and Distributed 2PL protocol.

Centralized 2PL

• The 2PL algorithm can easily be extended to the distributed DBMS environment.
• One way of doing this is to delegate lock management responsibility to a single site only.
• This means that only one of the sites has a lock manager;
• The transaction managers at the other sites communicate with it rather than with their own lock
managers.
• This approach is also known as the primary site 2PL algorithm.
• The communication between the cooperating sites in executing a transaction according to a
centralized 2PL (C2PL) algorithm is depicted in Fig.
• This communication is between the transaction manager at the site where the transaction is
initiated (called the coordinating TM), the lock manager at the central site, and the data processors
(DP) at the other participating sites.
7 P. V Ramana Murthy
LEE; B.E(Comp); M.Tech(CS); (Ph.D(CSE));
Malla Reddy Engineering College (Autonomous)
III Year I Sem - CSE Distributed Databases - Class Notes 2024-2025

• The participating sites are those that store the data item and at which the operation is to be carried
out.
• The order of messages is denoted in the figure.

• The transaction manager runs forever and waits until a message arrives from either an application
(with a transaction operation) or from a lock manager, or from a data processor.
• One common criticism of C2PL algorithms is that a bottleneck may quickly form around the
central site.
• Furthermore, the system may be less reliable since the failure or inaccessibility of the central site
would cause major system failures.
• The bottleneck will certainly form as the transaction rate increases.

Distributed 2PL

• Distributed 2PL (D2PL) requires the availability of lock managers at each site.
• The communication between cooperating sites that execute a transaction according to the
distributed 2PL protocol is depicted in Fig.

• The distributed 2PL transaction management algorithm is similar to the C2PLTM, with two
major modifications.
• The messages that are sent to the central site lock manager in C2PL-TM are sent to the lock
managers at all participating sites in D2PL-TM.

8 P. V Ramana Murthy
LEE; B.E(Comp); M.Tech(CS); (Ph.D(CSE));
Malla Reddy Engineering College (Autonomous)
III Year I Sem - CSE Distributed Databases - Class Notes 2024-2025

• The second difference is that the operations are not passed to the data processors by the
coordinating transaction manager, but by the participating lock managers.
• This means that the coordinating transaction manager does not wait for a “lock request
granted” message.
• The participating data processors send the “end of operation” messages to the coordinating
TM.
• The alternative is for each DP to send it to its own lock manager who can then release the locks
and inform the coordinating TM.
• Centralized 2PL A single site handles all locks, Primary 2PL Each data item is assigned a primary
site to handle its locks.
• Centralized 2PL Data is not necessarily replicated, Distributed 2PL Assumes data can be
replicated.
• Each primary is responsible for handling locks for its data, which may reside at remote data
managers.

Types for deadlock avoidance in distributed system.

• Deadlock avoidance schemes either employ concurrency control techniques that will never result
in deadlocks or require that potential deadlock situations are detected in advance and steps are
taken such that they will not occur.

Types for deadlock avoidance schemes in distributed system are:

1. The simplest means of avoiding deadlocks is to order the resources and insist that each process
request access to these resources in that order.
2. Another alternative is to make use of transaction timestamps to prioritize transactions and
resolve deadlocks by aborting transactions with higher (or lower) priorities.

9 P. V Ramana Murthy
LEE; B.E(Comp); M.Tech(CS); (Ph.D(CSE));
Malla Reddy Engineering College (Autonomous)

You might also like