DBMS - Transactions Management

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 40

DBMS : Transaction Management

Chapter Outline
1 Introduction to Transaction Processing
2 Transaction States
3 Properties of Transactions
4 Recovery Management
4 Characterizing Schedules based on Recoverability
5 Characterizing Schedules based on Serializability
6. Concurrency Control ( Locking )
Introduction to Transaction Processing
 Single-User System:
 At most one user at a time can use the system.
 Multiuser System:
 Many users can access the system concurrently.
 Concurrency
 Interleaved processing:
 Concurrent execution of processes is interleaved in
a single CPU
 Parallel processing:
 Processes are concurrently executed in multiple
CPUs.
What is a Transaction ?
 A Transaction:
 Logical unit of database processing that includes
one or more access operations (read -retrieval,
write - insert or update, delete).

 Transaction boundaries:
 Begin and End transaction.

 An application program may contain


several transactions separated by the Begin
and End transaction boundaries.
Transaction Example
An example of transaction can be bank accounts of two
users, say A & B. When a bank employee transfers amount of
Rs. 500 from A's account to B's account:
Open_Account(A)
Old_Balance = A.balance
New_Balance = Old_Balance - 500
A.balance = New_Balance
Close_Account(A)

Open_Account(B)
Old_Balance = B.balance
New_Balance = Old_Balance + 500
B.balance = New_Balance
Close_Account(B)
Basic Operations of a Transaction

 Basic operations are read and write


 read_item(X): Reads a database item
named X into a program variable. To
simplify our notation, we assume that
the program variable is also named X.

 write_item(X): Writes the value of


program variable X into the database
item named X.
Two sample transactions:
 (a) Transaction T1
 (b) Transaction T2
Transaction States
 Transaction may have the following states:

 Active state

 Partially committed state

 Committed state

 Failed state

 Terminated State
State transition diagram illustrating
the states for transaction execution
Active: In this state the transaction is being executed. This is the initial
state of every transaction.
Partially Committed: When a transaction executes its final operation, it is
said to be in this state. After execution of all operations, the database
system performs some checks e.g. the consistency state of database after
applying output of transaction onto the database.
Failed: If any checks made by database recovery system fails, the
transaction is said to be in failed state, from where it can no longer proceed
further.
Aborted: If any of checks fails and transaction reached in Failed state, the
recovery manager rolls back all its write operation on the database to make
database in the state where it was prior to start of execution of transaction.
Transactions in this state are called aborted. Database recovery module
can select one of the two operations after a transaction aborts:
Re-start the transaction
Kill the transaction

• Committed: If transaction executes all its operations successfully it is


said to be committed. All its effects are now permanently made on
database system.
Problems during multiple transaction processing
( Why Concurrency Control is Needed ? )
Problems that can occur for certain transaction schedules
without appropriate concurrency control mechanisms:

 The Lost Update Problem


 This occurs when two transactions that access the same
database items have their operations interleaved in a way
that makes the value of some database item incorrect.
 The Temporary Update (or Dirty Read) Problem
 This occurs when one transaction updates a database item
and then the transaction fails for some reason
 The updated item is accessed by another transaction before
it is changed back to its original value.
 The Incorrect Summary Problem
 If one transaction is calculating an aggregate summary
function on a number of records while other transactions are
updating some of these records, the aggregate function may
calculate some values before they are updated and others
after they are updated.
Concurrent execution is uncontrolled:
(a) The lost update problem.
(b) The temporary update problem.
(c) The incorrect summary problem.
Properties of Transactions (ACID properties)
 Atomicity: A transaction is an atomic unit of processing; it
is either performed in its entirety or not performed at all.
 Consistency preservation: A correct execution of the
transaction must take the database from one consistent
state to another.
 Isolation: A transaction should not make its updates visible
to other transactions until it is committed; this property,
when enforced strictly, solves the temporary update
problem and makes cascading rollbacks of transactions
unnecessary.
 Durability or permanency: Once a transaction changes
the database and the changes are committed, these
changes must never be lost because of subsequent failure.
Recovery Management ( Why recovery is needed )
(What causes a Transaction to fail)
1. A computer failure (system crash):
A hardware or software error occurs in the computer system
during transaction execution. If the hardware crashes, the
contents of the computer’s internal memory may be lost.

2. A transaction or system error:


Some operation in the transaction may cause it to fail, such as
integer overflow or division by zero. Transaction failure may also
occur because of erroneous parameter values or because of a
logical programming error. In addition, the user may interrupt the
transaction during its execution.

3.Local errors or exception conditions detected by the transaction:


Certain conditions necessitate cancellation of the transaction.
For example, data for the transaction may not be found. A
condition, such as insufficient account balance in a banking
database, may cause a transaction, such as a fund withdrawal
from that account, to be canceled. A programmed abort in the
transaction causes it to fail.
4. Concurrency control enforcement:
The concurrency control method may decide to abort the
transaction, to be restarted later, because it violates
serializability or because several transactions are in a
state of deadlock.

5. Disk failure:
Some disk blocks may lose their data because of a read
or write malfunction or because of a disk read/write head
crash. This may happen during a read or a write
operation of the transaction.

6. Physical problems and catastrophes:


This refers to an endless list of problems that includes
power or air-conditioning failure, fire, theft, sabotage,
overwriting disks or tapes by mistake, and mounting of a
wrong tape by the operator.
The System Log
 Log or Journal: The log keeps track of all
transaction operations that affect the values of
database items.
 This information may be needed to permit recovery
from transaction failures.

 The log is kept on disk, so it is not affected by any


type of failure except for disk or catastrophic failure.

 In addition, the log is periodically backed up to


archival storage (tape) to guard against such
catastrophic failures.
 T refers to a unique transaction-id that is generated
automatically by the system and is used to identify each
transaction:
 Types of log record:
 [start_transaction,T]: Records that transaction T has
started execution.

 [write_item,T,X,old_value,new_value]: Records that


transaction T has changed the value of database item X
from old_value to new_value.

 [read_item,T,X]: Records that transaction T has read the


value of database item X.

 [commit,T]: Records that transaction T has completed


successfully, and affirms that its effect can be committed
(recorded permanently) to the database.

 [abort,T]: Records that transaction T has been aborted.


Recovery using log records:

If the system crashes, we can recover to a consistent


database state by examining the log and using one of the
techniques :
 Commit Point:
 A transaction T reaches its commit point when all its
operations that access the database have been executed
successfully and the effect of all the transaction operations on
the database has been recorded in the log.
 Beyond the commit point, the transaction is said to be
committed, and its effect is assumed to be permanently
recorded in the database.
 The transaction then writes an entry [commit,T] into the log.

 Roll Back :
 Needed for transactions that have a [start_transaction,T]
entry into the log but no commit entry [commit,T] into the
log.
Transaction Schedule ( or history )
 Transaction schedule:
When transactions are executing concurrently in an
interleaved fashion, the order of execution of
operations from the various transactions forms what
is known as a transaction schedule (or history).

 A schedule (or history) S of n transactions T1, T2, ..Tn:


It is an ordering of the operations of the
transactions subject to the constraint that, for each
transaction Ti that participates in S, the operations
of T1 in S must appear in the same order in which
they occur in T1.
Characterizing Schedules based on Recoverability:
a. Recoverable schedule: One where no transaction
needs to be rolled back. A schedule S is
recoverable if no transaction T in S commits until
all transactions T’ that have written an item that T
reads have committed.
b. Cascadeless schedule: One where every
transaction reads only the items that are written by
committed transactions.
c. Schedules requiring cascaded rollback: A schedule
in which uncommitted transactions that read an
item from a failed transaction must be rolled back.
d. Strict Schedules: A schedule in which a transaction
can neither read or write an item X until the last
transaction that wrote X has committed.
Characterizing Schedules based on Serializability:

i. Serial schedule: A schedule S is serial if, for every


transaction T participating in the schedule, all the
operations of T are executed consecutively in the
schedule. Otherwise, the schedule is called nonserial
schedule.
ii. Serializable schedule: A schedule S is serializable if it
is equivalent to some serial schedule of the same n
transactions.
iii.Result equivalent: Two schedules are called result
equivalent if they produce the same final state of the
database.
iv. Conflict equivalent : Two schedules are said to
be conflict equivalent if the order of any two
conflicting operations is the same in both
schedules.
v. Conflict serializable: A schedule S is said to be
conflict serializable if it is conflict equivalent to
some serial schedule S’.
vi. View equivalence: A less restrictive definition of
equivalence of schedules
vii.View serializability: Definition of serializability
based on view equivalence. A schedule is view
serializable if it is view equivalent to a serial
schedule.
Serial Schedules example
 Schedule A  Schedule B
T1 T2 T1 T2

read_item(x); read_item(x);
x:=x-n; x:=x+m;
write_item(x); write_item(x);
read_item(y);
Y:=y+n; read_item(x);
write_item(y); x:=x-n;
read_item(x); write_item(x);
x:=x+m; read_item(y);
write_item(x); Y:=y+n;
write_item(y);
Non Serial Schedules example
 Schedule C  Schedule D
T1 T2 T1 T2

read_item(x); read_item(x);
x:=x-n; x:=x-n;
read_item(x); write_item(x);
x:=x+m;
write_item(x); read_item(x);
read_item(y); x:=x+m;
write_item(x); write_item(x);

Y:=y+n; read_item(y);
write_item(y); Y:=y+n;
write_item(y);
Result equivalent Schedules example
S1 S2

read_item(x); read_item(x);
x:=x+10; x:=x*1.1;
write-item(x); write_item(x);
Testing for conflict serializability: Algorithm
 Looks at only read_Item (X) and write_Item (X)
operations
 Constructs a precedence graph (serialization
graph) - a graph with directed edges
 An edge is created from Ti to Tj if one of the
operations in Ti appears before a conflicting
operation in Tj
 The schedule is serializable if and only if the
precedence graph has no cycles.
 For each Ti in S, create a node labeled Ti in
precedence graph.
 Create an edge from Ti->Tj when Tj executes
read-item(x) after Ti executes write_item(x).
 Create an edge from Ti->Tj when Tj executes
write-item(x) after Ti executes read_item(x).
 Create an edge from Ti->Tj when Tj executes
write-item(x) after Ti executes write _item(x).
 S is serializable if the graph has no cycles.

Slide 17- 28
Constructing the precedence graphs for schedules A
and D and to test for conflict serializability.
(a) Precedence graph for serial schedule A.
(b) Precedence graph for serial schedule B.
(c) Precedence graph for schedule C (not serializable).
(d) Precedence graph for schedule D (serializable, equivalent to
schedule A).
Another Example of Serializability testing.
(a) The read and Write operations of three transactions T1, T2, and T3.
(b) Schedule E.
(c) Schedule F.
 Two schedules are said to be view equivalent if
the following three conditions hold:
1. The same set of transactions participates in S
and S’, and S and S’ include the same
operations of those transactions.
2. For any operation Ri(X) of Ti in S, if the value of
X read by the operation has been written by an
operation Wj(X) of Tj (or if it is the original value
of X before the schedule started), the same
condition must hold for the value of X read by
operation Ri(X) of Ti in S’.
3. If the operation Wk(Y) of Tk is the last operation
to write item Y in S, then Wk(Y) of Tk must also
be the last operation to write item Y in S’.
Concurrency Control Techniques ( Locking )
( Purpose of Concurrency Control )

• To enforce Isolation (through mutual exclusion)


among conflicting transactions.
• To preserve database consistency through
consistency preserving execution of
transactions.
• To resolve read-write and write-write conflicts.

Example: In concurrent execution environment if


T1 conflicts with T2 over a data item A, then the
existing concurrency control decides if T1 or T2
should get the A and if the other transaction is
rolled-back or waits.
What is Locking ?
Locking is an operation which secures
(a) permission to Read or
(b) permission to Write a data item for a transaction.
Example: Lock (X). Data item X is locked in behalf of the
requesting transaction.

Unlocking is an operation which removes these permissions


from the data item.
Example: Unlock (X). Data item X is made available to all
other transactions.

Lock and Unlock are Atomic operations.

Two locks modes


(a) shared (read) and
(b) exclusive (write).
Shared mode: shared lock (X).
More than one transaction can apply share lock on X for
reading its value but no write lock can be applied on X by
any other transaction.
Exclusive mode: Write lock (X).
Only one write lock on X can exist at any time and no
shared lock can be applied by any other transaction on X.

Read Write
Read Write

Y N

N N

Conflict matrix
Database requires that all transactions should be well-
formed. A transaction is well-formed if:
• It must lock the data item before it reads or writes to it.
• It must not lock an already locked data items and it must
not try to unlock a free data item.

The following code performs the lock operation:


B : if LOCK (X) = 0 (*item is unlocked*)
then LOCK (X)  1 (*lock the item*)
else begin
wait (until lock (X) = 0) and
the lock manager wakes up the transaction);
goto B
end;

The following code performs the unlock operation:


LOCK (X)  0 (*unlock the item*)
if any transactions are waiting then
wake up one of the waiting the transactions;
Two-Phase Locking Technique:
Two Phases: (a) Locking (Growing)
(b) Unlocking (Shrinking).

Locking (Growing) Phase: A transaction applies locks


(read or write) on desired data items one at a time.

Unlocking (Shrinking) Phase: A transaction unlocks its


locked data items one at a time.

Requirement: For a transaction these two phases


must be mutually exclusively, that is, during locking
phase unlocking phase must not start and during
unlocking phase locking phase must not begin.
Two-Phase Locking Example:
T1 T2 Result
read_lock (Y); read_lock (X); Initial values: X=20; Y=30
read_item (Y); read_item (X); Result of serial execution
unlock (Y); unlock (X); T1 followed by T2
write_lock (X); Write_lock (Y); X=50, Y=80.
read_item (X); read_item (Y); Result of serial execution
X:=X+Y; Y:=X+Y; T2 followed by T1
write_item (X); write_item (Y); X=70, Y=50
unlock (X); unlock (Y);
Two-Phase Locking Example:
T1 T2 Result

read_lock (Y); X=50; Y=50


read_item (Y); Non-serializable because it.
unlock (Y); violated two-phase policy.
read_lock (X);
read_item (X);
unlock (X);
write_lock (Y);
read_item (Y);
Y:=X+Y;
write_item (Y);
unlock (Y);
write_lock (X);
read_item (X);
X:=X+Y;
write_item (X);
unlock (X);

You might also like