Transaction Management
Transaction Management
Transaction Management
• Transaction Concept
• Transaction State
• Concurrent Executions
• Serializability
Transaction State
• Active – the initial state; the transaction stays in this
state while it is executing
• Partially committed – after the final statement has been
executed.
• Failed -- after the discovery that normal execution can no
longer proceed.
• Aborted – after the transaction has been rolled back and
the database restored to its state prior to the start of the
transaction. Two options after it has been aborted:
• restart the transaction
• can be done only if no internal logical error
• kill the transaction
• Committed – after successful completion.
Transaction State (Cont.)
Transaction Concept
• A transaction is a unit of program execution that
accesses and possibly updates various data items.
• E.g. transaction to transfer $50 from account A to
account B:
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)
buffer
Buffer Block A input(A)
X A
Buffer Block B Y B
output(B)
read(X)
write(Y)
x2
x1
y1
memory disk
Example of Fund Transfer (Cont.)
• Transaction to transfer $50 from account A to account B:
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)
• Consistency requirement in above example:
• the sum of A and B is unchanged by the execution of the transaction
• In general, consistency requirements include
• Explicitly specified integrity constraints such as primary keys and foreign keys
• Implicit integrity constraints
• e.g. sum of balances of all accounts, minus sum of loan amounts must equal value of cash-in-hand
• A transaction must see a consistent database.
• During transaction execution the database may be temporarily inconsistent.
• When the transaction completes successfully the database must be consistent
• Erroneous transaction logic can lead to inconsistency
Example of Fund Transfer (Cont.)
• Isolation requirement — if between steps 3 and 6, another
transaction T2 is allowed to access the partially updated
database, it will see an inconsistent database (the sum A + B
will be less than it should be).
T1 T2
1. read(A)
2. A := A – 50
3. write(A)
read(A), read(B), print(A+B)
4. read(B)
5. B := B + 50
6. write(B
• Schedule – a sequences of instructions that specify the chronological order in which instructions of
concurrent transactions are executed
• a schedule for a set of transactions must consist of all instructions of those transactions.
• must preserve the order in which the instructions appear in each individual transaction.
• A transaction that successfully completes its execution will have a commit instructions as the last
statement
• by default transaction assumed to execute commit instruction as its last step
• A transaction that fails to successfully complete its execution will have an abort instruction as the last
statement
Schedule 1
• Let T1 transfer $50 from A to B, and T2 transfer 10% of the balance from A to
B.
• A serial schedule in which T1 is followed by T2 :
Schedule 2
• A serial schedule where T2 is followed by T1
Schedule 3
• Let T1 and T2 be the transactions defined previously. The following
schedule is not a serial schedule, but it is equivalent to Schedule 1.
Schedule 3 Schedule 6
Anomalies with Interleaved Execution
Reading Uncommitted Data (WR Conflicts, “dirty reads”):
Anomalies with Interleaved Execution
Unrepeatable Reads (RW Conflicts):
Anomalies (Continued)
• Overwriting Uncommitted Data (WW Conflicts):
Concurrency Control
Outline
• Lock-Based Protocols
• Timestamp-Based Protocols
Lock-Based Protocols
• A lock is a mechanism to control concurrent access to a data item
• Data items can be locked in two modes :
1. exclusive (X) mode. Data item can be both read as well as
written. X-lock is requested using lock-X instruction.
2. shared (S) mode. Data item can only be read. S-lock is
requested using lock-S instruction.
• Lock requests are made to the concurrency-control manager by the
programmer. Transaction can proceed only after request is granted.
Lock-Based Protocols (Cont.)
• Lock-compatibility matrix
• A transaction may be granted a lock on an item if the requested lock is compatible with
locks already held on the item by other transactions
• Any number of transactions can hold shared locks on an item,
• But if any transaction holds an exclusive on the item no other transaction may hold any lock on the
item.
• If a lock cannot be granted, the requesting transaction is made to wait till all
incompatible locks held by other transactions have been released. The lock is then
granted.
Lock-Based Protocols (Cont.)
• Example of a transaction performing locking:
T2: lock-S(A);
read (A);
unlock(A);
lock-S(B);
read (B);
unlock(B);
display(A+B)
• Locking as above is not sufficient to guarantee serializability — if A and B get
updated in-between the read of A and B, the displayed sum would be wrong.
• A locking protocol is a set of rules followed by all transactions while requesting
and releasing locks. Locking protocols restrict the set of possible schedules.
The Two-Phase Locking Protocol
• This protocol ensures conflict-serializable schedules.
• Phase 1: Growing Phase
• Transaction may obtain locks
• Transaction may not release locks
• Phase 2: Shrinking Phase
• Transaction may release locks
• Transaction may not obtain locks
• The protocol assures serializability. It can be proved that the transactions can
be serialized in the order of their lock points (i.e., the point where a
transaction acquired its final lock).
The Two-Phase Locking Protocol (Cont.)
• There can be conflict serializable schedules that cannot be obtained if
two-phase locking is used.
• However, in the absence of extra information (e.g., ordering of access to
data), two-phase locking is needed for conflict serializability in the
following sense:
• Given a transaction Ti that does not follow two-phase locking, we can find a
transaction Tj that uses two-phase locking, and a schedule for Ti and Tj that is not
conflict serializable.
Lock Conversions
• Two-phase locking with lock conversions:
– First Phase:
• can acquire a lock-S on item
• can acquire a lock-X on item
• can convert a lock-S to a lock-X (upgrade)
– Second Phase:
• can release a lock-S
• can release a lock-X
• can convert a lock-X to a lock-S (downgrade)
• This protocol assures serializability. But still relies on the programmer to
insert the various locking instructions.
Deadlocks
• Consider the partial schedule
• Neither T3 nor T4 can make progress — executing lock-S(B) causes T4 to wait for T3 to release its lock
on B, while executing lock-X(A) causes T3 to wait for T4 to release its lock on A.
• Such a situation is called a deadlock.
• To handle a deadlock one of T3 or T4 must be rolled back
and its locks released.
Deadlocks (Cont.)
<T0 start>
<T0, A, 1000, 950>
<To, B, 2000, 2050
A = 950
B = 2050
<T0 commit>
<T1 start> BC output before T1
<T1, C, 700, 600> commits
C = 600
BB , BC
<T1 commit>
BA BA output after T0
commits
• Note: BX denotes block containing X.
Concurrency Control and Recovery
• With concurrent transactions, all transactions share a single disk buffer and a
single log
• A buffer block can have data items updated by one or more transactions
• We assume that if a transaction Ti has modified an item, no other transaction
can modify the same item until Ti has committed or aborted
• i.e. the updates of uncommitted transactions should not be visible to other
transactions
• Otherwise how to perform undo if T1 updates A, then T2 updates A and commits, and finally T1
has to abort?
• Can be ensured by obtaining exclusive locks on updated items and holding the locks
till end of transaction (strict two-phase locking)
• Log records of different transactions may be interspersed in the log.
Undo and Redo Operations
• Undo of a log record <Ti, X, V1, V2> writes the old value V1 to X
• Redo of a log record <Ti, X, V1, V2> writes the new value V2 to X
• Undo and Redo of Transactions
• undo(Ti) restores the value of all data items updated by Ti to their old values, going
backwards from the last log record for Ti
• each time a data item X is restored to its old value V a special log record <Ti , X, V> is written out
• when undo of a transaction is complete, a log record
<Ti abort> is written out.
• redo(Ti) sets the value of all data items updated by Ti to the new values, going
forward from the first log record for Ti
• No logging is done in this case
Undo and Redo on Recovering from Failure
• Note that If transaction Ti was undone earlier and the <Ti abort> record written to
the log, and then a failure occurs, on recovery from failure Ti is redone
• such a redo redoes all the original actions including the steps that restored old values
• Known as repeating history
• Seems wasteful, but simplifies recovery greatly
Immediate DB Modification Recovery Example
Below we show the log as it appears at three instances of time.