Transaction Management - I
Transaction Management - I
Transaction Management - I
Introduction
• Time sharing systems executes more than one program at the
same time by interleaving the execution of the programs
• In DBMS, we consider transactions, not programs
• A transaction is a database program that must be completed
entirely in order to retain the consistency of the database; if the
transaction cannot be completed, the database should remain at
the same state as if the transaction hadn’t been executed at all
If the database is initially in consistent state (or empty), a
sequence of transactions would bring the database from
one consistent state to another
• Additional operations:
– Commit - the transaction is successful and the data items value must
be changed (if any) on the database permanently
– Rollback/Abort - the transaction is not successful, do not change any
of the data item values
– BEGIN_TRANSACTION, END_TRANSACTION
Active: transaction is started and is issuing reads and writes to the database
Partially committed: operations are done and values are ready to be written to
the database
Committed: writing to the database is permitted and successfully completed
Failed: the transaction or the system detects a fatal error
Terminated: transaction leaves the system(Aborted)
Implementation of Atomicity and
Durability
The recovery-management component of a database system implements
the support for atomicity and durability.
E.g. the shadow-database scheme:
all updates are made on a shadow copy of the database
db_pointer is made to point to the updated shadow copy after
– the transaction reaches partial commit and
– all updated pages have been flushed to disk.
Implementation of Atomicity and Durability
db_pointer always points to the current consistent copy of
the database.
In case transaction fails, old consistent copy pointed to by
db_pointer can be used, and the shadow copy can be
deleted.
The shadow-database scheme:
Assumes that only one transaction is active at a time.
Assumes disks do not fail
Useful for text editors, but
extremely inefficient for large databases
– Variant called shadow paging reduces copying of data,
but is still not practical for large databases
Does not handle concurrent transactions
Concurrent Executions
Multiple transactions are allowed to run concurrently in
the system. Advantages are:
increased processor and disk utilization, leading
to better transaction throughput
E.g. one transaction can be using the CPU while
another is reading from or writing to the disk
reduced average response time for transactions:
short transactions need not wait behind long ones.
Concurrency control schemes – mechanisms to
achieve isolation
that is, to control the interaction among the
concurrent transactions in order to prevent them
from destroying the consistency of the database
Schedules
Schedule – a sequences of instructions that specify the
chronological order in which instructions of concurrent
transactions are executed
a schedule for a set of transactions must consist of all
instructions of those transactions
must preserve the order in which the instructions
appear in each individual transaction.
A transaction that successfully completes its execution
will have a commit instructions as the last statement
by default transaction assumed to execute commit
instruction as its last step
A transaction that fails to successfully complete its
execution will have an abort instruction as the last
statement
Schedule 1
Let T1 transfer $50 from A to B, and T2 transfer 10% of the
balance from A to B.
A serial schedule in which T1 is followed by T2 :
Schedule 2
• A serial schedule where T2 is followed by T1
Schedule 3
Let T1 and T2 be the transactions defined previously. The
following schedule is not a serial schedule, but it is equivalent
to Schedule 1.
Schedule 3 Schedule 6
Conflict Serializability (Cont.)
y
Example Schedule (Schedule A) + Precedence Graph
T1 T2 T3 T4 T5
read(X)
read(Y)
read(Z)
read(V)
read(W)
T1 T2
read(W)
read(Y)
write(Y)
write(Z)
read(U)
read(Y)
T3 T4
write(Y)
read(Z)
write(Z)
read(U)
write(U) T5
Test for Conflict Serializability
A schedule is conflict serializable if and only
if its precedence graph is acyclic.
Cycle-detection algorithms exist which take
order n2 time, where n is the number of
vertices in the graph.
(Better algorithms take order n + e where
e is the number of edges.)
If precedence graph is acyclic, the
serializability order can be obtained by a
topological sorting of the graph.
This is a linear order consistent with the
partial order of the graph.
For example, a serializability order for
Schedule A would be
T5 → T1 → T3 → T2 → T4
Are there others?
Test for View Serializability