Class 14
Class 14
Class 14
Slides based on “Database Management Systems” 3rd ed, Ramakrishnan and Gehrke
What are Transactions?
So far, we looked at individual queries; in practice, a task
consists of a sequence of actions
E.g., “Transfer $1000 from account A to account B”
Subtract $1000 from account A
Subtract transfer fee from account A
Credit $1000 to account B
A transaction is the DBMS’s view of a user program:
Must be interpreted as “unit of work”: either entire transaction
executes, or no part of it executes/has any effect on DBMS
Two special final actions: COMMIT or ABORT
2
Concurrent Execution
DBMS receives large numbers of concurrent requests
Concurrent (or parallel) execution improves performance
Two transactions are concurrent if they overlap in time.
Disk accesses are frequent, and relatively slow; CPU can do a lot of
work while waiting for the disk, or even SSD
Goal is to increase/maximize system throughput
Number of transactions executed per time unit
Concurrency control
Protocols that ensure things execute correctly in parallel
Broad and difficult challenge that goes beyond DBMS realm
OS, Distributed Programming, hardware scheduling (CPU registers), etc
Our focus is DBMS, but some principles span beyond DBMS
3
Major Example: the web app
Concurrent web
requests from
users
Web layer
JDBC
Database server Database
Web app in execution (CS636)
To keep transactions executing concurrently, yet isolated
from each other, each has own objects related to DB data
Transaction Transaction
Thread using Thread using
objects objects
Employee objects
Database
Cache Employee rows
(rows)
Database
On disk
Web app Transactions
Each application action turns into a database transaction
A well-designed app has a “service API” describing those
actions
A request execution calls the service API one or more times.
Each service call represents an application action and contains
a transaction
Thus transactions are contained in request-response cycles
This ensures that transactions are short-lived, good for
performance
But they still can run concurrently under high-enough load
The web app service API
Concurrent web
requests from
users
Web layer
JDBC
Database server Database
ACID Properties
Transaction Management must fulfill four requirements:
1. Atomicity: either all actions within a transaction are carried
out, or none is
Only actions of committed transactions must be visible
2. Consistency: concurrent execution must leave DBMS in
consistent state
3. Isolation: each transaction is protected from effects of other
concurrent transactions
Net effect is that of some sequential execution
4. Durability: once a transaction commits, DBMS changes will
persist
Conversely, if a transaction aborts/is aborted, there are no effects
8
Roles of Transaction Manager
Concurrency Control
Ensuring correct execution in the presence of multiple transactions
running in parallel
Crash recovery
Ensure that atomicity is preserved if the system crashes while one
or more transactions are still incomplete
Main idea is to keep a log of operations; every action is logged
before execution (Write-Ahead Log or WAL)
9
Modeling Transactions
User programs may carry out many operations …
Data-related computations
Prompting user for input, handling web requests
… but the DBMS is only concerned about what data is
read/written from/to the database
A transaction is abstracted by a sequence of time-ordered
read and write actions
e.g., R(X), R(Y), W(X), W(Y)
R=read, W=write, data element in parentheses
Each individual action is indivisible, or atomic
SQL UPDATE = R(X) W(X)
10
Important dataflow assumptions
Transactions interact with one another as they run only via
database read and write operations.
No messages exchanged between transactions
No use of shared memory between transactions
Oracle, other DBs, enforce this
Transactions may accept information from the
environment when they start and return information to
the environment when they finish by committing.
The agent that starts a transaction will come to know whether it
committed or aborted, and can act on that information.
Thus it is possible for data to go from one transaction to the
environment and then to another starting transaction, but note
that these transactions are not concurrent.
Scheduling Transactions
12
Serializable schedule example
13
If execution is not serializable…
Non-serializable concurrent executions can show
anomalies, i.e., clearly bad behavior
Let’s look at some examples
Concurrency: lost update anomaly
Consider two transactions (in a really bad DB) where A = 100
T1: A = A + 100
T2: A = A + 100
16
Concurrency: when things go wrong (2/3)
Assume that initially there are $500 in both accounts
Consider a possible interleaving or schedule
CORRECT
17
Concurrency: when things go wrong (3/3)
Consider another interleaving or schedule:
T1: A=A+100, B=B-100
T2: A=1.06*A, B=1.06*B
WRONG!!!
The DBMS view
T1: R(A), W(A), R(B), W(B)
T2: R(A), W(A), R(B), W(B)
18
Concurrent Execution Anomalies
20
RW Conflicts
Unrepeatable Reads
21
WW Conflicts
Overwriting Uncommitted Data
T1: W(A), W(B), Commit
T2: W(A), W(B), Commit
22
Scheduling Transactions: recall terminology
Serial schedule: no interleaving of transactions
Safe, but poor performance!
23
Conflict Serializable Schedules
Two schedules are conflict equivalent if:
Involve the same actions of the same transactions
Every pair of conflicting actions is ordered the same way
Schedule S is conflict serializable if S is conflict equivalent
to some serial schedule
A conflict serializable schedule is serializable (to be
shown in future classes)
Some other schedules are also serializable
Why is serializability important?
25
Strict Two-Phase Locking (Strict 2PL)
Protocol steps
Each transaction must obtain a S (shared) lock on object before
reading, and an X (exclusive) lock on object before writing.
All locks held are released when the transaction completes
(Non-strict) 2PL: Release locks anytime, but cannot acquire locks after
releasing any lock.
28
Strict 2PL Example (red op is blocked)
T1: S(A) R(A) S(B)
T2: S(A) R(A) X(B)
29
Aborting Transactions
When Ti is aborted, all its actions have to be undone
if Tj reads an object last written by Ti, Tj must be aborted as well!
cascading aborts can be avoided with 2PL by releasing locks only at
commit (Strict 2PL)
If Ti writes an object, Tj can read this only after Ti commits
This also means the schedule is “recoverable”: transactions commit only
after all transactions whose changes they read commit.
In general, recoverable and serializable are separate properties of
concurrency protocols, but Strict 2PL has both.
30
Deadlocks
Cycle of transactions waiting for locks to be released by
each other
31
Locking Performance
Lock-based schemes rely on two mechanisms
Blocking
Aborting
Both blocking and aborting cause performance overhead
Transactions may have to wait
Transactions may need to be re-executed
How does blocking affect throughput?
First few transactions do not conflict – no blocking
Parallel execution, performance increase
As more transactions execute, blocking occurs
After a point, adding more transactions decreases throughput!
Locking Performance (2)
Thrashing
36
Setting Transaction Properties in SQL
Access Mode
READ ONLY vs READ WRITE
Isolation Level (decreasing level of concurrency)
Level Dirty Read Unrepeatable Phantom
Read
READ UNCOMMITTED Possible Possible Possible
READ COMMITTED No Possible Possible
REPEATABLE READ No No Possible
SERIALIZABLE No No No
37
Isolation Levels in Practice
Databases default to RC, read-committed, so many apps
run that way, can have their read data changed, and
phantoms
Web apps (JEE, anyway) have a hard time overriding RC,
so most are running at RC
The 2PL locking scheme we studied was for RR,
repeatable read: transaction takes long term read and
write locks
Long term = until commit of that transaction
Read Committed (RC) Isolation
2PL can be modified for RC: take long-term write locks
but not long term read locks
Reads are atomic as operations, but that’s it
Lost updates can happen in RC: system takes 2PC locks
only for the write operations:
R1(A)R2(A)W2(B)C2W1(B)C1
R1(A)R2(A)X2(B)W2(B)C2X1(B)W1(B)C1 (RC isolation)
Update statements are atomic, so that case of read-then-
write is safe even at RC
Update T set A = A + 100 (safe at RC isolation)
Remember to use update when possible!
Syntax for SQL
SET TRANSACTION ISOLATION LEVEL
SERIALIZABLE READ WRITE
Note:
READ UNCOMITTED cannot be READ WRITE
40