0% found this document useful (0 votes)

6 views70 pages

5 Transaction Processing

The document outlines the principles of distributed database systems, covering topics such as transaction processing, concurrency control, reliability, and data replication. It discusses various algorithms for managing transactions, including locking-based and timestamp ordering methods, and addresses issues like deadlocks and multiversion concurrency control. The content is aimed at providing a comprehensive understanding of how distributed databases maintain consistency and reliability in the face of concurrent operations and failures.

Uploaded by

kieunty22416c

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views70 pages

5 Transaction Processing

Uploaded by

kieunty22416c

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 70

Principles of Distributed Database

Systems
M. Tamer Özsu
Patrick Valduriez

© 2020, M.T. Özsu & P. Valduriez 1

Outline
 Introduction
 Distributed and parallel database design
 Distributed data control
 Distributed Transaction Processing
 Data Replication
 Database Integration – Multidatabase Systems
 Parallel Database Systems
 Peer-to-Peer Data Management
 Big Data Processing
 NoSQL, NewSQL and Polystores
 Web Data Management
© 2020, M.T. Özsu & P. Valduriez 2
Outline
 Distributed Transaction Processing
 Distributed Concurrency Control
 Distributed Reliability

© 2020, M.T. Özsu & P. Valduriez 3

Transaction
A database is in a consistent state if it obeys all of the
consistency (integrity) constraints defined over it.

Transaction consistency, on the other hand, refers to the

operations of concurrent transactions.

The database to remain in a consistent state even if there are

a number of user requests that are concurrently accessing
(reading or updating) the database.

Reliability refers to both the resiliency of a system to various

types of failures and its capability to recover from them.
© 2020, M.T. Özsu & P. Valduriez 4
Transaction
Reliability refers to both the resiliency of a system to various
types of failures and its capability to recover from them.

A resilient system is tolerant of system failures and can

continue to provide services even when failures occur.

A recoverable DBMS is one that can get to a consistent state

(by moving back to a previous consistent state or forward to a
new consistent state) following various types of failures.

© 2020, M.T. Özsu & P. Valduriez 5

Transaction

A transaction is a collection of actions that make consistent

transformations of system states while preserving system
consistency.
 concurrency transparency
 failure transparency

© 2020, M.T. Özsu & P. Valduriez 6

Transaction Characterization
Begin_transaction
…
Read
Read
…
Write
Read
…
Commit
 Read set (RS)
 The set of data items that are read by a transaction
 Write set (WS)
 The set of data items whose values are changed by this transaction
 Base set (BS)
 RS ∪ WS

© 2020, M.T. Özsu & P. Valduriez 7

Principles of Transactions

ATOMICITY
 all or nothing

CONSISTENCY
 no violation of integrity constraints

ISOLATION
 concurrent changes invisible  serializable

DURABILITY
 committed updates persist

© 2020, M.T. Özsu & P. Valduriez 8

Transactions

 Atomic and reliable execution in the presence of failures

 Correct execution in the presence of multiple user

accesses

 Correct management of replicas (if they support it)

© 2020, M.T. Özsu & P. Valduriez 9

Distributed TM Architecture

© 2020, M.T. Özsu & P. Valduriez 10

Outline
 Distributed Transaction Processing
 Distributed Concurrency Control


© 2020, M.T. Özsu & P. Valduriez 11

5.2 Distributed Concurrency Control
(p.188)
 The problem of synchronizing concurrent transactions
such that the consistency of the database is maintained
while, at the same time, maximum degree of concurrency
is achieved.
 Enforce isolation property
 Anomalies:
 Lost updates
 The effects of some transactions are not reflected on the database.
 Inconsistent retrievals
 A transaction, if it reads the same data item more than once, should
always read the same value.

© 2020, M.T. Özsu & P. Valduriez 12

Serializability in Distributed DBMS

 Two histories have to be considered:

 local histories
 global history

 For global transactions (i.e., global history) to be

serializable, two conditions are necessary:
 Each local history should be serializable → local serializability
 Two conflicting operations should be in the same relative order
in all of the local histories where they appear together →
global serializability

© 2020, M.T. Özsu & P. Valduriez 13

Global Non-serializability (p.188)
Example 5.1
T1: Read(x) T2: Read(x)
x ←x-100 Read(y)
Write(x) Commit
Read(y)
y ←y+100
Write(y)
Commit

 x stored at Site 1, y stored at Site 2

H1={R1(x), W1(x), R2(x)} T1 -> T2
H2={R1(y), W1(y), R2(y)} T1 -> T2
H1, H2 are individually serializable, the two transactions are
globally serializable.
© 2020, M.T. Özsu & P. Valduriez 14
Global Non-serializability (p.188)

H’1={R1(x), W1(x), R2(x)} T1 -> T2

H’2={R2(y), R1(y), W1(y)} T2 -> T1

 H’1, H’2 are individually serializable (in fact serial), but the
two transactions are not globally serializable.

© 2020, M.T. Özsu & P. Valduriez 15

Concurrency Control Algorithms

 Pessimistic
 Two-Phase Locking-based (2PL)
 Centralized (primary site) 2PL
 Primary copy 2PL
 Distributed 2PL
 Timestamp Ordering (TO)
 Basic TO
 Multiversion TO
 Conservative TO
 Optimistic
 Locking-based
 Timestamp ordering-based

© 2020, M.T. Özsu & P. Valduriez 16

5.2.1 Locking-Based Algorithms (p.189)

 Transactions indicate their intentions by requesting locks

from the scheduler (called lock manager).
 Locks are either read lock (rl) [also called shared lock] or
write lock (wl) [also called exclusive lock]
 Read locks and write locks conflict (because Read and
Write operations are incompatible
rl wl
rl yes no
wl no no
 Locking works nicely to allow concurrent processing of
transactions.
© 2020, M.T. Özsu & P. Valduriez 17
5.2.1.1 Centralized 2PL (p.190)
 There is only one 2PL scheduler in the distributed system.
 Lock requests are issued to the central scheduler.

© 2020, M.T. Özsu & P. Valduriez 18

Centralized 2PL

Tham khảo :

Algorithm 5.1: Centralized 2PL Transaction Manager

(C2PL-TM) (p.191)
Algorithm 5.2: Centralized 2PL Lock Manager (C2PL-LM)
(p.192)
Algorithm 5.3: Data Processor (DP) (p.193)

© 2020, M.T. Özsu & P. Valduriez 19

5.2.1.2 Distributed 2PL (p.193)

 2PL schedulers are placed at each site. Each scheduler

handles lock requests for data at that site.
 A transaction may read any of the replicated copies of
item x, by obtaining a read lock on one of the copies of x.
Writing into x requires obtaining write locks for all copies
of x.

© 2020, M.T. Özsu & P. Valduriez 20

Distributed 2PL Execution

© 2020, M.T. Özsu & P. Valduriez 21

Distributed 2PL Execution
The distributed 2PL transaction management algorithm is
similar to the C2PLTM, with two major modifications.
 The messages that are sent to the central site lock
manager in C2PL-TM are sent to the lock managers at all
participating sites in D2PL-TM.
 The operations are not passed to the data processors by
the coordinating transaction manager, but by the
participating lock managers.
= >This means that the coordinating transaction
manager does not wait for a “lock request granted”
message. The participating data processors send the
“end of operation” messages to the coordinating TM

© 2020, M.T. Özsu & P. Valduriez 22

5.2.1.3 Deadlock (p.194)
 A transaction is deadlocked if it is blocked and will
remain blocked until there is intervention.
 Locking-based concurrency control algorithms may
cause deadlocks.
 TO-based algorithms that involve waiting may cause
deadlocks.
 Wait-for graph (WFG)
If transaction Ti waits for another transaction Tj to
release a lock on an entity, then Ti → Tj in WFG.

Ti Tj

© 2020, M.T. Özsu & P. Valduriez 23

Local versus Global WFG
Example 5.2 (p.194)
 T1 and T2 run at site 1, T3 and T4 run at site 2.
 T3 waits for a lock held by T4 which waits for a lock held by T1 which
waits for a lock held by T2 which, in turn, waits for a lock held by T3.

Local WFG

Global WFG

© 2020, M.T. Özsu & P. Valduriez 24

Deadlock Detection

 Transactions are allowed to wait freely.

 Wait-for graphs and cycles.
 Topologies for deadlock detection algorithms
 Centralized
 Distributed
 Hierarchical

© 2020, M.T. Özsu & P. Valduriez 25

Centralized Deadlock Detection

 One site is designated as the deadlock detector for the

system. Each scheduler periodically sends its local WFG
to the central site which merges them to a global WFG to
determine cycles.
 How often to transmit?
 Too often ⇒ higher communication cost but lower delays due to
undetected deadlocks
 Too late ⇒ higher delays due to deadlocks, but lower
communication cost
 Would be a reasonable choice if the concurrency control
algorithm is also centralized (centralized 2PL)
 Proposed for Distributed INGRES

© 2020, M.T. Özsu & P. Valduriez 26

Hierarchical Deadlock Detection

Deadlocks that are Build a hierarchy of detectors

local to a single site
would be detected at
that site using the
LWFG.

Each site also sends

its LWFG to the
deadlock detector at
the next level.

© 2020, M.T. Özsu & P. Valduriez 27

Distributed Deadlock Detection
Distributed deadlock detection algorithms delegate the responsibility of
detecting deadlocks to individual sites. Local deadlock detectors at each
site that communicate their LWFGs with one another (in fact, only the
potential deadlock cycles are transmitted)
 Sites cooperate in detection of deadlocks.
 One example:
 Form local WFGs at each modified as follows:
1) Potential deadlock cycles from other sites are
added as edges
2) Join these with edges in LWFGs depicting that
remote transactions are waiting for local ones

 Each local deadlock detector:

 looks for a cycle that does not involve the external edge. If it exists,
there is a local deadlock which can be handled locally.
 looks for a cycle involving the external edge. If it exists, it indicates a
potential global deadlock. Pass on the information to the next site.

© 2020, M.T. Özsu & P. Valduriez 28

5.2.2 Timestamp Ordering (p.197)
Transaction (Ti) is assigned a globally unique
timestamp ts(Ti).
Transaction manager attaches the timestamp to
all operations issued by the transaction.
Each data item is assigned a write timestamp
(wts) and a read timestamp (rts):
rts(x) = largest timestamp of any read on x
wts(x) = largest timestamp of any write
(update) on x
Conflicting operations are resolved by
timestamp order.
© 2020, M.T. Özsu & P. Valduriez 29
Basic Timestamp Ordering
Two conflicting operations Oij of Ti and Okl of Tk → Oij executed before
Okl iff ts(Ti) < ts(Tk).
 Ti is called older transaction
 Tk is called younger transaction

Basic TO algorithm never causes operations to wait, but

instead, restarts them.
© 2020, M.T. Özsu & P. Valduriez 30
5.2.2.2 Conservative Timestamp
Ordering (p.201)
 Basic timestamp ordering tries to execute an operation
as soon as it receives it
 progressive
 too many restarts since there is no delaying
 Conservative timestamping delays each operation until
there is an assurance that it will not be restarted
 Assurance?
 No other operation with a smaller timestamp can arrive at the
scheduler
 Note that the delay may result in the formation of deadlocks

© 2020, M.T. Özsu & P. Valduriez 31

5.2.3 Multiversion Concurrency Control
(MVCC)

 Each version of a data item that is created is labeled with

the timestamp of the transaction that creates it
-> Do not modify the values in the database, create
new values.
 Typically timestamp-based implementation
ts(Ti) < ts(xr) < ts(Tj)
 Implemented in a number of systems: IBM DB2, Oracle,
SQL Server, SAP HANA, BerkeleyDB, PostgreSQL

© 2020, M.T. Özsu & P. Valduriez 32

MVCC Reads

 A Ri(x) is translated into a read on one version of x.

 Find a version of x (say xv) such that ts(xv) is the largest
timestamp less than ts(Ti).

© 2020, M.T. Özsu & P. Valduriez 33

MVCC Writes

 A Wi(x) is translated into Wi(xw) and accepted if the

scheduler has not yet processed any Rj(xr) such that
ts(Ti) < ts(xr) < ts(Tj)
ts(Ti) create new version xr, not read any Rj

© 2020, M.T. Özsu & P. Valduriez 34

5.2.4 Optimistic Concurrency Control
Algorithms (p.205)
Optimistic algorithms assume that transaction conflicts and
contention for data will not be predominant.

Each transaction follows five phases: read (R), execute

(E), write (W), validate (V), and commit (C)—of course
commit phase becomes abort if the transaction is
not validated.

This algorithm assigns timestamps to transactions at the

beginning of their validation step rather than at the
beginning of transactions as is done with (pessimistic) TO
algorithms.
© 2020, M.T. Özsu & P. Valduriez 35
Optimistic Concurrency Control
Algorithms
 Transaction execution model: divide Ti into subtransactions
each of which execute at a site
 𝑇𝑖𝑠 : subtransaction of Ti that executes at site s.

 At the beginning of the validation phase a timestamp is

assigned to the transaction (timestamp of its
subtransactions)
 The local validation of 𝑇𝑖𝑠 is performed according to the
following rules :

© 2020, M.T. Özsu & P. Valduriez 36

Optimistic CC Validation Test

 Rule 1 : At each site s, if all transactions 𝑇𝑘𝑠 where

ts(𝑇𝑘𝑠 ) < ts(𝑇𝑖𝑠 ) have completed their write phase
before 𝑇𝑖𝑠 has started its read phase (Fig. 5.9a),
validation succeeds, because transaction executions
are in serial order.

© 2020, M.T. Özsu & P. Valduriez 37

Optimistic CC Validation Test

 At each site s, if there is any transaction 𝑇𝑘𝑠 such that

ts(𝑇𝑘𝑠 ) < ts(𝑇𝑖𝑠 ), and which completes its write phase
(commit) while 𝑇𝑖𝑠 is in its read phase (Fig. 5.9b), the
validation succeeds if Ws(𝑇𝑘𝑠 ) ∩ Rs(𝑇𝑖𝑠 ) = ∅.

The updates of 𝑇𝑖𝑠 will not be overwritten by the

updates of 𝑇𝑘𝑠
© 2020, M.T. Özsu & P. Valduriez 38
Optimistic CC Validation Test

 At each site s, if there is any transaction 𝑇𝑘𝑠 such that

ts(𝑇𝑘𝑠 ) < ts(𝑇𝑖𝑠 ), and which completes its read phase
before 𝑇𝑖𝑠 completes its read phase (Fig. 5.9c), the
validation succeeds if Ws(𝑇𝑘𝑠 ) ∩ Rs(𝑇𝑖𝑠 ) = ∅, and
Ws(𝑇𝑘𝑠 ) ∩ Ws(𝑇𝑖𝑠 ) = ∅.
It simply requires that the updates of 𝑇𝑘𝑠 not affect the read
phase or the write phase of 𝑇𝑖𝑠

© 2020, M.T. Özsu & P. Valduriez 39

5.3 Snapshot Isolation (SI)

 Each transaction “sees” a consistent snapshot of the

database when it starts and R/W this snapshot
 Repeatable reads, but not serializable isolation
 Read-only transactions proceed without significant
synchronization overhead
 Centralized SI-based CC
1) Ti starts, obtains a begin timestamp tsb(Ti)
2) Ti ready to commit, obtains a commit timestamp tsc(Ti) that is greater
than any of the existing tsb or tsc
3) Ti commits if no other Tj such that tsc (Tj) € [tsb(Ti), tsc(Ti)]; otherwise
aborted (first committer wins)
4) When Ti commits, changes visible to all Tk where tsb(Tk)> tsc(Ti)

© 2020, M.T. Özsu & P. Valduriez 40

Distributed CC with SI
 Computing a consistent distributed snapshot is hard
 Similar rules to serializability
 Each local history should be SI
 Global history is SI → commitment orders at each site are the same

 Dependence relationship: 𝑇𝑖 at site s (𝑇𝑖𝑠 ) is dependent on 𝑇𝑗𝑠

(dependent(𝑇𝑖𝑠 , 𝑇𝑗𝑠 )) iff
(𝑅𝑆(𝑇𝑖𝑠 ) ∩ 𝑊𝑆(𝑇𝑗𝑠 ) ≠ ∅) ∨ (𝑊𝑆 𝑇𝑖𝑠 ∩ 𝑅𝑆(𝑇𝑗𝑠 ) ≠ ∅) ∨ (𝑊𝑆(𝑇𝑖𝑠 ) ∩ 𝑊𝑆(𝑇𝑗𝑠 ) ≠ ∅)
 Conditions
1) 𝑑𝑒𝑝𝑒𝑛𝑑𝑒𝑛𝑡(𝑇𝑖 , 𝑇𝑗 ) ∧ 𝑡𝑠𝑏 𝑇𝑖𝑠 < 𝑡𝑠𝑐 𝑇𝑗𝑠 ⟹ 𝑡𝑠𝑏 𝑇𝑖𝑡 < 𝑡𝑠𝑐 𝑇𝑗𝑡 at every
site 𝑡 where 𝑇𝑖 and 𝑇𝑗 execute together
2) 𝑑𝑒𝑝𝑒𝑛𝑑𝑒𝑛𝑡(𝑇𝑖 , 𝑇𝑗 ) ∧ 𝑡𝑠𝑐 𝑇𝑖𝑠 < 𝑡𝑠𝑐 𝑇𝑗𝑠 ⟹ 𝑡𝑠𝑐 𝑇𝑖𝑡 < 𝑡𝑠𝑏 𝑇𝑗𝑡 at every
site 𝑡 where 𝑇𝑖 and 𝑇𝑗 execute together
3) 𝑡𝑠𝑐 𝑇𝑖𝑠 < 𝑡𝑠𝑐 𝑇𝑗𝑠 ⟹ 𝑡𝑠𝑐 𝑇𝑖𝑡 < 𝑡𝑠𝑏 𝑇𝑗𝑡 at every site 𝑡 where 𝑇𝑖 and 𝑇𝑗 execute
© 2020, M.T.together
Özsu & P. Valduriez 41
Distributed CC with SI – Executing 𝑇𝑖
Coordinating TM Each site s

Check if first 2
Conditions hold

Positive Yes
Any ?
neg? Update event clock:
No max(own,coord TM)

Update event clock:

Max(event clocks of all s)
Wait

If global commit:
1) Persist 𝑇𝑖 updates
2) Update event clock
© 2020, M.T. Özsu & P. Valduriez 42
Outline
 Distributed Transaction Processing


 Distributed Reliability

© 2020, M.T. Özsu & P. Valduriez 43

Reliability

Problem:
How to maintain

atomicity

durability

properties of transactions

© 2020, M.T. Özsu & P. Valduriez 44

Ch.10/44
Types of Failures

 Transaction failures
 Transaction aborts (unilaterally or due to deadlock)
 System (site) failures
 Failure of processor, main memory, power supply, …
 Main memory contents are lost, but secondary storage contents
are safe
 Partial vs. total failure
 Media failures
 Failure of secondary storage devices → stored data is lost
 Head crash/controller failure
 Communication failures
 Lost/undeliverable messages
 Network partitioning
© 2020, M.T. Özsu & P. Valduriez 45
Distributed Reliability Protocols

 Commit protocols
 How to execute commit command for distributed transactions.
 Issue: how to ensure atomicity and durability?
 Termination protocols
 If a failure occurs, how can the remaining operational sites deal with it.
 Non-blocking: the occurrence of failures should not force the sites to
wait until the failure is repaired to terminate the transaction.
 Recovery protocols
 When a failure occurs, how do the sites where the failure occurred deal
with it.
 Independent: a failed site can determine the outcome of a transaction
without having to obtain remote information.
 Independent recovery  non-blocking termination

© 2020, M.T. Özsu & P. Valduriez 46

Two-Phase Commit (2PC)

Phase 1 : The coordinator gets the participants ready to

write the results into the database
Phase 2 : Everybody writes the results into the database
 Coordinator :The process at the site where the transaction
originates and which controls the execution
 Participant :The process at the other sites that participate in
executing the transaction
Global Commit Rule:
 The coordinator aborts a transaction if and only if at least one
participant votes to abort it.
 The coordinator commits a transaction if and only if all of the
participants vote to commit it.

State Transitions in 2PC

Coordinator Participant

COMMIT
Fig. 5.10

Centralized 2PC

centralized 2PC

Fig. 5.11

Linear 2PC

V-C: Vote-Commit, V-A: Vote-Abort, G-C: Global-commit, G-A: Global-abort

Distributed 2PC
Fig. 5.14 Distributed 2PC
communication structure

there is no need for the second phase of the protocol

To improve performance by
1) Reduce the number of messages between coordinator &
participants
2) Reduce the number of time logs are written
 Presumed Abort 2PC
 Participant polls coordinator about transaction’s outcome
 No information → abort the transaction
 Presumed Commit 2PC
 Participant polls coordinator about transaction’s outcome
 No information → assume transaction is committed
 Not an exact dual of presumed abort 2PC

Site Failures - 2PC Termination
The termination protocols serve the
Coordinator
timeouts for both the coordinator and
the participant processes.

Coordinator Timeouts

 Timeout in WAIT
 Cannot unilaterally commit
 Can unilaterally abort
 Timeout in ABORT or COMMIT
 Stay blocked and wait for the acks

Site Failures - 2PC Termination

Participant
Participant Timeouts

 Timeout in INITIAL
 Coordinator must have failed
in INITIAL state
 Unilaterally abort
 Timeout in READY
 Stay blocked,

Site Failures - 2PC Recovery

Coordinator
 Failure in INITIAL
 Start the commit process upon
recovery
 Failure in WAIT
 Restart the commit process upon
recovery
 Failure in ABORT or COMMIT
 Nothing special if all the acks have
been received
 Otherwise the termination protocol is
involved

Site Failures - 2PC Recovery

Participant
 Failure in INITIAL
 Unilaterally abort upon recovery
 Failure in READY
 The coordinator has been
informed about the local decision
 Treat as timeout in READY state
and invoke the termination
protocol
 Failure in ABORT or COMMIT
 Nothing special needs to be done

2PC Recovery Protocols –
Additional Cases
Arise due to non-atomicity of log and message send
actions
 Coordinator site fails after writing “begin_commit” log
and before sending “prepare” command
 treat it as a failure in WAIT state; send “prepare” command
 Participant site fails after writing “ready” record in log but
before “vote-commit” is sent
 treat it as failure in READY state
 alternatively, can send “vote-commit” upon recovery
 Participant site fails after writing “abort” record in log but
before “vote-abort” is sent
 no need to do anything upon recovery
© 2020, M.T. Özsu & P. Valduriez 58
2PC Recovery Protocols –
Additional Case

 Coordinator site fails after logging its final decision

record but before sending its decision to the participants
 coordinator treats it as a failure in COMMIT or ABORT state
 participants treat it as timeout in the READY state
 Participant site fails after writing “abort” or “commit”
record in log but before acknowledgement is sent
 participant treats it as failure in COMMIT or ABORT state
 coordinator will handle it by timeout in COMMIT or ABORT state

Problem With 2PC

 Blocking
 Ready implies that the participant waits for the coordinator
 If coordinator fails, site is blocked until recovery
 Blocking reduces availability
 Independent recovery is not possible
 However, it is known that:
 Independent recovery protocols exist only for single site failures;
no independent recovery protocol exists which is resilient to
multiple-site failures.
 So we search for these protocols – 3PC

Three-Phase Commit

 3PC is non-blocking.
 A commit protocols is non-blocking iff
 it is synchronous within one state transition, and
 its state transition diagram contains
 no state which is “adjacent” to both a commit and an abort state,
and
 no non-committable state which is “adjacent” to a commit state
 Adjacent: possible to go from one stat to another with a
single state transition
 Committable: all sites have voted to commit a
transaction
 e.g.: COMMIT state

State Transitions in 3PC

Coordinator Participant
3PC Protocol Actions

Network Partitioning

 Simple partitioning
 Only two partitions
 Multiple partitioning
 More than two partitions
 Formal bounds:
 There exists no non-blocking protocol that is resilient to a
network partition if messages are lost when partition occurs.
 There exist non-blocking protocols which are resilient to a single
network partition if all undeliverable messages are returned to
sender.
 There exists no non-blocking protocol which is resilient to a
multiple partition.

Independent Recovery Protocols for
Network Partitioning

 No general solution possible

 allow one group to terminate while the other is blocked
 improve availability
 How to determine which group to proceed?
 The group with a majority
 How does a group know if it has majority?
 Centralized
 Whichever partitions contains the central site should terminate the
transaction
 Voting-based (quorum)

Quorum Protocols

 The network partitioning problem is handled by the

commit protocol.
 Every site is assigned a vote Vi.
 Total number of votes in the system V
 Abort quorum Va, commit quorum Vc
 Va + Vc > V where 0 ≤ Va , Vc ≤ V
 Before a transaction commits, it must obtain a commit quorum Vc
 Before a transaction aborts, it must obtain an abort quorum Va

Paxos Consensus Protocol

 General problem: how to reach an agreement

(consensus) among TMs about the fate of a transaction
 2PC and 3PC are special cases
 General idea: If a majority reaches a decision, the global
decision is reached (like voting)
 Roles:
 Proposer: recommends a decision
 Acceptor: decides whether to accept the proposed decision
 Learner: discovers the agreed-upon decision by asking or it is
pushed

Paxos & Complications

 Naïve Paxos: one proposer

 Operates like a 2PC
 Complications
 Multiple proposers can exist at the same time; acceptor has to
choose
 Attach a ballot number
 Multiple proposals may result in split votes with no majority
 Run multiple consensus rounds → performance implication
 Choose a leader
 Some accepts fail after they accept a decision; the remaining
acceptors may not constitute majority
 Use ballot numbers

Basic Paxos – No Failures
Proposer (or Leader) Acceptor
Any
previous
Ack No prop?
Record
Record Ack prepare(bal) Yes

Yes bal > any

Record bal
Ack from prepare(bal) received?
majority?
No

Ignore and
Wait
if all are Ack (not with bal’,val’)
then val ← proposer wants &
nbal ← bal
No
else val ← val’ & nbal ← bal nbal = ack.bal? Ignore

Yes

Record
accepted(nbal,val)

Basic Paxos with Failures

 Some acceptors fail but there is quorum

 Not a problem
 Enough acceptors fail to eliminate quorum
 Run a new ballot
 Proposer/leader fails
 Choose a new leader and start a new ballot

Distributed Dbms Advanced Concepts
No ratings yet
Distributed Dbms Advanced Concepts
70 pages
5-Transaction Processing Nhom2
No ratings yet
5-Transaction Processing Nhom2
65 pages
Ch5 (CSE417)
No ratings yet
Ch5 (CSE417)
65 pages
Unit 3
No ratings yet
Unit 3
52 pages
5-Transaction_Processing
No ratings yet
5-Transaction_Processing
79 pages
6 Replication Nhom3
No ratings yet
6 Replication Nhom3
44 pages
Chapter 11 Concurrency Control
No ratings yet
Chapter 11 Concurrency Control
110 pages
2h Reliability Ozsu Ch12
No ratings yet
2h Reliability Ozsu Ch12
39 pages
Chapter 5
No ratings yet
Chapter 5
31 pages
INT1414 D22CQCN01-N N22DCCN052 LamNhatMinh Baigiuaky
No ratings yet
INT1414 D22CQCN01-N N22DCCN052 LamNhatMinh Baigiuaky
36 pages
ADB - Unit IV (Chapter-2) - Transaction Management
No ratings yet
ADB - Unit IV (Chapter-2) - Transaction Management
32 pages
Obermarck
No ratings yet
Obermarck
22 pages
Lecture 2 Ho PDF
No ratings yet
Lecture 2 Ho PDF
35 pages
11-Conc Control
No ratings yet
11-Conc Control
39 pages
Module 4
No ratings yet
Module 4
51 pages
Deadlock by Anusha Pai
No ratings yet
Deadlock by Anusha Pai
46 pages
Outline: Distributed Transaction Management
No ratings yet
Outline: Distributed Transaction Management
32 pages
Data Warehousing (2)
No ratings yet
Data Warehousing (2)
42 pages
Concurrency
No ratings yet
Concurrency
70 pages
Deadlocks
No ratings yet
Deadlocks
12 pages
Transaction 2
No ratings yet
Transaction 2
31 pages
13 - Distributed Transactions
No ratings yet
13 - Distributed Transactions
28 pages
Unit_4_DBMS
No ratings yet
Unit_4_DBMS
72 pages
Lecture 20
No ratings yet
Lecture 20
64 pages
DBMS 2Phase locking
No ratings yet
DBMS 2Phase locking
48 pages
DBMS UNIT-5 2025
No ratings yet
DBMS UNIT-5 2025
23 pages
Cs921database Technology
No ratings yet
Cs921database Technology
20 pages
unit 4
No ratings yet
unit 4
52 pages
dbms unit 4&5(gpt)
No ratings yet
dbms unit 4&5(gpt)
10 pages
UNIT-5
No ratings yet
UNIT-5
11 pages
unit 5
No ratings yet
unit 5
7 pages
An Updating Transaction: Distributed Transactions
No ratings yet
An Updating Transaction: Distributed Transactions
12 pages
Unit-6 Transactions & Replications Syllabus: Introduction, System Model and Group Communication, Concurrency Control in Distributed
No ratings yet
Unit-6 Transactions & Replications Syllabus: Introduction, System Model and Group Communication, Concurrency Control in Distributed
20 pages
ACS 1t
No ratings yet
ACS 1t
4 pages
Transaction Management
No ratings yet
Transaction Management
18 pages
DDS Unit - 3
No ratings yet
DDS Unit - 3
15 pages
concurrencycontrol-161011011906
No ratings yet
concurrencycontrol-161011011906
29 pages
DSC4
No ratings yet
DSC4
15 pages
Ads unit 4
No ratings yet
Ads unit 4
6 pages
Advanced DB-Chapter-Four Concurrency_control_Techniques (1)
No ratings yet
Advanced DB-Chapter-Four Concurrency_control_Techniques (1)
25 pages
Lecture 9 Distributed Transactions
No ratings yet
Lecture 9 Distributed Transactions
7 pages
Distributed Transactions
No ratings yet
Distributed Transactions
27 pages
Synchronization in Distributed Systems
No ratings yet
Synchronization in Distributed Systems
51 pages
ds part b
No ratings yet
ds part b
30 pages
Distributed Transaction Management
No ratings yet
Distributed Transaction Management
15 pages
Interview Question OOPS, DBMS, OS, CN & HR
100% (1)
Interview Question OOPS, DBMS, OS, CN & HR
111 pages
Transactions & Concurrency Control: CS4262 Distributed Systems
No ratings yet
Transactions & Concurrency Control: CS4262 Distributed Systems
45 pages
Concurrency Control in Dynamic Database Systems
No ratings yet
Concurrency Control in Dynamic Database Systems
22 pages
Consensus
No ratings yet
Consensus
77 pages
Online Resume Builder
75% (4)
Online Resume Builder
46 pages
Nonblocking Commit Protocols: Dale Skeen
No ratings yet
Nonblocking Commit Protocols: Dale Skeen
42 pages
Concurrency Control
No ratings yet
Concurrency Control
10 pages
Transactions &ConcurrencyControl
No ratings yet
Transactions &ConcurrencyControl
40 pages
Module 2 - Data Modeling
No ratings yet
Module 2 - Data Modeling
92 pages
Outline: Background Distributed DBMS Architecture Distributed Database Design Distributed Query Processing
No ratings yet
Outline: Background Distributed DBMS Architecture Distributed Database Design Distributed Query Processing
18 pages
Lecture 3 - Concurrency Control and Fault Tolerance
No ratings yet
Lecture 3 - Concurrency Control and Fault Tolerance
54 pages
Concurrency Control in Distributed Databases: Gul Sabah Arif
No ratings yet
Concurrency Control in Distributed Databases: Gul Sabah Arif
18 pages
Slides 11 Transactions
No ratings yet
Slides 11 Transactions
34 pages
advanced datbase technology lab-final
No ratings yet
advanced datbase technology lab-final
96 pages
HelpDoc_Handbook_CRM
No ratings yet
HelpDoc_Handbook_CRM
13 pages
Synchronization: CS403/534 Distributed Systems Erkay Savas Sabanci University
No ratings yet
Synchronization: CS403/534 Distributed Systems Erkay Savas Sabanci University
46 pages
M3+BE+Naming+Conventions+V1 2 1
No ratings yet
M3+BE+Naming+Conventions+V1 2 1
38 pages
Simio Release Notes 203
No ratings yet
Simio Release Notes 203
99 pages
Unit 5 - Cluster Analysis
No ratings yet
Unit 5 - Cluster Analysis
28 pages
sql_features
No ratings yet
sql_features
14 pages
COMP1845
No ratings yet
COMP1845
14 pages
Oracle Samples
No ratings yet
Oracle Samples
39 pages
MPI Duplication Analysis (Informatics) HIK1046.5
No ratings yet
MPI Duplication Analysis (Informatics) HIK1046.5
11 pages
Hashing: Data Structure
No ratings yet
Hashing: Data Structure
17 pages
Crawler
No ratings yet
Crawler
16 pages
DDB Lec 1
No ratings yet
DDB Lec 1
18 pages
Collision in Hashing
No ratings yet
Collision in Hashing
9 pages
BD_Unit4_Summary_efde2208-1937-44c2-9c1d-e0d171eb6120
No ratings yet
BD_Unit4_Summary_efde2208-1937-44c2-9c1d-e0d171eb6120
6 pages
Chitij Chauhan: Career Objective
No ratings yet
Chitij Chauhan: Career Objective
9 pages
Lab Sheet 10 (Ict.005)
No ratings yet
Lab Sheet 10 (Ict.005)
9 pages
02 Transactions
No ratings yet
02 Transactions
5 pages
Ch#22 TRANSACTION - MANAGEMENT
No ratings yet
Ch#22 TRANSACTION - MANAGEMENT
80 pages
Getting Started with Oracle Tuxedo
From Everand
Getting Started with Oracle Tuxedo
Biru Chattopadhayay
No ratings yet
Frequent Pattern Mining Overview: Data Mining Techniques: Frequent Patterns in Sets and Sequences
No ratings yet
Frequent Pattern Mining Overview: Data Mining Techniques: Frequent Patterns in Sets and Sequences
14 pages
Demo Script: Using Remote Views in Visual Foxpro: Purpose of This Document
No ratings yet
Demo Script: Using Remote Views in Visual Foxpro: Purpose of This Document
9 pages
Assignment 2
No ratings yet
Assignment 2
3 pages
ADBMS Sem 1 Mumbai University (MSC - CS)
No ratings yet
ADBMS Sem 1 Mumbai University (MSC - CS)
39 pages
Characteristics of The Database Approach
0% (1)
Characteristics of The Database Approach
4 pages
DBMS Project
No ratings yet
DBMS Project
28 pages
Assignment - 2: Information Technology For Agribusiness
No ratings yet
Assignment - 2: Information Technology For Agribusiness
5 pages
XII CS PB-I Sample Paper-IV by KVS RO Kolkatta
No ratings yet
XII CS PB-I Sample Paper-IV by KVS RO Kolkatta
11 pages
Cs Practical 12
No ratings yet
Cs Practical 12
1 page
Case Study 2 Maruti Suzuki DBMS
No ratings yet
Case Study 2 Maruti Suzuki DBMS
3 pages
2023 Winter Question Paper (Msbte Study Resources)
No ratings yet
2023 Winter Question Paper (Msbte Study Resources)
2 pages
SQL Server 2008 R2
No ratings yet
SQL Server 2008 R2
4 pages