Commit Protocols Non-Blocking Commit Protocols
Commit Protocols Non-Blocking Commit Protocols
Commit Protocols Non-Blocking Commit Protocols
2
3-phase Commit
Lemma: If a protocol contains a local state of a site with
both abort and commit states in its concurrency set, then
under independent recovery conditions it is not resilient to
an arbitrary single failure.
In previous figure, C(W2) can have both abort and commit
states in the concurrency set.
To make it a non-blocking protocol: introduce a buffer state
at both coordinator and cohorts.
Now, C(W1) = {q2, w2, a2} and C(w2) = {a1, p1, w1}.
3
3-phase commit: State Machine
Coordinator Cohort i
qi
q1 C_R received/ C_R received/
Commit_Request Agreed msg sent Abort msg sent
message sent to
all cohorts
All agreed/ wi ai
One or more abort w1 Abort from
Prepare msg Prep msg
reply/ Abort msg coordinator
to all received/
sent to all cohorts send Ack
a1 P1 Pi
All cohorts Commit
Ack/ Send Commit received from
msg to all coordinator
c1
ci
4
Failure, Timeout Transitions
A failure transition occurs at a failed site at the instant it
fails or immediately after it recovers from the failure.
Rule for failure transition: For every non-final state s (i.e., qi, wi,
pi) in the protocol, if C(s) contains a commit, then assign a failure
transition from s to a commit state in its FSA. Otherwise, assign a
failure transition from s to an abort state.
Reason: pi is the only state with a commit state in its concurrency
set. If a site fails at pi, then it can commit on recovery. Any other
state failure, safer to abort.
If site i is waiting on a message from j, i can time out. i can
determine the state of j based on the expected message.
Based on j’s state, the final state of j can be determined
using failure transition at j.
5
Failure, Timeout Transitions
This can be used for incorporating Timeout transitions at i.
Rule for timeout transition: For each nonfinal state s, if site j in
S(s),and site j has a failure transition from s to a commit (abort)
state, then assign a timeout transition from s to a commit (abort)
state.
Reason:
Failed site makes a transition to a commit (abort) state using failure
transition rule.
So, the operational site must make the same transition to ensure that
the final outcome is the same at all sites.
6
3-phase commit + Failure Trans.
Coordinator Cohort i
qi
q1 C_R received/ C_R received/
Commit_Request Agreed msg sent Abort msg sent
F,T message sent to F,T
all cohorts F,T
wi ai
One or more abort w1 All agreed/ Abort from
Prepare msg Prep msg
reply/ Abort msg coordinator
to all received/
sent to all cohorts F,T send Ack Abort from
T P1 coordinator
a1 Pi
Abort to all All cohorts
cohorts Commit
Ack/ Send Commit F,T received from
F
msg to all coordinator
c1
F: Failure Transition ci
T: Timeout Transition
F,T: Failure/Timeout
7
Nonblocking Commit Protocol
Phase 1:
First phase identical to that of 2-phase commit, except for failures.
Here, coordinator is in w1 and each cohort is in a or w or q,
depending on whether it has received the commit_request message
or not.
Phase 2:
Coordinator sends a Prepare message to all the cohorts (if all of
them sent Agreed message in phase 1).
Otherwise, it will send an Abort message to them.
On receiving a Prepare message, a cohort sends an
acknowledgement to the coordinator.
If the coordinator fails before sending a Prepare message, it
aborts the transaction on recovery.
Cohorts, on timing out on a Prepare message, also aborts the
transaction.
8
Nonblocking Commit Protocol
Phase 3:
On receiving acknowledgements to Prepare messages, the
coordinator sends a Commit message to all cohorts.
Cohort commits on receiving this message.
Coordinator fails before sending commit? : commits upon
recovery.
So cohorts on Commit message timeout, commit to the
transaction.
Cohort failed before sending an acknowledgement? :
coordinator times out and sends an abort message to all others.
Failed cohort aborts the transaction upon recovery.
9
Use of buffer state:
(e.g.,) Suppose state pi (in cohort) is not present. Let
coordinator wait in state p1 waiting for ack. Let cohort 2 (in
w2) acknowledge and commit.
Suppose cohort 3 fails in w3. Coordinator will time out and
abort. Cohort 3 will abort on recovery. Inconsistent with cohort
2.
10