You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Testcase failure exposed regression from fix to
Bug#22602898 NDB : CURIOUS STATE OF TC COMMIT_SENT / COMPLETE_SENT TIMEOUT HANDLING
Node failure handling in TC performs 1 pass through the local active
transactions to find those affected by a node failure.
In this pass, all transactions affected by the node failure are queued
for processing, e.g. rollback, commit, complete, via e.g. the serial
abort/commit or complete protocols.
The exceptions are transactions in transient internal states such as
CS_PREPARE_TO_COMMIT, CS_COMMITTING, CS_COMPLETING, which are then followed
by stable 'wait' states such as CS_COMMIT_SENT, CS_COMPLETE_SENT.
Transactions in these states were handled by doing nothing in the node failure
handling pass, and relying on the timeout handling in the subsequent
stable states to queue transactions for processing.
The fix to Bug#22602898 removed this stable state handling to avoid it
accidentally triggering, but also stopped it from triggering when needed
in this case where node failure handling found a transaction in a transient
state.
This is solved by modifying the CS_COMMIT_SENT and CS_COMPLETE_SENT stable
state handling to also perform node failure processing if a timeout has
occurred for a transaction with a failure number different to the current
latest failure number.
This ensures that all transactions involving the failed node are handled
eventually.
A new testcase testNodeRestart -n TransientStatesNF T1 is added to the
AT testsuite to give coverage.
Change-Id: I0c0d4b6f75a97a3a7ff892cc4eafd2351491a8ff
0 commit comments