Only disable heartbeat thread once at beginning of join-group #2617
+32
−30
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In #1438 we added logic to disable the heartbeat thread while processing group rebalances. Heartbeats are then reenabled via a callback on the final join group success future, which is triggered after the final syncgroup request succeeds and the member assignment data is received.
However, there is a race condition: (1) if the
join_group()
call times out while the join/sync requests are in flight, (2) the join/sync future succeeds after the timeout while the consumer is not callingjoin_goup()
, (3) the future enables the heartbeat thread, (4) the next call tojoin_group()
disables the heartbeat thread before finding that the future has succeeded. In this case the consumer will be in a stable group membership but will not send any heartbeat requests.The fix here is to only disable the heartbeat thread on the first call to
join_group()
.Fix #2610
I also moved heartbeat-related logs from the
kafka.coordinator
logger to thekafka.coordinator.heartbeat
logger.