Skip to content

Only disable heartbeat thread once at beginning of join-group #2617

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 8, 2025

Conversation

dpkp
Copy link
Owner

@dpkp dpkp commented May 8, 2025

In #1438 we added logic to disable the heartbeat thread while processing group rebalances. Heartbeats are then reenabled via a callback on the final join group success future, which is triggered after the final syncgroup request succeeds and the member assignment data is received.

However, there is a race condition: (1) if the join_group() call times out while the join/sync requests are in flight, (2) the join/sync future succeeds after the timeout while the consumer is not calling join_goup(), (3) the future enables the heartbeat thread, (4) the next call to join_group() disables the heartbeat thread before finding that the future has succeeded. In this case the consumer will be in a stable group membership but will not send any heartbeat requests.

The fix here is to only disable the heartbeat thread on the first call to join_group().

Fix #2610

I also moved heartbeat-related logs from the kafka.coordinator logger to the kafka.coordinator.heartbeat logger.

@dpkp dpkp merged commit 32a9285 into master May 8, 2025
18 checks passed
@dpkp dpkp deleted the dpkp/join-group-heartbeat-disable-once branch May 8, 2025 19:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CommitFailedError when there are very few messages to be read from topic
1 participant