Skip to content

Bump KafkaConsumer's request_timeout to 305000ms #1002

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Bump KafkaConsumer's request_timeout to 305000ms #1002

wants to merge 1 commit into from

Conversation

jeffwidman
Copy link
Contributor

@jeffwidman jeffwidman commented Mar 2, 2017

The Java Consumer changed the value to 305000 in 0.10.1.0:

The new Java Consumer now supports heartbeating from a background thread. There is a new configuration max.poll.interval.ms which controls the maximum time between poll invocations before the consumer will proactively leave the group (5 minutes by default). The value of the configuration request.timeout.ms must always be larger than max.poll.interval.ms because this is the maximum time that a JoinGroup request can block on the server while the consumer is rebalancing, so we have changed its default value to just above 5 minutes. Finally, the default value of session.timeout.ms has been adjusted down to 10 seconds, and the default value of max.poll.records has been changed to 500.

See the current upstream value here: https://kafka.apache.org/documentation/#configuration

This value was only changed in KafkaConsumer. So we don't also need to update KafkaProducer / KafkaClient as the upstream Java implementations still default to 30000. The broker default is also 30000.


Note:
Merging this may be slightly premature since #948 hasn't been done to add max.poll.interval.ms yet, but I don't see any harm in adding this now.

Sessions will still timeout, this just gives the brokers/cluster a bit longer to handle things like consumer group coordination etc.

In one of our dev environments, we were seeing a lot of these timeouts at the current default of 40000 using the new KafkaConsumer. Weirdly, the timeouts were being hit even though the consumer had no messages to consume and they were starting with default offset of latest.

This doesn't quite feel right. It's a dev cluster, so everything on VMs and the network can be flaky/slow. But I still wouldn't have expected a consumer that was simply sitting there would be hitting this request timeout at 40 seconds.

Still, when I bumped their request timeouts up to this new 305000 value, the timeout errors went away, and the consumers remained stable as best I could tell.

The Java Consumer [changed the value to 305000 in `0.10.1.0`](https://kafka.apache.org/documentation/#upgrade_1010_notable):

>  The new Java Consumer now supports heartbeating from a background thread. There is a new configuration max.poll.interval.ms which controls the maximum time between poll invocations before the consumer will proactively leave the group (5 minutes by default). The value of the configuration request.timeout.ms must always be larger than max.poll.interval.ms because this is the maximum time that a JoinGroup request can block on the server while the consumer is rebalancing, so we have changed its default value to just above 5 minutes. Finally, the default value of session.timeout.ms has been adjusted down to 10 seconds, and the default value of max.poll.records has been changed to 500.

See the current upstream value here: https://kafka.apache.org/documentation/#configuration
@jeffwidman jeffwidman changed the title Bump KafkaConsumer's request_timeout to 305000 Bump KafkaConsumer's request_timeout to 305000ms Mar 2, 2017
@dpkp
Copy link
Owner

dpkp commented Mar 3, 2017

It does seem strange to hit 40sec timeouts in an environment like that.

@dpkp
Copy link
Owner

dpkp commented Mar 7, 2017

Would like to wait on changing this default until the background heartbeat thread is implemented

@yaman-jain
Copy link

Hi @dpkp,
Possible to merge this PR?
Thanks in advance

@jeffwidman
Copy link
Contributor Author

Closing in favor of #1266

@jeffwidman jeffwidman closed this Oct 20, 2017
@jeffwidman jeffwidman deleted the patch-1 branch October 20, 2017 07:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants