Skip to content

BrokerConnection | Error receiving network data closing socket #2111

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
takwas opened this issue Aug 24, 2020 · 6 comments
Closed

BrokerConnection | Error receiving network data closing socket #2111

takwas opened this issue Aug 24, 2020 · 6 comments

Comments

@takwas
Copy link

takwas commented Aug 24, 2020

We have a relatively new setup that uses the Kafka protocal over Azure Event Hubs. Our usage of the KafkaConsumer client largely uses the default configuration, except for:

  • group_id - some string
  • auto_offset_reset - 'earliest'
  • enable_auto_commit - True
  • security_protocol - 'SASL_SSL'

Everything has run fine until recently -- and only on one occasion -- when we found this in the log. It appears the app would only resume upon a restart.

In light of the above premise, I'd like to ask:

  • Has anyone experienced this before?
  • How might one reproduce this?
  • What is the right way to resume from this error without having to restart the application?

[15:24:09] myapplication - INFO | Processing new message

[20:16:03] kafka.coordinator - WARNING | Heartbeat session expired, marking coordinator dead

[20:16:03] kafka.coordinator - WARNING | Heartbeat session expired, marking coordinator dead

[20:16:03] kafka.coordinator - WARNING | Marking the coordinator dead (node coordinator-0) for group myapplication: Heartbeat session expired.

[20:16:03] kafka.coordinator - WARNING | Marking the coordinator dead (node coordinator-0) for group myapplication: Heartbeat session expired.

[20:16:03] kafka.cluster - INFO | Group coordinator for myapplication is BrokerMetadata(nodeId='coordinator-0', host='my.kafka.host', port=9092, rack=None)

[20:16:03] kafka.cluster - INFO | Group coordinator for myapplication is BrokerMetadata(nodeId='coordinator-0', host='my.kafka.host', port=9092, rack=None)

[20:16:03] kafka.coordinator - INFO | Discovered coordinator coordinator-0 for group myapplication

[20:16:03] kafka.coordinator - INFO | Discovered coordinator coordinator-0 for group myapplication

[20:16:09] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:09] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:09] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:09] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:09] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:09] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:10] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:10] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:10] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:10] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:10] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:10] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:10] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:10] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:10] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:10] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:11] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:11] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:11] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:11] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:11] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:11] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:11] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:11] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:11] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:11] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:12] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:12] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:12] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:12] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:12] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:12] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:12] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:12] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:12] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:12] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:13] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:13] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:13] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:13] kafka.coordinator - WARNING | Heartbeat session expired, marking coordinator dead

[20:16:13] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:13] kafka.coordinator - WARNING | Heartbeat session expired, marking coordinator dead

[20:16:13] kafka.coordinator - WARNING | Marking the coordinator dead (node coordinator-0) for group myapplication: Heartbeat session expired.

[20:16:13] kafka.coordinator - WARNING | Marking the coordinator dead (node coordinator-0) for group myapplication: Heartbeat session expired.

[20:16:13] kafka.cluster - INFO | Group coordinator for myapplication is BrokerMetadata(nodeId='coordinator-0', host='my.kafka.host', port=9092, rack=None)

[20:16:13] kafka.cluster - INFO | Group coordinator for myapplication is BrokerMetadata(nodeId='coordinator-0', host='my.kafka.host', port=9092, rack=None)

[20:16:13] kafka.coordinator - INFO | Discovered coordinator coordinator-0 for group myapplication

[20:16:13] kafka.coordinator - INFO | Discovered coordinator coordinator-0 for group myapplication

[20:16:14] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:14] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:15] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:15] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:16] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:16] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:16] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:16] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:16] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:16] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:16] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:16] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:16] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:16] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:17] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:17] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:17] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:17] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:17] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:17] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:17] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:17] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:17] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:17] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:18] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:18] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:18] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:18] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:18] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:18] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:18] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:18] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:18] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:18] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:19] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:19] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:19] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:19] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:19] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:19] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:19] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:19] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:19] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:19] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:19] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:19] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:20] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:20] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:20] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:20] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:20] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:20] kafka.coordinator.consumer - WARNING | Auto offset commit failed for group myapplication: NodeNotReadyError: coordinator-0

[20:16:20] kafka.conn - ERROR | <BrokerConnection node_id=coordinator-0 host=my.kafka.host:9092 <connected> [IPv4 ('127.0.0.1', 9092)]>: Error receiving network data closing socket

Traceback (most recent call last):

File "/usr/local/lib/python3.8/site-packages/kafka/conn.py", line 1087, in _recv

data = self._sock.recv(self.config['sock_chunk_bytes'])

File "/usr/local/lib/python3.8/ssl.py", line 1226, in recv

return self.read(buflen)

File "/usr/local/lib/python3.8/ssl.py", line 1101, in read

return self._sslobj.read(len)

ConnectionResetError: [Errno 104] Connection reset by peer

[20:16:20] kafka.conn - ERROR | <BrokerConnection node_id=coordinator-0 host=my.kafka.host:9092 <connected> [IPv4 ('127.0.0.1', 9092)]>: Error receiving network data closing socket

Traceback (most recent call last):

File "/usr/local/lib/python3.8/site-packages/kafka/conn.py", line 1087, in _recv

data = self._sock.recv(self.config['sock_chunk_bytes'])

File "/usr/local/lib/python3.8/ssl.py", line 1226, in recv

return self.read(buflen)

File "/usr/local/lib/python3.8/ssl.py", line 1101, in read

return self._sslobj.read(len)

ConnectionResetError: [Errno 104] Connection reset by peer

[20:16:20] kafka.conn - INFO | <BrokerConnection node_id=coordinator-0 host=my.kafka.host:9092 <connected> [IPv4 ('127.0.0.1', 9092)]>: Closing connection. KafkaConnectionError: [Errno 104] Connection reset by peer

[20:16:20] kafka.conn - INFO | <BrokerConnection node_id=coordinator-0 host=my.kafka.host:9092 <connected> [IPv4 ('127.0.0.1', 9092)]>: Closing connection. KafkaConnectionError: [Errno 104] Connection reset by peer

[20:16:20] kafka.client - WARNING | Node coordinator-0 connection failed -- refreshing metadata

[20:16:20] kafka.client - WARNING | Node coordinator-0 connection failed -- refreshing metadata

[20:16:20] kafka.coordinator - ERROR | Error sending HeartbeatRequest_v1 to node coordinator-0 [KafkaConnectionError: [Errno 104] Connection reset by peer]

[20:16:20] kafka.coordinator - ERROR | Error sending HeartbeatRequest_v1 to node coordinator-0 [KafkaConnectionError: [Errno 104] Connection reset by peer]

[20:16:20] kafka.coordinator - WARNING | Marking the coordinator dead (node coordinator-0) for group myapplication: KafkaConnectionError: [Errno 104] Connection reset by peer.

[20:16:20] kafka.coordinator - WARNING | Marking the coordinator dead (node coordinator-0) for group myapplication: KafkaConnectionError: [Errno 104] Connection reset by peer.

[20:16:20] kafka.coordinator - ERROR | Error sending HeartbeatRequest_v1 to node coordinator-0 [KafkaConnectionError: [Errno 104] Connection reset by peer]

[20:16:20] kafka.coordinator - ERROR | Error sending HeartbeatRequest_v1 to node coordinator-0 [KafkaConnectionError: [Errno 104] Connection reset by peer]

[20:16:20] kafka.coordinator - ERROR | Error sending OffsetCommitRequest_v2 to node coordinator-0 [KafkaConnectionError: [Errno 104] Connection reset by peer]

[20:16:20] kafka.coordinator - ERROR | Error sending OffsetCommitRequest_v2 to node coordinator-0 [KafkaConnectionError: [Errno 104] Connection reset by peer]

[20:16:20] kafka.coordinator - ERROR | Error sending HeartbeatRequest_v1 to node coordinator-0 [KafkaConnectionError: [Errno 104] Connection reset by peer]

[20:16:20] kafka.coordinator - ERROR | Error sending HeartbeatRequest_v1 to node coordinator-0 [KafkaConnectionError: [Errno 104] Connection reset by peer]

overed coordinator coordinator-0 for group myapplication

[20:16:03] kafka.coordinator - INFO | Discovered coordinator coordinator
@Caisho
Copy link

Caisho commented Aug 26, 2020

Hi @takwas, I am having the same problem with Azure Event Hubs. In my logs I have seen it reconnect from this KafkaConnectionError, however if there is an error sending HeartBeatRequest_v1 to node coordinator it will stop reconnecting.

Can I check with you what version of kakfa-python are you using?

Reconnected:

2020-08-15 13:29:57,027.27.72998809814453:kafka.conn:139642177861440:ERROR:6:<BrokerConnection node_id=0 host=udpsentinel.servicebus.windows.net:9093 <connected> [IPv4 ('40.78.234.0', 9093)]>: Error receiving network data closing socket
Traceback (most recent call last):
  File "/opt/sentinel/venv/lib/python3.6/site-packages/kafka/conn.py", line 1087, in _recv
    data = self._sock.recv(self.config['sock_chunk_bytes'])
  File "/usr/local/lib/python3.6/ssl.py", line 997, in recv
    return self.read(buflen)
  File "/usr/local/lib/python3.6/ssl.py", line 874, in read
    return self._sslobj.read(len, buffer)
  File "/usr/local/lib/python3.6/ssl.py", line 633, in read
    v = self._sslobj.read(len)
ConnectionResetError: [Errno 104] Connection reset by peer
2020-08-15 13:29:57,030.30.178546905517578:kafka.conn:139642177861440:INFO:6:<BrokerConnection node_id=0 host=udpsentinel.servicebus.windows.net:9093 <connected> [IPv4 ('40.78.234.0', 9093)]>: Closing connection. KafkaConnectionError: [Errno 104] Connection reset by peer
2020-08-15 13:29:57,030.30.27200698852539:kafka.conn:139642177861440:DEBUG:6:<BrokerConnection node_id=0 host=udpsentinel.servicebus.windows.net:9093 <connected> [IPv4 ('40.78.234.0', 9093)]>: reconnect backoff 0.0427705769103242 after 1 failures
2020-08-15 13:29:57,030.30.40313720703125:kafka.client:139642177861440:WARNING:6:Node 0 connection failed -- refreshing metadata
2020-08-15 13:29:57,030.30.4868221282959:kafka.consumer.fetcher:139642177861440:ERROR:6:Fetch to node 0 failed: KafkaConnectionError: [Errno 104] Connection reset by peer
2020-08-15 13:29:57,030.30.698537826538086:kafka.client:139642131724032:DEBUG:6:Initializing connection to node 0 for meadata request

Failed and stopped:

2020-08-15 13:50:48,748.748.1091022491455:kafka.conn:139642177861440:ERROR:6:<BrokerConnection node_id=coordinator-0 host=udpsentinel.servicebus.windows.net:9093 <connected> [IPv4 ('40.78.234.0', 9093)]>: Error receiving network data closing socket
Traceback (most recent call last):
  File "/opt/sentinel/venv/lib/python3.6/site-packages/kafka/conn.py", line 1087, in _recv
    data = self._sock.recv(self.config['sock_chunk_bytes'])
  File "/usr/local/lib/python3.6/ssl.py", line 997, in recv
    return self.read(buflen)
  File "/usr/local/lib/python3.6/ssl.py", line 874, in read
    return self._sslobj.read(len, buffer)
  File "/usr/local/lib/python3.6/ssl.py", line 633, in read
    v = self._sslobj.read(len)
ConnectionResetError: [Errno 104] Connection reset by peer
2020-08-15 13:50:48,748.748.4414577484131:kafka.conn:139642177861440:INFO:6:<BrokerConnection node_id=coordinator-0 host=udpsentinel.servicebus.windows.net:9093 <connected> [IPv4 ('40.78.234.0', 9093)]>: Closing connection. KafkaConnectionError: [Errno 104] Connection reset by peer
2020-08-15 13:50:48,748.748.5194206237793:kafka.conn:139642177861440:DEBUG:6:<BrokerConnection node_id=coordinator-0 host=udpsentinel.servicebus.windows.net:9093 <connected> [IPv4 ('40.78.234.0', 9093)]>: reconnect backoff 0.042068663551742 after 1 failures
2020-08-15 13:50:48,748.748.6476898193359:kafka.client:139642177861440:WARNING:6:Node coordinator-0 connection failed -- refreshing metadata
2020-08-15 13:50:48,748.748.7637996673584:kafka.coordinator:139642177861440:ERROR:6:Error sending HeartbeatRequest_v1 to node coordinator-0 [KafkaConnectionError: [Errno 104] Connection reset by peer]
2020-08-15 13:50:48,748.748.8193511962891:kafka.coordinator:139642177861440:WARNING:6:Marking the coordinator dead (node coordinator-0) for group default_consumer_group: KafkaConnectionError: [Errno 104] Connection reset by peer.

Can anyone shed a light why it doesn't attempt to reconnect after HeartbeatRequest_v1 failed to send?

@takwas
Copy link
Author

takwas commented Aug 26, 2020

Hi, @Caisho. Thanks for sharing. I am using 2.0.1.

@jeffwidman
Copy link
Contributor

This may be related to #1985... not sure, I haven't looked closely, but at first glance the sound similar.

@takwas
Copy link
Author

takwas commented Sep 17, 2020

@jeffwidman: I think it is indeed quite similar, but mightn't be the exact same thing.

That last comment by @mjattiot also reflects our situation. The unpleasant thing about this situation is that, the Kubernetes pod keeps running as if nothing has happened whereas it's in a deadlocked state.

@takwas
Copy link
Author

takwas commented Sep 29, 2020

@dpkp, @jeffwidman: Any insights on this issue, please? It is now happening rather unpredictably, frequently and in such a silent way.

One way to attempt reproducing this is to force-kill TCP connection, but I don't have all the details about the other factors that come to play.


Based on the error logs shared by myself and @Caisho, the following portions of the codebase are of interest:

@takwas
Copy link
Author

takwas commented Sep 29, 2020

@jeffwidman: Having taken a closer look, I would surmise that this is indeed related to the case with #1985. With that thought, do you think that there will be a release with the changes in #2064 soon?

@dpkp dpkp closed this as completed Feb 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants