You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Spotted while reviewing #1230. The code for sending bytes into the network currently looks like this:
# In the future we might manage an internal write buffer# and send bytes asynchronously. For now, just block# sending each request payloadself._sock.setblocking(True)
total_sent=0whiletotal_sent<len(data):
sent_bytes=self._sock.send(data[total_sent:])
total_sent+=sent_bytesasserttotal_sent==len(data)
ifself._sensors:
self._sensors.bytes_sent.record(total_sent)
self._sock.setblocking(False)
The problem here is, that if 1 of the nodes goes down only on write (will stop reading bytes on the other end) this will lead to a stop in application code, as the sending Thread will suspend indefinitely. This is not too uncommon in Kubernetics or other virtual environments, that do not perfectly handle socket proxying.
To avoid the issue we have to make the call asynchronous or at least set the timeout based on request_timeout_ms configuration. For example:
# In the future we might manage an internal write buffer# and send bytes asynchronously. For now, just block# sending each request payloadself._sock.settimeout(self.config['request_timeout_ms'] /1000.0)
total_sent=0whiletotal_sent<len(data):
sent_bytes=self._sock.send(data[total_sent:])
total_sent+=sent_bytesasserttotal_sent==len(data)
ifself._sensors:
self._sensors.bytes_sent.record(total_sent)
self._sock.setblocking(False)
The text was updated successfully, but these errors were encountered:
Yes -- good catch. I filed #981 for work on non-blocking sends and have been hacking on that locally. Until then, I agree that we should set the socket timeout based on request timeout for blocking sends.
Spotted while reviewing #1230. The code for sending bytes into the network currently looks like this:
The problem here is, that if 1 of the nodes goes down only on write (will stop reading bytes on the other end) this will lead to a stop in application code, as the sending Thread will suspend indefinitely. This is not too uncommon in Kubernetics or other virtual environments, that do not perfectly handle socket proxying.
To avoid the issue we have to make the call asynchronous or at least set the timeout based on
request_timeout_ms
configuration. For example:The text was updated successfully, but these errors were encountered: