Skip to content

Improve timeout in Updater. #6117

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Apr 5, 2021
Merged

Improve timeout in Updater. #6117

merged 8 commits into from
Apr 5, 2021

Conversation

adrian-dybwad
Copy link
Contributor

@adrian-dybwad adrian-dybwad commented May 19, 2019

I added an optional timeout parameter in the writeStream function of the Updater. This makes the update more reliable (it finishes) on slow networks.

edit
fixes #6113

adrionics added 4 commits May 19, 2019 12:02
Improve timeout to allow reliable updates on slow networks.
@adrian-dybwad
Copy link
Contributor Author

Can anyone help with suggesting how to use the polledTimeout class properly?

@d-a-v
Copy link
Collaborator

d-a-v commented May 20, 2019

in examples

@adrian-dybwad
Copy link
Contributor Author

Thanks, I will update the pull request when I get the time.

@adrian-dybwad
Copy link
Contributor Author

As requested, polledTimeout implemented.

Copy link
Collaborator

@devyte devyte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job with polledTimeout.
Before moving forward, I have a question @d-a-v: is it ok to handle an additional timeout here, i. e. big stream timeout over the wifi Client timeout? Or should the timeout be handled in the wifi Client itself?

@d-a-v
Copy link
Collaborator

d-a-v commented May 21, 2019

@devyte It is true saying Stream::readBytes() already implements a timeout, opposed to non-arduino StreamInheritance::read(buf,len) which immediately returns when no more data is available (implemented in HardwareSerial and WiFiClient at least).

There are several locations in the core or our libraries where a retry is done before giving up. All of them do the same thing: move data from a Stream instance to a Print insntance (or local buffer). Fixing that would need to be done all at one by using a proper generic API or function template that allows transferring a Stream instance to a Print instance. I may have a something with no intermediate buffer but it is still an alpha state and I will propose something at some point.

@adrionics you may try with data.setTimeout(streamTimeout) or even better set data's timeout before starting the updater without changing writeStream() signature, and remove the retry.

@devyte
Copy link
Collaborator

devyte commented May 21, 2019

set data's timeout before starting the updater

that is exactly what I was thinking.

@adrian-dybwad
Copy link
Contributor Author

adrian-dybwad commented May 21, 2019

"data" is a WiFiClient and it is using the default timeout. I found that by setting a timeout on data, it seems to make things less reliable. I tried setting a timeout before connecting to the host, but that makes the client wait too long to request a retry (I think?) and so if anything, you need a different timeout before connecting to the host and after connecting(?). Through experimenting, I found it works better to not mess with the WiFiClient's setTimeout() and just wait long enough for it to get data. For sure though, the way this timeout was implemented before was much worse. I found it never is able to complete the update on some devices with low signal strength.

I am of course willing to follow your advice and retry the experiment, so if you have specifics on the setTimeout as related to WiFiClient, I will try again.

@d-a-v
Copy link
Collaborator

d-a-v commented May 21, 2019

@adrionics

Default timeout is 1000ms in Stream.
I don't understand why you would need a shorter timeout for other operations than in updater. Can you elaborate "I found that by setting a timeout on data, it seems to make things less reliable" if you set a timeout bigger ?

If really a bigger timeout is needed in Updater <<this update is really needed give it best luck>>, what about adding a Stream::getTimeout(), saving it in ::writeStream()'s beginning then setting it to twice its value (more? a big constant?), and restoring it in the end of ::writeStream() (still removing the retry) ?

@adrian-dybwad
Copy link
Contributor Author

There are two timeouts. A connect timeout and a stream timeout. setTimeout seems to change both though? I would not want a different timeout for other things. I was setting a timeout of 8 seconds - and tried 12 for the WiFiClient before connecting (it's default is 5). I then tried setting timeout on the stream of around 2 or 3 seconds.

Correct me if I am wrong though: The stream timeout is a timeout that after it expires, the stream will retry so if you have too long timeout set on that, it will wait for nothing and not help getting the data. That is why, if the stream timeout is lower, and the read is retried more times it works. Would that be right?

@d-a-v
Copy link
Collaborator

d-a-v commented May 21, 2019

setTimeout seems to change both though?

Yes, ClientContext uses WiFiClient/Stream timeout in ::connect(). And yes WiFiClient overrides timeout to 5000.

The stream timeout is a timeout that after it expires, the stream will retry

Stream itself does not retry. If the delay between two separate receives is greater than timeout, it returns.
If timeout was longer, then there'd be no need to externally retry (like it does now).
That's why I propose to increase timeout in ::writeStream() (and restore it after), or increase your timeout once for all (onto data) to something big enough for the updater Stream/client.

@d-a-v d-a-v added the waiting for feedback Waiting on additional info. If it's not received, the issue may be closed. label May 23, 2019
@devyte
Copy link
Collaborator

devyte commented Nov 6, 2019

@adrionics What is the status here? did you test based on @d-a-v 's comments?

@adrian-dybwad
Copy link
Contributor Author

@adrionics What is the status here? did you test based on @d-a-v 's comments?

Sorry, I did not have time to spend on this and used the modifications as posted already with excellent performance or reliability. I can revisit in the future, but it can be hard to test or verify on thousands of devices as has already been done now. I think it is better than it has ever been in terms of successful updates across the base with the changes described.

@earlephilhower earlephilhower added the merge-conflict PR has a merge conflict that needs manual correction label Feb 26, 2020
@devyte
Copy link
Collaborator

devyte commented Apr 4, 2021

@d-a-v does this PR still make sense in view of your Stream:send implementation? Or would a different Stream-ing approach be possible/better?

@d-a-v d-a-v removed the merge-conflict PR has a merge conflict that needs manual correction label Apr 4, 2021
Copy link
Collaborator

@d-a-v d-a-v left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The stream::send approach should indeed be considered but that could be in another PR.
This PR apparently solves some updater API issues.
I wonder if @PurpleAir is still using it.
(merge conflict is fixed)

@adrian-dybwad
Copy link
Contributor Author

The stream::send approach should indeed be considered but that could be in another PR.
This PR apparently solves some updater API issues.
I wonder if @PurpleAir is still using it.
(merge conflict is fixed)

Thank you, yes we are.

@devyte devyte merged commit bde6ab1 into esp8266:master Apr 5, 2021
@devyte devyte added component: core type: bug and removed waiting for feedback Waiting on additional info. If it's not received, the issue may be closed. labels Apr 5, 2021
@devyte devyte added this to the 3.0.0 milestone Apr 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improvement to timeout in Updater.cpp
5 participants