chore: use a cache to select best latency #7879
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem
Currently the latency report for workspace proxies is occasionally "slow", making it inconsistent. Some page loads have a very high latency on the preferred proxy closest to the user, instead selecting a slower proxy.
Debugging this has proved difficult.
There are 2 core issues with the current implementation:
This PR only intends to begin to address # 2. Collecting data on this has been challenging because it is all client side.
What this does
What this does is each time we load proxy latencies, we store the fetched latencies in local storage.
We always use the minimum latency from the cached latencies + the latest. We use the minimum as that best describes the best case round trip time. We might want to later change this to some sort of "median", or "average without outliers".
We store at most 8 cached latencies, and remove any latencies gathered over 1 week ago.
Debugging + Gathering Data
Honestly, I am unsure what the magic solution is here. Is minimum really the best? Is the average better? Etc etc.
Without some data, it is hard to know. And unfortunately my test deployments behave differently and appear to behave more consistently. So it is hard to gather data outside of production. And all this latency data is clientside, so we cannot push it into our telemetry or metrics.
To get some data to make these decisions, I added a local storage value called:
workspace-proxy-latencies-max
. This value can be set to any number, and your browser will save all latency checks made up to that amount into local storage. We can set it to something like 300, dogfood the product, and then download the data and see how inconsistent our latencies really are.Performance
Performance on this is probably not the best and can be improved. Local storage is the user's disk, so access is not as fast as memory. We can optimize this later if this proves to be a useful strategy.