-
Notifications
You must be signed in to change notification settings - Fork 886
feat: Fix Deployment DAUs to work with local timezones #7647
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Previously only worked for UTC timezone
// closest returns the value in the values map that has a key with the value most | ||
// close to the requested key. This is so if a user requests a timezone offset that | ||
// we do not have, we return the closest one we do have to the user. | ||
func closest[V any](values map[int]V, offset int) (int, V, bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really understand the purpose of only allowing a set number of timezones. What would be the harm in just passing them to Postgres verbatim?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by that? There are only 24 timezones, and I just used this instead of input validation because idk if there is some odd edge case of a -13 offset.
As for passing it to postgres. We cache these values and reuse them for all calls. So when the caller requests the data, they pull from a cache, they do not trigger a query. So the data must already be available.
I suppose a better cache would populate things lazily, but that would require a larger refactor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are more than 24 time zones, e.g. India is 10.5h ahead of CST.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am doing a +/- 12hr offset to UTC (0). So India would use the +10 bucket if it's +10.5.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha, I think I misinterpreted your comment as stating that there are 24 hours timezone in the world. I think an hour error tolerance is immaterial, so am good with the implentation you've described. But, I haven't checked the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that makes sense. I forgot that everything is cached in the background rather than on-demand. It's unfortunate that we need count(templates) * 24
queries for each tick though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is unfortunate, but it runs once per hour, so I am not too worried.
Why not do the frontend in this PR too? I assume querying the browser for timezone is just a few LoC, and let's us merge this in vertically. |
I really wanted to make sure this solution was acceptable before I did the FE. FE always takes me a longer time to figure out and test, and did not want to spend the hours if this was in vain. |
// closest returns the value in the values map that has a key with the value most | ||
// close to the requested key. This is so if a user requests a timezone offset that | ||
// we do not have, we return the closest one we do have to the user. | ||
func closest[V any](values map[int]V, offset int) (int, V, bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that makes sense. I forgot that everything is cached in the background rather than on-demand. It's unfortunate that we need count(templates) * 24
queries for each tick though.
What this does
This calculates the DAUs for the deployment and all templates for all 24 timezone offsets [-12,12]. The user requesting the DAUs can request a timezone offset, and the server will return the closest one available. The payload includes the offset chosen.
This will accommodate half timezones, daylight savings, etc. All the user needs to do is calculate their timezone offset to UTC at the time of calling. So if you are in daylight savings, just move your offset +/- 1. The days are still 24hours long, so the lost/added hour will not be in the correct bucket. That 1 hour accuracy lost is probably not a big deal though, since the day boundary is when people should not be working anyway.
Doing it this way abstracts away the concept of timezones from the back-end, and lets the caller decide what boundaries make up a given "day". It gives us pretty good accuracy to what the user is actually requesting, without having to introduce the concept of timezone to the backend.
Fixes: #7275
An ideal world
I think the perfect solution would be to track all activity by the workspace in the user's local timezone. Then we could measure all activity on "Monday" even if 1 person in Europe is technically in "Sunday" for the US deployment. So all "Monday" activity is in the "Monday" bucket.
This would be a big refactor though as we don't track the user's local timezone and all timestamps are in UTC in the database.
Future Work
Have the UI use the local browser timezone. Currently the browser still queries the UTC timezone. Once this is merged the UI can use the
tz_offset
query param to change the offset for the DAU calculation