Skip to content

feat: add user quiet hours schedule and restart requirement feature flag #8115

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 32 commits into from
Jul 20, 2023

Conversation

deansheather
Copy link
Member

@deansheather deansheather commented Jun 20, 2023

Adds feature flag template_restart_requirement which changes max_deadline to be calculated using a new "restart requirement" setting on the template, rendering the max_ttl field unused.

Adds a quiet_hours_schedule field to the users table which stores a daily cron expression e.g. CRON_TZ=Australia/Sydney 2 0 * * * (every day at 2am Sydney time).

Adds restart_requirement_days_of_week and restart_requirement_weeks to templates table. The first value is a uint8 for all 7 days of the week (and an unused bit) governing which days of the week the policy should require restarts on (in the user's quiet hours schedule and timezone). The second value says how many weeks between required restarts. 0-1 means weekly, 2 means fortnightly etc.

The week calculations are done as an offset from an "epoch" date, so every workspace from every customer will have the same n-week calculations for consistency.

Admins must configure the default schedule that applies to users who haven't configured a custom schedule yet, and can also configure the window duration. Users can set a custom schedule in any timezone. The cron expression must be a daily expression with a single time each day.

TODO:

  • Configurable default schedule as flag
    • Feature is disabled until admin picks a default schedule
  • Configurable window duration as flag
  • Tests for schedule package
  • Tests for new APIs
  • Tests for provisionerdserver applying the expected values to builds

Closes #6917

@github-actions github-actions bot added the stale This issue is like stale bread. label Jul 6, 2023
prisma-cloud-devsecops[bot]

This comment was marked as spam.

@github-actions github-actions bot removed the stale This issue is like stale bread. label Jul 8, 2023
return autostop, xerrors.New("coder server system clock is incorrect, cannot calculate template restart requirement")
}
since := startOfStopDay.Sub(epoch)
weeksSinceEpoch := int64(since.Hours() / (24 * 7))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This algorithm might be too naive??? More research/testing required to ensure this generates correct dates

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redid the algorithm completely since this algorithm fails if DST has passed once already during the year. The new algorithm is good in a bunch of timezones for the next 100 years (and this is tested).

@deansheather deansheather changed the title feat: add user maintenance schedule for max_ttl autostop feat: add user quiet hours schedule and restart requirement feature flag Jul 12, 2023
@deansheather deansheather marked this pull request as ready for review July 12, 2023 15:18
@deansheather deansheather requested review from Emyrk and coadler July 13, 2023 19:48
Copy link
Contributor

@coadler coadler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will get to the rest tomorrow

Copy link
Member

@Emyrk Emyrk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️ the docs

I need a longer stretch to grock some of this time stuff. The comments and code look well formatted, will get back to this 👍

Copy link
Member

@Emyrk Emyrk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not manually tested this, but the time stuff all looks good 👍

The bitmap feels unnecessary at first, since we always convert it into and out of a list. But it does give us validation that it uses each day at most once, all days are in order, and there is a maximum size to the list. I think LoC wise it's more to use the bitmap.

Example is where we use it here:

requirementDays := templateSchedule.RestartRequirement.DaysMap()
. We just expand it into a map.


I have not looked at the schedule package much at all, so there are some things I am not sure about.

This has Sunday as the first day. Is this ok?

func (s Schedule) DaysOfWeek() string {

Comment on lines 316 to 326
// Loop until we find a week that doesn't fail.
var lastErr error
for i := int64(0); i < 3; i++ {
monday, err := GetMondayOfWeek(now.Location(), week+i)
if err != nil {
lastErr = err
continue
}

return monday, nil
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have any example of why this would fail?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is my only open question.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, I'm not sure how this could fail, but the GetMondayOfWeek algorithm just adds 7 * n days to the epoch which seems kinda naive to me. This is tested though for the next 100 years in all of those timezones so I think it's stable, I'll add a comment saying this is probably not needed but here just in case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect. Was just making sure I wasn't missing something.

@deansheather
Copy link
Member Author

You're probably right about the bitmap thing being more LoC. I'll see about making it a string array and validating it in the DB.

@coadler
Copy link
Contributor

coadler commented Jul 17, 2023

Personally I like the bitmap for storing in the DB, it makes the Postgres side pretty simple without the need for arrays. It felt overkill at first, but I think it's a suitable tradeoff.

@Emyrk
Copy link
Member

Emyrk commented Jul 17, 2023

Personally I like the bitmap for storing in the DB, it makes the Postgres side pretty simple without the need for arrays. It felt overkill at first, but I think it's a suitable tradeoff.

It's probably not worth changing. I could go either way in it.

Copy link
Member

@Emyrk Emyrk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Just my one open comment on how something might fail. It's fine to keep the defensive programming, just curious because I don't see how it could.

@deansheather
Copy link
Member Author

This has Sunday as the first day. Is this ok?

Yeah we're not using it in this code, that's used for the autostart scheduling.

@github-actions
Copy link

github-actions bot commented Jul 19, 2023

✔️ Deployed PR 8115 successfully.
🚀 Access the deployment link here.
⚠️ This deployment will be deleted when the PR is closed.

@bpmct
Copy link
Member

bpmct commented Jul 19, 2023

I just deployed this, this doesn't include the restart requirement in the template settings form, right? That will be a separate PR?

Also, thoughts on another name instead of "restart requirement" since the action will be a stop, but the user could intervene and bump or restart if they wish to keep working?

  • Max session length
  • Auto-stop schedule (different than inactivity timeout)

Perhaps "restart requirement" could work, but didn't feel natural to me. I tried documenting it and couldn't find a natural way to explain it or show it in the UI. Feel free to make the final call here @deansheather
main...bpmct/scheduling-docs

@deansheather
Copy link
Member Author

Also, thoughts on another name instead of "restart requirement" since the action will be a stop, but the user could intervene and bump or restart if they wish to keep working?

In the interest of not making this PR wait for too much longer I'm going to merge it with the types and field names as they are with restart_requirement and we can rename them later once we have a good name.

I dislike "max session length" as you do not set a duration with this new method. If you were setting a duration in hours like the old max_ttl then it would make sense.

Auto-stop schedule doesn't convey that it's a required stop and the workspace can't stay on after this point. Auto-stop requirement could work though.

@bpmct
Copy link
Member

bpmct commented Jul 20, 2023

I like auto-stop requirement 👍

@deansheather deansheather merged commit dc8b731 into main Jul 20, 2023
@deansheather deansheather deleted the dean/user-maintenance-window branch July 20, 2023 13:35
@github-actions github-actions bot locked and limited conversation to collaborators Jul 20, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Template max TTL time bounds
4 participants