Skip to content

feat: notifications: report failed workspace builds #14571

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 126 commits into from
Sep 18, 2024
Merged

Conversation

mtojek
Copy link
Member

@mtojek mtojek commented Sep 5, 2024

Fixes: coder/internal#40

This PR introduces support for email reports, including a report with failed workspace builds. A report generator runs periodically to see if users expect a new report. Once the report is successfully generated and spread, a record (or records) are logged in the database table.

In the multi-instance setup, report generators use an exclusive locking mechanism to ensure only one instance is running.

@mtojek mtojek self-assigned this Sep 5, 2024
@mtojek
Copy link
Member Author

mtojek commented Sep 17, 2024

Alright, I addressed most of the feedback. Important changes:

  • I decided to stick with the concept of email per template. Template admins should not receive emails if there are no failed workspace builds.
  • Don't need to remove old entries in the notification_report_generator_log as right now there will be one record for notifications.TemplateWorkspaceBuildsFailedReport. We will see more if there are more report type defined.
  • Reports are sent to template admins at the same time. If somebody disables notifications, they will receive the report in the next week. Actually, it simplified the tracking logic.

Copy link
Contributor

@dannykopping dannykopping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Approving to unblock, but I really feel we should try add some testing around NewReportGenerator. I'm happy to land this first and have a follow-up PR which just focuses on that, since it could get complicated.

Excited to see this in action!

delay = 15 * time.Minute
)

func NewReportGenerator(ctx context.Context, logger slog.Logger, db database.Store, enqueuer notifications.Enqueuer, clk quartz.Clock) io.Closer {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is not tested using the current approach is exclusive locking, and this will be hard.

What makes this particularly hard (asking for real, not rhetorically)? To me it seems like we'd need to invoke NewReportGenerator in two separate goroutines and just ensure that we only get what we're expecting, which you've implemented in other tests.

I think it's quite risky to have functionality which should be guarding against pathological behaviour which is not covered by a test.

@mtojek mtojek merged commit 6de5937 into main Sep 18, 2024
26 checks passed
@mtojek mtojek deleted the 40-failure-summary branch September 18, 2024 07:11
@github-actions github-actions bot locked and limited conversation to collaborators Sep 18, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat: Email reports/summaries
4 participants