-
Notifications
You must be signed in to change notification settings - Fork 874
feat: improve resources_monitoring for OOM & OOD monitoring #16241
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
0b76f49
to
68b022f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks mostly fine to me, but missing some test cases around the unhappy code paths added to ConvertState()
.
DROP TABLE IF EXISTS workspace_agent_memory_resource_monitors; | ||
DROP TABLE IF EXISTS workspace_agent_volume_resource_monitors; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Obligatory reminder to double-check migration number before merging!
provisioner/terraform/testdata/calling-module/calling-module.tfplan.json
Show resolved
Hide resolved
provisioner/terraform/testdata/calling-module/calling-module.tfstate.json
Show resolved
Hide resolved
coderd/database/migrations/000288_agent_resource_monitors.up.sql
Outdated
Show resolved
Hide resolved
coderd/database/migrations/000288_agent_resource_monitors.up.sql
Outdated
Show resolved
Hide resolved
@@ -0,0 +1,14 @@ | |||
CREATE TABLE workspace_agent_memory_resource_monitors ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something to consider...
CREATE TYPE workspace_agent_resource_monitor AS ENUM ('memory', 'volume');
CREATE TABLE workspace_agent_resource_monitors (
id uuid not null primary key,
agent_id uuid not null foreign key ...,
type workspace_agent_resource_monitor not null,
enabled bool not null,
threshold integer not null,
path text not null,
created_at ...
)
This would obviously be less ideal if we plan to add other attributes later.
coderd/rbac/roles_test.go
Outdated
AuthorizeMap: map[bool][]hasAuthSubjects{ | ||
true: {owner}, | ||
false: { | ||
memberMe, orgMemberMe, otherOrgMember, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not authorizing memberMe
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not really sure how to configure RBAC for my needs - I'll let the PR opened until I have enough in the agent one so I can validate RBAC are well configured if good for you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we will need memberMe
and orgMemberMe
here, although it will become apparent once we add an API endpoint or similar that tries to read this data.
Adding or extending some existing tests should make it apparent what we need here.
Fixes :
|
enabled boolean NOT NULL, | ||
threshold integer NOT NULL, | ||
created_at timestamp with time zone NOT NULL, | ||
PRIMARY KEY (agent_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hhmm, I don't think this will work? The primary key needs to be unique.
https://www.postgresql.org/docs/current/ddl-constraints.html#DDL-CONSTRAINTS-PRIMARY-KEYS
A primary key constraint indicates that a column, or group of columns, can be used as a unique identifier for rows in the table.
In general, when you're creating database migrations, it's good practice to perform a few basic operations against them to validate that your design works.
I see what you're trying to do, but I think you're try to be overly efficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain to me how we can have duplicates here ? I maybe am missing some logic.
As we authorize only one resource monitor of type memory
here - and the resource monitor is linked to an agent id - I feel like it is doing what we want ?
If this one breaks it means that there is multiple memory resource monitors declared in the terraform configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hhmm, fair point. It's true that each agent should only have one memory resource monitor.
Please leave a comment in the migration explaining this constraint for future readers without context 👍
I should not review PRs before bed 🫤
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the existing tests pass, I'm in favour of merging this and addressing further issues as they arise (unless @dannykopping has objections). The existing PR has become large enough as well (discounting the changes to generated resources) that further changes here may become unwieldy to review.
Don't forget to check the migration number before merging!
As requested for this issue we need to have a new resource
resources_monitoring
in the agent.It needs to be parsed from the provisioner and inserted into a new db table.