-
Notifications
You must be signed in to change notification settings - Fork 887
refactor: workspace builds #7541
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Spike Curtis <spike@coder.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me overall, just a couple of comments around unit tests and concurrency safety.
var workspaceBuild *database.WorkspaceBuild | ||
var provisionerJob *database.ProvisionerJob | ||
err := store.InTx(func(store database.Store) error { | ||
b.store = store |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to be worried about a builder being used across multiple goroutines?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depends on what you mean.
When you are configuring the builder, it's passed around by value, so it's safe to use in multiple go routines. However, it's not safe to call b.Build()
from more than one goroutine. It's hard to imagine why you would do that, as we explicitly ensure that only one build can be running at a time per workspace.
If, however, you're talking about multiple goroutines attempting to insert a build on the same logical workspace (i.e. the goroutines don't share memory, but they are operating on the same database object), then the use of database transactions will ensure at most one build is inserted.
|
||
// Run with RepeatableRead isolation so that the build process sees the same data | ||
// as our calculation that determines whether an autobuild is necessary. | ||
}, &sql.TxOptions{Isolation: sql.LevelRepeatableRead}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that this new package is taking on the burden of being the one-stop-shop for building workspaces, it might be a good idea to add dedicated unit tests for the package instead of relying on the consuming packages to do the job for us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first review round is done...
I admit that I don't understand all concepts you've applied to this PR, so it might be a good idea to arrange some walking-through-code session (state.explicit, build reason: autostart/autostop, retry 5 times).
The idea of clean separation between fetching objects from DB, and processing the build is obviously the correct one 👍 . I wouldn't call this PR a refactor, this is a new truly new workspace builder!
Site question: did you think about the deprecation of legacy parameters? Will it be as simple as stripping references to ParameterValues
and resolver.Legacy
?
@@ -60,6 +59,7 @@ type Parameter struct { | |||
DestinationScheme ParameterDestinationScheme `json:"destination_scheme" table:"destination scheme" validate:"ne=none" enums:"none,environment_variable,provisioner_variable"` | |||
CreatedAt time.Time `json:"created_at" table:"created at" format:"date-time"` | |||
UpdatedAt time.Time `json:"updated_at" table:"updated at" format:"date-time"` | |||
SourceValue string `json:"source_value"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Wasn't that intentionally skipped not to leak any secret values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe?
I don't consider it a security risk to allow users to read back out the values that they themselves created. It's very useful, and we allow it in the case of Rich Parameters.
In this case, we use it while migrating legacy parameters to rich parameters in codersdk.ParameterResolver
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point, maybe it is not so bad.
In the case of rich parameters (vel. workspace build parameters), it is true. Template owners can use managed template variables to hide sensitive information (property sensitive
).
return nil | ||
} | ||
|
||
func (b *Builder) getLastBuildParameters() ([]database.WorkspaceBuildParameter, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: should we add some context to all return nil, err
for debugging purposes?
Legacy []Parameter | ||
Rich []WorkspaceBuildParameter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: should we unexpose these properties or do you intend to use them in future PRs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They're exposed so you can do
resolver := &codersdk.ParameterResolver{
Legacy: legacyParams,
Rich: richParams,
}
Otherwise we'd need a NewParameterResolver()
function to initialize this struct from outside the package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so the only questionable thing is the location of ParameterResolver
, but I think that I have already raised this concern in some other comment.
|
||
// ValidateResolve checks the provided value, v, against the parameter, p, and the previous build. If v is nil, it also | ||
// resolves the correct value. It returns the value of the parameter, if valid, and an error if invalid. | ||
func (r *ParameterResolver) ValidateResolve(p TemplateVersionParameter, v *WorkspaceBuildParameter) (value string, err error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that we're testing this logic mostly in the coderd API tests, but maybe a few extra unit tests here will be useful.
} | ||
if err != nil { | ||
// Other (hard) error | ||
return nil, nil, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe add some error wrapping here if this error is really unexpected.
I did think about removing legacy parameter support from workspaces. We'll want to remove
So, neither the |
func (b Builder) VersionID(v uuid.UUID) Builder { | ||
// nolint: revive | ||
b.version = versionTarget{specific: &v} | ||
return b | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we expose these on the struct directly and remove these helper functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then the New
func could be discarded too if we wanted, which could be nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can.
I'll admit I do like the Fluent interface that we'd lose.
builder := wsbuilder.New(workspace, database.WorkspaceTransitionStart).
Reason(database.BuildReasonInitiator).
Initiator(apiKey.UserID).
ActiveVersion().
LegacyParameterValues(createWorkspace.ParameterValues).
RichParameterValues(createWorkspace.RichParameterValues).
State(state)
Would become
builder := wsbuilder.Builder{
Workspace: workspace,
Transition: database.WorkspaceTransitionStart,
Reason: database.BuildReasonInitiator,
Initiator: apiKey.UserID,
Version: VersionTarget{Active: true},
LegacyParameterValues: createWorkspace.ParameterValues,
RichParameterValues: createWorkspace.RichParameterValues,
State: StateTarget{Explicit: state}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm playing around with the ...opts
pattern that we use elsewhere in the codebase, but I don't like it for this.
builder := wsbuilder.New(
workspace, database.WorkspaceTransition(createBuild.Transition),
wsbuilder.Initiator(apiKey.UserID),
wsbuilder.LegacyParameterValues(createBuild.ParameterValues),
wsbuilder.RichParameterValues(createBuild.RichParameterValues),
wsbuilder.LogLevel(string(createBuild.LogLevel)),
)
if createBuild.TemplateVersionID != uuid.Nil {
builder = wsbuilder.VersionID(createBuild.TemplateVersionID)(builder)
}
You have to type wsbuilder.
a bunch of extra times, and it's more awkward to modify the builder after the first call to New()
, as in the VersionID
call above.
err := b.checkTemplateVersionMatchesTemplate() | ||
if err != nil { | ||
return nil, nil, err | ||
} | ||
err = b.checkTemplateJobStatus() | ||
if err != nil { | ||
return nil, nil, err | ||
} | ||
err = b.checkRunningBuild() | ||
if err != nil { | ||
return nil, nil, err | ||
} | ||
|
||
template, err := b.getTemplate() | ||
if err != nil { | ||
return nil, nil, BuildError{http.StatusInternalServerError, "failed to fetch template", err} | ||
} | ||
|
||
templateVersionJob, err := b.getTemplateVersionJob() | ||
if err != nil { | ||
return nil, nil, BuildError{ | ||
http.StatusInternalServerError, "failed to fetch template version job", err, | ||
} | ||
} | ||
|
||
legacyParameters, err := b.getLastParameterValues() | ||
if err != nil { | ||
return nil, nil, BuildError{ | ||
http.StatusInternalServerError, | ||
"failed to fetch previous legacy parameters.", | ||
err, | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like all of these can happen in parallel and use an errgroup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That adds a fair amount of complexity.
The method calls read & modify an internal cache of db objects, which would have to have a mutex protecting it if it gets accessed from multiple goroutines. Holding that lock while querying the database would bring us back serial requests, but not holding the lock opens a big can of worms where you have to track requests in flight, lest we query the database multiple times for the same thing. So, the database calls get partially serialized as a natural consequence, meaning the level of parallelism we can achieve isn't that great.
I say not worth it!
continue | ||
} | ||
|
||
_, err = b.store.InsertParameterValue(b.ctx, database.InsertParameterValueParams{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can also use an errgroup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make that a later optimization? Like if we have some traces or whatever that show this is a bottleneck.
I feel like your typical template has like 5 or fewer parameters, so serializing the calls isn't a big deal. It's what the code I'm refactoring from does.
Signed-off-by: Spike Curtis <spike@coder.com>
Signed-off-by: Spike Curtis <spike@coder.com>
Signed-off-by: Spike Curtis <spike@coder.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for arranging the walk-thru session. I don't see any blockers or major issues. It would be great to shift some tests from the API level to the builder.
Signed-off-by: Spike Curtis <spike@coder.com>
Signed-off-by: Spike Curtis <spike@coder.com>
Signed-off-by: Spike Curtis <spike@coder.com>
coderd/database/mock/doc.go
Outdated
@@ -0,0 +1,4 @@ | |||
// package mock contains a mocked implementation of the database.Store interface for use in tests | |||
package mock |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's arguably some redundancy because of the name db
inside the database
package, but it would be kinda nice if this was called dbmock
instead for consistency.
Signed-off-by: Spike Curtis <spike@coder.com>
Signed-off-by: Spike Curtis <spike@coder.com>
Refactors workspace builds to consolidate the business logic of inserting a new build in one place. Used by API calls to create workspaces or trigger new builds, as well as the autobuild executor (auto-start, auto-stop).
I realize that we had a bad experience with
models
package in v1, so I'm wary of creating a giant package that tries to encapsulate all the business logic of all the things. Here I create a modest package with (for now) a single main object that handles creating a new build. Note that in the case of creating a workspace, the API handler still creates the workspace, then calls into the new package to create the build.Some other smaller changes to point out:
ParameterResolver
type incodersdk
, which handles resolving and validating rich parameter values on the build. This is a separate type because I intend to use it in the future to also consolidate the business logic around whether new parameter values are needed on the CLI and during auto-starts that automatically upgrade to the latest template version.