Skip to content

refactor: workspace builds #7541

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
May 23, 2023
Merged

refactor: workspace builds #7541

merged 10 commits into from
May 23, 2023

Conversation

spikecurtis
Copy link
Contributor

Refactors workspace builds to consolidate the business logic of inserting a new build in one place. Used by API calls to create workspaces or trigger new builds, as well as the autobuild executor (auto-start, auto-stop).

I realize that we had a bad experience with models package in v1, so I'm wary of creating a giant package that tries to encapsulate all the business logic of all the things. Here I create a modest package with (for now) a single main object that handles creating a new build. Note that in the case of creating a workspace, the API handler still creates the workspace, then calls into the new package to create the build.

Some other smaller changes to point out:

  • I created a ParameterResolver type in codersdk, which handles resolving and validating rich parameter values on the build. This is a separate type because I intend to use it in the future to also consolidate the business logic around whether new parameter values are needed on the CLI and during auto-starts that automatically upgrade to the latest template version.
  • I added checks that verify the template version for a build belongs to the template, and found that our template creation API was bugged and allowed you to create a new template for a version that was already part of an existing template. This would have the effect of "stealing" the template version from the existing template in the database. Some of our tests were exploiting this behavior in a lazy way, so I fixed the tests as well.

Signed-off-by: Spike Curtis <spike@coder.com>
@spikecurtis spikecurtis changed the title refactor workspace builds refactor: workspace builds May 16, 2023
Signed-off-by: Spike Curtis <spike@coder.com>
Copy link
Member

@johnstcn johnstcn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me overall, just a couple of comments around unit tests and concurrency safety.

var workspaceBuild *database.WorkspaceBuild
var provisionerJob *database.ProvisionerJob
err := store.InTx(func(store database.Store) error {
b.store = store
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to be worried about a builder being used across multiple goroutines?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depends on what you mean.

When you are configuring the builder, it's passed around by value, so it's safe to use in multiple go routines. However, it's not safe to call b.Build() from more than one goroutine. It's hard to imagine why you would do that, as we explicitly ensure that only one build can be running at a time per workspace.

If, however, you're talking about multiple goroutines attempting to insert a build on the same logical workspace (i.e. the goroutines don't share memory, but they are operating on the same database object), then the use of database transactions will ensure at most one build is inserted.


// Run with RepeatableRead isolation so that the build process sees the same data
// as our calculation that determines whether an autobuild is necessary.
}, &sql.TxOptions{Isolation: sql.LevelRepeatableRead})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this new package is taking on the burden of being the one-stop-shop for building workspaces, it might be a good idea to add dedicated unit tests for the package instead of relying on the consuming packages to do the job for us.

Copy link
Member

@mtojek mtojek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first review round is done...

I admit that I don't understand all concepts you've applied to this PR, so it might be a good idea to arrange some walking-through-code session (state.explicit, build reason: autostart/autostop, retry 5 times).

The idea of clean separation between fetching objects from DB, and processing the build is obviously the correct one 👍 . I wouldn't call this PR a refactor, this is a new truly new workspace builder!

Site question: did you think about the deprecation of legacy parameters? Will it be as simple as stripping references to ParameterValues and resolver.Legacy?

@@ -60,6 +59,7 @@ type Parameter struct {
DestinationScheme ParameterDestinationScheme `json:"destination_scheme" table:"destination scheme" validate:"ne=none" enums:"none,environment_variable,provisioner_variable"`
CreatedAt time.Time `json:"created_at" table:"created at" format:"date-time"`
UpdatedAt time.Time `json:"updated_at" table:"updated at" format:"date-time"`
SourceValue string `json:"source_value"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Wasn't that intentionally skipped not to leak any secret values?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe?

I don't consider it a security risk to allow users to read back out the values that they themselves created. It's very useful, and we allow it in the case of Rich Parameters.

In this case, we use it while migrating legacy parameters to rich parameters in codersdk.ParameterResolver.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point, maybe it is not so bad.

In the case of rich parameters (vel. workspace build parameters), it is true. Template owners can use managed template variables to hide sensitive information (property sensitive).

return nil
}

func (b *Builder) getLastBuildParameters() ([]database.WorkspaceBuildParameter, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should we add some context to all return nil, err for debugging purposes?

Comment on lines +124 to +125
Legacy []Parameter
Rich []WorkspaceBuildParameter
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should we unexpose these properties or do you intend to use them in future PRs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're exposed so you can do

resolver := &codersdk.ParameterResolver{
    Legacy: legacyParams,
    Rich: richParams,
}

Otherwise we'd need a NewParameterResolver() function to initialize this struct from outside the package.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so the only questionable thing is the location of ParameterResolver, but I think that I have already raised this concern in some other comment.


// ValidateResolve checks the provided value, v, against the parameter, p, and the previous build. If v is nil, it also
// resolves the correct value. It returns the value of the parameter, if valid, and an error if invalid.
func (r *ParameterResolver) ValidateResolve(p TemplateVersionParameter, v *WorkspaceBuildParameter) (value string, err error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that we're testing this logic mostly in the coderd API tests, but maybe a few extra unit tests here will be useful.

}
if err != nil {
// Other (hard) error
return nil, nil, err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add some error wrapping here if this error is really unexpected.

@spikecurtis
Copy link
Contributor Author

Site question: did you think about the deprecation of legacy parameters? Will it be as simple as stripping references to ParameterValues and resolver.Legacy?

I did think about removing legacy parameter support from workspaces. We'll want to remove

  • ParameterResolver.Legacy
  • Builder.legacyParameterValues
  • Builder.lastParameterValues

So, neither the Builder nor the ParameterResolver should have to care about them: they're handled entirely on the template.

Comment on lines +79 to +83
func (b Builder) VersionID(v uuid.UUID) Builder {
// nolint: revive
b.version = versionTarget{specific: &v}
return b
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we expose these on the struct directly and remove these helper functions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then the New func could be discarded too if we wanted, which could be nice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can.

I'll admit I do like the Fluent interface that we'd lose.

builder := wsbuilder.New(workspace, database.WorkspaceTransitionStart).
    Reason(database.BuildReasonInitiator).
    Initiator(apiKey.UserID).
    ActiveVersion().
    LegacyParameterValues(createWorkspace.ParameterValues).
    RichParameterValues(createWorkspace.RichParameterValues).
    State(state)

Would become

builder := wsbuilder.Builder{
    Workspace: workspace,
    Transition: database.WorkspaceTransitionStart,
    Reason: database.BuildReasonInitiator,
    Initiator: apiKey.UserID,
    Version: VersionTarget{Active: true},
    LegacyParameterValues: createWorkspace.ParameterValues,
    RichParameterValues: createWorkspace.RichParameterValues,
    State: StateTarget{Explicit: state}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm playing around with the ...opts pattern that we use elsewhere in the codebase, but I don't like it for this.

	builder := wsbuilder.New(
		workspace, database.WorkspaceTransition(createBuild.Transition),
		wsbuilder.Initiator(apiKey.UserID),
		wsbuilder.LegacyParameterValues(createBuild.ParameterValues),
		wsbuilder.RichParameterValues(createBuild.RichParameterValues),
		wsbuilder.LogLevel(string(createBuild.LogLevel)),
	)

	if createBuild.TemplateVersionID != uuid.Nil {
		builder = wsbuilder.VersionID(createBuild.TemplateVersionID)(builder)
	}

You have to type wsbuilder. a bunch of extra times, and it's more awkward to modify the builder after the first call to New(), as in the VersionID call above.

Comment on lines +227 to +259
err := b.checkTemplateVersionMatchesTemplate()
if err != nil {
return nil, nil, err
}
err = b.checkTemplateJobStatus()
if err != nil {
return nil, nil, err
}
err = b.checkRunningBuild()
if err != nil {
return nil, nil, err
}

template, err := b.getTemplate()
if err != nil {
return nil, nil, BuildError{http.StatusInternalServerError, "failed to fetch template", err}
}

templateVersionJob, err := b.getTemplateVersionJob()
if err != nil {
return nil, nil, BuildError{
http.StatusInternalServerError, "failed to fetch template version job", err,
}
}

legacyParameters, err := b.getLastParameterValues()
if err != nil {
return nil, nil, BuildError{
http.StatusInternalServerError,
"failed to fetch previous legacy parameters.",
err,
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like all of these can happen in parallel and use an errgroup.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That adds a fair amount of complexity.

The method calls read & modify an internal cache of db objects, which would have to have a mutex protecting it if it gets accessed from multiple goroutines. Holding that lock while querying the database would bring us back serial requests, but not holding the lock opens a big can of worms where you have to track requests in flight, lest we query the database multiple times for the same thing. So, the database calls get partially serialized as a natural consequence, meaning the level of parallelism we can achieve isn't that great.

I say not worth it!

continue
}

_, err = b.store.InsertParameterValue(b.ctx, database.InsertParameterValueParams{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can also use an errgroup

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make that a later optimization? Like if we have some traces or whatever that show this is a bottleneck.

I feel like your typical template has like 5 or fewer parameters, so serializing the calls isn't a big deal. It's what the code I'm refactoring from does.

Signed-off-by: Spike Curtis <spike@coder.com>
Signed-off-by: Spike Curtis <spike@coder.com>
Signed-off-by: Spike Curtis <spike@coder.com>
@mtojek mtojek self-requested a review May 17, 2023 10:02
Copy link
Member

@mtojek mtojek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for arranging the walk-thru session. I don't see any blockers or major issues. It would be great to shift some tests from the API level to the builder.

Signed-off-by: Spike Curtis <spike@coder.com>
Signed-off-by: Spike Curtis <spike@coder.com>
Signed-off-by: Spike Curtis <spike@coder.com>
@spikecurtis spikecurtis requested a review from kylecarbs May 22, 2023 10:40
@ammario ammario removed the sprint-0 label May 22, 2023
@@ -0,0 +1,4 @@
// package mock contains a mocked implementation of the database.Store interface for use in tests
package mock
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's arguably some redundancy because of the name db inside the database package, but it would be kinda nice if this was called dbmock instead for consistency.

Signed-off-by: Spike Curtis <spike@coder.com>
Signed-off-by: Spike Curtis <spike@coder.com>
@spikecurtis spikecurtis merged commit cd416c8 into main May 23, 2023
@spikecurtis spikecurtis deleted the spike/workspace-builds-refactor branch May 23, 2023 08:06
@github-actions github-actions bot locked and limited conversation to collaborators May 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants