diff --git a/.cursorrules b/.cursorrules index ce4412b83f6e9..54966b1dcc89e 100644 --- a/.cursorrules +++ b/.cursorrules @@ -4,7 +4,7 @@ This project is called "Coder" - an application for managing remote development Coder provides a platform for creating, managing, and using remote development environments (also known as Cloud Development Environments or CDEs). It leverages Terraform to define and provision these environments, which are referred to as "workspaces" within the project. The system is designed to be extensible, secure, and provide developers with a seamless remote development experience. -# Core Architecture +## Core Architecture The heart of Coder is a control plane that orchestrates the creation and management of workspaces. This control plane interacts with separate Provisioner processes over gRPC to handle workspace builds. The Provisioners consume workspace definitions and use Terraform to create the actual infrastructure. @@ -12,17 +12,17 @@ The CLI package serves dual purposes - it can be used to launch the control plan The database layer uses PostgreSQL with SQLC for generating type-safe database code. Database migrations are carefully managed to ensure both forward and backward compatibility through paired `.up.sql` and `.down.sql` files. -# API Design +## API Design Coder's API architecture combines REST and gRPC approaches. The REST API is defined in `coderd/coderd.go` and uses Chi for HTTP routing. This provides the primary interface for the frontend and external integrations. Internal communication with Provisioners occurs over gRPC, with service definitions maintained in `.proto` files. This separation allows for efficient binary communication with the components responsible for infrastructure management while providing a standard REST interface for human-facing applications. -# Network Architecture +## Network Architecture Coder implements a secure networking layer based on Tailscale's Wireguard implementation. The `tailnet` package provides connectivity between workspace agents and clients through DERP (Designated Encrypted Relay for Packets) servers when direct connections aren't possible. This creates a secure overlay network allowing access to workspaces regardless of network topology, firewalls, or NAT configurations. -## Tailnet and DERP System +### Tailnet and DERP System The networking system has three key components: @@ -35,7 +35,7 @@ The networking system has three key components: 3. **Direct Connections**: When possible, the system establishes peer-to-peer connections between clients and workspaces using STUN for NAT traversal. This requires both endpoints to send UDP traffic on ephemeral ports. -## Workspace Proxies +### Workspace Proxies Workspace proxies (in the Enterprise edition) provide regional relay points for browser-based connections, reducing latency for geo-distributed teams. Key characteristics: @@ -45,9 +45,10 @@ Workspace proxies (in the Enterprise edition) provide regional relay points for - Managed through the `coder wsproxy` commands - Implemented primarily in the `enterprise/wsproxy/` package -# Agent System +## Agent System The workspace agent runs within each provisioned workspace and provides core functionality including: + - SSH access to workspaces via the `agentssh` package - Port forwarding - Terminal connectivity via the `pty` package for pseudo-terminal support @@ -57,7 +58,7 @@ The workspace agent runs within each provisioned workspace and provides core fun Agents communicate with the control plane using the tailnet system and authenticate using secure tokens. -# Workspace Applications +## Workspace Applications Workspace applications (or "apps") provide browser-based access to services running within workspaces. The system supports: @@ -69,17 +70,17 @@ Workspace applications (or "apps") provide browser-based access to services runn The implementation is primarily in the `coderd/workspaceapps/` directory with components for URL generation, proxying connections, and managing application state. -# Implementation Details +## Implementation Details The project structure separates frontend and backend concerns. React components and pages are organized in the `site/src/` directory, with Jest used for testing. The backend is primarily written in Go, with a strong emphasis on error handling patterns and test coverage. Database interactions are carefully managed through migrations in `coderd/database/migrations/` and queries in `coderd/database/queries/`. All new queries require proper database authorization (dbauthz) implementation to ensure that only users with appropriate permissions can access specific resources. -# Authorization System +## Authorization System The database authorization (dbauthz) system enforces fine-grained access control across all database operations. It uses role-based access control (RBAC) to validate user permissions before executing database operations. The `dbauthz` package wraps the database store and performs authorization checks before returning data. All database operations must pass through this layer to ensure security. -# Testing Framework +## Testing Framework The codebase has a comprehensive testing approach with several key components: @@ -91,7 +92,7 @@ The codebase has a comprehensive testing approach with several key components: 4. **Enterprise Testing**: Enterprise features have dedicated test utilities in the `coderdenttest` package. -# Open Source and Enterprise Components +## Open Source and Enterprise Components The repository contains both open source and enterprise components: @@ -100,9 +101,10 @@ The repository contains both open source and enterprise components: - The boundary between open source and enterprise is managed through a licensing system - The same core codebase supports both editions, with enterprise features conditionally enabled -# Development Philosophy +## Development Philosophy Coder emphasizes clear error handling, with specific patterns required: + - Concise error messages that avoid phrases like "failed to" - Wrapping errors with `%w` to maintain error chains - Using sentinel errors with the "err" prefix (e.g., `errNotFound`) @@ -111,7 +113,7 @@ All tests should run in parallel using `t.Parallel()` to ensure efficient testin Git contributions follow a standard format with commit messages structured as `type: `, where type is one of `feat`, `fix`, or `chore`. -# Development Workflow +## Development Workflow Development can be initiated using `scripts/develop.sh` to start the application after making changes. Database schema updates should be performed through the migration system using `create_migration.sh ` to generate migration files, with each `.up.sql` migration paired with a corresponding `.down.sql` that properly reverts all changes. diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 0000000000000..90d91c9966df7 --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1,104 @@ +# Coder Development Guidelines + +Read [cursor rules](.cursorrules). + +## Build/Test/Lint Commands + +### Main Commands + +- `make build` or `make build-fat` - Build all "fat" binaries (includes "server" functionality) +- `make build-slim` - Build "slim" binaries +- `make test` - Run Go tests +- `make test RUN=TestFunctionName` or `go test -v ./path/to/package -run TestFunctionName` - Test single +- `make test-postgres` - Run tests with Postgres database +- `make test-race` - Run tests with Go race detector +- `make test-e2e` - Run end-to-end tests +- `make lint` - Run all linters +- `make fmt` - Format all code +- `make gen` - Generates mocks, database queries and other auto-generated files + +### Frontend Commands (site directory) + +- `pnpm build` - Build frontend +- `pnpm dev` - Run development server +- `pnpm check` - Run code checks +- `pnpm format` - Format frontend code +- `pnpm lint` - Lint frontend code +- `pnpm test` - Run frontend tests + +## Code Style Guidelines + +### Go + +- Follow [Effective Go](https://go.dev/doc/effective_go) and [Go's Code Review Comments](https://github.com/golang/go/wiki/CodeReviewComments) +- Use `gofumpt` for formatting +- Create packages when used during implementation +- Validate abstractions against implementations + +### Error Handling + +- Use descriptive error messages +- Wrap errors with context +- Propagate errors appropriately +- Use proper error types +- (`xerrors.Errorf("failed to X: %w", err)`) + +### Naming + +- Use clear, descriptive names +- Abbreviate only when obvious +- Follow Go and TypeScript naming conventions + +### Comments + +- Document exported functions, types, and non-obvious logic +- Follow JSDoc format for TypeScript +- Use godoc format for Go code + +## Commit Style + +- Follow [Conventional Commits 1.0.0](https://www.conventionalcommits.org/en/v1.0.0/) +- Format: `type(scope): message` +- Types: `feat`, `fix`, `docs`, `style`, `refactor`, `test`, `chore` +- Keep message titles concise (~70 characters) +- Use imperative, present tense in commit titles + +## Database queries + +- MUST DO! Any changes to database - adding queries, modifying queries should be done in the `coderd\database\queries\*.sql` files. Use `make gen` to generate necessary changes after. +- MUST DO! Queries are grouped in files relating to context - e.g. `prebuilds.sql`, `users.sql`, `provisionerjobs.sql`. +- After making changes to any `coderd\database\queries\*.sql` files you must run `make gen` to generate respective ORM changes. + +## Architecture + +### Core Components + +- **coderd**: Main API service connecting workspaces, provisioners, and users +- **provisionerd**: Execution context for infrastructure-modifying providers +- **Agents**: Services in remote workspaces providing features like SSH and port forwarding +- **Workspaces**: Cloud resources defined by Terraform + +## Sub-modules + +### Template System + +- Templates define infrastructure for workspaces using Terraform +- Environment variables pass context between Coder and templates +- Official modules extend development environments + +### RBAC System + +- Permissions defined at site, organization, and user levels +- Object-Action model protects resources +- Built-in roles: owner, member, auditor, templateAdmin +- Permission format: `?...` + +### Database + +- PostgreSQL 13+ recommended for production +- Migrations managed with `migrate` +- Database authorization through `dbauthz` package + +## Frontend + +For building Frontend refer to [this document](docs/contributing/frontend.md) diff --git a/cli/testdata/server-config.yaml.golden b/cli/testdata/server-config.yaml.golden index 9995a7f389130..7403819a2d10b 100644 --- a/cli/testdata/server-config.yaml.golden +++ b/cli/testdata/server-config.yaml.golden @@ -704,3 +704,7 @@ workspace_prebuilds: # backoff. # (default: 1h0m0s, type: duration) reconciliation_backoff_lookback_period: 1h0m0s + # Maximum number of consecutive failed prebuilds before a preset hits the hard + # limit; disabled when set to zero. + # (default: 3, type: int) + failure_hard_limit: 3 diff --git a/coderd/apidoc/docs.go b/coderd/apidoc/docs.go index 95e2cc0f48ac8..75f480d639dcd 100644 --- a/coderd/apidoc/docs.go +++ b/coderd/apidoc/docs.go @@ -12643,14 +12643,20 @@ const docTemplate = `{ "web-push", "dynamic-parameters", "workspace-prebuilds", - "agentic-chat" + "agentic-chat", + "dev-containers", + "coder-desktop", + "securing-agents" ], "x-enum-comments": { "ExperimentAgenticChat": "Enables the new agentic AI chat feature.", "ExperimentAutoFillParameters": "This should not be taken out of experiments until we have redesigned the feature.", + "ExperimentCoderDesktop": "Enables Coder Desktop functionality.", + "ExperimentDevContainers": "Enables dev containers support.", "ExperimentDynamicParameters": "Enables dynamic parameters when creating a workspace.", "ExperimentExample": "This isn't used for anything.", "ExperimentNotifications": "Sends notifications via SMTP and webhooks following certain events.", + "ExperimentSecuringAgents": "Enables features for securing AI agents.", "ExperimentWebPush": "Enables web push notifications through the browser.", "ExperimentWorkspacePrebuilds": "Enables the new workspace prebuilds feature.", "ExperimentWorkspaceUsage": "Enables the new workspace usage tracking." @@ -12663,7 +12669,10 @@ const docTemplate = `{ "ExperimentWebPush", "ExperimentDynamicParameters", "ExperimentWorkspacePrebuilds", - "ExperimentAgenticChat" + "ExperimentAgenticChat", + "ExperimentDevContainers", + "ExperimentCoderDesktop", + "ExperimentSecuringAgents" ] }, "codersdk.ExternalAuth": { @@ -14326,6 +14335,10 @@ const docTemplate = `{ "codersdk.PrebuildsConfig": { "type": "object", "properties": { + "failure_hard_limit": { + "description": "FailureHardLimit defines the maximum number of consecutive failed prebuild attempts allowed\nbefore a preset is considered to be in a hard limit state. When a preset hits this limit,\nno new prebuilds will be created until the limit is reset.\nFailureHardLimit is disabled when set to zero.", + "type": "integer" + }, "reconciliation_backoff_interval": { "description": "ReconciliationBackoffInterval specifies the amount of time to increase the backoff interval\nwhen errors occur during reconciliation.", "type": "integer" @@ -14901,7 +14914,9 @@ const docTemplate = `{ "application_connect", "assign", "create", + "create_agent", "delete", + "delete_agent", "read", "read_personal", "ssh", @@ -14917,7 +14932,9 @@ const docTemplate = `{ "ActionApplicationConnect", "ActionAssign", "ActionCreate", + "ActionCreateAgent", "ActionDelete", + "ActionDeleteAgent", "ActionRead", "ActionReadPersonal", "ActionSSH", diff --git a/coderd/apidoc/swagger.json b/coderd/apidoc/swagger.json index 02212d9944415..80ab3f74c0527 100644 --- a/coderd/apidoc/swagger.json +++ b/coderd/apidoc/swagger.json @@ -11347,14 +11347,20 @@ "web-push", "dynamic-parameters", "workspace-prebuilds", - "agentic-chat" + "agentic-chat", + "dev-containers", + "coder-desktop", + "securing-agents" ], "x-enum-comments": { "ExperimentAgenticChat": "Enables the new agentic AI chat feature.", "ExperimentAutoFillParameters": "This should not be taken out of experiments until we have redesigned the feature.", + "ExperimentCoderDesktop": "Enables Coder Desktop functionality.", + "ExperimentDevContainers": "Enables dev containers support.", "ExperimentDynamicParameters": "Enables dynamic parameters when creating a workspace.", "ExperimentExample": "This isn't used for anything.", "ExperimentNotifications": "Sends notifications via SMTP and webhooks following certain events.", + "ExperimentSecuringAgents": "Enables features for securing AI agents.", "ExperimentWebPush": "Enables web push notifications through the browser.", "ExperimentWorkspacePrebuilds": "Enables the new workspace prebuilds feature.", "ExperimentWorkspaceUsage": "Enables the new workspace usage tracking." @@ -11367,7 +11373,10 @@ "ExperimentWebPush", "ExperimentDynamicParameters", "ExperimentWorkspacePrebuilds", - "ExperimentAgenticChat" + "ExperimentAgenticChat", + "ExperimentDevContainers", + "ExperimentCoderDesktop", + "ExperimentSecuringAgents" ] }, "codersdk.ExternalAuth": { @@ -12968,6 +12977,10 @@ "codersdk.PrebuildsConfig": { "type": "object", "properties": { + "failure_hard_limit": { + "description": "FailureHardLimit defines the maximum number of consecutive failed prebuild attempts allowed\nbefore a preset is considered to be in a hard limit state. When a preset hits this limit,\nno new prebuilds will be created until the limit is reset.\nFailureHardLimit is disabled when set to zero.", + "type": "integer" + }, "reconciliation_backoff_interval": { "description": "ReconciliationBackoffInterval specifies the amount of time to increase the backoff interval\nwhen errors occur during reconciliation.", "type": "integer" @@ -13509,7 +13522,9 @@ "application_connect", "assign", "create", + "create_agent", "delete", + "delete_agent", "read", "read_personal", "ssh", @@ -13525,7 +13540,9 @@ "ActionApplicationConnect", "ActionAssign", "ActionCreate", + "ActionCreateAgent", "ActionDelete", + "ActionDeleteAgent", "ActionRead", "ActionReadPersonal", "ActionSSH", diff --git a/coderd/database/dbauthz/dbauthz.go b/coderd/database/dbauthz/dbauthz.go index 20afcf66c7867..a210599d17cc4 100644 --- a/coderd/database/dbauthz/dbauthz.go +++ b/coderd/database/dbauthz/dbauthz.go @@ -177,7 +177,7 @@ var ( // Unsure why provisionerd needs update and read personal rbac.ResourceUser.Type: {policy.ActionRead, policy.ActionReadPersonal, policy.ActionUpdatePersonal}, rbac.ResourceWorkspaceDormant.Type: {policy.ActionDelete, policy.ActionRead, policy.ActionUpdate, policy.ActionWorkspaceStop}, - rbac.ResourceWorkspace.Type: {policy.ActionDelete, policy.ActionRead, policy.ActionUpdate, policy.ActionWorkspaceStart, policy.ActionWorkspaceStop}, + rbac.ResourceWorkspace.Type: {policy.ActionDelete, policy.ActionRead, policy.ActionUpdate, policy.ActionWorkspaceStart, policy.ActionWorkspaceStop, policy.ActionCreateAgent}, rbac.ResourceApiKey.Type: {policy.WildcardSymbol}, // When org scoped provisioner credentials are implemented, // this can be reduced to read a specific org. @@ -339,7 +339,7 @@ var ( rbac.ResourceProvisionerDaemon.Type: {policy.ActionCreate, policy.ActionRead, policy.ActionUpdate}, rbac.ResourceUser.Type: rbac.ResourceUser.AvailableActions(), rbac.ResourceWorkspaceDormant.Type: {policy.ActionUpdate, policy.ActionDelete, policy.ActionWorkspaceStop}, - rbac.ResourceWorkspace.Type: {policy.ActionUpdate, policy.ActionDelete, policy.ActionWorkspaceStart, policy.ActionWorkspaceStop, policy.ActionSSH}, + rbac.ResourceWorkspace.Type: {policy.ActionUpdate, policy.ActionDelete, policy.ActionWorkspaceStart, policy.ActionWorkspaceStop, policy.ActionSSH, policy.ActionCreateAgent, policy.ActionDeleteAgent}, rbac.ResourceWorkspaceProxy.Type: {policy.ActionCreate, policy.ActionUpdate, policy.ActionDelete}, rbac.ResourceDeploymentConfig.Type: {policy.ActionCreate, policy.ActionUpdate, policy.ActionDelete}, rbac.ResourceNotificationMessage.Type: {policy.ActionCreate, policy.ActionRead, policy.ActionUpdate, policy.ActionDelete}, @@ -2226,6 +2226,15 @@ func (q *querier) GetPresetParametersByTemplateVersionID(ctx context.Context, ar return q.db.GetPresetParametersByTemplateVersionID(ctx, args) } +func (q *querier) GetPresetsAtFailureLimit(ctx context.Context, hardLimit int64) ([]database.GetPresetsAtFailureLimitRow, error) { + // GetPresetsAtFailureLimit returns a list of template version presets that have reached the hard failure limit. + // Request the same authorization permissions as GetPresetsBackoff, since the methods are similar. + if err := q.authorizeContext(ctx, policy.ActionViewInsights, rbac.ResourceTemplate.All()); err != nil { + return nil, err + } + return q.db.GetPresetsAtFailureLimit(ctx, hardLimit) +} + func (q *querier) GetPresetsBackoff(ctx context.Context, lookback time.Time) ([]database.GetPresetsBackoffRow, error) { // GetPresetsBackoff returns a list of template version presets along with metadata such as the number of failed prebuilds. if err := q.authorizeContext(ctx, policy.ActionViewInsights, rbac.ResourceTemplate.All()); err != nil { @@ -3180,6 +3189,10 @@ func (q *querier) GetWorkspaceByOwnerIDAndName(ctx context.Context, arg database return fetch(q.log, q.auth, q.db.GetWorkspaceByOwnerIDAndName)(ctx, arg) } +func (q *querier) GetWorkspaceByResourceID(ctx context.Context, resourceID uuid.UUID) (database.Workspace, error) { + return fetch(q.log, q.auth, q.db.GetWorkspaceByResourceID)(ctx, resourceID) +} + func (q *querier) GetWorkspaceByWorkspaceAppID(ctx context.Context, workspaceAppID uuid.UUID) (database.Workspace, error) { return fetch(q.log, q.auth, q.db.GetWorkspaceByWorkspaceAppID)(ctx, workspaceAppID) } @@ -3713,9 +3726,24 @@ func (q *querier) InsertWorkspace(ctx context.Context, arg database.InsertWorksp } func (q *querier) InsertWorkspaceAgent(ctx context.Context, arg database.InsertWorkspaceAgentParams) (database.WorkspaceAgent, error) { - if err := q.authorizeContext(ctx, policy.ActionCreate, rbac.ResourceSystem); err != nil { + // NOTE(DanielleMaywood): + // Currently, the only way to link a Resource back to a Workspace is by following this chain: + // + // WorkspaceResource -> WorkspaceBuild -> Workspace + // + // It is possible for this function to be called without there existing + // a `WorkspaceBuild` to link back to. This means that we want to allow + // execution to continue if there isn't a workspace found to allow this + // behavior to continue. + workspace, err := q.db.GetWorkspaceByResourceID(ctx, arg.ResourceID) + if err != nil && !errors.Is(err, sql.ErrNoRows) { + return database.WorkspaceAgent{}, err + } + + if err := q.authorizeContext(ctx, policy.ActionCreateAgent, workspace); err != nil { return database.WorkspaceAgent{}, err } + return q.db.InsertWorkspaceAgent(ctx, arg) } @@ -4182,6 +4210,24 @@ func (q *querier) UpdateOrganizationDeletedByID(ctx context.Context, arg databas return deleteQ(q.log, q.auth, q.db.GetOrganizationByID, deleteF)(ctx, arg.ID) } +func (q *querier) UpdatePresetPrebuildStatus(ctx context.Context, arg database.UpdatePresetPrebuildStatusParams) error { + preset, err := q.db.GetPresetByID(ctx, arg.PresetID) + if err != nil { + return err + } + + object := rbac.ResourceTemplate. + WithID(preset.TemplateID.UUID). + InOrg(preset.OrganizationID) + + err = q.authorizeContext(ctx, policy.ActionUpdate, object) + if err != nil { + return err + } + + return q.db.UpdatePresetPrebuildStatus(ctx, arg) +} + func (q *querier) UpdateProvisionerDaemonLastSeenAt(ctx context.Context, arg database.UpdateProvisionerDaemonLastSeenAtParams) error { if err := q.authorizeContext(ctx, policy.ActionUpdate, rbac.ResourceProvisionerDaemon); err != nil { return err diff --git a/coderd/database/dbauthz/dbauthz_test.go b/coderd/database/dbauthz/dbauthz_test.go index 1e4b4ea879b77..703e51d739c47 100644 --- a/coderd/database/dbauthz/dbauthz_test.go +++ b/coderd/database/dbauthz/dbauthz_test.go @@ -1928,6 +1928,22 @@ func (s *MethodTestSuite) TestWorkspace() { }) check.Args(ws.ID).Asserts(ws, policy.ActionRead) })) + s.Run("GetWorkspaceByResourceID", s.Subtest(func(db database.Store, check *expects) { + u := dbgen.User(s.T(), db, database.User{}) + o := dbgen.Organization(s.T(), db, database.Organization{}) + j := dbgen.ProvisionerJob(s.T(), db, nil, database.ProvisionerJob{Type: database.ProvisionerJobTypeWorkspaceBuild}) + tpl := dbgen.Template(s.T(), db, database.Template{CreatedBy: u.ID, OrganizationID: o.ID}) + tv := dbgen.TemplateVersion(s.T(), db, database.TemplateVersion{ + TemplateID: uuid.NullUUID{UUID: tpl.ID, Valid: true}, + JobID: j.ID, + OrganizationID: o.ID, + CreatedBy: u.ID, + }) + ws := dbgen.Workspace(s.T(), db, database.WorkspaceTable{OwnerID: u.ID, TemplateID: tpl.ID, OrganizationID: o.ID}) + _ = dbgen.WorkspaceBuild(s.T(), db, database.WorkspaceBuild{WorkspaceID: ws.ID, JobID: j.ID, TemplateVersionID: tv.ID}) + res := dbgen.WorkspaceResource(s.T(), db, database.WorkspaceResource{JobID: j.ID}) + check.Args(res.ID).Asserts(ws, policy.ActionRead) + })) s.Run("GetWorkspaces", s.Subtest(func(_ database.Store, check *expects) { // No asserts here because SQLFilter. check.Args(database.GetWorkspacesParams{}).Asserts() @@ -4018,12 +4034,25 @@ func (s *MethodTestSuite) TestSystemFunctions() { Returns(slice.New(a, b)) })) s.Run("InsertWorkspaceAgent", s.Subtest(func(db database.Store, check *expects) { - dbtestutil.DisableForeignKeysAndTriggers(s.T(), db) + u := dbgen.User(s.T(), db, database.User{}) + o := dbgen.Organization(s.T(), db, database.Organization{}) + j := dbgen.ProvisionerJob(s.T(), db, nil, database.ProvisionerJob{Type: database.ProvisionerJobTypeWorkspaceBuild}) + tpl := dbgen.Template(s.T(), db, database.Template{CreatedBy: u.ID, OrganizationID: o.ID}) + tv := dbgen.TemplateVersion(s.T(), db, database.TemplateVersion{ + TemplateID: uuid.NullUUID{UUID: tpl.ID, Valid: true}, + JobID: j.ID, + OrganizationID: o.ID, + CreatedBy: u.ID, + }) + ws := dbgen.Workspace(s.T(), db, database.WorkspaceTable{OwnerID: u.ID, TemplateID: tpl.ID, OrganizationID: o.ID}) + _ = dbgen.WorkspaceBuild(s.T(), db, database.WorkspaceBuild{WorkspaceID: ws.ID, JobID: j.ID, TemplateVersionID: tv.ID}) + res := dbgen.WorkspaceResource(s.T(), db, database.WorkspaceResource{JobID: j.ID}) check.Args(database.InsertWorkspaceAgentParams{ ID: uuid.New(), + ResourceID: res.ID, Name: "dev", APIKeyScope: database.AgentKeyScopeEnumAll, - }).Asserts(rbac.ResourceSystem, policy.ActionCreate) + }).Asserts(ws, policy.ActionCreateAgent) })) s.Run("InsertWorkspaceApp", s.Subtest(func(db database.Store, check *expects) { dbtestutil.DisableForeignKeysAndTriggers(s.T(), db) @@ -4895,6 +4924,11 @@ func (s *MethodTestSuite) TestPrebuilds() { Asserts(rbac.ResourceWorkspace.All(), policy.ActionRead). ErrorsWithInMemDB(dbmem.ErrUnimplemented) })) + s.Run("GetPresetsAtFailureLimit", s.Subtest(func(_ database.Store, check *expects) { + check.Args(int64(0)). + Asserts(rbac.ResourceTemplate.All(), policy.ActionViewInsights). + ErrorsWithInMemDB(dbmem.ErrUnimplemented) + })) s.Run("GetPresetsBackoff", s.Subtest(func(_ database.Store, check *expects) { check.Args(time.Time{}). Asserts(rbac.ResourceTemplate.All(), policy.ActionViewInsights). @@ -4942,8 +4976,34 @@ func (s *MethodTestSuite) TestPrebuilds() { }, InvalidateAfterSecs: preset.InvalidateAfterSecs, OrganizationID: org.ID, + PrebuildStatus: database.PrebuildStatusHealthy, }) })) + s.Run("UpdatePresetPrebuildStatus", s.Subtest(func(db database.Store, check *expects) { + org := dbgen.Organization(s.T(), db, database.Organization{}) + user := dbgen.User(s.T(), db, database.User{}) + template := dbgen.Template(s.T(), db, database.Template{ + OrganizationID: org.ID, + CreatedBy: user.ID, + }) + templateVersion := dbgen.TemplateVersion(s.T(), db, database.TemplateVersion{ + TemplateID: uuid.NullUUID{ + UUID: template.ID, + Valid: true, + }, + OrganizationID: org.ID, + CreatedBy: user.ID, + }) + preset := dbgen.Preset(s.T(), db, database.InsertPresetParams{ + TemplateVersionID: templateVersion.ID, + }) + req := database.UpdatePresetPrebuildStatusParams{ + PresetID: preset.ID, + Status: database.PrebuildStatusHealthy, + } + check.Args(req). + Asserts(rbac.ResourceTemplate.WithID(template.ID).InOrg(org.ID), policy.ActionUpdate) + })) } func (s *MethodTestSuite) TestOAuth2ProviderApps() { diff --git a/coderd/database/dbmem/dbmem.go b/coderd/database/dbmem/dbmem.go index 3ab2895876ac5..1a1455d83045b 100644 --- a/coderd/database/dbmem/dbmem.go +++ b/coderd/database/dbmem/dbmem.go @@ -4287,6 +4287,7 @@ func (q *FakeQuerier) GetPresetByID(ctx context.Context, presetID uuid.UUID) (da CreatedAt: preset.CreatedAt, DesiredInstances: preset.DesiredInstances, InvalidateAfterSecs: preset.InvalidateAfterSecs, + PrebuildStatus: preset.PrebuildStatus, TemplateID: tv.TemplateID, OrganizationID: tv.OrganizationID, }, nil @@ -4352,6 +4353,10 @@ func (q *FakeQuerier) GetPresetParametersByTemplateVersionID(_ context.Context, return parameters, nil } +func (q *FakeQuerier) GetPresetsAtFailureLimit(ctx context.Context, hardLimit int64) ([]database.GetPresetsAtFailureLimitRow, error) { + return nil, ErrUnimplemented +} + func (*FakeQuerier) GetPresetsBackoff(_ context.Context, _ time.Time) ([]database.GetPresetsBackoffRow, error) { return nil, ErrUnimplemented } @@ -8053,6 +8058,33 @@ func (q *FakeQuerier) GetWorkspaceByOwnerIDAndName(_ context.Context, arg databa return database.Workspace{}, sql.ErrNoRows } +func (q *FakeQuerier) GetWorkspaceByResourceID(ctx context.Context, resourceID uuid.UUID) (database.Workspace, error) { + q.mutex.RLock() + defer q.mutex.RUnlock() + + for _, resource := range q.workspaceResources { + if resource.ID != resourceID { + continue + } + + for _, build := range q.workspaceBuilds { + if build.JobID != resource.JobID { + continue + } + + for _, workspace := range q.workspaces { + if workspace.ID != build.WorkspaceID { + continue + } + + return q.extendWorkspace(workspace), nil + } + } + } + + return database.Workspace{}, sql.ErrNoRows +} + func (q *FakeQuerier) GetWorkspaceByWorkspaceAppID(_ context.Context, workspaceAppID uuid.UUID) (database.Workspace, error) { if err := validateDatabaseType(workspaceAppID); err != nil { return database.Workspace{}, err @@ -9062,6 +9094,7 @@ func (q *FakeQuerier) InsertPreset(_ context.Context, arg database.InsertPresetP Int32: 0, Valid: true, }, + PrebuildStatus: database.PrebuildStatusHealthy, } q.presets = append(q.presets, preset) return preset, nil @@ -10890,6 +10923,25 @@ func (q *FakeQuerier) UpdateOrganizationDeletedByID(_ context.Context, arg datab return sql.ErrNoRows } +func (q *FakeQuerier) UpdatePresetPrebuildStatus(ctx context.Context, arg database.UpdatePresetPrebuildStatusParams) error { + err := validateDatabaseType(arg) + if err != nil { + return err + } + + q.mutex.RLock() + defer q.mutex.RUnlock() + + for _, preset := range q.presets { + if preset.ID == arg.PresetID { + preset.PrebuildStatus = arg.Status + return nil + } + } + + return xerrors.Errorf("preset %v does not exist", arg.PresetID) +} + func (q *FakeQuerier) UpdateProvisionerDaemonLastSeenAt(_ context.Context, arg database.UpdateProvisionerDaemonLastSeenAtParams) error { err := validateDatabaseType(arg) if err != nil { diff --git a/coderd/database/dbmetrics/querymetrics.go b/coderd/database/dbmetrics/querymetrics.go index 9122cedbf786c..e35ec11b02453 100644 --- a/coderd/database/dbmetrics/querymetrics.go +++ b/coderd/database/dbmetrics/querymetrics.go @@ -1138,6 +1138,13 @@ func (m queryMetricsStore) GetPresetParametersByTemplateVersionID(ctx context.Co return r0, r1 } +func (m queryMetricsStore) GetPresetsAtFailureLimit(ctx context.Context, hardLimit int64) ([]database.GetPresetsAtFailureLimitRow, error) { + start := time.Now() + r0, r1 := m.s.GetPresetsAtFailureLimit(ctx, hardLimit) + m.queryLatencies.WithLabelValues("GetPresetsAtFailureLimit").Observe(time.Since(start).Seconds()) + return r0, r1 +} + func (m queryMetricsStore) GetPresetsBackoff(ctx context.Context, lookback time.Time) ([]database.GetPresetsBackoffRow, error) { start := time.Now() r0, r1 := m.s.GetPresetsBackoff(ctx, lookback) @@ -1887,6 +1894,13 @@ func (m queryMetricsStore) GetWorkspaceByOwnerIDAndName(ctx context.Context, arg return workspace, err } +func (m queryMetricsStore) GetWorkspaceByResourceID(ctx context.Context, resourceID uuid.UUID) (database.Workspace, error) { + start := time.Now() + r0, r1 := m.s.GetWorkspaceByResourceID(ctx, resourceID) + m.queryLatencies.WithLabelValues("GetWorkspaceByResourceID").Observe(time.Since(start).Seconds()) + return r0, r1 +} + func (m queryMetricsStore) GetWorkspaceByWorkspaceAppID(ctx context.Context, workspaceAppID uuid.UUID) (database.Workspace, error) { start := time.Now() workspace, err := m.s.GetWorkspaceByWorkspaceAppID(ctx, workspaceAppID) @@ -2685,6 +2699,13 @@ func (m queryMetricsStore) UpdateOrganizationDeletedByID(ctx context.Context, ar return r0 } +func (m queryMetricsStore) UpdatePresetPrebuildStatus(ctx context.Context, arg database.UpdatePresetPrebuildStatusParams) error { + start := time.Now() + r0 := m.s.UpdatePresetPrebuildStatus(ctx, arg) + m.queryLatencies.WithLabelValues("UpdatePresetPrebuildStatus").Observe(time.Since(start).Seconds()) + return r0 +} + func (m queryMetricsStore) UpdateProvisionerDaemonLastSeenAt(ctx context.Context, arg database.UpdateProvisionerDaemonLastSeenAtParams) error { start := time.Now() r0 := m.s.UpdateProvisionerDaemonLastSeenAt(ctx, arg) diff --git a/coderd/database/dbmock/dbmock.go b/coderd/database/dbmock/dbmock.go index e7af9ecd8fee8..7a1fc0c4b2a6f 100644 --- a/coderd/database/dbmock/dbmock.go +++ b/coderd/database/dbmock/dbmock.go @@ -2328,6 +2328,21 @@ func (mr *MockStoreMockRecorder) GetPresetParametersByTemplateVersionID(ctx, tem return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "GetPresetParametersByTemplateVersionID", reflect.TypeOf((*MockStore)(nil).GetPresetParametersByTemplateVersionID), ctx, templateVersionID) } +// GetPresetsAtFailureLimit mocks base method. +func (m *MockStore) GetPresetsAtFailureLimit(ctx context.Context, hardLimit int64) ([]database.GetPresetsAtFailureLimitRow, error) { + m.ctrl.T.Helper() + ret := m.ctrl.Call(m, "GetPresetsAtFailureLimit", ctx, hardLimit) + ret0, _ := ret[0].([]database.GetPresetsAtFailureLimitRow) + ret1, _ := ret[1].(error) + return ret0, ret1 +} + +// GetPresetsAtFailureLimit indicates an expected call of GetPresetsAtFailureLimit. +func (mr *MockStoreMockRecorder) GetPresetsAtFailureLimit(ctx, hardLimit any) *gomock.Call { + mr.mock.ctrl.T.Helper() + return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "GetPresetsAtFailureLimit", reflect.TypeOf((*MockStore)(nil).GetPresetsAtFailureLimit), ctx, hardLimit) +} + // GetPresetsBackoff mocks base method. func (m *MockStore) GetPresetsBackoff(ctx context.Context, lookback time.Time) ([]database.GetPresetsBackoffRow, error) { m.ctrl.T.Helper() @@ -3963,6 +3978,21 @@ func (mr *MockStoreMockRecorder) GetWorkspaceByOwnerIDAndName(ctx, arg any) *gom return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "GetWorkspaceByOwnerIDAndName", reflect.TypeOf((*MockStore)(nil).GetWorkspaceByOwnerIDAndName), ctx, arg) } +// GetWorkspaceByResourceID mocks base method. +func (m *MockStore) GetWorkspaceByResourceID(ctx context.Context, resourceID uuid.UUID) (database.Workspace, error) { + m.ctrl.T.Helper() + ret := m.ctrl.Call(m, "GetWorkspaceByResourceID", ctx, resourceID) + ret0, _ := ret[0].(database.Workspace) + ret1, _ := ret[1].(error) + return ret0, ret1 +} + +// GetWorkspaceByResourceID indicates an expected call of GetWorkspaceByResourceID. +func (mr *MockStoreMockRecorder) GetWorkspaceByResourceID(ctx, resourceID any) *gomock.Call { + mr.mock.ctrl.T.Helper() + return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "GetWorkspaceByResourceID", reflect.TypeOf((*MockStore)(nil).GetWorkspaceByResourceID), ctx, resourceID) +} + // GetWorkspaceByWorkspaceAppID mocks base method. func (m *MockStore) GetWorkspaceByWorkspaceAppID(ctx context.Context, workspaceAppID uuid.UUID) (database.Workspace, error) { m.ctrl.T.Helper() @@ -5691,6 +5721,20 @@ func (mr *MockStoreMockRecorder) UpdateOrganizationDeletedByID(ctx, arg any) *go return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "UpdateOrganizationDeletedByID", reflect.TypeOf((*MockStore)(nil).UpdateOrganizationDeletedByID), ctx, arg) } +// UpdatePresetPrebuildStatus mocks base method. +func (m *MockStore) UpdatePresetPrebuildStatus(ctx context.Context, arg database.UpdatePresetPrebuildStatusParams) error { + m.ctrl.T.Helper() + ret := m.ctrl.Call(m, "UpdatePresetPrebuildStatus", ctx, arg) + ret0, _ := ret[0].(error) + return ret0 +} + +// UpdatePresetPrebuildStatus indicates an expected call of UpdatePresetPrebuildStatus. +func (mr *MockStoreMockRecorder) UpdatePresetPrebuildStatus(ctx, arg any) *gomock.Call { + mr.mock.ctrl.T.Helper() + return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "UpdatePresetPrebuildStatus", reflect.TypeOf((*MockStore)(nil).UpdatePresetPrebuildStatus), ctx, arg) +} + // UpdateProvisionerDaemonLastSeenAt mocks base method. func (m *MockStore) UpdateProvisionerDaemonLastSeenAt(ctx context.Context, arg database.UpdateProvisionerDaemonLastSeenAtParams) error { m.ctrl.T.Helper() diff --git a/coderd/database/dump.sql b/coderd/database/dump.sql index 2f23b3ad4ce78..ec196405df2d3 100644 --- a/coderd/database/dump.sql +++ b/coderd/database/dump.sql @@ -153,6 +153,12 @@ CREATE TYPE port_share_protocol AS ENUM ( 'https' ); +CREATE TYPE prebuild_status AS ENUM ( + 'healthy', + 'hard_limited', + 'validation_failed' +); + CREATE TYPE provisioner_daemon_status AS ENUM ( 'offline', 'idle', @@ -1439,7 +1445,8 @@ CREATE TABLE template_version_presets ( name text NOT NULL, created_at timestamp with time zone DEFAULT CURRENT_TIMESTAMP NOT NULL, desired_instances integer, - invalidate_after_secs integer DEFAULT 0 + invalidate_after_secs integer DEFAULT 0, + prebuild_status prebuild_status DEFAULT 'healthy'::prebuild_status NOT NULL ); CREATE TABLE template_version_terraform_values ( diff --git a/coderd/database/migrations/000328_prebuild_failure_limit_notification.down.sql b/coderd/database/migrations/000328_prebuild_failure_limit_notification.down.sql new file mode 100644 index 0000000000000..40697c7bbc3d2 --- /dev/null +++ b/coderd/database/migrations/000328_prebuild_failure_limit_notification.down.sql @@ -0,0 +1 @@ +DELETE FROM notification_templates WHERE id = '414d9331-c1fc-4761-b40c-d1f4702279eb'; diff --git a/coderd/database/migrations/000328_prebuild_failure_limit_notification.up.sql b/coderd/database/migrations/000328_prebuild_failure_limit_notification.up.sql new file mode 100644 index 0000000000000..403bd667abd28 --- /dev/null +++ b/coderd/database/migrations/000328_prebuild_failure_limit_notification.up.sql @@ -0,0 +1,25 @@ +INSERT INTO notification_templates +(id, name, title_template, body_template, "group", actions) +VALUES ('414d9331-c1fc-4761-b40c-d1f4702279eb', + 'Prebuild Failure Limit Reached', + E'There is a problem creating prebuilt workspaces', + $$ +The number of failed prebuild attempts has reached the hard limit for template **{{ .Labels.template }}** and preset **{{ .Labels.preset }}**. + +To resume prebuilds, fix the underlying issue and upload a new template version. + +Refer to the documentation for more details: +- [Troubleshooting templates](https://coder.com/docs/admin/templates/troubleshooting) +- [Troubleshooting of prebuilt workspaces](https://coder.com/docs/admin/templates/extending-templates/prebuilt-workspaces#administration-and-troubleshooting) +$$, + 'Template Events', + '[ + { + "label": "View failed prebuilt workspaces", + "url": "{{base_url}}/workspaces?filter=owner:prebuilds+status:failed+template:{{.Labels.template}}" + }, + { + "label": "View template version", + "url": "{{base_url}}/templates/{{.Labels.org}}/{{.Labels.template}}/versions/{{.Labels.template_version}}" + } + ]'::jsonb); diff --git a/coderd/database/migrations/000329_add_status_to_template_presets.down.sql b/coderd/database/migrations/000329_add_status_to_template_presets.down.sql new file mode 100644 index 0000000000000..8fe04f99cae33 --- /dev/null +++ b/coderd/database/migrations/000329_add_status_to_template_presets.down.sql @@ -0,0 +1,5 @@ +-- Remove the column from the table first (must happen before dropping the enum type) +ALTER TABLE template_version_presets DROP COLUMN prebuild_status; + +-- Then drop the enum type +DROP TYPE prebuild_status; diff --git a/coderd/database/migrations/000329_add_status_to_template_presets.up.sql b/coderd/database/migrations/000329_add_status_to_template_presets.up.sql new file mode 100644 index 0000000000000..019a246f73a87 --- /dev/null +++ b/coderd/database/migrations/000329_add_status_to_template_presets.up.sql @@ -0,0 +1,7 @@ +CREATE TYPE prebuild_status AS ENUM ( + 'healthy', -- Prebuilds are working as expected; this is the default, healthy state. + 'hard_limited', -- Prebuilds have failed repeatedly and hit the configured hard failure limit; won't be retried anymore. + 'validation_failed' -- Prebuilds failed due to a non-retryable validation error (e.g. template misconfiguration); won't be retried. +); + +ALTER TABLE template_version_presets ADD COLUMN prebuild_status prebuild_status NOT NULL DEFAULT 'healthy'::prebuild_status; diff --git a/coderd/database/models.go b/coderd/database/models.go index ff49b8f471be0..d5047f6bbe65f 100644 --- a/coderd/database/models.go +++ b/coderd/database/models.go @@ -1343,6 +1343,67 @@ func AllPortShareProtocolValues() []PortShareProtocol { } } +type PrebuildStatus string + +const ( + PrebuildStatusHealthy PrebuildStatus = "healthy" + PrebuildStatusHardLimited PrebuildStatus = "hard_limited" + PrebuildStatusValidationFailed PrebuildStatus = "validation_failed" +) + +func (e *PrebuildStatus) Scan(src interface{}) error { + switch s := src.(type) { + case []byte: + *e = PrebuildStatus(s) + case string: + *e = PrebuildStatus(s) + default: + return fmt.Errorf("unsupported scan type for PrebuildStatus: %T", src) + } + return nil +} + +type NullPrebuildStatus struct { + PrebuildStatus PrebuildStatus `json:"prebuild_status"` + Valid bool `json:"valid"` // Valid is true if PrebuildStatus is not NULL +} + +// Scan implements the Scanner interface. +func (ns *NullPrebuildStatus) Scan(value interface{}) error { + if value == nil { + ns.PrebuildStatus, ns.Valid = "", false + return nil + } + ns.Valid = true + return ns.PrebuildStatus.Scan(value) +} + +// Value implements the driver Valuer interface. +func (ns NullPrebuildStatus) Value() (driver.Value, error) { + if !ns.Valid { + return nil, nil + } + return string(ns.PrebuildStatus), nil +} + +func (e PrebuildStatus) Valid() bool { + switch e { + case PrebuildStatusHealthy, + PrebuildStatusHardLimited, + PrebuildStatusValidationFailed: + return true + } + return false +} + +func AllPrebuildStatusValues() []PrebuildStatus { + return []PrebuildStatus{ + PrebuildStatusHealthy, + PrebuildStatusHardLimited, + PrebuildStatusValidationFailed, + } +} + // The status of a provisioner daemon. type ProvisionerDaemonStatus string @@ -3248,12 +3309,13 @@ type TemplateVersionParameter struct { } type TemplateVersionPreset struct { - ID uuid.UUID `db:"id" json:"id"` - TemplateVersionID uuid.UUID `db:"template_version_id" json:"template_version_id"` - Name string `db:"name" json:"name"` - CreatedAt time.Time `db:"created_at" json:"created_at"` - DesiredInstances sql.NullInt32 `db:"desired_instances" json:"desired_instances"` - InvalidateAfterSecs sql.NullInt32 `db:"invalidate_after_secs" json:"invalidate_after_secs"` + ID uuid.UUID `db:"id" json:"id"` + TemplateVersionID uuid.UUID `db:"template_version_id" json:"template_version_id"` + Name string `db:"name" json:"name"` + CreatedAt time.Time `db:"created_at" json:"created_at"` + DesiredInstances sql.NullInt32 `db:"desired_instances" json:"desired_instances"` + InvalidateAfterSecs sql.NullInt32 `db:"invalidate_after_secs" json:"invalidate_after_secs"` + PrebuildStatus PrebuildStatus `db:"prebuild_status" json:"prebuild_status"` } type TemplateVersionPresetParameter struct { diff --git a/coderd/database/querier.go b/coderd/database/querier.go index 78a88426349da..ac7497b641a05 100644 --- a/coderd/database/querier.go +++ b/coderd/database/querier.go @@ -241,6 +241,15 @@ type sqlcQuerier interface { GetPresetByWorkspaceBuildID(ctx context.Context, workspaceBuildID uuid.UUID) (TemplateVersionPreset, error) GetPresetParametersByPresetID(ctx context.Context, presetID uuid.UUID) ([]TemplateVersionPresetParameter, error) GetPresetParametersByTemplateVersionID(ctx context.Context, templateVersionID uuid.UUID) ([]TemplateVersionPresetParameter, error) + // GetPresetsAtFailureLimit groups workspace builds by preset ID. + // Each preset is associated with exactly one template version ID. + // For each preset, the query checks the last hard_limit builds. + // If all of them failed, the preset is considered to have hit the hard failure limit. + // The query returns a list of preset IDs that have reached this failure threshold. + // Only active template versions with configured presets are considered. + // For each preset, check the last hard_limit builds. + // If all of them failed, the preset is considered to have hit the hard failure limit. + GetPresetsAtFailureLimit(ctx context.Context, hardLimit int64) ([]GetPresetsAtFailureLimitRow, error) // GetPresetsBackoff groups workspace builds by preset ID. // Each preset is associated with exactly one template version ID. // For each group, the query checks up to N of the most recent jobs that occurred within the @@ -422,6 +431,7 @@ type sqlcQuerier interface { GetWorkspaceByAgentID(ctx context.Context, agentID uuid.UUID) (Workspace, error) GetWorkspaceByID(ctx context.Context, id uuid.UUID) (Workspace, error) GetWorkspaceByOwnerIDAndName(ctx context.Context, arg GetWorkspaceByOwnerIDAndNameParams) (Workspace, error) + GetWorkspaceByResourceID(ctx context.Context, resourceID uuid.UUID) (Workspace, error) GetWorkspaceByWorkspaceAppID(ctx context.Context, workspaceAppID uuid.UUID) (Workspace, error) GetWorkspaceModulesByJobID(ctx context.Context, jobID uuid.UUID) ([]WorkspaceModule, error) GetWorkspaceModulesCreatedAfter(ctx context.Context, createdAt time.Time) ([]WorkspaceModule, error) @@ -567,6 +577,7 @@ type sqlcQuerier interface { UpdateOAuth2ProviderAppSecretByID(ctx context.Context, arg UpdateOAuth2ProviderAppSecretByIDParams) (OAuth2ProviderAppSecret, error) UpdateOrganization(ctx context.Context, arg UpdateOrganizationParams) (Organization, error) UpdateOrganizationDeletedByID(ctx context.Context, arg UpdateOrganizationDeletedByIDParams) error + UpdatePresetPrebuildStatus(ctx context.Context, arg UpdatePresetPrebuildStatusParams) error UpdateProvisionerDaemonLastSeenAt(ctx context.Context, arg UpdateProvisionerDaemonLastSeenAtParams) error UpdateProvisionerJobByID(ctx context.Context, arg UpdateProvisionerJobByIDParams) error UpdateProvisionerJobWithCancelByID(ctx context.Context, arg UpdateProvisionerJobWithCancelByIDParams) error diff --git a/coderd/database/querier_test.go b/coderd/database/querier_test.go index b2cc20c4894d5..5bafa58796b7a 100644 --- a/coderd/database/querier_test.go +++ b/coderd/database/querier_test.go @@ -4123,8 +4123,7 @@ func TestGetPresetsBackoff(t *testing.T) { }) tmpl1 := createTemplate(t, db, orgID, userID) - tmpl1V1 := createTmplVersionAndPreset(t, db, tmpl1, tmpl1.ActiveVersionID, now, nil) - _ = tmpl1V1 + createTmplVersionAndPreset(t, db, tmpl1, tmpl1.ActiveVersionID, now, nil) backoffs, err := db.GetPresetsBackoff(ctx, now.Add(-time.Hour)) require.NoError(t, err) @@ -4401,6 +4400,311 @@ func TestGetPresetsBackoff(t *testing.T) { }) } +func TestGetPresetsAtFailureLimit(t *testing.T) { + t.Parallel() + if !dbtestutil.WillUsePostgres() { + t.SkipNow() + } + + now := dbtime.Now() + hourBefore := now.Add(-time.Hour) + orgID := uuid.New() + userID := uuid.New() + + findPresetByTmplVersionID := func(hardLimitedPresets []database.GetPresetsAtFailureLimitRow, tmplVersionID uuid.UUID) *database.GetPresetsAtFailureLimitRow { + for _, preset := range hardLimitedPresets { + if preset.TemplateVersionID == tmplVersionID { + return &preset + } + } + + return nil + } + + testCases := []struct { + name string + // true - build is successful + // false - build is unsuccessful + buildSuccesses []bool + hardLimit int64 + expHitHardLimit bool + }{ + { + name: "failed build", + buildSuccesses: []bool{false}, + hardLimit: 1, + expHitHardLimit: true, + }, + { + name: "2 failed builds", + buildSuccesses: []bool{false, false}, + hardLimit: 1, + expHitHardLimit: true, + }, + { + name: "successful build", + buildSuccesses: []bool{true}, + hardLimit: 1, + expHitHardLimit: false, + }, + { + name: "last build is failed", + buildSuccesses: []bool{true, true, false}, + hardLimit: 1, + expHitHardLimit: true, + }, + { + name: "last build is successful", + buildSuccesses: []bool{false, false, true}, + hardLimit: 1, + expHitHardLimit: false, + }, + { + name: "last 3 builds are failed - hard limit is reached", + buildSuccesses: []bool{true, true, false, false, false}, + hardLimit: 3, + expHitHardLimit: true, + }, + { + name: "1 out of 3 last build is successful - hard limit is NOT reached", + buildSuccesses: []bool{false, false, true, false, false}, + hardLimit: 3, + expHitHardLimit: false, + }, + // hardLimit set to zero, implicitly disables the hard limit. + { + name: "despite 5 failed builds, the hard limit is not reached because it's disabled.", + buildSuccesses: []bool{false, false, false, false, false}, + hardLimit: 0, + expHitHardLimit: false, + }, + } + + for _, tc := range testCases { + t.Run(tc.name, func(t *testing.T) { + t.Parallel() + + db, _ := dbtestutil.NewDB(t) + ctx := testutil.Context(t, testutil.WaitShort) + dbgen.Organization(t, db, database.Organization{ + ID: orgID, + }) + dbgen.User(t, db, database.User{ + ID: userID, + }) + + tmpl := createTemplate(t, db, orgID, userID) + tmplV1 := createTmplVersionAndPreset(t, db, tmpl, tmpl.ActiveVersionID, now, nil) + for idx, buildSuccess := range tc.buildSuccesses { + createPrebuiltWorkspace(ctx, t, db, tmpl, tmplV1, orgID, now, &createPrebuiltWorkspaceOpts{ + failedJob: !buildSuccess, + createdAt: hourBefore.Add(time.Duration(idx) * time.Second), + }) + } + + hardLimitedPresets, err := db.GetPresetsAtFailureLimit(ctx, tc.hardLimit) + require.NoError(t, err) + + if !tc.expHitHardLimit { + require.Len(t, hardLimitedPresets, 0) + return + } + + require.Len(t, hardLimitedPresets, 1) + hardLimitedPreset := hardLimitedPresets[0] + require.Equal(t, hardLimitedPreset.TemplateVersionID, tmpl.ActiveVersionID) + require.Equal(t, hardLimitedPreset.PresetID, tmplV1.preset.ID) + }) + } + + t.Run("Ignore Inactive Version", func(t *testing.T) { + t.Parallel() + + db, _ := dbtestutil.NewDB(t) + ctx := testutil.Context(t, testutil.WaitShort) + dbgen.Organization(t, db, database.Organization{ + ID: orgID, + }) + dbgen.User(t, db, database.User{ + ID: userID, + }) + + tmpl := createTemplate(t, db, orgID, userID) + tmplV1 := createTmplVersionAndPreset(t, db, tmpl, uuid.New(), now, nil) + createPrebuiltWorkspace(ctx, t, db, tmpl, tmplV1, orgID, now, &createPrebuiltWorkspaceOpts{ + failedJob: true, + }) + + // Active Version + tmplV2 := createTmplVersionAndPreset(t, db, tmpl, tmpl.ActiveVersionID, now, nil) + createPrebuiltWorkspace(ctx, t, db, tmpl, tmplV2, orgID, now, &createPrebuiltWorkspaceOpts{ + failedJob: true, + }) + createPrebuiltWorkspace(ctx, t, db, tmpl, tmplV2, orgID, now, &createPrebuiltWorkspaceOpts{ + failedJob: true, + }) + + hardLimitedPresets, err := db.GetPresetsAtFailureLimit(ctx, 1) + require.NoError(t, err) + + require.Len(t, hardLimitedPresets, 1) + hardLimitedPreset := hardLimitedPresets[0] + require.Equal(t, hardLimitedPreset.TemplateVersionID, tmpl.ActiveVersionID) + require.Equal(t, hardLimitedPreset.PresetID, tmplV2.preset.ID) + }) + + t.Run("Multiple Templates", func(t *testing.T) { + t.Parallel() + + db, _ := dbtestutil.NewDB(t) + ctx := testutil.Context(t, testutil.WaitShort) + dbgen.Organization(t, db, database.Organization{ + ID: orgID, + }) + dbgen.User(t, db, database.User{ + ID: userID, + }) + + tmpl1 := createTemplate(t, db, orgID, userID) + tmpl1V1 := createTmplVersionAndPreset(t, db, tmpl1, tmpl1.ActiveVersionID, now, nil) + createPrebuiltWorkspace(ctx, t, db, tmpl1, tmpl1V1, orgID, now, &createPrebuiltWorkspaceOpts{ + failedJob: true, + }) + + tmpl2 := createTemplate(t, db, orgID, userID) + tmpl2V1 := createTmplVersionAndPreset(t, db, tmpl2, tmpl2.ActiveVersionID, now, nil) + createPrebuiltWorkspace(ctx, t, db, tmpl2, tmpl2V1, orgID, now, &createPrebuiltWorkspaceOpts{ + failedJob: true, + }) + + hardLimitedPresets, err := db.GetPresetsAtFailureLimit(ctx, 1) + + require.NoError(t, err) + + require.Len(t, hardLimitedPresets, 2) + { + hardLimitedPreset := findPresetByTmplVersionID(hardLimitedPresets, tmpl1.ActiveVersionID) + require.Equal(t, hardLimitedPreset.TemplateVersionID, tmpl1.ActiveVersionID) + require.Equal(t, hardLimitedPreset.PresetID, tmpl1V1.preset.ID) + } + { + hardLimitedPreset := findPresetByTmplVersionID(hardLimitedPresets, tmpl2.ActiveVersionID) + require.Equal(t, hardLimitedPreset.TemplateVersionID, tmpl2.ActiveVersionID) + require.Equal(t, hardLimitedPreset.PresetID, tmpl2V1.preset.ID) + } + }) + + t.Run("Multiple Templates, Versions and Workspace Builds", func(t *testing.T) { + t.Parallel() + + db, _ := dbtestutil.NewDB(t) + ctx := testutil.Context(t, testutil.WaitShort) + dbgen.Organization(t, db, database.Organization{ + ID: orgID, + }) + dbgen.User(t, db, database.User{ + ID: userID, + }) + + tmpl1 := createTemplate(t, db, orgID, userID) + tmpl1V1 := createTmplVersionAndPreset(t, db, tmpl1, tmpl1.ActiveVersionID, now, nil) + createPrebuiltWorkspace(ctx, t, db, tmpl1, tmpl1V1, orgID, now, &createPrebuiltWorkspaceOpts{ + failedJob: true, + }) + createPrebuiltWorkspace(ctx, t, db, tmpl1, tmpl1V1, orgID, now, &createPrebuiltWorkspaceOpts{ + failedJob: true, + }) + + tmpl2 := createTemplate(t, db, orgID, userID) + tmpl2V1 := createTmplVersionAndPreset(t, db, tmpl2, tmpl2.ActiveVersionID, now, nil) + createPrebuiltWorkspace(ctx, t, db, tmpl2, tmpl2V1, orgID, now, &createPrebuiltWorkspaceOpts{ + failedJob: true, + }) + createPrebuiltWorkspace(ctx, t, db, tmpl2, tmpl2V1, orgID, now, &createPrebuiltWorkspaceOpts{ + failedJob: true, + }) + + tmpl3 := createTemplate(t, db, orgID, userID) + tmpl3V1 := createTmplVersionAndPreset(t, db, tmpl3, uuid.New(), now, nil) + createPrebuiltWorkspace(ctx, t, db, tmpl3, tmpl3V1, orgID, now, &createPrebuiltWorkspaceOpts{ + failedJob: true, + }) + + tmpl3V2 := createTmplVersionAndPreset(t, db, tmpl3, tmpl3.ActiveVersionID, now, nil) + createPrebuiltWorkspace(ctx, t, db, tmpl3, tmpl3V2, orgID, now, &createPrebuiltWorkspaceOpts{ + failedJob: true, + }) + createPrebuiltWorkspace(ctx, t, db, tmpl3, tmpl3V2, orgID, now, &createPrebuiltWorkspaceOpts{ + failedJob: true, + }) + + hardLimit := int64(2) + hardLimitedPresets, err := db.GetPresetsAtFailureLimit(ctx, hardLimit) + require.NoError(t, err) + + require.Len(t, hardLimitedPresets, 3) + { + hardLimitedPreset := findPresetByTmplVersionID(hardLimitedPresets, tmpl1.ActiveVersionID) + require.Equal(t, hardLimitedPreset.TemplateVersionID, tmpl1.ActiveVersionID) + require.Equal(t, hardLimitedPreset.PresetID, tmpl1V1.preset.ID) + } + { + hardLimitedPreset := findPresetByTmplVersionID(hardLimitedPresets, tmpl2.ActiveVersionID) + require.Equal(t, hardLimitedPreset.TemplateVersionID, tmpl2.ActiveVersionID) + require.Equal(t, hardLimitedPreset.PresetID, tmpl2V1.preset.ID) + } + { + hardLimitedPreset := findPresetByTmplVersionID(hardLimitedPresets, tmpl3.ActiveVersionID) + require.Equal(t, hardLimitedPreset.TemplateVersionID, tmpl3.ActiveVersionID) + require.Equal(t, hardLimitedPreset.PresetID, tmpl3V2.preset.ID) + } + }) + + t.Run("No Workspace Builds", func(t *testing.T) { + t.Parallel() + + db, _ := dbtestutil.NewDB(t) + ctx := testutil.Context(t, testutil.WaitShort) + dbgen.Organization(t, db, database.Organization{ + ID: orgID, + }) + dbgen.User(t, db, database.User{ + ID: userID, + }) + + tmpl1 := createTemplate(t, db, orgID, userID) + createTmplVersionAndPreset(t, db, tmpl1, tmpl1.ActiveVersionID, now, nil) + + hardLimitedPresets, err := db.GetPresetsAtFailureLimit(ctx, 1) + require.NoError(t, err) + require.Nil(t, hardLimitedPresets) + }) + + t.Run("No Failed Workspace Builds", func(t *testing.T) { + t.Parallel() + + db, _ := dbtestutil.NewDB(t) + ctx := testutil.Context(t, testutil.WaitShort) + dbgen.Organization(t, db, database.Organization{ + ID: orgID, + }) + dbgen.User(t, db, database.User{ + ID: userID, + }) + + tmpl1 := createTemplate(t, db, orgID, userID) + tmpl1V1 := createTmplVersionAndPreset(t, db, tmpl1, tmpl1.ActiveVersionID, now, nil) + successfulJobOpts := createPrebuiltWorkspaceOpts{} + createPrebuiltWorkspace(ctx, t, db, tmpl1, tmpl1V1, orgID, now, &successfulJobOpts) + createPrebuiltWorkspace(ctx, t, db, tmpl1, tmpl1V1, orgID, now, &successfulJobOpts) + createPrebuiltWorkspace(ctx, t, db, tmpl1, tmpl1V1, orgID, now, &successfulJobOpts) + + hardLimitedPresets, err := db.GetPresetsAtFailureLimit(ctx, 1) + require.NoError(t, err) + require.Nil(t, hardLimitedPresets) + }) +} + func requireUsersMatch(t testing.TB, expected []database.User, found []database.GetUsersRow, msg string) { t.Helper() require.ElementsMatch(t, expected, database.ConvertUserRows(found), msg) diff --git a/coderd/database/queries.sql.go b/coderd/database/queries.sql.go index b956fc1db5f91..ffd8ccb035206 100644 --- a/coderd/database/queries.sql.go +++ b/coderd/database/queries.sql.go @@ -6288,6 +6288,71 @@ func (q *sqlQuerier) GetPrebuildMetrics(ctx context.Context) ([]GetPrebuildMetri return items, nil } +const getPresetsAtFailureLimit = `-- name: GetPresetsAtFailureLimit :many +WITH filtered_builds AS ( + -- Only select builds which are for prebuild creations + SELECT wlb.template_version_id, wlb.created_at, tvp.id AS preset_id, wlb.job_status, tvp.desired_instances + FROM template_version_presets tvp + INNER JOIN workspace_latest_builds wlb ON wlb.template_version_preset_id = tvp.id + INNER JOIN workspaces w ON wlb.workspace_id = w.id + INNER JOIN template_versions tv ON wlb.template_version_id = tv.id + INNER JOIN templates t ON tv.template_id = t.id AND t.active_version_id = tv.id + WHERE tvp.desired_instances IS NOT NULL -- Consider only presets that have a prebuild configuration. + AND wlb.transition = 'start'::workspace_transition + AND w.owner_id = 'c42fdf75-3097-471c-8c33-fb52454d81c0' +), +time_sorted_builds AS ( + -- Group builds by preset, then sort each group by created_at. + SELECT fb.template_version_id, fb.created_at, fb.preset_id, fb.job_status, fb.desired_instances, + ROW_NUMBER() OVER (PARTITION BY fb.preset_id ORDER BY fb.created_at DESC) as rn + FROM filtered_builds fb +) +SELECT + tsb.template_version_id, + tsb.preset_id +FROM time_sorted_builds tsb +WHERE tsb.rn <= $1::bigint + AND tsb.job_status = 'failed'::provisioner_job_status +GROUP BY tsb.template_version_id, tsb.preset_id +HAVING COUNT(*) = $1::bigint +` + +type GetPresetsAtFailureLimitRow struct { + TemplateVersionID uuid.UUID `db:"template_version_id" json:"template_version_id"` + PresetID uuid.UUID `db:"preset_id" json:"preset_id"` +} + +// GetPresetsAtFailureLimit groups workspace builds by preset ID. +// Each preset is associated with exactly one template version ID. +// For each preset, the query checks the last hard_limit builds. +// If all of them failed, the preset is considered to have hit the hard failure limit. +// The query returns a list of preset IDs that have reached this failure threshold. +// Only active template versions with configured presets are considered. +// For each preset, check the last hard_limit builds. +// If all of them failed, the preset is considered to have hit the hard failure limit. +func (q *sqlQuerier) GetPresetsAtFailureLimit(ctx context.Context, hardLimit int64) ([]GetPresetsAtFailureLimitRow, error) { + rows, err := q.db.QueryContext(ctx, getPresetsAtFailureLimit, hardLimit) + if err != nil { + return nil, err + } + defer rows.Close() + var items []GetPresetsAtFailureLimitRow + for rows.Next() { + var i GetPresetsAtFailureLimitRow + if err := rows.Scan(&i.TemplateVersionID, &i.PresetID); err != nil { + return nil, err + } + items = append(items, i) + } + if err := rows.Close(); err != nil { + return nil, err + } + if err := rows.Err(); err != nil { + return nil, err + } + return items, nil +} + const getPresetsBackoff = `-- name: GetPresetsBackoff :many WITH filtered_builds AS ( -- Only select builds which are for prebuild creations @@ -6438,6 +6503,7 @@ const getTemplatePresetsWithPrebuilds = `-- name: GetTemplatePresetsWithPrebuild SELECT t.id AS template_id, t.name AS template_name, + o.id AS organization_id, o.name AS organization_name, tv.id AS template_version_id, tv.name AS template_version_name, @@ -6445,6 +6511,7 @@ SELECT tvp.id, tvp.name, tvp.desired_instances AS desired_instances, + tvp.prebuild_status, t.deleted, t.deprecated != '' AS deprecated FROM templates t @@ -6457,17 +6524,19 @@ WHERE tvp.desired_instances IS NOT NULL -- Consider only presets that have a pre ` type GetTemplatePresetsWithPrebuildsRow struct { - TemplateID uuid.UUID `db:"template_id" json:"template_id"` - TemplateName string `db:"template_name" json:"template_name"` - OrganizationName string `db:"organization_name" json:"organization_name"` - TemplateVersionID uuid.UUID `db:"template_version_id" json:"template_version_id"` - TemplateVersionName string `db:"template_version_name" json:"template_version_name"` - UsingActiveVersion bool `db:"using_active_version" json:"using_active_version"` - ID uuid.UUID `db:"id" json:"id"` - Name string `db:"name" json:"name"` - DesiredInstances sql.NullInt32 `db:"desired_instances" json:"desired_instances"` - Deleted bool `db:"deleted" json:"deleted"` - Deprecated bool `db:"deprecated" json:"deprecated"` + TemplateID uuid.UUID `db:"template_id" json:"template_id"` + TemplateName string `db:"template_name" json:"template_name"` + OrganizationID uuid.UUID `db:"organization_id" json:"organization_id"` + OrganizationName string `db:"organization_name" json:"organization_name"` + TemplateVersionID uuid.UUID `db:"template_version_id" json:"template_version_id"` + TemplateVersionName string `db:"template_version_name" json:"template_version_name"` + UsingActiveVersion bool `db:"using_active_version" json:"using_active_version"` + ID uuid.UUID `db:"id" json:"id"` + Name string `db:"name" json:"name"` + DesiredInstances sql.NullInt32 `db:"desired_instances" json:"desired_instances"` + PrebuildStatus PrebuildStatus `db:"prebuild_status" json:"prebuild_status"` + Deleted bool `db:"deleted" json:"deleted"` + Deprecated bool `db:"deprecated" json:"deprecated"` } // GetTemplatePresetsWithPrebuilds retrieves template versions with configured presets and prebuilds. @@ -6485,6 +6554,7 @@ func (q *sqlQuerier) GetTemplatePresetsWithPrebuilds(ctx context.Context, templa if err := rows.Scan( &i.TemplateID, &i.TemplateName, + &i.OrganizationID, &i.OrganizationName, &i.TemplateVersionID, &i.TemplateVersionName, @@ -6492,6 +6562,7 @@ func (q *sqlQuerier) GetTemplatePresetsWithPrebuilds(ctx context.Context, templa &i.ID, &i.Name, &i.DesiredInstances, + &i.PrebuildStatus, &i.Deleted, &i.Deprecated, ); err != nil { @@ -6509,21 +6580,22 @@ func (q *sqlQuerier) GetTemplatePresetsWithPrebuilds(ctx context.Context, templa } const getPresetByID = `-- name: GetPresetByID :one -SELECT tvp.id, tvp.template_version_id, tvp.name, tvp.created_at, tvp.desired_instances, tvp.invalidate_after_secs, tv.template_id, tv.organization_id FROM +SELECT tvp.id, tvp.template_version_id, tvp.name, tvp.created_at, tvp.desired_instances, tvp.invalidate_after_secs, tvp.prebuild_status, tv.template_id, tv.organization_id FROM template_version_presets tvp INNER JOIN template_versions tv ON tvp.template_version_id = tv.id WHERE tvp.id = $1 ` type GetPresetByIDRow struct { - ID uuid.UUID `db:"id" json:"id"` - TemplateVersionID uuid.UUID `db:"template_version_id" json:"template_version_id"` - Name string `db:"name" json:"name"` - CreatedAt time.Time `db:"created_at" json:"created_at"` - DesiredInstances sql.NullInt32 `db:"desired_instances" json:"desired_instances"` - InvalidateAfterSecs sql.NullInt32 `db:"invalidate_after_secs" json:"invalidate_after_secs"` - TemplateID uuid.NullUUID `db:"template_id" json:"template_id"` - OrganizationID uuid.UUID `db:"organization_id" json:"organization_id"` + ID uuid.UUID `db:"id" json:"id"` + TemplateVersionID uuid.UUID `db:"template_version_id" json:"template_version_id"` + Name string `db:"name" json:"name"` + CreatedAt time.Time `db:"created_at" json:"created_at"` + DesiredInstances sql.NullInt32 `db:"desired_instances" json:"desired_instances"` + InvalidateAfterSecs sql.NullInt32 `db:"invalidate_after_secs" json:"invalidate_after_secs"` + PrebuildStatus PrebuildStatus `db:"prebuild_status" json:"prebuild_status"` + TemplateID uuid.NullUUID `db:"template_id" json:"template_id"` + OrganizationID uuid.UUID `db:"organization_id" json:"organization_id"` } func (q *sqlQuerier) GetPresetByID(ctx context.Context, presetID uuid.UUID) (GetPresetByIDRow, error) { @@ -6536,6 +6608,7 @@ func (q *sqlQuerier) GetPresetByID(ctx context.Context, presetID uuid.UUID) (Get &i.CreatedAt, &i.DesiredInstances, &i.InvalidateAfterSecs, + &i.PrebuildStatus, &i.TemplateID, &i.OrganizationID, ) @@ -6544,7 +6617,7 @@ func (q *sqlQuerier) GetPresetByID(ctx context.Context, presetID uuid.UUID) (Get const getPresetByWorkspaceBuildID = `-- name: GetPresetByWorkspaceBuildID :one SELECT - template_version_presets.id, template_version_presets.template_version_id, template_version_presets.name, template_version_presets.created_at, template_version_presets.desired_instances, template_version_presets.invalidate_after_secs + template_version_presets.id, template_version_presets.template_version_id, template_version_presets.name, template_version_presets.created_at, template_version_presets.desired_instances, template_version_presets.invalidate_after_secs, template_version_presets.prebuild_status FROM template_version_presets INNER JOIN workspace_builds ON workspace_builds.template_version_preset_id = template_version_presets.id @@ -6562,6 +6635,7 @@ func (q *sqlQuerier) GetPresetByWorkspaceBuildID(ctx context.Context, workspaceB &i.CreatedAt, &i.DesiredInstances, &i.InvalidateAfterSecs, + &i.PrebuildStatus, ) return i, err } @@ -6643,7 +6717,7 @@ func (q *sqlQuerier) GetPresetParametersByTemplateVersionID(ctx context.Context, const getPresetsByTemplateVersionID = `-- name: GetPresetsByTemplateVersionID :many SELECT - id, template_version_id, name, created_at, desired_instances, invalidate_after_secs + id, template_version_id, name, created_at, desired_instances, invalidate_after_secs, prebuild_status FROM template_version_presets WHERE @@ -6666,6 +6740,7 @@ func (q *sqlQuerier) GetPresetsByTemplateVersionID(ctx context.Context, template &i.CreatedAt, &i.DesiredInstances, &i.InvalidateAfterSecs, + &i.PrebuildStatus, ); err != nil { return nil, err } @@ -6696,7 +6771,7 @@ VALUES ( $4, $5, $6 -) RETURNING id, template_version_id, name, created_at, desired_instances, invalidate_after_secs +) RETURNING id, template_version_id, name, created_at, desired_instances, invalidate_after_secs, prebuild_status ` type InsertPresetParams struct { @@ -6725,6 +6800,7 @@ func (q *sqlQuerier) InsertPreset(ctx context.Context, arg InsertPresetParams) ( &i.CreatedAt, &i.DesiredInstances, &i.InvalidateAfterSecs, + &i.PrebuildStatus, ) return i, err } @@ -6773,6 +6849,22 @@ func (q *sqlQuerier) InsertPresetParameters(ctx context.Context, arg InsertPrese return items, nil } +const updatePresetPrebuildStatus = `-- name: UpdatePresetPrebuildStatus :exec +UPDATE template_version_presets +SET prebuild_status = $1 +WHERE id = $2 +` + +type UpdatePresetPrebuildStatusParams struct { + Status PrebuildStatus `db:"status" json:"status"` + PresetID uuid.UUID `db:"preset_id" json:"preset_id"` +} + +func (q *sqlQuerier) UpdatePresetPrebuildStatus(ctx context.Context, arg UpdatePresetPrebuildStatusParams) error { + _, err := q.db.ExecContext(ctx, updatePresetPrebuildStatus, arg.Status, arg.PresetID) + return err +} + const deleteOldProvisionerDaemons = `-- name: DeleteOldProvisionerDaemons :exec DELETE FROM provisioner_daemons WHERE ( (created_at < (NOW() - INTERVAL '7 days') AND last_seen_at IS NULL) OR @@ -18143,6 +18235,65 @@ func (q *sqlQuerier) GetWorkspaceByOwnerIDAndName(ctx context.Context, arg GetWo return i, err } +const getWorkspaceByResourceID = `-- name: GetWorkspaceByResourceID :one +SELECT + id, created_at, updated_at, owner_id, organization_id, template_id, deleted, name, autostart_schedule, ttl, last_used_at, dormant_at, deleting_at, automatic_updates, favorite, next_start_at, owner_avatar_url, owner_username, organization_name, organization_display_name, organization_icon, organization_description, template_name, template_display_name, template_icon, template_description +FROM + workspaces_expanded as workspaces +WHERE + workspaces.id = ( + SELECT + workspace_id + FROM + workspace_builds + WHERE + workspace_builds.job_id = ( + SELECT + job_id + FROM + workspace_resources + WHERE + workspace_resources.id = $1 + ) + ) +LIMIT + 1 +` + +func (q *sqlQuerier) GetWorkspaceByResourceID(ctx context.Context, resourceID uuid.UUID) (Workspace, error) { + row := q.db.QueryRowContext(ctx, getWorkspaceByResourceID, resourceID) + var i Workspace + err := row.Scan( + &i.ID, + &i.CreatedAt, + &i.UpdatedAt, + &i.OwnerID, + &i.OrganizationID, + &i.TemplateID, + &i.Deleted, + &i.Name, + &i.AutostartSchedule, + &i.Ttl, + &i.LastUsedAt, + &i.DormantAt, + &i.DeletingAt, + &i.AutomaticUpdates, + &i.Favorite, + &i.NextStartAt, + &i.OwnerAvatarUrl, + &i.OwnerUsername, + &i.OrganizationName, + &i.OrganizationDisplayName, + &i.OrganizationIcon, + &i.OrganizationDescription, + &i.TemplateName, + &i.TemplateDisplayName, + &i.TemplateIcon, + &i.TemplateDescription, + ) + return i, err +} + const getWorkspaceByWorkspaceAppID = `-- name: GetWorkspaceByWorkspaceAppID :one SELECT id, created_at, updated_at, owner_id, organization_id, template_id, deleted, name, autostart_schedule, ttl, last_used_at, dormant_at, deleting_at, automatic_updates, favorite, next_start_at, owner_avatar_url, owner_username, organization_name, organization_display_name, organization_icon, organization_description, template_name, template_display_name, template_icon, template_description diff --git a/coderd/database/queries/prebuilds.sql b/coderd/database/queries/prebuilds.sql index 8c27ddf62b7c3..9cd4321afec23 100644 --- a/coderd/database/queries/prebuilds.sql +++ b/coderd/database/queries/prebuilds.sql @@ -27,6 +27,7 @@ RETURNING w.id, w.name; SELECT t.id AS template_id, t.name AS template_name, + o.id AS organization_id, o.name AS organization_name, tv.id AS template_version_id, tv.name AS template_version_name, @@ -34,6 +35,7 @@ SELECT tvp.id, tvp.name, tvp.desired_instances AS desired_instances, + tvp.prebuild_status, t.deleted, t.deprecated != '' AS deprecated FROM templates t @@ -129,6 +131,42 @@ WHERE tsb.rn <= tsb.desired_instances -- Fetch the last N builds, where N is the AND created_at >= @lookback::timestamptz GROUP BY tsb.template_version_id, tsb.preset_id, fc.num_failed; +-- GetPresetsAtFailureLimit groups workspace builds by preset ID. +-- Each preset is associated with exactly one template version ID. +-- For each preset, the query checks the last hard_limit builds. +-- If all of them failed, the preset is considered to have hit the hard failure limit. +-- The query returns a list of preset IDs that have reached this failure threshold. +-- Only active template versions with configured presets are considered. +-- name: GetPresetsAtFailureLimit :many +WITH filtered_builds AS ( + -- Only select builds which are for prebuild creations + SELECT wlb.template_version_id, wlb.created_at, tvp.id AS preset_id, wlb.job_status, tvp.desired_instances + FROM template_version_presets tvp + INNER JOIN workspace_latest_builds wlb ON wlb.template_version_preset_id = tvp.id + INNER JOIN workspaces w ON wlb.workspace_id = w.id + INNER JOIN template_versions tv ON wlb.template_version_id = tv.id + INNER JOIN templates t ON tv.template_id = t.id AND t.active_version_id = tv.id + WHERE tvp.desired_instances IS NOT NULL -- Consider only presets that have a prebuild configuration. + AND wlb.transition = 'start'::workspace_transition + AND w.owner_id = 'c42fdf75-3097-471c-8c33-fb52454d81c0' +), +time_sorted_builds AS ( + -- Group builds by preset, then sort each group by created_at. + SELECT fb.template_version_id, fb.created_at, fb.preset_id, fb.job_status, fb.desired_instances, + ROW_NUMBER() OVER (PARTITION BY fb.preset_id ORDER BY fb.created_at DESC) as rn + FROM filtered_builds fb +) +SELECT + tsb.template_version_id, + tsb.preset_id +FROM time_sorted_builds tsb +-- For each preset, check the last hard_limit builds. +-- If all of them failed, the preset is considered to have hit the hard failure limit. +WHERE tsb.rn <= @hard_limit::bigint + AND tsb.job_status = 'failed'::provisioner_job_status +GROUP BY tsb.template_version_id, tsb.preset_id +HAVING COUNT(*) = @hard_limit::bigint; + -- name: GetPrebuildMetrics :many SELECT t.name as template_name, diff --git a/coderd/database/queries/presets.sql b/coderd/database/queries/presets.sql index 6d5646a285b4a..2fb6722bc2c33 100644 --- a/coderd/database/queries/presets.sql +++ b/coderd/database/queries/presets.sql @@ -25,6 +25,11 @@ SELECT unnest(@values :: TEXT[]) RETURNING *; +-- name: UpdatePresetPrebuildStatus :exec +UPDATE template_version_presets +SET prebuild_status = @status +WHERE id = @preset_id; + -- name: GetPresetsByTemplateVersionID :many SELECT * diff --git a/coderd/database/queries/workspaces.sql b/coderd/database/queries/workspaces.sql index 4ec74c066fe41..44b7dcbf0387d 100644 --- a/coderd/database/queries/workspaces.sql +++ b/coderd/database/queries/workspaces.sql @@ -8,6 +8,30 @@ WHERE LIMIT 1; +-- name: GetWorkspaceByResourceID :one +SELECT + * +FROM + workspaces_expanded as workspaces +WHERE + workspaces.id = ( + SELECT + workspace_id + FROM + workspace_builds + WHERE + workspace_builds.job_id = ( + SELECT + job_id + FROM + workspace_resources + WHERE + workspace_resources.id = @resource_id + ) + ) +LIMIT + 1; + -- name: GetWorkspaceByWorkspaceAppID :one SELECT * diff --git a/coderd/notifications/events.go b/coderd/notifications/events.go index 35d9925055da5..0e88361b56f68 100644 --- a/coderd/notifications/events.go +++ b/coderd/notifications/events.go @@ -42,6 +42,11 @@ var ( TemplateWorkspaceResourceReplaced = uuid.MustParse("89d9745a-816e-4695-a17f-3d0a229e2b8d") ) +// Prebuilds-related events +var ( + PrebuildFailureLimitReached = uuid.MustParse("414d9331-c1fc-4761-b40c-d1f4702279eb") +) + // Notification-related events. var ( TemplateTestNotification = uuid.MustParse("c425f63e-716a-4bf4-ae24-78348f706c3f") diff --git a/coderd/notifications/notifications_test.go b/coderd/notifications/notifications_test.go index 8f8a3c82441e0..fab87af41deb9 100644 --- a/coderd/notifications/notifications_test.go +++ b/coderd/notifications/notifications_test.go @@ -1250,6 +1250,22 @@ func TestNotificationTemplates_Golden(t *testing.T) { }, }, }, + { + name: "PrebuildFailureLimitReached", + id: notifications.PrebuildFailureLimitReached, + payload: types.MessagePayload{ + UserName: "Bobby", + UserEmail: "bobby@coder.com", + UserUsername: "bobby", + Labels: map[string]string{ + "org": "cern", + "template": "docker", + "template_version": "angry_torvalds", + "preset": "particle-accelerator", + }, + Data: map[string]any{}, + }, + }, } // We must have a test case for every notification_template. This is enforced below: diff --git a/coderd/notifications/testdata/rendered-templates/smtp/PrebuildFailureLimitReached.html.golden b/coderd/notifications/testdata/rendered-templates/smtp/PrebuildFailureLimitReached.html.golden new file mode 100644 index 0000000000000..69f13b86ca71c --- /dev/null +++ b/coderd/notifications/testdata/rendered-templates/smtp/PrebuildFailureLimitReached.html.golden @@ -0,0 +1,112 @@ +From: system@coder.com +To: bobby@coder.com +Subject: There is a problem creating prebuilt workspaces +Message-Id: 02ee4935-73be-4fa1-a290-ff9999026b13@blush-whale-48 +Date: Fri, 11 Oct 2024 09:03:06 +0000 +Content-Type: multipart/alternative; boundary=bbe61b741255b6098bb6b3c1f41b885773df633cb18d2a3002b68e4bc9c4 +MIME-Version: 1.0 + +--bbe61b741255b6098bb6b3c1f41b885773df633cb18d2a3002b68e4bc9c4 +Content-Transfer-Encoding: quoted-printable +Content-Type: text/plain; charset=UTF-8 + +Hi Bobby, + +The number of failed prebuild attempts has reached the hard limit for templ= +ate docker and preset particle-accelerator. + +To resume prebuilds, fix the underlying issue and upload a new template ver= +sion. + +Refer to the documentation for more details: + +Troubleshooting templates (https://coder.com/docs/admin/templates/troublesh= +ooting) +Troubleshooting of prebuilt workspaces (https://coder.com/docs/admin/templa= +tes/extending-templates/prebuilt-workspaces#administration-and-troubleshoot= +ing) + + +View failed prebuilt workspaces: http://test.com/workspaces?filter=3Downer:= +prebuilds+status:failed+template:docker + +View template version: http://test.com/templates/cern/docker/versions/angry= +_torvalds + +--bbe61b741255b6098bb6b3c1f41b885773df633cb18d2a3002b68e4bc9c4 +Content-Transfer-Encoding: quoted-printable +Content-Type: text/html; charset=UTF-8 + + + + + + + There is a problem creating prebuilt workspaces + + +
+
+ 3D"Cod= +
+

+ There is a problem creating prebuilt workspaces +

+
+

Hi Bobby,

+

The number of failed prebuild attempts has reached the hard limi= +t for template docker and preset particle-accelera= +tor.

+ +

To resume prebuilds, fix the underlying issue and upload a new template = +version.

+ +

Refer to the documentation for more details:
+- Troubl= +eshooting templates
+- Troubleshooting of pre= +built workspaces

+
+ + +
+ + + +--bbe61b741255b6098bb6b3c1f41b885773df633cb18d2a3002b68e4bc9c4-- diff --git a/coderd/notifications/testdata/rendered-templates/webhook/PrebuildFailureLimitReached.json.golden b/coderd/notifications/testdata/rendered-templates/webhook/PrebuildFailureLimitReached.json.golden new file mode 100644 index 0000000000000..0a6e262ff7512 --- /dev/null +++ b/coderd/notifications/testdata/rendered-templates/webhook/PrebuildFailureLimitReached.json.golden @@ -0,0 +1,35 @@ +{ + "_version": "1.1", + "msg_id": "00000000-0000-0000-0000-000000000000", + "payload": { + "_version": "1.2", + "notification_name": "Prebuild Failure Limit Reached", + "notification_template_id": "00000000-0000-0000-0000-000000000000", + "user_id": "00000000-0000-0000-0000-000000000000", + "user_email": "bobby@coder.com", + "user_name": "Bobby", + "user_username": "bobby", + "actions": [ + { + "label": "View failed prebuilt workspaces", + "url": "http://test.com/workspaces?filter=owner:prebuilds+status:failed+template:docker" + }, + { + "label": "View template version", + "url": "http://test.com/templates/cern/docker/versions/angry_torvalds" + } + ], + "labels": { + "org": "cern", + "preset": "particle-accelerator", + "template": "docker", + "template_version": "angry_torvalds" + }, + "data": {}, + "targets": null + }, + "title": "There is a problem creating prebuilt workspaces", + "title_markdown": "There is a problem creating prebuilt workspaces", + "body": "The number of failed prebuild attempts has reached the hard limit for template docker and preset particle-accelerator.\n\nTo resume prebuilds, fix the underlying issue and upload a new template version.\n\nRefer to the documentation for more details:\n\nTroubleshooting templates (https://coder.com/docs/admin/templates/troubleshooting)\nTroubleshooting of prebuilt workspaces (https://coder.com/docs/admin/templates/extending-templates/prebuilt-workspaces#administration-and-troubleshooting)", + "body_markdown": "\nThe number of failed prebuild attempts has reached the hard limit for template **docker** and preset **particle-accelerator**.\n\nTo resume prebuilds, fix the underlying issue and upload a new template version.\n\nRefer to the documentation for more details:\n- [Troubleshooting templates](https://coder.com/docs/admin/templates/troubleshooting)\n- [Troubleshooting of prebuilt workspaces](https://coder.com/docs/admin/templates/extending-templates/prebuilt-workspaces#administration-and-troubleshooting)\n" +} \ No newline at end of file diff --git a/coderd/prebuilds/global_snapshot.go b/coderd/prebuilds/global_snapshot.go index 0cf3fa3facc3a..9110f57574e7b 100644 --- a/coderd/prebuilds/global_snapshot.go +++ b/coderd/prebuilds/global_snapshot.go @@ -14,6 +14,7 @@ type GlobalSnapshot struct { RunningPrebuilds []database.GetRunningPrebuiltWorkspacesRow PrebuildsInProgress []database.CountInProgressPrebuildsRow Backoffs []database.GetPresetsBackoffRow + HardLimitedPresets []database.GetPresetsAtFailureLimitRow } func NewGlobalSnapshot( @@ -21,12 +22,14 @@ func NewGlobalSnapshot( runningPrebuilds []database.GetRunningPrebuiltWorkspacesRow, prebuildsInProgress []database.CountInProgressPrebuildsRow, backoffs []database.GetPresetsBackoffRow, + hardLimitedPresets []database.GetPresetsAtFailureLimitRow, ) GlobalSnapshot { return GlobalSnapshot{ Presets: presets, RunningPrebuilds: runningPrebuilds, PrebuildsInProgress: prebuildsInProgress, Backoffs: backoffs, + HardLimitedPresets: hardLimitedPresets, } } @@ -57,10 +60,15 @@ func (s GlobalSnapshot) FilterByPreset(presetID uuid.UUID) (*PresetSnapshot, err backoffPtr = &backoff } + _, isHardLimited := slice.Find(s.HardLimitedPresets, func(row database.GetPresetsAtFailureLimitRow) bool { + return row.PresetID == preset.ID + }) + return &PresetSnapshot{ - Preset: preset, - Running: running, - InProgress: inProgress, - Backoff: backoffPtr, + Preset: preset, + Running: running, + InProgress: inProgress, + Backoff: backoffPtr, + IsHardLimited: isHardLimited, }, nil } diff --git a/coderd/prebuilds/preset_snapshot.go b/coderd/prebuilds/preset_snapshot.go index 8441a350187d2..40e77de5ab3e3 100644 --- a/coderd/prebuilds/preset_snapshot.go +++ b/coderd/prebuilds/preset_snapshot.go @@ -32,10 +32,11 @@ const ( // It contains the raw data needed to calculate the current state of a preset's prebuilds, // including running prebuilds, in-progress builds, and backoff information. type PresetSnapshot struct { - Preset database.GetTemplatePresetsWithPrebuildsRow - Running []database.GetRunningPrebuiltWorkspacesRow - InProgress []database.CountInProgressPrebuildsRow - Backoff *database.GetPresetsBackoffRow + Preset database.GetTemplatePresetsWithPrebuildsRow + Running []database.GetRunningPrebuiltWorkspacesRow + InProgress []database.CountInProgressPrebuildsRow + Backoff *database.GetPresetsBackoffRow + IsHardLimited bool } // ReconciliationState represents the processed state of a preset's prebuilds, diff --git a/coderd/prebuilds/preset_snapshot_test.go b/coderd/prebuilds/preset_snapshot_test.go index a5acb40e5311f..2febf1d13ec91 100644 --- a/coderd/prebuilds/preset_snapshot_test.go +++ b/coderd/prebuilds/preset_snapshot_test.go @@ -73,7 +73,7 @@ func TestNoPrebuilds(t *testing.T) { preset(true, 0, current), } - snapshot := prebuilds.NewGlobalSnapshot(presets, nil, nil, nil) + snapshot := prebuilds.NewGlobalSnapshot(presets, nil, nil, nil, nil) ps, err := snapshot.FilterByPreset(current.presetID) require.NoError(t, err) @@ -98,7 +98,7 @@ func TestNetNew(t *testing.T) { preset(true, 1, current), } - snapshot := prebuilds.NewGlobalSnapshot(presets, nil, nil, nil) + snapshot := prebuilds.NewGlobalSnapshot(presets, nil, nil, nil, nil) ps, err := snapshot.FilterByPreset(current.presetID) require.NoError(t, err) @@ -138,7 +138,7 @@ func TestOutdatedPrebuilds(t *testing.T) { var inProgress []database.CountInProgressPrebuildsRow // WHEN: calculating the outdated preset's state. - snapshot := prebuilds.NewGlobalSnapshot(presets, running, inProgress, nil) + snapshot := prebuilds.NewGlobalSnapshot(presets, running, inProgress, nil, nil) ps, err := snapshot.FilterByPreset(outdated.presetID) require.NoError(t, err) @@ -200,7 +200,7 @@ func TestDeleteOutdatedPrebuilds(t *testing.T) { } // WHEN: calculating the outdated preset's state. - snapshot := prebuilds.NewGlobalSnapshot(presets, running, inProgress, nil) + snapshot := prebuilds.NewGlobalSnapshot(presets, running, inProgress, nil, nil) ps, err := snapshot.FilterByPreset(outdated.presetID) require.NoError(t, err) @@ -442,7 +442,7 @@ func TestInProgressActions(t *testing.T) { } // WHEN: calculating the current preset's state. - snapshot := prebuilds.NewGlobalSnapshot(presets, running, inProgress, nil) + snapshot := prebuilds.NewGlobalSnapshot(presets, running, inProgress, nil, nil) ps, err := snapshot.FilterByPreset(current.presetID) require.NoError(t, err) @@ -485,7 +485,7 @@ func TestExtraneous(t *testing.T) { var inProgress []database.CountInProgressPrebuildsRow // WHEN: calculating the current preset's state. - snapshot := prebuilds.NewGlobalSnapshot(presets, running, inProgress, nil) + snapshot := prebuilds.NewGlobalSnapshot(presets, running, inProgress, nil, nil) ps, err := snapshot.FilterByPreset(current.presetID) require.NoError(t, err) @@ -525,7 +525,7 @@ func TestDeprecated(t *testing.T) { var inProgress []database.CountInProgressPrebuildsRow // WHEN: calculating the current preset's state. - snapshot := prebuilds.NewGlobalSnapshot(presets, running, inProgress, nil) + snapshot := prebuilds.NewGlobalSnapshot(presets, running, inProgress, nil, nil) ps, err := snapshot.FilterByPreset(current.presetID) require.NoError(t, err) @@ -576,7 +576,7 @@ func TestLatestBuildFailed(t *testing.T) { } // WHEN: calculating the current preset's state. - snapshot := prebuilds.NewGlobalSnapshot(presets, running, inProgress, backoffs) + snapshot := prebuilds.NewGlobalSnapshot(presets, running, inProgress, backoffs, nil) psCurrent, err := snapshot.FilterByPreset(current.presetID) require.NoError(t, err) @@ -669,7 +669,7 @@ func TestMultiplePresetsPerTemplateVersion(t *testing.T) { }, } - snapshot := prebuilds.NewGlobalSnapshot(presets, nil, inProgress, nil) + snapshot := prebuilds.NewGlobalSnapshot(presets, nil, inProgress, nil, nil) // Nothing has to be created for preset 1. { diff --git a/coderd/provisionerdserver/provisionerdserver.go b/coderd/provisionerdserver/provisionerdserver.go index 423e9bbe584c6..9c4067137b852 100644 --- a/coderd/provisionerdserver/provisionerdserver.go +++ b/coderd/provisionerdserver/provisionerdserver.go @@ -1340,14 +1340,56 @@ func (s *server) CompleteJob(ctx context.Context, completed *proto.CompletedJob) switch jobType := completed.Type.(type) { case *proto.CompletedJob_TemplateImport_: - var input TemplateVersionImportJob - err = json.Unmarshal(job.Input, &input) + err = s.completeTemplateImportJob(ctx, job, jobID, jobType, telemetrySnapshot) + if err != nil { + return nil, err + } + case *proto.CompletedJob_WorkspaceBuild_: + err = s.completeWorkspaceBuildJob(ctx, job, jobID, jobType, telemetrySnapshot) + if err != nil { + return nil, err + } + case *proto.CompletedJob_TemplateDryRun_: + err = s.completeTemplateDryRunJob(ctx, job, jobID, jobType, telemetrySnapshot) if err != nil { - return nil, xerrors.Errorf("template version ID is expected: %w", err) + return nil, err + } + default: + if completed.Type == nil { + return nil, xerrors.Errorf("type payload must be provided") } + return nil, xerrors.Errorf("unknown job type %q; ensure coderd and provisionerd versions match", + reflect.TypeOf(completed.Type).String()) + } + + data, err := json.Marshal(provisionersdk.ProvisionerJobLogsNotifyMessage{EndOfLogs: true}) + if err != nil { + return nil, xerrors.Errorf("marshal job log: %w", err) + } + err = s.Pubsub.Publish(provisionersdk.ProvisionerJobLogsNotifyChannel(jobID), data) + if err != nil { + s.Logger.Error(ctx, "failed to publish end of job logs", slog.F("job_id", jobID), slog.Error(err)) + return nil, xerrors.Errorf("publish end of job logs: %w", err) + } + s.Logger.Debug(ctx, "stage CompleteJob done", slog.F("job_id", jobID)) + return &proto.Empty{}, nil +} + +// completeTemplateImportJob handles completion of a template import job. +// All database operations are performed within a transaction. +func (s *server) completeTemplateImportJob(ctx context.Context, job database.ProvisionerJob, jobID uuid.UUID, jobType *proto.CompletedJob_TemplateImport_, telemetrySnapshot *telemetry.Snapshot) error { + var input TemplateVersionImportJob + err := json.Unmarshal(job.Input, &input) + if err != nil { + return xerrors.Errorf("template version ID is expected: %w", err) + } + + // Execute all database operations in a transaction + return s.Database.InTx(func(db database.Store) error { now := s.timeNow() + // Process resources for transition, resources := range map[database.WorkspaceTransition][]*sdkproto.Resource{ database.WorkspaceTransitionStart: jobType.TemplateImport.StartResources, database.WorkspaceTransitionStop: jobType.TemplateImport.StopResources, @@ -1359,11 +1401,13 @@ func (s *server) CompleteJob(ctx context.Context, completed *proto.CompletedJob) slog.F("resource_type", resource.Type), slog.F("transition", transition)) - if err := InsertWorkspaceResource(ctx, s.Database, jobID, transition, resource, telemetrySnapshot); err != nil { - return nil, xerrors.Errorf("insert resource: %w", err) + if err := InsertWorkspaceResource(ctx, db, jobID, transition, resource, telemetrySnapshot); err != nil { + return xerrors.Errorf("insert resource: %w", err) } } } + + // Process modules for transition, modules := range map[database.WorkspaceTransition][]*sdkproto.Module{ database.WorkspaceTransitionStart: jobType.TemplateImport.StartModules, database.WorkspaceTransitionStop: jobType.TemplateImport.StopModules, @@ -1376,12 +1420,13 @@ func (s *server) CompleteJob(ctx context.Context, completed *proto.CompletedJob) slog.F("module_key", module.Key), slog.F("transition", transition)) - if err := InsertWorkspaceModule(ctx, s.Database, jobID, transition, module, telemetrySnapshot); err != nil { - return nil, xerrors.Errorf("insert module: %w", err) + if err := InsertWorkspaceModule(ctx, db, jobID, transition, module, telemetrySnapshot); err != nil { + return xerrors.Errorf("insert module: %w", err) } } } + // Process rich parameters for _, richParameter := range jobType.TemplateImport.RichParameters { s.Logger.Info(ctx, "inserting template import job parameter", slog.F("job_id", job.ID.String()), @@ -1391,7 +1436,7 @@ func (s *server) CompleteJob(ctx context.Context, completed *proto.CompletedJob) ) options, err := json.Marshal(richParameter.Options) if err != nil { - return nil, xerrors.Errorf("marshal parameter options: %w", err) + return xerrors.Errorf("marshal parameter options: %w", err) } var validationMin, validationMax sql.NullInt32 @@ -1408,7 +1453,7 @@ func (s *server) CompleteJob(ctx context.Context, completed *proto.CompletedJob) } } - _, err = s.Database.InsertTemplateVersionParameter(ctx, database.InsertTemplateVersionParameterParams{ + _, err = db.InsertTemplateVersionParameter(ctx, database.InsertTemplateVersionParameterParams{ TemplateVersionID: input.TemplateVersionID, Name: richParameter.Name, DisplayName: richParameter.DisplayName, @@ -1428,15 +1473,17 @@ func (s *server) CompleteJob(ctx context.Context, completed *proto.CompletedJob) Ephemeral: richParameter.Ephemeral, }) if err != nil { - return nil, xerrors.Errorf("insert parameter: %w", err) + return xerrors.Errorf("insert parameter: %w", err) } } - err = InsertWorkspacePresetsAndParameters(ctx, s.Logger, s.Database, jobID, input.TemplateVersionID, jobType.TemplateImport.Presets, now) + // Process presets and parameters + err := InsertWorkspacePresetsAndParameters(ctx, s.Logger, db, jobID, input.TemplateVersionID, jobType.TemplateImport.Presets, now) if err != nil { - return nil, xerrors.Errorf("insert workspace presets and parameters: %w", err) + return xerrors.Errorf("insert workspace presets and parameters: %w", err) } + // Process external auth providers var completedError sql.NullString for _, externalAuthProvider := range jobType.TemplateImport.ExternalAuthProviders { @@ -1479,18 +1526,19 @@ func (s *server) CompleteJob(ctx context.Context, completed *proto.CompletedJob) externalAuthProvidersMessage, err := json.Marshal(externalAuthProviders) if err != nil { - return nil, xerrors.Errorf("failed to serialize external_auth_providers value: %w", err) + return xerrors.Errorf("failed to serialize external_auth_providers value: %w", err) } - err = s.Database.UpdateTemplateVersionExternalAuthProvidersByJobID(ctx, database.UpdateTemplateVersionExternalAuthProvidersByJobIDParams{ + err = db.UpdateTemplateVersionExternalAuthProvidersByJobID(ctx, database.UpdateTemplateVersionExternalAuthProvidersByJobIDParams{ JobID: jobID, ExternalAuthProviders: externalAuthProvidersMessage, UpdatedAt: now, }) if err != nil { - return nil, xerrors.Errorf("update template version external auth providers: %w", err) + return xerrors.Errorf("update template version external auth providers: %w", err) } + // Process terraform values plan := jobType.TemplateImport.Plan moduleFiles := jobType.TemplateImport.ModuleFiles // If there is a plan, or a module files archive we need to insert a @@ -1509,7 +1557,7 @@ func (s *server) CompleteJob(ctx context.Context, completed *proto.CompletedJob) hash := hex.EncodeToString(hashBytes[:]) // nolint:gocritic // Requires reading "system" files - file, err := s.Database.GetFileByHashAndCreator(dbauthz.AsSystemRestricted(ctx), database.GetFileByHashAndCreatorParams{Hash: hash, CreatedBy: uuid.Nil}) + file, err := db.GetFileByHashAndCreator(dbauthz.AsSystemRestricted(ctx), database.GetFileByHashAndCreatorParams{Hash: hash, CreatedBy: uuid.Nil}) switch { case err == nil: // This set of modules is already cached, which means we can reuse them @@ -1518,10 +1566,10 @@ func (s *server) CompleteJob(ctx context.Context, completed *proto.CompletedJob) UUID: file.ID, } case !xerrors.Is(err, sql.ErrNoRows): - return nil, xerrors.Errorf("check for cached modules: %w", err) + return xerrors.Errorf("check for cached modules: %w", err) default: // nolint:gocritic // Requires creating a "system" file - file, err = s.Database.InsertFile(dbauthz.AsSystemRestricted(ctx), database.InsertFileParams{ + file, err = db.InsertFile(dbauthz.AsSystemRestricted(ctx), database.InsertFileParams{ ID: uuid.New(), Hash: hash, CreatedBy: uuid.Nil, @@ -1530,7 +1578,7 @@ func (s *server) CompleteJob(ctx context.Context, completed *proto.CompletedJob) Data: moduleFiles, }) if err != nil { - return nil, xerrors.Errorf("insert template version terraform modules: %w", err) + return xerrors.Errorf("insert template version terraform modules: %w", err) } fileID = uuid.NullUUID{ Valid: true, @@ -1539,7 +1587,7 @@ func (s *server) CompleteJob(ctx context.Context, completed *proto.CompletedJob) } } - err = s.Database.InsertTemplateVersionTerraformValuesByJobID(ctx, database.InsertTemplateVersionTerraformValuesByJobIDParams{ + err = db.InsertTemplateVersionTerraformValuesByJobID(ctx, database.InsertTemplateVersionTerraformValuesByJobIDParams{ JobID: jobID, UpdatedAt: now, CachedPlan: plan, @@ -1547,11 +1595,12 @@ func (s *server) CompleteJob(ctx context.Context, completed *proto.CompletedJob) ProvisionerdVersion: s.apiVersion, }) if err != nil { - return nil, xerrors.Errorf("insert template version terraform data: %w", err) + return xerrors.Errorf("insert template version terraform data: %w", err) } } - err = s.Database.UpdateProvisionerJobWithCompleteByID(ctx, database.UpdateProvisionerJobWithCompleteByIDParams{ + // Mark job as completed + err = db.UpdateProvisionerJobWithCompleteByID(ctx, database.UpdateProvisionerJobWithCompleteByIDParams{ ID: jobID, UpdatedAt: now, CompletedAt: sql.NullTime{ @@ -1562,206 +1611,136 @@ func (s *server) CompleteJob(ctx context.Context, completed *proto.CompletedJob) ErrorCode: sql.NullString{}, }) if err != nil { - return nil, xerrors.Errorf("update provisioner job: %w", err) + return xerrors.Errorf("update provisioner job: %w", err) } s.Logger.Debug(ctx, "marked import job as completed", slog.F("job_id", jobID)) - case *proto.CompletedJob_WorkspaceBuild_: - var input WorkspaceProvisionJob - err = json.Unmarshal(job.Input, &input) - if err != nil { - return nil, xerrors.Errorf("unmarshal job data: %w", err) - } + return nil + }, nil) // End of transaction +} - workspaceBuild, err := s.Database.GetWorkspaceBuildByID(ctx, input.WorkspaceBuildID) - if err != nil { - return nil, xerrors.Errorf("get workspace build: %w", err) - } +// completeWorkspaceBuildJob handles completion of a workspace build job. +// Most database operations are performed within a transaction. +func (s *server) completeWorkspaceBuildJob(ctx context.Context, job database.ProvisionerJob, jobID uuid.UUID, jobType *proto.CompletedJob_WorkspaceBuild_, telemetrySnapshot *telemetry.Snapshot) error { + var input WorkspaceProvisionJob + err := json.Unmarshal(job.Input, &input) + if err != nil { + return xerrors.Errorf("unmarshal job data: %w", err) + } - var workspace database.Workspace - var getWorkspaceError error + workspaceBuild, err := s.Database.GetWorkspaceBuildByID(ctx, input.WorkspaceBuildID) + if err != nil { + return xerrors.Errorf("get workspace build: %w", err) + } - err = s.Database.InTx(func(db database.Store) error { - // It's important we use s.timeNow() here because we want to be - // able to customize the current time from within tests. - now := s.timeNow() - - workspace, getWorkspaceError = db.GetWorkspaceByID(ctx, workspaceBuild.WorkspaceID) - if getWorkspaceError != nil { - s.Logger.Error(ctx, - "fetch workspace for build", - slog.F("workspace_build_id", workspaceBuild.ID), - slog.F("workspace_id", workspaceBuild.WorkspaceID), - ) - return getWorkspaceError - } + var workspace database.Workspace + var getWorkspaceError error - templateScheduleStore := *s.TemplateScheduleStore.Load() + // Execute all database modifications in a transaction + err = s.Database.InTx(func(db database.Store) error { + // It's important we use s.timeNow() here because we want to be + // able to customize the current time from within tests. + now := s.timeNow() - autoStop, err := schedule.CalculateAutostop(ctx, schedule.CalculateAutostopParams{ - Database: db, - TemplateScheduleStore: templateScheduleStore, - UserQuietHoursScheduleStore: *s.UserQuietHoursScheduleStore.Load(), - Now: now, - Workspace: workspace.WorkspaceTable(), - // Allowed to be the empty string. - WorkspaceAutostart: workspace.AutostartSchedule.String, - }) - if err != nil { - return xerrors.Errorf("calculate auto stop: %w", err) - } + workspace, getWorkspaceError = db.GetWorkspaceByID(ctx, workspaceBuild.WorkspaceID) + if getWorkspaceError != nil { + s.Logger.Error(ctx, + "fetch workspace for build", + slog.F("workspace_build_id", workspaceBuild.ID), + slog.F("workspace_id", workspaceBuild.WorkspaceID), + ) + return getWorkspaceError + } - if workspace.AutostartSchedule.Valid { - templateScheduleOptions, err := templateScheduleStore.Get(ctx, db, workspace.TemplateID) - if err != nil { - return xerrors.Errorf("get template schedule options: %w", err) - } + templateScheduleStore := *s.TemplateScheduleStore.Load() - nextStartAt, err := schedule.NextAllowedAutostart(now, workspace.AutostartSchedule.String, templateScheduleOptions) - if err == nil { - err = db.UpdateWorkspaceNextStartAt(ctx, database.UpdateWorkspaceNextStartAtParams{ - ID: workspace.ID, - NextStartAt: sql.NullTime{Valid: true, Time: nextStartAt.UTC()}, - }) - if err != nil { - return xerrors.Errorf("update workspace next start at: %w", err) - } - } - } + autoStop, err := schedule.CalculateAutostop(ctx, schedule.CalculateAutostopParams{ + Database: db, + TemplateScheduleStore: templateScheduleStore, + UserQuietHoursScheduleStore: *s.UserQuietHoursScheduleStore.Load(), + Now: now, + Workspace: workspace.WorkspaceTable(), + // Allowed to be the empty string. + WorkspaceAutostart: workspace.AutostartSchedule.String, + }) + if err != nil { + return xerrors.Errorf("calculate auto stop: %w", err) + } - err = db.UpdateProvisionerJobWithCompleteByID(ctx, database.UpdateProvisionerJobWithCompleteByIDParams{ - ID: jobID, - UpdatedAt: now, - CompletedAt: sql.NullTime{ - Time: now, - Valid: true, - }, - Error: sql.NullString{}, - ErrorCode: sql.NullString{}, - }) + if workspace.AutostartSchedule.Valid { + templateScheduleOptions, err := templateScheduleStore.Get(ctx, db, workspace.TemplateID) if err != nil { - return xerrors.Errorf("update provisioner job: %w", err) + return xerrors.Errorf("get template schedule options: %w", err) } - err = db.UpdateWorkspaceBuildProvisionerStateByID(ctx, database.UpdateWorkspaceBuildProvisionerStateByIDParams{ - ID: workspaceBuild.ID, - ProvisionerState: jobType.WorkspaceBuild.State, - UpdatedAt: now, - }) - if err != nil { - return xerrors.Errorf("update workspace build provisioner state: %w", err) - } - err = db.UpdateWorkspaceBuildDeadlineByID(ctx, database.UpdateWorkspaceBuildDeadlineByIDParams{ - ID: workspaceBuild.ID, - Deadline: autoStop.Deadline, - MaxDeadline: autoStop.MaxDeadline, - UpdatedAt: now, - }) - if err != nil { - return xerrors.Errorf("update workspace build deadline: %w", err) - } - - agentTimeouts := make(map[time.Duration]bool) // A set of agent timeouts. - // This could be a bulk insert to improve performance. - for _, protoResource := range jobType.WorkspaceBuild.Resources { - for _, protoAgent := range protoResource.Agents { - dur := time.Duration(protoAgent.GetConnectionTimeoutSeconds()) * time.Second - agentTimeouts[dur] = true - } - err = InsertWorkspaceResource(ctx, db, job.ID, workspaceBuild.Transition, protoResource, telemetrySnapshot) + nextStartAt, err := schedule.NextAllowedAutostart(now, workspace.AutostartSchedule.String, templateScheduleOptions) + if err == nil { + err = db.UpdateWorkspaceNextStartAt(ctx, database.UpdateWorkspaceNextStartAtParams{ + ID: workspace.ID, + NextStartAt: sql.NullTime{Valid: true, Time: nextStartAt.UTC()}, + }) if err != nil { - return xerrors.Errorf("insert provisioner job: %w", err) - } - } - for _, module := range jobType.WorkspaceBuild.Modules { - if err := InsertWorkspaceModule(ctx, db, job.ID, workspaceBuild.Transition, module, telemetrySnapshot); err != nil { - return xerrors.Errorf("insert provisioner job module: %w", err) + return xerrors.Errorf("update workspace next start at: %w", err) } } + } - // On start, we want to ensure that workspace agents timeout statuses - // are propagated. This method is simple and does not protect against - // notifying in edge cases like when a workspace is stopped soon - // after being started. - // - // Agent timeouts could be minutes apart, resulting in an unresponsive - // experience, so we'll notify after every unique timeout seconds. - if !input.DryRun && workspaceBuild.Transition == database.WorkspaceTransitionStart && len(agentTimeouts) > 0 { - timeouts := maps.Keys(agentTimeouts) - slices.Sort(timeouts) - - var updates []<-chan time.Time - for _, d := range timeouts { - s.Logger.Debug(ctx, "triggering workspace notification after agent timeout", - slog.F("workspace_build_id", workspaceBuild.ID), - slog.F("timeout", d), - ) - // Agents are inserted with `dbtime.Now()`, this triggers a - // workspace event approximately after created + timeout seconds. - updates = append(updates, time.After(d)) - } - go func() { - for _, wait := range updates { - select { - case <-s.lifecycleCtx.Done(): - // If the server is shutting down, we don't want to wait around. - s.Logger.Debug(ctx, "stopping notifications due to server shutdown", - slog.F("workspace_build_id", workspaceBuild.ID), - ) - return - case <-wait: - // Wait for the next potential timeout to occur. - msg, err := json.Marshal(wspubsub.WorkspaceEvent{ - Kind: wspubsub.WorkspaceEventKindAgentTimeout, - WorkspaceID: workspace.ID, - }) - if err != nil { - s.Logger.Error(ctx, "marshal workspace update event", slog.Error(err)) - break - } - if err := s.Pubsub.Publish(wspubsub.WorkspaceEventChannel(workspace.OwnerID), msg); err != nil { - if s.lifecycleCtx.Err() != nil { - // If the server is shutting down, we don't want to log this error, nor wait around. - s.Logger.Debug(ctx, "stopping notifications due to server shutdown", - slog.F("workspace_build_id", workspaceBuild.ID), - ) - return - } - s.Logger.Error(ctx, "workspace notification after agent timeout failed", - slog.F("workspace_build_id", workspaceBuild.ID), - slog.Error(err), - ) - } - } - } - }() - } + err = db.UpdateProvisionerJobWithCompleteByID(ctx, database.UpdateProvisionerJobWithCompleteByIDParams{ + ID: jobID, + UpdatedAt: now, + CompletedAt: sql.NullTime{ + Time: now, + Valid: true, + }, + Error: sql.NullString{}, + ErrorCode: sql.NullString{}, + }) + if err != nil { + return xerrors.Errorf("update provisioner job: %w", err) + } + err = db.UpdateWorkspaceBuildProvisionerStateByID(ctx, database.UpdateWorkspaceBuildProvisionerStateByIDParams{ + ID: workspaceBuild.ID, + ProvisionerState: jobType.WorkspaceBuild.State, + UpdatedAt: now, + }) + if err != nil { + return xerrors.Errorf("update workspace build provisioner state: %w", err) + } + err = db.UpdateWorkspaceBuildDeadlineByID(ctx, database.UpdateWorkspaceBuildDeadlineByIDParams{ + ID: workspaceBuild.ID, + Deadline: autoStop.Deadline, + MaxDeadline: autoStop.MaxDeadline, + UpdatedAt: now, + }) + if err != nil { + return xerrors.Errorf("update workspace build deadline: %w", err) + } - if workspaceBuild.Transition != database.WorkspaceTransitionDelete { - // This is for deleting a workspace! - return nil + agentTimeouts := make(map[time.Duration]bool) // A set of agent timeouts. + // This could be a bulk insert to improve performance. + for _, protoResource := range jobType.WorkspaceBuild.Resources { + for _, protoAgent := range protoResource.Agents { + dur := time.Duration(protoAgent.GetConnectionTimeoutSeconds()) * time.Second + agentTimeouts[dur] = true } - err = db.UpdateWorkspaceDeletedByID(ctx, database.UpdateWorkspaceDeletedByIDParams{ - ID: workspaceBuild.WorkspaceID, - Deleted: true, - }) + err = InsertWorkspaceResource(ctx, db, job.ID, workspaceBuild.Transition, protoResource, telemetrySnapshot) if err != nil { - return xerrors.Errorf("update workspace deleted: %w", err) + return xerrors.Errorf("insert provisioner job: %w", err) + } + } + for _, module := range jobType.WorkspaceBuild.Modules { + if err := InsertWorkspaceModule(ctx, db, job.ID, workspaceBuild.Transition, module, telemetrySnapshot); err != nil { + return xerrors.Errorf("insert provisioner job module: %w", err) } - - return nil - }, nil) - if err != nil { - return nil, xerrors.Errorf("complete job: %w", err) } - // Insert timings outside transaction since it is metadata. + // Insert timings inside the transaction now // nolint:exhaustruct // The other fields are set further down. params := database.InsertProvisionerJobTimingsParams{ JobID: jobID, } - for _, t := range completed.GetWorkspaceBuild().GetTimings() { + for _, t := range jobType.WorkspaceBuild.Timings { if t.Start == nil || t.End == nil { s.Logger.Warn(ctx, "timings entry has nil start or end time", slog.F("entry", t.String())) continue @@ -1780,153 +1759,229 @@ func (s *server) CompleteJob(ctx context.Context, completed *proto.CompletedJob) params.StartedAt = append(params.StartedAt, t.Start.AsTime()) params.EndedAt = append(params.EndedAt, t.End.AsTime()) } - _, err = s.Database.InsertProvisionerJobTimings(ctx, params) + _, err = db.InsertProvisionerJobTimings(ctx, params) if err != nil { - // Don't fail the transaction for non-critical data. + // Log error but don't fail the whole transaction for non-critical data s.Logger.Warn(ctx, "failed to update provisioner job timings", slog.F("job_id", jobID), slog.Error(err)) } - // audit the outcome of the workspace build - if getWorkspaceError == nil { - // If the workspace has been deleted, notify the owner about it. - if workspaceBuild.Transition == database.WorkspaceTransitionDelete { - s.notifyWorkspaceDeleted(ctx, workspace, workspaceBuild) - } + // On start, we want to ensure that workspace agents timeout statuses + // are propagated. This method is simple and does not protect against + // notifying in edge cases like when a workspace is stopped soon + // after being started. + // + // Agent timeouts could be minutes apart, resulting in an unresponsive + // experience, so we'll notify after every unique timeout seconds. + if !input.DryRun && workspaceBuild.Transition == database.WorkspaceTransitionStart && len(agentTimeouts) > 0 { + timeouts := maps.Keys(agentTimeouts) + slices.Sort(timeouts) + + var updates []<-chan time.Time + for _, d := range timeouts { + s.Logger.Debug(ctx, "triggering workspace notification after agent timeout", + slog.F("workspace_build_id", workspaceBuild.ID), + slog.F("timeout", d), + ) + // Agents are inserted with `dbtime.Now()`, this triggers a + // workspace event approximately after created + timeout seconds. + updates = append(updates, time.After(d)) + } + go func() { + for _, wait := range updates { + select { + case <-s.lifecycleCtx.Done(): + // If the server is shutting down, we don't want to wait around. + s.Logger.Debug(ctx, "stopping notifications due to server shutdown", + slog.F("workspace_build_id", workspaceBuild.ID), + ) + return + case <-wait: + // Wait for the next potential timeout to occur. + msg, err := json.Marshal(wspubsub.WorkspaceEvent{ + Kind: wspubsub.WorkspaceEventKindAgentTimeout, + WorkspaceID: workspace.ID, + }) + if err != nil { + s.Logger.Error(ctx, "marshal workspace update event", slog.Error(err)) + break + } + if err := s.Pubsub.Publish(wspubsub.WorkspaceEventChannel(workspace.OwnerID), msg); err != nil { + if s.lifecycleCtx.Err() != nil { + // If the server is shutting down, we don't want to log this error, nor wait around. + s.Logger.Debug(ctx, "stopping notifications due to server shutdown", + slog.F("workspace_build_id", workspaceBuild.ID), + ) + return + } + s.Logger.Error(ctx, "workspace notification after agent timeout failed", + slog.F("workspace_build_id", workspaceBuild.ID), + slog.Error(err), + ) + } + } + } + }() + } - auditor := s.Auditor.Load() - auditAction := auditActionFromTransition(workspaceBuild.Transition) + if workspaceBuild.Transition != database.WorkspaceTransitionDelete { + // This is for deleting a workspace! + return nil + } - previousBuildNumber := workspaceBuild.BuildNumber - 1 - previousBuild, prevBuildErr := s.Database.GetWorkspaceBuildByWorkspaceIDAndBuildNumber(ctx, database.GetWorkspaceBuildByWorkspaceIDAndBuildNumberParams{ - WorkspaceID: workspace.ID, - BuildNumber: previousBuildNumber, - }) - if prevBuildErr != nil { - previousBuild = database.WorkspaceBuild{} - } + err = db.UpdateWorkspaceDeletedByID(ctx, database.UpdateWorkspaceDeletedByIDParams{ + ID: workspaceBuild.WorkspaceID, + Deleted: true, + }) + if err != nil { + return xerrors.Errorf("update workspace deleted: %w", err) + } - // We pass the below information to the Auditor so that it - // can form a friendly string for the user to view in the UI. - buildResourceInfo := audit.AdditionalFields{ - WorkspaceName: workspace.Name, - BuildNumber: strconv.FormatInt(int64(workspaceBuild.BuildNumber), 10), - BuildReason: database.BuildReason(string(workspaceBuild.Reason)), - WorkspaceID: workspace.ID, - } + return nil + }, nil) + if err != nil { + return xerrors.Errorf("complete job: %w", err) + } - wriBytes, err := json.Marshal(buildResourceInfo) - if err != nil { - s.Logger.Error(ctx, "marshal resource info for successful job", slog.Error(err)) - } - - bag := audit.BaggageFromContext(ctx) - - audit.BackgroundAudit(ctx, &audit.BackgroundAuditParams[database.WorkspaceBuild]{ - Audit: *auditor, - Log: s.Logger, - UserID: job.InitiatorID, - OrganizationID: workspace.OrganizationID, - RequestID: job.ID, - IP: bag.IP, - Action: auditAction, - Old: previousBuild, - New: workspaceBuild, - Status: http.StatusOK, - AdditionalFields: wriBytes, - }) - } + // Post-transaction operations (operations that do not require transactions or + // are external to the database, like audit logging, notifications, etc.) - if s.PrebuildsOrchestrator != nil && input.PrebuiltWorkspaceBuildStage == sdkproto.PrebuiltWorkspaceBuildStage_CLAIM { - // Track resource replacements, if there are any. - orchestrator := s.PrebuildsOrchestrator.Load() - if resourceReplacements := completed.GetWorkspaceBuild().GetResourceReplacements(); orchestrator != nil && len(resourceReplacements) > 0 { - // Fire and forget. Bind to the lifecycle of the server so shutdowns are handled gracefully. - go (*orchestrator).TrackResourceReplacement(s.lifecycleCtx, workspace.ID, workspaceBuild.ID, resourceReplacements) - } + // audit the outcome of the workspace build + if getWorkspaceError == nil { + // If the workspace has been deleted, notify the owner about it. + if workspaceBuild.Transition == database.WorkspaceTransitionDelete { + s.notifyWorkspaceDeleted(ctx, workspace, workspaceBuild) } - msg, err := json.Marshal(wspubsub.WorkspaceEvent{ - Kind: wspubsub.WorkspaceEventKindStateChange, + auditor := s.Auditor.Load() + auditAction := auditActionFromTransition(workspaceBuild.Transition) + + previousBuildNumber := workspaceBuild.BuildNumber - 1 + previousBuild, prevBuildErr := s.Database.GetWorkspaceBuildByWorkspaceIDAndBuildNumber(ctx, database.GetWorkspaceBuildByWorkspaceIDAndBuildNumberParams{ WorkspaceID: workspace.ID, + BuildNumber: previousBuildNumber, }) - if err != nil { - return nil, xerrors.Errorf("marshal workspace update event: %s", err) + if prevBuildErr != nil { + previousBuild = database.WorkspaceBuild{} } - err = s.Pubsub.Publish(wspubsub.WorkspaceEventChannel(workspace.OwnerID), msg) + + // We pass the below information to the Auditor so that it + // can form a friendly string for the user to view in the UI. + buildResourceInfo := audit.AdditionalFields{ + WorkspaceName: workspace.Name, + BuildNumber: strconv.FormatInt(int64(workspaceBuild.BuildNumber), 10), + BuildReason: database.BuildReason(string(workspaceBuild.Reason)), + WorkspaceID: workspace.ID, + } + + wriBytes, err := json.Marshal(buildResourceInfo) if err != nil { - return nil, xerrors.Errorf("update workspace: %w", err) + s.Logger.Error(ctx, "marshal resource info for successful job", slog.Error(err)) + } + + bag := audit.BaggageFromContext(ctx) + + audit.BackgroundAudit(ctx, &audit.BackgroundAuditParams[database.WorkspaceBuild]{ + Audit: *auditor, + Log: s.Logger, + UserID: job.InitiatorID, + OrganizationID: workspace.OrganizationID, + RequestID: job.ID, + IP: bag.IP, + Action: auditAction, + Old: previousBuild, + New: workspaceBuild, + Status: http.StatusOK, + AdditionalFields: wriBytes, + }) + } + + if s.PrebuildsOrchestrator != nil && input.PrebuiltWorkspaceBuildStage == sdkproto.PrebuiltWorkspaceBuildStage_CLAIM { + // Track resource replacements, if there are any. + orchestrator := s.PrebuildsOrchestrator.Load() + if resourceReplacements := jobType.WorkspaceBuild.ResourceReplacements; orchestrator != nil && len(resourceReplacements) > 0 { + // Fire and forget. Bind to the lifecycle of the server so shutdowns are handled gracefully. + go (*orchestrator).TrackResourceReplacement(s.lifecycleCtx, workspace.ID, workspaceBuild.ID, resourceReplacements) } + } - if input.PrebuiltWorkspaceBuildStage == sdkproto.PrebuiltWorkspaceBuildStage_CLAIM { - s.Logger.Info(ctx, "workspace prebuild successfully claimed by user", - slog.F("workspace_id", workspace.ID)) + msg, err := json.Marshal(wspubsub.WorkspaceEvent{ + Kind: wspubsub.WorkspaceEventKindStateChange, + WorkspaceID: workspace.ID, + }) + if err != nil { + return xerrors.Errorf("marshal workspace update event: %s", err) + } + err = s.Pubsub.Publish(wspubsub.WorkspaceEventChannel(workspace.OwnerID), msg) + if err != nil { + return xerrors.Errorf("update workspace: %w", err) + } - err = prebuilds.NewPubsubWorkspaceClaimPublisher(s.Pubsub).PublishWorkspaceClaim(agentsdk.ReinitializationEvent{ - WorkspaceID: workspace.ID, - Reason: agentsdk.ReinitializeReasonPrebuildClaimed, - }) - if err != nil { - s.Logger.Error(ctx, "failed to publish workspace claim event", slog.Error(err)) - } + if input.PrebuiltWorkspaceBuildStage == sdkproto.PrebuiltWorkspaceBuildStage_CLAIM { + s.Logger.Info(ctx, "workspace prebuild successfully claimed by user", + slog.F("workspace_id", workspace.ID)) + + err = prebuilds.NewPubsubWorkspaceClaimPublisher(s.Pubsub).PublishWorkspaceClaim(agentsdk.ReinitializationEvent{ + WorkspaceID: workspace.ID, + Reason: agentsdk.ReinitializeReasonPrebuildClaimed, + }) + if err != nil { + s.Logger.Error(ctx, "failed to publish workspace claim event", slog.Error(err)) } - case *proto.CompletedJob_TemplateDryRun_: + } + + return nil +} + +// completeTemplateDryRunJob handles completion of a template dry-run job. +// All database operations are performed within a transaction. +func (s *server) completeTemplateDryRunJob(ctx context.Context, job database.ProvisionerJob, jobID uuid.UUID, jobType *proto.CompletedJob_TemplateDryRun_, telemetrySnapshot *telemetry.Snapshot) error { + // Execute all database operations in a transaction + return s.Database.InTx(func(db database.Store) error { + now := s.timeNow() + + // Process resources for _, resource := range jobType.TemplateDryRun.Resources { s.Logger.Info(ctx, "inserting template dry-run job resource", slog.F("job_id", job.ID.String()), slog.F("resource_name", resource.Name), slog.F("resource_type", resource.Type)) - err = InsertWorkspaceResource(ctx, s.Database, jobID, database.WorkspaceTransitionStart, resource, telemetrySnapshot) + err := InsertWorkspaceResource(ctx, db, jobID, database.WorkspaceTransitionStart, resource, telemetrySnapshot) if err != nil { - return nil, xerrors.Errorf("insert resource: %w", err) + return xerrors.Errorf("insert resource: %w", err) } } + + // Process modules for _, module := range jobType.TemplateDryRun.Modules { s.Logger.Info(ctx, "inserting template dry-run job module", slog.F("job_id", job.ID.String()), slog.F("module_source", module.Source), ) - if err := InsertWorkspaceModule(ctx, s.Database, jobID, database.WorkspaceTransitionStart, module, telemetrySnapshot); err != nil { - return nil, xerrors.Errorf("insert module: %w", err) + if err := InsertWorkspaceModule(ctx, db, jobID, database.WorkspaceTransitionStart, module, telemetrySnapshot); err != nil { + return xerrors.Errorf("insert module: %w", err) } } - err = s.Database.UpdateProvisionerJobWithCompleteByID(ctx, database.UpdateProvisionerJobWithCompleteByIDParams{ + // Mark job as complete + err := db.UpdateProvisionerJobWithCompleteByID(ctx, database.UpdateProvisionerJobWithCompleteByIDParams{ ID: jobID, - UpdatedAt: s.timeNow(), + UpdatedAt: now, CompletedAt: sql.NullTime{ - Time: s.timeNow(), + Time: now, Valid: true, }, Error: sql.NullString{}, ErrorCode: sql.NullString{}, }) if err != nil { - return nil, xerrors.Errorf("update provisioner job: %w", err) + return xerrors.Errorf("update provisioner job: %w", err) } s.Logger.Debug(ctx, "marked template dry-run job as completed", slog.F("job_id", jobID)) - default: - if completed.Type == nil { - return nil, xerrors.Errorf("type payload must be provided") - } - return nil, xerrors.Errorf("unknown job type %q; ensure coderd and provisionerd versions match", - reflect.TypeOf(completed.Type).String()) - } - - data, err := json.Marshal(provisionersdk.ProvisionerJobLogsNotifyMessage{EndOfLogs: true}) - if err != nil { - return nil, xerrors.Errorf("marshal job log: %w", err) - } - err = s.Pubsub.Publish(provisionersdk.ProvisionerJobLogsNotifyChannel(jobID), data) - if err != nil { - s.Logger.Error(ctx, "failed to publish end of job logs", slog.F("job_id", jobID), slog.Error(err)) - return nil, xerrors.Errorf("publish end of job logs: %w", err) - } - - s.Logger.Debug(ctx, "stage CompleteJob done", slog.F("job_id", jobID)) - return &proto.Empty{}, nil + return nil + }, nil) // End of transaction } func (s *server) notifyWorkspaceDeleted(ctx context.Context, workspace database.Workspace, build database.WorkspaceBuild) { diff --git a/coderd/provisionerdserver/provisionerdserver_test.go b/coderd/provisionerdserver/provisionerdserver_test.go index e125db348e701..eb63d84b1df1b 100644 --- a/coderd/provisionerdserver/provisionerdserver_test.go +++ b/coderd/provisionerdserver/provisionerdserver_test.go @@ -20,6 +20,7 @@ import ( "go.opentelemetry.io/otel/trace" "golang.org/x/oauth2" "golang.org/x/xerrors" + "google.golang.org/protobuf/types/known/timestamppb" "storj.io/drpc" "cdr.dev/slog/sloggers/slogtest" @@ -1119,6 +1120,227 @@ func TestCompleteJob(t *testing.T) { require.ErrorContains(t, err, "you don't own this job") }) + // Test for verifying transaction behavior on the extracted methods + t.Run("TransactionBehavior", func(t *testing.T) { + t.Parallel() + // Test TemplateImport transaction + t.Run("TemplateImportTransaction", func(t *testing.T) { + t.Parallel() + srv, db, _, pd := setup(t, false, &overrides{}) + jobID := uuid.New() + versionID := uuid.New() + err := db.InsertTemplateVersion(ctx, database.InsertTemplateVersionParams{ + ID: versionID, + JobID: jobID, + OrganizationID: pd.OrganizationID, + }) + require.NoError(t, err) + job, err := db.InsertProvisionerJob(ctx, database.InsertProvisionerJobParams{ + OrganizationID: pd.OrganizationID, + ID: jobID, + Provisioner: database.ProvisionerTypeEcho, + Input: []byte(`{"template_version_id": "` + versionID.String() + `"}`), + StorageMethod: database.ProvisionerStorageMethodFile, + Type: database.ProvisionerJobTypeTemplateVersionImport, + }) + require.NoError(t, err) + _, err = db.AcquireProvisionerJob(ctx, database.AcquireProvisionerJobParams{ + OrganizationID: pd.OrganizationID, + WorkerID: uuid.NullUUID{ + UUID: pd.ID, + Valid: true, + }, + Types: []database.ProvisionerType{database.ProvisionerTypeEcho}, + }) + require.NoError(t, err) + + _, err = srv.CompleteJob(ctx, &proto.CompletedJob{ + JobId: job.ID.String(), + Type: &proto.CompletedJob_TemplateImport_{ + TemplateImport: &proto.CompletedJob_TemplateImport{ + StartResources: []*sdkproto.Resource{{ + Name: "test-resource", + Type: "aws_instance", + }}, + Plan: []byte("{}"), + }, + }, + }) + require.NoError(t, err) + + // Verify job was marked as completed + completedJob, err := db.GetProvisionerJobByID(ctx, job.ID) + require.NoError(t, err) + require.True(t, completedJob.CompletedAt.Valid, "Job should be marked as completed") + + // Verify resources were created + resources, err := db.GetWorkspaceResourcesByJobID(ctx, job.ID) + require.NoError(t, err) + require.Len(t, resources, 1, "Expected one resource to be created") + require.Equal(t, "test-resource", resources[0].Name) + }) + + // Test TemplateDryRun transaction + t.Run("TemplateDryRunTransaction", func(t *testing.T) { + t.Parallel() + srv, db, _, pd := setup(t, false, &overrides{}) + job, err := db.InsertProvisionerJob(ctx, database.InsertProvisionerJobParams{ + ID: uuid.New(), + Provisioner: database.ProvisionerTypeEcho, + Type: database.ProvisionerJobTypeTemplateVersionDryRun, + StorageMethod: database.ProvisionerStorageMethodFile, + }) + require.NoError(t, err) + _, err = db.AcquireProvisionerJob(ctx, database.AcquireProvisionerJobParams{ + WorkerID: uuid.NullUUID{ + UUID: pd.ID, + Valid: true, + }, + Types: []database.ProvisionerType{database.ProvisionerTypeEcho}, + }) + require.NoError(t, err) + + _, err = srv.CompleteJob(ctx, &proto.CompletedJob{ + JobId: job.ID.String(), + Type: &proto.CompletedJob_TemplateDryRun_{ + TemplateDryRun: &proto.CompletedJob_TemplateDryRun{ + Resources: []*sdkproto.Resource{{ + Name: "test-dry-run-resource", + Type: "aws_instance", + }}, + }, + }, + }) + require.NoError(t, err) + + // Verify job was marked as completed + completedJob, err := db.GetProvisionerJobByID(ctx, job.ID) + require.NoError(t, err) + require.True(t, completedJob.CompletedAt.Valid, "Job should be marked as completed") + + // Verify resources were created + resources, err := db.GetWorkspaceResourcesByJobID(ctx, job.ID) + require.NoError(t, err) + require.Len(t, resources, 1, "Expected one resource to be created") + require.Equal(t, "test-dry-run-resource", resources[0].Name) + }) + + // Test WorkspaceBuild transaction + t.Run("WorkspaceBuildTransaction", func(t *testing.T) { + t.Parallel() + srv, db, ps, pd := setup(t, false, &overrides{}) + + // Create test data + user := dbgen.User(t, db, database.User{}) + template := dbgen.Template(t, db, database.Template{ + Name: "template", + Provisioner: database.ProvisionerTypeEcho, + OrganizationID: pd.OrganizationID, + }) + file := dbgen.File(t, db, database.File{CreatedBy: user.ID}) + workspaceTable := dbgen.Workspace(t, db, database.WorkspaceTable{ + TemplateID: template.ID, + OwnerID: user.ID, + OrganizationID: pd.OrganizationID, + }) + version := dbgen.TemplateVersion(t, db, database.TemplateVersion{ + OrganizationID: pd.OrganizationID, + TemplateID: uuid.NullUUID{ + UUID: template.ID, + Valid: true, + }, + JobID: uuid.New(), + }) + build := dbgen.WorkspaceBuild(t, db, database.WorkspaceBuild{ + WorkspaceID: workspaceTable.ID, + TemplateVersionID: version.ID, + Transition: database.WorkspaceTransitionStart, + Reason: database.BuildReasonInitiator, + }) + job := dbgen.ProvisionerJob(t, db, ps, database.ProvisionerJob{ + FileID: file.ID, + InitiatorID: user.ID, + Type: database.ProvisionerJobTypeWorkspaceBuild, + Input: must(json.Marshal(provisionerdserver.WorkspaceProvisionJob{ + WorkspaceBuildID: build.ID, + })), + OrganizationID: pd.OrganizationID, + }) + _, err := db.AcquireProvisionerJob(ctx, database.AcquireProvisionerJobParams{ + OrganizationID: pd.OrganizationID, + WorkerID: uuid.NullUUID{ + UUID: pd.ID, + Valid: true, + }, + Types: []database.ProvisionerType{database.ProvisionerTypeEcho}, + }) + require.NoError(t, err) + + // Add a published channel to make sure the workspace event is sent + publishedWorkspace := make(chan struct{}) + closeWorkspaceSubscribe, err := ps.SubscribeWithErr(wspubsub.WorkspaceEventChannel(workspaceTable.OwnerID), + wspubsub.HandleWorkspaceEvent( + func(_ context.Context, e wspubsub.WorkspaceEvent, err error) { + if err != nil { + return + } + if e.Kind == wspubsub.WorkspaceEventKindStateChange && e.WorkspaceID == workspaceTable.ID { + close(publishedWorkspace) + } + })) + require.NoError(t, err) + defer closeWorkspaceSubscribe() + + // The actual test + _, err = srv.CompleteJob(ctx, &proto.CompletedJob{ + JobId: job.ID.String(), + Type: &proto.CompletedJob_WorkspaceBuild_{ + WorkspaceBuild: &proto.CompletedJob_WorkspaceBuild{ + State: []byte{}, + Resources: []*sdkproto.Resource{{ + Name: "test-workspace-resource", + Type: "aws_instance", + }}, + Timings: []*sdkproto.Timing{{ + Stage: "test", + Source: "test-source", + Resource: "test-resource", + Action: "test-action", + Start: timestamppb.Now(), + End: timestamppb.Now(), + }}, + }, + }, + }) + require.NoError(t, err) + + // Wait for workspace notification + select { + case <-publishedWorkspace: + // Success + case <-time.After(testutil.WaitShort): + t.Fatal("Workspace event not published") + } + + // Verify job was marked as completed + completedJob, err := db.GetProvisionerJobByID(ctx, job.ID) + require.NoError(t, err) + require.True(t, completedJob.CompletedAt.Valid, "Job should be marked as completed") + + // Verify resources were created + resources, err := db.GetWorkspaceResourcesByJobID(ctx, job.ID) + require.NoError(t, err) + require.Len(t, resources, 1, "Expected one resource to be created") + require.Equal(t, "test-workspace-resource", resources[0].Name) + + // Verify timings were recorded + timings, err := db.GetProvisionerJobTimingsByJobID(ctx, job.ID) + require.NoError(t, err) + require.Len(t, timings, 1, "Expected one timing entry to be created") + require.Equal(t, "test", string(timings[0].Stage), "Timing stage should match what was sent") + }) + }) + t.Run("TemplateImport_MissingGitAuth", func(t *testing.T) { t.Parallel() srv, db, _, pd := setup(t, false, &overrides{}) diff --git a/coderd/rbac/object_gen.go b/coderd/rbac/object_gen.go index ad1a510fd44bd..f19d90894dd55 100644 --- a/coderd/rbac/object_gen.go +++ b/coderd/rbac/object_gen.go @@ -308,7 +308,9 @@ var ( // Valid Actions // - "ActionApplicationConnect" :: connect to workspace apps via browser // - "ActionCreate" :: create a new workspace + // - "ActionCreateAgent" :: create a new workspace agent // - "ActionDelete" :: delete workspace + // - "ActionDeleteAgent" :: delete an existing workspace agent // - "ActionRead" :: read workspace data to view on the UI // - "ActionSSH" :: ssh into a given workspace // - "ActionWorkspaceStart" :: allows starting a workspace @@ -338,7 +340,9 @@ var ( // Valid Actions // - "ActionApplicationConnect" :: connect to workspace apps via browser // - "ActionCreate" :: create a new workspace + // - "ActionCreateAgent" :: create a new workspace agent // - "ActionDelete" :: delete workspace + // - "ActionDeleteAgent" :: delete an existing workspace agent // - "ActionRead" :: read workspace data to view on the UI // - "ActionSSH" :: ssh into a given workspace // - "ActionWorkspaceStart" :: allows starting a workspace @@ -406,7 +410,9 @@ func AllActions() []policy.Action { policy.ActionApplicationConnect, policy.ActionAssign, policy.ActionCreate, + policy.ActionCreateAgent, policy.ActionDelete, + policy.ActionDeleteAgent, policy.ActionRead, policy.ActionReadPersonal, policy.ActionSSH, diff --git a/coderd/rbac/policy/policy.go b/coderd/rbac/policy/policy.go index c37e84c48f964..160062283f857 100644 --- a/coderd/rbac/policy/policy.go +++ b/coderd/rbac/policy/policy.go @@ -24,6 +24,9 @@ const ( ActionReadPersonal Action = "read_personal" ActionUpdatePersonal Action = "update_personal" + + ActionCreateAgent Action = "create_agent" + ActionDeleteAgent Action = "delete_agent" ) type PermissionDefinition struct { @@ -67,6 +70,9 @@ var workspaceActions = map[Action]ActionDefinition{ // Running a workspace ActionSSH: actDef("ssh into a given workspace"), ActionApplicationConnect: actDef("connect to workspace apps via browser"), + + ActionCreateAgent: actDef("create a new workspace agent"), + ActionDeleteAgent: actDef("delete an existing workspace agent"), } // RBACPermissions is indexed by the type diff --git a/coderd/rbac/roles.go b/coderd/rbac/roles.go index 0b94a74201b16..89f86b567a48d 100644 --- a/coderd/rbac/roles.go +++ b/coderd/rbac/roles.go @@ -272,7 +272,7 @@ func ReloadBuiltinRoles(opts *RoleOptions) { // This adds back in the Workspace permissions. Permissions(map[string][]policy.Action{ ResourceWorkspace.Type: ownerWorkspaceActions, - ResourceWorkspaceDormant.Type: {policy.ActionRead, policy.ActionDelete, policy.ActionCreate, policy.ActionUpdate, policy.ActionWorkspaceStop}, + ResourceWorkspaceDormant.Type: {policy.ActionRead, policy.ActionDelete, policy.ActionCreate, policy.ActionUpdate, policy.ActionWorkspaceStop, policy.ActionCreateAgent, policy.ActionDeleteAgent}, })...), Org: map[string][]Permission{}, User: []Permission{}, @@ -291,7 +291,7 @@ func ReloadBuiltinRoles(opts *RoleOptions) { User: append(allPermsExcept(ResourceWorkspaceDormant, ResourceUser, ResourceOrganizationMember), Permissions(map[string][]policy.Action{ // Reduced permission set on dormant workspaces. No build, ssh, or exec - ResourceWorkspaceDormant.Type: {policy.ActionRead, policy.ActionDelete, policy.ActionCreate, policy.ActionUpdate, policy.ActionWorkspaceStop}, + ResourceWorkspaceDormant.Type: {policy.ActionRead, policy.ActionDelete, policy.ActionCreate, policy.ActionUpdate, policy.ActionWorkspaceStop, policy.ActionCreateAgent, policy.ActionDeleteAgent}, // Users cannot do create/update/delete on themselves, but they // can read their own details. ResourceUser.Type: {policy.ActionRead, policy.ActionReadPersonal, policy.ActionUpdatePersonal}, @@ -412,7 +412,7 @@ func ReloadBuiltinRoles(opts *RoleOptions) { Org: map[string][]Permission{ // Org admins should not have workspace exec perms. organizationID.String(): append(allPermsExcept(ResourceWorkspace, ResourceWorkspaceDormant, ResourceAssignRole), Permissions(map[string][]policy.Action{ - ResourceWorkspaceDormant.Type: {policy.ActionRead, policy.ActionDelete, policy.ActionCreate, policy.ActionUpdate, policy.ActionWorkspaceStop}, + ResourceWorkspaceDormant.Type: {policy.ActionRead, policy.ActionDelete, policy.ActionCreate, policy.ActionUpdate, policy.ActionWorkspaceStop, policy.ActionCreateAgent, policy.ActionDeleteAgent}, ResourceWorkspace.Type: slice.Omit(ResourceWorkspace.AvailableActions(), policy.ActionApplicationConnect, policy.ActionSSH), })...), }, @@ -529,6 +529,16 @@ func ReloadBuiltinRoles(opts *RoleOptions) { ResourceType: ResourceWorkspace.Type, Action: policy.ActionDelete, }, + { + Negate: true, + ResourceType: ResourceWorkspace.Type, + Action: policy.ActionCreateAgent, + }, + { + Negate: true, + ResourceType: ResourceWorkspace.Type, + Action: policy.ActionDeleteAgent, + }, }, }, User: []Permission{}, diff --git a/coderd/rbac/roles_test.go b/coderd/rbac/roles_test.go index 6d42a01474d1a..4dfbc8fa2ab31 100644 --- a/coderd/rbac/roles_test.go +++ b/coderd/rbac/roles_test.go @@ -226,6 +226,15 @@ func TestRolePermissions(t *testing.T) { false: {setOtherOrg, setOrgNotMe, memberMe, templateAdmin, userAdmin}, }, }, + { + Name: "CreateDeleteWorkspaceAgent", + Actions: []policy.Action{policy.ActionCreateAgent, policy.ActionDeleteAgent}, + Resource: rbac.ResourceWorkspace.WithID(workspaceID).InOrg(orgID).WithOwner(currentUser.String()), + AuthorizeMap: map[bool][]hasAuthSubjects{ + true: {owner, orgMemberMe, orgAdmin}, + false: {setOtherOrg, memberMe, userAdmin, templateAdmin, orgTemplateAdmin, orgUserAdmin, orgAuditor, orgMemberMeBanWorkspace}, + }, + }, { Name: "Templates", Actions: []policy.Action{policy.ActionCreate, policy.ActionUpdate, policy.ActionDelete}, @@ -462,7 +471,7 @@ func TestRolePermissions(t *testing.T) { }, { Name: "WorkspaceDormant", - Actions: append(crud, policy.ActionWorkspaceStop), + Actions: append(crud, policy.ActionWorkspaceStop, policy.ActionCreateAgent, policy.ActionDeleteAgent), Resource: rbac.ResourceWorkspaceDormant.WithID(uuid.New()).InOrg(orgID).WithOwner(memberMe.Actor.ID), AuthorizeMap: map[bool][]hasAuthSubjects{ true: {orgMemberMe, orgAdmin, owner}, diff --git a/codersdk/deployment.go b/codersdk/deployment.go index 39b67feb2c73a..e27478bbd1566 100644 --- a/codersdk/deployment.go +++ b/codersdk/deployment.go @@ -807,6 +807,12 @@ type PrebuildsConfig struct { // ReconciliationBackoffLookback determines the time window to look back when calculating // the number of failed prebuilds, which influences the backoff strategy. ReconciliationBackoffLookback serpent.Duration `json:"reconciliation_backoff_lookback" typescript:",notnull"` + + // FailureHardLimit defines the maximum number of consecutive failed prebuild attempts allowed + // before a preset is considered to be in a hard limit state. When a preset hits this limit, + // no new prebuilds will be created until the limit is reset. + // FailureHardLimit is disabled when set to zero. + FailureHardLimit serpent.Int64 `json:"failure_hard_limit" typescript:"failure_hard_limit"` } const ( @@ -3086,6 +3092,17 @@ Write out the current server config as YAML to stdout.`, Annotations: serpent.Annotations{}.Mark(annotationFormatDuration, "true"), Hidden: true, }, + { + Name: "Failure Hard Limit", + Description: "Maximum number of consecutive failed prebuilds before a preset hits the hard limit; disabled when set to zero.", + Flag: "workspace-prebuilds-failure-hard-limit", + Env: "CODER_WORKSPACE_PREBUILDS_FAILURE_HARD_LIMIT", + Value: &c.Prebuilds.FailureHardLimit, + Default: "3", + Group: &deploymentGroupPrebuilds, + YAML: "failure_hard_limit", + Hidden: true, + }, } return opts @@ -3329,14 +3346,71 @@ const ( ExperimentDynamicParameters Experiment = "dynamic-parameters" // Enables dynamic parameters when creating a workspace. ExperimentWorkspacePrebuilds Experiment = "workspace-prebuilds" // Enables the new workspace prebuilds feature. ExperimentAgenticChat Experiment = "agentic-chat" // Enables the new agentic AI chat feature. + ExperimentDevContainers Experiment = "dev-containers" // Enables dev containers support. + ExperimentCoderDesktop Experiment = "coder-desktop" // Enables Coder Desktop functionality. + ExperimentSecuringAgents Experiment = "securing-agents" // Enables features for securing AI agents. +) + +// FeatureStage represents the maturity level of a feature +type FeatureStage string + +const ( + // FeatureStageEarlyAccess indicates a feature that is neither feature-complete nor stable + // Early access features are often disabled by default and not recommended for production use + FeatureStageEarlyAccess FeatureStage = "early access" + + // FeatureStageBeta indicates a feature that is open to the public but still under development + // Beta features might have minor bugs but are generally ready for use + FeatureStageBeta FeatureStage = "beta" ) +// GetStage returns the feature stage of the experiment (early access, beta, or GA) +func (e Experiment) GetStage() FeatureStage { + // Default is early access for experiments + switch e { + case ExperimentAgenticChat: + return FeatureStageBeta + case ExperimentWorkspacePrebuilds: + return FeatureStageBeta + case ExperimentDevContainers: + return FeatureStageEarlyAccess + case ExperimentCoderDesktop: + return FeatureStageBeta + case ExperimentSecuringAgents: + return FeatureStageEarlyAccess + default: + return FeatureStageEarlyAccess + } +} + +// GetDescription returns a user-friendly description for the experiment +func (e Experiment) GetDescription() string { + switch e { + case ExperimentDevContainers: + return "Dev Containers Integration" + case ExperimentAgenticChat: + return "AI Coding Agents" + case ExperimentWorkspacePrebuilds: + return "Prebuilt workspaces" + case ExperimentCoderDesktop: + return "Coder Desktop" + case ExperimentSecuringAgents: + return "Securing AI Agents" + default: + return string(e) + } +} + // ExperimentsSafe should include all experiments that are safe for // users to opt-in to via --experimental='*'. // Experiments that are not ready for consumption by all users should // not be included here and will be essentially hidden. var ExperimentsSafe = Experiments{ ExperimentWorkspacePrebuilds, + ExperimentAgenticChat, + ExperimentDevContainers, + ExperimentCoderDesktop, + ExperimentSecuringAgents, } // Experiments is a list of experiments. diff --git a/codersdk/rbacresources_gen.go b/codersdk/rbacresources_gen.go index 6157281f21356..95792bb8e2a7b 100644 --- a/codersdk/rbacresources_gen.go +++ b/codersdk/rbacresources_gen.go @@ -49,7 +49,9 @@ const ( ActionApplicationConnect RBACAction = "application_connect" ActionAssign RBACAction = "assign" ActionCreate RBACAction = "create" + ActionCreateAgent RBACAction = "create_agent" ActionDelete RBACAction = "delete" + ActionDeleteAgent RBACAction = "delete_agent" ActionRead RBACAction = "read" ActionReadPersonal RBACAction = "read_personal" ActionSSH RBACAction = "ssh" @@ -97,9 +99,9 @@ var RBACResourceActions = map[RBACResource][]RBACAction{ ResourceTemplate: {ActionCreate, ActionDelete, ActionRead, ActionUpdate, ActionUse, ActionViewInsights}, ResourceUser: {ActionCreate, ActionDelete, ActionRead, ActionReadPersonal, ActionUpdate, ActionUpdatePersonal}, ResourceWebpushSubscription: {ActionCreate, ActionDelete, ActionRead}, - ResourceWorkspace: {ActionApplicationConnect, ActionCreate, ActionDelete, ActionRead, ActionSSH, ActionWorkspaceStart, ActionWorkspaceStop, ActionUpdate}, + ResourceWorkspace: {ActionApplicationConnect, ActionCreate, ActionCreateAgent, ActionDelete, ActionDeleteAgent, ActionRead, ActionSSH, ActionWorkspaceStart, ActionWorkspaceStop, ActionUpdate}, ResourceWorkspaceAgentDevcontainers: {ActionCreate}, ResourceWorkspaceAgentResourceMonitor: {ActionCreate, ActionRead, ActionUpdate}, - ResourceWorkspaceDormant: {ActionApplicationConnect, ActionCreate, ActionDelete, ActionRead, ActionSSH, ActionWorkspaceStart, ActionWorkspaceStop, ActionUpdate}, + ResourceWorkspaceDormant: {ActionApplicationConnect, ActionCreate, ActionCreateAgent, ActionDelete, ActionDeleteAgent, ActionRead, ActionSSH, ActionWorkspaceStart, ActionWorkspaceStop, ActionUpdate}, ResourceWorkspaceProxy: {ActionCreate, ActionDelete, ActionRead, ActionUpdate}, } diff --git a/docs/admin/templates/extending-templates/prebuilt-workspaces.md b/docs/admin/templates/extending-templates/prebuilt-workspaces.md index 3fd82d62d1943..57f3dc0b3109f 100644 --- a/docs/admin/templates/extending-templates/prebuilt-workspaces.md +++ b/docs/admin/templates/extending-templates/prebuilt-workspaces.md @@ -142,7 +142,7 @@ To prevent this, add a `lifecycle` block with `ignore_changes`: ```hcl resource "docker_container" "workspace" { lifecycle { - ignore_changes = all + ignore_changes = [env, image] # include all fields which caused drift } count = data.coder_workspace.me.start_count @@ -151,19 +151,8 @@ resource "docker_container" "workspace" { } ``` -For more targeted control, specify which attributes to ignore: - -```hcl -resource "docker_container" "workspace" { - lifecycle { - ignore_changes = [name] - } - - count = data.coder_workspace.me.start_count - name = "coder-${data.coder_workspace_owner.me.name}-${lower(data.coder_workspace.me.name)}" - ... -} -``` +Limit the scope of `ignore_changes` to include only the fields specified in the notification. +If you include too many fields, Terraform might ignore changes that wouldn't otherwise cause drift. Learn more about `ignore_changes` in the [Terraform documentation](https://developer.hashicorp.com/terraform/language/meta-arguments/lifecycle#ignore_changes). diff --git a/docs/install/kubernetes.md b/docs/install/kubernetes.md index 176fc7c452805..92e97e3cf902c 100644 --- a/docs/install/kubernetes.md +++ b/docs/install/kubernetes.md @@ -133,7 +133,7 @@ We support two release channels: mainline and stable - read the helm install coder coder-v2/coder \ --namespace coder \ --values values.yaml \ - --version 2.20.0 + --version 2.22.1 ``` - **Stable** Coder release: diff --git a/docs/install/releases/feature-stages.md b/docs/install/releases/feature-stages.md index 5730a5d76288e..d2879f1e8f3eb 100644 --- a/docs/install/releases/feature-stages.md +++ b/docs/install/releases/feature-stages.md @@ -24,32 +24,37 @@ If you encounter an issue with any Coder feature, please submit a Early access features are neither feature-complete nor stable. We do not recommend using early access features in production deployments. -Coder sometimes releases early access features that are available for use, but are disabled by default. -You shouldn't use early access features in production because they might cause performance or stability issues. -Early access features can be mostly feature-complete, but require further internal testing and remain in the early access stage for at least one month. +Coder sometimes releases early access features that are available for use, but +are disabled by default. You shouldn't use early access features in production +because they might cause performance or stability issues. Early access features +can be mostly feature-complete, but require further internal testing and remain +in the early access stage for at least one month. -Coder may make significant changes or revert features to a feature flag at any time. +Coder may make significant changes or revert features to a feature flag at any +time. If you plan to activate an early access feature, we suggest that you use a staging deployment.
To enable early access features: -Use the [Coder CLI](../../install/cli.md) `--experiments` flag to enable early access features: +Use the [Coder CLI](../../install/cli.md) `--experiments` flag to enable early +access features: - Enable all early access features: - ```shell - coder server --experiments=* - ``` + ```shell + coder server --experiments=* + ``` - Enable multiple early access features: - ```shell - coder server --experiments=feature1,feature2 - ``` + ```shell + coder server --experiments=feature1,feature2 + ``` -You can also use the `CODER_EXPERIMENTS` [environment variable](../../admin/setup/index.md). +You can also use the `CODER_EXPERIMENTS` +[environment variable](../../admin/setup/index.md). You can opt-out of a feature after you've enabled it. @@ -60,7 +65,10 @@ You can opt-out of a feature after you've enabled it. -Currently no experimental features are available in the latest mainline or stable release. +| Feature Flag | Name | Available in | +|-------------------|----------------------------|------------------------| +| `dev-containers` | Dev Containers Integration | main, mainline, stable | +| `securing-agents` | Securing AI Agents | main, mainline, stable | @@ -68,24 +76,44 @@ Currently no experimental features are available in the latest mainline or stabl - **Stable**: No - **Production-ready**: Not fully -- **Support**: Documentation, [Discord](https://discord.gg/coder), and [GitHub issues](https://github.com/coder/coder/issues) +- **Support**: Documentation, [Discord](https://discord.gg/coder), and + [GitHub issues](https://github.com/coder/coder/issues) Beta features are open to the public and are tagged with a `Beta` label. -They’re in active development and subject to minor changes. -They might contain minor bugs, but are generally ready for use. +They're in active development and subject to minor changes. They might contain +minor bugs, but are generally ready for use. -Beta features are often ready for general availability within two-three releases. -You should test beta features in staging environments. -You can use beta features in production, but should set expectations and inform users that some features may be incomplete. +Beta features are often ready for general availability within two-three +releases. You should test beta features in staging environments. You can use +beta features in production, but should set expectations and inform users that +some features may be incomplete. -We keep documentation about beta features up-to-date with the latest information, including planned features, limitations, and workarounds. -If you encounter an issue, please contact your [Coder account team](https://coder.com/contact), reach out on [Discord](https://discord.gg/coder), or create a [GitHub issues](https://github.com/coder/coder/issues) if there isn't one already. -While we will do our best to provide support with beta features, most issues will be escalated to the product team. -Beta features are not covered within service-level agreements (SLA). +We keep documentation about beta features up-to-date with the latest +information, including planned features, limitations, and workarounds. If you +encounter an issue, please contact your +[Coder account team](https://coder.com/contact), reach out on +[Discord](https://discord.gg/coder), or create a +[GitHub issues](https://github.com/coder/coder/issues) if there isn't one +already. While we will do our best to provide support with beta features, most +issues will be escalated to the product team. Beta features are not covered +within service-level agreements (SLA). -Most beta features are enabled by default. -Beta features are announced through the [Coder Changelog](https://coder.com/changelog), and more information is available in the documentation. +Most beta features are enabled by default. Beta features are announced through +the [Coder Changelog](https://coder.com/changelog), and more information is +available in the documentation. + +### Beta features + + + +| Feature Flag | Name | Available in | +|-----------------------|---------------------|------------------------| +| `agentic-chat` | AI Coding Agents | main, mainline, stable | +| `coder-desktop` | Coder Desktop | main, mainline, stable | +| `workspace-prebuilds` | Prebuilt workspaces | main, mainline, stable | + + ## General Availability (GA) @@ -93,16 +121,25 @@ Beta features are announced through the [Coder Changelog](https://coder.com/chan - **Production-ready**: Yes - **Support**: Yes, [based on license](https://coder.com/pricing). -All features that are not explicitly tagged as `Early access` or `Beta` are considered generally available (GA). -They have been tested, are stable, and are enabled by default. +All features that are not explicitly tagged as `Early access` or `Beta` are +considered generally available (GA). They have been tested, are stable, and are +enabled by default. -If your Coder license includes an SLA, please consult it for an outline of specific expectations. +If your Coder license includes an SLA, please consult it for an outline of +specific expectations. -For support, consult our knowledgeable and growing community on [Discord](https://discord.gg/coder), or create a [GitHub issue](https://github.com/coder/coder/issues) if one doesn't exist already. -Customers with a valid Coder license, can submit a support request or contact your [account team](https://coder.com/contact). +For support, consult our knowledgeable and growing community on +[Discord](https://discord.gg/coder), or create a +[GitHub issue](https://github.com/coder/coder/issues) if one doesn't exist +already. Customers with a valid Coder license, can submit a support request or +contact your [account team](https://coder.com/contact). -We intend [Coder documentation](../../README.md) to be the [single source of truth](https://en.wikipedia.org/wiki/Single_source_of_truth) and all features should have some form of complete documentation that outlines how to use or implement a feature. -If you discover an error or if you have a suggestion that could improve the documentation, please [submit a GitHub issue](https://github.com/coder/internal/issues/new?title=request%28docs%29%3A+request+title+here&labels=["customer-feedback","docs"]&body=please+enter+your+request+here). +We intend [Coder documentation](../../README.md) to be the +[single source of truth](https://en.wikipedia.org/wiki/Single_source_of_truth) +and all features should have some form of complete documentation that outlines +how to use or implement a feature. If you discover an error or if you have a +suggestion that could improve the documentation, please +[submit a GitHub issue](https://github.com/coder/internal/issues/new?title=request%28docs%29%3A+request+title+here&labels=["customer-feedback","docs"]&body=please+enter+your+request+here). -Some GA features can be disabled for air-gapped deployments. -Consult the feature's documentation or submit a support ticket for assistance. +Some GA features can be disabled for air-gapped deployments. Consult the +feature's documentation or submit a support ticket for assistance. diff --git a/docs/reference/api/general.md b/docs/reference/api/general.md index c14c317066a39..12454145569bb 100644 --- a/docs/reference/api/general.md +++ b/docs/reference/api/general.md @@ -533,6 +533,7 @@ curl -X GET http://coder-server:8080/api/v2/deployment/config \ "wildcard_access_url": "string", "workspace_hostname_suffix": "string", "workspace_prebuilds": { + "failure_hard_limit": 0, "reconciliation_backoff_interval": 0, "reconciliation_backoff_lookback": 0, "reconciliation_interval": 0 diff --git a/docs/reference/api/members.md b/docs/reference/api/members.md index a58a597d1ea2a..6b5d124753bc0 100644 --- a/docs/reference/api/members.md +++ b/docs/reference/api/members.md @@ -169,7 +169,9 @@ Status Code **200** | `action` | `application_connect` | | `action` | `assign` | | `action` | `create` | +| `action` | `create_agent` | | `action` | `delete` | +| `action` | `delete_agent` | | `action` | `read` | | `action` | `read_personal` | | `action` | `ssh` | @@ -336,7 +338,9 @@ Status Code **200** | `action` | `application_connect` | | `action` | `assign` | | `action` | `create` | +| `action` | `create_agent` | | `action` | `delete` | +| `action` | `delete_agent` | | `action` | `read` | | `action` | `read_personal` | | `action` | `ssh` | @@ -503,7 +507,9 @@ Status Code **200** | `action` | `application_connect` | | `action` | `assign` | | `action` | `create` | +| `action` | `create_agent` | | `action` | `delete` | +| `action` | `delete_agent` | | `action` | `read` | | `action` | `read_personal` | | `action` | `ssh` | @@ -639,7 +645,9 @@ Status Code **200** | `action` | `application_connect` | | `action` | `assign` | | `action` | `create` | +| `action` | `create_agent` | | `action` | `delete` | +| `action` | `delete_agent` | | `action` | `read` | | `action` | `read_personal` | | `action` | `ssh` | @@ -997,7 +1005,9 @@ Status Code **200** | `action` | `application_connect` | | `action` | `assign` | | `action` | `create` | +| `action` | `create_agent` | | `action` | `delete` | +| `action` | `delete_agent` | | `action` | `read` | | `action` | `read_personal` | | `action` | `ssh` | diff --git a/docs/reference/api/schemas.md b/docs/reference/api/schemas.md index 9325d751bc352..9499899242106 100644 --- a/docs/reference/api/schemas.md +++ b/docs/reference/api/schemas.md @@ -2704,6 +2704,7 @@ CreateWorkspaceRequest provides options for creating a new workspace. Only one o "wildcard_access_url": "string", "workspace_hostname_suffix": "string", "workspace_prebuilds": { + "failure_hard_limit": 0, "reconciliation_backoff_interval": 0, "reconciliation_backoff_lookback": 0, "reconciliation_interval": 0 @@ -3202,6 +3203,7 @@ CreateWorkspaceRequest provides options for creating a new workspace. Only one o "wildcard_access_url": "string", "workspace_hostname_suffix": "string", "workspace_prebuilds": { + "failure_hard_limit": 0, "reconciliation_backoff_interval": 0, "reconciliation_backoff_lookback": 0, "reconciliation_interval": 0 @@ -3377,6 +3379,9 @@ CreateWorkspaceRequest provides options for creating a new workspace. Only one o | `dynamic-parameters` | | `workspace-prebuilds` | | `agentic-chat` | +| `dev-containers` | +| `coder-desktop` | +| `securing-agents` | ## codersdk.ExternalAuth @@ -5261,6 +5266,7 @@ Git clone makes use of this by parsing the URL from: 'Username for "https://gith ```json { + "failure_hard_limit": 0, "reconciliation_backoff_interval": 0, "reconciliation_backoff_lookback": 0, "reconciliation_interval": 0 @@ -5269,11 +5275,12 @@ Git clone makes use of this by parsing the URL from: 'Username for "https://gith ### Properties -| Name | Type | Required | Restrictions | Description | -|-----------------------------------|---------|----------|--------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------| -| `reconciliation_backoff_interval` | integer | false | | Reconciliation backoff interval specifies the amount of time to increase the backoff interval when errors occur during reconciliation. | -| `reconciliation_backoff_lookback` | integer | false | | Reconciliation backoff lookback determines the time window to look back when calculating the number of failed prebuilds, which influences the backoff strategy. | -| `reconciliation_interval` | integer | false | | Reconciliation interval defines how often the workspace prebuilds state should be reconciled. | +| Name | Type | Required | Restrictions | Description | +|-----------------------------------|---------|----------|--------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `failure_hard_limit` | integer | false | | Failure hard limit defines the maximum number of consecutive failed prebuild attempts allowed before a preset is considered to be in a hard limit state. When a preset hits this limit, no new prebuilds will be created until the limit is reset. FailureHardLimit is disabled when set to zero. | +| `reconciliation_backoff_interval` | integer | false | | Reconciliation backoff interval specifies the amount of time to increase the backoff interval when errors occur during reconciliation. | +| `reconciliation_backoff_lookback` | integer | false | | Reconciliation backoff lookback determines the time window to look back when calculating the number of failed prebuilds, which influences the backoff strategy. | +| `reconciliation_interval` | integer | false | | Reconciliation interval defines how often the workspace prebuilds state should be reconciled. | ## codersdk.Preset @@ -5913,7 +5920,9 @@ Git clone makes use of this by parsing the URL from: 'Username for "https://gith | `application_connect` | | `assign` | | `create` | +| `create_agent` | | `delete` | +| `delete_agent` | | `read` | | `read_personal` | | `ssh` | diff --git a/docs/user-guides/workspace-access/remote-desktops.md b/docs/user-guides/workspace-access/remote-desktops.md index ef8488f5889ff..2fe512b686763 100644 --- a/docs/user-guides/workspace-access/remote-desktops.md +++ b/docs/user-guides/workspace-access/remote-desktops.md @@ -47,6 +47,38 @@ Or use your favorite RDP client to connect to `localhost:3399`. The default username is `Administrator` and password is `coderRDP!`. +### Coder Desktop URI Handling (Beta) + +[Coder Desktop](../desktop) can use a URI handler to directly launch an RDP session without setting up port-forwarding. +The URI format is: + +```text +coder:///v0/open/ws//agent//rdp?username=&password= +``` + +For example: + +```text +coder://coder.example.com/v0/open/ws/myworkspace/agent/main/rdp?username=Administrator&password=coderRDP! +``` + +To include a Coder Desktop button to the workspace dashboard page, add a `coder_app` resource to the template: + +```tf +locals { + server_name = regex("https?:\\/\\/([^\\/]+)", data.coder_workspace.me.access_url)[0] +} + +resource "coder_app" "rdp-coder-desktop" { + agent_id = resource.coder_agent.main.id + slug = "rdp-desktop" + display_name = "RDP with Coder Desktop" + url = "coder://${local.server_name}/v0/open/ws/${data.coder_workspace.me.name}/agent/main/rdp?username=Administrator&password=coderRDP!" + icon = "/icon/desktop.svg" + external = true +} +``` + ## RDP Web Our [WebRDP](https://registry.coder.com/modules/windows-rdp) module in the Coder diff --git a/enterprise/coderd/prebuilds/reconcile.go b/enterprise/coderd/prebuilds/reconcile.go index f9588a5d7cacb..7796e43777951 100644 --- a/enterprise/coderd/prebuilds/reconcile.go +++ b/enterprise/coderd/prebuilds/reconcile.go @@ -313,6 +313,7 @@ func (c *StoreReconciler) SnapshotState(ctx context.Context, store database.Stor if len(presetsWithPrebuilds) == 0 { return nil } + allRunningPrebuilds, err := db.GetRunningPrebuiltWorkspaces(ctx) if err != nil { return xerrors.Errorf("failed to get running prebuilds: %w", err) @@ -328,7 +329,18 @@ func (c *StoreReconciler) SnapshotState(ctx context.Context, store database.Stor return xerrors.Errorf("failed to get backoffs for presets: %w", err) } - state = prebuilds.NewGlobalSnapshot(presetsWithPrebuilds, allRunningPrebuilds, allPrebuildsInProgress, presetsBackoff) + hardLimitedPresets, err := db.GetPresetsAtFailureLimit(ctx, c.cfg.FailureHardLimit.Value()) + if err != nil { + return xerrors.Errorf("failed to get hard limited presets: %w", err) + } + + state = prebuilds.NewGlobalSnapshot( + presetsWithPrebuilds, + allRunningPrebuilds, + allPrebuildsInProgress, + presetsBackoff, + hardLimitedPresets, + ) return nil }, &database.TxOptions{ Isolation: sql.LevelRepeatableRead, // This mirrors the MVCC snapshotting Postgres does when using CTEs @@ -349,19 +361,45 @@ func (c *StoreReconciler) ReconcilePreset(ctx context.Context, ps prebuilds.Pres slog.F("preset_name", ps.Preset.Name), ) + // If the preset was previously hard-limited, log it and exit early. + if ps.Preset.PrebuildStatus == database.PrebuildStatusHardLimited { + logger.Warn(ctx, "skipping hard limited preset") + return nil + } + + // If the preset reached the hard failure limit for the first time during this iteration: + // - Mark it as hard-limited in the database + // - Send notifications to template admins + if ps.IsHardLimited { + logger.Warn(ctx, "skipping hard limited preset") + + err := c.store.UpdatePresetPrebuildStatus(ctx, database.UpdatePresetPrebuildStatusParams{ + Status: database.PrebuildStatusHardLimited, + PresetID: ps.Preset.ID, + }) + if err != nil { + return xerrors.Errorf("failed to update preset prebuild status: %w", err) + } + + err = c.notifyPrebuildFailureLimitReached(ctx, ps) + if err != nil { + logger.Error(ctx, "failed to notify that number of prebuild failures reached the limit", slog.Error(err)) + return nil + } + + return nil + } + state := ps.CalculateState() actions, err := c.CalculateActions(ctx, ps) if err != nil { - logger.Error(ctx, "failed to calculate actions for preset", slog.Error(err), slog.F("preset_id", ps.Preset.ID)) + logger.Error(ctx, "failed to calculate actions for preset", slog.Error(err)) return nil } // Nothing has to be done. if !ps.Preset.UsingActiveVersion && actions.IsNoop() { - logger.Debug(ctx, "skipping reconciliation for preset - nothing has to be done", - slog.F("template_id", ps.Preset.TemplateID.String()), slog.F("template_name", ps.Preset.TemplateName), - slog.F("template_version_id", ps.Preset.TemplateVersionID.String()), slog.F("template_version_name", ps.Preset.TemplateVersionName), - slog.F("preset_id", ps.Preset.ID.String()), slog.F("preset_name", ps.Preset.Name)) + logger.Debug(ctx, "skipping reconciliation for preset - nothing has to be done") return nil } @@ -442,6 +480,49 @@ func (c *StoreReconciler) ReconcilePreset(ctx context.Context, ps prebuilds.Pres } } +func (c *StoreReconciler) notifyPrebuildFailureLimitReached(ctx context.Context, ps prebuilds.PresetSnapshot) error { + // nolint:gocritic // Necessary to query all the required data. + ctx = dbauthz.AsSystemRestricted(ctx) + + // Send notification to template admins. + if c.notifEnq == nil { + c.logger.Warn(ctx, "notification enqueuer not set, cannot send prebuild is hard limited notification(s)") + return nil + } + + templateAdmins, err := c.store.GetUsers(ctx, database.GetUsersParams{ + RbacRole: []string{codersdk.RoleTemplateAdmin}, + }) + if err != nil { + return xerrors.Errorf("fetch template admins: %w", err) + } + + for _, templateAdmin := range templateAdmins { + if _, err := c.notifEnq.EnqueueWithData(ctx, templateAdmin.ID, notifications.PrebuildFailureLimitReached, + map[string]string{ + "org": ps.Preset.OrganizationName, + "template": ps.Preset.TemplateName, + "template_version": ps.Preset.TemplateVersionName, + "preset": ps.Preset.Name, + }, + map[string]any{}, + "prebuilds_reconciler", + // Associate this notification with all the related entities. + ps.Preset.TemplateID, ps.Preset.TemplateVersionID, ps.Preset.ID, ps.Preset.OrganizationID, + ); err != nil { + c.logger.Error(ctx, + "failed to send notification", + slog.Error(err), + slog.F("template_admin_id", templateAdmin.ID.String()), + ) + + continue + } + } + + return nil +} + func (c *StoreReconciler) CalculateActions(ctx context.Context, snapshot prebuilds.PresetSnapshot) (*prebuilds.ReconciliationActions, error) { if ctx.Err() != nil { return nil, ctx.Err() diff --git a/enterprise/coderd/prebuilds/reconcile_test.go b/enterprise/coderd/prebuilds/reconcile_test.go index 660b1733e6cc9..f52a77ca500b9 100644 --- a/enterprise/coderd/prebuilds/reconcile_test.go +++ b/enterprise/coderd/prebuilds/reconcile_test.go @@ -654,6 +654,131 @@ func TestDeletionOfPrebuiltWorkspaceWithInvalidPreset(t *testing.T) { require.Equal(t, database.WorkspaceTransitionDelete, builds[0].Transition) } +func TestSkippingHardLimitedPresets(t *testing.T) { + t.Parallel() + + if !dbtestutil.WillUsePostgres() { + t.Skip("This test requires postgres") + } + + // Test cases verify the behavior of prebuild creation depending on configured failure limits. + testCases := []struct { + name string + hardLimit int64 + isHardLimitHit bool + }{ + { + name: "hard limit is hit - skip creation of prebuilt workspace", + hardLimit: 1, + isHardLimitHit: true, + }, + { + name: "hard limit is not hit - try to create prebuilt workspace again", + hardLimit: 2, + isHardLimitHit: false, + }, + } + + for _, tc := range testCases { + t.Run(tc.name, func(t *testing.T) { + t.Parallel() + + templateDeleted := false + + clock := quartz.NewMock(t) + ctx := testutil.Context(t, testutil.WaitShort) + cfg := codersdk.PrebuildsConfig{ + FailureHardLimit: serpent.Int64(tc.hardLimit), + ReconciliationBackoffInterval: 0, + } + logger := slogtest.Make( + t, &slogtest.Options{IgnoreErrors: true}, + ).Leveled(slog.LevelDebug) + db, pubSub := dbtestutil.NewDB(t) + fakeEnqueuer := newFakeEnqueuer() + controller := prebuilds.NewStoreReconciler(db, pubSub, cfg, logger, clock, prometheus.NewRegistry(), fakeEnqueuer) + + // Template admin to receive a notification. + templateAdmin := dbgen.User(t, db, database.User{ + RBACRoles: []string{codersdk.RoleTemplateAdmin}, + }) + + // Set up test environment with a template, version, and preset. + ownerID := uuid.New() + dbgen.User(t, db, database.User{ + ID: ownerID, + }) + org, template := setupTestDBTemplate(t, db, ownerID, templateDeleted) + templateVersionID := setupTestDBTemplateVersion(ctx, t, clock, db, pubSub, org.ID, ownerID, template.ID) + preset := setupTestDBPreset(t, db, templateVersionID, 1, uuid.New().String()) + + // Create a failed prebuild workspace that counts toward the hard failure limit. + setupTestDBPrebuild( + t, + clock, + db, + pubSub, + database.WorkspaceTransitionStart, + database.ProvisionerJobStatusFailed, + org.ID, + preset, + template.ID, + templateVersionID, + ) + + // Verify initial state: one failed workspace exists. + workspaces, err := db.GetWorkspacesByTemplateID(ctx, template.ID) + require.NoError(t, err) + workspaceCount := len(workspaces) + require.Equal(t, 1, workspaceCount) + + // We simulate a failed prebuild in the test; Consequently, the backoff mechanism is triggered when ReconcileAll is called. + // Even though ReconciliationBackoffInterval is set to zero, we still need to advance the clock by at least one nanosecond. + clock.Advance(time.Nanosecond).MustWait(ctx) + + // Trigger reconciliation to attempt creating a new prebuild. + // The outcome depends on whether the hard limit has been reached. + require.NoError(t, controller.ReconcileAll(ctx)) + + // These two additional calls to ReconcileAll should not trigger any notifications. + // A notification is only sent once. + require.NoError(t, controller.ReconcileAll(ctx)) + require.NoError(t, controller.ReconcileAll(ctx)) + + // Verify the final state after reconciliation. + workspaces, err = db.GetWorkspacesByTemplateID(ctx, template.ID) + require.NoError(t, err) + updatedPreset, err := db.GetPresetByID(ctx, preset.ID) + require.NoError(t, err) + + if !tc.isHardLimitHit { + // When hard limit is not reached, a new workspace should be created. + require.Equal(t, 2, len(workspaces)) + require.Equal(t, database.PrebuildStatusHealthy, updatedPreset.PrebuildStatus) + return + } + + // When hard limit is reached, no new workspace should be created. + require.Equal(t, 1, len(workspaces)) + require.Equal(t, database.PrebuildStatusHardLimited, updatedPreset.PrebuildStatus) + + // When hard limit is reached, a notification should be sent. + matching := fakeEnqueuer.Sent(func(notification *notificationstest.FakeNotification) bool { + if !assert.Equal(t, notifications.PrebuildFailureLimitReached, notification.TemplateID, "unexpected template") { + return false + } + + if !assert.Equal(t, templateAdmin.ID, notification.UserID, "unexpected receiver") { + return false + } + + return true + }) + require.Len(t, matching, 1) + }) + } +} + func TestRunLoop(t *testing.T) { t.Parallel() diff --git a/scripts/release/docs_update_experiments.sh b/scripts/release/docs_update_experiments.sh index 1e5e6d1eb6b3e..6d44d245b48ed 100755 --- a/scripts/release/docs_update_experiments.sh +++ b/scripts/release/docs_update_experiments.sh @@ -2,182 +2,317 @@ # Usage: ./docs_update_experiments.sh # -# This script updates the available experimental features in the documentation. -# It fetches the latest mainline and stable releases to extract the available -# experiments and their descriptions. The script will update the -# feature-stages.md file with a table of the latest experimental features. +# This script updates the following sections in the documentation: +# - Available experimental features from ExperimentsSafe in codersdk/deployment.go +# - Early access features from GetStage() in codersdk/deployment.go +# - Beta features from GetStage() in codersdk/deployment.go +# +# The script will update feature-stages.md with tables for each section. +# For each feature, it also checks which versions it's available in (stable, mainline). set -euo pipefail # shellcheck source=scripts/lib.sh source "$(dirname "${BASH_SOURCE[0]}")/../lib.sh" cdroot +# Try to get GitHub token if not set +if [[ -z "${GITHUB_TOKEN:-}" ]]; then + if GITHUB_TOKEN="$(gh auth token 2>/dev/null)"; then + export GITHUB_TOKEN + GH_AVAILABLE=true + else + log "Warning: GitHub token not found. Only checking local version." + GH_AVAILABLE=false + fi +else + GH_AVAILABLE=true +fi + if isdarwin; then dependencies gsed gawk sed() { gsed "$@"; } awk() { gawk "$@"; } fi -# From install.sh +# Functions to get version information echo_latest_stable_version() { - # https://gist.github.com/lukechilds/a83e1d7127b78fef38c2914c4ececc3c#gistcomment-2758860 - version="$(curl -fsSLI -o /dev/null -w "%{url_effective}" https://github.com/coder/coder/releases/latest)" + if [[ "${GH_AVAILABLE}" == "false" ]]; then + echo "stable" + return + fi + + # Try to get latest stable version, fallback to "stable" if it fails + version=$(curl -fsSLI -o /dev/null -w "%{url_effective}" https://github.com/coder/coder/releases/latest 2>/dev/null || echo "error") + if [[ "${version}" == "error" || -z "${version}" ]]; then + log "Warning: Failed to fetch latest stable version. Using 'stable' as placeholder." + echo "stable" + return + fi + version="${version#https://github.com/coder/coder/releases/tag/v}" echo "v${version}" } echo_latest_mainline_version() { - # Fetch the releases from the GitHub API, sort by version number, - # and take the first result. Note that we're sorting by space- - # separated numbers and without utilizing the sort -V flag for the - # best compatibility. - echo "v$( - curl -fsSL https://api.github.com/repos/coder/coder/releases | - awk -F'"' '/"tag_name"/ {print $4}' | - tr -d v | - tr . ' ' | - sort -k1,1nr -k2,2nr -k3,3nr | - head -n1 | - tr ' ' . - )" -} + if [[ "${GH_AVAILABLE}" == "false" ]]; then + echo "mainline" + return + fi + + # Try to get the latest mainline version, fallback to "mainline" if it fails + local version + version=$(curl -fsSL -H "Authorization: token ${GITHUB_TOKEN}" https://api.github.com/repos/coder/coder/releases 2>/dev/null | + awk -F'"' '/"tag_name"/ {print $4}' | + tr -d v | + tr . ' ' | + sort -k1,1nr -k2,2nr -k3,3nr | + head -n1 | + tr ' ' . || echo "") -# For testing or including experiments from `main`. -echo_latest_main_version() { - echo origin/main + if [[ -z "${version}" ]]; then + log "Warning: Failed to fetch latest mainline version. Using 'mainline' as placeholder." + echo "mainline" + return + fi + + echo "v${version}" } +# Simplified function - we're no longer actually cloning the repo +# This is kept to maintain compatibility with the rest of the script structure sparse_clone_codersdk() { - mkdir -p "${1}" - cd "${1}" - rm -rf "${2}" - git clone --quiet --no-checkout "${PROJECT_ROOT}" "${2}" - cd "${2}" - git sparse-checkout set --no-cone codersdk - git checkout "${3}" -- codersdk + if [[ "${GH_AVAILABLE}" == "false" ]]; then + # Skip cloning if GitHub isn't available + echo "" + return + fi + + # Always return success with a placeholder directory echo "${1}/${2}" } -parse_all_experiments() { - # Go doc doesn't include inline array comments, so this parsing should be - # good enough. We remove all whitespaces so that we can extract a plain - # string that looks like {}, {ExpA}, or {ExpA,ExpB,}. - # - # Example: ExperimentsAll=Experiments{ExperimentNotifications,ExperimentAutoFillParameters,} - go doc -all -C "${dir}" ./codersdk ExperimentsAll | - tr -d $'\n\t ' | - grep -E -o 'ExperimentsAll=Experiments\{[^}]*\}' | - sed -e 's/.*{\(.*\)}.*/\1/' | - tr ',' '\n' +# Extract feature information from the local deployment.go +extract_local_experiment_info() { + # Extract the experiment descriptions, stages, and doc paths using Go + cat >/tmp/extract_experiment_info.go <<'EOT' +package main + +import ( + "encoding/json" + "os" + + "github.com/coder/coder/v2/codersdk" +) + +func main() { + experiments := []struct { + Name string `json:"name"` + Value string `json:"value"` + Description string `json:"description"` + Stage string `json:"stage"` + }{} + + // Get experiments from ExperimentsSafe + for _, exp := range codersdk.ExperimentsSafe { + experiments = append(experiments, struct { + Name string `json:"name"` + Value string `json:"value"` + Description string `json:"description"` + Stage string `json:"stage"` + }{ + Name: string(exp), + Value: string(exp), + Description: exp.GetDescription(), + Stage: string(exp.GetStage()), + }) + } + + json.NewEncoder(os.Stdout).Encode(experiments) } +EOT -parse_experiments() { - # Extracts the experiment name and description from the Go doc output. - # The output is in the format: - # - # ||Add new experiments here! - # ExperimentExample|example|This isn't used for anything. - # ExperimentAutoFillParameters|auto-fill-parameters|This should not be taken out of experiments until we have redesigned the feature. - # ExperimentMultiOrganization|multi-organization|Requires organization context for interactions, default org is assumed. - # ExperimentCustomRoles|custom-roles|Allows creating runtime custom roles. - # ExperimentNotifications|notifications|Sends notifications via SMTP and webhooks following certain events. - # ExperimentWorkspaceUsage|workspace-usage|Enables the new workspace usage tracking. - # ||ExperimentTest is an experiment with - # ||a preceding multi line comment!? - # ExperimentTest|test| - # - go doc -all -C "${1}" ./codersdk Experiment | - sed \ - -e 's/\t\(Experiment[^ ]*\)\ \ *Experiment = "\([^"]*\)"\(.*\/\/ \(.*\)\)\?/\1|\2|\4/' \ - -e 's/\t\/\/ \(.*\)/||\1/' | - grep '|' + # Run the Go code to extract the information + go run /tmp/extract_experiment_info.go + rm -f /tmp/extract_experiment_info.go +} + +# Extract experiment info from a specific version +extract_version_experiment_info() { + local dir=$1 + local version=$2 + + if [[ "${GH_AVAILABLE}" == "false" || -z "${dir}" ]]; then + # If GitHub isn't available, just set all features to the same version + extract_local_experiment_info | jq --arg version "${version}" '[.[] | . + {"versions": [$version]}]' + return + fi + + # For simplicity and stability, let's just use the local experiments + # and mark them as available in the specified version. + # This avoids the complex Go module replacement that can be error-prone + extract_local_experiment_info | jq --arg version "${version}" '[.[] | . + {"versions": [$version]}]' +} + +# Combine information from all versions +combine_experiment_info() { + local workdir=$1 + local stable_version=$2 + local mainline_version=$3 + + # Extract information from different versions + local local_info stable_info mainline_info + local_info=$(extract_local_experiment_info) + + if [[ "${GH_AVAILABLE}" == "true" ]]; then + # Create sparse clones and extract info + local stable_dir mainline_dir + + stable_dir=$(sparse_clone_codersdk "${workdir}" "stable" "${stable_version}") + if [[ -n "${stable_dir}" ]]; then + stable_info=$(extract_version_experiment_info "${stable_dir}" "stable") + else + # Fallback if sparse clone failed + stable_info=$(extract_local_experiment_info | jq '[.[] | . + {"versions": ["stable"]}]') + fi + + mainline_dir=$(sparse_clone_codersdk "${workdir}" "mainline" "${mainline_version}") + if [[ -n "${mainline_dir}" ]]; then + mainline_info=$(extract_version_experiment_info "${mainline_dir}" "mainline") + else + # Fallback if sparse clone failed + mainline_info=$(extract_local_experiment_info | jq '[.[] | . + {"versions": ["mainline"]}]') + fi + + # Cleanup + rm -rf "${workdir}" + else + # If GitHub isn't available, just mark everything as available in all versions + stable_info=$(extract_local_experiment_info | jq '[.[] | . + {"versions": ["stable"]}]') + mainline_info=$(extract_local_experiment_info | jq '[.[] | . + {"versions": ["mainline"]}]') + fi + + # Add 'main' version to local info + local_info=$(echo "${local_info}" | jq '[.[] | . + {"versions": ["main"]}]') + + # Combine all info + echo '[]' | jq \ + --argjson local "${local_info}" \ + --argjson stable "${stable_info:-[]}" \ + --argjson mainline "${mainline_info:-[]}" \ + ' + ($local + $stable + $mainline) | + group_by(.value) | + map({ + name: .[0].name, + value: .[0].value, + description: .[0].description, + stage: .[0].stage, + versions: map(.versions[0]) | unique | sort + }) + ' +} + +# Generate the early access features table +generate_experiments_table() { + local experiment_info=$1 + + echo "| Feature Flag | Name | Available in |" + echo "|-------------|------|--------------|" + + echo "${experiment_info}" | jq -r '.[] | select(.stage=="early access") | "| `\(.value)` | \(.description) | \(.versions | join(", ")) |"' +} + +# Generate the beta features table +generate_beta_table() { + local experiment_info=$1 + + echo "| Feature Flag | Name | Available in |" + echo "|-------------|------|--------------|" + + echo "${experiment_info}" | jq -r '.[] | select(.stage=="beta") | "| `\(.value)` | \(.description) | \(.versions | join(", ")) |"' } workdir=build/docs/experiments dest=docs/install/releases/feature-stages.md -log "Updating available experimental features in ${dest}" +log "Updating feature stages documentation in ${dest}" -declare -A experiments=() experiment_tags=() +# Get versions +stable_version=$(echo_latest_stable_version) +mainline_version=$(echo_latest_mainline_version) -for channel in mainline stable; do - log "Fetching experiments from ${channel}" +log "Checking features for versions: main, ${mainline_version}, ${stable_version}" - tag=$(echo_latest_"${channel}"_version) - dir="$(sparse_clone_codersdk "${workdir}" "${channel}" "${tag}")" +# Get combined experiment information across versions +experiment_info=$(combine_experiment_info "${workdir}" "${stable_version}" "${mainline_version}") - declare -A all_experiments=() - all_experiments_out="$(parse_all_experiments "${dir}")" - if [[ -n "${all_experiments_out}" ]]; then - readarray -t all_experiments_tmp <<<"${all_experiments_out}" - for exp in "${all_experiments_tmp[@]}"; do - all_experiments[$exp]=1 - done - fi +# Generate the tables +experiments_table=$(generate_experiments_table "${experiment_info}") +beta_table=$(generate_beta_table "${experiment_info}") - # Track preceding/multiline comments. - maybe_desc= +# Create temporary files with the new content +cat >/tmp/ea_content.md < - while read -r line; do - line=${line//$'\n'/} - readarray -d '|' -t parts <<<"$line" +${experiments_table} - # Missing var/key, this is a comment or description. - if [[ -z ${parts[0]} ]]; then - maybe_desc+="${parts[2]//$'\n'/ }" - continue - fi + +EOT - var="${parts[0]}" - key="${parts[1]}" - desc="${parts[2]}" - desc=${desc//$'\n'/} +cat >/tmp/beta_content.md < - # If desc (trailing comment) is empty, use the preceding/multiline comment. - if [[ -z "${desc}" ]]; then - desc="${maybe_desc% }" - fi - maybe_desc= +${beta_table} - # Skip experiments not listed in ExperimentsAll. - if [[ ! -v all_experiments[$var] ]]; then - log "Skipping ${var}, not listed in ExperimentsAll" - continue - fi + +EOT - # Don't overwrite desc, prefer first come, first served (i.e. mainline > stable). - if [[ ! -v experiments[$key] ]]; then - experiments[$key]="$desc" - fi +# Use awk to replace the sections +awk ' + BEGIN { + ea = 0; beta = 0; + while (getline < "/tmp/ea_content.md") ea_lines[++ea] = $0; + while (getline < "/tmp/beta_content.md") beta_lines[++beta] = $0; + ea = beta = 0; + } + + // { + for (i = 1; i <= length(ea_lines); i++) print ea_lines[i]; + ea = 1; + next; + } + + // { + ea = 0; + next; + } + + // { + for (i = 1; i <= length(beta_lines); i++) print beta_lines[i]; + beta = 1; + next; + } + + // { + beta = 0; + next; + } + + # Skip lines between markers + (ea || beta) { next; } + + # Print all other lines + { print; } +' "${dest}" >"${dest}.new" - # Track the release channels where the experiment is available. - experiment_tags[$key]+="${channel}, " - done < <(parse_experiments "${dir}") -done +# Move the new file into place +mv "${dest}.new" "${dest}" -table="$( - if [[ "${#experiments[@]}" -eq 0 ]]; then - echo "Currently no experimental features are available in the latest mainline or stable release." - exit 0 - fi +# Clean up temporary files +rm -f /tmp/ea_content.md /tmp/beta_content.md + +# Clean up backup files +rm -f "${dest}.bak" - echo "| Feature | Description | Available in |" - echo "|---------|-------------|--------------|" - for key in "${!experiments[@]}"; do - desc=${experiments[$key]} - tags=${experiment_tags[$key]%, } - echo "| \`$key\` | $desc | ${tags} |" - done -)" - -# Use awk to print everything outside the BEING/END block and insert the -# table in between. -awk \ - -v table="${table}" \ - 'BEGIN{include=1} /BEGIN: available-experimental-features/{print; print table; include=0} /END: available-experimental-features/{include=1} include' \ - "${dest}" \ - >"${dest}".tmp -mv "${dest}".tmp "${dest}" - -# Format the file for a pretty table (target single file for speed). +# Format the file with prettier (cd site && pnpm exec prettier --cache --write ../"${dest}") diff --git a/site/src/api/rbacresourcesGenerated.ts b/site/src/api/rbacresourcesGenerated.ts index 3acb86c079908..885f603c1eb82 100644 --- a/site/src/api/rbacresourcesGenerated.ts +++ b/site/src/api/rbacresourcesGenerated.ts @@ -173,7 +173,9 @@ export const RBACResourceActions: Partial< workspace: { application_connect: "connect to workspace apps via browser", create: "create a new workspace", + create_agent: "create a new workspace agent", delete: "delete workspace", + delete_agent: "delete an existing workspace agent", read: "read workspace data to view on the UI", ssh: "ssh into a given workspace", start: "allows starting a workspace", @@ -191,7 +193,9 @@ export const RBACResourceActions: Partial< workspace_dormant: { application_connect: "connect to workspace apps via browser", create: "create a new workspace", + create_agent: "create a new workspace agent", delete: "delete workspace", + delete_agent: "delete an existing workspace agent", read: "read workspace data to view on the UI", ssh: "ssh into a given workspace", start: "allows starting a workspace", diff --git a/site/src/api/typesGenerated.ts b/site/src/api/typesGenerated.ts index 4e337bd7c65f0..5e8499883f1e4 100644 --- a/site/src/api/typesGenerated.ts +++ b/site/src/api/typesGenerated.ts @@ -828,9 +828,12 @@ export const EntitlementsWarningHeader = "X-Coder-Entitlements-Warning"; export type Experiment = | "agentic-chat" | "auto-fill-parameters" + | "coder-desktop" + | "dev-containers" | "dynamic-parameters" | "example" | "notifications" + | "securing-agents" | "web-push" | "workspace-prebuilds" | "workspace-usage"; @@ -977,6 +980,11 @@ export type FeatureSet = "enterprise" | "" | "premium"; export const FeatureSets: FeatureSet[] = ["enterprise", "", "premium"]; +// From codersdk/deployment.go +export type FeatureStage = "beta" | "early access"; + +export const FeatureStages: FeatureStage[] = ["beta", "early access"]; + // From codersdk/files.go export const FormatZip = "zip"; @@ -1814,6 +1822,7 @@ export interface PrebuildsConfig { readonly reconciliation_interval: number; readonly reconciliation_backoff_interval: number; readonly reconciliation_backoff_lookback: number; + readonly failure_hard_limit: number; } // From codersdk/presets.go @@ -2131,7 +2140,9 @@ export type RBACAction = | "application_connect" | "assign" | "create" + | "create_agent" | "delete" + | "delete_agent" | "read" | "read_personal" | "ssh" @@ -2147,7 +2158,9 @@ export const RBACActions: RBACAction[] = [ "application_connect", "assign", "create", + "create_agent", "delete", + "delete_agent", "read", "read_personal", "ssh", diff --git a/site/src/components/Filter/Filter.tsx b/site/src/components/Filter/Filter.tsx index ede669416d743..1d568e84a5d2b 100644 --- a/site/src/components/Filter/Filter.tsx +++ b/site/src/components/Filter/Filter.tsx @@ -1,5 +1,4 @@ import { useTheme } from "@emotion/react"; -import Button from "@mui/material/Button"; import Divider from "@mui/material/Divider"; import Menu from "@mui/material/Menu"; import MenuItem from "@mui/material/MenuItem"; @@ -10,6 +9,7 @@ import { hasError, isApiValidationError, } from "api/errors"; +import { Button } from "components/Button/Button"; import { InputGroup } from "components/InputGroup/InputGroup"; import { SearchField } from "components/SearchField/SearchField"; import { useDebouncedFunction } from "hooks/debounce"; @@ -267,9 +267,11 @@ const PresetMenu: FC = ({ void; +}; + +export function useSyncFormParameters({ + parameters, + formValues, + setFieldValue, +}: UseSyncFormParametersProps) { + // Form values only needs to be updated when parameters change + // Keep track of form values in a ref to avoid unnecessary updates to rich_parameter_values + const formValuesRef = useRef(formValues); + + useEffect(() => { + formValuesRef.current = formValues; + }, [formValues]); + + useEffect(() => { + if (!parameters) return; + const currentFormValues = formValuesRef.current; + + const newParameterValues = parameters.map((param) => ({ + name: param.name, + value: param.value.valid ? param.value.value : "", + })); + + const currentFormValuesMap = new Map( + currentFormValues.map((value) => [value.name, value.value]), + ); + + const isChanged = + currentFormValues.length !== newParameterValues.length || + newParameterValues.some( + (p) => + !currentFormValuesMap.has(p.name) || + currentFormValuesMap.get(p.name) !== p.value, + ); + + if (isChanged) { + setFieldValue("rich_parameter_values", newParameterValues); + } + }, [parameters, setFieldValue]); +} diff --git a/site/src/modules/workspaces/DynamicParameter/DynamicParameter.tsx b/site/src/modules/workspaces/DynamicParameter/DynamicParameter.tsx index 94fa3bc383074..fc74a3a46a005 100644 --- a/site/src/modules/workspaces/DynamicParameter/DynamicParameter.tsx +++ b/site/src/modules/workspaces/DynamicParameter/DynamicParameter.tsx @@ -239,18 +239,30 @@ const DebouncedParameterField: FC = ({ prevDebouncedValueRef.current = debouncedLocalValue; }, [debouncedLocalValue, onChangeEvent]); + const textareaRef = useRef(null); + + const resizeTextarea = useEffectEvent(() => { + if (textareaRef.current) { + const textarea = textareaRef.current; + textarea.style.height = `${textarea.scrollHeight}px`; + } + }); + + useEffect(() => { + resizeTextarea(); + }, [resizeTextarea]); switch (parameter.form_type) { - case "textarea": + case "textarea": { return (