diff --git a/.cursorrules b/.cursorrules index ce4412b83f6e9..54966b1dcc89e 100644 --- a/.cursorrules +++ b/.cursorrules @@ -4,7 +4,7 @@ This project is called "Coder" - an application for managing remote development Coder provides a platform for creating, managing, and using remote development environments (also known as Cloud Development Environments or CDEs). It leverages Terraform to define and provision these environments, which are referred to as "workspaces" within the project. The system is designed to be extensible, secure, and provide developers with a seamless remote development experience. -# Core Architecture +## Core Architecture The heart of Coder is a control plane that orchestrates the creation and management of workspaces. This control plane interacts with separate Provisioner processes over gRPC to handle workspace builds. The Provisioners consume workspace definitions and use Terraform to create the actual infrastructure. @@ -12,17 +12,17 @@ The CLI package serves dual purposes - it can be used to launch the control plan The database layer uses PostgreSQL with SQLC for generating type-safe database code. Database migrations are carefully managed to ensure both forward and backward compatibility through paired `.up.sql` and `.down.sql` files. -# API Design +## API Design Coder's API architecture combines REST and gRPC approaches. The REST API is defined in `coderd/coderd.go` and uses Chi for HTTP routing. This provides the primary interface for the frontend and external integrations. Internal communication with Provisioners occurs over gRPC, with service definitions maintained in `.proto` files. This separation allows for efficient binary communication with the components responsible for infrastructure management while providing a standard REST interface for human-facing applications. -# Network Architecture +## Network Architecture Coder implements a secure networking layer based on Tailscale's Wireguard implementation. The `tailnet` package provides connectivity between workspace agents and clients through DERP (Designated Encrypted Relay for Packets) servers when direct connections aren't possible. This creates a secure overlay network allowing access to workspaces regardless of network topology, firewalls, or NAT configurations. -## Tailnet and DERP System +### Tailnet and DERP System The networking system has three key components: @@ -35,7 +35,7 @@ The networking system has three key components: 3. **Direct Connections**: When possible, the system establishes peer-to-peer connections between clients and workspaces using STUN for NAT traversal. This requires both endpoints to send UDP traffic on ephemeral ports. -## Workspace Proxies +### Workspace Proxies Workspace proxies (in the Enterprise edition) provide regional relay points for browser-based connections, reducing latency for geo-distributed teams. Key characteristics: @@ -45,9 +45,10 @@ Workspace proxies (in the Enterprise edition) provide regional relay points for - Managed through the `coder wsproxy` commands - Implemented primarily in the `enterprise/wsproxy/` package -# Agent System +## Agent System The workspace agent runs within each provisioned workspace and provides core functionality including: + - SSH access to workspaces via the `agentssh` package - Port forwarding - Terminal connectivity via the `pty` package for pseudo-terminal support @@ -57,7 +58,7 @@ The workspace agent runs within each provisioned workspace and provides core fun Agents communicate with the control plane using the tailnet system and authenticate using secure tokens. -# Workspace Applications +## Workspace Applications Workspace applications (or "apps") provide browser-based access to services running within workspaces. The system supports: @@ -69,17 +70,17 @@ Workspace applications (or "apps") provide browser-based access to services runn The implementation is primarily in the `coderd/workspaceapps/` directory with components for URL generation, proxying connections, and managing application state. -# Implementation Details +## Implementation Details The project structure separates frontend and backend concerns. React components and pages are organized in the `site/src/` directory, with Jest used for testing. The backend is primarily written in Go, with a strong emphasis on error handling patterns and test coverage. Database interactions are carefully managed through migrations in `coderd/database/migrations/` and queries in `coderd/database/queries/`. All new queries require proper database authorization (dbauthz) implementation to ensure that only users with appropriate permissions can access specific resources. -# Authorization System +## Authorization System The database authorization (dbauthz) system enforces fine-grained access control across all database operations. It uses role-based access control (RBAC) to validate user permissions before executing database operations. The `dbauthz` package wraps the database store and performs authorization checks before returning data. All database operations must pass through this layer to ensure security. -# Testing Framework +## Testing Framework The codebase has a comprehensive testing approach with several key components: @@ -91,7 +92,7 @@ The codebase has a comprehensive testing approach with several key components: 4. **Enterprise Testing**: Enterprise features have dedicated test utilities in the `coderdenttest` package. -# Open Source and Enterprise Components +## Open Source and Enterprise Components The repository contains both open source and enterprise components: @@ -100,9 +101,10 @@ The repository contains both open source and enterprise components: - The boundary between open source and enterprise is managed through a licensing system - The same core codebase supports both editions, with enterprise features conditionally enabled -# Development Philosophy +## Development Philosophy Coder emphasizes clear error handling, with specific patterns required: + - Concise error messages that avoid phrases like "failed to" - Wrapping errors with `%w` to maintain error chains - Using sentinel errors with the "err" prefix (e.g., `errNotFound`) @@ -111,7 +113,7 @@ All tests should run in parallel using `t.Parallel()` to ensure efficient testin Git contributions follow a standard format with commit messages structured as `type: `, where type is one of `feat`, `fix`, or `chore`. -# Development Workflow +## Development Workflow Development can be initiated using `scripts/develop.sh` to start the application after making changes. Database schema updates should be performed through the migration system using `create_migration.sh ` to generate migration files, with each `.up.sql` migration paired with a corresponding `.down.sql` that properly reverts all changes. diff --git a/.github/actions/setup-go-paths/action.yml b/.github/actions/setup-go-paths/action.yml new file mode 100644 index 0000000000000..8423ddb4c5dab --- /dev/null +++ b/.github/actions/setup-go-paths/action.yml @@ -0,0 +1,57 @@ +name: "Setup Go Paths" +description: Overrides Go paths like GOCACHE and GOMODCACHE to use temporary directories. +outputs: + gocache: + description: "Value of GOCACHE" + value: ${{ steps.paths.outputs.gocache }} + gomodcache: + description: "Value of GOMODCACHE" + value: ${{ steps.paths.outputs.gomodcache }} + gopath: + description: "Value of GOPATH" + value: ${{ steps.paths.outputs.gopath }} + gotmp: + description: "Value of GOTMPDIR" + value: ${{ steps.paths.outputs.gotmp }} + cached-dirs: + description: "Go directories that should be cached between CI runs" + value: ${{ steps.paths.outputs.cached-dirs }} +runs: + using: "composite" + steps: + - name: Override Go paths + id: paths + uses: actions/github-script@60a0d83039c74a4aee543508d2ffcb1c3799cdea # v7 + with: + script: | + const path = require('path'); + + // RUNNER_TEMP should be backed by a RAM disk on Windows if + // coder/setup-ramdisk-action was used + const runnerTemp = process.env.RUNNER_TEMP; + const gocacheDir = path.join(runnerTemp, 'go-cache'); + const gomodcacheDir = path.join(runnerTemp, 'go-mod-cache'); + const gopathDir = path.join(runnerTemp, 'go-path'); + const gotmpDir = path.join(runnerTemp, 'go-tmp'); + + core.exportVariable('GOCACHE', gocacheDir); + core.exportVariable('GOMODCACHE', gomodcacheDir); + core.exportVariable('GOPATH', gopathDir); + core.exportVariable('GOTMPDIR', gotmpDir); + + core.setOutput('gocache', gocacheDir); + core.setOutput('gomodcache', gomodcacheDir); + core.setOutput('gopath', gopathDir); + core.setOutput('gotmp', gotmpDir); + + const cachedDirs = `${gocacheDir}\n${gomodcacheDir}`; + core.setOutput('cached-dirs', cachedDirs); + + - name: Create directories + shell: bash + run: | + set -e + mkdir -p "$GOCACHE" + mkdir -p "$GOMODCACHE" + mkdir -p "$GOPATH" + mkdir -p "$GOTMPDIR" diff --git a/.github/actions/setup-go/action.yaml b/.github/actions/setup-go/action.yaml index 6ee57ff57db6b..6656ba5d06490 100644 --- a/.github/actions/setup-go/action.yaml +++ b/.github/actions/setup-go/action.yaml @@ -8,42 +8,26 @@ inputs: use-preinstalled-go: description: "Whether to use preinstalled Go." default: "false" - use-temp-cache-dirs: - description: "Whether to use temporary GOCACHE and GOMODCACHE directories." - default: "false" + use-cache: + description: "Whether to use the cache." + default: "true" runs: using: "composite" steps: - - name: Override GOCACHE and GOMODCACHE - shell: bash - if: inputs.use-temp-cache-dirs == 'true' - run: | - # cd to another directory to ensure we're not inside a Go project. - # That'd trigger Go to download the toolchain for that project. - cd "$RUNNER_TEMP" - # RUNNER_TEMP should be backed by a RAM disk on Windows if - # coder/setup-ramdisk-action was used - export GOCACHE_DIR="$RUNNER_TEMP""\go-cache" - export GOMODCACHE_DIR="$RUNNER_TEMP""\go-mod-cache" - export GOPATH_DIR="$RUNNER_TEMP""\go-path" - export GOTMP_DIR="$RUNNER_TEMP""\go-tmp" - mkdir -p "$GOCACHE_DIR" - mkdir -p "$GOMODCACHE_DIR" - mkdir -p "$GOPATH_DIR" - mkdir -p "$GOTMP_DIR" - go env -w GOCACHE="$GOCACHE_DIR" - go env -w GOMODCACHE="$GOMODCACHE_DIR" - go env -w GOPATH="$GOPATH_DIR" - go env -w GOTMPDIR="$GOTMP_DIR" - name: Setup Go uses: actions/setup-go@0a12ed9d6a96ab950c8f026ed9f722fe0da7ef32 # v5.0.2 with: go-version: ${{ inputs.use-preinstalled-go == 'false' && inputs.version || '' }} + cache: ${{ inputs.use-cache }} - name: Install gotestsum shell: bash run: go install gotest.tools/gotestsum@0d9599e513d70e5792bb9334869f82f6e8b53d4d # main as of 2025-05-15 + - name: Install mtimehash + shell: bash + run: go install github.com/slsyy/mtimehash/cmd/mtimehash@a6b5da4ed2c4a40e7b805534b004e9fde7b53ce0 # v1.0.0 + # It isn't necessary that we ever do this, but it helps # separate the "setup" from the "run" times. - name: go mod download diff --git a/.github/actions/setup-imdisk/action.yaml b/.github/actions/setup-imdisk/action.yaml deleted file mode 100644 index 52ef7eb08fd81..0000000000000 --- a/.github/actions/setup-imdisk/action.yaml +++ /dev/null @@ -1,27 +0,0 @@ -name: "Setup ImDisk" -if: runner.os == 'Windows' -description: | - Sets up the ImDisk toolkit for Windows and creates a RAM disk on drive R:. -runs: - using: "composite" - steps: - - name: Download ImDisk - if: runner.os == 'Windows' - shell: bash - run: | - mkdir imdisk - cd imdisk - curl -L -o files.cab https://github.com/coder/imdisk-artifacts/raw/92a17839ebc0ee3e69be019f66b3e9b5d2de4482/files.cab - curl -L -o install.bat https://github.com/coder/imdisk-artifacts/raw/92a17839ebc0ee3e69be019f66b3e9b5d2de4482/install.bat - cd .. - - - name: Install ImDisk - shell: cmd - run: | - cd imdisk - install.bat /silent - - - name: Create RAM Disk - shell: cmd - run: | - imdisk -a -s 4096M -m R: -p "/fs:ntfs /q /y" diff --git a/.github/workflows/ci.yaml b/.github/workflows/ci.yaml index ad8f5d1289715..b64d0e610721f 100644 --- a/.github/workflows/ci.yaml +++ b/.github/workflows/ci.yaml @@ -24,7 +24,7 @@ jobs: docs-only: ${{ steps.filter.outputs.docs_count == steps.filter.outputs.all_count }} docs: ${{ steps.filter.outputs.docs }} go: ${{ steps.filter.outputs.go }} - ts: ${{ steps.filter.outputs.ts }} + site: ${{ steps.filter.outputs.site }} k8s: ${{ steps.filter.outputs.k8s }} ci: ${{ steps.filter.outputs.ci }} db: ${{ steps.filter.outputs.db }} @@ -92,9 +92,8 @@ jobs: gomod: - "go.mod" - "go.sum" - ts: + site: - "site/**" - - "Makefile" k8s: - "helm/**" - "scripts/Dockerfile" @@ -224,7 +223,7 @@ jobs: gen: timeout-minutes: 8 runs-on: ${{ github.repository_owner == 'coder' && 'depot-ubuntu-22.04-8' || 'ubuntu-latest' }} - if: always() + if: ${{ !cancelled() }} steps: - name: Harden Runner uses: step-security/harden-runner@0634a2670c59f64b4a01f0f96f84700a4088b9f0 # v2.12.0 @@ -336,13 +335,16 @@ jobs: # a separate repository to allow its use before actions/checkout. - name: Setup RAM Disks if: runner.os == 'Windows' - uses: coder/setup-ramdisk-action@81c5c441bda00c6c3d6bcee2e5a33ed4aadbbcc1 + uses: coder/setup-ramdisk-action@a4b59caa8be2e88c348abeef042d7c1a33d8743e - name: Checkout uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2 with: fetch-depth: 1 + - name: Setup Go Paths + uses: ./.github/actions/setup-go-paths + - name: Setup Go uses: ./.github/actions/setup-go with: @@ -350,7 +352,6 @@ jobs: # download the toolchain configured in go.mod, so we don't # need to reinstall it. It's faster on Windows runners. use-preinstalled-go: ${{ runner.os == 'Windows' }} - use-temp-cache-dirs: ${{ runner.os == 'Windows' }} - name: Setup Terraform uses: ./.github/actions/setup-tf @@ -398,63 +399,10 @@ jobs: with: api-key: ${{ secrets.DATADOG_API_KEY }} - # We don't run the full test-suite for Windows & MacOS, so we just run the CLI tests on every PR. - # We run the test suite in test-go-pg, including CLI. - test-cli: - runs-on: ${{ matrix.os == 'macos-latest' && github.repository_owner == 'coder' && 'depot-macos-latest' || matrix.os == 'windows-2022' && github.repository_owner == 'coder' && 'windows-latest-16-cores' || matrix.os }} - needs: changes - if: needs.changes.outputs.go == 'true' || needs.changes.outputs.ci == 'true' || github.ref == 'refs/heads/main' - strategy: - matrix: - os: - - macos-latest - - windows-2022 - steps: - - name: Harden Runner - uses: step-security/harden-runner@0634a2670c59f64b4a01f0f96f84700a4088b9f0 # v2.12.0 - with: - egress-policy: audit - - - name: Checkout - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2 - with: - fetch-depth: 1 - - - name: Setup Go - uses: ./.github/actions/setup-go - - - name: Setup Terraform - uses: ./.github/actions/setup-tf - - # Sets up the ImDisk toolkit for Windows and creates a RAM disk on drive R:. - - name: Setup ImDisk - if: runner.os == 'Windows' - uses: ./.github/actions/setup-imdisk - - - name: Test CLI - env: - TS_DEBUG_DISCO: "true" - LC_CTYPE: "en_US.UTF-8" - LC_ALL: "en_US.UTF-8" - TEST_RETRIES: 2 - shell: bash - run: | - # By default Go will use the number of logical CPUs, which - # is a fine default. - PARALLEL_FLAG="" - - make test-cli - - - name: Upload test stats to Datadog - timeout-minutes: 1 - continue-on-error: true - uses: ./.github/actions/upload-datadog - if: success() || failure() - with: - api-key: ${{ secrets.DATADOG_API_KEY }} - test-go-pg: - runs-on: ${{ matrix.os == 'ubuntu-latest' && github.repository_owner == 'coder' && 'depot-ubuntu-22.04-8' || matrix.os }} + # make sure to adjust NUM_PARALLEL_PACKAGES and NUM_PARALLEL_TESTS below + # when changing runner sizes + runs-on: ${{ matrix.os == 'ubuntu-latest' && github.repository_owner == 'coder' && 'depot-ubuntu-22.04-8' || matrix.os && matrix.os == 'macos-latest' && github.repository_owner == 'coder' && 'depot-macos-latest' || matrix.os == 'windows-2022' && github.repository_owner == 'coder' && 'depot-windows-2022-16' || matrix.os }} needs: changes if: needs.changes.outputs.go == 'true' || needs.changes.outputs.ci == 'true' || github.ref == 'refs/heads/main' # This timeout must be greater than the timeout set by `go test` in @@ -466,48 +414,157 @@ jobs: matrix: os: - ubuntu-latest + - macos-latest + - windows-2022 steps: - name: Harden Runner uses: step-security/harden-runner@0634a2670c59f64b4a01f0f96f84700a4088b9f0 # v2.12.0 with: egress-policy: audit + # macOS indexes all new files in the background. Our Postgres tests + # create and destroy thousands of databases on disk, and Spotlight + # tries to index all of them, seriously slowing down the tests. + - name: Disable Spotlight Indexing + if: runner.os == 'macOS' + run: | + sudo mdutil -a -i off + sudo mdutil -X / + sudo launchctl bootout system /System/Library/LaunchDaemons/com.apple.metadata.mds.plist + + # Set up RAM disks to speed up the rest of the job. This action is in + # a separate repository to allow its use before actions/checkout. + - name: Setup RAM Disks + if: runner.os == 'Windows' + uses: coder/setup-ramdisk-action@a4b59caa8be2e88c348abeef042d7c1a33d8743e + - name: Checkout uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2 with: fetch-depth: 1 + - name: Setup Go Paths + id: go-paths + uses: ./.github/actions/setup-go-paths + + - name: Download Go Build Cache + id: download-go-build-cache + uses: ./.github/actions/test-cache/download + with: + key-prefix: test-go-build-${{ runner.os }}-${{ runner.arch }} + cache-path: ${{ steps.go-paths.outputs.cached-dirs }} + - name: Setup Go uses: ./.github/actions/setup-go + with: + # Runners have Go baked-in and Go will automatically + # download the toolchain configured in go.mod, so we don't + # need to reinstall it. It's faster on Windows runners. + use-preinstalled-go: ${{ runner.os == 'Windows' }} + # Cache is already downloaded above + use-cache: false - name: Setup Terraform uses: ./.github/actions/setup-tf - # Sets up the ImDisk toolkit for Windows and creates a RAM disk on drive R:. - - name: Setup ImDisk - if: runner.os == 'Windows' - uses: ./.github/actions/setup-imdisk - - name: Download Test Cache id: download-cache uses: ./.github/actions/test-cache/download with: key-prefix: test-go-pg-${{ runner.os }}-${{ runner.arch }} + - name: Normalize File and Directory Timestamps + shell: bash + # Normalize file modification timestamps so that go test can use the + # cache from the previous CI run. See https://github.com/golang/go/issues/58571 + # for more details. + run: | + find . -type f ! -path ./.git/\*\* | mtimehash + find . -type d ! -path ./.git/\*\* -exec touch -t 200601010000 {} + + - name: Test with PostgreSQL Database env: POSTGRES_VERSION: "13" TS_DEBUG_DISCO: "true" LC_CTYPE: "en_US.UTF-8" LC_ALL: "en_US.UTF-8" - TEST_RETRIES: 2 shell: bash run: | - # By default Go will use the number of logical CPUs, which - # is a fine default. - PARALLEL_FLAG="" + set -o errexit + set -o pipefail + + if [ "${{ runner.os }}" == "Windows" ]; then + # Create a temp dir on the R: ramdisk drive for Windows. The default + # C: drive is extremely slow: https://github.com/actions/runner-images/issues/8755 + mkdir -p "R:/temp/embedded-pg" + go run scripts/embedded-pg/main.go -path "R:/temp/embedded-pg" + elif [ "${{ runner.os }}" == "macOS" ]; then + # Postgres runs faster on a ramdisk on macOS too + mkdir -p /tmp/tmpfs + sudo mount_tmpfs -o noowners -s 8g /tmp/tmpfs + go run scripts/embedded-pg/main.go -path /tmp/tmpfs/embedded-pg + elif [ "${{ runner.os }}" == "Linux" ]; then + make test-postgres-docker + fi - make test-postgres + # if macOS, install google-chrome for scaletests + # As another concern, should we really have this kind of external dependency + # requirement on standard CI? + if [ "${{ matrix.os }}" == "macos-latest" ]; then + brew install google-chrome + fi + + # macOS will output "The default interactive shell is now zsh" + # intermittently in CI... + if [ "${{ matrix.os }}" == "macos-latest" ]; then + touch ~/.bash_profile && echo "export BASH_SILENCE_DEPRECATION_WARNING=1" >> ~/.bash_profile + fi + + if [ "${{ runner.os }}" == "Windows" ]; then + # Our Windows runners have 16 cores. + # On Windows Postgres chokes up when we have 16x16=256 tests + # running in parallel, and dbtestutil.NewDB starts to take more than + # 10s to complete sometimes causing test timeouts. With 16x8=128 tests + # Postgres tends not to choke. + NUM_PARALLEL_PACKAGES=8 + NUM_PARALLEL_TESTS=16 + elif [ "${{ runner.os }}" == "macOS" ]; then + # Our macOS runners have 8 cores. We set NUM_PARALLEL_TESTS to 16 + # because the tests complete faster and Postgres doesn't choke. It seems + # that macOS's tmpfs is faster than the one on Windows. + NUM_PARALLEL_PACKAGES=8 + NUM_PARALLEL_TESTS=16 + elif [ "${{ runner.os }}" == "Linux" ]; then + # Our Linux runners have 8 cores. + NUM_PARALLEL_PACKAGES=8 + NUM_PARALLEL_TESTS=8 + fi + + # by default, run tests with cache + TESTCOUNT="" + if [ "${{ github.ref }}" == "refs/heads/main" ]; then + # on main, run tests without cache + TESTCOUNT="-count=1" + fi + + mkdir -p "$RUNNER_TEMP/sym" + source scripts/normalize_path.sh + # terraform gets installed in a random directory, so we need to normalize + # the path to the terraform binary or a bunch of cached tests will be + # invalidated. See scripts/normalize_path.sh for more details. + normalize_path_with_symlinks "$RUNNER_TEMP/sym" "$(dirname $(which terraform))" + + # We rerun failing tests to counteract flakiness coming from Postgres + # choking on macOS and Windows sometimes. + DB=ci gotestsum --rerun-fails=2 --rerun-fails-max-failures=50 \ + --format standard-quiet --packages "./..." \ + -- -timeout=20m -v -p $NUM_PARALLEL_PACKAGES -parallel=$NUM_PARALLEL_TESTS $TESTCOUNT + + - name: Upload Go Build Cache + uses: ./.github/actions/test-cache/upload + with: + cache-key: ${{ steps.download-go-build-cache.outputs.cache-key }} + cache-path: ${{ steps.go-paths.outputs.cached-dirs }} - name: Upload Test Cache uses: ./.github/actions/test-cache/upload @@ -716,7 +773,7 @@ jobs: test-js: runs-on: ${{ github.repository_owner == 'coder' && 'depot-ubuntu-22.04-8' || 'ubuntu-latest' }} needs: changes - if: needs.changes.outputs.ts == 'true' || needs.changes.outputs.ci == 'true' || github.ref == 'refs/heads/main' + if: needs.changes.outputs.site == 'true' || needs.changes.outputs.ci == 'true' || github.ref == 'refs/heads/main' timeout-minutes: 20 steps: - name: Harden Runner @@ -747,7 +804,7 @@ jobs: #- premium: true # name: test-e2e-premium # Skip test-e2e on forks as they don't have access to CI secrets - if: (needs.changes.outputs.go == 'true' || needs.changes.outputs.ts == 'true' || needs.changes.outputs.ci == 'true' || github.ref == 'refs/heads/main') && !(github.event.pull_request.head.repo.fork) + if: (needs.changes.outputs.go == 'true' || needs.changes.outputs.site == 'true' || needs.changes.outputs.ci == 'true' || github.ref == 'refs/heads/main') && !(github.event.pull_request.head.repo.fork) timeout-minutes: 20 name: ${{ matrix.variant.name }} steps: @@ -816,11 +873,13 @@ jobs: path: ./site/test-results/**/debug-pprof-*.txt retention-days: 7 + # Reference guide: + # https://www.chromatic.com/docs/turbosnap-best-practices/#run-with-caution-when-using-the-pull_request-event chromatic: # REMARK: this is only used to build storybook and deploy it to Chromatic. runs-on: ubuntu-latest needs: changes - if: needs.changes.outputs.ts == 'true' || needs.changes.outputs.ci == 'true' + if: needs.changes.outputs.site == 'true' || needs.changes.outputs.ci == 'true' steps: - name: Harden Runner uses: step-security/harden-runner@0634a2670c59f64b4a01f0f96f84700a4088b9f0 # v2.12.0 @@ -830,9 +889,10 @@ jobs: - name: Checkout uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2 with: - # Required by Chromatic for build-over-build history, otherwise we - # only get 1 commit on shallow checkout. + # 👇 Ensures Chromatic can read your full git history fetch-depth: 0 + # 👇 Tells the checkout which commit hash to reference + ref: ${{ github.event.pull_request.head.ref }} - name: Setup Node uses: ./.github/actions/setup-node @@ -842,7 +902,7 @@ jobs: # the check to pass. This is desired in PRs, but not in mainline. - name: Publish to Chromatic (non-mainline) if: github.ref != 'refs/heads/main' && github.repository_owner == 'coder' - uses: chromaui/action@30b6228aa809059d46219e0f556752e8672a7e26 # v11.11.0 + uses: chromaui/action@1cfa065cbdab28f6ca3afaeb3d761383076a35aa # v11.29.0 env: NODE_OPTIONS: "--max_old_space_size=4096" STORYBOOK: true @@ -857,6 +917,7 @@ jobs: projectToken: 695c25b6cb65 workingDir: "./site" storybookBaseDir: "./site" + storybookConfigDir: "./site/.storybook" # Prevent excessive build runs on minor version changes skip: "@(renovate/**|dependabot/**)" # Run TurboSnap to trace file dependencies to related stories @@ -873,7 +934,7 @@ jobs: # infinitely "in progress" in mainline unless we re-review each build. - name: Publish to Chromatic (mainline) if: github.ref == 'refs/heads/main' && github.repository_owner == 'coder' - uses: chromaui/action@30b6228aa809059d46219e0f556752e8672a7e26 # v11.11.0 + uses: chromaui/action@1cfa065cbdab28f6ca3afaeb3d761383076a35aa # v11.29.0 env: NODE_OPTIONS: "--max_old_space_size=4096" STORYBOOK: true @@ -886,6 +947,7 @@ jobs: projectToken: 695c25b6cb65 workingDir: "./site" storybookBaseDir: "./site" + storybookConfigDir: "./site/.storybook" # Run TurboSnap to trace file dependencies to related stories # and tell chromatic to only take snapshots of relevant stories onlyChanged: true diff --git a/.github/workflows/nightly-gauntlet.yaml b/.github/workflows/nightly-gauntlet.yaml deleted file mode 100644 index 64b520d07ba6e..0000000000000 --- a/.github/workflows/nightly-gauntlet.yaml +++ /dev/null @@ -1,183 +0,0 @@ -# The nightly-gauntlet runs tests that are either too flaky or too slow to block -# every PR. -name: nightly-gauntlet -on: - schedule: - # Every day at 4AM - - cron: "0 4 * * 1-5" - workflow_dispatch: - -permissions: - contents: read - -jobs: - test-go-pg: - # make sure to adjust NUM_PARALLEL_PACKAGES and NUM_PARALLEL_TESTS below - # when changing runner sizes - runs-on: ${{ matrix.os == 'macos-latest' && github.repository_owner == 'coder' && 'depot-macos-latest' || matrix.os == 'windows-2022' && github.repository_owner == 'coder' && 'depot-windows-2022-16' || matrix.os }} - # This timeout must be greater than the timeout set by `go test` in - # `make test-postgres` to ensure we receive a trace of running - # goroutines. Setting this to the timeout +5m should work quite well - # even if some of the preceding steps are slow. - timeout-minutes: 25 - strategy: - fail-fast: false - matrix: - os: - - macos-latest - - windows-2022 - steps: - - name: Harden Runner - uses: step-security/harden-runner@0634a2670c59f64b4a01f0f96f84700a4088b9f0 # v2.12.0 - with: - egress-policy: audit - - # macOS indexes all new files in the background. Our Postgres tests - # create and destroy thousands of databases on disk, and Spotlight - # tries to index all of them, seriously slowing down the tests. - - name: Disable Spotlight Indexing - if: runner.os == 'macOS' - run: | - sudo mdutil -a -i off - sudo mdutil -X / - sudo launchctl bootout system /System/Library/LaunchDaemons/com.apple.metadata.mds.plist - - # Set up RAM disks to speed up the rest of the job. This action is in - # a separate repository to allow its use before actions/checkout. - - name: Setup RAM Disks - if: runner.os == 'Windows' - uses: coder/setup-ramdisk-action@79dacfe70c47ad6d6c0dd7f45412368802641439 - - - name: Checkout - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2 - with: - fetch-depth: 1 - - - name: Setup Go - uses: ./.github/actions/setup-go - with: - # Runners have Go baked-in and Go will automatically - # download the toolchain configured in go.mod, so we don't - # need to reinstall it. It's faster on Windows runners. - use-preinstalled-go: ${{ runner.os == 'Windows' }} - use-temp-cache-dirs: ${{ runner.os == 'Windows' }} - - - name: Setup Terraform - uses: ./.github/actions/setup-tf - - - name: Test with PostgreSQL Database - env: - POSTGRES_VERSION: "13" - TS_DEBUG_DISCO: "true" - LC_CTYPE: "en_US.UTF-8" - LC_ALL: "en_US.UTF-8" - shell: bash - run: | - if [ "${{ runner.os }}" == "Windows" ]; then - # Create a temp dir on the R: ramdisk drive for Windows. The default - # C: drive is extremely slow: https://github.com/actions/runner-images/issues/8755 - mkdir -p "R:/temp/embedded-pg" - go run scripts/embedded-pg/main.go -path "R:/temp/embedded-pg" - fi - if [ "${{ runner.os }}" == "macOS" ]; then - # Postgres runs faster on a ramdisk on macOS too - mkdir -p /tmp/tmpfs - sudo mount_tmpfs -o noowners -s 8g /tmp/tmpfs - go run scripts/embedded-pg/main.go -path /tmp/tmpfs/embedded-pg - fi - - # if macOS, install google-chrome for scaletests - # As another concern, should we really have this kind of external dependency - # requirement on standard CI? - if [ "${{ matrix.os }}" == "macos-latest" ]; then - brew install google-chrome - fi - - # By default Go will use the number of logical CPUs, which - # is a fine default. - PARALLEL_FLAG="" - - # macOS will output "The default interactive shell is now zsh" - # intermittently in CI... - if [ "${{ matrix.os }}" == "macos-latest" ]; then - touch ~/.bash_profile && echo "export BASH_SILENCE_DEPRECATION_WARNING=1" >> ~/.bash_profile - fi - - # Golang's default for these 2 variables is the number of logical CPUs. - # Our Windows and Linux runners have 16 cores, so they match up there. - NUM_PARALLEL_PACKAGES=16 - NUM_PARALLEL_TESTS=16 - if [ "${{ runner.os }}" == "Windows" ]; then - # On Windows Postgres chokes up when we have 16x16=256 tests - # running in parallel, and dbtestutil.NewDB starts to take more than - # 10s to complete sometimes causing test timeouts. With 16x8=128 tests - # Postgres tends not to choke. - NUM_PARALLEL_PACKAGES=8 - fi - if [ "${{ runner.os }}" == "macOS" ]; then - # Our macOS runners have 8 cores. We leave NUM_PARALLEL_TESTS at 16 - # because the tests complete faster and Postgres doesn't choke. It seems - # that macOS's tmpfs is faster than the one on Windows. - NUM_PARALLEL_PACKAGES=8 - fi - - # We rerun failing tests to counteract flakiness coming from Postgres - # choking on macOS and Windows sometimes. - DB=ci gotestsum --rerun-fails=2 --rerun-fails-max-failures=1000 \ - --format standard-quiet --packages "./..." \ - -- -v -p $NUM_PARALLEL_PACKAGES -parallel=$NUM_PARALLEL_TESTS -count=1 - - - name: Upload test stats to Datadog - timeout-minutes: 1 - continue-on-error: true - uses: ./.github/actions/upload-datadog - if: success() || failure() - with: - api-key: ${{ secrets.DATADOG_API_KEY }} - - notify-slack-on-failure: - needs: - - test-go-pg - runs-on: ubuntu-latest - if: failure() && github.ref == 'refs/heads/main' - - steps: - - name: Send Slack notification - run: | - curl -X POST -H 'Content-type: application/json' \ - --data '{ - "blocks": [ - { - "type": "header", - "text": { - "type": "plain_text", - "text": "❌ Nightly gauntlet failed", - "emoji": true - } - }, - { - "type": "section", - "fields": [ - { - "type": "mrkdwn", - "text": "*Workflow:*\n${{ github.workflow }}" - }, - { - "type": "mrkdwn", - "text": "*Committer:*\n${{ github.actor }}" - }, - { - "type": "mrkdwn", - "text": "*Commit:*\n${{ github.sha }}" - } - ] - }, - { - "type": "section", - "text": { - "type": "mrkdwn", - "text": "*View failure:* <${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}|Click here>" - } - } - ] - }' ${{ secrets.CI_FAILURE_SLACK_WEBHOOK }} diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 0000000000000..90d91c9966df7 --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1,104 @@ +# Coder Development Guidelines + +Read [cursor rules](.cursorrules). + +## Build/Test/Lint Commands + +### Main Commands + +- `make build` or `make build-fat` - Build all "fat" binaries (includes "server" functionality) +- `make build-slim` - Build "slim" binaries +- `make test` - Run Go tests +- `make test RUN=TestFunctionName` or `go test -v ./path/to/package -run TestFunctionName` - Test single +- `make test-postgres` - Run tests with Postgres database +- `make test-race` - Run tests with Go race detector +- `make test-e2e` - Run end-to-end tests +- `make lint` - Run all linters +- `make fmt` - Format all code +- `make gen` - Generates mocks, database queries and other auto-generated files + +### Frontend Commands (site directory) + +- `pnpm build` - Build frontend +- `pnpm dev` - Run development server +- `pnpm check` - Run code checks +- `pnpm format` - Format frontend code +- `pnpm lint` - Lint frontend code +- `pnpm test` - Run frontend tests + +## Code Style Guidelines + +### Go + +- Follow [Effective Go](https://go.dev/doc/effective_go) and [Go's Code Review Comments](https://github.com/golang/go/wiki/CodeReviewComments) +- Use `gofumpt` for formatting +- Create packages when used during implementation +- Validate abstractions against implementations + +### Error Handling + +- Use descriptive error messages +- Wrap errors with context +- Propagate errors appropriately +- Use proper error types +- (`xerrors.Errorf("failed to X: %w", err)`) + +### Naming + +- Use clear, descriptive names +- Abbreviate only when obvious +- Follow Go and TypeScript naming conventions + +### Comments + +- Document exported functions, types, and non-obvious logic +- Follow JSDoc format for TypeScript +- Use godoc format for Go code + +## Commit Style + +- Follow [Conventional Commits 1.0.0](https://www.conventionalcommits.org/en/v1.0.0/) +- Format: `type(scope): message` +- Types: `feat`, `fix`, `docs`, `style`, `refactor`, `test`, `chore` +- Keep message titles concise (~70 characters) +- Use imperative, present tense in commit titles + +## Database queries + +- MUST DO! Any changes to database - adding queries, modifying queries should be done in the `coderd\database\queries\*.sql` files. Use `make gen` to generate necessary changes after. +- MUST DO! Queries are grouped in files relating to context - e.g. `prebuilds.sql`, `users.sql`, `provisionerjobs.sql`. +- After making changes to any `coderd\database\queries\*.sql` files you must run `make gen` to generate respective ORM changes. + +## Architecture + +### Core Components + +- **coderd**: Main API service connecting workspaces, provisioners, and users +- **provisionerd**: Execution context for infrastructure-modifying providers +- **Agents**: Services in remote workspaces providing features like SSH and port forwarding +- **Workspaces**: Cloud resources defined by Terraform + +## Sub-modules + +### Template System + +- Templates define infrastructure for workspaces using Terraform +- Environment variables pass context between Coder and templates +- Official modules extend development environments + +### RBAC System + +- Permissions defined at site, organization, and user levels +- Object-Action model protects resources +- Built-in roles: owner, member, auditor, templateAdmin +- Permission format: `?...` + +### Database + +- PostgreSQL 13+ recommended for production +- Migrations managed with `migrate` +- Database authorization through `dbauthz` package + +## Frontend + +For building Frontend refer to [this document](docs/contributing/frontend.md) diff --git a/agent/agent.go b/agent/agent.go index 927612302bf71..4aaef05661184 100644 --- a/agent/agent.go +++ b/agent/agent.go @@ -1176,12 +1176,6 @@ func (a *agent) handleManifest(manifestOK *checkpoint) func(ctx context.Context, } a.metrics.startupScriptSeconds.WithLabelValues(label).Set(dur) a.scriptRunner.StartCron() - if containerAPI := a.containerAPI.Load(); containerAPI != nil { - // Inform the container API that the agent is ready. - // This allows us to start watching for changes to - // the devcontainer configuration files. - containerAPI.SignalReady() - } }) if err != nil { return xerrors.Errorf("track conn goroutine: %w", err) diff --git a/agent/agentcontainers/api.go b/agent/agentcontainers/api.go index f2164c9a874ff..7fd42175db7d4 100644 --- a/agent/agentcontainers/api.go +++ b/agent/agentcontainers/api.go @@ -8,6 +8,7 @@ import ( "path" "slices" "strings" + "sync" "time" "github.com/fsnotify/fsnotify" @@ -25,35 +26,35 @@ import ( ) const ( - defaultGetContainersCacheDuration = 10 * time.Second - dockerCreatedAtTimeFormat = "2006-01-02 15:04:05 -0700 MST" - getContainersTimeout = 5 * time.Second + defaultUpdateInterval = 10 * time.Second + listContainersTimeout = 15 * time.Second ) // API is responsible for container-related operations in the agent. // It provides methods to list and manage containers. type API struct { - ctx context.Context - cancel context.CancelFunc - done chan struct{} - logger slog.Logger - watcher watcher.Watcher - - cacheDuration time.Duration - execer agentexec.Execer - cl Lister - dccli DevcontainerCLI - clock quartz.Clock - scriptLogger func(logSourceID uuid.UUID) ScriptLogger - - // lockCh protects the below fields. We use a channel instead of a - // mutex so we can handle cancellation properly. - lockCh chan struct{} - containers codersdk.WorkspaceAgentListContainersResponse - mtime time.Time - devcontainerNames map[string]struct{} // Track devcontainer names to avoid duplicates. - knownDevcontainers []codersdk.WorkspaceAgentDevcontainer // Track predefined and runtime-detected devcontainers. - configFileModifiedTimes map[string]time.Time // Track when config files were last modified. + ctx context.Context + cancel context.CancelFunc + watcherDone chan struct{} + updaterDone chan struct{} + initialUpdateDone chan struct{} // Closed after first update in updaterLoop. + updateTrigger chan chan error // Channel to trigger manual refresh. + updateInterval time.Duration // Interval for periodic container updates. + logger slog.Logger + watcher watcher.Watcher + execer agentexec.Execer + cl Lister + dccli DevcontainerCLI + clock quartz.Clock + scriptLogger func(logSourceID uuid.UUID) ScriptLogger + + mu sync.RWMutex + closed bool + containers codersdk.WorkspaceAgentListContainersResponse // Output from the last list operation. + containersErr error // Error from the last list operation. + devcontainerNames map[string]struct{} + knownDevcontainers []codersdk.WorkspaceAgentDevcontainer + configFileModifiedTimes map[string]time.Time devcontainerLogSourceIDs map[string]uuid.UUID // Track devcontainer log source IDs. } @@ -69,15 +70,6 @@ func WithClock(clock quartz.Clock) Option { } } -// WithCacheDuration sets the cache duration for the API. -// This is used to control how often the API refreshes the list of -// containers. The default is 10 seconds. -func WithCacheDuration(d time.Duration) Option { - return func(api *API) { - api.cacheDuration = d - } -} - // WithExecer sets the agentexec.Execer implementation to use. func WithExecer(execer agentexec.Execer) Option { return func(api *API) { @@ -169,12 +161,14 @@ func NewAPI(logger slog.Logger, options ...Option) *API { api := &API{ ctx: ctx, cancel: cancel, - done: make(chan struct{}), + watcherDone: make(chan struct{}), + updaterDone: make(chan struct{}), + initialUpdateDone: make(chan struct{}), + updateTrigger: make(chan chan error), + updateInterval: defaultUpdateInterval, logger: logger, clock: quartz.NewReal(), execer: agentexec.DefaultExecer, - cacheDuration: defaultGetContainersCacheDuration, - lockCh: make(chan struct{}, 1), devcontainerNames: make(map[string]struct{}), knownDevcontainers: []codersdk.WorkspaceAgentDevcontainer{}, configFileModifiedTimes: make(map[string]time.Time), @@ -200,33 +194,16 @@ func NewAPI(logger slog.Logger, options ...Option) *API { } } - go api.loop() + go api.watcherLoop() + go api.updaterLoop() return api } -// SignalReady signals the API that we are ready to begin watching for -// file changes. This is used to prime the cache with the current list -// of containers and to start watching the devcontainer config files for -// changes. It should be called after the agent ready. -func (api *API) SignalReady() { - // Prime the cache with the current list of containers. - _, _ = api.cl.List(api.ctx) - - // Make sure we watch the devcontainer config files for changes. - for _, devcontainer := range api.knownDevcontainers { - if devcontainer.ConfigPath == "" { - continue - } - - if err := api.watcher.Add(devcontainer.ConfigPath); err != nil { - api.logger.Error(api.ctx, "watch devcontainer config file failed", slog.Error(err), slog.F("file", devcontainer.ConfigPath)) - } - } -} - -func (api *API) loop() { - defer close(api.done) +func (api *API) watcherLoop() { + defer close(api.watcherDone) + defer api.logger.Debug(api.ctx, "watcher loop stopped") + api.logger.Debug(api.ctx, "watcher loop started") for { event, err := api.watcher.Next(api.ctx) @@ -263,10 +240,94 @@ func (api *API) loop() { } } +// updaterLoop is responsible for periodically updating the container +// list and handling manual refresh requests. +func (api *API) updaterLoop() { + defer close(api.updaterDone) + defer api.logger.Debug(api.ctx, "updater loop stopped") + api.logger.Debug(api.ctx, "updater loop started") + + // Perform an initial update to populate the container list, this + // gives us a guarantee that the API has loaded the initial state + // before returning any responses. This is useful for both tests + // and anyone looking to interact with the API. + api.logger.Debug(api.ctx, "performing initial containers update") + if err := api.updateContainers(api.ctx); err != nil { + api.logger.Error(api.ctx, "initial containers update failed", slog.Error(err)) + } else { + api.logger.Debug(api.ctx, "initial containers update complete") + } + // Signal that the initial update attempt (successful or not) is done. + // Other services can wait on this if they need the first data to be available. + close(api.initialUpdateDone) + + // We utilize a TickerFunc here instead of a regular Ticker so that + // we can guarantee execution of the updateContainers method after + // advancing the clock. + ticker := api.clock.TickerFunc(api.ctx, api.updateInterval, func() error { + done := make(chan error, 1) + defer close(done) + + select { + case <-api.ctx.Done(): + return api.ctx.Err() + case api.updateTrigger <- done: + err := <-done + if err != nil { + api.logger.Error(api.ctx, "updater loop ticker failed", slog.Error(err)) + } + default: + api.logger.Debug(api.ctx, "updater loop ticker skipped, update in progress") + } + + return nil // Always nil to keep the ticker going. + }, "updaterLoop") + defer func() { + if err := ticker.Wait("updaterLoop"); err != nil && !errors.Is(err, context.Canceled) { + api.logger.Error(api.ctx, "updater loop ticker failed", slog.Error(err)) + } + }() + + for { + select { + case <-api.ctx.Done(): + return + case done := <-api.updateTrigger: + // Note that although we pass api.ctx here, updateContainers + // has an internal timeout to prevent long blocking calls. + done <- api.updateContainers(api.ctx) + } + } +} + // Routes returns the HTTP handler for container-related routes. func (api *API) Routes() http.Handler { r := chi.NewRouter() + ensureInitialUpdateDoneMW := func(next http.Handler) http.Handler { + return http.HandlerFunc(func(rw http.ResponseWriter, r *http.Request) { + select { + case <-api.ctx.Done(): + httpapi.Write(r.Context(), rw, http.StatusServiceUnavailable, codersdk.Response{ + Message: "API closed", + Detail: "The API is closed and cannot process requests.", + }) + return + case <-r.Context().Done(): + return + case <-api.initialUpdateDone: + // Initial update is done, we can start processing + // requests. + } + next.ServeHTTP(rw, r) + }) + } + + // For now, all endpoints require the initial update to be done. + // If we want to allow some endpoints to be available before + // the initial update, we can enable this per-route. + r.Use(ensureInitialUpdateDoneMW) + r.Get("/", api.handleList) r.Route("/devcontainers", func(r chi.Router) { r.Get("/", api.handleDevcontainersList) @@ -278,62 +339,53 @@ func (api *API) Routes() http.Handler { // handleList handles the HTTP request to list containers. func (api *API) handleList(rw http.ResponseWriter, r *http.Request) { - select { - case <-r.Context().Done(): - // Client went away. + ct, err := api.getContainers() + if err != nil { + httpapi.Write(r.Context(), rw, http.StatusInternalServerError, codersdk.Response{ + Message: "Could not get containers", + Detail: err.Error(), + }) return - default: - ct, err := api.getContainers(r.Context()) - if err != nil { - if errors.Is(err, context.Canceled) { - httpapi.Write(r.Context(), rw, http.StatusRequestTimeout, codersdk.Response{ - Message: "Could not get containers.", - Detail: "Took too long to list containers.", - }) - return - } - httpapi.Write(r.Context(), rw, http.StatusInternalServerError, codersdk.Response{ - Message: "Could not get containers.", - Detail: err.Error(), - }) - return - } - - httpapi.Write(r.Context(), rw, http.StatusOK, ct) } + httpapi.Write(r.Context(), rw, http.StatusOK, ct) } -func copyListContainersResponse(resp codersdk.WorkspaceAgentListContainersResponse) codersdk.WorkspaceAgentListContainersResponse { - return codersdk.WorkspaceAgentListContainersResponse{ - Containers: slices.Clone(resp.Containers), - Warnings: slices.Clone(resp.Warnings), - } -} +// updateContainers fetches the latest container list, processes it, and +// updates the cache. It performs locking for updating shared API state. +func (api *API) updateContainers(ctx context.Context) error { + listCtx, listCancel := context.WithTimeout(ctx, listContainersTimeout) + defer listCancel() -func (api *API) getContainers(ctx context.Context) (codersdk.WorkspaceAgentListContainersResponse, error) { - select { - case <-api.ctx.Done(): - return codersdk.WorkspaceAgentListContainersResponse{}, api.ctx.Err() - case <-ctx.Done(): - return codersdk.WorkspaceAgentListContainersResponse{}, ctx.Err() - case api.lockCh <- struct{}{}: - defer func() { <-api.lockCh }() - } + updated, err := api.cl.List(listCtx) + if err != nil { + // If the context was canceled, we hold off on clearing the + // containers cache. This is to avoid clearing the cache if + // the update was canceled due to a timeout. Hopefully this + // will clear up on the next update. + if !errors.Is(err, context.Canceled) { + api.mu.Lock() + api.containers = codersdk.WorkspaceAgentListContainersResponse{} + api.containersErr = err + api.mu.Unlock() + } - now := api.clock.Now() - if now.Sub(api.mtime) < api.cacheDuration { - return copyListContainersResponse(api.containers), nil + return xerrors.Errorf("list containers failed: %w", err) } - timeoutCtx, timeoutCancel := context.WithTimeout(ctx, getContainersTimeout) - defer timeoutCancel() - updated, err := api.cl.List(timeoutCtx) - if err != nil { - return codersdk.WorkspaceAgentListContainersResponse{}, xerrors.Errorf("get containers: %w", err) - } - api.containers = updated - api.mtime = now + api.mu.Lock() + defer api.mu.Unlock() + api.processUpdatedContainersLocked(ctx, updated) + + api.logger.Debug(ctx, "containers updated successfully", slog.F("container_count", len(api.containers.Containers)), slog.F("warning_count", len(api.containers.Warnings)), slog.F("devcontainer_count", len(api.knownDevcontainers))) + + return nil +} + +// processUpdatedContainersLocked updates the devcontainer state based +// on the latest list of containers. This method assumes that api.mu is +// held. +func (api *API) processUpdatedContainersLocked(ctx context.Context, updated codersdk.WorkspaceAgentListContainersResponse) { dirtyStates := make(map[string]bool) // Reset all known devcontainers to not running. for i := range api.knownDevcontainers { @@ -345,6 +397,7 @@ func (api *API) getContainers(ctx context.Context) (codersdk.WorkspaceAgentListC } // Check if the container is running and update the known devcontainers. + updatedDevcontainers := make(map[string]bool) for i := range updated.Containers { container := &updated.Containers[i] workspaceFolder := container.Labels[DevcontainerLocalFolderLabel] @@ -354,18 +407,16 @@ func (api *API) getContainers(ctx context.Context) (codersdk.WorkspaceAgentListC continue } - container.DevcontainerDirty = dirtyStates[workspaceFolder] - if container.DevcontainerDirty { - lastModified, hasModTime := api.configFileModifiedTimes[configFile] - if hasModTime && container.CreatedAt.After(lastModified) { - api.logger.Info(ctx, "new container created after config modification, not marking as dirty", - slog.F("container", container.ID), - slog.F("created_at", container.CreatedAt), - slog.F("config_modified_at", lastModified), - slog.F("file", configFile), - ) - container.DevcontainerDirty = false - } + if lastModified, hasModTime := api.configFileModifiedTimes[configFile]; !hasModTime || container.CreatedAt.Before(lastModified) { + api.logger.Debug(ctx, "container created before config modification, setting dirty state from devcontainer", + slog.F("container", container.ID), + slog.F("created_at", container.CreatedAt), + slog.F("config_modified_at", lastModified), + slog.F("file", configFile), + slog.F("workspace_folder", workspaceFolder), + slog.F("dirty", dirtyStates[workspaceFolder]), + ) + container.DevcontainerDirty = dirtyStates[workspaceFolder] } // Check if this is already in our known list. @@ -373,29 +424,17 @@ func (api *API) getContainers(ctx context.Context) (codersdk.WorkspaceAgentListC return dc.WorkspaceFolder == workspaceFolder }); knownIndex != -1 { // Update existing entry with runtime information. - if configFile != "" && api.knownDevcontainers[knownIndex].ConfigPath == "" { - api.knownDevcontainers[knownIndex].ConfigPath = configFile + dc := &api.knownDevcontainers[knownIndex] + if configFile != "" && dc.ConfigPath == "" { + dc.ConfigPath = configFile if err := api.watcher.Add(configFile); err != nil { api.logger.Error(ctx, "watch devcontainer config file failed", slog.Error(err), slog.F("file", configFile)) } } - api.knownDevcontainers[knownIndex].Running = container.Running - api.knownDevcontainers[knownIndex].Container = container - - // Check if this container was created after the config - // file was modified. - if configFile != "" && api.knownDevcontainers[knownIndex].Dirty { - lastModified, hasModTime := api.configFileModifiedTimes[configFile] - if hasModTime && container.CreatedAt.After(lastModified) { - api.logger.Info(ctx, "clearing dirty flag for container created after config modification", - slog.F("container", container.ID), - slog.F("created_at", container.CreatedAt), - slog.F("config_modified_at", lastModified), - slog.F("file", configFile), - ) - api.knownDevcontainers[knownIndex].Dirty = false - } - } + dc.Running = container.Running + dc.Container = container + dc.Dirty = container.DevcontainerDirty + updatedDevcontainers[workspaceFolder] = true continue } @@ -428,9 +467,73 @@ func (api *API) getContainers(ctx context.Context) (codersdk.WorkspaceAgentListC Dirty: container.DevcontainerDirty, Container: container, }) + updatedDevcontainers[workspaceFolder] = true + } + + for i := range api.knownDevcontainers { + if _, ok := updatedDevcontainers[api.knownDevcontainers[i].WorkspaceFolder]; ok { + continue + } + + dc := &api.knownDevcontainers[i] + + if !dc.Running && !dc.Dirty && dc.Container == nil { + // Already marked as not running, skip. + continue + } + + api.logger.Debug(ctx, "devcontainer is not running anymore, marking as not running", + slog.F("workspace_folder", dc.WorkspaceFolder), + slog.F("config_path", dc.ConfigPath), + slog.F("name", dc.Name), + ) + dc.Running = false + dc.Dirty = false + dc.Container = nil + } + + api.containers = updated + api.containersErr = nil +} + +// refreshContainers triggers an immediate update of the container list +// and waits for it to complete. +func (api *API) refreshContainers(ctx context.Context) (err error) { + defer func() { + if err != nil { + err = xerrors.Errorf("refresh containers failed: %w", err) + } + }() + + done := make(chan error, 1) + select { + case <-api.ctx.Done(): + return xerrors.Errorf("API closed: %w", api.ctx.Err()) + case <-ctx.Done(): + return ctx.Err() + case api.updateTrigger <- done: + select { + case <-api.ctx.Done(): + return xerrors.Errorf("API closed: %w", api.ctx.Err()) + case <-ctx.Done(): + return ctx.Err() + case err := <-done: + return err + } } +} - return copyListContainersResponse(api.containers), nil +func (api *API) getContainers() (codersdk.WorkspaceAgentListContainersResponse, error) { + api.mu.RLock() + defer api.mu.RUnlock() + + if api.containersErr != nil { + return codersdk.WorkspaceAgentListContainersResponse{}, api.containersErr + } + return codersdk.WorkspaceAgentListContainersResponse{ + Containers: slices.Clone(api.containers.Containers), + Warnings: slices.Clone(api.containers.Warnings), + }, nil } // handleDevcontainerRecreate handles the HTTP request to recreate a @@ -447,7 +550,7 @@ func (api *API) handleDevcontainerRecreate(w http.ResponseWriter, r *http.Reques return } - containers, err := api.getContainers(ctx) + containers, err := api.getContainers() if err != nil { httpapi.Write(ctx, w, http.StatusInternalServerError, codersdk.Response{ Message: "Could not list containers", @@ -509,30 +612,9 @@ func (api *API) handleDevcontainerRecreate(w http.ResponseWriter, r *http.Reques return } - // TODO(mafredri): Temporarily handle clearing the dirty state after - // recreation, later on this should be handled by a "container watcher". - if !api.doLockedHandler(w, r, func() { - for i := range api.knownDevcontainers { - if api.knownDevcontainers[i].WorkspaceFolder == workspaceFolder { - if api.knownDevcontainers[i].Dirty { - api.logger.Info(ctx, "clearing dirty flag after recreation", - slog.F("workspace_folder", workspaceFolder), - slog.F("name", api.knownDevcontainers[i].Name), - ) - api.knownDevcontainers[i].Dirty = false - // TODO(mafredri): This should be handled by a service that - // updates the devcontainer state periodically and on-demand. - api.knownDevcontainers[i].Container = nil - // Set the modified time to the zero value to indicate that - // the containers list must be refreshed. This will see to - // it that the new container is re-assigned. - api.mtime = time.Time{} - } - return - } - } - }) { - return + // NOTE(mafredri): This won't be needed once recreation is done async. + if err := api.refreshContainers(r.Context()); err != nil { + api.logger.Error(ctx, "failed to trigger immediate refresh after devcontainer recreation", slog.Error(err)) } w.WriteHeader(http.StatusNoContent) @@ -542,8 +624,10 @@ func (api *API) handleDevcontainerRecreate(w http.ResponseWriter, r *http.Reques func (api *API) handleDevcontainersList(w http.ResponseWriter, r *http.Request) { ctx := r.Context() - // Run getContainers to detect the latest devcontainers and their state. - _, err := api.getContainers(ctx) + api.mu.RLock() + err := api.containersErr + devcontainers := slices.Clone(api.knownDevcontainers) + api.mu.RUnlock() if err != nil { httpapi.Write(ctx, w, http.StatusInternalServerError, codersdk.Response{ Message: "Could not list containers", @@ -552,13 +636,6 @@ func (api *API) handleDevcontainersList(w http.ResponseWriter, r *http.Request) return } - var devcontainers []codersdk.WorkspaceAgentDevcontainer - if !api.doLockedHandler(w, r, func() { - devcontainers = slices.Clone(api.knownDevcontainers) - }) { - return - } - slices.SortFunc(devcontainers, func(a, b codersdk.WorkspaceAgentDevcontainer) int { if cmp := strings.Compare(a.WorkspaceFolder, b.WorkspaceFolder); cmp != 0 { return cmp @@ -576,75 +653,56 @@ func (api *API) handleDevcontainersList(w http.ResponseWriter, r *http.Request) // markDevcontainerDirty finds the devcontainer with the given config file path // and marks it as dirty. It acquires the lock before modifying the state. func (api *API) markDevcontainerDirty(configPath string, modifiedAt time.Time) { - ok := api.doLocked(func() { - // Record the timestamp of when this configuration file was modified. - api.configFileModifiedTimes[configPath] = modifiedAt + api.mu.Lock() + defer api.mu.Unlock() - for i := range api.knownDevcontainers { - if api.knownDevcontainers[i].ConfigPath != configPath { - continue - } + // Record the timestamp of when this configuration file was modified. + api.configFileModifiedTimes[configPath] = modifiedAt - // TODO(mafredri): Simplistic mark for now, we should check if the - // container is running and if the config file was modified after - // the container was created. - if !api.knownDevcontainers[i].Dirty { - api.logger.Info(api.ctx, "marking devcontainer as dirty", - slog.F("file", configPath), - slog.F("name", api.knownDevcontainers[i].Name), - slog.F("workspace_folder", api.knownDevcontainers[i].WorkspaceFolder), - slog.F("modified_at", modifiedAt), - ) - api.knownDevcontainers[i].Dirty = true - if api.knownDevcontainers[i].Container != nil { - api.knownDevcontainers[i].Container.DevcontainerDirty = true - } - } + for i := range api.knownDevcontainers { + dc := &api.knownDevcontainers[i] + if dc.ConfigPath != configPath { + continue } - }) - if !ok { - api.logger.Debug(api.ctx, "mark devcontainer dirty failed", slog.F("file", configPath)) - } -} -func (api *API) doLockedHandler(w http.ResponseWriter, r *http.Request, f func()) bool { - select { - case <-r.Context().Done(): - httpapi.Write(r.Context(), w, http.StatusRequestTimeout, codersdk.Response{ - Message: "Request canceled", - Detail: "Request was canceled before we could process it.", - }) - return false - case <-api.ctx.Done(): - httpapi.Write(r.Context(), w, http.StatusServiceUnavailable, codersdk.Response{ - Message: "API closed", - Detail: "The API is closed and cannot process requests.", - }) - return false - case api.lockCh <- struct{}{}: - defer func() { <-api.lockCh }() + logger := api.logger.With( + slog.F("file", configPath), + slog.F("name", dc.Name), + slog.F("workspace_folder", dc.WorkspaceFolder), + slog.F("modified_at", modifiedAt), + ) + + // TODO(mafredri): Simplistic mark for now, we should check if the + // container is running and if the config file was modified after + // the container was created. + if !dc.Dirty { + logger.Info(api.ctx, "marking devcontainer as dirty") + dc.Dirty = true + } + if dc.Container != nil && !dc.Container.DevcontainerDirty { + logger.Info(api.ctx, "marking devcontainer container as dirty") + dc.Container.DevcontainerDirty = true + } } - f() - return true } -func (api *API) doLocked(f func()) bool { - select { - case <-api.ctx.Done(): - return false - case api.lockCh <- struct{}{}: - defer func() { <-api.lockCh }() +func (api *API) Close() error { + api.mu.Lock() + if api.closed { + api.mu.Unlock() + return nil } - f() - return true -} + api.closed = true + + api.logger.Debug(api.ctx, "closing API") + defer api.logger.Debug(api.ctx, "closed API") -func (api *API) Close() error { api.cancel() - <-api.done err := api.watcher.Close() - if err != nil { - return err - } - return nil + + api.mu.Unlock() + <-api.watcherDone + <-api.updaterDone + + return err } diff --git a/agent/agentcontainers/api_test.go b/agent/agentcontainers/api_test.go index 2e173b7d5a6b4..a687cb8c001f8 100644 --- a/agent/agentcontainers/api_test.go +++ b/agent/agentcontainers/api_test.go @@ -161,15 +161,16 @@ func TestAPI(t *testing.T) { return codersdk.WorkspaceAgentListContainersResponse{Containers: cts} } + type initialDataPayload struct { + val codersdk.WorkspaceAgentListContainersResponse + err error + } + // Each test case is called multiple times to ensure idempotency for _, tc := range []struct { name string - // data to be stored in the handler - cacheData codersdk.WorkspaceAgentListContainersResponse - // duration of cache - cacheDur time.Duration - // relative age of the cached data - cacheAge time.Duration + // initialData to be stored in the handler + initialData initialDataPayload // function to set up expectations for the mock setupMock func(mcl *acmock.MockLister, preReq *gomock.Call) // expected result @@ -178,104 +179,119 @@ func TestAPI(t *testing.T) { expectedErr string }{ { - name: "no cache", + name: "no initial data", + initialData: initialDataPayload{makeResponse(), nil}, setupMock: func(mcl *acmock.MockLister, preReq *gomock.Call) { mcl.EXPECT().List(gomock.Any()).Return(makeResponse(fakeCt), nil).After(preReq).AnyTimes() }, expected: makeResponse(fakeCt), }, { - name: "no data", - cacheData: makeResponse(), - cacheAge: 2 * time.Second, - cacheDur: time.Second, + name: "repeat initial data", + initialData: initialDataPayload{makeResponse(fakeCt), nil}, + expected: makeResponse(fakeCt), + }, + { + name: "lister error always", + initialData: initialDataPayload{makeResponse(), assert.AnError}, + expectedErr: assert.AnError.Error(), + }, + { + name: "lister error only during initial data", + initialData: initialDataPayload{makeResponse(), assert.AnError}, setupMock: func(mcl *acmock.MockLister, preReq *gomock.Call) { mcl.EXPECT().List(gomock.Any()).Return(makeResponse(fakeCt), nil).After(preReq).AnyTimes() }, expected: makeResponse(fakeCt), }, { - name: "cached data", - cacheAge: time.Second, - cacheData: makeResponse(fakeCt), - cacheDur: 2 * time.Second, - expected: makeResponse(fakeCt), - }, - { - name: "lister error", + name: "lister error after initial data", + initialData: initialDataPayload{makeResponse(fakeCt), nil}, setupMock: func(mcl *acmock.MockLister, preReq *gomock.Call) { mcl.EXPECT().List(gomock.Any()).Return(makeResponse(), assert.AnError).After(preReq).AnyTimes() }, expectedErr: assert.AnError.Error(), }, { - name: "stale cache", - cacheAge: 2 * time.Second, - cacheData: makeResponse(fakeCt), - cacheDur: time.Second, + name: "updated data", + initialData: initialDataPayload{makeResponse(fakeCt), nil}, setupMock: func(mcl *acmock.MockLister, preReq *gomock.Call) { mcl.EXPECT().List(gomock.Any()).Return(makeResponse(fakeCt2), nil).After(preReq).AnyTimes() }, expected: makeResponse(fakeCt2), }, } { - tc := tc t.Run(tc.name, func(t *testing.T) { t.Parallel() var ( ctx = testutil.Context(t, testutil.WaitShort) - clk = quartz.NewMock(t) - ctrl = gomock.NewController(t) - mockLister = acmock.NewMockLister(ctrl) - now = time.Now().UTC() - logger = slogtest.Make(t, nil).Leveled(slog.LevelDebug) + mClock = quartz.NewMock(t) + tickerTrap = mClock.Trap().TickerFunc("updaterLoop") + mCtrl = gomock.NewController(t) + mLister = acmock.NewMockLister(mCtrl) + logger = slogtest.Make(t, &slogtest.Options{IgnoreErrors: true}).Leveled(slog.LevelDebug) r = chi.NewRouter() - api = agentcontainers.NewAPI(logger, - agentcontainers.WithCacheDuration(tc.cacheDur), - agentcontainers.WithClock(clk), - agentcontainers.WithLister(mockLister), - ) ) - defer api.Close() - - r.Mount("/", api.Routes()) - preReq := mockLister.EXPECT().List(gomock.Any()).Return(tc.cacheData, nil).Times(1) + initialDataCall := mLister.EXPECT().List(gomock.Any()).Return(tc.initialData.val, tc.initialData.err) if tc.setupMock != nil { - tc.setupMock(mockLister, preReq) + tc.setupMock(mLister, initialDataCall.Times(1)) + } else { + initialDataCall.AnyTimes() } - if tc.cacheAge != 0 { - clk.Set(now.Add(-tc.cacheAge)).MustWait(ctx) + api := agentcontainers.NewAPI(logger, + agentcontainers.WithClock(mClock), + agentcontainers.WithLister(mLister), + ) + defer api.Close() + r.Mount("/", api.Routes()) + + // Make sure the ticker function has been registered + // before advancing the clock. + tickerTrap.MustWait(ctx).Release() + tickerTrap.Close() + + // Initial request returns the initial data. + req := httptest.NewRequest(http.MethodGet, "/", nil). + WithContext(ctx) + rec := httptest.NewRecorder() + r.ServeHTTP(rec, req) + + if tc.initialData.err != nil { + got := &codersdk.Error{} + err := json.NewDecoder(rec.Body).Decode(got) + require.NoError(t, err, "unmarshal response failed") + require.ErrorContains(t, got, tc.initialData.err.Error(), "want error") } else { - clk.Set(now).MustWait(ctx) + var got codersdk.WorkspaceAgentListContainersResponse + err := json.NewDecoder(rec.Body).Decode(&got) + require.NoError(t, err, "unmarshal response failed") + require.Equal(t, tc.initialData.val, got, "want initial data") } - // Prime the cache with the initial data. - req := httptest.NewRequest(http.MethodGet, "/", nil) - rec := httptest.NewRecorder() + // Advance the clock to run updaterLoop. + _, aw := mClock.AdvanceNext() + aw.MustWait(ctx) + + // Second request returns the updated data. + req = httptest.NewRequest(http.MethodGet, "/", nil). + WithContext(ctx) + rec = httptest.NewRecorder() r.ServeHTTP(rec, req) - clk.Set(now).MustWait(ctx) - - // Repeat the test to ensure idempotency - for i := 0; i < 2; i++ { - req = httptest.NewRequest(http.MethodGet, "/", nil) - rec = httptest.NewRecorder() - r.ServeHTTP(rec, req) - - if tc.expectedErr != "" { - got := &codersdk.Error{} - err := json.NewDecoder(rec.Body).Decode(got) - require.NoError(t, err, "unmarshal response failed") - require.ErrorContains(t, got, tc.expectedErr, "expected error (attempt %d)", i) - } else { - var got codersdk.WorkspaceAgentListContainersResponse - err := json.NewDecoder(rec.Body).Decode(&got) - require.NoError(t, err, "unmarshal response failed") - require.Equal(t, tc.expected, got, "expected containers to be equal (attempt %d)", i) - } + if tc.expectedErr != "" { + got := &codersdk.Error{} + err := json.NewDecoder(rec.Body).Decode(got) + require.NoError(t, err, "unmarshal response failed") + require.ErrorContains(t, got, tc.expectedErr, "want error") + return } + + var got codersdk.WorkspaceAgentListContainersResponse + err := json.NewDecoder(rec.Body).Decode(&got) + require.NoError(t, err, "unmarshal response failed") + require.Equal(t, tc.expected, got, "want updated data") }) } }) @@ -380,7 +396,7 @@ func TestAPI(t *testing.T) { t.Run(tt.name, func(t *testing.T) { t.Parallel() - logger := slogtest.Make(t, nil).Leveled(slog.LevelDebug) + logger := slogtest.Make(t, &slogtest.Options{IgnoreErrors: true}).Leveled(slog.LevelDebug) // Setup router with the handler under test. r := chi.NewRouter() @@ -393,8 +409,11 @@ func TestAPI(t *testing.T) { defer api.Close() r.Mount("/", api.Routes()) + ctx := testutil.Context(t, testutil.WaitShort) + // Simulate HTTP request to the recreate endpoint. - req := httptest.NewRequest(http.MethodPost, "/devcontainers/container/"+tt.containerID+"/recreate", nil) + req := httptest.NewRequest(http.MethodPost, "/devcontainers/container/"+tt.containerID+"/recreate", nil). + WithContext(ctx) rec := httptest.NewRecorder() r.ServeHTTP(rec, req) @@ -688,7 +707,7 @@ func TestAPI(t *testing.T) { t.Run(tt.name, func(t *testing.T) { t.Parallel() - logger := slogtest.Make(t, nil).Leveled(slog.LevelDebug) + logger := slogtest.Make(t, &slogtest.Options{IgnoreErrors: true}).Leveled(slog.LevelDebug) // Setup router with the handler under test. r := chi.NewRouter() @@ -712,9 +731,13 @@ func TestAPI(t *testing.T) { api := agentcontainers.NewAPI(logger, apiOptions...) defer api.Close() + r.Mount("/", api.Routes()) - req := httptest.NewRequest(http.MethodGet, "/devcontainers", nil) + ctx := testutil.Context(t, testutil.WaitShort) + + req := httptest.NewRequest(http.MethodGet, "/devcontainers", nil). + WithContext(ctx) rec := httptest.NewRecorder() r.ServeHTTP(rec, req) @@ -739,15 +762,110 @@ func TestAPI(t *testing.T) { } }) + t.Run("List devcontainers running then not running", func(t *testing.T) { + t.Parallel() + + container := codersdk.WorkspaceAgentContainer{ + ID: "container-id", + FriendlyName: "container-name", + Running: true, + CreatedAt: time.Now().Add(-1 * time.Minute), + Labels: map[string]string{ + agentcontainers.DevcontainerLocalFolderLabel: "/home/coder/project", + agentcontainers.DevcontainerConfigFileLabel: "/home/coder/project/.devcontainer/devcontainer.json", + }, + } + dc := codersdk.WorkspaceAgentDevcontainer{ + ID: uuid.New(), + Name: "test-devcontainer", + WorkspaceFolder: "/home/coder/project", + ConfigPath: "/home/coder/project/.devcontainer/devcontainer.json", + } + + ctx := testutil.Context(t, testutil.WaitShort) + + logger := slogtest.Make(t, nil).Leveled(slog.LevelDebug) + fLister := &fakeLister{ + containers: codersdk.WorkspaceAgentListContainersResponse{ + Containers: []codersdk.WorkspaceAgentContainer{container}, + }, + } + fWatcher := newFakeWatcher(t) + mClock := quartz.NewMock(t) + mClock.Set(time.Now()).MustWait(ctx) + tickerTrap := mClock.Trap().TickerFunc("updaterLoop") + + api := agentcontainers.NewAPI(logger, + agentcontainers.WithClock(mClock), + agentcontainers.WithLister(fLister), + agentcontainers.WithWatcher(fWatcher), + agentcontainers.WithDevcontainers( + []codersdk.WorkspaceAgentDevcontainer{dc}, + []codersdk.WorkspaceAgentScript{{LogSourceID: uuid.New(), ID: dc.ID}}, + ), + ) + defer api.Close() + + // Make sure the ticker function has been registered + // before advancing any use of mClock.Advance. + tickerTrap.MustWait(ctx).Release() + tickerTrap.Close() + + // Make sure the start loop has been called. + fWatcher.waitNext(ctx) + + // Simulate a file modification event to make the devcontainer dirty. + fWatcher.sendEventWaitNextCalled(ctx, fsnotify.Event{ + Name: "/home/coder/project/.devcontainer/devcontainer.json", + Op: fsnotify.Write, + }) + + // Initially the devcontainer should be running and dirty. + req := httptest.NewRequest(http.MethodGet, "/devcontainers", nil). + WithContext(ctx) + rec := httptest.NewRecorder() + api.Routes().ServeHTTP(rec, req) + + require.Equal(t, http.StatusOK, rec.Code) + var resp1 codersdk.WorkspaceAgentDevcontainersResponse + err := json.NewDecoder(rec.Body).Decode(&resp1) + require.NoError(t, err) + require.Len(t, resp1.Devcontainers, 1) + require.True(t, resp1.Devcontainers[0].Running, "devcontainer should be running initially") + require.True(t, resp1.Devcontainers[0].Dirty, "devcontainer should be dirty initially") + require.NotNil(t, resp1.Devcontainers[0].Container, "devcontainer should have a container initially") + + // Next, simulate a situation where the container is no longer + // running. + fLister.containers.Containers = []codersdk.WorkspaceAgentContainer{} + + // Trigger a refresh which will use the second response from mock + // lister (no containers). + _, aw := mClock.AdvanceNext() + aw.MustWait(ctx) + + // Afterwards the devcontainer should not be running and not dirty. + req = httptest.NewRequest(http.MethodGet, "/devcontainers", nil). + WithContext(ctx) + rec = httptest.NewRecorder() + api.Routes().ServeHTTP(rec, req) + + require.Equal(t, http.StatusOK, rec.Code) + var resp2 codersdk.WorkspaceAgentDevcontainersResponse + err = json.NewDecoder(rec.Body).Decode(&resp2) + require.NoError(t, err) + require.Len(t, resp2.Devcontainers, 1) + require.False(t, resp2.Devcontainers[0].Running, "devcontainer should not be running after empty list") + require.False(t, resp2.Devcontainers[0].Dirty, "devcontainer should not be dirty after empty list") + require.Nil(t, resp2.Devcontainers[0].Container, "devcontainer should not have a container after empty list") + }) + t.Run("FileWatcher", func(t *testing.T) { t.Parallel() - ctx := testutil.Context(t, testutil.WaitMedium) + ctx := testutil.Context(t, testutil.WaitShort) startTime := time.Date(2025, 1, 1, 12, 0, 0, 0, time.UTC) - mClock := quartz.NewMock(t) - mClock.Set(startTime) - fWatcher := newFakeWatcher(t) // Create a fake container with a config file. configPath := "/workspace/project/.devcontainer/devcontainer.json" @@ -762,6 +880,10 @@ func TestAPI(t *testing.T) { }, } + mClock := quartz.NewMock(t) + mClock.Set(startTime) + tickerTrap := mClock.Trap().TickerFunc("updaterLoop") + fWatcher := newFakeWatcher(t) fLister := &fakeLister{ containers: codersdk.WorkspaceAgentListContainersResponse{ Containers: []codersdk.WorkspaceAgentContainer{container}, @@ -777,14 +899,18 @@ func TestAPI(t *testing.T) { ) defer api.Close() - api.SignalReady() - r := chi.NewRouter() r.Mount("/", api.Routes()) + // Make sure the ticker function has been registered + // before advancing any use of mClock.Advance. + tickerTrap.MustWait(ctx).Release() + tickerTrap.Close() + // Call the list endpoint first to ensure config files are // detected and watched. - req := httptest.NewRequest(http.MethodGet, "/devcontainers", nil) + req := httptest.NewRequest(http.MethodGet, "/devcontainers", nil). + WithContext(ctx) rec := httptest.NewRecorder() r.ServeHTTP(rec, req) require.Equal(t, http.StatusOK, rec.Code) @@ -813,10 +939,13 @@ func TestAPI(t *testing.T) { Op: fsnotify.Write, }) - mClock.Advance(time.Minute).MustWait(ctx) + // Advance the clock to run updaterLoop. + _, aw := mClock.AdvanceNext() + aw.MustWait(ctx) // Check if the container is marked as dirty. - req = httptest.NewRequest(http.MethodGet, "/devcontainers", nil) + req = httptest.NewRequest(http.MethodGet, "/devcontainers", nil). + WithContext(ctx) rec = httptest.NewRecorder() r.ServeHTTP(rec, req) require.Equal(t, http.StatusOK, rec.Code) @@ -830,15 +959,18 @@ func TestAPI(t *testing.T) { assert.True(t, response.Devcontainers[0].Container.DevcontainerDirty, "container should be marked as dirty after config file was modified") - mClock.Advance(time.Minute).MustWait(ctx) - container.ID = "new-container-id" // Simulate a new container ID after recreation. container.FriendlyName = "new-container-name" container.CreatedAt = mClock.Now() // Update the creation time. fLister.containers.Containers = []codersdk.WorkspaceAgentContainer{container} + // Advance the clock to run updaterLoop. + _, aw = mClock.AdvanceNext() + aw.MustWait(ctx) + // Check if dirty flag is cleared. - req = httptest.NewRequest(http.MethodGet, "/devcontainers", nil) + req = httptest.NewRequest(http.MethodGet, "/devcontainers", nil). + WithContext(ctx) rec = httptest.NewRecorder() r.ServeHTTP(rec, req) require.Equal(t, http.StatusOK, rec.Code) diff --git a/cli/exp_mcp.go b/cli/exp_mcp.go index 6174f0cffbf0e..fb866666daf4a 100644 --- a/cli/exp_mcp.go +++ b/cli/exp_mcp.go @@ -255,7 +255,7 @@ func (*RootCmd) mcpConfigureClaudeCode() *serpent.Command { { Name: "app-status-slug", Description: "The app status slug to use when running the Coder MCP server.", - Env: "CODER_MCP_CLAUDE_APP_STATUS_SLUG", + Env: "CODER_MCP_APP_STATUS_SLUG", Flag: "claude-app-status-slug", Value: serpent.StringOf(&appStatusSlug), }, diff --git a/cli/open_test.go b/cli/open_test.go index 9ba16a32674e2..97d24f0634d9d 100644 --- a/cli/open_test.go +++ b/cli/open_test.go @@ -326,7 +326,7 @@ func TestOpenVSCodeDevContainer(t *testing.T) { }, }, }, nil, - ) + ).AnyTimes() client, workspace, agentToken := setupWorkspaceForAgent(t, func(agents []*proto.Agent) []*proto.Agent { agents[0].Directory = agentDir @@ -501,7 +501,7 @@ func TestOpenVSCodeDevContainer_NoAgentDirectory(t *testing.T) { }, }, }, nil, - ) + ).AnyTimes() client, workspace, agentToken := setupWorkspaceForAgent(t, func(agents []*proto.Agent) []*proto.Agent { agents[0].Name = agentName diff --git a/cli/parameterresolver.go b/cli/parameterresolver.go index 41c61d5315a77..40625331fa6aa 100644 --- a/cli/parameterresolver.go +++ b/cli/parameterresolver.go @@ -226,7 +226,7 @@ func (pr *ParameterResolver) resolveWithInput(resolved []codersdk.WorkspaceBuild if p != nil { continue } - // Parameter has not been resolved yet, so CLI needs to determine if user should input it. + // PreviewParameter has not been resolved yet, so CLI needs to determine if user should input it. firstTimeUse := pr.isFirstTimeUse(tvp.Name) promptParameterOption := pr.isLastBuildParameterInvalidOption(tvp) diff --git a/cli/server.go b/cli/server.go index c5532e07e7a81..1794044bce48f 100644 --- a/cli/server.go +++ b/cli/server.go @@ -87,6 +87,7 @@ import ( "github.com/coder/coder/v2/coderd/externalauth" "github.com/coder/coder/v2/coderd/gitsshkey" "github.com/coder/coder/v2/coderd/httpmw" + "github.com/coder/coder/v2/coderd/jobreaper" "github.com/coder/coder/v2/coderd/notifications" "github.com/coder/coder/v2/coderd/oauthpki" "github.com/coder/coder/v2/coderd/prometheusmetrics" @@ -95,7 +96,6 @@ import ( "github.com/coder/coder/v2/coderd/schedule" "github.com/coder/coder/v2/coderd/telemetry" "github.com/coder/coder/v2/coderd/tracing" - "github.com/coder/coder/v2/coderd/unhanger" "github.com/coder/coder/v2/coderd/updatecheck" "github.com/coder/coder/v2/coderd/util/ptr" "github.com/coder/coder/v2/coderd/util/slice" @@ -1124,14 +1124,14 @@ func (r *RootCmd) Server(newAPI func(context.Context, *coderd.Options) (*coderd. autobuildTicker := time.NewTicker(vals.AutobuildPollInterval.Value()) defer autobuildTicker.Stop() autobuildExecutor := autobuild.NewExecutor( - ctx, options.Database, options.Pubsub, options.PrometheusRegistry, coderAPI.TemplateScheduleStore, &coderAPI.Auditor, coderAPI.AccessControlStore, logger, autobuildTicker.C, options.NotificationsEnqueuer) + ctx, options.Database, options.Pubsub, options.PrometheusRegistry, coderAPI.TemplateScheduleStore, &coderAPI.Auditor, coderAPI.AccessControlStore, logger, autobuildTicker.C, options.NotificationsEnqueuer, coderAPI.Experiments) autobuildExecutor.Run() - hangDetectorTicker := time.NewTicker(vals.JobHangDetectorInterval.Value()) - defer hangDetectorTicker.Stop() - hangDetector := unhanger.New(ctx, options.Database, options.Pubsub, logger, hangDetectorTicker.C) - hangDetector.Start() - defer hangDetector.Close() + jobReaperTicker := time.NewTicker(vals.JobReaperDetectorInterval.Value()) + defer jobReaperTicker.Stop() + jobReaper := jobreaper.New(ctx, options.Database, options.Pubsub, logger, jobReaperTicker.C) + jobReaper.Start() + defer jobReaper.Close() waitForProvisionerJobs := false // Currently there is no way to ask the server to shut diff --git a/cli/ssh_test.go b/cli/ssh_test.go index 49f83daa0612a..147fc07372032 100644 --- a/cli/ssh_test.go +++ b/cli/ssh_test.go @@ -2056,12 +2056,6 @@ func TestSSH_Container(t *testing.T) { client, workspace, agentToken := setupWorkspaceForAgent(t) ctrl := gomock.NewController(t) mLister := acmock.NewMockLister(ctrl) - _ = agenttest.New(t, client.URL, agentToken, func(o *agent.Options) { - o.ExperimentalDevcontainersEnabled = true - o.ContainerAPIOptions = append(o.ContainerAPIOptions, agentcontainers.WithLister(mLister)) - }) - _ = coderdtest.NewWorkspaceAgentWaiter(t, client, workspace.ID).Wait() - mLister.EXPECT().List(gomock.Any()).Return(codersdk.WorkspaceAgentListContainersResponse{ Containers: []codersdk.WorkspaceAgentContainer{ { @@ -2070,7 +2064,12 @@ func TestSSH_Container(t *testing.T) { }, }, Warnings: nil, - }, nil) + }, nil).AnyTimes() + _ = agenttest.New(t, client.URL, agentToken, func(o *agent.Options) { + o.ExperimentalDevcontainersEnabled = true + o.ContainerAPIOptions = append(o.ContainerAPIOptions, agentcontainers.WithLister(mLister)) + }) + _ = coderdtest.NewWorkspaceAgentWaiter(t, client, workspace.ID).Wait() cID := uuid.NewString() inv, root := clitest.New(t, "ssh", workspace.Name, "-c", cID) diff --git a/cli/testdata/coder_list_--output_json.golden b/cli/testdata/coder_list_--output_json.golden index 5f293787de719..c37c89c4efe2a 100644 --- a/cli/testdata/coder_list_--output_json.golden +++ b/cli/testdata/coder_list_--output_json.golden @@ -15,6 +15,7 @@ "template_allow_user_cancel_workspace_jobs": false, "template_active_version_id": "============[version ID]============", "template_require_active_version": false, + "template_use_classic_parameter_flow": false, "latest_build": { "id": "========[workspace build ID]========", "created_at": "====[timestamp]=====", @@ -23,7 +24,6 @@ "workspace_name": "test-workspace", "workspace_owner_id": "==========[first user ID]===========", "workspace_owner_name": "testuser", - "workspace_owner_avatar_url": "", "template_version_id": "============[version ID]============", "template_version_name": "===========[version name]===========", "build_number": 1, diff --git a/cli/testdata/coder_users_edit-roles_--help.golden b/cli/testdata/coder_users_edit-roles_--help.golden index 02dd9155b4d4e..5a21c152e63fc 100644 --- a/cli/testdata/coder_users_edit-roles_--help.golden +++ b/cli/testdata/coder_users_edit-roles_--help.golden @@ -8,8 +8,7 @@ USAGE: OPTIONS: --roles string-array A list of roles to give to the user. This removes any existing roles - the user may have. The available roles are: auditor, member, owner, - template-admin, user-admin. + the user may have. -y, --yes bool Bypass prompts. diff --git a/cli/testdata/coder_users_list_--output_json.golden b/cli/testdata/coder_users_list_--output_json.golden index 61b17e026d290..7243200f6bdb1 100644 --- a/cli/testdata/coder_users_list_--output_json.golden +++ b/cli/testdata/coder_users_list_--output_json.golden @@ -2,7 +2,6 @@ { "id": "==========[first user ID]===========", "username": "testuser", - "avatar_url": "", "name": "Test User", "email": "testuser@coder.com", "created_at": "====[timestamp]=====", @@ -23,8 +22,6 @@ { "id": "==========[second user ID]==========", "username": "testuser2", - "avatar_url": "", - "name": "", "email": "testuser2@coder.com", "created_at": "====[timestamp]=====", "updated_at": "====[timestamp]=====", diff --git a/cli/testdata/server-config.yaml.golden b/cli/testdata/server-config.yaml.golden index fc76a6c2ec8a0..7403819a2d10b 100644 --- a/cli/testdata/server-config.yaml.golden +++ b/cli/testdata/server-config.yaml.golden @@ -183,7 +183,7 @@ networking: # Interval to poll for scheduled workspace builds. # (default: 1m0s, type: duration) autobuildPollInterval: 1m0s -# Interval to poll for hung jobs and automatically terminate them. +# Interval to poll for hung and pending jobs and automatically terminate them. # (default: 1m0s, type: duration) jobHangDetectorInterval: 1m0s introspection: @@ -704,3 +704,7 @@ workspace_prebuilds: # backoff. # (default: 1h0m0s, type: duration) reconciliation_backoff_lookback_period: 1h0m0s + # Maximum number of consecutive failed prebuilds before a preset hits the hard + # limit; disabled when set to zero. + # (default: 3, type: int) + failure_hard_limit: 3 diff --git a/cli/usereditroles.go b/cli/usereditroles.go index 815d8f47dc186..5bdad7a66863b 100644 --- a/cli/usereditroles.go +++ b/cli/usereditroles.go @@ -1,32 +1,19 @@ package cli import ( - "fmt" "slices" - "sort" "strings" "golang.org/x/xerrors" "github.com/coder/coder/v2/cli/cliui" - "github.com/coder/coder/v2/coderd/rbac" "github.com/coder/coder/v2/codersdk" "github.com/coder/serpent" ) func (r *RootCmd) userEditRoles() *serpent.Command { client := new(codersdk.Client) - - roles := rbac.SiteRoles() - - siteRoles := make([]string, 0) - for _, role := range roles { - siteRoles = append(siteRoles, role.Identifier.Name) - } - sort.Strings(siteRoles) - var givenRoles []string - cmd := &serpent.Command{ Use: "edit-roles ", Short: "Edit a user's roles by username or id", @@ -34,7 +21,7 @@ func (r *RootCmd) userEditRoles() *serpent.Command { cliui.SkipPromptOption(), { Name: "roles", - Description: fmt.Sprintf("A list of roles to give to the user. This removes any existing roles the user may have. The available roles are: %s.", strings.Join(siteRoles, ", ")), + Description: "A list of roles to give to the user. This removes any existing roles the user may have.", Flag: "roles", Value: serpent.StringArrayOf(&givenRoles), }, @@ -52,13 +39,21 @@ func (r *RootCmd) userEditRoles() *serpent.Command { if err != nil { return xerrors.Errorf("fetch user roles: %w", err) } + siteRoles, err := client.ListSiteRoles(ctx) + if err != nil { + return xerrors.Errorf("fetch site roles: %w", err) + } + siteRoleNames := make([]string, 0, len(siteRoles)) + for _, role := range siteRoles { + siteRoleNames = append(siteRoleNames, role.Name) + } var selectedRoles []string if len(givenRoles) > 0 { // Make sure all of the given roles are valid site roles for _, givenRole := range givenRoles { - if !slices.Contains(siteRoles, givenRole) { - siteRolesPretty := strings.Join(siteRoles, ", ") + if !slices.Contains(siteRoleNames, givenRole) { + siteRolesPretty := strings.Join(siteRoleNames, ", ") return xerrors.Errorf("The role %s is not valid. Please use one or more of the following roles: %s\n", givenRole, siteRolesPretty) } } @@ -67,7 +62,7 @@ func (r *RootCmd) userEditRoles() *serpent.Command { } else { selectedRoles, err = cliui.MultiSelect(inv, cliui.MultiSelectOptions{ Message: "Select the roles you'd like to assign to the user", - Options: siteRoles, + Options: siteRoleNames, Defaults: userRoles.Roles, }) if err != nil { diff --git a/coderd/agentapi/manifest.go b/coderd/agentapi/manifest.go index 66bfe4cb5f94f..855ff4b8acd37 100644 --- a/coderd/agentapi/manifest.go +++ b/coderd/agentapi/manifest.go @@ -47,7 +47,6 @@ func (a *ManifestAPI) GetManifest(ctx context.Context, _ *agentproto.GetManifest scripts []database.WorkspaceAgentScript metadata []database.WorkspaceAgentMetadatum workspace database.Workspace - owner database.User devcontainers []database.WorkspaceAgentDevcontainer ) @@ -76,10 +75,6 @@ func (a *ManifestAPI) GetManifest(ctx context.Context, _ *agentproto.GetManifest if err != nil { return xerrors.Errorf("getting workspace by id: %w", err) } - owner, err = a.Database.GetUserByID(ctx, workspace.OwnerID) - if err != nil { - return xerrors.Errorf("getting workspace owner by id: %w", err) - } return err }) eg.Go(func() (err error) { @@ -98,7 +93,7 @@ func (a *ManifestAPI) GetManifest(ctx context.Context, _ *agentproto.GetManifest AppSlugOrPort: "{{port}}", AgentName: workspaceAgent.Name, WorkspaceName: workspace.Name, - Username: owner.Username, + Username: workspace.OwnerUsername, } vscodeProxyURI := vscodeProxyURI(appSlug, a.AccessURL, a.AppHostname) @@ -115,7 +110,7 @@ func (a *ManifestAPI) GetManifest(ctx context.Context, _ *agentproto.GetManifest } } - apps, err := dbAppsToProto(dbApps, workspaceAgent, owner.Username, workspace) + apps, err := dbAppsToProto(dbApps, workspaceAgent, workspace.OwnerUsername, workspace) if err != nil { return nil, xerrors.Errorf("converting workspace apps: %w", err) } @@ -128,7 +123,7 @@ func (a *ManifestAPI) GetManifest(ctx context.Context, _ *agentproto.GetManifest return &agentproto.Manifest{ AgentId: workspaceAgent.ID[:], AgentName: workspaceAgent.Name, - OwnerUsername: owner.Username, + OwnerUsername: workspace.OwnerUsername, WorkspaceId: workspace.ID[:], WorkspaceName: workspace.Name, GitAuthConfigs: gitAuthConfigs, diff --git a/coderd/agentapi/manifest_test.go b/coderd/agentapi/manifest_test.go index 9273acb0c40ff..fc46f5fe480f8 100644 --- a/coderd/agentapi/manifest_test.go +++ b/coderd/agentapi/manifest_test.go @@ -46,9 +46,10 @@ func TestGetManifest(t *testing.T) { Username: "cool-user", } workspace = database.Workspace{ - ID: uuid.New(), - OwnerID: owner.ID, - Name: "cool-workspace", + ID: uuid.New(), + OwnerID: owner.ID, + OwnerUsername: owner.Username, + Name: "cool-workspace", } agent = database.WorkspaceAgent{ ID: uuid.New(), @@ -336,7 +337,6 @@ func TestGetManifest(t *testing.T) { }).Return(metadata, nil) mDB.EXPECT().GetWorkspaceAgentDevcontainersByAgentID(gomock.Any(), agent.ID).Return(devcontainers, nil) mDB.EXPECT().GetWorkspaceByID(gomock.Any(), workspace.ID).Return(workspace, nil) - mDB.EXPECT().GetUserByID(gomock.Any(), workspace.OwnerID).Return(owner, nil) got, err := api.GetManifest(context.Background(), &agentproto.GetManifestRequest{}) require.NoError(t, err) @@ -404,7 +404,6 @@ func TestGetManifest(t *testing.T) { }).Return([]database.WorkspaceAgentMetadatum{}, nil) mDB.EXPECT().GetWorkspaceAgentDevcontainersByAgentID(gomock.Any(), childAgent.ID).Return([]database.WorkspaceAgentDevcontainer{}, nil) mDB.EXPECT().GetWorkspaceByID(gomock.Any(), workspace.ID).Return(workspace, nil) - mDB.EXPECT().GetUserByID(gomock.Any(), workspace.OwnerID).Return(owner, nil) got, err := api.GetManifest(context.Background(), &agentproto.GetManifestRequest{}) require.NoError(t, err) @@ -468,7 +467,6 @@ func TestGetManifest(t *testing.T) { }).Return(metadata, nil) mDB.EXPECT().GetWorkspaceAgentDevcontainersByAgentID(gomock.Any(), agent.ID).Return(devcontainers, nil) mDB.EXPECT().GetWorkspaceByID(gomock.Any(), workspace.ID).Return(workspace, nil) - mDB.EXPECT().GetUserByID(gomock.Any(), workspace.OwnerID).Return(owner, nil) got, err := api.GetManifest(context.Background(), &agentproto.GetManifestRequest{}) require.NoError(t, err) diff --git a/coderd/apidoc/docs.go b/coderd/apidoc/docs.go index 075f33aeac02f..7cee63e183e7e 100644 --- a/coderd/apidoc/docs.go +++ b/coderd/apidoc/docs.go @@ -11998,6 +11998,10 @@ const docTemplate = `{ "dry_run": { "type": "boolean" }, + "enable_dynamic_parameters": { + "description": "EnableDynamicParameters skips some of the static parameter checking.\nIt will default to whatever the template has marked as the default experience.\nRequires the \"dynamic-experiment\" to be used.", + "type": "boolean" + }, "log_level": { "description": "Log level changes the default logging verbosity of a provider (\"info\" if empty).", "enum": [ @@ -14322,6 +14326,10 @@ const docTemplate = `{ "codersdk.PrebuildsConfig": { "type": "object", "properties": { + "failure_hard_limit": { + "description": "FailureHardLimit defines the maximum number of consecutive failed prebuild attempts allowed\nbefore a preset is considered to be in a hard limit state. When a preset hits this limit,\nno new prebuilds will be created until the limit is reset.\nFailureHardLimit is disabled when set to zero.", + "type": "integer" + }, "reconciliation_backoff_interval": { "description": "ReconciliationBackoffInterval specifies the amount of time to increase the backoff interval\nwhen errors occur during reconciliation.", "type": "integer" @@ -14897,7 +14905,9 @@ const docTemplate = `{ "application_connect", "assign", "create", + "create_agent", "delete", + "delete_agent", "read", "read_personal", "ssh", @@ -14913,7 +14923,9 @@ const docTemplate = `{ "ActionApplicationConnect", "ActionAssign", "ActionCreate", + "ActionCreateAgent", "ActionDelete", + "ActionDeleteAgent", "ActionRead", "ActionReadPersonal", "ActionSSH", @@ -17006,6 +17018,9 @@ const docTemplate = `{ "template_require_active_version": { "type": "boolean" }, + "template_use_classic_parameter_flow": { + "type": "boolean" + }, "ttl_ms": { "type": "integer" }, diff --git a/coderd/apidoc/swagger.json b/coderd/apidoc/swagger.json index e00ab22232483..89a582091496f 100644 --- a/coderd/apidoc/swagger.json +++ b/coderd/apidoc/swagger.json @@ -10716,6 +10716,10 @@ "dry_run": { "type": "boolean" }, + "enable_dynamic_parameters": { + "description": "EnableDynamicParameters skips some of the static parameter checking.\nIt will default to whatever the template has marked as the default experience.\nRequires the \"dynamic-experiment\" to be used.", + "type": "boolean" + }, "log_level": { "description": "Log level changes the default logging verbosity of a provider (\"info\" if empty).", "enum": ["debug"], @@ -12964,6 +12968,10 @@ "codersdk.PrebuildsConfig": { "type": "object", "properties": { + "failure_hard_limit": { + "description": "FailureHardLimit defines the maximum number of consecutive failed prebuild attempts allowed\nbefore a preset is considered to be in a hard limit state. When a preset hits this limit,\nno new prebuilds will be created until the limit is reset.\nFailureHardLimit is disabled when set to zero.", + "type": "integer" + }, "reconciliation_backoff_interval": { "description": "ReconciliationBackoffInterval specifies the amount of time to increase the backoff interval\nwhen errors occur during reconciliation.", "type": "integer" @@ -13505,7 +13513,9 @@ "application_connect", "assign", "create", + "create_agent", "delete", + "delete_agent", "read", "read_personal", "ssh", @@ -13521,7 +13531,9 @@ "ActionApplicationConnect", "ActionAssign", "ActionCreate", + "ActionCreateAgent", "ActionDelete", + "ActionDeleteAgent", "ActionRead", "ActionReadPersonal", "ActionSSH", @@ -15513,6 +15525,9 @@ "template_require_active_version": { "type": "boolean" }, + "template_use_classic_parameter_flow": { + "type": "boolean" + }, "ttl_ms": { "type": "integer" }, diff --git a/coderd/autobuild/lifecycle_executor.go b/coderd/autobuild/lifecycle_executor.go index cc4e48b43544c..b0cba60111335 100644 --- a/coderd/autobuild/lifecycle_executor.go +++ b/coderd/autobuild/lifecycle_executor.go @@ -27,6 +27,7 @@ import ( "github.com/coder/coder/v2/coderd/notifications" "github.com/coder/coder/v2/coderd/schedule" "github.com/coder/coder/v2/coderd/wsbuilder" + "github.com/coder/coder/v2/codersdk" ) // Executor automatically starts or stops workspaces. @@ -43,6 +44,7 @@ type Executor struct { // NotificationsEnqueuer handles enqueueing notifications for delivery by SMTP, webhook, etc. notificationsEnqueuer notifications.Enqueuer reg prometheus.Registerer + experiments codersdk.Experiments metrics executorMetrics } @@ -59,7 +61,7 @@ type Stats struct { } // New returns a new wsactions executor. -func NewExecutor(ctx context.Context, db database.Store, ps pubsub.Pubsub, reg prometheus.Registerer, tss *atomic.Pointer[schedule.TemplateScheduleStore], auditor *atomic.Pointer[audit.Auditor], acs *atomic.Pointer[dbauthz.AccessControlStore], log slog.Logger, tick <-chan time.Time, enqueuer notifications.Enqueuer) *Executor { +func NewExecutor(ctx context.Context, db database.Store, ps pubsub.Pubsub, reg prometheus.Registerer, tss *atomic.Pointer[schedule.TemplateScheduleStore], auditor *atomic.Pointer[audit.Auditor], acs *atomic.Pointer[dbauthz.AccessControlStore], log slog.Logger, tick <-chan time.Time, enqueuer notifications.Enqueuer, exp codersdk.Experiments) *Executor { factory := promauto.With(reg) le := &Executor{ //nolint:gocritic // Autostart has a limited set of permissions. @@ -73,6 +75,7 @@ func NewExecutor(ctx context.Context, db database.Store, ps pubsub.Pubsub, reg p accessControlStore: acs, notificationsEnqueuer: enqueuer, reg: reg, + experiments: exp, metrics: executorMetrics{ autobuildExecutionDuration: factory.NewHistogram(prometheus.HistogramOpts{ Namespace: "coderd", @@ -258,6 +261,7 @@ func (e *Executor) runOnce(t time.Time) Stats { builder := wsbuilder.New(ws, nextTransition). SetLastWorkspaceBuildInTx(&latestBuild). SetLastWorkspaceBuildJobInTx(&latestJob). + Experiments(e.experiments). Reason(reason) log.Debug(e.ctx, "auto building workspace", slog.F("transition", nextTransition)) if nextTransition == database.WorkspaceTransitionStart && @@ -349,13 +353,18 @@ func (e *Executor) runOnce(t time.Time) Stats { nextBuildReason = string(nextBuild.Reason) } + templateVersionMessage := activeTemplateVersion.Message + if templateVersionMessage == "" { + templateVersionMessage = "None provided" + } + if _, err := e.notificationsEnqueuer.Enqueue(e.ctx, ws.OwnerID, notifications.TemplateWorkspaceAutoUpdated, map[string]string{ "name": ws.Name, "initiator": "autobuild", "reason": nextBuildReason, "template_version_name": activeTemplateVersion.Name, - "template_version_message": activeTemplateVersion.Message, + "template_version_message": templateVersionMessage, }, "autobuild", // Associate this notification with all the related entities. ws.ID, ws.OwnerID, ws.TemplateID, ws.OrganizationID, diff --git a/coderd/coderdtest/coderdtest.go b/coderd/coderdtest/coderdtest.go index b395a2cf2afbe..a8f444c8f632e 100644 --- a/coderd/coderdtest/coderdtest.go +++ b/coderd/coderdtest/coderdtest.go @@ -68,6 +68,7 @@ import ( "github.com/coder/coder/v2/coderd/externalauth" "github.com/coder/coder/v2/coderd/gitsshkey" "github.com/coder/coder/v2/coderd/httpmw" + "github.com/coder/coder/v2/coderd/jobreaper" "github.com/coder/coder/v2/coderd/notifications" "github.com/coder/coder/v2/coderd/notifications/notificationstest" "github.com/coder/coder/v2/coderd/rbac" @@ -75,7 +76,6 @@ import ( "github.com/coder/coder/v2/coderd/runtimeconfig" "github.com/coder/coder/v2/coderd/schedule" "github.com/coder/coder/v2/coderd/telemetry" - "github.com/coder/coder/v2/coderd/unhanger" "github.com/coder/coder/v2/coderd/updatecheck" "github.com/coder/coder/v2/coderd/util/ptr" "github.com/coder/coder/v2/coderd/webpush" @@ -354,6 +354,7 @@ func NewOptions(t testing.TB, options *Options) (func(http.Handler), context.Can auditor.Store(&options.Auditor) ctx, cancelFunc := context.WithCancel(context.Background()) + experiments := coderd.ReadExperiments(*options.Logger, options.DeploymentValues.Experiments) lifecycleExecutor := autobuild.NewExecutor( ctx, options.Database, @@ -365,14 +366,15 @@ func NewOptions(t testing.TB, options *Options) (func(http.Handler), context.Can *options.Logger, options.AutobuildTicker, options.NotificationsEnqueuer, + experiments, ).WithStatsChannel(options.AutobuildStats) lifecycleExecutor.Run() - hangDetectorTicker := time.NewTicker(options.DeploymentValues.JobHangDetectorInterval.Value()) - defer hangDetectorTicker.Stop() - hangDetector := unhanger.New(ctx, options.Database, options.Pubsub, options.Logger.Named("unhanger.detector"), hangDetectorTicker.C) - hangDetector.Start() - t.Cleanup(hangDetector.Close) + jobReaperTicker := time.NewTicker(options.DeploymentValues.JobReaperDetectorInterval.Value()) + defer jobReaperTicker.Stop() + jobReaper := jobreaper.New(ctx, options.Database, options.Pubsub, options.Logger.Named("reaper.detector"), jobReaperTicker.C) + jobReaper.Start() + t.Cleanup(jobReaper.Close) if options.TelemetryReporter == nil { options.TelemetryReporter = telemetry.NewNoop() diff --git a/coderd/database/db2sdk/db2sdk.go b/coderd/database/db2sdk/db2sdk.go index 18d1d8a6ac788..ed258a07820ab 100644 --- a/coderd/database/db2sdk/db2sdk.go +++ b/coderd/database/db2sdk/db2sdk.go @@ -12,6 +12,7 @@ import ( "time" "github.com/google/uuid" + "github.com/hashicorp/hcl/v2" "golang.org/x/xerrors" "tailscale.com/tailcfg" @@ -24,6 +25,7 @@ import ( "github.com/coder/coder/v2/codersdk" "github.com/coder/coder/v2/provisionersdk/proto" "github.com/coder/coder/v2/tailnet" + previewtypes "github.com/coder/preview/types" ) // List is a helper function to reduce boilerplate when converting slices of @@ -764,3 +766,83 @@ func Chat(chat database.Chat) codersdk.Chat { func Chats(chats []database.Chat) []codersdk.Chat { return List(chats, Chat) } + +func PreviewParameter(param previewtypes.Parameter) codersdk.PreviewParameter { + return codersdk.PreviewParameter{ + PreviewParameterData: codersdk.PreviewParameterData{ + Name: param.Name, + DisplayName: param.DisplayName, + Description: param.Description, + Type: codersdk.OptionType(param.Type), + FormType: codersdk.ParameterFormType(param.FormType), + Styling: codersdk.PreviewParameterStyling{ + Placeholder: param.Styling.Placeholder, + Disabled: param.Styling.Disabled, + Label: param.Styling.Label, + }, + Mutable: param.Mutable, + DefaultValue: PreviewHCLString(param.DefaultValue), + Icon: param.Icon, + Options: List(param.Options, PreviewParameterOption), + Validations: List(param.Validations, PreviewParameterValidation), + Required: param.Required, + Order: param.Order, + Ephemeral: param.Ephemeral, + }, + Value: PreviewHCLString(param.Value), + Diagnostics: PreviewDiagnostics(param.Diagnostics), + } +} + +func HCLDiagnostics(d hcl.Diagnostics) []codersdk.FriendlyDiagnostic { + return PreviewDiagnostics(previewtypes.Diagnostics(d)) +} + +func PreviewDiagnostics(d previewtypes.Diagnostics) []codersdk.FriendlyDiagnostic { + f := d.FriendlyDiagnostics() + return List(f, func(f previewtypes.FriendlyDiagnostic) codersdk.FriendlyDiagnostic { + return codersdk.FriendlyDiagnostic{ + Severity: codersdk.DiagnosticSeverityString(f.Severity), + Summary: f.Summary, + Detail: f.Detail, + Extra: codersdk.DiagnosticExtra{ + Code: f.Extra.Code, + }, + } + }) +} + +func PreviewHCLString(h previewtypes.HCLString) codersdk.NullHCLString { + n := h.NullHCLString() + return codersdk.NullHCLString{ + Value: n.Value, + Valid: n.Valid, + } +} + +func PreviewParameterOption(o *previewtypes.ParameterOption) codersdk.PreviewParameterOption { + if o == nil { + // This should never be sent + return codersdk.PreviewParameterOption{} + } + return codersdk.PreviewParameterOption{ + Name: o.Name, + Description: o.Description, + Value: PreviewHCLString(o.Value), + Icon: o.Icon, + } +} + +func PreviewParameterValidation(v *previewtypes.ParameterValidation) codersdk.PreviewParameterValidation { + if v == nil { + // This should never be sent + return codersdk.PreviewParameterValidation{} + } + return codersdk.PreviewParameterValidation{ + Error: v.Error, + Regex: v.Regex, + Min: v.Min, + Max: v.Max, + Monotonic: v.Monotonic, + } +} diff --git a/coderd/database/dbauthz/dbauthz.go b/coderd/database/dbauthz/dbauthz.go index 928dee0e30ea3..a210599d17cc4 100644 --- a/coderd/database/dbauthz/dbauthz.go +++ b/coderd/database/dbauthz/dbauthz.go @@ -170,14 +170,14 @@ var ( Identifier: rbac.RoleIdentifier{Name: "provisionerd"}, DisplayName: "Provisioner Daemon", Site: rbac.Permissions(map[string][]policy.Action{ - // TODO: Add ProvisionerJob resource type. - rbac.ResourceFile.Type: {policy.ActionRead}, - rbac.ResourceSystem.Type: {policy.WildcardSymbol}, - rbac.ResourceTemplate.Type: {policy.ActionRead, policy.ActionUpdate}, + rbac.ResourceProvisionerJobs.Type: {policy.ActionRead, policy.ActionUpdate, policy.ActionCreate}, + rbac.ResourceFile.Type: {policy.ActionRead}, + rbac.ResourceSystem.Type: {policy.WildcardSymbol}, + rbac.ResourceTemplate.Type: {policy.ActionRead, policy.ActionUpdate}, // Unsure why provisionerd needs update and read personal rbac.ResourceUser.Type: {policy.ActionRead, policy.ActionReadPersonal, policy.ActionUpdatePersonal}, rbac.ResourceWorkspaceDormant.Type: {policy.ActionDelete, policy.ActionRead, policy.ActionUpdate, policy.ActionWorkspaceStop}, - rbac.ResourceWorkspace.Type: {policy.ActionDelete, policy.ActionRead, policy.ActionUpdate, policy.ActionWorkspaceStart, policy.ActionWorkspaceStop}, + rbac.ResourceWorkspace.Type: {policy.ActionDelete, policy.ActionRead, policy.ActionUpdate, policy.ActionWorkspaceStart, policy.ActionWorkspaceStop, policy.ActionCreateAgent}, rbac.ResourceApiKey.Type: {policy.WildcardSymbol}, // When org scoped provisioner credentials are implemented, // this can be reduced to read a specific org. @@ -219,19 +219,20 @@ var ( Scope: rbac.ScopeAll, }.WithCachedASTValue() - // See unhanger package. - subjectHangDetector = rbac.Subject{ - Type: rbac.SubjectTypeHangDetector, - FriendlyName: "Hang Detector", + // See reaper package. + subjectJobReaper = rbac.Subject{ + Type: rbac.SubjectTypeJobReaper, + FriendlyName: "Job Reaper", ID: uuid.Nil.String(), Roles: rbac.Roles([]rbac.Role{ { - Identifier: rbac.RoleIdentifier{Name: "hangdetector"}, - DisplayName: "Hang Detector Daemon", + Identifier: rbac.RoleIdentifier{Name: "jobreaper"}, + DisplayName: "Job Reaper Daemon", Site: rbac.Permissions(map[string][]policy.Action{ - rbac.ResourceSystem.Type: {policy.WildcardSymbol}, - rbac.ResourceTemplate.Type: {policy.ActionRead}, - rbac.ResourceWorkspace.Type: {policy.ActionRead, policy.ActionUpdate}, + rbac.ResourceSystem.Type: {policy.WildcardSymbol}, + rbac.ResourceTemplate.Type: {policy.ActionRead}, + rbac.ResourceWorkspace.Type: {policy.ActionRead, policy.ActionUpdate}, + rbac.ResourceProvisionerJobs.Type: {policy.ActionRead, policy.ActionUpdate}, }), Org: map[string][]rbac.Permission{}, User: []rbac.Permission{}, @@ -338,7 +339,7 @@ var ( rbac.ResourceProvisionerDaemon.Type: {policy.ActionCreate, policy.ActionRead, policy.ActionUpdate}, rbac.ResourceUser.Type: rbac.ResourceUser.AvailableActions(), rbac.ResourceWorkspaceDormant.Type: {policy.ActionUpdate, policy.ActionDelete, policy.ActionWorkspaceStop}, - rbac.ResourceWorkspace.Type: {policy.ActionUpdate, policy.ActionDelete, policy.ActionWorkspaceStart, policy.ActionWorkspaceStop, policy.ActionSSH}, + rbac.ResourceWorkspace.Type: {policy.ActionUpdate, policy.ActionDelete, policy.ActionWorkspaceStart, policy.ActionWorkspaceStop, policy.ActionSSH, policy.ActionCreateAgent, policy.ActionDeleteAgent}, rbac.ResourceWorkspaceProxy.Type: {policy.ActionCreate, policy.ActionUpdate, policy.ActionDelete}, rbac.ResourceDeploymentConfig.Type: {policy.ActionCreate, policy.ActionUpdate, policy.ActionDelete}, rbac.ResourceNotificationMessage.Type: {policy.ActionCreate, policy.ActionRead, policy.ActionUpdate, policy.ActionDelete}, @@ -346,6 +347,7 @@ var ( rbac.ResourceNotificationTemplate.Type: {policy.ActionCreate, policy.ActionUpdate, policy.ActionDelete}, rbac.ResourceCryptoKey.Type: {policy.ActionCreate, policy.ActionUpdate, policy.ActionDelete}, rbac.ResourceFile.Type: {policy.ActionCreate, policy.ActionRead}, + rbac.ResourceProvisionerJobs.Type: {policy.ActionRead, policy.ActionUpdate, policy.ActionCreate}, }), Org: map[string][]rbac.Permission{}, User: []rbac.Permission{}, @@ -407,10 +409,10 @@ func AsAutostart(ctx context.Context) context.Context { return As(ctx, subjectAutostart) } -// AsHangDetector returns a context with an actor that has permissions required -// for unhanger.Detector to function. -func AsHangDetector(ctx context.Context) context.Context { - return As(ctx, subjectHangDetector) +// AsJobReaper returns a context with an actor that has permissions required +// for reaper.Detector to function. +func AsJobReaper(ctx context.Context) context.Context { + return As(ctx, subjectJobReaper) } // AsKeyRotator returns a context with an actor that has permissions required for rotating crypto keys. @@ -1085,11 +1087,10 @@ func (q *querier) AcquireNotificationMessages(ctx context.Context, arg database. return q.db.AcquireNotificationMessages(ctx, arg) } -// TODO: We need to create a ProvisionerJob resource type func (q *querier) AcquireProvisionerJob(ctx context.Context, arg database.AcquireProvisionerJobParams) (database.ProvisionerJob, error) { - // if err := q.authorizeContext(ctx, policy.ActionUpdate, rbac.ResourceSystem); err != nil { - // return database.ProvisionerJob{}, err - // } + if err := q.authorizeContext(ctx, policy.ActionUpdate, rbac.ResourceProvisionerJobs); err != nil { + return database.ProvisionerJob{}, err + } return q.db.AcquireProvisionerJob(ctx, arg) } @@ -1912,14 +1913,6 @@ func (q *querier) GetHealthSettings(ctx context.Context) (string, error) { return q.db.GetHealthSettings(ctx) } -// TODO: We need to create a ProvisionerJob resource type -func (q *querier) GetHungProvisionerJobs(ctx context.Context, hungSince time.Time) ([]database.ProvisionerJob, error) { - // if err := q.authorizeContext(ctx, policy.ActionCreate, rbac.ResourceSystem); err != nil { - // return nil, err - // } - return q.db.GetHungProvisionerJobs(ctx, hungSince) -} - func (q *querier) GetInboxNotificationByID(ctx context.Context, id uuid.UUID) (database.InboxNotification, error) { return fetchWithAction(q.log, q.auth, policy.ActionRead, q.db.GetInboxNotificationByID)(ctx, id) } @@ -2233,6 +2226,15 @@ func (q *querier) GetPresetParametersByTemplateVersionID(ctx context.Context, ar return q.db.GetPresetParametersByTemplateVersionID(ctx, args) } +func (q *querier) GetPresetsAtFailureLimit(ctx context.Context, hardLimit int64) ([]database.GetPresetsAtFailureLimitRow, error) { + // GetPresetsAtFailureLimit returns a list of template version presets that have reached the hard failure limit. + // Request the same authorization permissions as GetPresetsBackoff, since the methods are similar. + if err := q.authorizeContext(ctx, policy.ActionViewInsights, rbac.ResourceTemplate.All()); err != nil { + return nil, err + } + return q.db.GetPresetsAtFailureLimit(ctx, hardLimit) +} + func (q *querier) GetPresetsBackoff(ctx context.Context, lookback time.Time) ([]database.GetPresetsBackoffRow, error) { // GetPresetsBackoff returns a list of template version presets along with metadata such as the number of failed prebuilds. if err := q.authorizeContext(ctx, policy.ActionViewInsights, rbac.ResourceTemplate.All()); err != nil { @@ -2307,6 +2309,13 @@ func (q *querier) GetProvisionerJobByID(ctx context.Context, id uuid.UUID) (data return job, nil } +func (q *querier) GetProvisionerJobByIDForUpdate(ctx context.Context, id uuid.UUID) (database.ProvisionerJob, error) { + if err := q.authorizeContext(ctx, policy.ActionRead, rbac.ResourceProvisionerJobs); err != nil { + return database.ProvisionerJob{}, err + } + return q.db.GetProvisionerJobByIDForUpdate(ctx, id) +} + func (q *querier) GetProvisionerJobTimingsByJobID(ctx context.Context, jobID uuid.UUID) ([]database.ProvisionerJobTiming, error) { _, err := q.GetProvisionerJobByID(ctx, jobID) if err != nil { @@ -2315,31 +2324,49 @@ func (q *querier) GetProvisionerJobTimingsByJobID(ctx context.Context, jobID uui return q.db.GetProvisionerJobTimingsByJobID(ctx, jobID) } -// TODO: We have a ProvisionerJobs resource, but it hasn't been checked for this use-case. func (q *querier) GetProvisionerJobsByIDs(ctx context.Context, ids []uuid.UUID) ([]database.ProvisionerJob, error) { - // if err := q.authorizeContext(ctx, policy.ActionRead, rbac.ResourceSystem); err != nil { - // return nil, err - // } - return q.db.GetProvisionerJobsByIDs(ctx, ids) + provisionerJobs, err := q.db.GetProvisionerJobsByIDs(ctx, ids) + if err != nil { + return nil, err + } + orgIDs := make(map[uuid.UUID]struct{}) + for _, job := range provisionerJobs { + orgIDs[job.OrganizationID] = struct{}{} + } + for orgID := range orgIDs { + if err := q.authorizeContext(ctx, policy.ActionRead, rbac.ResourceProvisionerJobs.InOrg(orgID)); err != nil { + return nil, err + } + } + return provisionerJobs, nil } -// TODO: We have a ProvisionerJobs resource, but it hasn't been checked for this use-case. func (q *querier) GetProvisionerJobsByIDsWithQueuePosition(ctx context.Context, ids []uuid.UUID) ([]database.GetProvisionerJobsByIDsWithQueuePositionRow, error) { + // TODO: Remove this once we have a proper rbac check for provisioner jobs. + // Details in https://github.com/coder/coder/issues/16160 return q.db.GetProvisionerJobsByIDsWithQueuePosition(ctx, ids) } func (q *querier) GetProvisionerJobsByOrganizationAndStatusWithQueuePositionAndProvisioner(ctx context.Context, arg database.GetProvisionerJobsByOrganizationAndStatusWithQueuePositionAndProvisionerParams) ([]database.GetProvisionerJobsByOrganizationAndStatusWithQueuePositionAndProvisionerRow, error) { + // TODO: Remove this once we have a proper rbac check for provisioner jobs. + // Details in https://github.com/coder/coder/issues/16160 return fetchWithPostFilter(q.auth, policy.ActionRead, q.db.GetProvisionerJobsByOrganizationAndStatusWithQueuePositionAndProvisioner)(ctx, arg) } -// TODO: We have a ProvisionerJobs resource, but it hasn't been checked for this use-case. func (q *querier) GetProvisionerJobsCreatedAfter(ctx context.Context, createdAt time.Time) ([]database.ProvisionerJob, error) { - // if err := q.authorizeContext(ctx, policy.ActionRead, rbac.ResourceSystem); err != nil { - // return nil, err - // } + if err := q.authorizeContext(ctx, policy.ActionRead, rbac.ResourceProvisionerJobs); err != nil { + return nil, err + } return q.db.GetProvisionerJobsCreatedAfter(ctx, createdAt) } +func (q *querier) GetProvisionerJobsToBeReaped(ctx context.Context, arg database.GetProvisionerJobsToBeReapedParams) ([]database.ProvisionerJob, error) { + if err := q.authorizeContext(ctx, policy.ActionRead, rbac.ResourceProvisionerJobs); err != nil { + return nil, err + } + return q.db.GetProvisionerJobsToBeReaped(ctx, arg) +} + func (q *querier) GetProvisionerKeyByHashedSecret(ctx context.Context, hashedSecret []byte) (database.ProvisionerKey, error) { return fetch(q.log, q.auth, q.db.GetProvisionerKeyByHashedSecret)(ctx, hashedSecret) } @@ -3162,6 +3189,10 @@ func (q *querier) GetWorkspaceByOwnerIDAndName(ctx context.Context, arg database return fetch(q.log, q.auth, q.db.GetWorkspaceByOwnerIDAndName)(ctx, arg) } +func (q *querier) GetWorkspaceByResourceID(ctx context.Context, resourceID uuid.UUID) (database.Workspace, error) { + return fetch(q.log, q.auth, q.db.GetWorkspaceByResourceID)(ctx, resourceID) +} + func (q *querier) GetWorkspaceByWorkspaceAppID(ctx context.Context, workspaceAppID uuid.UUID) (database.Workspace, error) { return fetch(q.log, q.auth, q.db.GetWorkspaceByWorkspaceAppID)(ctx, workspaceAppID) } @@ -3533,27 +3564,22 @@ func (q *querier) InsertPresetParameters(ctx context.Context, arg database.Inser return q.db.InsertPresetParameters(ctx, arg) } -// TODO: We need to create a ProvisionerJob resource type func (q *querier) InsertProvisionerJob(ctx context.Context, arg database.InsertProvisionerJobParams) (database.ProvisionerJob, error) { - // if err := q.authorizeContext(ctx, policy.ActionCreate, rbac.ResourceSystem); err != nil { - // return database.ProvisionerJob{}, err - // } + // TODO: Remove this once we have a proper rbac check for provisioner jobs. + // Details in https://github.com/coder/coder/issues/16160 return q.db.InsertProvisionerJob(ctx, arg) } -// TODO: We need to create a ProvisionerJob resource type func (q *querier) InsertProvisionerJobLogs(ctx context.Context, arg database.InsertProvisionerJobLogsParams) ([]database.ProvisionerJobLog, error) { - // if err := q.authorizeContext(ctx, policy.ActionCreate, rbac.ResourceSystem); err != nil { - // return nil, err - // } + // TODO: Remove this once we have a proper rbac check for provisioner jobs. + // Details in https://github.com/coder/coder/issues/16160 return q.db.InsertProvisionerJobLogs(ctx, arg) } -// TODO: We need to create a ProvisionerJob resource type func (q *querier) InsertProvisionerJobTimings(ctx context.Context, arg database.InsertProvisionerJobTimingsParams) ([]database.ProvisionerJobTiming, error) { - // if err := q.authorizeContext(ctx, policy.ActionCreate, rbac.ResourceSystem); err != nil { - // return nil, err - // } + if err := q.authorizeContext(ctx, policy.ActionUpdate, rbac.ResourceProvisionerJobs); err != nil { + return nil, err + } return q.db.InsertProvisionerJobTimings(ctx, arg) } @@ -3700,9 +3726,24 @@ func (q *querier) InsertWorkspace(ctx context.Context, arg database.InsertWorksp } func (q *querier) InsertWorkspaceAgent(ctx context.Context, arg database.InsertWorkspaceAgentParams) (database.WorkspaceAgent, error) { - if err := q.authorizeContext(ctx, policy.ActionCreate, rbac.ResourceSystem); err != nil { + // NOTE(DanielleMaywood): + // Currently, the only way to link a Resource back to a Workspace is by following this chain: + // + // WorkspaceResource -> WorkspaceBuild -> Workspace + // + // It is possible for this function to be called without there existing + // a `WorkspaceBuild` to link back to. This means that we want to allow + // execution to continue if there isn't a workspace found to allow this + // behavior to continue. + workspace, err := q.db.GetWorkspaceByResourceID(ctx, arg.ResourceID) + if err != nil && !errors.Is(err, sql.ErrNoRows) { return database.WorkspaceAgent{}, err } + + if err := q.authorizeContext(ctx, policy.ActionCreateAgent, workspace); err != nil { + return database.WorkspaceAgent{}, err + } + return q.db.InsertWorkspaceAgent(ctx, arg) } @@ -4169,6 +4210,24 @@ func (q *querier) UpdateOrganizationDeletedByID(ctx context.Context, arg databas return deleteQ(q.log, q.auth, q.db.GetOrganizationByID, deleteF)(ctx, arg.ID) } +func (q *querier) UpdatePresetPrebuildStatus(ctx context.Context, arg database.UpdatePresetPrebuildStatusParams) error { + preset, err := q.db.GetPresetByID(ctx, arg.PresetID) + if err != nil { + return err + } + + object := rbac.ResourceTemplate. + WithID(preset.TemplateID.UUID). + InOrg(preset.OrganizationID) + + err = q.authorizeContext(ctx, policy.ActionUpdate, object) + if err != nil { + return err + } + + return q.db.UpdatePresetPrebuildStatus(ctx, arg) +} + func (q *querier) UpdateProvisionerDaemonLastSeenAt(ctx context.Context, arg database.UpdateProvisionerDaemonLastSeenAtParams) error { if err := q.authorizeContext(ctx, policy.ActionUpdate, rbac.ResourceProvisionerDaemon); err != nil { return err @@ -4176,15 +4235,17 @@ func (q *querier) UpdateProvisionerDaemonLastSeenAt(ctx context.Context, arg dat return q.db.UpdateProvisionerDaemonLastSeenAt(ctx, arg) } -// TODO: We need to create a ProvisionerJob resource type func (q *querier) UpdateProvisionerJobByID(ctx context.Context, arg database.UpdateProvisionerJobByIDParams) error { - // if err := q.authorizeContext(ctx, policy.ActionUpdate, rbac.ResourceSystem); err != nil { - // return err - // } + if err := q.authorizeContext(ctx, policy.ActionUpdate, rbac.ResourceProvisionerJobs); err != nil { + return err + } return q.db.UpdateProvisionerJobByID(ctx, arg) } func (q *querier) UpdateProvisionerJobWithCancelByID(ctx context.Context, arg database.UpdateProvisionerJobWithCancelByIDParams) error { + // TODO: Remove this once we have a proper rbac check for provisioner jobs. + // Details in https://github.com/coder/coder/issues/16160 + job, err := q.db.GetProvisionerJobByID(ctx, arg.ID) if err != nil { return err @@ -4251,14 +4312,20 @@ func (q *querier) UpdateProvisionerJobWithCancelByID(ctx context.Context, arg da return q.db.UpdateProvisionerJobWithCancelByID(ctx, arg) } -// TODO: We need to create a ProvisionerJob resource type func (q *querier) UpdateProvisionerJobWithCompleteByID(ctx context.Context, arg database.UpdateProvisionerJobWithCompleteByIDParams) error { - // if err := q.authorizeContext(ctx, policy.ActionUpdate, rbac.ResourceSystem); err != nil { - // return err - // } + if err := q.authorizeContext(ctx, policy.ActionUpdate, rbac.ResourceProvisionerJobs); err != nil { + return err + } return q.db.UpdateProvisionerJobWithCompleteByID(ctx, arg) } +func (q *querier) UpdateProvisionerJobWithCompleteWithStartedAtByID(ctx context.Context, arg database.UpdateProvisionerJobWithCompleteWithStartedAtByIDParams) error { + if err := q.authorizeContext(ctx, policy.ActionUpdate, rbac.ResourceProvisionerJobs); err != nil { + return err + } + return q.db.UpdateProvisionerJobWithCompleteWithStartedAtByID(ctx, arg) +} + func (q *querier) UpdateReplica(ctx context.Context, arg database.UpdateReplicaParams) (database.Replica, error) { if err := q.authorizeContext(ctx, policy.ActionUpdate, rbac.ResourceSystem); err != nil { return database.Replica{}, err diff --git a/coderd/database/dbauthz/dbauthz_test.go b/coderd/database/dbauthz/dbauthz_test.go index a0289f222392b..703e51d739c47 100644 --- a/coderd/database/dbauthz/dbauthz_test.go +++ b/coderd/database/dbauthz/dbauthz_test.go @@ -694,9 +694,12 @@ func (s *MethodTestSuite) TestProvisionerJob() { Asserts(v.RBACObject(tpl), []policy.Action{policy.ActionRead, policy.ActionUpdate}).Returns() })) s.Run("GetProvisionerJobsByIDs", s.Subtest(func(db database.Store, check *expects) { - a := dbgen.ProvisionerJob(s.T(), db, nil, database.ProvisionerJob{}) - b := dbgen.ProvisionerJob(s.T(), db, nil, database.ProvisionerJob{}) - check.Args([]uuid.UUID{a.ID, b.ID}).Asserts().Returns(slice.New(a, b)) + o := dbgen.Organization(s.T(), db, database.Organization{}) + a := dbgen.ProvisionerJob(s.T(), db, nil, database.ProvisionerJob{OrganizationID: o.ID}) + b := dbgen.ProvisionerJob(s.T(), db, nil, database.ProvisionerJob{OrganizationID: o.ID}) + check.Args([]uuid.UUID{a.ID, b.ID}). + Asserts(rbac.ResourceProvisionerJobs.InOrg(o.ID), policy.ActionRead). + Returns(slice.New(a, b)) })) s.Run("GetProvisionerLogsAfterID", s.Subtest(func(db database.Store, check *expects) { u := dbgen.User(s.T(), db, database.User{}) @@ -1925,6 +1928,22 @@ func (s *MethodTestSuite) TestWorkspace() { }) check.Args(ws.ID).Asserts(ws, policy.ActionRead) })) + s.Run("GetWorkspaceByResourceID", s.Subtest(func(db database.Store, check *expects) { + u := dbgen.User(s.T(), db, database.User{}) + o := dbgen.Organization(s.T(), db, database.Organization{}) + j := dbgen.ProvisionerJob(s.T(), db, nil, database.ProvisionerJob{Type: database.ProvisionerJobTypeWorkspaceBuild}) + tpl := dbgen.Template(s.T(), db, database.Template{CreatedBy: u.ID, OrganizationID: o.ID}) + tv := dbgen.TemplateVersion(s.T(), db, database.TemplateVersion{ + TemplateID: uuid.NullUUID{UUID: tpl.ID, Valid: true}, + JobID: j.ID, + OrganizationID: o.ID, + CreatedBy: u.ID, + }) + ws := dbgen.Workspace(s.T(), db, database.WorkspaceTable{OwnerID: u.ID, TemplateID: tpl.ID, OrganizationID: o.ID}) + _ = dbgen.WorkspaceBuild(s.T(), db, database.WorkspaceBuild{WorkspaceID: ws.ID, JobID: j.ID, TemplateVersionID: tv.ID}) + res := dbgen.WorkspaceResource(s.T(), db, database.WorkspaceResource{JobID: j.ID}) + check.Args(res.ID).Asserts(ws, policy.ActionRead) + })) s.Run("GetWorkspaces", s.Subtest(func(_ database.Store, check *expects) { // No asserts here because SQLFilter. check.Args(database.GetWorkspacesParams{}).Asserts() @@ -3923,9 +3942,8 @@ func (s *MethodTestSuite) TestSystemFunctions() { check.Args().Asserts(rbac.ResourceSystem, policy.ActionDelete) })) s.Run("GetProvisionerJobsCreatedAfter", s.Subtest(func(db database.Store, check *expects) { - // TODO: add provisioner job resource type _ = dbgen.ProvisionerJob(s.T(), db, nil, database.ProvisionerJob{CreatedAt: time.Now().Add(-time.Hour)}) - check.Args(time.Now()).Asserts( /*rbac.ResourceSystem, policy.ActionRead*/ ) + check.Args(time.Now()).Asserts(rbac.ResourceProvisionerJobs, policy.ActionRead) })) s.Run("GetTemplateVersionsByIDs", s.Subtest(func(db database.Store, check *expects) { dbtestutil.DisableForeignKeysAndTriggers(s.T(), db) @@ -4008,20 +4026,33 @@ func (s *MethodTestSuite) TestSystemFunctions() { Returns([]database.WorkspaceAgent{agt}) })) s.Run("GetProvisionerJobsByIDs", s.Subtest(func(db database.Store, check *expects) { - // TODO: add a ProvisionerJob resource type - a := dbgen.ProvisionerJob(s.T(), db, nil, database.ProvisionerJob{}) - b := dbgen.ProvisionerJob(s.T(), db, nil, database.ProvisionerJob{}) + o := dbgen.Organization(s.T(), db, database.Organization{}) + a := dbgen.ProvisionerJob(s.T(), db, nil, database.ProvisionerJob{OrganizationID: o.ID}) + b := dbgen.ProvisionerJob(s.T(), db, nil, database.ProvisionerJob{OrganizationID: o.ID}) check.Args([]uuid.UUID{a.ID, b.ID}). - Asserts( /*rbac.ResourceSystem, policy.ActionRead*/ ). + Asserts(rbac.ResourceProvisionerJobs.InOrg(o.ID), policy.ActionRead). Returns(slice.New(a, b)) })) s.Run("InsertWorkspaceAgent", s.Subtest(func(db database.Store, check *expects) { - dbtestutil.DisableForeignKeysAndTriggers(s.T(), db) + u := dbgen.User(s.T(), db, database.User{}) + o := dbgen.Organization(s.T(), db, database.Organization{}) + j := dbgen.ProvisionerJob(s.T(), db, nil, database.ProvisionerJob{Type: database.ProvisionerJobTypeWorkspaceBuild}) + tpl := dbgen.Template(s.T(), db, database.Template{CreatedBy: u.ID, OrganizationID: o.ID}) + tv := dbgen.TemplateVersion(s.T(), db, database.TemplateVersion{ + TemplateID: uuid.NullUUID{UUID: tpl.ID, Valid: true}, + JobID: j.ID, + OrganizationID: o.ID, + CreatedBy: u.ID, + }) + ws := dbgen.Workspace(s.T(), db, database.WorkspaceTable{OwnerID: u.ID, TemplateID: tpl.ID, OrganizationID: o.ID}) + _ = dbgen.WorkspaceBuild(s.T(), db, database.WorkspaceBuild{WorkspaceID: ws.ID, JobID: j.ID, TemplateVersionID: tv.ID}) + res := dbgen.WorkspaceResource(s.T(), db, database.WorkspaceResource{JobID: j.ID}) check.Args(database.InsertWorkspaceAgentParams{ ID: uuid.New(), + ResourceID: res.ID, Name: "dev", APIKeyScope: database.AgentKeyScopeEnumAll, - }).Asserts(rbac.ResourceSystem, policy.ActionCreate) + }).Asserts(ws, policy.ActionCreateAgent) })) s.Run("InsertWorkspaceApp", s.Subtest(func(db database.Store, check *expects) { dbtestutil.DisableForeignKeysAndTriggers(s.T(), db) @@ -4048,7 +4079,6 @@ func (s *MethodTestSuite) TestSystemFunctions() { }).Asserts(rbac.ResourceSystem, policy.ActionUpdate).Returns() })) s.Run("AcquireProvisionerJob", s.Subtest(func(db database.Store, check *expects) { - // TODO: we need to create a ProvisionerJob resource j := dbgen.ProvisionerJob(s.T(), db, nil, database.ProvisionerJob{ StartedAt: sql.NullTime{Valid: false}, UpdatedAt: time.Now(), @@ -4058,47 +4088,48 @@ func (s *MethodTestSuite) TestSystemFunctions() { OrganizationID: j.OrganizationID, Types: []database.ProvisionerType{j.Provisioner}, ProvisionerTags: must(json.Marshal(j.Tags)), - }).Asserts( /*rbac.ResourceSystem, policy.ActionUpdate*/ ) + }).Asserts(rbac.ResourceProvisionerJobs, policy.ActionUpdate) })) s.Run("UpdateProvisionerJobWithCompleteByID", s.Subtest(func(db database.Store, check *expects) { - // TODO: we need to create a ProvisionerJob resource j := dbgen.ProvisionerJob(s.T(), db, nil, database.ProvisionerJob{}) check.Args(database.UpdateProvisionerJobWithCompleteByIDParams{ ID: j.ID, - }).Asserts( /*rbac.ResourceSystem, policy.ActionUpdate*/ ) + }).Asserts(rbac.ResourceProvisionerJobs, policy.ActionUpdate) + })) + s.Run("UpdateProvisionerJobWithCompleteWithStartedAtByID", s.Subtest(func(db database.Store, check *expects) { + j := dbgen.ProvisionerJob(s.T(), db, nil, database.ProvisionerJob{}) + check.Args(database.UpdateProvisionerJobWithCompleteWithStartedAtByIDParams{ + ID: j.ID, + }).Asserts(rbac.ResourceProvisionerJobs, policy.ActionUpdate) })) s.Run("UpdateProvisionerJobByID", s.Subtest(func(db database.Store, check *expects) { - // TODO: we need to create a ProvisionerJob resource j := dbgen.ProvisionerJob(s.T(), db, nil, database.ProvisionerJob{}) check.Args(database.UpdateProvisionerJobByIDParams{ ID: j.ID, UpdatedAt: time.Now(), - }).Asserts( /*rbac.ResourceSystem, policy.ActionUpdate*/ ) + }).Asserts(rbac.ResourceProvisionerJobs, policy.ActionUpdate) })) s.Run("InsertProvisionerJob", s.Subtest(func(db database.Store, check *expects) { dbtestutil.DisableForeignKeysAndTriggers(s.T(), db) - // TODO: we need to create a ProvisionerJob resource check.Args(database.InsertProvisionerJobParams{ ID: uuid.New(), Provisioner: database.ProvisionerTypeEcho, StorageMethod: database.ProvisionerStorageMethodFile, Type: database.ProvisionerJobTypeWorkspaceBuild, Input: json.RawMessage("{}"), - }).Asserts( /*rbac.ResourceSystem, policy.ActionCreate*/ ) + }).Asserts( /* rbac.ResourceProvisionerJobs, policy.ActionCreate */ ) })) s.Run("InsertProvisionerJobLogs", s.Subtest(func(db database.Store, check *expects) { - // TODO: we need to create a ProvisionerJob resource j := dbgen.ProvisionerJob(s.T(), db, nil, database.ProvisionerJob{}) check.Args(database.InsertProvisionerJobLogsParams{ JobID: j.ID, - }).Asserts( /*rbac.ResourceSystem, policy.ActionCreate*/ ) + }).Asserts( /* rbac.ResourceProvisionerJobs, policy.ActionUpdate */ ) })) s.Run("InsertProvisionerJobTimings", s.Subtest(func(db database.Store, check *expects) { - // TODO: we need to create a ProvisionerJob resource j := dbgen.ProvisionerJob(s.T(), db, nil, database.ProvisionerJob{}) check.Args(database.InsertProvisionerJobTimingsParams{ JobID: j.ID, - }).Asserts( /*rbac.ResourceSystem, policy.ActionCreate*/ ) + }).Asserts(rbac.ResourceProvisionerJobs, policy.ActionUpdate) })) s.Run("UpsertProvisionerDaemon", s.Subtest(func(db database.Store, check *expects) { dbtestutil.DisableForeignKeysAndTriggers(s.T(), db) @@ -4234,8 +4265,8 @@ func (s *MethodTestSuite) TestSystemFunctions() { s.Run("GetFileTemplates", s.Subtest(func(db database.Store, check *expects) { check.Args(uuid.New()).Asserts(rbac.ResourceSystem, policy.ActionRead) })) - s.Run("GetHungProvisionerJobs", s.Subtest(func(db database.Store, check *expects) { - check.Args(time.Time{}).Asserts() + s.Run("GetProvisionerJobsToBeReaped", s.Subtest(func(db database.Store, check *expects) { + check.Args(database.GetProvisionerJobsToBeReapedParams{}).Asserts(rbac.ResourceProvisionerJobs, policy.ActionRead) })) s.Run("UpsertOAuthSigningKey", s.Subtest(func(db database.Store, check *expects) { check.Args("foo").Asserts(rbac.ResourceSystem, policy.ActionUpdate) @@ -4479,6 +4510,9 @@ func (s *MethodTestSuite) TestSystemFunctions() { VapidPrivateKey: "test", }).Asserts(rbac.ResourceDeploymentConfig, policy.ActionUpdate) })) + s.Run("GetProvisionerJobByIDForUpdate", s.Subtest(func(db database.Store, check *expects) { + check.Args(uuid.New()).Asserts(rbac.ResourceProvisionerJobs, policy.ActionRead).Errors(sql.ErrNoRows) + })) } func (s *MethodTestSuite) TestNotifications() { @@ -4890,6 +4924,11 @@ func (s *MethodTestSuite) TestPrebuilds() { Asserts(rbac.ResourceWorkspace.All(), policy.ActionRead). ErrorsWithInMemDB(dbmem.ErrUnimplemented) })) + s.Run("GetPresetsAtFailureLimit", s.Subtest(func(_ database.Store, check *expects) { + check.Args(int64(0)). + Asserts(rbac.ResourceTemplate.All(), policy.ActionViewInsights). + ErrorsWithInMemDB(dbmem.ErrUnimplemented) + })) s.Run("GetPresetsBackoff", s.Subtest(func(_ database.Store, check *expects) { check.Args(time.Time{}). Asserts(rbac.ResourceTemplate.All(), policy.ActionViewInsights). @@ -4937,8 +4976,34 @@ func (s *MethodTestSuite) TestPrebuilds() { }, InvalidateAfterSecs: preset.InvalidateAfterSecs, OrganizationID: org.ID, + PrebuildStatus: database.PrebuildStatusHealthy, }) })) + s.Run("UpdatePresetPrebuildStatus", s.Subtest(func(db database.Store, check *expects) { + org := dbgen.Organization(s.T(), db, database.Organization{}) + user := dbgen.User(s.T(), db, database.User{}) + template := dbgen.Template(s.T(), db, database.Template{ + OrganizationID: org.ID, + CreatedBy: user.ID, + }) + templateVersion := dbgen.TemplateVersion(s.T(), db, database.TemplateVersion{ + TemplateID: uuid.NullUUID{ + UUID: template.ID, + Valid: true, + }, + OrganizationID: org.ID, + CreatedBy: user.ID, + }) + preset := dbgen.Preset(s.T(), db, database.InsertPresetParams{ + TemplateVersionID: templateVersion.ID, + }) + req := database.UpdatePresetPrebuildStatusParams{ + PresetID: preset.ID, + Status: database.PrebuildStatusHealthy, + } + check.Args(req). + Asserts(rbac.ResourceTemplate.WithID(template.ID).InOrg(org.ID), policy.ActionUpdate) + })) } func (s *MethodTestSuite) TestOAuth2ProviderApps() { diff --git a/coderd/database/dbmem/dbmem.go b/coderd/database/dbmem/dbmem.go index 7dec84f8aaeb0..1a1455d83045b 100644 --- a/coderd/database/dbmem/dbmem.go +++ b/coderd/database/dbmem/dbmem.go @@ -8,6 +8,7 @@ import ( "errors" "fmt" "math" + insecurerand "math/rand" //#nosec // this is only used for shuffling an array to pick random jobs to reap "reflect" "regexp" "slices" @@ -3707,23 +3708,6 @@ func (q *FakeQuerier) GetHealthSettings(_ context.Context) (string, error) { return string(q.healthSettings), nil } -func (q *FakeQuerier) GetHungProvisionerJobs(_ context.Context, hungSince time.Time) ([]database.ProvisionerJob, error) { - q.mutex.RLock() - defer q.mutex.RUnlock() - - hungJobs := []database.ProvisionerJob{} - for _, provisionerJob := range q.provisionerJobs { - if provisionerJob.StartedAt.Valid && !provisionerJob.CompletedAt.Valid && provisionerJob.UpdatedAt.Before(hungSince) { - // clone the Tags before appending, since maps are reference types and - // we don't want the caller to be able to mutate the map we have inside - // dbmem! - provisionerJob.Tags = maps.Clone(provisionerJob.Tags) - hungJobs = append(hungJobs, provisionerJob) - } - } - return hungJobs, nil -} - func (q *FakeQuerier) GetInboxNotificationByID(_ context.Context, id uuid.UUID) (database.InboxNotification, error) { q.mutex.RLock() defer q.mutex.RUnlock() @@ -4303,6 +4287,7 @@ func (q *FakeQuerier) GetPresetByID(ctx context.Context, presetID uuid.UUID) (da CreatedAt: preset.CreatedAt, DesiredInstances: preset.DesiredInstances, InvalidateAfterSecs: preset.InvalidateAfterSecs, + PrebuildStatus: preset.PrebuildStatus, TemplateID: tv.TemplateID, OrganizationID: tv.OrganizationID, }, nil @@ -4368,6 +4353,10 @@ func (q *FakeQuerier) GetPresetParametersByTemplateVersionID(_ context.Context, return parameters, nil } +func (q *FakeQuerier) GetPresetsAtFailureLimit(ctx context.Context, hardLimit int64) ([]database.GetPresetsAtFailureLimitRow, error) { + return nil, ErrUnimplemented +} + func (*FakeQuerier) GetPresetsBackoff(_ context.Context, _ time.Time) ([]database.GetPresetsBackoffRow, error) { return nil, ErrUnimplemented } @@ -4642,6 +4631,13 @@ func (q *FakeQuerier) GetProvisionerJobByID(ctx context.Context, id uuid.UUID) ( return q.getProvisionerJobByIDNoLock(ctx, id) } +func (q *FakeQuerier) GetProvisionerJobByIDForUpdate(ctx context.Context, id uuid.UUID) (database.ProvisionerJob, error) { + q.mutex.RLock() + defer q.mutex.RUnlock() + + return q.getProvisionerJobByIDNoLock(ctx, id) +} + func (q *FakeQuerier) GetProvisionerJobTimingsByJobID(_ context.Context, jobID uuid.UUID) ([]database.ProvisionerJobTiming, error) { q.mutex.RLock() defer q.mutex.RUnlock() @@ -4884,6 +4880,33 @@ func (q *FakeQuerier) GetProvisionerJobsCreatedAfter(_ context.Context, after ti return jobs, nil } +func (q *FakeQuerier) GetProvisionerJobsToBeReaped(_ context.Context, arg database.GetProvisionerJobsToBeReapedParams) ([]database.ProvisionerJob, error) { + q.mutex.RLock() + defer q.mutex.RUnlock() + maxJobs := arg.MaxJobs + + hungJobs := []database.ProvisionerJob{} + for _, provisionerJob := range q.provisionerJobs { + if !provisionerJob.CompletedAt.Valid { + if (provisionerJob.StartedAt.Valid && provisionerJob.UpdatedAt.Before(arg.HungSince)) || + (!provisionerJob.StartedAt.Valid && provisionerJob.UpdatedAt.Before(arg.PendingSince)) { + // clone the Tags before appending, since maps are reference types and + // we don't want the caller to be able to mutate the map we have inside + // dbmem! + provisionerJob.Tags = maps.Clone(provisionerJob.Tags) + hungJobs = append(hungJobs, provisionerJob) + if len(hungJobs) >= int(maxJobs) { + break + } + } + } + } + insecurerand.Shuffle(len(hungJobs), func(i, j int) { + hungJobs[i], hungJobs[j] = hungJobs[j], hungJobs[i] + }) + return hungJobs, nil +} + func (q *FakeQuerier) GetProvisionerKeyByHashedSecret(_ context.Context, hashedSecret []byte) (database.ProvisionerKey, error) { q.mutex.RLock() defer q.mutex.RUnlock() @@ -8035,6 +8058,33 @@ func (q *FakeQuerier) GetWorkspaceByOwnerIDAndName(_ context.Context, arg databa return database.Workspace{}, sql.ErrNoRows } +func (q *FakeQuerier) GetWorkspaceByResourceID(ctx context.Context, resourceID uuid.UUID) (database.Workspace, error) { + q.mutex.RLock() + defer q.mutex.RUnlock() + + for _, resource := range q.workspaceResources { + if resource.ID != resourceID { + continue + } + + for _, build := range q.workspaceBuilds { + if build.JobID != resource.JobID { + continue + } + + for _, workspace := range q.workspaces { + if workspace.ID != build.WorkspaceID { + continue + } + + return q.extendWorkspace(workspace), nil + } + } + } + + return database.Workspace{}, sql.ErrNoRows +} + func (q *FakeQuerier) GetWorkspaceByWorkspaceAppID(_ context.Context, workspaceAppID uuid.UUID) (database.Workspace, error) { if err := validateDatabaseType(workspaceAppID); err != nil { return database.Workspace{}, err @@ -9044,6 +9094,7 @@ func (q *FakeQuerier) InsertPreset(_ context.Context, arg database.InsertPresetP Int32: 0, Valid: true, }, + PrebuildStatus: database.PrebuildStatusHealthy, } q.presets = append(q.presets, preset) return preset, nil @@ -10872,6 +10923,25 @@ func (q *FakeQuerier) UpdateOrganizationDeletedByID(_ context.Context, arg datab return sql.ErrNoRows } +func (q *FakeQuerier) UpdatePresetPrebuildStatus(ctx context.Context, arg database.UpdatePresetPrebuildStatusParams) error { + err := validateDatabaseType(arg) + if err != nil { + return err + } + + q.mutex.RLock() + defer q.mutex.RUnlock() + + for _, preset := range q.presets { + if preset.ID == arg.PresetID { + preset.PrebuildStatus = arg.Status + return nil + } + } + + return xerrors.Errorf("preset %v does not exist", arg.PresetID) +} + func (q *FakeQuerier) UpdateProvisionerDaemonLastSeenAt(_ context.Context, arg database.UpdateProvisionerDaemonLastSeenAtParams) error { err := validateDatabaseType(arg) if err != nil { @@ -10958,6 +11028,30 @@ func (q *FakeQuerier) UpdateProvisionerJobWithCompleteByID(_ context.Context, ar return sql.ErrNoRows } +func (q *FakeQuerier) UpdateProvisionerJobWithCompleteWithStartedAtByID(_ context.Context, arg database.UpdateProvisionerJobWithCompleteWithStartedAtByIDParams) error { + if err := validateDatabaseType(arg); err != nil { + return err + } + + q.mutex.Lock() + defer q.mutex.Unlock() + + for index, job := range q.provisionerJobs { + if arg.ID != job.ID { + continue + } + job.UpdatedAt = arg.UpdatedAt + job.CompletedAt = arg.CompletedAt + job.Error = arg.Error + job.ErrorCode = arg.ErrorCode + job.StartedAt = arg.StartedAt + job.JobStatus = provisionerJobStatus(job) + q.provisionerJobs[index] = job + return nil + } + return sql.ErrNoRows +} + func (q *FakeQuerier) UpdateReplica(_ context.Context, arg database.UpdateReplicaParams) (database.Replica, error) { if err := validateDatabaseType(arg); err != nil { return database.Replica{}, err diff --git a/coderd/database/dbmetrics/querymetrics.go b/coderd/database/dbmetrics/querymetrics.go index a5a22aad1a0bf..e35ec11b02453 100644 --- a/coderd/database/dbmetrics/querymetrics.go +++ b/coderd/database/dbmetrics/querymetrics.go @@ -865,13 +865,6 @@ func (m queryMetricsStore) GetHealthSettings(ctx context.Context) (string, error return r0, r1 } -func (m queryMetricsStore) GetHungProvisionerJobs(ctx context.Context, hungSince time.Time) ([]database.ProvisionerJob, error) { - start := time.Now() - jobs, err := m.s.GetHungProvisionerJobs(ctx, hungSince) - m.queryLatencies.WithLabelValues("GetHungProvisionerJobs").Observe(time.Since(start).Seconds()) - return jobs, err -} - func (m queryMetricsStore) GetInboxNotificationByID(ctx context.Context, id uuid.UUID) (database.InboxNotification, error) { start := time.Now() r0, r1 := m.s.GetInboxNotificationByID(ctx, id) @@ -1145,6 +1138,13 @@ func (m queryMetricsStore) GetPresetParametersByTemplateVersionID(ctx context.Co return r0, r1 } +func (m queryMetricsStore) GetPresetsAtFailureLimit(ctx context.Context, hardLimit int64) ([]database.GetPresetsAtFailureLimitRow, error) { + start := time.Now() + r0, r1 := m.s.GetPresetsAtFailureLimit(ctx, hardLimit) + m.queryLatencies.WithLabelValues("GetPresetsAtFailureLimit").Observe(time.Since(start).Seconds()) + return r0, r1 +} + func (m queryMetricsStore) GetPresetsBackoff(ctx context.Context, lookback time.Time) ([]database.GetPresetsBackoffRow, error) { start := time.Now() r0, r1 := m.s.GetPresetsBackoff(ctx, lookback) @@ -1194,6 +1194,13 @@ func (m queryMetricsStore) GetProvisionerJobByID(ctx context.Context, id uuid.UU return job, err } +func (m queryMetricsStore) GetProvisionerJobByIDForUpdate(ctx context.Context, id uuid.UUID) (database.ProvisionerJob, error) { + start := time.Now() + r0, r1 := m.s.GetProvisionerJobByIDForUpdate(ctx, id) + m.queryLatencies.WithLabelValues("GetProvisionerJobByIDForUpdate").Observe(time.Since(start).Seconds()) + return r0, r1 +} + func (m queryMetricsStore) GetProvisionerJobTimingsByJobID(ctx context.Context, jobID uuid.UUID) ([]database.ProvisionerJobTiming, error) { start := time.Now() r0, r1 := m.s.GetProvisionerJobTimingsByJobID(ctx, jobID) @@ -1229,6 +1236,13 @@ func (m queryMetricsStore) GetProvisionerJobsCreatedAfter(ctx context.Context, c return jobs, err } +func (m queryMetricsStore) GetProvisionerJobsToBeReaped(ctx context.Context, arg database.GetProvisionerJobsToBeReapedParams) ([]database.ProvisionerJob, error) { + start := time.Now() + r0, r1 := m.s.GetProvisionerJobsToBeReaped(ctx, arg) + m.queryLatencies.WithLabelValues("GetProvisionerJobsToBeReaped").Observe(time.Since(start).Seconds()) + return r0, r1 +} + func (m queryMetricsStore) GetProvisionerKeyByHashedSecret(ctx context.Context, hashedSecret []byte) (database.ProvisionerKey, error) { start := time.Now() r0, r1 := m.s.GetProvisionerKeyByHashedSecret(ctx, hashedSecret) @@ -1880,6 +1894,13 @@ func (m queryMetricsStore) GetWorkspaceByOwnerIDAndName(ctx context.Context, arg return workspace, err } +func (m queryMetricsStore) GetWorkspaceByResourceID(ctx context.Context, resourceID uuid.UUID) (database.Workspace, error) { + start := time.Now() + r0, r1 := m.s.GetWorkspaceByResourceID(ctx, resourceID) + m.queryLatencies.WithLabelValues("GetWorkspaceByResourceID").Observe(time.Since(start).Seconds()) + return r0, r1 +} + func (m queryMetricsStore) GetWorkspaceByWorkspaceAppID(ctx context.Context, workspaceAppID uuid.UUID) (database.Workspace, error) { start := time.Now() workspace, err := m.s.GetWorkspaceByWorkspaceAppID(ctx, workspaceAppID) @@ -2678,6 +2699,13 @@ func (m queryMetricsStore) UpdateOrganizationDeletedByID(ctx context.Context, ar return r0 } +func (m queryMetricsStore) UpdatePresetPrebuildStatus(ctx context.Context, arg database.UpdatePresetPrebuildStatusParams) error { + start := time.Now() + r0 := m.s.UpdatePresetPrebuildStatus(ctx, arg) + m.queryLatencies.WithLabelValues("UpdatePresetPrebuildStatus").Observe(time.Since(start).Seconds()) + return r0 +} + func (m queryMetricsStore) UpdateProvisionerDaemonLastSeenAt(ctx context.Context, arg database.UpdateProvisionerDaemonLastSeenAtParams) error { start := time.Now() r0 := m.s.UpdateProvisionerDaemonLastSeenAt(ctx, arg) @@ -2706,6 +2734,13 @@ func (m queryMetricsStore) UpdateProvisionerJobWithCompleteByID(ctx context.Cont return err } +func (m queryMetricsStore) UpdateProvisionerJobWithCompleteWithStartedAtByID(ctx context.Context, arg database.UpdateProvisionerJobWithCompleteWithStartedAtByIDParams) error { + start := time.Now() + r0 := m.s.UpdateProvisionerJobWithCompleteWithStartedAtByID(ctx, arg) + m.queryLatencies.WithLabelValues("UpdateProvisionerJobWithCompleteWithStartedAtByID").Observe(time.Since(start).Seconds()) + return r0 +} + func (m queryMetricsStore) UpdateReplica(ctx context.Context, arg database.UpdateReplicaParams) (database.Replica, error) { start := time.Now() replica, err := m.s.UpdateReplica(ctx, arg) diff --git a/coderd/database/dbmock/dbmock.go b/coderd/database/dbmock/dbmock.go index 0d66dcec11848..7a1fc0c4b2a6f 100644 --- a/coderd/database/dbmock/dbmock.go +++ b/coderd/database/dbmock/dbmock.go @@ -1743,21 +1743,6 @@ func (mr *MockStoreMockRecorder) GetHealthSettings(ctx any) *gomock.Call { return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "GetHealthSettings", reflect.TypeOf((*MockStore)(nil).GetHealthSettings), ctx) } -// GetHungProvisionerJobs mocks base method. -func (m *MockStore) GetHungProvisionerJobs(ctx context.Context, updatedAt time.Time) ([]database.ProvisionerJob, error) { - m.ctrl.T.Helper() - ret := m.ctrl.Call(m, "GetHungProvisionerJobs", ctx, updatedAt) - ret0, _ := ret[0].([]database.ProvisionerJob) - ret1, _ := ret[1].(error) - return ret0, ret1 -} - -// GetHungProvisionerJobs indicates an expected call of GetHungProvisionerJobs. -func (mr *MockStoreMockRecorder) GetHungProvisionerJobs(ctx, updatedAt any) *gomock.Call { - mr.mock.ctrl.T.Helper() - return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "GetHungProvisionerJobs", reflect.TypeOf((*MockStore)(nil).GetHungProvisionerJobs), ctx, updatedAt) -} - // GetInboxNotificationByID mocks base method. func (m *MockStore) GetInboxNotificationByID(ctx context.Context, id uuid.UUID) (database.InboxNotification, error) { m.ctrl.T.Helper() @@ -2343,6 +2328,21 @@ func (mr *MockStoreMockRecorder) GetPresetParametersByTemplateVersionID(ctx, tem return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "GetPresetParametersByTemplateVersionID", reflect.TypeOf((*MockStore)(nil).GetPresetParametersByTemplateVersionID), ctx, templateVersionID) } +// GetPresetsAtFailureLimit mocks base method. +func (m *MockStore) GetPresetsAtFailureLimit(ctx context.Context, hardLimit int64) ([]database.GetPresetsAtFailureLimitRow, error) { + m.ctrl.T.Helper() + ret := m.ctrl.Call(m, "GetPresetsAtFailureLimit", ctx, hardLimit) + ret0, _ := ret[0].([]database.GetPresetsAtFailureLimitRow) + ret1, _ := ret[1].(error) + return ret0, ret1 +} + +// GetPresetsAtFailureLimit indicates an expected call of GetPresetsAtFailureLimit. +func (mr *MockStoreMockRecorder) GetPresetsAtFailureLimit(ctx, hardLimit any) *gomock.Call { + mr.mock.ctrl.T.Helper() + return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "GetPresetsAtFailureLimit", reflect.TypeOf((*MockStore)(nil).GetPresetsAtFailureLimit), ctx, hardLimit) +} + // GetPresetsBackoff mocks base method. func (m *MockStore) GetPresetsBackoff(ctx context.Context, lookback time.Time) ([]database.GetPresetsBackoffRow, error) { m.ctrl.T.Helper() @@ -2448,6 +2448,21 @@ func (mr *MockStoreMockRecorder) GetProvisionerJobByID(ctx, id any) *gomock.Call return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "GetProvisionerJobByID", reflect.TypeOf((*MockStore)(nil).GetProvisionerJobByID), ctx, id) } +// GetProvisionerJobByIDForUpdate mocks base method. +func (m *MockStore) GetProvisionerJobByIDForUpdate(ctx context.Context, id uuid.UUID) (database.ProvisionerJob, error) { + m.ctrl.T.Helper() + ret := m.ctrl.Call(m, "GetProvisionerJobByIDForUpdate", ctx, id) + ret0, _ := ret[0].(database.ProvisionerJob) + ret1, _ := ret[1].(error) + return ret0, ret1 +} + +// GetProvisionerJobByIDForUpdate indicates an expected call of GetProvisionerJobByIDForUpdate. +func (mr *MockStoreMockRecorder) GetProvisionerJobByIDForUpdate(ctx, id any) *gomock.Call { + mr.mock.ctrl.T.Helper() + return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "GetProvisionerJobByIDForUpdate", reflect.TypeOf((*MockStore)(nil).GetProvisionerJobByIDForUpdate), ctx, id) +} + // GetProvisionerJobTimingsByJobID mocks base method. func (m *MockStore) GetProvisionerJobTimingsByJobID(ctx context.Context, jobID uuid.UUID) ([]database.ProvisionerJobTiming, error) { m.ctrl.T.Helper() @@ -2523,6 +2538,21 @@ func (mr *MockStoreMockRecorder) GetProvisionerJobsCreatedAfter(ctx, createdAt a return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "GetProvisionerJobsCreatedAfter", reflect.TypeOf((*MockStore)(nil).GetProvisionerJobsCreatedAfter), ctx, createdAt) } +// GetProvisionerJobsToBeReaped mocks base method. +func (m *MockStore) GetProvisionerJobsToBeReaped(ctx context.Context, arg database.GetProvisionerJobsToBeReapedParams) ([]database.ProvisionerJob, error) { + m.ctrl.T.Helper() + ret := m.ctrl.Call(m, "GetProvisionerJobsToBeReaped", ctx, arg) + ret0, _ := ret[0].([]database.ProvisionerJob) + ret1, _ := ret[1].(error) + return ret0, ret1 +} + +// GetProvisionerJobsToBeReaped indicates an expected call of GetProvisionerJobsToBeReaped. +func (mr *MockStoreMockRecorder) GetProvisionerJobsToBeReaped(ctx, arg any) *gomock.Call { + mr.mock.ctrl.T.Helper() + return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "GetProvisionerJobsToBeReaped", reflect.TypeOf((*MockStore)(nil).GetProvisionerJobsToBeReaped), ctx, arg) +} + // GetProvisionerKeyByHashedSecret mocks base method. func (m *MockStore) GetProvisionerKeyByHashedSecret(ctx context.Context, hashedSecret []byte) (database.ProvisionerKey, error) { m.ctrl.T.Helper() @@ -3948,6 +3978,21 @@ func (mr *MockStoreMockRecorder) GetWorkspaceByOwnerIDAndName(ctx, arg any) *gom return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "GetWorkspaceByOwnerIDAndName", reflect.TypeOf((*MockStore)(nil).GetWorkspaceByOwnerIDAndName), ctx, arg) } +// GetWorkspaceByResourceID mocks base method. +func (m *MockStore) GetWorkspaceByResourceID(ctx context.Context, resourceID uuid.UUID) (database.Workspace, error) { + m.ctrl.T.Helper() + ret := m.ctrl.Call(m, "GetWorkspaceByResourceID", ctx, resourceID) + ret0, _ := ret[0].(database.Workspace) + ret1, _ := ret[1].(error) + return ret0, ret1 +} + +// GetWorkspaceByResourceID indicates an expected call of GetWorkspaceByResourceID. +func (mr *MockStoreMockRecorder) GetWorkspaceByResourceID(ctx, resourceID any) *gomock.Call { + mr.mock.ctrl.T.Helper() + return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "GetWorkspaceByResourceID", reflect.TypeOf((*MockStore)(nil).GetWorkspaceByResourceID), ctx, resourceID) +} + // GetWorkspaceByWorkspaceAppID mocks base method. func (m *MockStore) GetWorkspaceByWorkspaceAppID(ctx context.Context, workspaceAppID uuid.UUID) (database.Workspace, error) { m.ctrl.T.Helper() @@ -5676,6 +5721,20 @@ func (mr *MockStoreMockRecorder) UpdateOrganizationDeletedByID(ctx, arg any) *go return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "UpdateOrganizationDeletedByID", reflect.TypeOf((*MockStore)(nil).UpdateOrganizationDeletedByID), ctx, arg) } +// UpdatePresetPrebuildStatus mocks base method. +func (m *MockStore) UpdatePresetPrebuildStatus(ctx context.Context, arg database.UpdatePresetPrebuildStatusParams) error { + m.ctrl.T.Helper() + ret := m.ctrl.Call(m, "UpdatePresetPrebuildStatus", ctx, arg) + ret0, _ := ret[0].(error) + return ret0 +} + +// UpdatePresetPrebuildStatus indicates an expected call of UpdatePresetPrebuildStatus. +func (mr *MockStoreMockRecorder) UpdatePresetPrebuildStatus(ctx, arg any) *gomock.Call { + mr.mock.ctrl.T.Helper() + return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "UpdatePresetPrebuildStatus", reflect.TypeOf((*MockStore)(nil).UpdatePresetPrebuildStatus), ctx, arg) +} + // UpdateProvisionerDaemonLastSeenAt mocks base method. func (m *MockStore) UpdateProvisionerDaemonLastSeenAt(ctx context.Context, arg database.UpdateProvisionerDaemonLastSeenAtParams) error { m.ctrl.T.Helper() @@ -5732,6 +5791,20 @@ func (mr *MockStoreMockRecorder) UpdateProvisionerJobWithCompleteByID(ctx, arg a return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "UpdateProvisionerJobWithCompleteByID", reflect.TypeOf((*MockStore)(nil).UpdateProvisionerJobWithCompleteByID), ctx, arg) } +// UpdateProvisionerJobWithCompleteWithStartedAtByID mocks base method. +func (m *MockStore) UpdateProvisionerJobWithCompleteWithStartedAtByID(ctx context.Context, arg database.UpdateProvisionerJobWithCompleteWithStartedAtByIDParams) error { + m.ctrl.T.Helper() + ret := m.ctrl.Call(m, "UpdateProvisionerJobWithCompleteWithStartedAtByID", ctx, arg) + ret0, _ := ret[0].(error) + return ret0 +} + +// UpdateProvisionerJobWithCompleteWithStartedAtByID indicates an expected call of UpdateProvisionerJobWithCompleteWithStartedAtByID. +func (mr *MockStoreMockRecorder) UpdateProvisionerJobWithCompleteWithStartedAtByID(ctx, arg any) *gomock.Call { + mr.mock.ctrl.T.Helper() + return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "UpdateProvisionerJobWithCompleteWithStartedAtByID", reflect.TypeOf((*MockStore)(nil).UpdateProvisionerJobWithCompleteWithStartedAtByID), ctx, arg) +} + // UpdateReplica mocks base method. func (m *MockStore) UpdateReplica(ctx context.Context, arg database.UpdateReplicaParams) (database.Replica, error) { m.ctrl.T.Helper() diff --git a/coderd/database/dump.sql b/coderd/database/dump.sql index 2f23b3ad4ce78..ec196405df2d3 100644 --- a/coderd/database/dump.sql +++ b/coderd/database/dump.sql @@ -153,6 +153,12 @@ CREATE TYPE port_share_protocol AS ENUM ( 'https' ); +CREATE TYPE prebuild_status AS ENUM ( + 'healthy', + 'hard_limited', + 'validation_failed' +); + CREATE TYPE provisioner_daemon_status AS ENUM ( 'offline', 'idle', @@ -1439,7 +1445,8 @@ CREATE TABLE template_version_presets ( name text NOT NULL, created_at timestamp with time zone DEFAULT CURRENT_TIMESTAMP NOT NULL, desired_instances integer, - invalidate_after_secs integer DEFAULT 0 + invalidate_after_secs integer DEFAULT 0, + prebuild_status prebuild_status DEFAULT 'healthy'::prebuild_status NOT NULL ); CREATE TABLE template_version_terraform_values ( diff --git a/coderd/database/migrations/000328_prebuild_failure_limit_notification.down.sql b/coderd/database/migrations/000328_prebuild_failure_limit_notification.down.sql new file mode 100644 index 0000000000000..40697c7bbc3d2 --- /dev/null +++ b/coderd/database/migrations/000328_prebuild_failure_limit_notification.down.sql @@ -0,0 +1 @@ +DELETE FROM notification_templates WHERE id = '414d9331-c1fc-4761-b40c-d1f4702279eb'; diff --git a/coderd/database/migrations/000328_prebuild_failure_limit_notification.up.sql b/coderd/database/migrations/000328_prebuild_failure_limit_notification.up.sql new file mode 100644 index 0000000000000..403bd667abd28 --- /dev/null +++ b/coderd/database/migrations/000328_prebuild_failure_limit_notification.up.sql @@ -0,0 +1,25 @@ +INSERT INTO notification_templates +(id, name, title_template, body_template, "group", actions) +VALUES ('414d9331-c1fc-4761-b40c-d1f4702279eb', + 'Prebuild Failure Limit Reached', + E'There is a problem creating prebuilt workspaces', + $$ +The number of failed prebuild attempts has reached the hard limit for template **{{ .Labels.template }}** and preset **{{ .Labels.preset }}**. + +To resume prebuilds, fix the underlying issue and upload a new template version. + +Refer to the documentation for more details: +- [Troubleshooting templates](https://coder.com/docs/admin/templates/troubleshooting) +- [Troubleshooting of prebuilt workspaces](https://coder.com/docs/admin/templates/extending-templates/prebuilt-workspaces#administration-and-troubleshooting) +$$, + 'Template Events', + '[ + { + "label": "View failed prebuilt workspaces", + "url": "{{base_url}}/workspaces?filter=owner:prebuilds+status:failed+template:{{.Labels.template}}" + }, + { + "label": "View template version", + "url": "{{base_url}}/templates/{{.Labels.org}}/{{.Labels.template}}/versions/{{.Labels.template_version}}" + } + ]'::jsonb); diff --git a/coderd/database/migrations/000329_add_status_to_template_presets.down.sql b/coderd/database/migrations/000329_add_status_to_template_presets.down.sql new file mode 100644 index 0000000000000..8fe04f99cae33 --- /dev/null +++ b/coderd/database/migrations/000329_add_status_to_template_presets.down.sql @@ -0,0 +1,5 @@ +-- Remove the column from the table first (must happen before dropping the enum type) +ALTER TABLE template_version_presets DROP COLUMN prebuild_status; + +-- Then drop the enum type +DROP TYPE prebuild_status; diff --git a/coderd/database/migrations/000329_add_status_to_template_presets.up.sql b/coderd/database/migrations/000329_add_status_to_template_presets.up.sql new file mode 100644 index 0000000000000..019a246f73a87 --- /dev/null +++ b/coderd/database/migrations/000329_add_status_to_template_presets.up.sql @@ -0,0 +1,7 @@ +CREATE TYPE prebuild_status AS ENUM ( + 'healthy', -- Prebuilds are working as expected; this is the default, healthy state. + 'hard_limited', -- Prebuilds have failed repeatedly and hit the configured hard failure limit; won't be retried anymore. + 'validation_failed' -- Prebuilds failed due to a non-retryable validation error (e.g. template misconfiguration); won't be retried. +); + +ALTER TABLE template_version_presets ADD COLUMN prebuild_status prebuild_status NOT NULL DEFAULT 'healthy'::prebuild_status; diff --git a/coderd/database/models.go b/coderd/database/models.go index ff49b8f471be0..d5047f6bbe65f 100644 --- a/coderd/database/models.go +++ b/coderd/database/models.go @@ -1343,6 +1343,67 @@ func AllPortShareProtocolValues() []PortShareProtocol { } } +type PrebuildStatus string + +const ( + PrebuildStatusHealthy PrebuildStatus = "healthy" + PrebuildStatusHardLimited PrebuildStatus = "hard_limited" + PrebuildStatusValidationFailed PrebuildStatus = "validation_failed" +) + +func (e *PrebuildStatus) Scan(src interface{}) error { + switch s := src.(type) { + case []byte: + *e = PrebuildStatus(s) + case string: + *e = PrebuildStatus(s) + default: + return fmt.Errorf("unsupported scan type for PrebuildStatus: %T", src) + } + return nil +} + +type NullPrebuildStatus struct { + PrebuildStatus PrebuildStatus `json:"prebuild_status"` + Valid bool `json:"valid"` // Valid is true if PrebuildStatus is not NULL +} + +// Scan implements the Scanner interface. +func (ns *NullPrebuildStatus) Scan(value interface{}) error { + if value == nil { + ns.PrebuildStatus, ns.Valid = "", false + return nil + } + ns.Valid = true + return ns.PrebuildStatus.Scan(value) +} + +// Value implements the driver Valuer interface. +func (ns NullPrebuildStatus) Value() (driver.Value, error) { + if !ns.Valid { + return nil, nil + } + return string(ns.PrebuildStatus), nil +} + +func (e PrebuildStatus) Valid() bool { + switch e { + case PrebuildStatusHealthy, + PrebuildStatusHardLimited, + PrebuildStatusValidationFailed: + return true + } + return false +} + +func AllPrebuildStatusValues() []PrebuildStatus { + return []PrebuildStatus{ + PrebuildStatusHealthy, + PrebuildStatusHardLimited, + PrebuildStatusValidationFailed, + } +} + // The status of a provisioner daemon. type ProvisionerDaemonStatus string @@ -3248,12 +3309,13 @@ type TemplateVersionParameter struct { } type TemplateVersionPreset struct { - ID uuid.UUID `db:"id" json:"id"` - TemplateVersionID uuid.UUID `db:"template_version_id" json:"template_version_id"` - Name string `db:"name" json:"name"` - CreatedAt time.Time `db:"created_at" json:"created_at"` - DesiredInstances sql.NullInt32 `db:"desired_instances" json:"desired_instances"` - InvalidateAfterSecs sql.NullInt32 `db:"invalidate_after_secs" json:"invalidate_after_secs"` + ID uuid.UUID `db:"id" json:"id"` + TemplateVersionID uuid.UUID `db:"template_version_id" json:"template_version_id"` + Name string `db:"name" json:"name"` + CreatedAt time.Time `db:"created_at" json:"created_at"` + DesiredInstances sql.NullInt32 `db:"desired_instances" json:"desired_instances"` + InvalidateAfterSecs sql.NullInt32 `db:"invalidate_after_secs" json:"invalidate_after_secs"` + PrebuildStatus PrebuildStatus `db:"prebuild_status" json:"prebuild_status"` } type TemplateVersionPresetParameter struct { diff --git a/coderd/database/no_slim.go b/coderd/database/no_slim.go index 561466490f53e..edb81e23ad1c7 100644 --- a/coderd/database/no_slim.go +++ b/coderd/database/no_slim.go @@ -1,8 +1,9 @@ +//go:build slim + package database const ( - // This declaration protects against imports in slim builds, see - // no_slim_slim.go. - //nolint:revive,unused - _DO_NOT_IMPORT_THIS_PACKAGE_IN_SLIM_BUILDS = "DO_NOT_IMPORT_THIS_PACKAGE_IN_SLIM_BUILDS" + // This line fails to compile, preventing this package from being imported + // in slim builds. + _DO_NOT_IMPORT_THIS_PACKAGE_IN_SLIM_BUILDS = _DO_NOT_IMPORT_THIS_PACKAGE_IN_SLIM_BUILDS ) diff --git a/coderd/database/no_slim_slim.go b/coderd/database/no_slim_slim.go deleted file mode 100644 index 845ac0df77942..0000000000000 --- a/coderd/database/no_slim_slim.go +++ /dev/null @@ -1,14 +0,0 @@ -//go:build slim - -package database - -const ( - // This re-declaration will result in a compilation error and is present to - // prevent increasing the slim binary size by importing this package, - // directly or indirectly. - // - // no_slim_slim.go:7:2: _DO_NOT_IMPORT_THIS_PACKAGE_IN_SLIM_BUILDS redeclared in this block - // no_slim.go:4:2: other declaration of _DO_NOT_IMPORT_THIS_PACKAGE_IN_SLIM_BUILDS - //nolint:revive,unused - _DO_NOT_IMPORT_THIS_PACKAGE_IN_SLIM_BUILDS = "DO_NOT_IMPORT_THIS_PACKAGE_IN_SLIM_BUILDS" -) diff --git a/coderd/database/querier.go b/coderd/database/querier.go index 81b8d58758ada..ac7497b641a05 100644 --- a/coderd/database/querier.go +++ b/coderd/database/querier.go @@ -196,7 +196,6 @@ type sqlcQuerier interface { GetGroupMembersCountByGroupID(ctx context.Context, arg GetGroupMembersCountByGroupIDParams) (int64, error) GetGroups(ctx context.Context, arg GetGroupsParams) ([]GetGroupsRow, error) GetHealthSettings(ctx context.Context) (string, error) - GetHungProvisionerJobs(ctx context.Context, updatedAt time.Time) ([]ProvisionerJob, error) GetInboxNotificationByID(ctx context.Context, id uuid.UUID) (InboxNotification, error) // Fetches inbox notifications for a user filtered by templates and targets // param user_id: The user ID @@ -242,6 +241,15 @@ type sqlcQuerier interface { GetPresetByWorkspaceBuildID(ctx context.Context, workspaceBuildID uuid.UUID) (TemplateVersionPreset, error) GetPresetParametersByPresetID(ctx context.Context, presetID uuid.UUID) ([]TemplateVersionPresetParameter, error) GetPresetParametersByTemplateVersionID(ctx context.Context, templateVersionID uuid.UUID) ([]TemplateVersionPresetParameter, error) + // GetPresetsAtFailureLimit groups workspace builds by preset ID. + // Each preset is associated with exactly one template version ID. + // For each preset, the query checks the last hard_limit builds. + // If all of them failed, the preset is considered to have hit the hard failure limit. + // The query returns a list of preset IDs that have reached this failure threshold. + // Only active template versions with configured presets are considered. + // For each preset, check the last hard_limit builds. + // If all of them failed, the preset is considered to have hit the hard failure limit. + GetPresetsAtFailureLimit(ctx context.Context, hardLimit int64) ([]GetPresetsAtFailureLimitRow, error) // GetPresetsBackoff groups workspace builds by preset ID. // Each preset is associated with exactly one template version ID. // For each group, the query checks up to N of the most recent jobs that occurred within the @@ -265,11 +273,16 @@ type sqlcQuerier interface { // Previous job information. GetProvisionerDaemonsWithStatusByOrganization(ctx context.Context, arg GetProvisionerDaemonsWithStatusByOrganizationParams) ([]GetProvisionerDaemonsWithStatusByOrganizationRow, error) GetProvisionerJobByID(ctx context.Context, id uuid.UUID) (ProvisionerJob, error) + // Gets a single provisioner job by ID for update. + // This is used to securely reap jobs that have been hung/pending for a long time. + GetProvisionerJobByIDForUpdate(ctx context.Context, id uuid.UUID) (ProvisionerJob, error) GetProvisionerJobTimingsByJobID(ctx context.Context, jobID uuid.UUID) ([]ProvisionerJobTiming, error) GetProvisionerJobsByIDs(ctx context.Context, ids []uuid.UUID) ([]ProvisionerJob, error) GetProvisionerJobsByIDsWithQueuePosition(ctx context.Context, ids []uuid.UUID) ([]GetProvisionerJobsByIDsWithQueuePositionRow, error) GetProvisionerJobsByOrganizationAndStatusWithQueuePositionAndProvisioner(ctx context.Context, arg GetProvisionerJobsByOrganizationAndStatusWithQueuePositionAndProvisionerParams) ([]GetProvisionerJobsByOrganizationAndStatusWithQueuePositionAndProvisionerRow, error) GetProvisionerJobsCreatedAfter(ctx context.Context, createdAt time.Time) ([]ProvisionerJob, error) + // To avoid repeatedly attempting to reap the same jobs, we randomly order and limit to @max_jobs. + GetProvisionerJobsToBeReaped(ctx context.Context, arg GetProvisionerJobsToBeReapedParams) ([]ProvisionerJob, error) GetProvisionerKeyByHashedSecret(ctx context.Context, hashedSecret []byte) (ProvisionerKey, error) GetProvisionerKeyByID(ctx context.Context, id uuid.UUID) (ProvisionerKey, error) GetProvisionerKeyByName(ctx context.Context, arg GetProvisionerKeyByNameParams) (ProvisionerKey, error) @@ -418,6 +431,7 @@ type sqlcQuerier interface { GetWorkspaceByAgentID(ctx context.Context, agentID uuid.UUID) (Workspace, error) GetWorkspaceByID(ctx context.Context, id uuid.UUID) (Workspace, error) GetWorkspaceByOwnerIDAndName(ctx context.Context, arg GetWorkspaceByOwnerIDAndNameParams) (Workspace, error) + GetWorkspaceByResourceID(ctx context.Context, resourceID uuid.UUID) (Workspace, error) GetWorkspaceByWorkspaceAppID(ctx context.Context, workspaceAppID uuid.UUID) (Workspace, error) GetWorkspaceModulesByJobID(ctx context.Context, jobID uuid.UUID) ([]WorkspaceModule, error) GetWorkspaceModulesCreatedAfter(ctx context.Context, createdAt time.Time) ([]WorkspaceModule, error) @@ -563,10 +577,12 @@ type sqlcQuerier interface { UpdateOAuth2ProviderAppSecretByID(ctx context.Context, arg UpdateOAuth2ProviderAppSecretByIDParams) (OAuth2ProviderAppSecret, error) UpdateOrganization(ctx context.Context, arg UpdateOrganizationParams) (Organization, error) UpdateOrganizationDeletedByID(ctx context.Context, arg UpdateOrganizationDeletedByIDParams) error + UpdatePresetPrebuildStatus(ctx context.Context, arg UpdatePresetPrebuildStatusParams) error UpdateProvisionerDaemonLastSeenAt(ctx context.Context, arg UpdateProvisionerDaemonLastSeenAtParams) error UpdateProvisionerJobByID(ctx context.Context, arg UpdateProvisionerJobByIDParams) error UpdateProvisionerJobWithCancelByID(ctx context.Context, arg UpdateProvisionerJobWithCancelByIDParams) error UpdateProvisionerJobWithCompleteByID(ctx context.Context, arg UpdateProvisionerJobWithCompleteByIDParams) error + UpdateProvisionerJobWithCompleteWithStartedAtByID(ctx context.Context, arg UpdateProvisionerJobWithCompleteWithStartedAtByIDParams) error UpdateReplica(ctx context.Context, arg UpdateReplicaParams) (Replica, error) UpdateTailnetPeerStatusByCoordinator(ctx context.Context, arg UpdateTailnetPeerStatusByCoordinatorParams) error UpdateTemplateACLByID(ctx context.Context, arg UpdateTemplateACLByIDParams) error diff --git a/coderd/database/querier_test.go b/coderd/database/querier_test.go index b2cc20c4894d5..5bafa58796b7a 100644 --- a/coderd/database/querier_test.go +++ b/coderd/database/querier_test.go @@ -4123,8 +4123,7 @@ func TestGetPresetsBackoff(t *testing.T) { }) tmpl1 := createTemplate(t, db, orgID, userID) - tmpl1V1 := createTmplVersionAndPreset(t, db, tmpl1, tmpl1.ActiveVersionID, now, nil) - _ = tmpl1V1 + createTmplVersionAndPreset(t, db, tmpl1, tmpl1.ActiveVersionID, now, nil) backoffs, err := db.GetPresetsBackoff(ctx, now.Add(-time.Hour)) require.NoError(t, err) @@ -4401,6 +4400,311 @@ func TestGetPresetsBackoff(t *testing.T) { }) } +func TestGetPresetsAtFailureLimit(t *testing.T) { + t.Parallel() + if !dbtestutil.WillUsePostgres() { + t.SkipNow() + } + + now := dbtime.Now() + hourBefore := now.Add(-time.Hour) + orgID := uuid.New() + userID := uuid.New() + + findPresetByTmplVersionID := func(hardLimitedPresets []database.GetPresetsAtFailureLimitRow, tmplVersionID uuid.UUID) *database.GetPresetsAtFailureLimitRow { + for _, preset := range hardLimitedPresets { + if preset.TemplateVersionID == tmplVersionID { + return &preset + } + } + + return nil + } + + testCases := []struct { + name string + // true - build is successful + // false - build is unsuccessful + buildSuccesses []bool + hardLimit int64 + expHitHardLimit bool + }{ + { + name: "failed build", + buildSuccesses: []bool{false}, + hardLimit: 1, + expHitHardLimit: true, + }, + { + name: "2 failed builds", + buildSuccesses: []bool{false, false}, + hardLimit: 1, + expHitHardLimit: true, + }, + { + name: "successful build", + buildSuccesses: []bool{true}, + hardLimit: 1, + expHitHardLimit: false, + }, + { + name: "last build is failed", + buildSuccesses: []bool{true, true, false}, + hardLimit: 1, + expHitHardLimit: true, + }, + { + name: "last build is successful", + buildSuccesses: []bool{false, false, true}, + hardLimit: 1, + expHitHardLimit: false, + }, + { + name: "last 3 builds are failed - hard limit is reached", + buildSuccesses: []bool{true, true, false, false, false}, + hardLimit: 3, + expHitHardLimit: true, + }, + { + name: "1 out of 3 last build is successful - hard limit is NOT reached", + buildSuccesses: []bool{false, false, true, false, false}, + hardLimit: 3, + expHitHardLimit: false, + }, + // hardLimit set to zero, implicitly disables the hard limit. + { + name: "despite 5 failed builds, the hard limit is not reached because it's disabled.", + buildSuccesses: []bool{false, false, false, false, false}, + hardLimit: 0, + expHitHardLimit: false, + }, + } + + for _, tc := range testCases { + t.Run(tc.name, func(t *testing.T) { + t.Parallel() + + db, _ := dbtestutil.NewDB(t) + ctx := testutil.Context(t, testutil.WaitShort) + dbgen.Organization(t, db, database.Organization{ + ID: orgID, + }) + dbgen.User(t, db, database.User{ + ID: userID, + }) + + tmpl := createTemplate(t, db, orgID, userID) + tmplV1 := createTmplVersionAndPreset(t, db, tmpl, tmpl.ActiveVersionID, now, nil) + for idx, buildSuccess := range tc.buildSuccesses { + createPrebuiltWorkspace(ctx, t, db, tmpl, tmplV1, orgID, now, &createPrebuiltWorkspaceOpts{ + failedJob: !buildSuccess, + createdAt: hourBefore.Add(time.Duration(idx) * time.Second), + }) + } + + hardLimitedPresets, err := db.GetPresetsAtFailureLimit(ctx, tc.hardLimit) + require.NoError(t, err) + + if !tc.expHitHardLimit { + require.Len(t, hardLimitedPresets, 0) + return + } + + require.Len(t, hardLimitedPresets, 1) + hardLimitedPreset := hardLimitedPresets[0] + require.Equal(t, hardLimitedPreset.TemplateVersionID, tmpl.ActiveVersionID) + require.Equal(t, hardLimitedPreset.PresetID, tmplV1.preset.ID) + }) + } + + t.Run("Ignore Inactive Version", func(t *testing.T) { + t.Parallel() + + db, _ := dbtestutil.NewDB(t) + ctx := testutil.Context(t, testutil.WaitShort) + dbgen.Organization(t, db, database.Organization{ + ID: orgID, + }) + dbgen.User(t, db, database.User{ + ID: userID, + }) + + tmpl := createTemplate(t, db, orgID, userID) + tmplV1 := createTmplVersionAndPreset(t, db, tmpl, uuid.New(), now, nil) + createPrebuiltWorkspace(ctx, t, db, tmpl, tmplV1, orgID, now, &createPrebuiltWorkspaceOpts{ + failedJob: true, + }) + + // Active Version + tmplV2 := createTmplVersionAndPreset(t, db, tmpl, tmpl.ActiveVersionID, now, nil) + createPrebuiltWorkspace(ctx, t, db, tmpl, tmplV2, orgID, now, &createPrebuiltWorkspaceOpts{ + failedJob: true, + }) + createPrebuiltWorkspace(ctx, t, db, tmpl, tmplV2, orgID, now, &createPrebuiltWorkspaceOpts{ + failedJob: true, + }) + + hardLimitedPresets, err := db.GetPresetsAtFailureLimit(ctx, 1) + require.NoError(t, err) + + require.Len(t, hardLimitedPresets, 1) + hardLimitedPreset := hardLimitedPresets[0] + require.Equal(t, hardLimitedPreset.TemplateVersionID, tmpl.ActiveVersionID) + require.Equal(t, hardLimitedPreset.PresetID, tmplV2.preset.ID) + }) + + t.Run("Multiple Templates", func(t *testing.T) { + t.Parallel() + + db, _ := dbtestutil.NewDB(t) + ctx := testutil.Context(t, testutil.WaitShort) + dbgen.Organization(t, db, database.Organization{ + ID: orgID, + }) + dbgen.User(t, db, database.User{ + ID: userID, + }) + + tmpl1 := createTemplate(t, db, orgID, userID) + tmpl1V1 := createTmplVersionAndPreset(t, db, tmpl1, tmpl1.ActiveVersionID, now, nil) + createPrebuiltWorkspace(ctx, t, db, tmpl1, tmpl1V1, orgID, now, &createPrebuiltWorkspaceOpts{ + failedJob: true, + }) + + tmpl2 := createTemplate(t, db, orgID, userID) + tmpl2V1 := createTmplVersionAndPreset(t, db, tmpl2, tmpl2.ActiveVersionID, now, nil) + createPrebuiltWorkspace(ctx, t, db, tmpl2, tmpl2V1, orgID, now, &createPrebuiltWorkspaceOpts{ + failedJob: true, + }) + + hardLimitedPresets, err := db.GetPresetsAtFailureLimit(ctx, 1) + + require.NoError(t, err) + + require.Len(t, hardLimitedPresets, 2) + { + hardLimitedPreset := findPresetByTmplVersionID(hardLimitedPresets, tmpl1.ActiveVersionID) + require.Equal(t, hardLimitedPreset.TemplateVersionID, tmpl1.ActiveVersionID) + require.Equal(t, hardLimitedPreset.PresetID, tmpl1V1.preset.ID) + } + { + hardLimitedPreset := findPresetByTmplVersionID(hardLimitedPresets, tmpl2.ActiveVersionID) + require.Equal(t, hardLimitedPreset.TemplateVersionID, tmpl2.ActiveVersionID) + require.Equal(t, hardLimitedPreset.PresetID, tmpl2V1.preset.ID) + } + }) + + t.Run("Multiple Templates, Versions and Workspace Builds", func(t *testing.T) { + t.Parallel() + + db, _ := dbtestutil.NewDB(t) + ctx := testutil.Context(t, testutil.WaitShort) + dbgen.Organization(t, db, database.Organization{ + ID: orgID, + }) + dbgen.User(t, db, database.User{ + ID: userID, + }) + + tmpl1 := createTemplate(t, db, orgID, userID) + tmpl1V1 := createTmplVersionAndPreset(t, db, tmpl1, tmpl1.ActiveVersionID, now, nil) + createPrebuiltWorkspace(ctx, t, db, tmpl1, tmpl1V1, orgID, now, &createPrebuiltWorkspaceOpts{ + failedJob: true, + }) + createPrebuiltWorkspace(ctx, t, db, tmpl1, tmpl1V1, orgID, now, &createPrebuiltWorkspaceOpts{ + failedJob: true, + }) + + tmpl2 := createTemplate(t, db, orgID, userID) + tmpl2V1 := createTmplVersionAndPreset(t, db, tmpl2, tmpl2.ActiveVersionID, now, nil) + createPrebuiltWorkspace(ctx, t, db, tmpl2, tmpl2V1, orgID, now, &createPrebuiltWorkspaceOpts{ + failedJob: true, + }) + createPrebuiltWorkspace(ctx, t, db, tmpl2, tmpl2V1, orgID, now, &createPrebuiltWorkspaceOpts{ + failedJob: true, + }) + + tmpl3 := createTemplate(t, db, orgID, userID) + tmpl3V1 := createTmplVersionAndPreset(t, db, tmpl3, uuid.New(), now, nil) + createPrebuiltWorkspace(ctx, t, db, tmpl3, tmpl3V1, orgID, now, &createPrebuiltWorkspaceOpts{ + failedJob: true, + }) + + tmpl3V2 := createTmplVersionAndPreset(t, db, tmpl3, tmpl3.ActiveVersionID, now, nil) + createPrebuiltWorkspace(ctx, t, db, tmpl3, tmpl3V2, orgID, now, &createPrebuiltWorkspaceOpts{ + failedJob: true, + }) + createPrebuiltWorkspace(ctx, t, db, tmpl3, tmpl3V2, orgID, now, &createPrebuiltWorkspaceOpts{ + failedJob: true, + }) + + hardLimit := int64(2) + hardLimitedPresets, err := db.GetPresetsAtFailureLimit(ctx, hardLimit) + require.NoError(t, err) + + require.Len(t, hardLimitedPresets, 3) + { + hardLimitedPreset := findPresetByTmplVersionID(hardLimitedPresets, tmpl1.ActiveVersionID) + require.Equal(t, hardLimitedPreset.TemplateVersionID, tmpl1.ActiveVersionID) + require.Equal(t, hardLimitedPreset.PresetID, tmpl1V1.preset.ID) + } + { + hardLimitedPreset := findPresetByTmplVersionID(hardLimitedPresets, tmpl2.ActiveVersionID) + require.Equal(t, hardLimitedPreset.TemplateVersionID, tmpl2.ActiveVersionID) + require.Equal(t, hardLimitedPreset.PresetID, tmpl2V1.preset.ID) + } + { + hardLimitedPreset := findPresetByTmplVersionID(hardLimitedPresets, tmpl3.ActiveVersionID) + require.Equal(t, hardLimitedPreset.TemplateVersionID, tmpl3.ActiveVersionID) + require.Equal(t, hardLimitedPreset.PresetID, tmpl3V2.preset.ID) + } + }) + + t.Run("No Workspace Builds", func(t *testing.T) { + t.Parallel() + + db, _ := dbtestutil.NewDB(t) + ctx := testutil.Context(t, testutil.WaitShort) + dbgen.Organization(t, db, database.Organization{ + ID: orgID, + }) + dbgen.User(t, db, database.User{ + ID: userID, + }) + + tmpl1 := createTemplate(t, db, orgID, userID) + createTmplVersionAndPreset(t, db, tmpl1, tmpl1.ActiveVersionID, now, nil) + + hardLimitedPresets, err := db.GetPresetsAtFailureLimit(ctx, 1) + require.NoError(t, err) + require.Nil(t, hardLimitedPresets) + }) + + t.Run("No Failed Workspace Builds", func(t *testing.T) { + t.Parallel() + + db, _ := dbtestutil.NewDB(t) + ctx := testutil.Context(t, testutil.WaitShort) + dbgen.Organization(t, db, database.Organization{ + ID: orgID, + }) + dbgen.User(t, db, database.User{ + ID: userID, + }) + + tmpl1 := createTemplate(t, db, orgID, userID) + tmpl1V1 := createTmplVersionAndPreset(t, db, tmpl1, tmpl1.ActiveVersionID, now, nil) + successfulJobOpts := createPrebuiltWorkspaceOpts{} + createPrebuiltWorkspace(ctx, t, db, tmpl1, tmpl1V1, orgID, now, &successfulJobOpts) + createPrebuiltWorkspace(ctx, t, db, tmpl1, tmpl1V1, orgID, now, &successfulJobOpts) + createPrebuiltWorkspace(ctx, t, db, tmpl1, tmpl1V1, orgID, now, &successfulJobOpts) + + hardLimitedPresets, err := db.GetPresetsAtFailureLimit(ctx, 1) + require.NoError(t, err) + require.Nil(t, hardLimitedPresets) + }) +} + func requireUsersMatch(t testing.TB, expected []database.User, found []database.GetUsersRow, msg string) { t.Helper() require.ElementsMatch(t, expected, database.ConvertUserRows(found), msg) diff --git a/coderd/database/queries.sql.go b/coderd/database/queries.sql.go index fdb9252bf27ee..ffd8ccb035206 100644 --- a/coderd/database/queries.sql.go +++ b/coderd/database/queries.sql.go @@ -6288,6 +6288,71 @@ func (q *sqlQuerier) GetPrebuildMetrics(ctx context.Context) ([]GetPrebuildMetri return items, nil } +const getPresetsAtFailureLimit = `-- name: GetPresetsAtFailureLimit :many +WITH filtered_builds AS ( + -- Only select builds which are for prebuild creations + SELECT wlb.template_version_id, wlb.created_at, tvp.id AS preset_id, wlb.job_status, tvp.desired_instances + FROM template_version_presets tvp + INNER JOIN workspace_latest_builds wlb ON wlb.template_version_preset_id = tvp.id + INNER JOIN workspaces w ON wlb.workspace_id = w.id + INNER JOIN template_versions tv ON wlb.template_version_id = tv.id + INNER JOIN templates t ON tv.template_id = t.id AND t.active_version_id = tv.id + WHERE tvp.desired_instances IS NOT NULL -- Consider only presets that have a prebuild configuration. + AND wlb.transition = 'start'::workspace_transition + AND w.owner_id = 'c42fdf75-3097-471c-8c33-fb52454d81c0' +), +time_sorted_builds AS ( + -- Group builds by preset, then sort each group by created_at. + SELECT fb.template_version_id, fb.created_at, fb.preset_id, fb.job_status, fb.desired_instances, + ROW_NUMBER() OVER (PARTITION BY fb.preset_id ORDER BY fb.created_at DESC) as rn + FROM filtered_builds fb +) +SELECT + tsb.template_version_id, + tsb.preset_id +FROM time_sorted_builds tsb +WHERE tsb.rn <= $1::bigint + AND tsb.job_status = 'failed'::provisioner_job_status +GROUP BY tsb.template_version_id, tsb.preset_id +HAVING COUNT(*) = $1::bigint +` + +type GetPresetsAtFailureLimitRow struct { + TemplateVersionID uuid.UUID `db:"template_version_id" json:"template_version_id"` + PresetID uuid.UUID `db:"preset_id" json:"preset_id"` +} + +// GetPresetsAtFailureLimit groups workspace builds by preset ID. +// Each preset is associated with exactly one template version ID. +// For each preset, the query checks the last hard_limit builds. +// If all of them failed, the preset is considered to have hit the hard failure limit. +// The query returns a list of preset IDs that have reached this failure threshold. +// Only active template versions with configured presets are considered. +// For each preset, check the last hard_limit builds. +// If all of them failed, the preset is considered to have hit the hard failure limit. +func (q *sqlQuerier) GetPresetsAtFailureLimit(ctx context.Context, hardLimit int64) ([]GetPresetsAtFailureLimitRow, error) { + rows, err := q.db.QueryContext(ctx, getPresetsAtFailureLimit, hardLimit) + if err != nil { + return nil, err + } + defer rows.Close() + var items []GetPresetsAtFailureLimitRow + for rows.Next() { + var i GetPresetsAtFailureLimitRow + if err := rows.Scan(&i.TemplateVersionID, &i.PresetID); err != nil { + return nil, err + } + items = append(items, i) + } + if err := rows.Close(); err != nil { + return nil, err + } + if err := rows.Err(); err != nil { + return nil, err + } + return items, nil +} + const getPresetsBackoff = `-- name: GetPresetsBackoff :many WITH filtered_builds AS ( -- Only select builds which are for prebuild creations @@ -6438,6 +6503,7 @@ const getTemplatePresetsWithPrebuilds = `-- name: GetTemplatePresetsWithPrebuild SELECT t.id AS template_id, t.name AS template_name, + o.id AS organization_id, o.name AS organization_name, tv.id AS template_version_id, tv.name AS template_version_name, @@ -6445,6 +6511,7 @@ SELECT tvp.id, tvp.name, tvp.desired_instances AS desired_instances, + tvp.prebuild_status, t.deleted, t.deprecated != '' AS deprecated FROM templates t @@ -6457,17 +6524,19 @@ WHERE tvp.desired_instances IS NOT NULL -- Consider only presets that have a pre ` type GetTemplatePresetsWithPrebuildsRow struct { - TemplateID uuid.UUID `db:"template_id" json:"template_id"` - TemplateName string `db:"template_name" json:"template_name"` - OrganizationName string `db:"organization_name" json:"organization_name"` - TemplateVersionID uuid.UUID `db:"template_version_id" json:"template_version_id"` - TemplateVersionName string `db:"template_version_name" json:"template_version_name"` - UsingActiveVersion bool `db:"using_active_version" json:"using_active_version"` - ID uuid.UUID `db:"id" json:"id"` - Name string `db:"name" json:"name"` - DesiredInstances sql.NullInt32 `db:"desired_instances" json:"desired_instances"` - Deleted bool `db:"deleted" json:"deleted"` - Deprecated bool `db:"deprecated" json:"deprecated"` + TemplateID uuid.UUID `db:"template_id" json:"template_id"` + TemplateName string `db:"template_name" json:"template_name"` + OrganizationID uuid.UUID `db:"organization_id" json:"organization_id"` + OrganizationName string `db:"organization_name" json:"organization_name"` + TemplateVersionID uuid.UUID `db:"template_version_id" json:"template_version_id"` + TemplateVersionName string `db:"template_version_name" json:"template_version_name"` + UsingActiveVersion bool `db:"using_active_version" json:"using_active_version"` + ID uuid.UUID `db:"id" json:"id"` + Name string `db:"name" json:"name"` + DesiredInstances sql.NullInt32 `db:"desired_instances" json:"desired_instances"` + PrebuildStatus PrebuildStatus `db:"prebuild_status" json:"prebuild_status"` + Deleted bool `db:"deleted" json:"deleted"` + Deprecated bool `db:"deprecated" json:"deprecated"` } // GetTemplatePresetsWithPrebuilds retrieves template versions with configured presets and prebuilds. @@ -6485,6 +6554,7 @@ func (q *sqlQuerier) GetTemplatePresetsWithPrebuilds(ctx context.Context, templa if err := rows.Scan( &i.TemplateID, &i.TemplateName, + &i.OrganizationID, &i.OrganizationName, &i.TemplateVersionID, &i.TemplateVersionName, @@ -6492,6 +6562,7 @@ func (q *sqlQuerier) GetTemplatePresetsWithPrebuilds(ctx context.Context, templa &i.ID, &i.Name, &i.DesiredInstances, + &i.PrebuildStatus, &i.Deleted, &i.Deprecated, ); err != nil { @@ -6509,21 +6580,22 @@ func (q *sqlQuerier) GetTemplatePresetsWithPrebuilds(ctx context.Context, templa } const getPresetByID = `-- name: GetPresetByID :one -SELECT tvp.id, tvp.template_version_id, tvp.name, tvp.created_at, tvp.desired_instances, tvp.invalidate_after_secs, tv.template_id, tv.organization_id FROM +SELECT tvp.id, tvp.template_version_id, tvp.name, tvp.created_at, tvp.desired_instances, tvp.invalidate_after_secs, tvp.prebuild_status, tv.template_id, tv.organization_id FROM template_version_presets tvp INNER JOIN template_versions tv ON tvp.template_version_id = tv.id WHERE tvp.id = $1 ` type GetPresetByIDRow struct { - ID uuid.UUID `db:"id" json:"id"` - TemplateVersionID uuid.UUID `db:"template_version_id" json:"template_version_id"` - Name string `db:"name" json:"name"` - CreatedAt time.Time `db:"created_at" json:"created_at"` - DesiredInstances sql.NullInt32 `db:"desired_instances" json:"desired_instances"` - InvalidateAfterSecs sql.NullInt32 `db:"invalidate_after_secs" json:"invalidate_after_secs"` - TemplateID uuid.NullUUID `db:"template_id" json:"template_id"` - OrganizationID uuid.UUID `db:"organization_id" json:"organization_id"` + ID uuid.UUID `db:"id" json:"id"` + TemplateVersionID uuid.UUID `db:"template_version_id" json:"template_version_id"` + Name string `db:"name" json:"name"` + CreatedAt time.Time `db:"created_at" json:"created_at"` + DesiredInstances sql.NullInt32 `db:"desired_instances" json:"desired_instances"` + InvalidateAfterSecs sql.NullInt32 `db:"invalidate_after_secs" json:"invalidate_after_secs"` + PrebuildStatus PrebuildStatus `db:"prebuild_status" json:"prebuild_status"` + TemplateID uuid.NullUUID `db:"template_id" json:"template_id"` + OrganizationID uuid.UUID `db:"organization_id" json:"organization_id"` } func (q *sqlQuerier) GetPresetByID(ctx context.Context, presetID uuid.UUID) (GetPresetByIDRow, error) { @@ -6536,6 +6608,7 @@ func (q *sqlQuerier) GetPresetByID(ctx context.Context, presetID uuid.UUID) (Get &i.CreatedAt, &i.DesiredInstances, &i.InvalidateAfterSecs, + &i.PrebuildStatus, &i.TemplateID, &i.OrganizationID, ) @@ -6544,7 +6617,7 @@ func (q *sqlQuerier) GetPresetByID(ctx context.Context, presetID uuid.UUID) (Get const getPresetByWorkspaceBuildID = `-- name: GetPresetByWorkspaceBuildID :one SELECT - template_version_presets.id, template_version_presets.template_version_id, template_version_presets.name, template_version_presets.created_at, template_version_presets.desired_instances, template_version_presets.invalidate_after_secs + template_version_presets.id, template_version_presets.template_version_id, template_version_presets.name, template_version_presets.created_at, template_version_presets.desired_instances, template_version_presets.invalidate_after_secs, template_version_presets.prebuild_status FROM template_version_presets INNER JOIN workspace_builds ON workspace_builds.template_version_preset_id = template_version_presets.id @@ -6562,6 +6635,7 @@ func (q *sqlQuerier) GetPresetByWorkspaceBuildID(ctx context.Context, workspaceB &i.CreatedAt, &i.DesiredInstances, &i.InvalidateAfterSecs, + &i.PrebuildStatus, ) return i, err } @@ -6643,7 +6717,7 @@ func (q *sqlQuerier) GetPresetParametersByTemplateVersionID(ctx context.Context, const getPresetsByTemplateVersionID = `-- name: GetPresetsByTemplateVersionID :many SELECT - id, template_version_id, name, created_at, desired_instances, invalidate_after_secs + id, template_version_id, name, created_at, desired_instances, invalidate_after_secs, prebuild_status FROM template_version_presets WHERE @@ -6666,6 +6740,7 @@ func (q *sqlQuerier) GetPresetsByTemplateVersionID(ctx context.Context, template &i.CreatedAt, &i.DesiredInstances, &i.InvalidateAfterSecs, + &i.PrebuildStatus, ); err != nil { return nil, err } @@ -6696,7 +6771,7 @@ VALUES ( $4, $5, $6 -) RETURNING id, template_version_id, name, created_at, desired_instances, invalidate_after_secs +) RETURNING id, template_version_id, name, created_at, desired_instances, invalidate_after_secs, prebuild_status ` type InsertPresetParams struct { @@ -6725,6 +6800,7 @@ func (q *sqlQuerier) InsertPreset(ctx context.Context, arg InsertPresetParams) ( &i.CreatedAt, &i.DesiredInstances, &i.InvalidateAfterSecs, + &i.PrebuildStatus, ) return i, err } @@ -6773,6 +6849,22 @@ func (q *sqlQuerier) InsertPresetParameters(ctx context.Context, arg InsertPrese return items, nil } +const updatePresetPrebuildStatus = `-- name: UpdatePresetPrebuildStatus :exec +UPDATE template_version_presets +SET prebuild_status = $1 +WHERE id = $2 +` + +type UpdatePresetPrebuildStatusParams struct { + Status PrebuildStatus `db:"status" json:"status"` + PresetID uuid.UUID `db:"preset_id" json:"preset_id"` +} + +func (q *sqlQuerier) UpdatePresetPrebuildStatus(ctx context.Context, arg UpdatePresetPrebuildStatusParams) error { + _, err := q.db.ExecContext(ctx, updatePresetPrebuildStatus, arg.Status, arg.PresetID) + return err +} + const deleteOldProvisionerDaemons = `-- name: DeleteOldProvisionerDaemons :exec DELETE FROM provisioner_daemons WHERE ( (created_at < (NOW() - INTERVAL '7 days') AND last_seen_at IS NULL) OR @@ -7384,71 +7476,57 @@ func (q *sqlQuerier) AcquireProvisionerJob(ctx context.Context, arg AcquireProvi return i, err } -const getHungProvisionerJobs = `-- name: GetHungProvisionerJobs :many +const getProvisionerJobByID = `-- name: GetProvisionerJobByID :one SELECT id, created_at, updated_at, started_at, canceled_at, completed_at, error, organization_id, initiator_id, provisioner, storage_method, type, input, worker_id, file_id, tags, error_code, trace_metadata, job_status FROM provisioner_jobs WHERE - updated_at < $1 - AND started_at IS NOT NULL - AND completed_at IS NULL + id = $1 ` -func (q *sqlQuerier) GetHungProvisionerJobs(ctx context.Context, updatedAt time.Time) ([]ProvisionerJob, error) { - rows, err := q.db.QueryContext(ctx, getHungProvisionerJobs, updatedAt) - if err != nil { - return nil, err - } - defer rows.Close() - var items []ProvisionerJob - for rows.Next() { - var i ProvisionerJob - if err := rows.Scan( - &i.ID, - &i.CreatedAt, - &i.UpdatedAt, - &i.StartedAt, - &i.CanceledAt, - &i.CompletedAt, - &i.Error, - &i.OrganizationID, - &i.InitiatorID, - &i.Provisioner, - &i.StorageMethod, - &i.Type, - &i.Input, - &i.WorkerID, - &i.FileID, - &i.Tags, - &i.ErrorCode, - &i.TraceMetadata, - &i.JobStatus, - ); err != nil { - return nil, err - } - items = append(items, i) - } - if err := rows.Close(); err != nil { - return nil, err - } - if err := rows.Err(); err != nil { - return nil, err - } - return items, nil +func (q *sqlQuerier) GetProvisionerJobByID(ctx context.Context, id uuid.UUID) (ProvisionerJob, error) { + row := q.db.QueryRowContext(ctx, getProvisionerJobByID, id) + var i ProvisionerJob + err := row.Scan( + &i.ID, + &i.CreatedAt, + &i.UpdatedAt, + &i.StartedAt, + &i.CanceledAt, + &i.CompletedAt, + &i.Error, + &i.OrganizationID, + &i.InitiatorID, + &i.Provisioner, + &i.StorageMethod, + &i.Type, + &i.Input, + &i.WorkerID, + &i.FileID, + &i.Tags, + &i.ErrorCode, + &i.TraceMetadata, + &i.JobStatus, + ) + return i, err } -const getProvisionerJobByID = `-- name: GetProvisionerJobByID :one +const getProvisionerJobByIDForUpdate = `-- name: GetProvisionerJobByIDForUpdate :one SELECT id, created_at, updated_at, started_at, canceled_at, completed_at, error, organization_id, initiator_id, provisioner, storage_method, type, input, worker_id, file_id, tags, error_code, trace_metadata, job_status FROM provisioner_jobs WHERE id = $1 +FOR UPDATE +SKIP LOCKED ` -func (q *sqlQuerier) GetProvisionerJobByID(ctx context.Context, id uuid.UUID) (ProvisionerJob, error) { - row := q.db.QueryRowContext(ctx, getProvisionerJobByID, id) +// Gets a single provisioner job by ID for update. +// This is used to securely reap jobs that have been hung/pending for a long time. +func (q *sqlQuerier) GetProvisionerJobByIDForUpdate(ctx context.Context, id uuid.UUID) (ProvisionerJob, error) { + row := q.db.QueryRowContext(ctx, getProvisionerJobByIDForUpdate, id) var i ProvisionerJob err := row.Scan( &i.ID, @@ -7913,6 +7991,79 @@ func (q *sqlQuerier) GetProvisionerJobsCreatedAfter(ctx context.Context, created return items, nil } +const getProvisionerJobsToBeReaped = `-- name: GetProvisionerJobsToBeReaped :many +SELECT + id, created_at, updated_at, started_at, canceled_at, completed_at, error, organization_id, initiator_id, provisioner, storage_method, type, input, worker_id, file_id, tags, error_code, trace_metadata, job_status +FROM + provisioner_jobs +WHERE + ( + -- If the job has not been started before @pending_since, reap it. + updated_at < $1 + AND started_at IS NULL + AND completed_at IS NULL + ) + OR + ( + -- If the job has been started but not completed before @hung_since, reap it. + updated_at < $2 + AND started_at IS NOT NULL + AND completed_at IS NULL + ) +ORDER BY random() +LIMIT $3 +` + +type GetProvisionerJobsToBeReapedParams struct { + PendingSince time.Time `db:"pending_since" json:"pending_since"` + HungSince time.Time `db:"hung_since" json:"hung_since"` + MaxJobs int32 `db:"max_jobs" json:"max_jobs"` +} + +// To avoid repeatedly attempting to reap the same jobs, we randomly order and limit to @max_jobs. +func (q *sqlQuerier) GetProvisionerJobsToBeReaped(ctx context.Context, arg GetProvisionerJobsToBeReapedParams) ([]ProvisionerJob, error) { + rows, err := q.db.QueryContext(ctx, getProvisionerJobsToBeReaped, arg.PendingSince, arg.HungSince, arg.MaxJobs) + if err != nil { + return nil, err + } + defer rows.Close() + var items []ProvisionerJob + for rows.Next() { + var i ProvisionerJob + if err := rows.Scan( + &i.ID, + &i.CreatedAt, + &i.UpdatedAt, + &i.StartedAt, + &i.CanceledAt, + &i.CompletedAt, + &i.Error, + &i.OrganizationID, + &i.InitiatorID, + &i.Provisioner, + &i.StorageMethod, + &i.Type, + &i.Input, + &i.WorkerID, + &i.FileID, + &i.Tags, + &i.ErrorCode, + &i.TraceMetadata, + &i.JobStatus, + ); err != nil { + return nil, err + } + items = append(items, i) + } + if err := rows.Close(); err != nil { + return nil, err + } + if err := rows.Err(); err != nil { + return nil, err + } + return items, nil +} + const insertProvisionerJob = `-- name: InsertProvisionerJob :one INSERT INTO provisioner_jobs ( @@ -8121,6 +8272,40 @@ func (q *sqlQuerier) UpdateProvisionerJobWithCompleteByID(ctx context.Context, a return err } +const updateProvisionerJobWithCompleteWithStartedAtByID = `-- name: UpdateProvisionerJobWithCompleteWithStartedAtByID :exec +UPDATE + provisioner_jobs +SET + updated_at = $2, + completed_at = $3, + error = $4, + error_code = $5, + started_at = $6 +WHERE + id = $1 +` + +type UpdateProvisionerJobWithCompleteWithStartedAtByIDParams struct { + ID uuid.UUID `db:"id" json:"id"` + UpdatedAt time.Time `db:"updated_at" json:"updated_at"` + CompletedAt sql.NullTime `db:"completed_at" json:"completed_at"` + Error sql.NullString `db:"error" json:"error"` + ErrorCode sql.NullString `db:"error_code" json:"error_code"` + StartedAt sql.NullTime `db:"started_at" json:"started_at"` +} + +func (q *sqlQuerier) UpdateProvisionerJobWithCompleteWithStartedAtByID(ctx context.Context, arg UpdateProvisionerJobWithCompleteWithStartedAtByIDParams) error { + _, err := q.db.ExecContext(ctx, updateProvisionerJobWithCompleteWithStartedAtByID, + arg.ID, + arg.UpdatedAt, + arg.CompletedAt, + arg.Error, + arg.ErrorCode, + arg.StartedAt, + ) + return err +} + const deleteProvisionerKey = `-- name: DeleteProvisionerKey :exec DELETE FROM provisioner_keys @@ -18050,6 +18235,65 @@ func (q *sqlQuerier) GetWorkspaceByOwnerIDAndName(ctx context.Context, arg GetWo return i, err } +const getWorkspaceByResourceID = `-- name: GetWorkspaceByResourceID :one +SELECT + id, created_at, updated_at, owner_id, organization_id, template_id, deleted, name, autostart_schedule, ttl, last_used_at, dormant_at, deleting_at, automatic_updates, favorite, next_start_at, owner_avatar_url, owner_username, organization_name, organization_display_name, organization_icon, organization_description, template_name, template_display_name, template_icon, template_description +FROM + workspaces_expanded as workspaces +WHERE + workspaces.id = ( + SELECT + workspace_id + FROM + workspace_builds + WHERE + workspace_builds.job_id = ( + SELECT + job_id + FROM + workspace_resources + WHERE + workspace_resources.id = $1 + ) + ) +LIMIT + 1 +` + +func (q *sqlQuerier) GetWorkspaceByResourceID(ctx context.Context, resourceID uuid.UUID) (Workspace, error) { + row := q.db.QueryRowContext(ctx, getWorkspaceByResourceID, resourceID) + var i Workspace + err := row.Scan( + &i.ID, + &i.CreatedAt, + &i.UpdatedAt, + &i.OwnerID, + &i.OrganizationID, + &i.TemplateID, + &i.Deleted, + &i.Name, + &i.AutostartSchedule, + &i.Ttl, + &i.LastUsedAt, + &i.DormantAt, + &i.DeletingAt, + &i.AutomaticUpdates, + &i.Favorite, + &i.NextStartAt, + &i.OwnerAvatarUrl, + &i.OwnerUsername, + &i.OrganizationName, + &i.OrganizationDisplayName, + &i.OrganizationIcon, + &i.OrganizationDescription, + &i.TemplateName, + &i.TemplateDisplayName, + &i.TemplateIcon, + &i.TemplateDescription, + ) + return i, err +} + const getWorkspaceByWorkspaceAppID = `-- name: GetWorkspaceByWorkspaceAppID :one SELECT id, created_at, updated_at, owner_id, organization_id, template_id, deleted, name, autostart_schedule, ttl, last_used_at, dormant_at, deleting_at, automatic_updates, favorite, next_start_at, owner_avatar_url, owner_username, organization_name, organization_display_name, organization_icon, organization_description, template_name, template_display_name, template_icon, template_description diff --git a/coderd/database/queries/prebuilds.sql b/coderd/database/queries/prebuilds.sql index 8c27ddf62b7c3..9cd4321afec23 100644 --- a/coderd/database/queries/prebuilds.sql +++ b/coderd/database/queries/prebuilds.sql @@ -27,6 +27,7 @@ RETURNING w.id, w.name; SELECT t.id AS template_id, t.name AS template_name, + o.id AS organization_id, o.name AS organization_name, tv.id AS template_version_id, tv.name AS template_version_name, @@ -34,6 +35,7 @@ SELECT tvp.id, tvp.name, tvp.desired_instances AS desired_instances, + tvp.prebuild_status, t.deleted, t.deprecated != '' AS deprecated FROM templates t @@ -129,6 +131,42 @@ WHERE tsb.rn <= tsb.desired_instances -- Fetch the last N builds, where N is the AND created_at >= @lookback::timestamptz GROUP BY tsb.template_version_id, tsb.preset_id, fc.num_failed; +-- GetPresetsAtFailureLimit groups workspace builds by preset ID. +-- Each preset is associated with exactly one template version ID. +-- For each preset, the query checks the last hard_limit builds. +-- If all of them failed, the preset is considered to have hit the hard failure limit. +-- The query returns a list of preset IDs that have reached this failure threshold. +-- Only active template versions with configured presets are considered. +-- name: GetPresetsAtFailureLimit :many +WITH filtered_builds AS ( + -- Only select builds which are for prebuild creations + SELECT wlb.template_version_id, wlb.created_at, tvp.id AS preset_id, wlb.job_status, tvp.desired_instances + FROM template_version_presets tvp + INNER JOIN workspace_latest_builds wlb ON wlb.template_version_preset_id = tvp.id + INNER JOIN workspaces w ON wlb.workspace_id = w.id + INNER JOIN template_versions tv ON wlb.template_version_id = tv.id + INNER JOIN templates t ON tv.template_id = t.id AND t.active_version_id = tv.id + WHERE tvp.desired_instances IS NOT NULL -- Consider only presets that have a prebuild configuration. + AND wlb.transition = 'start'::workspace_transition + AND w.owner_id = 'c42fdf75-3097-471c-8c33-fb52454d81c0' +), +time_sorted_builds AS ( + -- Group builds by preset, then sort each group by created_at. + SELECT fb.template_version_id, fb.created_at, fb.preset_id, fb.job_status, fb.desired_instances, + ROW_NUMBER() OVER (PARTITION BY fb.preset_id ORDER BY fb.created_at DESC) as rn + FROM filtered_builds fb +) +SELECT + tsb.template_version_id, + tsb.preset_id +FROM time_sorted_builds tsb +-- For each preset, check the last hard_limit builds. +-- If all of them failed, the preset is considered to have hit the hard failure limit. +WHERE tsb.rn <= @hard_limit::bigint + AND tsb.job_status = 'failed'::provisioner_job_status +GROUP BY tsb.template_version_id, tsb.preset_id +HAVING COUNT(*) = @hard_limit::bigint; + -- name: GetPrebuildMetrics :many SELECT t.name as template_name, diff --git a/coderd/database/queries/presets.sql b/coderd/database/queries/presets.sql index 6d5646a285b4a..2fb6722bc2c33 100644 --- a/coderd/database/queries/presets.sql +++ b/coderd/database/queries/presets.sql @@ -25,6 +25,11 @@ SELECT unnest(@values :: TEXT[]) RETURNING *; +-- name: UpdatePresetPrebuildStatus :exec +UPDATE template_version_presets +SET prebuild_status = @status +WHERE id = @preset_id; + -- name: GetPresetsByTemplateVersionID :many SELECT * diff --git a/coderd/database/queries/provisionerjobs.sql b/coderd/database/queries/provisionerjobs.sql index 2ab7774e660b8..88bacc705601c 100644 --- a/coderd/database/queries/provisionerjobs.sql +++ b/coderd/database/queries/provisionerjobs.sql @@ -41,6 +41,18 @@ FROM WHERE id = $1; +-- name: GetProvisionerJobByIDForUpdate :one +-- Gets a single provisioner job by ID for update. +-- This is used to securely reap jobs that have been hung/pending for a long time. +SELECT + * +FROM + provisioner_jobs +WHERE + id = $1 +FOR UPDATE +SKIP LOCKED; + -- name: GetProvisionerJobsByIDs :many SELECT * @@ -262,15 +274,40 @@ SET WHERE id = $1; --- name: GetHungProvisionerJobs :many +-- name: UpdateProvisionerJobWithCompleteWithStartedAtByID :exec +UPDATE + provisioner_jobs +SET + updated_at = $2, + completed_at = $3, + error = $4, + error_code = $5, + started_at = $6 +WHERE + id = $1; + +-- name: GetProvisionerJobsToBeReaped :many SELECT * FROM provisioner_jobs WHERE - updated_at < $1 - AND started_at IS NOT NULL - AND completed_at IS NULL; + ( + -- If the job has not been started before @pending_since, reap it. + updated_at < @pending_since + AND started_at IS NULL + AND completed_at IS NULL + ) + OR + ( + -- If the job has been started but not completed before @hung_since, reap it. + updated_at < @hung_since + AND started_at IS NOT NULL + AND completed_at IS NULL + ) +-- To avoid repeatedly attempting to reap the same jobs, we randomly order and limit to @max_jobs. +ORDER BY random() +LIMIT @max_jobs; -- name: InsertProvisionerJobTimings :many INSERT INTO provisioner_job_timings (job_id, started_at, ended_at, stage, source, action, resource) diff --git a/coderd/database/queries/workspaces.sql b/coderd/database/queries/workspaces.sql index 4ec74c066fe41..44b7dcbf0387d 100644 --- a/coderd/database/queries/workspaces.sql +++ b/coderd/database/queries/workspaces.sql @@ -8,6 +8,30 @@ WHERE LIMIT 1; +-- name: GetWorkspaceByResourceID :one +SELECT + * +FROM + workspaces_expanded as workspaces +WHERE + workspaces.id = ( + SELECT + workspace_id + FROM + workspace_builds + WHERE + workspace_builds.job_id = ( + SELECT + job_id + FROM + workspace_resources + WHERE + workspace_resources.id = @resource_id + ) + ) +LIMIT + 1; + -- name: GetWorkspaceByWorkspaceAppID :one SELECT * diff --git a/coderd/httpapi/authz.go b/coderd/httpapi/authz.go new file mode 100644 index 0000000000000..f0f208d31b937 --- /dev/null +++ b/coderd/httpapi/authz.go @@ -0,0 +1,28 @@ +//go:build !slim + +package httpapi + +import ( + "context" + "net/http" + + "github.com/coder/coder/v2/coderd/rbac" +) + +// This is defined separately in slim builds to avoid importing the rbac +// package, which is a large dependency. +func SetAuthzCheckRecorderHeader(ctx context.Context, rw http.ResponseWriter) { + if rec, ok := rbac.GetAuthzCheckRecorder(ctx); ok { + // If you're here because you saw this header in a response, and you're + // trying to investigate the code, here are a couple of notable things + // for you to know: + // - If any of the checks are `false`, they might not represent the whole + // picture. There could be additional checks that weren't performed, + // because processing stopped after the failure. + // - The checks are recorded by the `authzRecorder` type, which is + // configured on server startup for development and testing builds. + // - If this header is missing from a response, make sure the response is + // being written by calling `httpapi.Write`! + rw.Header().Set("x-authz-checks", rec.String()) + } +} diff --git a/coderd/httpapi/authz_slim.go b/coderd/httpapi/authz_slim.go new file mode 100644 index 0000000000000..0ebe7ca01aa86 --- /dev/null +++ b/coderd/httpapi/authz_slim.go @@ -0,0 +1,13 @@ +//go:build slim + +package httpapi + +import ( + "context" + "net/http" +) + +func SetAuthzCheckRecorderHeader(ctx context.Context, rw http.ResponseWriter) { + // There's no RBAC on the agent API, so this is separately defined to + // avoid importing the RBAC package, which is a large dependency. +} diff --git a/coderd/httpapi/httpapi.go b/coderd/httpapi/httpapi.go index 5c5c623474a47..466d45de82e5d 100644 --- a/coderd/httpapi/httpapi.go +++ b/coderd/httpapi/httpapi.go @@ -20,7 +20,6 @@ import ( "github.com/coder/websocket/wsjson" "github.com/coder/coder/v2/coderd/httpapi/httpapiconstraints" - "github.com/coder/coder/v2/coderd/rbac" "github.com/coder/coder/v2/coderd/tracing" "github.com/coder/coder/v2/codersdk" ) @@ -199,19 +198,7 @@ func Write(ctx context.Context, rw http.ResponseWriter, status int, response int _, span := tracing.StartSpan(ctx) defer span.End() - if rec, ok := rbac.GetAuthzCheckRecorder(ctx); ok { - // If you're here because you saw this header in a response, and you're - // trying to investigate the code, here are a couple of notable things - // for you to know: - // - If any of the checks are `false`, they might not represent the whole - // picture. There could be additional checks that weren't performed, - // because processing stopped after the failure. - // - The checks are recorded by the `authzRecorder` type, which is - // configured on server startup for development and testing builds. - // - If this header is missing from a response, make sure the response is - // being written by calling `httpapi.Write`! - rw.Header().Set("x-authz-checks", rec.String()) - } + SetAuthzCheckRecorderHeader(ctx, rw) rw.Header().Set("Content-Type", "application/json; charset=utf-8") rw.WriteHeader(status) @@ -228,9 +215,7 @@ func WriteIndent(ctx context.Context, rw http.ResponseWriter, status int, respon _, span := tracing.StartSpan(ctx) defer span.End() - if rec, ok := rbac.GetAuthzCheckRecorder(ctx); ok { - rw.Header().Set("x-authz-checks", rec.String()) - } + SetAuthzCheckRecorderHeader(ctx, rw) rw.Header().Set("Content-Type", "application/json; charset=utf-8") rw.WriteHeader(status) diff --git a/coderd/httpmw/authz.go b/coderd/httpmw/authz.go index 53aadb6cb7a57..9f1f397c858e0 100644 --- a/coderd/httpmw/authz.go +++ b/coderd/httpmw/authz.go @@ -1,3 +1,5 @@ +//go:build !slim + package httpmw import ( diff --git a/coderd/httpmw/loggermw/logger.go b/coderd/httpmw/loggermw/logger.go index 9eeb07a5f10e5..30e5e2d811ad8 100644 --- a/coderd/httpmw/loggermw/logger.go +++ b/coderd/httpmw/loggermw/logger.go @@ -132,7 +132,7 @@ var actorLogOrder = []rbac.SubjectType{ rbac.SubjectTypeAutostart, rbac.SubjectTypeCryptoKeyReader, rbac.SubjectTypeCryptoKeyRotator, - rbac.SubjectTypeHangDetector, + rbac.SubjectTypeJobReaper, rbac.SubjectTypeNotifier, rbac.SubjectTypePrebuildsOrchestrator, rbac.SubjectTypeProvisionerd, diff --git a/coderd/unhanger/detector.go b/coderd/jobreaper/detector.go similarity index 72% rename from coderd/unhanger/detector.go rename to coderd/jobreaper/detector.go index 14383b1839363..ad5774ee6b95d 100644 --- a/coderd/unhanger/detector.go +++ b/coderd/jobreaper/detector.go @@ -1,11 +1,10 @@ -package unhanger +package jobreaper import ( "context" "database/sql" "encoding/json" - "fmt" - "math/rand" //#nosec // this is only used for shuffling an array to pick random jobs to unhang + "fmt" //#nosec // this is only used for shuffling an array to pick random jobs to unhang "time" "golang.org/x/xerrors" @@ -21,10 +20,14 @@ import ( ) const ( - // HungJobDuration is the duration of time since the last update to a job - // before it is considered hung. + // HungJobDuration is the duration of time since the last update + // to a RUNNING job before it is considered hung. HungJobDuration = 5 * time.Minute + // PendingJobDuration is the duration of time since last update + // to a PENDING job before it is considered dead. + PendingJobDuration = 30 * time.Minute + // HungJobExitTimeout is the duration of time that provisioners should allow // for a graceful exit upon cancellation due to failing to send an update to // a job. @@ -38,16 +41,30 @@ const ( MaxJobsPerRun = 10 ) -// HungJobLogMessages are written to provisioner job logs when a job is hung and -// terminated. -var HungJobLogMessages = []string{ - "", - "====================", - "Coder: Build has been detected as hung for 5 minutes and will be terminated.", - "====================", - "", +// jobLogMessages are written to provisioner job logs when a job is reaped +func JobLogMessages(reapType ReapType, threshold time.Duration) []string { + return []string{ + "", + "====================", + fmt.Sprintf("Coder: Build has been detected as %s for %.0f minutes and will be terminated.", reapType, threshold.Minutes()), + "====================", + "", + } +} + +type jobToReap struct { + ID uuid.UUID + Threshold time.Duration + Type ReapType } +type ReapType string + +const ( + Pending ReapType = "pending" + Hung ReapType = "hung" +) + // acquireLockError is returned when the detector fails to acquire a lock and // cancels the current run. type acquireLockError struct{} @@ -93,10 +110,10 @@ type Stats struct { Error error } -// New returns a new hang detector. +// New returns a new job reaper. func New(ctx context.Context, db database.Store, pub pubsub.Pubsub, log slog.Logger, tick <-chan time.Time) *Detector { - //nolint:gocritic // Hang detector has a limited set of permissions. - ctx, cancel := context.WithCancel(dbauthz.AsHangDetector(ctx)) + //nolint:gocritic // Job reaper has a limited set of permissions. + ctx, cancel := context.WithCancel(dbauthz.AsJobReaper(ctx)) d := &Detector{ ctx: ctx, cancel: cancel, @@ -172,34 +189,42 @@ func (d *Detector) run(t time.Time) Stats { Error: nil, } - // Find all provisioner jobs that are currently running but have not - // received an update in the last 5 minutes. - jobs, err := d.db.GetHungProvisionerJobs(ctx, t.Add(-HungJobDuration)) + // Find all provisioner jobs to be reaped + jobs, err := d.db.GetProvisionerJobsToBeReaped(ctx, database.GetProvisionerJobsToBeReapedParams{ + PendingSince: t.Add(-PendingJobDuration), + HungSince: t.Add(-HungJobDuration), + MaxJobs: MaxJobsPerRun, + }) if err != nil { - stats.Error = xerrors.Errorf("get hung provisioner jobs: %w", err) + stats.Error = xerrors.Errorf("get provisioner jobs to be reaped: %w", err) return stats } - // Limit the number of jobs we'll unhang in a single run to avoid - // timing out. - if len(jobs) > MaxJobsPerRun { - // Pick a random subset of the jobs to unhang. - rand.Shuffle(len(jobs), func(i, j int) { - jobs[i], jobs[j] = jobs[j], jobs[i] - }) - jobs = jobs[:MaxJobsPerRun] - } + jobsToReap := make([]*jobToReap, 0, len(jobs)) - // Send a message into the build log for each hung job saying that it - // has been detected and will be terminated, then mark the job as - // failed. for _, job := range jobs { + j := &jobToReap{ + ID: job.ID, + } + if job.JobStatus == database.ProvisionerJobStatusPending { + j.Threshold = PendingJobDuration + j.Type = Pending + } else { + j.Threshold = HungJobDuration + j.Type = Hung + } + jobsToReap = append(jobsToReap, j) + } + + // Send a message into the build log for each hung or pending job saying that it + // has been detected and will be terminated, then mark the job as failed. + for _, job := range jobsToReap { log := d.log.With(slog.F("job_id", job.ID)) - err := unhangJob(ctx, log, d.db, d.pubsub, job.ID) + err := reapJob(ctx, log, d.db, d.pubsub, job) if err != nil { if !(xerrors.As(err, &acquireLockError{}) || xerrors.As(err, &jobIneligibleError{})) { - log.Error(ctx, "error forcefully terminating hung provisioner job", slog.Error(err)) + log.Error(ctx, "error forcefully terminating provisioner job", slog.F("type", job.Type), slog.Error(err)) } continue } @@ -210,47 +235,34 @@ func (d *Detector) run(t time.Time) Stats { return stats } -func unhangJob(ctx context.Context, log slog.Logger, db database.Store, pub pubsub.Pubsub, jobID uuid.UUID) error { +func reapJob(ctx context.Context, log slog.Logger, db database.Store, pub pubsub.Pubsub, jobToReap *jobToReap) error { var lowestLogID int64 err := db.InTx(func(db database.Store) error { - locked, err := db.TryAcquireLock(ctx, database.GenLockID(fmt.Sprintf("hang-detector:%s", jobID))) - if err != nil { - return xerrors.Errorf("acquire lock: %w", err) - } - if !locked { - // This error is ignored. - return acquireLockError{} - } - // Refetch the job while we hold the lock. - job, err := db.GetProvisionerJobByID(ctx, jobID) + job, err := db.GetProvisionerJobByIDForUpdate(ctx, jobToReap.ID) if err != nil { + if xerrors.Is(err, sql.ErrNoRows) { + return acquireLockError{} + } return xerrors.Errorf("get provisioner job: %w", err) } - // Check if we should still unhang it. - if !job.StartedAt.Valid { - // This shouldn't be possible to hit because the query only selects - // started and not completed jobs, and a job can't be "un-started". - return jobIneligibleError{ - Err: xerrors.New("job is not started"), - } - } if job.CompletedAt.Valid { return jobIneligibleError{ Err: xerrors.Errorf("job is completed (status %s)", job.JobStatus), } } - if job.UpdatedAt.After(time.Now().Add(-HungJobDuration)) { + if job.UpdatedAt.After(time.Now().Add(-jobToReap.Threshold)) { return jobIneligibleError{ Err: xerrors.New("job has been updated recently"), } } log.Warn( - ctx, "detected hung provisioner job, forcefully terminating", - "threshold", HungJobDuration, + ctx, "forcefully terminating provisioner job", + "type", jobToReap.Type, + "threshold", jobToReap.Threshold, ) // First, get the latest logs from the build so we can make sure @@ -260,7 +272,7 @@ func unhangJob(ctx context.Context, log slog.Logger, db database.Store, pub pubs CreatedAfter: 0, }) if err != nil { - return xerrors.Errorf("get logs for hung job: %w", err) + return xerrors.Errorf("get logs for %s job: %w", jobToReap.Type, err) } logStage := "" if len(logs) != 0 { @@ -280,7 +292,7 @@ func unhangJob(ctx context.Context, log slog.Logger, db database.Store, pub pubs Output: nil, } now := dbtime.Now() - for i, msg := range HungJobLogMessages { + for i, msg := range JobLogMessages(jobToReap.Type, jobToReap.Threshold) { // Set the created at in a way that ensures each message has // a unique timestamp so they will be sorted correctly. insertParams.CreatedAt = append(insertParams.CreatedAt, now.Add(time.Millisecond*time.Duration(i))) @@ -291,13 +303,22 @@ func unhangJob(ctx context.Context, log slog.Logger, db database.Store, pub pubs } newLogs, err := db.InsertProvisionerJobLogs(ctx, insertParams) if err != nil { - return xerrors.Errorf("insert logs for hung job: %w", err) + return xerrors.Errorf("insert logs for %s job: %w", job.JobStatus, err) } lowestLogID = newLogs[0].ID // Mark the job as failed. now = dbtime.Now() - err = db.UpdateProvisionerJobWithCompleteByID(ctx, database.UpdateProvisionerJobWithCompleteByIDParams{ + + // If the job was never started (pending), set the StartedAt time to the current + // time so that the build duration is correct. + if job.JobStatus == database.ProvisionerJobStatusPending { + job.StartedAt = sql.NullTime{ + Time: now, + Valid: true, + } + } + err = db.UpdateProvisionerJobWithCompleteWithStartedAtByID(ctx, database.UpdateProvisionerJobWithCompleteWithStartedAtByIDParams{ ID: job.ID, UpdatedAt: now, CompletedAt: sql.NullTime{ @@ -305,12 +326,13 @@ func unhangJob(ctx context.Context, log slog.Logger, db database.Store, pub pubs Valid: true, }, Error: sql.NullString{ - String: "Coder: Build has been detected as hung for 5 minutes and has been terminated by hang detector.", + String: fmt.Sprintf("Coder: Build has been detected as %s for %.0f minutes and has been terminated by the reaper.", jobToReap.Type, jobToReap.Threshold.Minutes()), Valid: true, }, ErrorCode: sql.NullString{ Valid: false, }, + StartedAt: job.StartedAt, }) if err != nil { return xerrors.Errorf("mark job as failed: %w", err) @@ -364,7 +386,7 @@ func unhangJob(ctx context.Context, log slog.Logger, db database.Store, pub pubs if err != nil { return xerrors.Errorf("marshal log notification: %w", err) } - err = pub.Publish(provisionersdk.ProvisionerJobLogsNotifyChannel(jobID), data) + err = pub.Publish(provisionersdk.ProvisionerJobLogsNotifyChannel(jobToReap.ID), data) if err != nil { return xerrors.Errorf("publish log notification: %w", err) } diff --git a/coderd/unhanger/detector_test.go b/coderd/jobreaper/detector_test.go similarity index 73% rename from coderd/unhanger/detector_test.go rename to coderd/jobreaper/detector_test.go index 43eb62bfa884b..28457aeeca3a8 100644 --- a/coderd/unhanger/detector_test.go +++ b/coderd/jobreaper/detector_test.go @@ -1,4 +1,4 @@ -package unhanger_test +package jobreaper_test import ( "context" @@ -20,9 +20,9 @@ import ( "github.com/coder/coder/v2/coderd/database/dbauthz" "github.com/coder/coder/v2/coderd/database/dbgen" "github.com/coder/coder/v2/coderd/database/dbtestutil" + "github.com/coder/coder/v2/coderd/jobreaper" "github.com/coder/coder/v2/coderd/provisionerdserver" "github.com/coder/coder/v2/coderd/rbac" - "github.com/coder/coder/v2/coderd/unhanger" "github.com/coder/coder/v2/provisionersdk" "github.com/coder/coder/v2/testutil" ) @@ -39,10 +39,10 @@ func TestDetectorNoJobs(t *testing.T) { db, pubsub = dbtestutil.NewDB(t) log = testutil.Logger(t) tickCh = make(chan time.Time) - statsCh = make(chan unhanger.Stats) + statsCh = make(chan jobreaper.Stats) ) - detector := unhanger.New(ctx, wrapDBAuthz(db, log), pubsub, log, tickCh).WithStatsChannel(statsCh) + detector := jobreaper.New(ctx, wrapDBAuthz(db, log), pubsub, log, tickCh).WithStatsChannel(statsCh) detector.Start() tickCh <- time.Now() @@ -62,7 +62,7 @@ func TestDetectorNoHungJobs(t *testing.T) { db, pubsub = dbtestutil.NewDB(t) log = testutil.Logger(t) tickCh = make(chan time.Time) - statsCh = make(chan unhanger.Stats) + statsCh = make(chan jobreaper.Stats) ) // Insert some jobs that are running and haven't been updated in a while, @@ -89,7 +89,7 @@ func TestDetectorNoHungJobs(t *testing.T) { }) } - detector := unhanger.New(ctx, wrapDBAuthz(db, log), pubsub, log, tickCh).WithStatsChannel(statsCh) + detector := jobreaper.New(ctx, wrapDBAuthz(db, log), pubsub, log, tickCh).WithStatsChannel(statsCh) detector.Start() tickCh <- now @@ -109,7 +109,7 @@ func TestDetectorHungWorkspaceBuild(t *testing.T) { db, pubsub = dbtestutil.NewDB(t) log = testutil.Logger(t) tickCh = make(chan time.Time) - statsCh = make(chan unhanger.Stats) + statsCh = make(chan jobreaper.Stats) ) var ( @@ -195,7 +195,7 @@ func TestDetectorHungWorkspaceBuild(t *testing.T) { t.Log("previous job ID: ", previousWorkspaceBuildJob.ID) t.Log("current job ID: ", currentWorkspaceBuildJob.ID) - detector := unhanger.New(ctx, wrapDBAuthz(db, log), pubsub, log, tickCh).WithStatsChannel(statsCh) + detector := jobreaper.New(ctx, wrapDBAuthz(db, log), pubsub, log, tickCh).WithStatsChannel(statsCh) detector.Start() tickCh <- now @@ -231,7 +231,7 @@ func TestDetectorHungWorkspaceBuildNoOverrideState(t *testing.T) { db, pubsub = dbtestutil.NewDB(t) log = testutil.Logger(t) tickCh = make(chan time.Time) - statsCh = make(chan unhanger.Stats) + statsCh = make(chan jobreaper.Stats) ) var ( @@ -318,7 +318,7 @@ func TestDetectorHungWorkspaceBuildNoOverrideState(t *testing.T) { t.Log("previous job ID: ", previousWorkspaceBuildJob.ID) t.Log("current job ID: ", currentWorkspaceBuildJob.ID) - detector := unhanger.New(ctx, wrapDBAuthz(db, log), pubsub, log, tickCh).WithStatsChannel(statsCh) + detector := jobreaper.New(ctx, wrapDBAuthz(db, log), pubsub, log, tickCh).WithStatsChannel(statsCh) detector.Start() tickCh <- now @@ -354,7 +354,7 @@ func TestDetectorHungWorkspaceBuildNoOverrideStateIfNoExistingBuild(t *testing.T db, pubsub = dbtestutil.NewDB(t) log = testutil.Logger(t) tickCh = make(chan time.Time) - statsCh = make(chan unhanger.Stats) + statsCh = make(chan jobreaper.Stats) ) var ( @@ -411,7 +411,7 @@ func TestDetectorHungWorkspaceBuildNoOverrideStateIfNoExistingBuild(t *testing.T t.Log("current job ID: ", currentWorkspaceBuildJob.ID) - detector := unhanger.New(ctx, wrapDBAuthz(db, log), pubsub, log, tickCh).WithStatsChannel(statsCh) + detector := jobreaper.New(ctx, wrapDBAuthz(db, log), pubsub, log, tickCh).WithStatsChannel(statsCh) detector.Start() tickCh <- now @@ -439,6 +439,100 @@ func TestDetectorHungWorkspaceBuildNoOverrideStateIfNoExistingBuild(t *testing.T detector.Wait() } +func TestDetectorPendingWorkspaceBuildNoOverrideStateIfNoExistingBuild(t *testing.T) { + t.Parallel() + + var ( + ctx = testutil.Context(t, testutil.WaitLong) + db, pubsub = dbtestutil.NewDB(t) + log = testutil.Logger(t) + tickCh = make(chan time.Time) + statsCh = make(chan jobreaper.Stats) + ) + + var ( + now = time.Now() + thirtyFiveMinAgo = now.Add(-time.Minute * 35) + org = dbgen.Organization(t, db, database.Organization{}) + user = dbgen.User(t, db, database.User{}) + file = dbgen.File(t, db, database.File{}) + template = dbgen.Template(t, db, database.Template{ + OrganizationID: org.ID, + CreatedBy: user.ID, + }) + templateVersion = dbgen.TemplateVersion(t, db, database.TemplateVersion{ + OrganizationID: org.ID, + TemplateID: uuid.NullUUID{ + UUID: template.ID, + Valid: true, + }, + CreatedBy: user.ID, + }) + workspace = dbgen.Workspace(t, db, database.WorkspaceTable{ + OwnerID: user.ID, + OrganizationID: org.ID, + TemplateID: template.ID, + }) + + // First build. + expectedWorkspaceBuildState = []byte(`{"dean":"cool","colin":"also cool"}`) + currentWorkspaceBuildJob = dbgen.ProvisionerJob(t, db, pubsub, database.ProvisionerJob{ + CreatedAt: thirtyFiveMinAgo, + UpdatedAt: thirtyFiveMinAgo, + StartedAt: sql.NullTime{ + Time: time.Time{}, + Valid: false, + }, + OrganizationID: org.ID, + InitiatorID: user.ID, + Provisioner: database.ProvisionerTypeEcho, + StorageMethod: database.ProvisionerStorageMethodFile, + FileID: file.ID, + Type: database.ProvisionerJobTypeWorkspaceBuild, + Input: []byte("{}"), + }) + currentWorkspaceBuild = dbgen.WorkspaceBuild(t, db, database.WorkspaceBuild{ + WorkspaceID: workspace.ID, + TemplateVersionID: templateVersion.ID, + BuildNumber: 1, + JobID: currentWorkspaceBuildJob.ID, + // Should not be overridden. + ProvisionerState: expectedWorkspaceBuildState, + }) + ) + + t.Log("current job ID: ", currentWorkspaceBuildJob.ID) + + detector := jobreaper.New(ctx, wrapDBAuthz(db, log), pubsub, log, tickCh).WithStatsChannel(statsCh) + detector.Start() + tickCh <- now + + stats := <-statsCh + require.NoError(t, stats.Error) + require.Len(t, stats.TerminatedJobIDs, 1) + require.Equal(t, currentWorkspaceBuildJob.ID, stats.TerminatedJobIDs[0]) + + // Check that the current provisioner job was updated. + job, err := db.GetProvisionerJobByID(ctx, currentWorkspaceBuildJob.ID) + require.NoError(t, err) + require.WithinDuration(t, now, job.UpdatedAt, 30*time.Second) + require.True(t, job.CompletedAt.Valid) + require.WithinDuration(t, now, job.CompletedAt.Time, 30*time.Second) + require.True(t, job.StartedAt.Valid) + require.WithinDuration(t, now, job.StartedAt.Time, 30*time.Second) + require.True(t, job.Error.Valid) + require.Contains(t, job.Error.String, "Build has been detected as pending") + require.False(t, job.ErrorCode.Valid) + + // Check that the provisioner state was NOT updated. + build, err := db.GetWorkspaceBuildByID(ctx, currentWorkspaceBuild.ID) + require.NoError(t, err) + require.Equal(t, expectedWorkspaceBuildState, build.ProvisionerState) + + detector.Close() + detector.Wait() +} + func TestDetectorHungOtherJobTypes(t *testing.T) { t.Parallel() @@ -447,7 +541,7 @@ func TestDetectorHungOtherJobTypes(t *testing.T) { db, pubsub = dbtestutil.NewDB(t) log = testutil.Logger(t) tickCh = make(chan time.Time) - statsCh = make(chan unhanger.Stats) + statsCh = make(chan jobreaper.Stats) ) var ( @@ -509,7 +603,7 @@ func TestDetectorHungOtherJobTypes(t *testing.T) { t.Log("template import job ID: ", templateImportJob.ID) t.Log("template dry-run job ID: ", templateDryRunJob.ID) - detector := unhanger.New(ctx, wrapDBAuthz(db, log), pubsub, log, tickCh).WithStatsChannel(statsCh) + detector := jobreaper.New(ctx, wrapDBAuthz(db, log), pubsub, log, tickCh).WithStatsChannel(statsCh) detector.Start() tickCh <- now @@ -543,6 +637,113 @@ func TestDetectorHungOtherJobTypes(t *testing.T) { detector.Wait() } +func TestDetectorPendingOtherJobTypes(t *testing.T) { + t.Parallel() + + var ( + ctx = testutil.Context(t, testutil.WaitLong) + db, pubsub = dbtestutil.NewDB(t) + log = testutil.Logger(t) + tickCh = make(chan time.Time) + statsCh = make(chan jobreaper.Stats) + ) + + var ( + now = time.Now() + thirtyFiveMinAgo = now.Add(-time.Minute * 35) + org = dbgen.Organization(t, db, database.Organization{}) + user = dbgen.User(t, db, database.User{}) + file = dbgen.File(t, db, database.File{}) + + // Template import job. + templateImportJob = dbgen.ProvisionerJob(t, db, pubsub, database.ProvisionerJob{ + CreatedAt: thirtyFiveMinAgo, + UpdatedAt: thirtyFiveMinAgo, + StartedAt: sql.NullTime{ + Time: time.Time{}, + Valid: false, + }, + OrganizationID: org.ID, + InitiatorID: user.ID, + Provisioner: database.ProvisionerTypeEcho, + StorageMethod: database.ProvisionerStorageMethodFile, + FileID: file.ID, + Type: database.ProvisionerJobTypeTemplateVersionImport, + Input: []byte("{}"), + }) + _ = dbgen.TemplateVersion(t, db, database.TemplateVersion{ + OrganizationID: org.ID, + JobID: templateImportJob.ID, + CreatedBy: user.ID, + }) + ) + + // Template dry-run job. + dryRunVersion := dbgen.TemplateVersion(t, db, database.TemplateVersion{ + OrganizationID: org.ID, + CreatedBy: user.ID, + }) + input, err := json.Marshal(provisionerdserver.TemplateVersionDryRunJob{ + TemplateVersionID: dryRunVersion.ID, + }) + require.NoError(t, err) + templateDryRunJob := dbgen.ProvisionerJob(t, db, pubsub, database.ProvisionerJob{ + CreatedAt: thirtyFiveMinAgo, + UpdatedAt: thirtyFiveMinAgo, + StartedAt: sql.NullTime{ + Time: time.Time{}, + Valid: false, + }, + OrganizationID: org.ID, + InitiatorID: user.ID, + Provisioner: database.ProvisionerTypeEcho, + StorageMethod: database.ProvisionerStorageMethodFile, + FileID: file.ID, + Type: database.ProvisionerJobTypeTemplateVersionDryRun, + Input: input, + }) + + t.Log("template import job ID: ", templateImportJob.ID) + t.Log("template dry-run job ID: ", templateDryRunJob.ID) + + detector := jobreaper.New(ctx, wrapDBAuthz(db, log), pubsub, log, tickCh).WithStatsChannel(statsCh) + detector.Start() + tickCh <- now + + stats := <-statsCh + require.NoError(t, stats.Error) + require.Len(t, stats.TerminatedJobIDs, 2) + require.Contains(t, stats.TerminatedJobIDs, templateImportJob.ID) + require.Contains(t, stats.TerminatedJobIDs, templateDryRunJob.ID) + + // Check that the template import job was updated. + job, err := db.GetProvisionerJobByID(ctx, templateImportJob.ID) + require.NoError(t, err) + require.WithinDuration(t, now, job.UpdatedAt, 30*time.Second) + require.True(t, job.CompletedAt.Valid) + require.WithinDuration(t, now, job.CompletedAt.Time, 30*time.Second) + require.True(t, job.StartedAt.Valid) + require.WithinDuration(t, now, job.StartedAt.Time, 30*time.Second) + require.True(t, job.Error.Valid) + require.Contains(t, job.Error.String, "Build has been detected as pending") + require.False(t, job.ErrorCode.Valid) + + // Check that the template dry-run job was updated. + job, err = db.GetProvisionerJobByID(ctx, templateDryRunJob.ID) + require.NoError(t, err) + require.WithinDuration(t, now, job.UpdatedAt, 30*time.Second) + require.True(t, job.CompletedAt.Valid) + require.WithinDuration(t, now, job.CompletedAt.Time, 30*time.Second) + require.True(t, job.StartedAt.Valid) + require.WithinDuration(t, now, job.StartedAt.Time, 30*time.Second) + require.True(t, job.Error.Valid) + require.Contains(t, job.Error.String, "Build has been detected as pending") + require.False(t, job.ErrorCode.Valid) + + detector.Close() + detector.Wait() +} + func TestDetectorHungCanceledJob(t *testing.T) { t.Parallel() @@ -551,7 +752,7 @@ func TestDetectorHungCanceledJob(t *testing.T) { db, pubsub = dbtestutil.NewDB(t) log = testutil.Logger(t) tickCh = make(chan time.Time) - statsCh = make(chan unhanger.Stats) + statsCh = make(chan jobreaper.Stats) ) var ( @@ -591,7 +792,7 @@ func TestDetectorHungCanceledJob(t *testing.T) { t.Log("template import job ID: ", templateImportJob.ID) - detector := unhanger.New(ctx, wrapDBAuthz(db, log), pubsub, log, tickCh).WithStatsChannel(statsCh) + detector := jobreaper.New(ctx, wrapDBAuthz(db, log), pubsub, log, tickCh).WithStatsChannel(statsCh) detector.Start() tickCh <- now @@ -653,7 +854,7 @@ func TestDetectorPushesLogs(t *testing.T) { db, pubsub = dbtestutil.NewDB(t) log = testutil.Logger(t) tickCh = make(chan time.Time) - statsCh = make(chan unhanger.Stats) + statsCh = make(chan jobreaper.Stats) ) var ( @@ -706,7 +907,7 @@ func TestDetectorPushesLogs(t *testing.T) { require.Len(t, logs, 10) } - detector := unhanger.New(ctx, wrapDBAuthz(db, log), pubsub, log, tickCh).WithStatsChannel(statsCh) + detector := jobreaper.New(ctx, wrapDBAuthz(db, log), pubsub, log, tickCh).WithStatsChannel(statsCh) detector.Start() // Create pubsub subscription to listen for new log events. @@ -741,12 +942,19 @@ func TestDetectorPushesLogs(t *testing.T) { CreatedAfter: after, }) require.NoError(t, err) - require.Len(t, logs, len(unhanger.HungJobLogMessages)) + threshold := jobreaper.HungJobDuration + jobType := jobreaper.Hung + if templateImportJob.JobStatus == database.ProvisionerJobStatusPending { + threshold = jobreaper.PendingJobDuration + jobType = jobreaper.Pending + } + expectedLogs := jobreaper.JobLogMessages(jobType, threshold) + require.Len(t, logs, len(expectedLogs)) for i, log := range logs { assert.Equal(t, database.LogLevelError, log.Level) assert.Equal(t, c.expectStage, log.Stage) assert.Equal(t, database.LogSourceProvisionerDaemon, log.Source) - assert.Equal(t, unhanger.HungJobLogMessages[i], log.Output) + assert.Equal(t, expectedLogs[i], log.Output) } // Double check the full log count. @@ -755,7 +963,7 @@ func TestDetectorPushesLogs(t *testing.T) { CreatedAfter: 0, }) require.NoError(t, err) - require.Len(t, logs, c.preLogCount+len(unhanger.HungJobLogMessages)) + require.Len(t, logs, c.preLogCount+len(expectedLogs)) detector.Close() detector.Wait() @@ -771,15 +979,15 @@ func TestDetectorMaxJobsPerRun(t *testing.T) { db, pubsub = dbtestutil.NewDB(t) log = testutil.Logger(t) tickCh = make(chan time.Time) - statsCh = make(chan unhanger.Stats) + statsCh = make(chan jobreaper.Stats) org = dbgen.Organization(t, db, database.Organization{}) user = dbgen.User(t, db, database.User{}) file = dbgen.File(t, db, database.File{}) ) - // Create unhanger.MaxJobsPerRun + 1 hung jobs. + // Create MaxJobsPerRun + 1 hung jobs. now := time.Now() - for i := 0; i < unhanger.MaxJobsPerRun+1; i++ { + for i := 0; i < jobreaper.MaxJobsPerRun+1; i++ { pj := dbgen.ProvisionerJob(t, db, pubsub, database.ProvisionerJob{ CreatedAt: now.Add(-time.Hour), UpdatedAt: now.Add(-time.Hour), @@ -802,14 +1010,14 @@ func TestDetectorMaxJobsPerRun(t *testing.T) { }) } - detector := unhanger.New(ctx, wrapDBAuthz(db, log), pubsub, log, tickCh).WithStatsChannel(statsCh) + detector := jobreaper.New(ctx, wrapDBAuthz(db, log), pubsub, log, tickCh).WithStatsChannel(statsCh) detector.Start() tickCh <- now - // Make sure that only unhanger.MaxJobsPerRun jobs are terminated. + // Make sure that only MaxJobsPerRun jobs are terminated. stats := <-statsCh require.NoError(t, stats.Error) - require.Len(t, stats.TerminatedJobIDs, unhanger.MaxJobsPerRun) + require.Len(t, stats.TerminatedJobIDs, jobreaper.MaxJobsPerRun) // Run the detector again and make sure that only the remaining job is // terminated. @@ -823,7 +1031,7 @@ func TestDetectorMaxJobsPerRun(t *testing.T) { } // wrapDBAuthz adds our Authorization/RBAC around the given database store, to -// ensure the unhanger has the right permissions to do its work. +// ensure the reaper has the right permissions to do its work. func wrapDBAuthz(db database.Store, logger slog.Logger) database.Store { return dbauthz.New( db, diff --git a/coderd/notifications/events.go b/coderd/notifications/events.go index 35d9925055da5..0e88361b56f68 100644 --- a/coderd/notifications/events.go +++ b/coderd/notifications/events.go @@ -42,6 +42,11 @@ var ( TemplateWorkspaceResourceReplaced = uuid.MustParse("89d9745a-816e-4695-a17f-3d0a229e2b8d") ) +// Prebuilds-related events +var ( + PrebuildFailureLimitReached = uuid.MustParse("414d9331-c1fc-4761-b40c-d1f4702279eb") +) + // Notification-related events. var ( TemplateTestNotification = uuid.MustParse("c425f63e-716a-4bf4-ae24-78348f706c3f") diff --git a/coderd/notifications/notifications_test.go b/coderd/notifications/notifications_test.go index 8f8a3c82441e0..fab87af41deb9 100644 --- a/coderd/notifications/notifications_test.go +++ b/coderd/notifications/notifications_test.go @@ -1250,6 +1250,22 @@ func TestNotificationTemplates_Golden(t *testing.T) { }, }, }, + { + name: "PrebuildFailureLimitReached", + id: notifications.PrebuildFailureLimitReached, + payload: types.MessagePayload{ + UserName: "Bobby", + UserEmail: "bobby@coder.com", + UserUsername: "bobby", + Labels: map[string]string{ + "org": "cern", + "template": "docker", + "template_version": "angry_torvalds", + "preset": "particle-accelerator", + }, + Data: map[string]any{}, + }, + }, } // We must have a test case for every notification_template. This is enforced below: diff --git a/coderd/notifications/testdata/rendered-templates/smtp/PrebuildFailureLimitReached.html.golden b/coderd/notifications/testdata/rendered-templates/smtp/PrebuildFailureLimitReached.html.golden new file mode 100644 index 0000000000000..69f13b86ca71c --- /dev/null +++ b/coderd/notifications/testdata/rendered-templates/smtp/PrebuildFailureLimitReached.html.golden @@ -0,0 +1,112 @@ +From: system@coder.com +To: bobby@coder.com +Subject: There is a problem creating prebuilt workspaces +Message-Id: 02ee4935-73be-4fa1-a290-ff9999026b13@blush-whale-48 +Date: Fri, 11 Oct 2024 09:03:06 +0000 +Content-Type: multipart/alternative; boundary=bbe61b741255b6098bb6b3c1f41b885773df633cb18d2a3002b68e4bc9c4 +MIME-Version: 1.0 + +--bbe61b741255b6098bb6b3c1f41b885773df633cb18d2a3002b68e4bc9c4 +Content-Transfer-Encoding: quoted-printable +Content-Type: text/plain; charset=UTF-8 + +Hi Bobby, + +The number of failed prebuild attempts has reached the hard limit for templ= +ate docker and preset particle-accelerator. + +To resume prebuilds, fix the underlying issue and upload a new template ver= +sion. + +Refer to the documentation for more details: + +Troubleshooting templates (https://coder.com/docs/admin/templates/troublesh= +ooting) +Troubleshooting of prebuilt workspaces (https://coder.com/docs/admin/templa= +tes/extending-templates/prebuilt-workspaces#administration-and-troubleshoot= +ing) + + +View failed prebuilt workspaces: http://test.com/workspaces?filter=3Downer:= +prebuilds+status:failed+template:docker + +View template version: http://test.com/templates/cern/docker/versions/angry= +_torvalds + +--bbe61b741255b6098bb6b3c1f41b885773df633cb18d2a3002b68e4bc9c4 +Content-Transfer-Encoding: quoted-printable +Content-Type: text/html; charset=UTF-8 + + + + + + + There is a problem creating prebuilt workspaces + + +
+
+ 3D"Cod= +
+

+ There is a problem creating prebuilt workspaces +

+
+

Hi Bobby,

+

The number of failed prebuild attempts has reached the hard limi= +t for template docker and preset particle-accelera= +tor.

+ +

To resume prebuilds, fix the underlying issue and upload a new template = +version.

+ +

Refer to the documentation for more details:
+- Troubl= +eshooting templates
+- Troubleshooting of pre= +built workspaces

+
+ + +
+ + + +--bbe61b741255b6098bb6b3c1f41b885773df633cb18d2a3002b68e4bc9c4-- diff --git a/coderd/notifications/testdata/rendered-templates/webhook/PrebuildFailureLimitReached.json.golden b/coderd/notifications/testdata/rendered-templates/webhook/PrebuildFailureLimitReached.json.golden new file mode 100644 index 0000000000000..0a6e262ff7512 --- /dev/null +++ b/coderd/notifications/testdata/rendered-templates/webhook/PrebuildFailureLimitReached.json.golden @@ -0,0 +1,35 @@ +{ + "_version": "1.1", + "msg_id": "00000000-0000-0000-0000-000000000000", + "payload": { + "_version": "1.2", + "notification_name": "Prebuild Failure Limit Reached", + "notification_template_id": "00000000-0000-0000-0000-000000000000", + "user_id": "00000000-0000-0000-0000-000000000000", + "user_email": "bobby@coder.com", + "user_name": "Bobby", + "user_username": "bobby", + "actions": [ + { + "label": "View failed prebuilt workspaces", + "url": "http://test.com/workspaces?filter=owner:prebuilds+status:failed+template:docker" + }, + { + "label": "View template version", + "url": "http://test.com/templates/cern/docker/versions/angry_torvalds" + } + ], + "labels": { + "org": "cern", + "preset": "particle-accelerator", + "template": "docker", + "template_version": "angry_torvalds" + }, + "data": {}, + "targets": null + }, + "title": "There is a problem creating prebuilt workspaces", + "title_markdown": "There is a problem creating prebuilt workspaces", + "body": "The number of failed prebuild attempts has reached the hard limit for template docker and preset particle-accelerator.\n\nTo resume prebuilds, fix the underlying issue and upload a new template version.\n\nRefer to the documentation for more details:\n\nTroubleshooting templates (https://coder.com/docs/admin/templates/troubleshooting)\nTroubleshooting of prebuilt workspaces (https://coder.com/docs/admin/templates/extending-templates/prebuilt-workspaces#administration-and-troubleshooting)", + "body_markdown": "\nThe number of failed prebuild attempts has reached the hard limit for template **docker** and preset **particle-accelerator**.\n\nTo resume prebuilds, fix the underlying issue and upload a new template version.\n\nRefer to the documentation for more details:\n- [Troubleshooting templates](https://coder.com/docs/admin/templates/troubleshooting)\n- [Troubleshooting of prebuilt workspaces](https://coder.com/docs/admin/templates/extending-templates/prebuilt-workspaces#administration-and-troubleshooting)\n" +} \ No newline at end of file diff --git a/coderd/parameters.go b/coderd/parameters.go index c3fc4ffdeeede..1a0c1f92ddbf9 100644 --- a/coderd/parameters.go +++ b/coderd/parameters.go @@ -12,13 +12,14 @@ import ( "golang.org/x/sync/errgroup" "golang.org/x/xerrors" - "github.com/coder/coder/v2/apiversion" "github.com/coder/coder/v2/coderd/database" + "github.com/coder/coder/v2/coderd/database/db2sdk" "github.com/coder/coder/v2/coderd/database/dbauthz" "github.com/coder/coder/v2/coderd/files" "github.com/coder/coder/v2/coderd/httpapi" "github.com/coder/coder/v2/coderd/httpmw" "github.com/coder/coder/v2/coderd/util/ptr" + "github.com/coder/coder/v2/coderd/wsbuilder" "github.com/coder/coder/v2/codersdk" "github.com/coder/coder/v2/codersdk/wsjson" sdkproto "github.com/coder/coder/v2/provisionersdk/proto" @@ -69,13 +70,10 @@ func (api *API) templateVersionDynamicParameters(rw http.ResponseWriter, r *http return } - major, minor, err := apiversion.Parse(tf.ProvisionerdVersion) - // If the api version is not valid or less than 1.5, we need to use the static parameters - useStaticParams := err != nil || major < 1 || (major == 1 && minor < 6) - if useStaticParams { - api.handleStaticParameters(rw, r, templateVersion.ID) - } else { + if wsbuilder.ProvisionerVersionSupportsDynamicParameters(tf.ProvisionerdVersion) { api.handleDynamicParameters(rw, r, tf, templateVersion) + } else { + api.handleStaticParameters(rw, r, templateVersion.ID) } } @@ -289,10 +287,10 @@ func (api *API) handleParameterWebsocket(rw http.ResponseWriter, r *http.Request result, diagnostics := render(ctx, map[string]string{}) response := codersdk.DynamicParametersResponse{ ID: -1, // Always start with -1. - Diagnostics: previewtypes.Diagnostics(diagnostics), + Diagnostics: db2sdk.HCLDiagnostics(diagnostics), } if result != nil { - response.Parameters = result.Parameters + response.Parameters = db2sdk.List(result.Parameters, db2sdk.PreviewParameter) } err = stream.Send(response) if err != nil { @@ -317,10 +315,10 @@ func (api *API) handleParameterWebsocket(rw http.ResponseWriter, r *http.Request result, diagnostics := render(ctx, update.Inputs) response := codersdk.DynamicParametersResponse{ ID: update.ID, - Diagnostics: previewtypes.Diagnostics(diagnostics), + Diagnostics: db2sdk.HCLDiagnostics(diagnostics), } if result != nil { - response.Parameters = result.Parameters + response.Parameters = db2sdk.List(result.Parameters, db2sdk.PreviewParameter) } err = stream.Send(response) if err != nil { diff --git a/coderd/parameters_test.go b/coderd/parameters_test.go index e7fc77f141efc..8edadc9b7e797 100644 --- a/coderd/parameters_test.go +++ b/coderd/parameters_test.go @@ -68,8 +68,8 @@ func TestDynamicParametersOwnerSSHPublicKey(t *testing.T) { require.Equal(t, -1, preview.ID) require.Empty(t, preview.Diagnostics) require.Equal(t, "public_key", preview.Parameters[0].Name) - require.True(t, preview.Parameters[0].Value.Valid()) - require.Equal(t, sshKey.PublicKey, preview.Parameters[0].Value.Value.AsString()) + require.True(t, preview.Parameters[0].Value.Valid) + require.Equal(t, sshKey.PublicKey, preview.Parameters[0].Value.Value) } func TestDynamicParametersWithTerraformValues(t *testing.T) { @@ -103,8 +103,8 @@ func TestDynamicParametersWithTerraformValues(t *testing.T) { require.Len(t, preview.Parameters, 1) require.Equal(t, "jetbrains_ide", preview.Parameters[0].Name) - require.True(t, preview.Parameters[0].Value.Valid()) - require.Equal(t, "CL", preview.Parameters[0].Value.AsString()) + require.True(t, preview.Parameters[0].Value.Valid) + require.Equal(t, "CL", preview.Parameters[0].Value.Value) }) // OldProvisioners use the static parameters in the dynamic param flow @@ -154,8 +154,8 @@ func TestDynamicParametersWithTerraformValues(t *testing.T) { require.Contains(t, preview.Diagnostics[0].Summary, "required metadata to support dynamic parameters") require.Len(t, preview.Parameters, 1) require.Equal(t, "jetbrains_ide", preview.Parameters[0].Name) - require.True(t, preview.Parameters[0].Value.Valid()) - require.Equal(t, defaultValue, preview.Parameters[0].Value.AsString()) + require.True(t, preview.Parameters[0].Value.Valid) + require.Equal(t, defaultValue, preview.Parameters[0].Value.Value) // Test some inputs for _, exp := range []string{defaultValue, "GO", "Invalid", defaultValue} { @@ -182,8 +182,8 @@ func TestDynamicParametersWithTerraformValues(t *testing.T) { require.Len(t, preview.Parameters[0].Diagnostics, 0) } require.Equal(t, "jetbrains_ide", preview.Parameters[0].Name) - require.True(t, preview.Parameters[0].Value.Valid()) - require.Equal(t, exp, preview.Parameters[0].Value.AsString()) + require.True(t, preview.Parameters[0].Value.Valid) + require.Equal(t, exp, preview.Parameters[0].Value.Value) } }) diff --git a/coderd/prebuilds/global_snapshot.go b/coderd/prebuilds/global_snapshot.go index 0cf3fa3facc3a..9110f57574e7b 100644 --- a/coderd/prebuilds/global_snapshot.go +++ b/coderd/prebuilds/global_snapshot.go @@ -14,6 +14,7 @@ type GlobalSnapshot struct { RunningPrebuilds []database.GetRunningPrebuiltWorkspacesRow PrebuildsInProgress []database.CountInProgressPrebuildsRow Backoffs []database.GetPresetsBackoffRow + HardLimitedPresets []database.GetPresetsAtFailureLimitRow } func NewGlobalSnapshot( @@ -21,12 +22,14 @@ func NewGlobalSnapshot( runningPrebuilds []database.GetRunningPrebuiltWorkspacesRow, prebuildsInProgress []database.CountInProgressPrebuildsRow, backoffs []database.GetPresetsBackoffRow, + hardLimitedPresets []database.GetPresetsAtFailureLimitRow, ) GlobalSnapshot { return GlobalSnapshot{ Presets: presets, RunningPrebuilds: runningPrebuilds, PrebuildsInProgress: prebuildsInProgress, Backoffs: backoffs, + HardLimitedPresets: hardLimitedPresets, } } @@ -57,10 +60,15 @@ func (s GlobalSnapshot) FilterByPreset(presetID uuid.UUID) (*PresetSnapshot, err backoffPtr = &backoff } + _, isHardLimited := slice.Find(s.HardLimitedPresets, func(row database.GetPresetsAtFailureLimitRow) bool { + return row.PresetID == preset.ID + }) + return &PresetSnapshot{ - Preset: preset, - Running: running, - InProgress: inProgress, - Backoff: backoffPtr, + Preset: preset, + Running: running, + InProgress: inProgress, + Backoff: backoffPtr, + IsHardLimited: isHardLimited, }, nil } diff --git a/coderd/prebuilds/preset_snapshot.go b/coderd/prebuilds/preset_snapshot.go index 8441a350187d2..40e77de5ab3e3 100644 --- a/coderd/prebuilds/preset_snapshot.go +++ b/coderd/prebuilds/preset_snapshot.go @@ -32,10 +32,11 @@ const ( // It contains the raw data needed to calculate the current state of a preset's prebuilds, // including running prebuilds, in-progress builds, and backoff information. type PresetSnapshot struct { - Preset database.GetTemplatePresetsWithPrebuildsRow - Running []database.GetRunningPrebuiltWorkspacesRow - InProgress []database.CountInProgressPrebuildsRow - Backoff *database.GetPresetsBackoffRow + Preset database.GetTemplatePresetsWithPrebuildsRow + Running []database.GetRunningPrebuiltWorkspacesRow + InProgress []database.CountInProgressPrebuildsRow + Backoff *database.GetPresetsBackoffRow + IsHardLimited bool } // ReconciliationState represents the processed state of a preset's prebuilds, diff --git a/coderd/prebuilds/preset_snapshot_test.go b/coderd/prebuilds/preset_snapshot_test.go index a5acb40e5311f..2febf1d13ec91 100644 --- a/coderd/prebuilds/preset_snapshot_test.go +++ b/coderd/prebuilds/preset_snapshot_test.go @@ -73,7 +73,7 @@ func TestNoPrebuilds(t *testing.T) { preset(true, 0, current), } - snapshot := prebuilds.NewGlobalSnapshot(presets, nil, nil, nil) + snapshot := prebuilds.NewGlobalSnapshot(presets, nil, nil, nil, nil) ps, err := snapshot.FilterByPreset(current.presetID) require.NoError(t, err) @@ -98,7 +98,7 @@ func TestNetNew(t *testing.T) { preset(true, 1, current), } - snapshot := prebuilds.NewGlobalSnapshot(presets, nil, nil, nil) + snapshot := prebuilds.NewGlobalSnapshot(presets, nil, nil, nil, nil) ps, err := snapshot.FilterByPreset(current.presetID) require.NoError(t, err) @@ -138,7 +138,7 @@ func TestOutdatedPrebuilds(t *testing.T) { var inProgress []database.CountInProgressPrebuildsRow // WHEN: calculating the outdated preset's state. - snapshot := prebuilds.NewGlobalSnapshot(presets, running, inProgress, nil) + snapshot := prebuilds.NewGlobalSnapshot(presets, running, inProgress, nil, nil) ps, err := snapshot.FilterByPreset(outdated.presetID) require.NoError(t, err) @@ -200,7 +200,7 @@ func TestDeleteOutdatedPrebuilds(t *testing.T) { } // WHEN: calculating the outdated preset's state. - snapshot := prebuilds.NewGlobalSnapshot(presets, running, inProgress, nil) + snapshot := prebuilds.NewGlobalSnapshot(presets, running, inProgress, nil, nil) ps, err := snapshot.FilterByPreset(outdated.presetID) require.NoError(t, err) @@ -442,7 +442,7 @@ func TestInProgressActions(t *testing.T) { } // WHEN: calculating the current preset's state. - snapshot := prebuilds.NewGlobalSnapshot(presets, running, inProgress, nil) + snapshot := prebuilds.NewGlobalSnapshot(presets, running, inProgress, nil, nil) ps, err := snapshot.FilterByPreset(current.presetID) require.NoError(t, err) @@ -485,7 +485,7 @@ func TestExtraneous(t *testing.T) { var inProgress []database.CountInProgressPrebuildsRow // WHEN: calculating the current preset's state. - snapshot := prebuilds.NewGlobalSnapshot(presets, running, inProgress, nil) + snapshot := prebuilds.NewGlobalSnapshot(presets, running, inProgress, nil, nil) ps, err := snapshot.FilterByPreset(current.presetID) require.NoError(t, err) @@ -525,7 +525,7 @@ func TestDeprecated(t *testing.T) { var inProgress []database.CountInProgressPrebuildsRow // WHEN: calculating the current preset's state. - snapshot := prebuilds.NewGlobalSnapshot(presets, running, inProgress, nil) + snapshot := prebuilds.NewGlobalSnapshot(presets, running, inProgress, nil, nil) ps, err := snapshot.FilterByPreset(current.presetID) require.NoError(t, err) @@ -576,7 +576,7 @@ func TestLatestBuildFailed(t *testing.T) { } // WHEN: calculating the current preset's state. - snapshot := prebuilds.NewGlobalSnapshot(presets, running, inProgress, backoffs) + snapshot := prebuilds.NewGlobalSnapshot(presets, running, inProgress, backoffs, nil) psCurrent, err := snapshot.FilterByPreset(current.presetID) require.NoError(t, err) @@ -669,7 +669,7 @@ func TestMultiplePresetsPerTemplateVersion(t *testing.T) { }, } - snapshot := prebuilds.NewGlobalSnapshot(presets, nil, inProgress, nil) + snapshot := prebuilds.NewGlobalSnapshot(presets, nil, inProgress, nil, nil) // Nothing has to be created for preset 1. { diff --git a/coderd/provisionerdserver/provisionerdserver.go b/coderd/provisionerdserver/provisionerdserver.go index 423e9bbe584c6..9c4067137b852 100644 --- a/coderd/provisionerdserver/provisionerdserver.go +++ b/coderd/provisionerdserver/provisionerdserver.go @@ -1340,14 +1340,56 @@ func (s *server) CompleteJob(ctx context.Context, completed *proto.CompletedJob) switch jobType := completed.Type.(type) { case *proto.CompletedJob_TemplateImport_: - var input TemplateVersionImportJob - err = json.Unmarshal(job.Input, &input) + err = s.completeTemplateImportJob(ctx, job, jobID, jobType, telemetrySnapshot) + if err != nil { + return nil, err + } + case *proto.CompletedJob_WorkspaceBuild_: + err = s.completeWorkspaceBuildJob(ctx, job, jobID, jobType, telemetrySnapshot) + if err != nil { + return nil, err + } + case *proto.CompletedJob_TemplateDryRun_: + err = s.completeTemplateDryRunJob(ctx, job, jobID, jobType, telemetrySnapshot) if err != nil { - return nil, xerrors.Errorf("template version ID is expected: %w", err) + return nil, err + } + default: + if completed.Type == nil { + return nil, xerrors.Errorf("type payload must be provided") } + return nil, xerrors.Errorf("unknown job type %q; ensure coderd and provisionerd versions match", + reflect.TypeOf(completed.Type).String()) + } + + data, err := json.Marshal(provisionersdk.ProvisionerJobLogsNotifyMessage{EndOfLogs: true}) + if err != nil { + return nil, xerrors.Errorf("marshal job log: %w", err) + } + err = s.Pubsub.Publish(provisionersdk.ProvisionerJobLogsNotifyChannel(jobID), data) + if err != nil { + s.Logger.Error(ctx, "failed to publish end of job logs", slog.F("job_id", jobID), slog.Error(err)) + return nil, xerrors.Errorf("publish end of job logs: %w", err) + } + s.Logger.Debug(ctx, "stage CompleteJob done", slog.F("job_id", jobID)) + return &proto.Empty{}, nil +} + +// completeTemplateImportJob handles completion of a template import job. +// All database operations are performed within a transaction. +func (s *server) completeTemplateImportJob(ctx context.Context, job database.ProvisionerJob, jobID uuid.UUID, jobType *proto.CompletedJob_TemplateImport_, telemetrySnapshot *telemetry.Snapshot) error { + var input TemplateVersionImportJob + err := json.Unmarshal(job.Input, &input) + if err != nil { + return xerrors.Errorf("template version ID is expected: %w", err) + } + + // Execute all database operations in a transaction + return s.Database.InTx(func(db database.Store) error { now := s.timeNow() + // Process resources for transition, resources := range map[database.WorkspaceTransition][]*sdkproto.Resource{ database.WorkspaceTransitionStart: jobType.TemplateImport.StartResources, database.WorkspaceTransitionStop: jobType.TemplateImport.StopResources, @@ -1359,11 +1401,13 @@ func (s *server) CompleteJob(ctx context.Context, completed *proto.CompletedJob) slog.F("resource_type", resource.Type), slog.F("transition", transition)) - if err := InsertWorkspaceResource(ctx, s.Database, jobID, transition, resource, telemetrySnapshot); err != nil { - return nil, xerrors.Errorf("insert resource: %w", err) + if err := InsertWorkspaceResource(ctx, db, jobID, transition, resource, telemetrySnapshot); err != nil { + return xerrors.Errorf("insert resource: %w", err) } } } + + // Process modules for transition, modules := range map[database.WorkspaceTransition][]*sdkproto.Module{ database.WorkspaceTransitionStart: jobType.TemplateImport.StartModules, database.WorkspaceTransitionStop: jobType.TemplateImport.StopModules, @@ -1376,12 +1420,13 @@ func (s *server) CompleteJob(ctx context.Context, completed *proto.CompletedJob) slog.F("module_key", module.Key), slog.F("transition", transition)) - if err := InsertWorkspaceModule(ctx, s.Database, jobID, transition, module, telemetrySnapshot); err != nil { - return nil, xerrors.Errorf("insert module: %w", err) + if err := InsertWorkspaceModule(ctx, db, jobID, transition, module, telemetrySnapshot); err != nil { + return xerrors.Errorf("insert module: %w", err) } } } + // Process rich parameters for _, richParameter := range jobType.TemplateImport.RichParameters { s.Logger.Info(ctx, "inserting template import job parameter", slog.F("job_id", job.ID.String()), @@ -1391,7 +1436,7 @@ func (s *server) CompleteJob(ctx context.Context, completed *proto.CompletedJob) ) options, err := json.Marshal(richParameter.Options) if err != nil { - return nil, xerrors.Errorf("marshal parameter options: %w", err) + return xerrors.Errorf("marshal parameter options: %w", err) } var validationMin, validationMax sql.NullInt32 @@ -1408,7 +1453,7 @@ func (s *server) CompleteJob(ctx context.Context, completed *proto.CompletedJob) } } - _, err = s.Database.InsertTemplateVersionParameter(ctx, database.InsertTemplateVersionParameterParams{ + _, err = db.InsertTemplateVersionParameter(ctx, database.InsertTemplateVersionParameterParams{ TemplateVersionID: input.TemplateVersionID, Name: richParameter.Name, DisplayName: richParameter.DisplayName, @@ -1428,15 +1473,17 @@ func (s *server) CompleteJob(ctx context.Context, completed *proto.CompletedJob) Ephemeral: richParameter.Ephemeral, }) if err != nil { - return nil, xerrors.Errorf("insert parameter: %w", err) + return xerrors.Errorf("insert parameter: %w", err) } } - err = InsertWorkspacePresetsAndParameters(ctx, s.Logger, s.Database, jobID, input.TemplateVersionID, jobType.TemplateImport.Presets, now) + // Process presets and parameters + err := InsertWorkspacePresetsAndParameters(ctx, s.Logger, db, jobID, input.TemplateVersionID, jobType.TemplateImport.Presets, now) if err != nil { - return nil, xerrors.Errorf("insert workspace presets and parameters: %w", err) + return xerrors.Errorf("insert workspace presets and parameters: %w", err) } + // Process external auth providers var completedError sql.NullString for _, externalAuthProvider := range jobType.TemplateImport.ExternalAuthProviders { @@ -1479,18 +1526,19 @@ func (s *server) CompleteJob(ctx context.Context, completed *proto.CompletedJob) externalAuthProvidersMessage, err := json.Marshal(externalAuthProviders) if err != nil { - return nil, xerrors.Errorf("failed to serialize external_auth_providers value: %w", err) + return xerrors.Errorf("failed to serialize external_auth_providers value: %w", err) } - err = s.Database.UpdateTemplateVersionExternalAuthProvidersByJobID(ctx, database.UpdateTemplateVersionExternalAuthProvidersByJobIDParams{ + err = db.UpdateTemplateVersionExternalAuthProvidersByJobID(ctx, database.UpdateTemplateVersionExternalAuthProvidersByJobIDParams{ JobID: jobID, ExternalAuthProviders: externalAuthProvidersMessage, UpdatedAt: now, }) if err != nil { - return nil, xerrors.Errorf("update template version external auth providers: %w", err) + return xerrors.Errorf("update template version external auth providers: %w", err) } + // Process terraform values plan := jobType.TemplateImport.Plan moduleFiles := jobType.TemplateImport.ModuleFiles // If there is a plan, or a module files archive we need to insert a @@ -1509,7 +1557,7 @@ func (s *server) CompleteJob(ctx context.Context, completed *proto.CompletedJob) hash := hex.EncodeToString(hashBytes[:]) // nolint:gocritic // Requires reading "system" files - file, err := s.Database.GetFileByHashAndCreator(dbauthz.AsSystemRestricted(ctx), database.GetFileByHashAndCreatorParams{Hash: hash, CreatedBy: uuid.Nil}) + file, err := db.GetFileByHashAndCreator(dbauthz.AsSystemRestricted(ctx), database.GetFileByHashAndCreatorParams{Hash: hash, CreatedBy: uuid.Nil}) switch { case err == nil: // This set of modules is already cached, which means we can reuse them @@ -1518,10 +1566,10 @@ func (s *server) CompleteJob(ctx context.Context, completed *proto.CompletedJob) UUID: file.ID, } case !xerrors.Is(err, sql.ErrNoRows): - return nil, xerrors.Errorf("check for cached modules: %w", err) + return xerrors.Errorf("check for cached modules: %w", err) default: // nolint:gocritic // Requires creating a "system" file - file, err = s.Database.InsertFile(dbauthz.AsSystemRestricted(ctx), database.InsertFileParams{ + file, err = db.InsertFile(dbauthz.AsSystemRestricted(ctx), database.InsertFileParams{ ID: uuid.New(), Hash: hash, CreatedBy: uuid.Nil, @@ -1530,7 +1578,7 @@ func (s *server) CompleteJob(ctx context.Context, completed *proto.CompletedJob) Data: moduleFiles, }) if err != nil { - return nil, xerrors.Errorf("insert template version terraform modules: %w", err) + return xerrors.Errorf("insert template version terraform modules: %w", err) } fileID = uuid.NullUUID{ Valid: true, @@ -1539,7 +1587,7 @@ func (s *server) CompleteJob(ctx context.Context, completed *proto.CompletedJob) } } - err = s.Database.InsertTemplateVersionTerraformValuesByJobID(ctx, database.InsertTemplateVersionTerraformValuesByJobIDParams{ + err = db.InsertTemplateVersionTerraformValuesByJobID(ctx, database.InsertTemplateVersionTerraformValuesByJobIDParams{ JobID: jobID, UpdatedAt: now, CachedPlan: plan, @@ -1547,11 +1595,12 @@ func (s *server) CompleteJob(ctx context.Context, completed *proto.CompletedJob) ProvisionerdVersion: s.apiVersion, }) if err != nil { - return nil, xerrors.Errorf("insert template version terraform data: %w", err) + return xerrors.Errorf("insert template version terraform data: %w", err) } } - err = s.Database.UpdateProvisionerJobWithCompleteByID(ctx, database.UpdateProvisionerJobWithCompleteByIDParams{ + // Mark job as completed + err = db.UpdateProvisionerJobWithCompleteByID(ctx, database.UpdateProvisionerJobWithCompleteByIDParams{ ID: jobID, UpdatedAt: now, CompletedAt: sql.NullTime{ @@ -1562,206 +1611,136 @@ func (s *server) CompleteJob(ctx context.Context, completed *proto.CompletedJob) ErrorCode: sql.NullString{}, }) if err != nil { - return nil, xerrors.Errorf("update provisioner job: %w", err) + return xerrors.Errorf("update provisioner job: %w", err) } s.Logger.Debug(ctx, "marked import job as completed", slog.F("job_id", jobID)) - case *proto.CompletedJob_WorkspaceBuild_: - var input WorkspaceProvisionJob - err = json.Unmarshal(job.Input, &input) - if err != nil { - return nil, xerrors.Errorf("unmarshal job data: %w", err) - } + return nil + }, nil) // End of transaction +} - workspaceBuild, err := s.Database.GetWorkspaceBuildByID(ctx, input.WorkspaceBuildID) - if err != nil { - return nil, xerrors.Errorf("get workspace build: %w", err) - } +// completeWorkspaceBuildJob handles completion of a workspace build job. +// Most database operations are performed within a transaction. +func (s *server) completeWorkspaceBuildJob(ctx context.Context, job database.ProvisionerJob, jobID uuid.UUID, jobType *proto.CompletedJob_WorkspaceBuild_, telemetrySnapshot *telemetry.Snapshot) error { + var input WorkspaceProvisionJob + err := json.Unmarshal(job.Input, &input) + if err != nil { + return xerrors.Errorf("unmarshal job data: %w", err) + } - var workspace database.Workspace - var getWorkspaceError error + workspaceBuild, err := s.Database.GetWorkspaceBuildByID(ctx, input.WorkspaceBuildID) + if err != nil { + return xerrors.Errorf("get workspace build: %w", err) + } - err = s.Database.InTx(func(db database.Store) error { - // It's important we use s.timeNow() here because we want to be - // able to customize the current time from within tests. - now := s.timeNow() - - workspace, getWorkspaceError = db.GetWorkspaceByID(ctx, workspaceBuild.WorkspaceID) - if getWorkspaceError != nil { - s.Logger.Error(ctx, - "fetch workspace for build", - slog.F("workspace_build_id", workspaceBuild.ID), - slog.F("workspace_id", workspaceBuild.WorkspaceID), - ) - return getWorkspaceError - } + var workspace database.Workspace + var getWorkspaceError error - templateScheduleStore := *s.TemplateScheduleStore.Load() + // Execute all database modifications in a transaction + err = s.Database.InTx(func(db database.Store) error { + // It's important we use s.timeNow() here because we want to be + // able to customize the current time from within tests. + now := s.timeNow() - autoStop, err := schedule.CalculateAutostop(ctx, schedule.CalculateAutostopParams{ - Database: db, - TemplateScheduleStore: templateScheduleStore, - UserQuietHoursScheduleStore: *s.UserQuietHoursScheduleStore.Load(), - Now: now, - Workspace: workspace.WorkspaceTable(), - // Allowed to be the empty string. - WorkspaceAutostart: workspace.AutostartSchedule.String, - }) - if err != nil { - return xerrors.Errorf("calculate auto stop: %w", err) - } + workspace, getWorkspaceError = db.GetWorkspaceByID(ctx, workspaceBuild.WorkspaceID) + if getWorkspaceError != nil { + s.Logger.Error(ctx, + "fetch workspace for build", + slog.F("workspace_build_id", workspaceBuild.ID), + slog.F("workspace_id", workspaceBuild.WorkspaceID), + ) + return getWorkspaceError + } - if workspace.AutostartSchedule.Valid { - templateScheduleOptions, err := templateScheduleStore.Get(ctx, db, workspace.TemplateID) - if err != nil { - return xerrors.Errorf("get template schedule options: %w", err) - } + templateScheduleStore := *s.TemplateScheduleStore.Load() - nextStartAt, err := schedule.NextAllowedAutostart(now, workspace.AutostartSchedule.String, templateScheduleOptions) - if err == nil { - err = db.UpdateWorkspaceNextStartAt(ctx, database.UpdateWorkspaceNextStartAtParams{ - ID: workspace.ID, - NextStartAt: sql.NullTime{Valid: true, Time: nextStartAt.UTC()}, - }) - if err != nil { - return xerrors.Errorf("update workspace next start at: %w", err) - } - } - } + autoStop, err := schedule.CalculateAutostop(ctx, schedule.CalculateAutostopParams{ + Database: db, + TemplateScheduleStore: templateScheduleStore, + UserQuietHoursScheduleStore: *s.UserQuietHoursScheduleStore.Load(), + Now: now, + Workspace: workspace.WorkspaceTable(), + // Allowed to be the empty string. + WorkspaceAutostart: workspace.AutostartSchedule.String, + }) + if err != nil { + return xerrors.Errorf("calculate auto stop: %w", err) + } - err = db.UpdateProvisionerJobWithCompleteByID(ctx, database.UpdateProvisionerJobWithCompleteByIDParams{ - ID: jobID, - UpdatedAt: now, - CompletedAt: sql.NullTime{ - Time: now, - Valid: true, - }, - Error: sql.NullString{}, - ErrorCode: sql.NullString{}, - }) + if workspace.AutostartSchedule.Valid { + templateScheduleOptions, err := templateScheduleStore.Get(ctx, db, workspace.TemplateID) if err != nil { - return xerrors.Errorf("update provisioner job: %w", err) + return xerrors.Errorf("get template schedule options: %w", err) } - err = db.UpdateWorkspaceBuildProvisionerStateByID(ctx, database.UpdateWorkspaceBuildProvisionerStateByIDParams{ - ID: workspaceBuild.ID, - ProvisionerState: jobType.WorkspaceBuild.State, - UpdatedAt: now, - }) - if err != nil { - return xerrors.Errorf("update workspace build provisioner state: %w", err) - } - err = db.UpdateWorkspaceBuildDeadlineByID(ctx, database.UpdateWorkspaceBuildDeadlineByIDParams{ - ID: workspaceBuild.ID, - Deadline: autoStop.Deadline, - MaxDeadline: autoStop.MaxDeadline, - UpdatedAt: now, - }) - if err != nil { - return xerrors.Errorf("update workspace build deadline: %w", err) - } - - agentTimeouts := make(map[time.Duration]bool) // A set of agent timeouts. - // This could be a bulk insert to improve performance. - for _, protoResource := range jobType.WorkspaceBuild.Resources { - for _, protoAgent := range protoResource.Agents { - dur := time.Duration(protoAgent.GetConnectionTimeoutSeconds()) * time.Second - agentTimeouts[dur] = true - } - err = InsertWorkspaceResource(ctx, db, job.ID, workspaceBuild.Transition, protoResource, telemetrySnapshot) + nextStartAt, err := schedule.NextAllowedAutostart(now, workspace.AutostartSchedule.String, templateScheduleOptions) + if err == nil { + err = db.UpdateWorkspaceNextStartAt(ctx, database.UpdateWorkspaceNextStartAtParams{ + ID: workspace.ID, + NextStartAt: sql.NullTime{Valid: true, Time: nextStartAt.UTC()}, + }) if err != nil { - return xerrors.Errorf("insert provisioner job: %w", err) - } - } - for _, module := range jobType.WorkspaceBuild.Modules { - if err := InsertWorkspaceModule(ctx, db, job.ID, workspaceBuild.Transition, module, telemetrySnapshot); err != nil { - return xerrors.Errorf("insert provisioner job module: %w", err) + return xerrors.Errorf("update workspace next start at: %w", err) } } + } - // On start, we want to ensure that workspace agents timeout statuses - // are propagated. This method is simple and does not protect against - // notifying in edge cases like when a workspace is stopped soon - // after being started. - // - // Agent timeouts could be minutes apart, resulting in an unresponsive - // experience, so we'll notify after every unique timeout seconds. - if !input.DryRun && workspaceBuild.Transition == database.WorkspaceTransitionStart && len(agentTimeouts) > 0 { - timeouts := maps.Keys(agentTimeouts) - slices.Sort(timeouts) - - var updates []<-chan time.Time - for _, d := range timeouts { - s.Logger.Debug(ctx, "triggering workspace notification after agent timeout", - slog.F("workspace_build_id", workspaceBuild.ID), - slog.F("timeout", d), - ) - // Agents are inserted with `dbtime.Now()`, this triggers a - // workspace event approximately after created + timeout seconds. - updates = append(updates, time.After(d)) - } - go func() { - for _, wait := range updates { - select { - case <-s.lifecycleCtx.Done(): - // If the server is shutting down, we don't want to wait around. - s.Logger.Debug(ctx, "stopping notifications due to server shutdown", - slog.F("workspace_build_id", workspaceBuild.ID), - ) - return - case <-wait: - // Wait for the next potential timeout to occur. - msg, err := json.Marshal(wspubsub.WorkspaceEvent{ - Kind: wspubsub.WorkspaceEventKindAgentTimeout, - WorkspaceID: workspace.ID, - }) - if err != nil { - s.Logger.Error(ctx, "marshal workspace update event", slog.Error(err)) - break - } - if err := s.Pubsub.Publish(wspubsub.WorkspaceEventChannel(workspace.OwnerID), msg); err != nil { - if s.lifecycleCtx.Err() != nil { - // If the server is shutting down, we don't want to log this error, nor wait around. - s.Logger.Debug(ctx, "stopping notifications due to server shutdown", - slog.F("workspace_build_id", workspaceBuild.ID), - ) - return - } - s.Logger.Error(ctx, "workspace notification after agent timeout failed", - slog.F("workspace_build_id", workspaceBuild.ID), - slog.Error(err), - ) - } - } - } - }() - } + err = db.UpdateProvisionerJobWithCompleteByID(ctx, database.UpdateProvisionerJobWithCompleteByIDParams{ + ID: jobID, + UpdatedAt: now, + CompletedAt: sql.NullTime{ + Time: now, + Valid: true, + }, + Error: sql.NullString{}, + ErrorCode: sql.NullString{}, + }) + if err != nil { + return xerrors.Errorf("update provisioner job: %w", err) + } + err = db.UpdateWorkspaceBuildProvisionerStateByID(ctx, database.UpdateWorkspaceBuildProvisionerStateByIDParams{ + ID: workspaceBuild.ID, + ProvisionerState: jobType.WorkspaceBuild.State, + UpdatedAt: now, + }) + if err != nil { + return xerrors.Errorf("update workspace build provisioner state: %w", err) + } + err = db.UpdateWorkspaceBuildDeadlineByID(ctx, database.UpdateWorkspaceBuildDeadlineByIDParams{ + ID: workspaceBuild.ID, + Deadline: autoStop.Deadline, + MaxDeadline: autoStop.MaxDeadline, + UpdatedAt: now, + }) + if err != nil { + return xerrors.Errorf("update workspace build deadline: %w", err) + } - if workspaceBuild.Transition != database.WorkspaceTransitionDelete { - // This is for deleting a workspace! - return nil + agentTimeouts := make(map[time.Duration]bool) // A set of agent timeouts. + // This could be a bulk insert to improve performance. + for _, protoResource := range jobType.WorkspaceBuild.Resources { + for _, protoAgent := range protoResource.Agents { + dur := time.Duration(protoAgent.GetConnectionTimeoutSeconds()) * time.Second + agentTimeouts[dur] = true } - err = db.UpdateWorkspaceDeletedByID(ctx, database.UpdateWorkspaceDeletedByIDParams{ - ID: workspaceBuild.WorkspaceID, - Deleted: true, - }) + err = InsertWorkspaceResource(ctx, db, job.ID, workspaceBuild.Transition, protoResource, telemetrySnapshot) if err != nil { - return xerrors.Errorf("update workspace deleted: %w", err) + return xerrors.Errorf("insert provisioner job: %w", err) + } + } + for _, module := range jobType.WorkspaceBuild.Modules { + if err := InsertWorkspaceModule(ctx, db, job.ID, workspaceBuild.Transition, module, telemetrySnapshot); err != nil { + return xerrors.Errorf("insert provisioner job module: %w", err) } - - return nil - }, nil) - if err != nil { - return nil, xerrors.Errorf("complete job: %w", err) } - // Insert timings outside transaction since it is metadata. + // Insert timings inside the transaction now // nolint:exhaustruct // The other fields are set further down. params := database.InsertProvisionerJobTimingsParams{ JobID: jobID, } - for _, t := range completed.GetWorkspaceBuild().GetTimings() { + for _, t := range jobType.WorkspaceBuild.Timings { if t.Start == nil || t.End == nil { s.Logger.Warn(ctx, "timings entry has nil start or end time", slog.F("entry", t.String())) continue @@ -1780,153 +1759,229 @@ func (s *server) CompleteJob(ctx context.Context, completed *proto.CompletedJob) params.StartedAt = append(params.StartedAt, t.Start.AsTime()) params.EndedAt = append(params.EndedAt, t.End.AsTime()) } - _, err = s.Database.InsertProvisionerJobTimings(ctx, params) + _, err = db.InsertProvisionerJobTimings(ctx, params) if err != nil { - // Don't fail the transaction for non-critical data. + // Log error but don't fail the whole transaction for non-critical data s.Logger.Warn(ctx, "failed to update provisioner job timings", slog.F("job_id", jobID), slog.Error(err)) } - // audit the outcome of the workspace build - if getWorkspaceError == nil { - // If the workspace has been deleted, notify the owner about it. - if workspaceBuild.Transition == database.WorkspaceTransitionDelete { - s.notifyWorkspaceDeleted(ctx, workspace, workspaceBuild) - } + // On start, we want to ensure that workspace agents timeout statuses + // are propagated. This method is simple and does not protect against + // notifying in edge cases like when a workspace is stopped soon + // after being started. + // + // Agent timeouts could be minutes apart, resulting in an unresponsive + // experience, so we'll notify after every unique timeout seconds. + if !input.DryRun && workspaceBuild.Transition == database.WorkspaceTransitionStart && len(agentTimeouts) > 0 { + timeouts := maps.Keys(agentTimeouts) + slices.Sort(timeouts) + + var updates []<-chan time.Time + for _, d := range timeouts { + s.Logger.Debug(ctx, "triggering workspace notification after agent timeout", + slog.F("workspace_build_id", workspaceBuild.ID), + slog.F("timeout", d), + ) + // Agents are inserted with `dbtime.Now()`, this triggers a + // workspace event approximately after created + timeout seconds. + updates = append(updates, time.After(d)) + } + go func() { + for _, wait := range updates { + select { + case <-s.lifecycleCtx.Done(): + // If the server is shutting down, we don't want to wait around. + s.Logger.Debug(ctx, "stopping notifications due to server shutdown", + slog.F("workspace_build_id", workspaceBuild.ID), + ) + return + case <-wait: + // Wait for the next potential timeout to occur. + msg, err := json.Marshal(wspubsub.WorkspaceEvent{ + Kind: wspubsub.WorkspaceEventKindAgentTimeout, + WorkspaceID: workspace.ID, + }) + if err != nil { + s.Logger.Error(ctx, "marshal workspace update event", slog.Error(err)) + break + } + if err := s.Pubsub.Publish(wspubsub.WorkspaceEventChannel(workspace.OwnerID), msg); err != nil { + if s.lifecycleCtx.Err() != nil { + // If the server is shutting down, we don't want to log this error, nor wait around. + s.Logger.Debug(ctx, "stopping notifications due to server shutdown", + slog.F("workspace_build_id", workspaceBuild.ID), + ) + return + } + s.Logger.Error(ctx, "workspace notification after agent timeout failed", + slog.F("workspace_build_id", workspaceBuild.ID), + slog.Error(err), + ) + } + } + } + }() + } - auditor := s.Auditor.Load() - auditAction := auditActionFromTransition(workspaceBuild.Transition) + if workspaceBuild.Transition != database.WorkspaceTransitionDelete { + // This is for deleting a workspace! + return nil + } - previousBuildNumber := workspaceBuild.BuildNumber - 1 - previousBuild, prevBuildErr := s.Database.GetWorkspaceBuildByWorkspaceIDAndBuildNumber(ctx, database.GetWorkspaceBuildByWorkspaceIDAndBuildNumberParams{ - WorkspaceID: workspace.ID, - BuildNumber: previousBuildNumber, - }) - if prevBuildErr != nil { - previousBuild = database.WorkspaceBuild{} - } + err = db.UpdateWorkspaceDeletedByID(ctx, database.UpdateWorkspaceDeletedByIDParams{ + ID: workspaceBuild.WorkspaceID, + Deleted: true, + }) + if err != nil { + return xerrors.Errorf("update workspace deleted: %w", err) + } - // We pass the below information to the Auditor so that it - // can form a friendly string for the user to view in the UI. - buildResourceInfo := audit.AdditionalFields{ - WorkspaceName: workspace.Name, - BuildNumber: strconv.FormatInt(int64(workspaceBuild.BuildNumber), 10), - BuildReason: database.BuildReason(string(workspaceBuild.Reason)), - WorkspaceID: workspace.ID, - } + return nil + }, nil) + if err != nil { + return xerrors.Errorf("complete job: %w", err) + } - wriBytes, err := json.Marshal(buildResourceInfo) - if err != nil { - s.Logger.Error(ctx, "marshal resource info for successful job", slog.Error(err)) - } - - bag := audit.BaggageFromContext(ctx) - - audit.BackgroundAudit(ctx, &audit.BackgroundAuditParams[database.WorkspaceBuild]{ - Audit: *auditor, - Log: s.Logger, - UserID: job.InitiatorID, - OrganizationID: workspace.OrganizationID, - RequestID: job.ID, - IP: bag.IP, - Action: auditAction, - Old: previousBuild, - New: workspaceBuild, - Status: http.StatusOK, - AdditionalFields: wriBytes, - }) - } + // Post-transaction operations (operations that do not require transactions or + // are external to the database, like audit logging, notifications, etc.) - if s.PrebuildsOrchestrator != nil && input.PrebuiltWorkspaceBuildStage == sdkproto.PrebuiltWorkspaceBuildStage_CLAIM { - // Track resource replacements, if there are any. - orchestrator := s.PrebuildsOrchestrator.Load() - if resourceReplacements := completed.GetWorkspaceBuild().GetResourceReplacements(); orchestrator != nil && len(resourceReplacements) > 0 { - // Fire and forget. Bind to the lifecycle of the server so shutdowns are handled gracefully. - go (*orchestrator).TrackResourceReplacement(s.lifecycleCtx, workspace.ID, workspaceBuild.ID, resourceReplacements) - } + // audit the outcome of the workspace build + if getWorkspaceError == nil { + // If the workspace has been deleted, notify the owner about it. + if workspaceBuild.Transition == database.WorkspaceTransitionDelete { + s.notifyWorkspaceDeleted(ctx, workspace, workspaceBuild) } - msg, err := json.Marshal(wspubsub.WorkspaceEvent{ - Kind: wspubsub.WorkspaceEventKindStateChange, + auditor := s.Auditor.Load() + auditAction := auditActionFromTransition(workspaceBuild.Transition) + + previousBuildNumber := workspaceBuild.BuildNumber - 1 + previousBuild, prevBuildErr := s.Database.GetWorkspaceBuildByWorkspaceIDAndBuildNumber(ctx, database.GetWorkspaceBuildByWorkspaceIDAndBuildNumberParams{ WorkspaceID: workspace.ID, + BuildNumber: previousBuildNumber, }) - if err != nil { - return nil, xerrors.Errorf("marshal workspace update event: %s", err) + if prevBuildErr != nil { + previousBuild = database.WorkspaceBuild{} } - err = s.Pubsub.Publish(wspubsub.WorkspaceEventChannel(workspace.OwnerID), msg) + + // We pass the below information to the Auditor so that it + // can form a friendly string for the user to view in the UI. + buildResourceInfo := audit.AdditionalFields{ + WorkspaceName: workspace.Name, + BuildNumber: strconv.FormatInt(int64(workspaceBuild.BuildNumber), 10), + BuildReason: database.BuildReason(string(workspaceBuild.Reason)), + WorkspaceID: workspace.ID, + } + + wriBytes, err := json.Marshal(buildResourceInfo) if err != nil { - return nil, xerrors.Errorf("update workspace: %w", err) + s.Logger.Error(ctx, "marshal resource info for successful job", slog.Error(err)) + } + + bag := audit.BaggageFromContext(ctx) + + audit.BackgroundAudit(ctx, &audit.BackgroundAuditParams[database.WorkspaceBuild]{ + Audit: *auditor, + Log: s.Logger, + UserID: job.InitiatorID, + OrganizationID: workspace.OrganizationID, + RequestID: job.ID, + IP: bag.IP, + Action: auditAction, + Old: previousBuild, + New: workspaceBuild, + Status: http.StatusOK, + AdditionalFields: wriBytes, + }) + } + + if s.PrebuildsOrchestrator != nil && input.PrebuiltWorkspaceBuildStage == sdkproto.PrebuiltWorkspaceBuildStage_CLAIM { + // Track resource replacements, if there are any. + orchestrator := s.PrebuildsOrchestrator.Load() + if resourceReplacements := jobType.WorkspaceBuild.ResourceReplacements; orchestrator != nil && len(resourceReplacements) > 0 { + // Fire and forget. Bind to the lifecycle of the server so shutdowns are handled gracefully. + go (*orchestrator).TrackResourceReplacement(s.lifecycleCtx, workspace.ID, workspaceBuild.ID, resourceReplacements) } + } - if input.PrebuiltWorkspaceBuildStage == sdkproto.PrebuiltWorkspaceBuildStage_CLAIM { - s.Logger.Info(ctx, "workspace prebuild successfully claimed by user", - slog.F("workspace_id", workspace.ID)) + msg, err := json.Marshal(wspubsub.WorkspaceEvent{ + Kind: wspubsub.WorkspaceEventKindStateChange, + WorkspaceID: workspace.ID, + }) + if err != nil { + return xerrors.Errorf("marshal workspace update event: %s", err) + } + err = s.Pubsub.Publish(wspubsub.WorkspaceEventChannel(workspace.OwnerID), msg) + if err != nil { + return xerrors.Errorf("update workspace: %w", err) + } - err = prebuilds.NewPubsubWorkspaceClaimPublisher(s.Pubsub).PublishWorkspaceClaim(agentsdk.ReinitializationEvent{ - WorkspaceID: workspace.ID, - Reason: agentsdk.ReinitializeReasonPrebuildClaimed, - }) - if err != nil { - s.Logger.Error(ctx, "failed to publish workspace claim event", slog.Error(err)) - } + if input.PrebuiltWorkspaceBuildStage == sdkproto.PrebuiltWorkspaceBuildStage_CLAIM { + s.Logger.Info(ctx, "workspace prebuild successfully claimed by user", + slog.F("workspace_id", workspace.ID)) + + err = prebuilds.NewPubsubWorkspaceClaimPublisher(s.Pubsub).PublishWorkspaceClaim(agentsdk.ReinitializationEvent{ + WorkspaceID: workspace.ID, + Reason: agentsdk.ReinitializeReasonPrebuildClaimed, + }) + if err != nil { + s.Logger.Error(ctx, "failed to publish workspace claim event", slog.Error(err)) } - case *proto.CompletedJob_TemplateDryRun_: + } + + return nil +} + +// completeTemplateDryRunJob handles completion of a template dry-run job. +// All database operations are performed within a transaction. +func (s *server) completeTemplateDryRunJob(ctx context.Context, job database.ProvisionerJob, jobID uuid.UUID, jobType *proto.CompletedJob_TemplateDryRun_, telemetrySnapshot *telemetry.Snapshot) error { + // Execute all database operations in a transaction + return s.Database.InTx(func(db database.Store) error { + now := s.timeNow() + + // Process resources for _, resource := range jobType.TemplateDryRun.Resources { s.Logger.Info(ctx, "inserting template dry-run job resource", slog.F("job_id", job.ID.String()), slog.F("resource_name", resource.Name), slog.F("resource_type", resource.Type)) - err = InsertWorkspaceResource(ctx, s.Database, jobID, database.WorkspaceTransitionStart, resource, telemetrySnapshot) + err := InsertWorkspaceResource(ctx, db, jobID, database.WorkspaceTransitionStart, resource, telemetrySnapshot) if err != nil { - return nil, xerrors.Errorf("insert resource: %w", err) + return xerrors.Errorf("insert resource: %w", err) } } + + // Process modules for _, module := range jobType.TemplateDryRun.Modules { s.Logger.Info(ctx, "inserting template dry-run job module", slog.F("job_id", job.ID.String()), slog.F("module_source", module.Source), ) - if err := InsertWorkspaceModule(ctx, s.Database, jobID, database.WorkspaceTransitionStart, module, telemetrySnapshot); err != nil { - return nil, xerrors.Errorf("insert module: %w", err) + if err := InsertWorkspaceModule(ctx, db, jobID, database.WorkspaceTransitionStart, module, telemetrySnapshot); err != nil { + return xerrors.Errorf("insert module: %w", err) } } - err = s.Database.UpdateProvisionerJobWithCompleteByID(ctx, database.UpdateProvisionerJobWithCompleteByIDParams{ + // Mark job as complete + err := db.UpdateProvisionerJobWithCompleteByID(ctx, database.UpdateProvisionerJobWithCompleteByIDParams{ ID: jobID, - UpdatedAt: s.timeNow(), + UpdatedAt: now, CompletedAt: sql.NullTime{ - Time: s.timeNow(), + Time: now, Valid: true, }, Error: sql.NullString{}, ErrorCode: sql.NullString{}, }) if err != nil { - return nil, xerrors.Errorf("update provisioner job: %w", err) + return xerrors.Errorf("update provisioner job: %w", err) } s.Logger.Debug(ctx, "marked template dry-run job as completed", slog.F("job_id", jobID)) - default: - if completed.Type == nil { - return nil, xerrors.Errorf("type payload must be provided") - } - return nil, xerrors.Errorf("unknown job type %q; ensure coderd and provisionerd versions match", - reflect.TypeOf(completed.Type).String()) - } - - data, err := json.Marshal(provisionersdk.ProvisionerJobLogsNotifyMessage{EndOfLogs: true}) - if err != nil { - return nil, xerrors.Errorf("marshal job log: %w", err) - } - err = s.Pubsub.Publish(provisionersdk.ProvisionerJobLogsNotifyChannel(jobID), data) - if err != nil { - s.Logger.Error(ctx, "failed to publish end of job logs", slog.F("job_id", jobID), slog.Error(err)) - return nil, xerrors.Errorf("publish end of job logs: %w", err) - } - - s.Logger.Debug(ctx, "stage CompleteJob done", slog.F("job_id", jobID)) - return &proto.Empty{}, nil + return nil + }, nil) // End of transaction } func (s *server) notifyWorkspaceDeleted(ctx context.Context, workspace database.Workspace, build database.WorkspaceBuild) { diff --git a/coderd/provisionerdserver/provisionerdserver_test.go b/coderd/provisionerdserver/provisionerdserver_test.go index e125db348e701..eb63d84b1df1b 100644 --- a/coderd/provisionerdserver/provisionerdserver_test.go +++ b/coderd/provisionerdserver/provisionerdserver_test.go @@ -20,6 +20,7 @@ import ( "go.opentelemetry.io/otel/trace" "golang.org/x/oauth2" "golang.org/x/xerrors" + "google.golang.org/protobuf/types/known/timestamppb" "storj.io/drpc" "cdr.dev/slog/sloggers/slogtest" @@ -1119,6 +1120,227 @@ func TestCompleteJob(t *testing.T) { require.ErrorContains(t, err, "you don't own this job") }) + // Test for verifying transaction behavior on the extracted methods + t.Run("TransactionBehavior", func(t *testing.T) { + t.Parallel() + // Test TemplateImport transaction + t.Run("TemplateImportTransaction", func(t *testing.T) { + t.Parallel() + srv, db, _, pd := setup(t, false, &overrides{}) + jobID := uuid.New() + versionID := uuid.New() + err := db.InsertTemplateVersion(ctx, database.InsertTemplateVersionParams{ + ID: versionID, + JobID: jobID, + OrganizationID: pd.OrganizationID, + }) + require.NoError(t, err) + job, err := db.InsertProvisionerJob(ctx, database.InsertProvisionerJobParams{ + OrganizationID: pd.OrganizationID, + ID: jobID, + Provisioner: database.ProvisionerTypeEcho, + Input: []byte(`{"template_version_id": "` + versionID.String() + `"}`), + StorageMethod: database.ProvisionerStorageMethodFile, + Type: database.ProvisionerJobTypeTemplateVersionImport, + }) + require.NoError(t, err) + _, err = db.AcquireProvisionerJob(ctx, database.AcquireProvisionerJobParams{ + OrganizationID: pd.OrganizationID, + WorkerID: uuid.NullUUID{ + UUID: pd.ID, + Valid: true, + }, + Types: []database.ProvisionerType{database.ProvisionerTypeEcho}, + }) + require.NoError(t, err) + + _, err = srv.CompleteJob(ctx, &proto.CompletedJob{ + JobId: job.ID.String(), + Type: &proto.CompletedJob_TemplateImport_{ + TemplateImport: &proto.CompletedJob_TemplateImport{ + StartResources: []*sdkproto.Resource{{ + Name: "test-resource", + Type: "aws_instance", + }}, + Plan: []byte("{}"), + }, + }, + }) + require.NoError(t, err) + + // Verify job was marked as completed + completedJob, err := db.GetProvisionerJobByID(ctx, job.ID) + require.NoError(t, err) + require.True(t, completedJob.CompletedAt.Valid, "Job should be marked as completed") + + // Verify resources were created + resources, err := db.GetWorkspaceResourcesByJobID(ctx, job.ID) + require.NoError(t, err) + require.Len(t, resources, 1, "Expected one resource to be created") + require.Equal(t, "test-resource", resources[0].Name) + }) + + // Test TemplateDryRun transaction + t.Run("TemplateDryRunTransaction", func(t *testing.T) { + t.Parallel() + srv, db, _, pd := setup(t, false, &overrides{}) + job, err := db.InsertProvisionerJob(ctx, database.InsertProvisionerJobParams{ + ID: uuid.New(), + Provisioner: database.ProvisionerTypeEcho, + Type: database.ProvisionerJobTypeTemplateVersionDryRun, + StorageMethod: database.ProvisionerStorageMethodFile, + }) + require.NoError(t, err) + _, err = db.AcquireProvisionerJob(ctx, database.AcquireProvisionerJobParams{ + WorkerID: uuid.NullUUID{ + UUID: pd.ID, + Valid: true, + }, + Types: []database.ProvisionerType{database.ProvisionerTypeEcho}, + }) + require.NoError(t, err) + + _, err = srv.CompleteJob(ctx, &proto.CompletedJob{ + JobId: job.ID.String(), + Type: &proto.CompletedJob_TemplateDryRun_{ + TemplateDryRun: &proto.CompletedJob_TemplateDryRun{ + Resources: []*sdkproto.Resource{{ + Name: "test-dry-run-resource", + Type: "aws_instance", + }}, + }, + }, + }) + require.NoError(t, err) + + // Verify job was marked as completed + completedJob, err := db.GetProvisionerJobByID(ctx, job.ID) + require.NoError(t, err) + require.True(t, completedJob.CompletedAt.Valid, "Job should be marked as completed") + + // Verify resources were created + resources, err := db.GetWorkspaceResourcesByJobID(ctx, job.ID) + require.NoError(t, err) + require.Len(t, resources, 1, "Expected one resource to be created") + require.Equal(t, "test-dry-run-resource", resources[0].Name) + }) + + // Test WorkspaceBuild transaction + t.Run("WorkspaceBuildTransaction", func(t *testing.T) { + t.Parallel() + srv, db, ps, pd := setup(t, false, &overrides{}) + + // Create test data + user := dbgen.User(t, db, database.User{}) + template := dbgen.Template(t, db, database.Template{ + Name: "template", + Provisioner: database.ProvisionerTypeEcho, + OrganizationID: pd.OrganizationID, + }) + file := dbgen.File(t, db, database.File{CreatedBy: user.ID}) + workspaceTable := dbgen.Workspace(t, db, database.WorkspaceTable{ + TemplateID: template.ID, + OwnerID: user.ID, + OrganizationID: pd.OrganizationID, + }) + version := dbgen.TemplateVersion(t, db, database.TemplateVersion{ + OrganizationID: pd.OrganizationID, + TemplateID: uuid.NullUUID{ + UUID: template.ID, + Valid: true, + }, + JobID: uuid.New(), + }) + build := dbgen.WorkspaceBuild(t, db, database.WorkspaceBuild{ + WorkspaceID: workspaceTable.ID, + TemplateVersionID: version.ID, + Transition: database.WorkspaceTransitionStart, + Reason: database.BuildReasonInitiator, + }) + job := dbgen.ProvisionerJob(t, db, ps, database.ProvisionerJob{ + FileID: file.ID, + InitiatorID: user.ID, + Type: database.ProvisionerJobTypeWorkspaceBuild, + Input: must(json.Marshal(provisionerdserver.WorkspaceProvisionJob{ + WorkspaceBuildID: build.ID, + })), + OrganizationID: pd.OrganizationID, + }) + _, err := db.AcquireProvisionerJob(ctx, database.AcquireProvisionerJobParams{ + OrganizationID: pd.OrganizationID, + WorkerID: uuid.NullUUID{ + UUID: pd.ID, + Valid: true, + }, + Types: []database.ProvisionerType{database.ProvisionerTypeEcho}, + }) + require.NoError(t, err) + + // Add a published channel to make sure the workspace event is sent + publishedWorkspace := make(chan struct{}) + closeWorkspaceSubscribe, err := ps.SubscribeWithErr(wspubsub.WorkspaceEventChannel(workspaceTable.OwnerID), + wspubsub.HandleWorkspaceEvent( + func(_ context.Context, e wspubsub.WorkspaceEvent, err error) { + if err != nil { + return + } + if e.Kind == wspubsub.WorkspaceEventKindStateChange && e.WorkspaceID == workspaceTable.ID { + close(publishedWorkspace) + } + })) + require.NoError(t, err) + defer closeWorkspaceSubscribe() + + // The actual test + _, err = srv.CompleteJob(ctx, &proto.CompletedJob{ + JobId: job.ID.String(), + Type: &proto.CompletedJob_WorkspaceBuild_{ + WorkspaceBuild: &proto.CompletedJob_WorkspaceBuild{ + State: []byte{}, + Resources: []*sdkproto.Resource{{ + Name: "test-workspace-resource", + Type: "aws_instance", + }}, + Timings: []*sdkproto.Timing{{ + Stage: "test", + Source: "test-source", + Resource: "test-resource", + Action: "test-action", + Start: timestamppb.Now(), + End: timestamppb.Now(), + }}, + }, + }, + }) + require.NoError(t, err) + + // Wait for workspace notification + select { + case <-publishedWorkspace: + // Success + case <-time.After(testutil.WaitShort): + t.Fatal("Workspace event not published") + } + + // Verify job was marked as completed + completedJob, err := db.GetProvisionerJobByID(ctx, job.ID) + require.NoError(t, err) + require.True(t, completedJob.CompletedAt.Valid, "Job should be marked as completed") + + // Verify resources were created + resources, err := db.GetWorkspaceResourcesByJobID(ctx, job.ID) + require.NoError(t, err) + require.Len(t, resources, 1, "Expected one resource to be created") + require.Equal(t, "test-workspace-resource", resources[0].Name) + + // Verify timings were recorded + timings, err := db.GetProvisionerJobTimingsByJobID(ctx, job.ID) + require.NoError(t, err) + require.Len(t, timings, 1, "Expected one timing entry to be created") + require.Equal(t, "test", string(timings[0].Stage), "Timing stage should match what was sent") + }) + }) + t.Run("TemplateImport_MissingGitAuth", func(t *testing.T) { t.Parallel() srv, db, _, pd := setup(t, false, &overrides{}) diff --git a/coderd/rbac/authz.go b/coderd/rbac/authz.go index d2c6d5d0675be..c63042a2a1363 100644 --- a/coderd/rbac/authz.go +++ b/coderd/rbac/authz.go @@ -65,7 +65,7 @@ const ( SubjectTypeUser SubjectType = "user" SubjectTypeProvisionerd SubjectType = "provisionerd" SubjectTypeAutostart SubjectType = "autostart" - SubjectTypeHangDetector SubjectType = "hang_detector" + SubjectTypeJobReaper SubjectType = "job_reaper" SubjectTypeResourceMonitor SubjectType = "resource_monitor" SubjectTypeCryptoKeyRotator SubjectType = "crypto_key_rotator" SubjectTypeCryptoKeyReader SubjectType = "crypto_key_reader" diff --git a/coderd/rbac/no_slim.go b/coderd/rbac/no_slim.go new file mode 100644 index 0000000000000..d1baaeade4108 --- /dev/null +++ b/coderd/rbac/no_slim.go @@ -0,0 +1,9 @@ +//go:build slim + +package rbac + +const ( + // This line fails to compile, preventing this package from being imported + // in slim builds. + _DO_NOT_IMPORT_THIS_PACKAGE_IN_SLIM_BUILDS = _DO_NOT_IMPORT_THIS_PACKAGE_IN_SLIM_BUILDS +) diff --git a/coderd/rbac/object_gen.go b/coderd/rbac/object_gen.go index 40b7dc87a56f8..f19d90894dd55 100644 --- a/coderd/rbac/object_gen.go +++ b/coderd/rbac/object_gen.go @@ -234,7 +234,9 @@ var ( // ResourceProvisionerJobs // Valid Actions + // - "ActionCreate" :: create provisioner jobs // - "ActionRead" :: read provisioner jobs + // - "ActionUpdate" :: update provisioner jobs ResourceProvisionerJobs = Object{ Type: "provisioner_jobs", } @@ -306,7 +308,9 @@ var ( // Valid Actions // - "ActionApplicationConnect" :: connect to workspace apps via browser // - "ActionCreate" :: create a new workspace + // - "ActionCreateAgent" :: create a new workspace agent // - "ActionDelete" :: delete workspace + // - "ActionDeleteAgent" :: delete an existing workspace agent // - "ActionRead" :: read workspace data to view on the UI // - "ActionSSH" :: ssh into a given workspace // - "ActionWorkspaceStart" :: allows starting a workspace @@ -336,7 +340,9 @@ var ( // Valid Actions // - "ActionApplicationConnect" :: connect to workspace apps via browser // - "ActionCreate" :: create a new workspace + // - "ActionCreateAgent" :: create a new workspace agent // - "ActionDelete" :: delete workspace + // - "ActionDeleteAgent" :: delete an existing workspace agent // - "ActionRead" :: read workspace data to view on the UI // - "ActionSSH" :: ssh into a given workspace // - "ActionWorkspaceStart" :: allows starting a workspace @@ -404,7 +410,9 @@ func AllActions() []policy.Action { policy.ActionApplicationConnect, policy.ActionAssign, policy.ActionCreate, + policy.ActionCreateAgent, policy.ActionDelete, + policy.ActionDeleteAgent, policy.ActionRead, policy.ActionReadPersonal, policy.ActionSSH, diff --git a/coderd/rbac/policy/policy.go b/coderd/rbac/policy/policy.go index 35da0892abfdb..160062283f857 100644 --- a/coderd/rbac/policy/policy.go +++ b/coderd/rbac/policy/policy.go @@ -24,6 +24,9 @@ const ( ActionReadPersonal Action = "read_personal" ActionUpdatePersonal Action = "update_personal" + + ActionCreateAgent Action = "create_agent" + ActionDeleteAgent Action = "delete_agent" ) type PermissionDefinition struct { @@ -67,6 +70,9 @@ var workspaceActions = map[Action]ActionDefinition{ // Running a workspace ActionSSH: actDef("ssh into a given workspace"), ActionApplicationConnect: actDef("connect to workspace apps via browser"), + + ActionCreateAgent: actDef("create a new workspace agent"), + ActionDeleteAgent: actDef("delete an existing workspace agent"), } // RBACPermissions is indexed by the type @@ -182,7 +188,9 @@ var RBACPermissions = map[string]PermissionDefinition{ }, "provisioner_jobs": { Actions: map[Action]ActionDefinition{ - ActionRead: actDef("read provisioner jobs"), + ActionRead: actDef("read provisioner jobs"), + ActionUpdate: actDef("update provisioner jobs"), + ActionCreate: actDef("create provisioner jobs"), }, }, "organization": { diff --git a/coderd/rbac/roles.go b/coderd/rbac/roles.go index 56124faee44e2..8b98f5f2f2bc7 100644 --- a/coderd/rbac/roles.go +++ b/coderd/rbac/roles.go @@ -272,7 +272,7 @@ func ReloadBuiltinRoles(opts *RoleOptions) { // This adds back in the Workspace permissions. Permissions(map[string][]policy.Action{ ResourceWorkspace.Type: ownerWorkspaceActions, - ResourceWorkspaceDormant.Type: {policy.ActionRead, policy.ActionDelete, policy.ActionCreate, policy.ActionUpdate, policy.ActionWorkspaceStop}, + ResourceWorkspaceDormant.Type: {policy.ActionRead, policy.ActionDelete, policy.ActionCreate, policy.ActionUpdate, policy.ActionWorkspaceStop, policy.ActionCreateAgent, policy.ActionDeleteAgent}, })...), Org: map[string][]Permission{}, User: []Permission{}, @@ -291,7 +291,7 @@ func ReloadBuiltinRoles(opts *RoleOptions) { User: append(allPermsExcept(ResourceWorkspaceDormant, ResourceUser, ResourceOrganizationMember), Permissions(map[string][]policy.Action{ // Reduced permission set on dormant workspaces. No build, ssh, or exec - ResourceWorkspaceDormant.Type: {policy.ActionRead, policy.ActionDelete, policy.ActionCreate, policy.ActionUpdate, policy.ActionWorkspaceStop}, + ResourceWorkspaceDormant.Type: {policy.ActionRead, policy.ActionDelete, policy.ActionCreate, policy.ActionUpdate, policy.ActionWorkspaceStop, policy.ActionCreateAgent, policy.ActionDeleteAgent}, // Users cannot do create/update/delete on themselves, but they // can read their own details. ResourceUser.Type: {policy.ActionRead, policy.ActionReadPersonal, policy.ActionUpdatePersonal}, @@ -412,7 +412,7 @@ func ReloadBuiltinRoles(opts *RoleOptions) { Org: map[string][]Permission{ // Org admins should not have workspace exec perms. organizationID.String(): append(allPermsExcept(ResourceWorkspace, ResourceWorkspaceDormant, ResourceAssignRole), Permissions(map[string][]policy.Action{ - ResourceWorkspaceDormant.Type: {policy.ActionRead, policy.ActionDelete, policy.ActionCreate, policy.ActionUpdate, policy.ActionWorkspaceStop}, + ResourceWorkspaceDormant.Type: {policy.ActionRead, policy.ActionDelete, policy.ActionCreate, policy.ActionUpdate, policy.ActionWorkspaceStop, policy.ActionCreateAgent, policy.ActionDeleteAgent}, ResourceWorkspace.Type: slice.Omit(ResourceWorkspace.AvailableActions(), policy.ActionApplicationConnect, policy.ActionSSH), })...), }, @@ -503,7 +503,7 @@ func ReloadBuiltinRoles(opts *RoleOptions) { // the ability to create templates and provisioners has // a lot of overlap. ResourceProvisionerDaemon.Type: {policy.ActionCreate, policy.ActionRead, policy.ActionUpdate, policy.ActionDelete}, - ResourceProvisionerJobs.Type: {policy.ActionRead}, + ResourceProvisionerJobs.Type: {policy.ActionRead, policy.ActionUpdate, policy.ActionCreate}, }), }, User: []Permission{}, @@ -529,6 +529,16 @@ func ReloadBuiltinRoles(opts *RoleOptions) { ResourceType: ResourceWorkspace.Type, Action: policy.ActionDelete, }, + { + Negate: true, + ResourceType: ResourceWorkspace.Type, + Action: policy.ActionCreateAgent, + }, + { + Negate: true, + ResourceType: ResourceWorkspace.Type, + Action: policy.ActionDeleteAgent, + }, }, }, User: []Permission{}, @@ -788,12 +798,12 @@ func OrganizationRoles(organizationID uuid.UUID) []Role { return roles } -// SiteRoles lists all roles that can be applied to a user. +// SiteBuiltInRoles lists all roles that can be applied to a user. // This is the list of available roles, and not specific to a user // // This should be a list in a database, but until then we build // the list from the builtins. -func SiteRoles() []Role { +func SiteBuiltInRoles() []Role { var roles []Role for _, roleF := range builtInRoles { // Must provide some non-nil uuid to filter out org roles. diff --git a/coderd/rbac/roles_test.go b/coderd/rbac/roles_test.go index e90c89914fdec..5738edfe8caa2 100644 --- a/coderd/rbac/roles_test.go +++ b/coderd/rbac/roles_test.go @@ -34,7 +34,7 @@ func (a authSubject) Subjects() []authSubject { return []authSubject{a} } // rules. If this is incorrect, that is a mistake. func TestBuiltInRoles(t *testing.T) { t.Parallel() - for _, r := range rbac.SiteRoles() { + for _, r := range rbac.SiteBuiltInRoles() { r := r t.Run(r.Identifier.String(), func(t *testing.T) { t.Parallel() @@ -226,6 +226,15 @@ func TestRolePermissions(t *testing.T) { false: {setOtherOrg, setOrgNotMe, memberMe, templateAdmin, userAdmin}, }, }, + { + Name: "CreateDeleteWorkspaceAgent", + Actions: []policy.Action{policy.ActionCreateAgent, policy.ActionDeleteAgent}, + Resource: rbac.ResourceWorkspace.WithID(workspaceID).InOrg(orgID).WithOwner(currentUser.String()), + AuthorizeMap: map[bool][]hasAuthSubjects{ + true: {owner, orgMemberMe, orgAdmin}, + false: {setOtherOrg, memberMe, userAdmin, templateAdmin, orgTemplateAdmin, orgUserAdmin, orgAuditor, orgMemberMeBanWorkspace}, + }, + }, { Name: "Templates", Actions: []policy.Action{policy.ActionCreate, policy.ActionUpdate, policy.ActionDelete}, @@ -462,7 +471,7 @@ func TestRolePermissions(t *testing.T) { }, { Name: "WorkspaceDormant", - Actions: append(crud, policy.ActionWorkspaceStop), + Actions: append(crud, policy.ActionWorkspaceStop, policy.ActionCreateAgent, policy.ActionDeleteAgent), Resource: rbac.ResourceWorkspaceDormant.WithID(uuid.New()).InOrg(orgID).WithOwner(memberMe.Actor.ID), AuthorizeMap: map[bool][]hasAuthSubjects{ true: {orgMemberMe, orgAdmin, owner}, @@ -580,7 +589,7 @@ func TestRolePermissions(t *testing.T) { }, { Name: "ProvisionerJobs", - Actions: []policy.Action{policy.ActionRead}, + Actions: []policy.Action{policy.ActionRead, policy.ActionUpdate, policy.ActionCreate}, Resource: rbac.ResourceProvisionerJobs.InOrg(orgID), AuthorizeMap: map[bool][]hasAuthSubjects{ true: {owner, orgTemplateAdmin, orgAdmin}, @@ -988,7 +997,7 @@ func TestIsOrgRole(t *testing.T) { func TestListRoles(t *testing.T) { t.Parallel() - siteRoles := rbac.SiteRoles() + siteRoles := rbac.SiteBuiltInRoles() siteRoleNames := make([]string, 0, len(siteRoles)) for _, role := range siteRoles { siteRoleNames = append(siteRoleNames, role.Identifier.Name) diff --git a/coderd/roles.go b/coderd/roles.go index 89e6a964aba31..ed650f41fd6c9 100644 --- a/coderd/roles.go +++ b/coderd/roles.go @@ -43,7 +43,7 @@ func (api *API) AssignableSiteRoles(rw http.ResponseWriter, r *http.Request) { return } - httpapi.Write(ctx, rw, http.StatusOK, assignableRoles(actorRoles.Roles, rbac.SiteRoles(), dbCustomRoles)) + httpapi.Write(ctx, rw, http.StatusOK, assignableRoles(actorRoles.Roles, rbac.SiteBuiltInRoles(), dbCustomRoles)) } // assignableOrgRoles returns all org wide roles that can be assigned. diff --git a/coderd/telemetry/telemetry_test.go b/coderd/telemetry/telemetry_test.go index 6f97ce8a1270b..7de4c98e07fa8 100644 --- a/coderd/telemetry/telemetry_test.go +++ b/coderd/telemetry/telemetry_test.go @@ -1,6 +1,7 @@ package telemetry_test import ( + "context" "database/sql" "encoding/json" "net/http" @@ -115,7 +116,7 @@ func TestTelemetry(t *testing.T) { _ = dbgen.WorkspaceAgentMemoryResourceMonitor(t, db, database.WorkspaceAgentMemoryResourceMonitor{}) _ = dbgen.WorkspaceAgentVolumeResourceMonitor(t, db, database.WorkspaceAgentVolumeResourceMonitor{}) - _, snapshot := collectSnapshot(t, db, nil) + _, snapshot := collectSnapshot(ctx, t, db, nil) require.Len(t, snapshot.ProvisionerJobs, 1) require.Len(t, snapshot.Licenses, 1) require.Len(t, snapshot.Templates, 1) @@ -168,17 +169,19 @@ func TestTelemetry(t *testing.T) { }) t.Run("HashedEmail", func(t *testing.T) { t.Parallel() + ctx := testutil.Context(t, testutil.WaitMedium) db := dbmem.New() _ = dbgen.User(t, db, database.User{ Email: "kyle@coder.com", }) - _, snapshot := collectSnapshot(t, db, nil) + _, snapshot := collectSnapshot(ctx, t, db, nil) require.Len(t, snapshot.Users, 1) require.Equal(t, snapshot.Users[0].EmailHashed, "bb44bf07cf9a2db0554bba63a03d822c927deae77df101874496df5a6a3e896d@coder.com") }) t.Run("HashedModule", func(t *testing.T) { t.Parallel() db, _ := dbtestutil.NewDB(t) + ctx := testutil.Context(t, testutil.WaitMedium) pj := dbgen.ProvisionerJob(t, db, nil, database.ProvisionerJob{}) _ = dbgen.WorkspaceModule(t, db, database.WorkspaceModule{ JobID: pj.ID, @@ -190,7 +193,7 @@ func TestTelemetry(t *testing.T) { Source: "https://internal-url.com/some-module", Version: "1.0.0", }) - _, snapshot := collectSnapshot(t, db, nil) + _, snapshot := collectSnapshot(ctx, t, db, nil) require.Len(t, snapshot.WorkspaceModules, 2) modules := snapshot.WorkspaceModules sort.Slice(modules, func(i, j int) bool { @@ -286,11 +289,11 @@ func TestTelemetry(t *testing.T) { db, _ := dbtestutil.NewDB(t) // 1. No org sync settings - deployment, _ := collectSnapshot(t, db, nil) + deployment, _ := collectSnapshot(ctx, t, db, nil) require.False(t, *deployment.IDPOrgSync) // 2. Org sync settings set in server flags - deployment, _ = collectSnapshot(t, db, func(opts telemetry.Options) telemetry.Options { + deployment, _ = collectSnapshot(ctx, t, db, func(opts telemetry.Options) telemetry.Options { opts.DeploymentConfig = &codersdk.DeploymentValues{ OIDC: codersdk.OIDCConfig{ OrganizationField: "organizations", @@ -312,7 +315,7 @@ func TestTelemetry(t *testing.T) { AssignDefault: true, }) require.NoError(t, err) - deployment, _ = collectSnapshot(t, db, nil) + deployment, _ = collectSnapshot(ctx, t, db, nil) require.True(t, *deployment.IDPOrgSync) }) } @@ -320,8 +323,9 @@ func TestTelemetry(t *testing.T) { // nolint:paralleltest func TestTelemetryInstallSource(t *testing.T) { t.Setenv("CODER_TELEMETRY_INSTALL_SOURCE", "aws_marketplace") + ctx := testutil.Context(t, testutil.WaitMedium) db := dbmem.New() - deployment, _ := collectSnapshot(t, db, nil) + deployment, _ := collectSnapshot(ctx, t, db, nil) require.Equal(t, "aws_marketplace", deployment.InstallSource) } @@ -436,7 +440,7 @@ func TestRecordTelemetryStatus(t *testing.T) { } } -func mockTelemetryServer(t *testing.T) (*url.URL, chan *telemetry.Deployment, chan *telemetry.Snapshot) { +func mockTelemetryServer(ctx context.Context, t *testing.T) (*url.URL, chan *telemetry.Deployment, chan *telemetry.Snapshot) { t.Helper() deployment := make(chan *telemetry.Deployment, 64) snapshot := make(chan *telemetry.Snapshot, 64) @@ -446,7 +450,11 @@ func mockTelemetryServer(t *testing.T) (*url.URL, chan *telemetry.Deployment, ch dd := &telemetry.Deployment{} err := json.NewDecoder(r.Body).Decode(dd) require.NoError(t, err) - deployment <- dd + ok := testutil.AssertSend(ctx, t, deployment, dd) + if !ok { + w.WriteHeader(http.StatusInternalServerError) + return + } // Ensure the header is sent only after deployment is sent w.WriteHeader(http.StatusAccepted) }) @@ -455,7 +463,11 @@ func mockTelemetryServer(t *testing.T) (*url.URL, chan *telemetry.Deployment, ch ss := &telemetry.Snapshot{} err := json.NewDecoder(r.Body).Decode(ss) require.NoError(t, err) - snapshot <- ss + ok := testutil.AssertSend(ctx, t, snapshot, ss) + if !ok { + w.WriteHeader(http.StatusInternalServerError) + return + } // Ensure the header is sent only after snapshot is sent w.WriteHeader(http.StatusAccepted) }) @@ -467,10 +479,15 @@ func mockTelemetryServer(t *testing.T) (*url.URL, chan *telemetry.Deployment, ch return serverURL, deployment, snapshot } -func collectSnapshot(t *testing.T, db database.Store, addOptionsFn func(opts telemetry.Options) telemetry.Options) (*telemetry.Deployment, *telemetry.Snapshot) { +func collectSnapshot( + ctx context.Context, + t *testing.T, + db database.Store, + addOptionsFn func(opts telemetry.Options) telemetry.Options, +) (*telemetry.Deployment, *telemetry.Snapshot) { t.Helper() - serverURL, deployment, snapshot := mockTelemetryServer(t) + serverURL, deployment, snapshot := mockTelemetryServer(ctx, t) options := telemetry.Options{ Database: db, @@ -485,5 +502,6 @@ func collectSnapshot(t *testing.T, db database.Store, addOptionsFn func(opts tel reporter, err := telemetry.New(options) require.NoError(t, err) t.Cleanup(reporter.Close) - return <-deployment, <-snapshot + + return testutil.RequireReceive(ctx, t, deployment), testutil.RequireReceive(ctx, t, snapshot) } diff --git a/coderd/workspaceagents_test.go b/coderd/workspaceagents_test.go index 27da80b3c579b..1d17560c38816 100644 --- a/coderd/workspaceagents_test.go +++ b/coderd/workspaceagents_test.go @@ -439,25 +439,55 @@ func TestWorkspaceAgentConnectRPC(t *testing.T) { t.Run("Connect", func(t *testing.T) { t.Parallel() - client, db := coderdtest.NewWithDatabase(t, nil) - user := coderdtest.CreateFirstUser(t, client) - r := dbfake.WorkspaceBuild(t, db, database.WorkspaceTable{ - OrganizationID: user.OrganizationID, - OwnerID: user.UserID, - }).WithAgent().Do() - _ = agenttest.New(t, client.URL, r.AgentToken) - resources := coderdtest.AwaitWorkspaceAgents(t, client, r.Workspace.ID) + for _, tc := range []struct { + name string + apiKeyScope rbac.ScopeName + }{ + { + name: "empty (backwards compat)", + apiKeyScope: "", + }, + { + name: "all", + apiKeyScope: rbac.ScopeAll, + }, + { + name: "no_user_data", + apiKeyScope: rbac.ScopeNoUserData, + }, + { + name: "application_connect", + apiKeyScope: rbac.ScopeApplicationConnect, + }, + } { + t.Run(tc.name, func(t *testing.T) { + client, db := coderdtest.NewWithDatabase(t, nil) + user := coderdtest.CreateFirstUser(t, client) + r := dbfake.WorkspaceBuild(t, db, database.WorkspaceTable{ + OrganizationID: user.OrganizationID, + OwnerID: user.UserID, + }).WithAgent(func(agents []*proto.Agent) []*proto.Agent { + for _, agent := range agents { + agent.ApiKeyScope = string(tc.apiKeyScope) + } - ctx, cancel := context.WithTimeout(context.Background(), testutil.WaitLong) - defer cancel() + return agents + }).Do() + _ = agenttest.New(t, client.URL, r.AgentToken) + resources := coderdtest.NewWorkspaceAgentWaiter(t, client, r.Workspace.ID).AgentNames([]string{}).Wait() - conn, err := workspacesdk.New(client). - DialAgent(ctx, resources[0].Agents[0].ID, nil) - require.NoError(t, err) - defer func() { - _ = conn.Close() - }() - conn.AwaitReachable(ctx) + ctx, cancel := context.WithTimeout(context.Background(), testutil.WaitLong) + defer cancel() + + conn, err := workspacesdk.New(client). + DialAgent(ctx, resources[0].Agents[0].ID, nil) + require.NoError(t, err) + defer func() { + _ = conn.Close() + }() + conn.AwaitReachable(ctx) + }) + } }) t.Run("FailNonLatestBuild", func(t *testing.T) { @@ -1295,14 +1325,14 @@ func TestWorkspaceAgentContainers(t *testing.T) { { name: "test response", setupMock: func(mcl *acmock.MockLister) (codersdk.WorkspaceAgentListContainersResponse, error) { - mcl.EXPECT().List(gomock.Any()).Return(testResponse, nil).Times(1) + mcl.EXPECT().List(gomock.Any()).Return(testResponse, nil).AnyTimes() return testResponse, nil }, }, { name: "error response", setupMock: func(mcl *acmock.MockLister) (codersdk.WorkspaceAgentListContainersResponse, error) { - mcl.EXPECT().List(gomock.Any()).Return(codersdk.WorkspaceAgentListContainersResponse{}, assert.AnError).Times(1) + mcl.EXPECT().List(gomock.Any()).Return(codersdk.WorkspaceAgentListContainersResponse{}, assert.AnError).AnyTimes() return codersdk.WorkspaceAgentListContainersResponse{}, assert.AnError }, }, @@ -1314,7 +1344,10 @@ func TestWorkspaceAgentContainers(t *testing.T) { ctrl := gomock.NewController(t) mcl := acmock.NewMockLister(ctrl) expected, expectedErr := tc.setupMock(mcl) - client, db := coderdtest.NewWithDatabase(t, &coderdtest.Options{}) + logger := slogtest.Make(t, &slogtest.Options{IgnoreErrors: true}).Leveled(slog.LevelDebug) + client, db := coderdtest.NewWithDatabase(t, &coderdtest.Options{ + Logger: &logger, + }) user := coderdtest.CreateFirstUser(t, client) r := dbfake.WorkspaceBuild(t, db, database.WorkspaceTable{ OrganizationID: user.OrganizationID, @@ -1323,6 +1356,7 @@ func TestWorkspaceAgentContainers(t *testing.T) { return agents }).Do() _ = agenttest.New(t, client.URL, r.AgentToken, func(o *agent.Options) { + o.Logger = logger.Named("agent") o.ExperimentalDevcontainersEnabled = true o.ContainerAPIOptions = append(o.ContainerAPIOptions, agentcontainers.WithLister(mcl)) }) @@ -1392,7 +1426,7 @@ func TestWorkspaceAgentRecreateDevcontainer(t *testing.T) { setupMock: func(mcl *acmock.MockLister, mdccli *acmock.MockDevcontainerCLI) int { mcl.EXPECT().List(gomock.Any()).Return(codersdk.WorkspaceAgentListContainersResponse{ Containers: []codersdk.WorkspaceAgentContainer{devContainer}, - }, nil).Times(1) + }, nil).AnyTimes() mdccli.EXPECT().Up(gomock.Any(), workspaceFolder, configFile, gomock.Any()).Return("someid", nil).Times(1) return 0 }, @@ -1400,7 +1434,7 @@ func TestWorkspaceAgentRecreateDevcontainer(t *testing.T) { { name: "Container does not exist", setupMock: func(mcl *acmock.MockLister, mdccli *acmock.MockDevcontainerCLI) int { - mcl.EXPECT().List(gomock.Any()).Return(codersdk.WorkspaceAgentListContainersResponse{}, nil).Times(1) + mcl.EXPECT().List(gomock.Any()).Return(codersdk.WorkspaceAgentListContainersResponse{}, nil).AnyTimes() return http.StatusNotFound }, }, @@ -1409,7 +1443,7 @@ func TestWorkspaceAgentRecreateDevcontainer(t *testing.T) { setupMock: func(mcl *acmock.MockLister, mdccli *acmock.MockDevcontainerCLI) int { mcl.EXPECT().List(gomock.Any()).Return(codersdk.WorkspaceAgentListContainersResponse{ Containers: []codersdk.WorkspaceAgentContainer{plainContainer}, - }, nil).Times(1) + }, nil).AnyTimes() return http.StatusNotFound }, }, @@ -1421,7 +1455,10 @@ func TestWorkspaceAgentRecreateDevcontainer(t *testing.T) { mcl := acmock.NewMockLister(ctrl) mdccli := acmock.NewMockDevcontainerCLI(ctrl) wantStatus := tc.setupMock(mcl, mdccli) - client, db := coderdtest.NewWithDatabase(t, &coderdtest.Options{}) + logger := slogtest.Make(t, &slogtest.Options{IgnoreErrors: true}).Leveled(slog.LevelDebug) + client, db := coderdtest.NewWithDatabase(t, &coderdtest.Options{ + Logger: &logger, + }) user := coderdtest.CreateFirstUser(t, client) r := dbfake.WorkspaceBuild(t, db, database.WorkspaceTable{ OrganizationID: user.OrganizationID, @@ -1430,6 +1467,7 @@ func TestWorkspaceAgentRecreateDevcontainer(t *testing.T) { return agents }).Do() _ = agenttest.New(t, client.URL, r.AgentToken, func(o *agent.Options) { + o.Logger = logger.Named("agent") o.ExperimentalDevcontainersEnabled = true o.ContainerAPIOptions = append( o.ContainerAPIOptions, diff --git a/coderd/workspaceagentsrpc.go b/coderd/workspaceagentsrpc.go index 43da35410f632..2dcf65bd8c7d5 100644 --- a/coderd/workspaceagentsrpc.go +++ b/coderd/workspaceagentsrpc.go @@ -76,17 +76,8 @@ func (api *API) workspaceAgentRPC(rw http.ResponseWriter, r *http.Request) { return } - owner, err := api.Database.GetUserByID(ctx, workspace.OwnerID) - if err != nil { - httpapi.Write(ctx, rw, http.StatusBadRequest, codersdk.Response{ - Message: "Internal error fetching user.", - Detail: err.Error(), - }) - return - } - logger = logger.With( - slog.F("owner", owner.Username), + slog.F("owner", workspace.OwnerUsername), slog.F("workspace_name", workspace.Name), slog.F("agent_name", workspaceAgent.Name), ) @@ -170,7 +161,7 @@ func (api *API) workspaceAgentRPC(rw http.ResponseWriter, r *http.Request) { }) streamID := tailnet.StreamID{ - Name: fmt.Sprintf("%s-%s-%s", owner.Username, workspace.Name, workspaceAgent.Name), + Name: fmt.Sprintf("%s-%s-%s", workspace.OwnerUsername, workspace.Name, workspaceAgent.Name), ID: workspaceAgent.ID, Auth: tailnet.AgentCoordinateeAuth{ID: workspaceAgent.ID}, } diff --git a/coderd/workspaceagentsrpc_test.go b/coderd/workspaceagentsrpc_test.go index caea9b39c2f54..5175f80b0b723 100644 --- a/coderd/workspaceagentsrpc_test.go +++ b/coderd/workspaceagentsrpc_test.go @@ -13,6 +13,7 @@ import ( "github.com/coder/coder/v2/coderd/database" "github.com/coder/coder/v2/coderd/database/dbfake" "github.com/coder/coder/v2/coderd/database/dbtime" + "github.com/coder/coder/v2/coderd/rbac" "github.com/coder/coder/v2/codersdk/agentsdk" "github.com/coder/coder/v2/provisionersdk/proto" "github.com/coder/coder/v2/testutil" @@ -22,89 +23,150 @@ import ( func TestWorkspaceAgentReportStats(t *testing.T) { t.Parallel() - tickCh := make(chan time.Time) - flushCh := make(chan int, 1) - client, db := coderdtest.NewWithDatabase(t, &coderdtest.Options{ - WorkspaceUsageTrackerFlush: flushCh, - WorkspaceUsageTrackerTick: tickCh, - }) - user := coderdtest.CreateFirstUser(t, client) - r := dbfake.WorkspaceBuild(t, db, database.WorkspaceTable{ - OrganizationID: user.OrganizationID, - OwnerID: user.UserID, - LastUsedAt: dbtime.Now().Add(-time.Minute), - }).WithAgent().Do() + for _, tc := range []struct { + name string + apiKeyScope rbac.ScopeName + }{ + { + name: "empty (backwards compat)", + apiKeyScope: "", + }, + { + name: "all", + apiKeyScope: rbac.ScopeAll, + }, + { + name: "no_user_data", + apiKeyScope: rbac.ScopeNoUserData, + }, + { + name: "application_connect", + apiKeyScope: rbac.ScopeApplicationConnect, + }, + } { + t.Run(tc.name, func(t *testing.T) { + t.Parallel() - ac := agentsdk.New(client.URL) - ac.SetSessionToken(r.AgentToken) - conn, err := ac.ConnectRPC(context.Background()) - require.NoError(t, err) - defer func() { - _ = conn.Close() - }() - agentAPI := agentproto.NewDRPCAgentClient(conn) + tickCh := make(chan time.Time) + flushCh := make(chan int, 1) + client, db := coderdtest.NewWithDatabase(t, &coderdtest.Options{ + WorkspaceUsageTrackerFlush: flushCh, + WorkspaceUsageTrackerTick: tickCh, + }) + user := coderdtest.CreateFirstUser(t, client) + r := dbfake.WorkspaceBuild(t, db, database.WorkspaceTable{ + OrganizationID: user.OrganizationID, + OwnerID: user.UserID, + LastUsedAt: dbtime.Now().Add(-time.Minute), + }).WithAgent( + func(agent []*proto.Agent) []*proto.Agent { + for _, a := range agent { + a.ApiKeyScope = string(tc.apiKeyScope) + } - _, err = agentAPI.UpdateStats(context.Background(), &agentproto.UpdateStatsRequest{ - Stats: &agentproto.Stats{ - ConnectionsByProto: map[string]int64{"TCP": 1}, - ConnectionCount: 1, - RxPackets: 1, - RxBytes: 1, - TxPackets: 1, - TxBytes: 1, - SessionCountVscode: 1, - SessionCountJetbrains: 0, - SessionCountReconnectingPty: 0, - SessionCountSsh: 0, - ConnectionMedianLatencyMs: 10, - }, - }) - require.NoError(t, err) + return agent + }, + ).Do() + + ac := agentsdk.New(client.URL) + ac.SetSessionToken(r.AgentToken) + conn, err := ac.ConnectRPC(context.Background()) + require.NoError(t, err) + defer func() { + _ = conn.Close() + }() + agentAPI := agentproto.NewDRPCAgentClient(conn) + + _, err = agentAPI.UpdateStats(context.Background(), &agentproto.UpdateStatsRequest{ + Stats: &agentproto.Stats{ + ConnectionsByProto: map[string]int64{"TCP": 1}, + ConnectionCount: 1, + RxPackets: 1, + RxBytes: 1, + TxPackets: 1, + TxBytes: 1, + SessionCountVscode: 1, + SessionCountJetbrains: 0, + SessionCountReconnectingPty: 0, + SessionCountSsh: 0, + ConnectionMedianLatencyMs: 10, + }, + }) + require.NoError(t, err) - tickCh <- dbtime.Now() - count := <-flushCh - require.Equal(t, 1, count, "expected one flush with one id") + tickCh <- dbtime.Now() + count := <-flushCh + require.Equal(t, 1, count, "expected one flush with one id") - newWorkspace, err := client.Workspace(context.Background(), r.Workspace.ID) - require.NoError(t, err) + newWorkspace, err := client.Workspace(context.Background(), r.Workspace.ID) + require.NoError(t, err) - assert.True(t, - newWorkspace.LastUsedAt.After(r.Workspace.LastUsedAt), - "%s is not after %s", newWorkspace.LastUsedAt, r.Workspace.LastUsedAt, - ) + assert.True(t, + newWorkspace.LastUsedAt.After(r.Workspace.LastUsedAt), + "%s is not after %s", newWorkspace.LastUsedAt, r.Workspace.LastUsedAt, + ) + }) + } } func TestAgentAPI_LargeManifest(t *testing.T) { t.Parallel() - ctx := testutil.Context(t, testutil.WaitLong) - client, store := coderdtest.NewWithDatabase(t, nil) - adminUser := coderdtest.CreateFirstUser(t, client) - n := 512000 - longScript := make([]byte, n) - for i := range longScript { - longScript[i] = 'q' + + for _, tc := range []struct { + name string + apiKeyScope rbac.ScopeName + }{ + { + name: "empty (backwards compat)", + apiKeyScope: "", + }, + { + name: "all", + apiKeyScope: rbac.ScopeAll, + }, + { + name: "no_user_data", + apiKeyScope: rbac.ScopeNoUserData, + }, + { + name: "application_connect", + apiKeyScope: rbac.ScopeApplicationConnect, + }, + } { + t.Run(tc.name, func(t *testing.T) { + t.Parallel() + ctx := testutil.Context(t, testutil.WaitLong) + client, store := coderdtest.NewWithDatabase(t, nil) + adminUser := coderdtest.CreateFirstUser(t, client) + n := 512000 + longScript := make([]byte, n) + for i := range longScript { + longScript[i] = 'q' + } + r := dbfake.WorkspaceBuild(t, store, database.WorkspaceTable{ + OrganizationID: adminUser.OrganizationID, + OwnerID: adminUser.UserID, + }).WithAgent(func(agents []*proto.Agent) []*proto.Agent { + agents[0].Scripts = []*proto.Script{ + { + Script: string(longScript), + }, + } + agents[0].ApiKeyScope = string(tc.apiKeyScope) + return agents + }).Do() + ac := agentsdk.New(client.URL) + ac.SetSessionToken(r.AgentToken) + conn, err := ac.ConnectRPC(ctx) + defer func() { + _ = conn.Close() + }() + require.NoError(t, err) + agentAPI := agentproto.NewDRPCAgentClient(conn) + manifest, err := agentAPI.GetManifest(ctx, &agentproto.GetManifestRequest{}) + require.NoError(t, err) + require.Len(t, manifest.Scripts, 1) + require.Len(t, manifest.Scripts[0].Script, n) + }) } - r := dbfake.WorkspaceBuild(t, store, database.WorkspaceTable{ - OrganizationID: adminUser.OrganizationID, - OwnerID: adminUser.UserID, - }).WithAgent(func(agents []*proto.Agent) []*proto.Agent { - agents[0].Scripts = []*proto.Script{ - { - Script: string(longScript), - }, - } - return agents - }).Do() - ac := agentsdk.New(client.URL) - ac.SetSessionToken(r.AgentToken) - conn, err := ac.ConnectRPC(ctx) - defer func() { - _ = conn.Close() - }() - require.NoError(t, err) - agentAPI := agentproto.NewDRPCAgentClient(conn) - manifest, err := agentAPI.GetManifest(ctx, &agentproto.GetManifestRequest{}) - require.NoError(t, err) - require.Len(t, manifest.Scripts, 1) - require.Len(t, manifest.Scripts[0].Script, n) } diff --git a/coderd/workspacebuilds.go b/coderd/workspacebuilds.go index 719d4e2a48123..08b90b834ccca 100644 --- a/coderd/workspacebuilds.go +++ b/coderd/workspacebuilds.go @@ -338,6 +338,7 @@ func (api *API) postWorkspaceBuilds(rw http.ResponseWriter, r *http.Request) { RichParameterValues(createBuild.RichParameterValues). LogLevel(string(createBuild.LogLevel)). DeploymentValues(api.Options.DeploymentValues). + Experiments(api.Experiments). TemplateVersionPresetID(createBuild.TemplateVersionPresetID) var ( @@ -383,6 +384,22 @@ func (api *API) postWorkspaceBuilds(rw http.ResponseWriter, r *http.Request) { builder = builder.State(createBuild.ProvisionerState) } + // Only defer to dynamic parameters if the experiment is enabled. + if api.Experiments.Enabled(codersdk.ExperimentDynamicParameters) { + if createBuild.EnableDynamicParameters != nil { + // Explicit opt-in + builder = builder.DynamicParameters(*createBuild.EnableDynamicParameters) + } + } else { + if createBuild.EnableDynamicParameters != nil { + api.Logger.Warn(ctx, "ignoring dynamic parameter field sent by request, the experiment is not enabled", + slog.F("field", *createBuild.EnableDynamicParameters), + slog.F("user", apiKey.UserID.String()), + slog.F("transition", string(createBuild.Transition)), + ) + } + } + workspaceBuild, provisionerJob, provisionerDaemons, err = builder.Build( ctx, tx, diff --git a/coderd/workspaces.go b/coderd/workspaces.go index 203c9f8599298..fe0c2d3f609a2 100644 --- a/coderd/workspaces.go +++ b/coderd/workspaces.go @@ -704,6 +704,8 @@ func createWorkspace( Reason(database.BuildReasonInitiator). Initiator(initiatorID). ActiveVersion(). + Experiments(api.Experiments). + DeploymentValues(api.DeploymentValues). RichParameterValues(req.RichParameterValues) if req.TemplateVersionID != uuid.Nil { builder = builder.VersionID(req.TemplateVersionID) @@ -716,7 +718,7 @@ func createWorkspace( } if req.EnableDynamicParameters && api.Experiments.Enabled(codersdk.ExperimentDynamicParameters) { - builder = builder.UsingDynamicParameters() + builder = builder.DynamicParameters(req.EnableDynamicParameters) } workspaceBuild, provisionerJob, provisionerDaemons, err = builder.Build( @@ -2259,6 +2261,7 @@ func convertWorkspace( TemplateAllowUserCancelWorkspaceJobs: template.AllowUserCancelWorkspaceJobs, TemplateActiveVersionID: template.ActiveVersionID, TemplateRequireActiveVersion: template.RequireActiveVersion, + TemplateUseClassicParameterFlow: template.UseClassicParameterFlow, Outdated: workspaceBuild.TemplateVersionID.String() != template.ActiveVersionID.String(), Name: workspace.Name, AutostartSchedule: autostartSchedule, diff --git a/coderd/wsbuilder/wsbuilder.go b/coderd/wsbuilder/wsbuilder.go index 91638c63e436f..46035f28dda77 100644 --- a/coderd/wsbuilder/wsbuilder.go +++ b/coderd/wsbuilder/wsbuilder.go @@ -13,7 +13,9 @@ import ( "github.com/hashicorp/hcl/v2" "github.com/hashicorp/hcl/v2/hclsyntax" + "github.com/coder/coder/v2/apiversion" "github.com/coder/coder/v2/coderd/rbac/policy" + "github.com/coder/coder/v2/coderd/util/ptr" "github.com/coder/coder/v2/provisioner/terraform/tfparse" "github.com/coder/coder/v2/provisionersdk" sdkproto "github.com/coder/coder/v2/provisionersdk/proto" @@ -51,9 +53,11 @@ type Builder struct { state stateTarget logLevel string deploymentValues *codersdk.DeploymentValues + experiments codersdk.Experiments - richParameterValues []codersdk.WorkspaceBuildParameter - dynamicParametersEnabled bool + richParameterValues []codersdk.WorkspaceBuildParameter + // dynamicParametersEnabled is non-nil if set externally + dynamicParametersEnabled *bool initiator uuid.UUID reason database.BuildReason templateVersionPresetID uuid.UUID @@ -66,6 +70,7 @@ type Builder struct { template *database.Template templateVersion *database.TemplateVersion templateVersionJob *database.ProvisionerJob + terraformValues *database.TemplateVersionTerraformValue templateVersionParameters *[]database.TemplateVersionParameter templateVersionVariables *[]database.TemplateVersionVariable templateVersionWorkspaceTags *[]database.TemplateVersionWorkspaceTag @@ -155,6 +160,14 @@ func (b Builder) DeploymentValues(dv *codersdk.DeploymentValues) Builder { return b } +func (b Builder) Experiments(exp codersdk.Experiments) Builder { + // nolint: revive + cpy := make(codersdk.Experiments, len(exp)) + copy(cpy, exp) + b.experiments = cpy + return b +} + func (b Builder) Initiator(u uuid.UUID) Builder { // nolint: revive b.initiator = u @@ -187,8 +200,9 @@ func (b Builder) MarkPrebuiltWorkspaceClaim() Builder { return b } -func (b Builder) UsingDynamicParameters() Builder { - b.dynamicParametersEnabled = true +func (b Builder) DynamicParameters(using bool) Builder { + // nolint: revive + b.dynamicParametersEnabled = ptr.Ref(using) return b } @@ -516,6 +530,22 @@ func (b *Builder) getTemplateVersionID() (uuid.UUID, error) { return bld.TemplateVersionID, nil } +func (b *Builder) getTemplateTerraformValues() (*database.TemplateVersionTerraformValue, error) { + if b.terraformValues != nil { + return b.terraformValues, nil + } + v, err := b.getTemplateVersion() + if err != nil { + return nil, xerrors.Errorf("get template version so we can get terraform values: %w", err) + } + vals, err := b.store.GetTemplateVersionTerraformValues(b.ctx, v.ID) + if err != nil { + return nil, xerrors.Errorf("get template version terraform values %s: %w", v.JobID, err) + } + b.terraformValues = &vals + return b.terraformValues, err +} + func (b *Builder) getLastBuild() (*database.WorkspaceBuild, error) { if b.lastBuild != nil { return b.lastBuild, nil @@ -593,30 +623,43 @@ func (b *Builder) getParameters() (names, values []string, err error) { return nil, nil, BuildError{http.StatusBadRequest, "Unable to build workspace with unsupported parameters", err} } + // Dynamic parameters skip all parameter validation. + // Deleting a workspace also should skip parameter validation. + // Pass the user's input as is. + if b.usingDynamicParameters() { + // TODO: The previous behavior was only to pass param values + // for parameters that exist. Since dynamic params can have + // conditional parameter existence, the static frame of reference + // is not sufficient. So assume the user is correct, or pull in the + // dynamic param code to find the actual parameters. + for _, value := range b.richParameterValues { + names = append(names, value.Name) + values = append(values, value.Value) + } + b.parameterNames = &names + b.parameterValues = &values + return names, values, nil + } + resolver := codersdk.ParameterResolver{ Rich: db2sdk.WorkspaceBuildParameters(lastBuildParameters), } + for _, templateVersionParameter := range templateVersionParameters { tvp, err := db2sdk.TemplateVersionParameter(templateVersionParameter) if err != nil { return nil, nil, BuildError{http.StatusInternalServerError, "failed to convert template version parameter", err} } - var value string - if !b.dynamicParametersEnabled { - var err error - value, err = resolver.ValidateResolve( - tvp, - b.findNewBuildParameterValue(templateVersionParameter.Name), - ) - if err != nil { - // At this point, we've queried all the data we need from the database, - // so the only errors are problems with the request (missing data, failed - // validation, immutable parameters, etc.) - return nil, nil, BuildError{http.StatusBadRequest, fmt.Sprintf("Unable to validate parameter %q", templateVersionParameter.Name), err} - } - } else { - value = resolver.Resolve(tvp, b.findNewBuildParameterValue(templateVersionParameter.Name)) + value, err := resolver.ValidateResolve( + tvp, + b.findNewBuildParameterValue(templateVersionParameter.Name), + ) + if err != nil { + // At this point, we've queried all the data we need from the database, + // so the only errors are problems with the request (missing data, failed + // validation, immutable parameters, etc.) + return nil, nil, BuildError{http.StatusBadRequest, fmt.Sprintf("Unable to validate parameter %q", templateVersionParameter.Name), err} } names = append(names, templateVersionParameter.Name) @@ -977,3 +1020,36 @@ func (b *Builder) checkRunningBuild() error { } return nil } + +func (b *Builder) usingDynamicParameters() bool { + if !b.experiments.Enabled(codersdk.ExperimentDynamicParameters) { + // Experiment required + return false + } + + vals, err := b.getTemplateTerraformValues() + if err != nil { + return false + } + + if !ProvisionerVersionSupportsDynamicParameters(vals.ProvisionerdVersion) { + return false + } + + if b.dynamicParametersEnabled != nil { + return *b.dynamicParametersEnabled + } + + tpl, err := b.getTemplate() + if err != nil { + return false // Let another part of the code get this error + } + return !tpl.UseClassicParameterFlow +} + +func ProvisionerVersionSupportsDynamicParameters(version string) bool { + major, minor, err := apiversion.Parse(version) + // If the api version is not valid or less than 1.6, we need to use the static parameters + useStaticParams := err != nil || major < 1 || (major == 1 && minor < 6) + return !useStaticParams +} diff --git a/coderd/wsbuilder/wsbuilder_test.go b/coderd/wsbuilder/wsbuilder_test.go index 00b7b5f0ae08b..abe5e3fe9b8b7 100644 --- a/coderd/wsbuilder/wsbuilder_test.go +++ b/coderd/wsbuilder/wsbuilder_test.go @@ -839,6 +839,32 @@ func TestWorkspaceBuildWithPreset(t *testing.T) { req.NoError(err) } +func TestProvisionerVersionSupportsDynamicParameters(t *testing.T) { + t.Parallel() + + for v, dyn := range map[string]bool{ + "": false, + "na": false, + "0.0": false, + "0.10": false, + "1.4": false, + "1.5": false, + "1.6": true, + "1.7": true, + "1.8": true, + "2.0": true, + "2.17": true, + "4.0": true, + } { + t.Run(v, func(t *testing.T) { + t.Parallel() + + does := wsbuilder.ProvisionerVersionSupportsDynamicParameters(v) + require.Equal(t, dyn, does) + }) + } +} + type txExpect func(mTx *dbmock.MockStore) func expectDB(t *testing.T, opts ...txExpect) *dbmock.MockStore { diff --git a/codersdk/deployment.go b/codersdk/deployment.go index 0741bf9e3844a..89834f163affd 100644 --- a/codersdk/deployment.go +++ b/codersdk/deployment.go @@ -345,7 +345,7 @@ type DeploymentValues struct { // HTTPAddress is a string because it may be set to zero to disable. HTTPAddress serpent.String `json:"http_address,omitempty" typescript:",notnull"` AutobuildPollInterval serpent.Duration `json:"autobuild_poll_interval,omitempty"` - JobHangDetectorInterval serpent.Duration `json:"job_hang_detector_interval,omitempty"` + JobReaperDetectorInterval serpent.Duration `json:"job_hang_detector_interval,omitempty"` DERP DERP `json:"derp,omitempty" typescript:",notnull"` Prometheus PrometheusConfig `json:"prometheus,omitempty" typescript:",notnull"` Pprof PprofConfig `json:"pprof,omitempty" typescript:",notnull"` @@ -807,6 +807,12 @@ type PrebuildsConfig struct { // ReconciliationBackoffLookback determines the time window to look back when calculating // the number of failed prebuilds, which influences the backoff strategy. ReconciliationBackoffLookback serpent.Duration `json:"reconciliation_backoff_lookback" typescript:",notnull"` + + // FailureHardLimit defines the maximum number of consecutive failed prebuild attempts allowed + // before a preset is considered to be in a hard limit state. When a preset hits this limit, + // no new prebuilds will be created until the limit is reset. + // FailureHardLimit is disabled when set to zero. + FailureHardLimit serpent.Int64 `json:"failure_hard_limit" typescript:"failure_hard_limit"` } const ( @@ -1287,13 +1293,13 @@ func (c *DeploymentValues) Options() serpent.OptionSet { Annotations: serpent.Annotations{}.Mark(annotationFormatDuration, "true"), }, { - Name: "Job Hang Detector Interval", - Description: "Interval to poll for hung jobs and automatically terminate them.", + Name: "Job Reaper Detect Interval", + Description: "Interval to poll for hung and pending jobs and automatically terminate them.", Flag: "job-hang-detector-interval", Env: "CODER_JOB_HANG_DETECTOR_INTERVAL", Hidden: true, Default: time.Minute.String(), - Value: &c.JobHangDetectorInterval, + Value: &c.JobReaperDetectorInterval, YAML: "jobHangDetectorInterval", Annotations: serpent.Annotations{}.Mark(annotationFormatDuration, "true"), }, @@ -3086,6 +3092,17 @@ Write out the current server config as YAML to stdout.`, Annotations: serpent.Annotations{}.Mark(annotationFormatDuration, "true"), Hidden: true, }, + { + Name: "Failure Hard Limit", + Description: "Maximum number of consecutive failed prebuilds before a preset hits the hard limit; disabled when set to zero.", + Flag: "workspace-prebuilds-failure-hard-limit", + Env: "CODER_WORKSPACE_PREBUILDS_FAILURE_HARD_LIMIT", + Value: &c.Prebuilds.FailureHardLimit, + Default: "3", + Group: &deploymentGroupPrebuilds, + YAML: "failure_hard_limit", + Hidden: true, + }, } return opts diff --git a/codersdk/organizations.go b/codersdk/organizations.go index dd2eab50cf57e..728540ef2e6e1 100644 --- a/codersdk/organizations.go +++ b/codersdk/organizations.go @@ -74,8 +74,8 @@ type OrganizationMember struct { type OrganizationMemberWithUserData struct { Username string `table:"username,default_sort" json:"username"` - Name string `table:"name" json:"name"` - AvatarURL string `json:"avatar_url"` + Name string `table:"name" json:"name,omitempty"` + AvatarURL string `json:"avatar_url,omitempty"` Email string `json:"email"` GlobalRoles []SlimRole `json:"global_roles"` OrganizationMember `table:"m,recursive_inline"` diff --git a/codersdk/parameters.go b/codersdk/parameters.go index 881aaf99f573c..d81dc7cf55ca0 100644 --- a/codersdk/parameters.go +++ b/codersdk/parameters.go @@ -7,17 +7,121 @@ import ( "github.com/google/uuid" "github.com/coder/coder/v2/codersdk/wsjson" - previewtypes "github.com/coder/preview/types" "github.com/coder/websocket" ) -// FriendlyDiagnostic is included to guarantee it is generated in the output -// types. This is used as the type override for `previewtypes.Diagnostic`. -type FriendlyDiagnostic = previewtypes.FriendlyDiagnostic +type ParameterFormType string -// NullHCLString is included to guarantee it is generated in the output -// types. This is used as the type override for `previewtypes.HCLString`. -type NullHCLString = previewtypes.NullHCLString +const ( + ParameterFormTypeDefault ParameterFormType = "" + ParameterFormTypeRadio ParameterFormType = "radio" + ParameterFormTypeSlider ParameterFormType = "slider" + ParameterFormTypeInput ParameterFormType = "input" + ParameterFormTypeDropdown ParameterFormType = "dropdown" + ParameterFormTypeCheckbox ParameterFormType = "checkbox" + ParameterFormTypeSwitch ParameterFormType = "switch" + ParameterFormTypeMultiSelect ParameterFormType = "multi-select" + ParameterFormTypeTagSelect ParameterFormType = "tag-select" + ParameterFormTypeTextArea ParameterFormType = "textarea" + ParameterFormTypeError ParameterFormType = "error" +) + +type OptionType string + +const ( + OptionTypeString OptionType = "string" + OptionTypeNumber OptionType = "number" + OptionTypeBoolean OptionType = "bool" + OptionTypeListString OptionType = "list(string)" +) + +type DiagnosticSeverityString string + +const ( + DiagnosticSeverityError DiagnosticSeverityString = "error" + DiagnosticSeverityWarning DiagnosticSeverityString = "warning" +) + +// FriendlyDiagnostic == previewtypes.FriendlyDiagnostic +// Copied to avoid import deps +type FriendlyDiagnostic struct { + Severity DiagnosticSeverityString `json:"severity"` + Summary string `json:"summary"` + Detail string `json:"detail"` + + Extra DiagnosticExtra `json:"extra"` +} + +type DiagnosticExtra struct { + Code string `json:"code"` +} + +// NullHCLString == `previewtypes.NullHCLString`. +type NullHCLString struct { + Value string `json:"value"` + Valid bool `json:"valid"` +} + +type PreviewParameter struct { + PreviewParameterData + Value NullHCLString `json:"value"` + Diagnostics []FriendlyDiagnostic `json:"diagnostics"` +} + +type PreviewParameterData struct { + Name string `json:"name"` + DisplayName string `json:"display_name"` + Description string `json:"description"` + Type OptionType `json:"type"` + FormType ParameterFormType `json:"form_type"` + Styling PreviewParameterStyling `json:"styling"` + Mutable bool `json:"mutable"` + DefaultValue NullHCLString `json:"default_value"` + Icon string `json:"icon"` + Options []PreviewParameterOption `json:"options"` + Validations []PreviewParameterValidation `json:"validations"` + Required bool `json:"required"` + // legacy_variable_name was removed (= 14) + Order int64 `json:"order"` + Ephemeral bool `json:"ephemeral"` +} + +type PreviewParameterStyling struct { + Placeholder *string `json:"placeholder,omitempty"` + Disabled *bool `json:"disabled,omitempty"` + Label *string `json:"label,omitempty"` +} + +type PreviewParameterOption struct { + Name string `json:"name"` + Description string `json:"description"` + Value NullHCLString `json:"value"` + Icon string `json:"icon"` +} + +type PreviewParameterValidation struct { + Error string `json:"validation_error"` + + // All validation attributes are optional. + Regex *string `json:"validation_regex"` + Min *int64 `json:"validation_min"` + Max *int64 `json:"validation_max"` + Monotonic *string `json:"validation_monotonic"` +} + +type DynamicParametersRequest struct { + // ID identifies the request. The response contains the same + // ID so that the client can match it to the request. + ID int `json:"id"` + Inputs map[string]string `json:"inputs"` +} + +type DynamicParametersResponse struct { + ID int `json:"id"` + Diagnostics []FriendlyDiagnostic `json:"diagnostics"` + Parameters []PreviewParameter `json:"parameters"` + // TODO: Workspace tags +} func (c *Client) TemplateVersionDynamicParameters(ctx context.Context, userID, version uuid.UUID) (*wsjson.Stream[DynamicParametersResponse, DynamicParametersRequest], error) { conn, err := c.Dial(ctx, fmt.Sprintf("/api/v2/users/%s/templateversions/%s/parameters", userID, version), nil) diff --git a/codersdk/rbacresources_gen.go b/codersdk/rbacresources_gen.go index 54f65767928d6..95792bb8e2a7b 100644 --- a/codersdk/rbacresources_gen.go +++ b/codersdk/rbacresources_gen.go @@ -49,7 +49,9 @@ const ( ActionApplicationConnect RBACAction = "application_connect" ActionAssign RBACAction = "assign" ActionCreate RBACAction = "create" + ActionCreateAgent RBACAction = "create_agent" ActionDelete RBACAction = "delete" + ActionDeleteAgent RBACAction = "delete_agent" ActionRead RBACAction = "read" ActionReadPersonal RBACAction = "read_personal" ActionSSH RBACAction = "ssh" @@ -90,16 +92,16 @@ var RBACResourceActions = map[RBACResource][]RBACAction{ ResourceOrganization: {ActionCreate, ActionDelete, ActionRead, ActionUpdate}, ResourceOrganizationMember: {ActionCreate, ActionDelete, ActionRead, ActionUpdate}, ResourceProvisionerDaemon: {ActionCreate, ActionDelete, ActionRead, ActionUpdate}, - ResourceProvisionerJobs: {ActionRead}, + ResourceProvisionerJobs: {ActionCreate, ActionRead, ActionUpdate}, ResourceReplicas: {ActionRead}, ResourceSystem: {ActionCreate, ActionDelete, ActionRead, ActionUpdate}, ResourceTailnetCoordinator: {ActionCreate, ActionDelete, ActionRead, ActionUpdate}, ResourceTemplate: {ActionCreate, ActionDelete, ActionRead, ActionUpdate, ActionUse, ActionViewInsights}, ResourceUser: {ActionCreate, ActionDelete, ActionRead, ActionReadPersonal, ActionUpdate, ActionUpdatePersonal}, ResourceWebpushSubscription: {ActionCreate, ActionDelete, ActionRead}, - ResourceWorkspace: {ActionApplicationConnect, ActionCreate, ActionDelete, ActionRead, ActionSSH, ActionWorkspaceStart, ActionWorkspaceStop, ActionUpdate}, + ResourceWorkspace: {ActionApplicationConnect, ActionCreate, ActionCreateAgent, ActionDelete, ActionDeleteAgent, ActionRead, ActionSSH, ActionWorkspaceStart, ActionWorkspaceStop, ActionUpdate}, ResourceWorkspaceAgentDevcontainers: {ActionCreate}, ResourceWorkspaceAgentResourceMonitor: {ActionCreate, ActionRead, ActionUpdate}, - ResourceWorkspaceDormant: {ActionApplicationConnect, ActionCreate, ActionDelete, ActionRead, ActionSSH, ActionWorkspaceStart, ActionWorkspaceStop, ActionUpdate}, + ResourceWorkspaceDormant: {ActionApplicationConnect, ActionCreate, ActionCreateAgent, ActionDelete, ActionDeleteAgent, ActionRead, ActionSSH, ActionWorkspaceStart, ActionWorkspaceStop, ActionUpdate}, ResourceWorkspaceProxy: {ActionCreate, ActionDelete, ActionRead, ActionUpdate}, } diff --git a/codersdk/templateversions.go b/codersdk/templateversions.go index 42b381fadebce..de8bb7b970957 100644 --- a/codersdk/templateversions.go +++ b/codersdk/templateversions.go @@ -9,8 +9,6 @@ import ( "time" "github.com/google/uuid" - - previewtypes "github.com/coder/preview/types" ) type TemplateVersionWarning string @@ -125,20 +123,6 @@ func (c *Client) CancelTemplateVersion(ctx context.Context, version uuid.UUID) e return nil } -type DynamicParametersRequest struct { - // ID identifies the request. The response contains the same - // ID so that the client can match it to the request. - ID int `json:"id"` - Inputs map[string]string `json:"inputs"` -} - -type DynamicParametersResponse struct { - ID int `json:"id"` - Diagnostics previewtypes.Diagnostics `json:"diagnostics"` - Parameters []previewtypes.Parameter `json:"parameters"` - // TODO: Workspace tags -} - // TemplateVersionParameters returns parameters a template version exposes. func (c *Client) TemplateVersionRichParameters(ctx context.Context, version uuid.UUID) ([]TemplateVersionParameter, error) { res, err := c.Request(ctx, http.MethodGet, fmt.Sprintf("/api/v2/templateversions/%s/rich-parameters", version), nil) diff --git a/codersdk/users.go b/codersdk/users.go index 3d9d95e683066..3207e3fbabaa1 100644 --- a/codersdk/users.go +++ b/codersdk/users.go @@ -40,7 +40,7 @@ type UsersRequest struct { type MinimalUser struct { ID uuid.UUID `json:"id" validate:"required" table:"id" format:"uuid"` Username string `json:"username" validate:"required" table:"username,default_sort"` - AvatarURL string `json:"avatar_url" format:"uri"` + AvatarURL string `json:"avatar_url,omitempty" format:"uri"` } // ReducedUser omits role and organization information. Roles are deduced from @@ -49,11 +49,11 @@ type MinimalUser struct { // required by the frontend. type ReducedUser struct { MinimalUser `table:"m,recursive_inline"` - Name string `json:"name"` + Name string `json:"name,omitempty"` Email string `json:"email" validate:"required" table:"email" format:"email"` CreatedAt time.Time `json:"created_at" validate:"required" table:"created at" format:"date-time"` UpdatedAt time.Time `json:"updated_at" table:"updated at" format:"date-time"` - LastSeenAt time.Time `json:"last_seen_at" format:"date-time"` + LastSeenAt time.Time `json:"last_seen_at,omitempty" format:"date-time"` Status UserStatus `json:"status" table:"status" enums:"active,suspended"` LoginType LoginType `json:"login_type"` diff --git a/codersdk/workspacebuilds.go b/codersdk/workspacebuilds.go index 7b67dc3b86171..ee31876f44fab 100644 --- a/codersdk/workspacebuilds.go +++ b/codersdk/workspacebuilds.go @@ -58,7 +58,7 @@ type WorkspaceBuild struct { WorkspaceName string `json:"workspace_name"` WorkspaceOwnerID uuid.UUID `json:"workspace_owner_id" format:"uuid"` WorkspaceOwnerName string `json:"workspace_owner_name"` - WorkspaceOwnerAvatarURL string `json:"workspace_owner_avatar_url"` + WorkspaceOwnerAvatarURL string `json:"workspace_owner_avatar_url,omitempty"` TemplateVersionID uuid.UUID `json:"template_version_id" format:"uuid"` TemplateVersionName string `json:"template_version_name"` BuildNumber int32 `json:"build_number"` diff --git a/codersdk/workspaces.go b/codersdk/workspaces.go index 311c4bcba35d4..e0f1b9b1e2c2a 100644 --- a/codersdk/workspaces.go +++ b/codersdk/workspaces.go @@ -41,6 +41,7 @@ type Workspace struct { TemplateAllowUserCancelWorkspaceJobs bool `json:"template_allow_user_cancel_workspace_jobs"` TemplateActiveVersionID uuid.UUID `json:"template_active_version_id" format:"uuid"` TemplateRequireActiveVersion bool `json:"template_require_active_version"` + TemplateUseClassicParameterFlow bool `json:"template_use_classic_parameter_flow"` LatestBuild WorkspaceBuild `json:"latest_build"` LatestAppStatus *WorkspaceAppStatus `json:"latest_app_status"` Outdated bool `json:"outdated"` @@ -109,6 +110,10 @@ type CreateWorkspaceBuildRequest struct { LogLevel ProvisionerLogLevel `json:"log_level,omitempty" validate:"omitempty,oneof=debug"` // TemplateVersionPresetID is the ID of the template version preset to use for the build. TemplateVersionPresetID uuid.UUID `json:"template_version_preset_id,omitempty" format:"uuid"` + // EnableDynamicParameters skips some of the static parameter checking. + // It will default to whatever the template has marked as the default experience. + // Requires the "dynamic-experiment" to be used. + EnableDynamicParameters *bool `json:"enable_dynamic_parameters,omitempty"` } type WorkspaceOptions struct { diff --git a/docs/admin/provisioners/manage-provisioner-jobs.md b/docs/admin/provisioners/manage-provisioner-jobs.md index 05d5d9dddff9f..b2581e6020fc6 100644 --- a/docs/admin/provisioners/manage-provisioner-jobs.md +++ b/docs/admin/provisioners/manage-provisioner-jobs.md @@ -48,6 +48,10 @@ Each provisioner job has a lifecycle state: | **Failed** | Provisioner encountered an error while executing the job. | | **Canceled** | Job was manually terminated by an admin. | +The following diagram shows how a provisioner job transitions between lifecycle states: + +![Provisioner jobs state transitions](../../images/admin/provisioners/provisioner-jobs-status-flow.png) + ## When to cancel provisioner jobs A job might need to be cancelled when: diff --git a/docs/admin/setup/index.md b/docs/admin/setup/index.md index 96000292266e2..1a34920e733e8 100644 --- a/docs/admin/setup/index.md +++ b/docs/admin/setup/index.md @@ -140,7 +140,7 @@ To configure Coder behind a corporate proxy, set the environment variables `HTTP_PROXY` and `HTTPS_PROXY`. Be sure to restart the server. Lowercase values (e.g. `http_proxy`) are also respected in this case. -## External Authentication +## Continue your setup with external authentication Coder supports external authentication via OAuth2.0. This allows enabling integrations with Git providers, such as GitHub, GitLab, and Bitbucket. diff --git a/docs/admin/templates/extending-templates/devcontainers.md b/docs/admin/templates/extending-templates/devcontainers.md index 4894a012476a1..d4284bf48efde 100644 --- a/docs/admin/templates/extending-templates/devcontainers.md +++ b/docs/admin/templates/extending-templates/devcontainers.md @@ -122,3 +122,5 @@ resource "docker_container" "workspace" { ## Next Steps - [Dev Containers Integration](../../../user-guides/devcontainers/index.md) +- [Working with Dev Containers](../../../user-guides/devcontainers/working-with-dev-containers.md) +- [Troubleshooting Dev Containers](../../../user-guides/devcontainers/troubleshooting-dev-containers.md) diff --git a/docs/admin/templates/extending-templates/docker-in-workspaces.md b/docs/admin/templates/extending-templates/docker-in-workspaces.md index 4c88c2471de3f..51b1634d20371 100644 --- a/docs/admin/templates/extending-templates/docker-in-workspaces.md +++ b/docs/admin/templates/extending-templates/docker-in-workspaces.md @@ -266,6 +266,45 @@ Before using Podman, please review the following documentation: > For more information around the requirements of rootless podman pods, see: > [How to run Podman inside of Kubernetes](https://www.redhat.com/sysadmin/podman-inside-kubernetes) +### Rootless Podman on Bottlerocket nodes + +Rootless containers rely on Linux user-namespaces. +[Bottlerocket](https://github.com/bottlerocket-os/bottlerocket) disables them by default (`user.max_user_namespaces = 0`), so Podman commands will return an error until you raise the limit: + +```output +cannot clone: Invalid argument +user namespaces are not enabled in /proc/sys/user/max_user_namespaces +``` + +1. Add a `user.max_user_namespaces` value to your Bottlerocket user data to use rootless Podman on the node: + + ```toml + [settings.kernel.sysctl] + "user.max_user_namespaces" = "65536" + ``` + +1. Reboot the node. +1. Verify that the value is more than `0`: + + ```shell + sysctl -n user.max_user_namespaces + ``` + +For Karpenter-managed Bottlerocket nodes, add the `user.max_user_namespaces` setting in your `EC2NodeClass`: + +```yaml +apiVersion: karpenter.k8s.aws/v1 +kind: EC2NodeClass +metadata: + name: bottlerocket-rootless +spec: + amiFamily: Bottlerocket # required for BR-style userData + # … + userData: | + [settings.kernel] + sysctl = { "user.max_user_namespaces" = "65536" } +``` + ## Privileged sidecar container A diff --git a/docs/admin/templates/extending-templates/parameters.md b/docs/admin/templates/extending-templates/parameters.md index b5e6473ab6b4f..9c1235d51a915 100644 --- a/docs/admin/templates/extending-templates/parameters.md +++ b/docs/admin/templates/extending-templates/parameters.md @@ -252,7 +252,7 @@ data "coder_parameter" "force_rebuild" { ## Validating parameters -Coder supports rich parameters with multiple validation modes: min, max, +Coder supports parameters with multiple validation modes: min, max, monotonic numbers, and regular expressions. ### Number @@ -391,3 +391,547 @@ parameters in one of two ways: ``` Or set the [environment variable](../../setup/index.md), `CODER_EXPERIMENTS=auto-fill-parameters` + +## Dynamic Parameters + +Dynamic Parameters enhances Coder's existing parameter system with real-time validation, +conditional parameter behavior, and richer input types. +This feature allows template authors to create more interactive and responsive workspace creation experiences. + +### Enable Dynamic Parameters (Early Access) + +To use Dynamic Parameters, enable the experiment flag or set the environment variable. + +Note that as of v2.22.0, Dynamic parameters are an unsafe experiment and will not be enabled with the experiment wildcard. + +
+ +#### Flag + +```shell +coder server --experiments=dynamic-parameters +``` + +#### Env Variable + +```shell +CODER_EXPERIMENTS=dynamic-parameters +``` + +
+ +Dynamic Parameters also require version >=2.4.0 of the Coder provider. + +Enable the experiment, then include the following at the top of your template: + +```terraform +terraform { + required_providers { + coder = { + source = "coder/coder" + version = ">=2.4.0" + } + } +} +``` + +Once enabled, users can toggle between the experimental and classic interfaces during +workspace creation using an escape hatch in the workspace creation form. + +## Features and Capabilities + +Dynamic Parameters introduces three primary enhancements to the standard parameter system: + +- **Conditional Parameters** + + - Parameters can respond to changes in other parameters + - Show or hide parameters based on other selections + - Modify validation rules conditionally + - Create branching paths in workspace creation forms + +- **Reference User Properties** + + - Read user data at build time from [`coder_workspace_owner`](https://registry.terraform.io/providers/coder/coder/latest/docs/data-sources/workspace_owner) + - Conditionally hide parameters based on user's role + - Change parameter options based on user groups + - Reference user name in parameters + +- **Additional Form Inputs** + + - Searchable dropdown lists for easier selection + - Multi-select options for choosing multiple items + - Secret text inputs for sensitive information + - Key-value pair inputs for complex data + - Button parameters for toggling sections + +## Available Form Input Types + +Dynamic Parameters supports a variety of form types to create rich, interactive user experiences. + +You can specify the form type using the `form_type` property. +Different parameter types support different form types. + +The "Options" column in the table below indicates whether the form type requires options to be defined (Yes) or doesn't support/require them (No). When required, options are specified using one or more `option` blocks in your parameter definition, where each option has a `name` (displayed to the user) and a `value` (used in your template logic). + +| Form Type | Parameter Types | Options | Notes | +|----------------|--------------------------------------------|---------|------------------------------------------------------------------------------------------------------------------------------| +| `checkbox` | `bool` | No | A single checkbox for boolean parameters. Default for boolean parameters. | +| `dropdown` | `string`, `number` | Yes | Searchable dropdown list for choosing a single option from a list. Default for `string` or `number` parameters with options. | +| `input` | `string`, `number` | No | Standard single-line text input field. Default for string/number parameters without options. | +| `key-value` | `string` | No | For entering key-value pairs (as JSON). | +| `multi-select` | `list(string)` | Yes | Select multiple items from a list with checkboxes. | +| `radio` | `string`, `number`, `bool`, `list(string)` | Yes | Radio buttons for selecting a single option with all choices visible at once. | +| `slider` | `number` | No | Slider selection with min/max validation for numeric values. | +| `switch` | `bool` | No | Toggle switch alternative for boolean parameters. | +| `tag-select` | `list(string)` | No | Default for list(string) parameters without options. | +| `textarea` | `string` | No | Multi-line text input field for longer content. | | + +### Form Type Examples + +
`checkbox`: A single checkbox for boolean values + +```tf +data "coder_parameter" "enable_gpu" { + name = "enable_gpu" + display_name = "Enable GPU" + type = "bool" + form_type = "checkbox" # This is the default for boolean parameters + default = false +} +``` + +
+ +
`dropdown`: A searchable select menu for choosing a single option from a list + +```tf +data "coder_parameter" "region" { + name = "region" + display_name = "Region" + description = "Select a region" + type = "string" + form_type = "dropdown" # This is the default for string parameters with options + + option { + name = "US East" + value = "us-east-1" + } + option { + name = "US West" + value = "us-west-2" + } +} +``` + +
+ +
`input`: A standard text input field + +```tf +data "coder_parameter" "custom_domain" { + name = "custom_domain" + display_name = "Custom Domain" + type = "string" + form_type = "input" # This is the default for string parameters without options + default = "" +} +``` + +
+ +
`key-value`: Input for entering key-value pairs + +```tf +data "coder_parameter" "environment_vars" { + name = "environment_vars" + display_name = "Environment Variables" + type = "string" + form_type = "key-value" + default = jsonencode({"NODE_ENV": "development"}) +} +``` + +
+ +
`multi-select`: Checkboxes for selecting multiple options from a list + +```tf +data "coder_parameter" "tools" { + name = "tools" + display_name = "Developer Tools" + type = "list(string)" + form_type = "multi-select" + default = jsonencode(["git", "docker"]) + + option { + name = "Git" + value = "git" + } + option { + name = "Docker" + value = "docker" + } + option { + name = "Kubernetes CLI" + value = "kubectl" + } +} +``` + +
+ +
`password`: A text input that masks sensitive information + +```tf +data "coder_parameter" "api_key" { + name = "api_key" + display_name = "API Key" + type = "string" + form_type = "password" + secret = true +} +``` + +
+ +
`radio`: Radio buttons for selecting a single option with high visibility + +```tf +data "coder_parameter" "environment" { + name = "environment" + display_name = "Environment" + type = "string" + form_type = "radio" + default = "dev" + + option { + name = "Development" + value = "dev" + } + option { + name = "Staging" + value = "staging" + } +} +``` + +
+ +
`slider`: A slider for selecting numeric values within a range + +```tf +data "coder_parameter" "cpu_cores" { + name = "cpu_cores" + display_name = "CPU Cores" + type = "number" + form_type = "slider" + default = 2 + validation { + min = 1 + max = 8 + } +} +``` + +
+ +
`switch`: A toggle switch for boolean values + +```tf +data "coder_parameter" "advanced_mode" { + name = "advanced_mode" + display_name = "Advanced Mode" + type = "bool" + form_type = "switch" + default = false +} +``` + +
+ +
`textarea`: A multi-line text input field for longer content + +```tf +data "coder_parameter" "init_script" { + name = "init_script" + display_name = "Initialization Script" + type = "string" + form_type = "textarea" + default = "#!/bin/bash\necho 'Hello World'" +} +``` + +
+ +## Dynamic Parameter Use Case Examples + +
Conditional Parameters: Region and Instance Types + +This example shows instance types based on the selected region: + +```tf +data "coder_parameter" "region" { + name = "region" + display_name = "Region" + description = "Select a region for your workspace" + type = "string" + default = "us-east-1" + + option { + name = "US East (N. Virginia)" + value = "us-east-1" + } + + option { + name = "US West (Oregon)" + value = "us-west-2" + } +} + +data "coder_parameter" "instance_type" { + name = "instance_type" + display_name = "Instance Type" + description = "Select an instance type available in the selected region" + type = "string" + + # This option will only appear when us-east-1 is selected + dynamic "option" { + for_each = data.coder_parameter.region.value == "us-east-1" ? [1] : [] + content { + name = "t3.large (US East)" + value = "t3.large" + } + } + + # This option will only appear when us-west-2 is selected + dynamic "option" { + for_each = data.coder_parameter.region.value == "us-west-2" ? [1] : [] + content { + name = "t3.medium (US West)" + value = "t3.medium" + } + } +} +``` + +
+ +
Advanced Options Toggle + +This example shows how to create an advanced options section: + +```tf +data "coder_parameter" "show_advanced" { + name = "show_advanced" + display_name = "Show Advanced Options" + description = "Enable to show advanced configuration options" + type = "bool" + default = false + order = 0 +} + +data "coder_parameter" "advanced_setting" { + # This parameter is only visible when show_advanced is true + count = data.coder_parameter.show_advanced.value ? 1 : 0 + name = "advanced_setting" + display_name = "Advanced Setting" + description = "An advanced configuration option" + type = "string" + default = "default_value" + mutable = true + order = 1 +} + +
+ +
Multi-select IDE Options + +This example allows selecting multiple IDEs to install: + +```tf +data "coder_parameter" "ides" { + name = "ides" + display_name = "IDEs to Install" + description = "Select which IDEs to install in your workspace" + type = "list(string)" + default = jsonencode(["vscode"]) + mutable = true + form_type = "multi-select" + + option { + name = "VS Code" + value = "vscode" + icon = "/icon/vscode.png" + } + + option { + name = "JetBrains IntelliJ" + value = "intellij" + icon = "/icon/intellij.png" + } + + option { + name = "JupyterLab" + value = "jupyter" + icon = "/icon/jupyter.png" + } +} +``` + +
+ +
Team-specific Resources + +This example filters resources based on user group membership: + +```tf +data "coder_parameter" "instance_type" { + name = "instance_type" + display_name = "Instance Type" + description = "Select an instance type for your workspace" + type = "string" + + # Show GPU options only if user belongs to the "data-science" group + dynamic "option" { + for_each = contains(data.coder_workspace_owner.me.groups, "data-science") ? [1] : [] + content { + name = "p3.2xlarge (GPU)" + value = "p3.2xlarge" + } + } + + # Standard options for all users + option { + name = "t3.medium (Standard)" + value = "t3.medium" + } +} +``` + +### Advanced Usage Patterns + +
Creating Branching Paths + +For templates serving multiple teams or use cases, you can create comprehensive branching paths: + +```tf +data "coder_parameter" "environment_type" { + name = "environment_type" + display_name = "Environment Type" + description = "Select your preferred development environment" + type = "string" + default = "container" + + option { + name = "Container" + value = "container" + } + + option { + name = "Virtual Machine" + value = "vm" + } +} + +# Container-specific parameters +data "coder_parameter" "container_image" { + name = "container_image" + display_name = "Container Image" + description = "Select a container image for your environment" + type = "string" + default = "ubuntu:latest" + + # Only show when container environment is selected + condition { + field = data.coder_parameter.environment_type.name + value = "container" + } + + option { + name = "Ubuntu" + value = "ubuntu:latest" + } + + option { + name = "Python" + value = "python:3.9" + } +} + +# VM-specific parameters +data "coder_parameter" "vm_image" { + name = "vm_image" + display_name = "VM Image" + description = "Select a VM image for your environment" + type = "string" + default = "ubuntu-20.04" + + # Only show when VM environment is selected + condition { + field = data.coder_parameter.environment_type.name + value = "vm" + } + + option { + name = "Ubuntu 20.04" + value = "ubuntu-20.04" + } + + option { + name = "Debian 11" + value = "debian-11" + } +} +``` + +
+ +
Conditional Validation + +Adjust validation rules dynamically based on parameter values: + +```tf +data "coder_parameter" "team" { + name = "team" + display_name = "Team" + type = "string" + default = "engineering" + + option { + name = "Engineering" + value = "engineering" + } + + option { + name = "Data Science" + value = "data-science" + } +} + +data "coder_parameter" "cpu_count" { + name = "cpu_count" + display_name = "CPU Count" + type = "number" + default = 2 + + # Engineering team has lower limits + dynamic "validation" { + for_each = data.coder_parameter.team.value == "engineering" ? [1] : [] + content { + min = 1 + max = 4 + } + } + + # Data Science team has higher limits + dynamic "validation" { + for_each = data.coder_parameter.team.value == "data-science" ? [1] : [] + content { + min = 2 + max = 8 + } + } +} +``` + +
diff --git a/docs/admin/templates/extending-templates/prebuilt-workspaces.md b/docs/admin/templates/extending-templates/prebuilt-workspaces.md index 3fd82d62d1943..57f3dc0b3109f 100644 --- a/docs/admin/templates/extending-templates/prebuilt-workspaces.md +++ b/docs/admin/templates/extending-templates/prebuilt-workspaces.md @@ -142,7 +142,7 @@ To prevent this, add a `lifecycle` block with `ignore_changes`: ```hcl resource "docker_container" "workspace" { lifecycle { - ignore_changes = all + ignore_changes = [env, image] # include all fields which caused drift } count = data.coder_workspace.me.start_count @@ -151,19 +151,8 @@ resource "docker_container" "workspace" { } ``` -For more targeted control, specify which attributes to ignore: - -```hcl -resource "docker_container" "workspace" { - lifecycle { - ignore_changes = [name] - } - - count = data.coder_workspace.me.start_count - name = "coder-${data.coder_workspace_owner.me.name}-${lower(data.coder_workspace.me.name)}" - ... -} -``` +Limit the scope of `ignore_changes` to include only the fields specified in the notification. +If you include too many fields, Terraform might ignore changes that wouldn't otherwise cause drift. Learn more about `ignore_changes` in the [Terraform documentation](https://developer.hashicorp.com/terraform/language/meta-arguments/lifecycle#ignore_changes). diff --git a/docs/admin/users/sessions-tokens.md b/docs/admin/users/sessions-tokens.md index 6332b8182fc17..8152c92290877 100644 --- a/docs/admin/users/sessions-tokens.md +++ b/docs/admin/users/sessions-tokens.md @@ -61,7 +61,7 @@ behalf of other users. Use the API for earlier versions of Coder. #### CLI ```sh -coder tokens create my-token --user +coder tokens create --name my-token --user ``` See the full CLI reference for diff --git a/docs/images/admin/provisioners/provisioner-jobs-status-flow.png b/docs/images/admin/provisioners/provisioner-jobs-status-flow.png new file mode 100644 index 0000000000000..384a7c9efba82 Binary files /dev/null and b/docs/images/admin/provisioners/provisioner-jobs-status-flow.png differ diff --git a/docs/install/kubernetes.md b/docs/install/kubernetes.md index 176fc7c452805..92e97e3cf902c 100644 --- a/docs/install/kubernetes.md +++ b/docs/install/kubernetes.md @@ -133,7 +133,7 @@ We support two release channels: mainline and stable - read the helm install coder coder-v2/coder \ --namespace coder \ --values values.yaml \ - --version 2.20.0 + --version 2.22.1 ``` - **Stable** Coder release: diff --git a/docs/install/releases/feature-stages.md b/docs/install/releases/feature-stages.md index 5730a5d76288e..216b9c01d28af 100644 --- a/docs/install/releases/feature-stages.md +++ b/docs/install/releases/feature-stages.md @@ -24,32 +24,37 @@ If you encounter an issue with any Coder feature, please submit a Early access features are neither feature-complete nor stable. We do not recommend using early access features in production deployments. -Coder sometimes releases early access features that are available for use, but are disabled by default. -You shouldn't use early access features in production because they might cause performance or stability issues. -Early access features can be mostly feature-complete, but require further internal testing and remain in the early access stage for at least one month. +Coder sometimes releases early access features that are available for use, but +are disabled by default. You shouldn't use early access features in production +because they might cause performance or stability issues. Early access features +can be mostly feature-complete, but require further internal testing and remain +in the early access stage for at least one month. -Coder may make significant changes or revert features to a feature flag at any time. +Coder may make significant changes or revert features to a feature flag at any +time. If you plan to activate an early access feature, we suggest that you use a staging deployment.
To enable early access features: -Use the [Coder CLI](../../install/cli.md) `--experiments` flag to enable early access features: +Use the [Coder CLI](../../install/cli.md) `--experiments` flag to enable early +access features: - Enable all early access features: - ```shell - coder server --experiments=* - ``` + ```shell + coder server --experiments=* + ``` - Enable multiple early access features: - ```shell - coder server --experiments=feature1,feature2 - ``` + ```shell + coder server --experiments=feature1,feature2 + ``` -You can also use the `CODER_EXPERIMENTS` [environment variable](../../admin/setup/index.md). +You can also use the `CODER_EXPERIMENTS` +[environment variable](../../admin/setup/index.md). You can opt-out of a feature after you've enabled it. @@ -60,7 +65,9 @@ You can opt-out of a feature after you've enabled it. -Currently no experimental features are available in the latest mainline or stable release. +| Feature | Description | Available in | +|-----------------------|----------------------------------------------|--------------| +| `workspace-prebuilds` | Enables the new workspace prebuilds feature. | mainline | @@ -68,24 +75,32 @@ Currently no experimental features are available in the latest mainline or stabl - **Stable**: No - **Production-ready**: Not fully -- **Support**: Documentation, [Discord](https://discord.gg/coder), and [GitHub issues](https://github.com/coder/coder/issues) +- **Support**: Documentation, [Discord](https://discord.gg/coder), and + [GitHub issues](https://github.com/coder/coder/issues) Beta features are open to the public and are tagged with a `Beta` label. -They’re in active development and subject to minor changes. -They might contain minor bugs, but are generally ready for use. +They’re in active development and subject to minor changes. They might contain +minor bugs, but are generally ready for use. -Beta features are often ready for general availability within two-three releases. -You should test beta features in staging environments. -You can use beta features in production, but should set expectations and inform users that some features may be incomplete. +Beta features are often ready for general availability within two-three +releases. You should test beta features in staging environments. You can use +beta features in production, but should set expectations and inform users that +some features may be incomplete. -We keep documentation about beta features up-to-date with the latest information, including planned features, limitations, and workarounds. -If you encounter an issue, please contact your [Coder account team](https://coder.com/contact), reach out on [Discord](https://discord.gg/coder), or create a [GitHub issues](https://github.com/coder/coder/issues) if there isn't one already. -While we will do our best to provide support with beta features, most issues will be escalated to the product team. -Beta features are not covered within service-level agreements (SLA). +We keep documentation about beta features up-to-date with the latest +information, including planned features, limitations, and workarounds. If you +encounter an issue, please contact your +[Coder account team](https://coder.com/contact), reach out on +[Discord](https://discord.gg/coder), or create a +[GitHub issues](https://github.com/coder/coder/issues) if there isn't one +already. While we will do our best to provide support with beta features, most +issues will be escalated to the product team. Beta features are not covered +within service-level agreements (SLA). -Most beta features are enabled by default. -Beta features are announced through the [Coder Changelog](https://coder.com/changelog), and more information is available in the documentation. +Most beta features are enabled by default. Beta features are announced through +the [Coder Changelog](https://coder.com/changelog), and more information is +available in the documentation. ## General Availability (GA) @@ -93,16 +108,25 @@ Beta features are announced through the [Coder Changelog](https://coder.com/chan - **Production-ready**: Yes - **Support**: Yes, [based on license](https://coder.com/pricing). -All features that are not explicitly tagged as `Early access` or `Beta` are considered generally available (GA). -They have been tested, are stable, and are enabled by default. +All features that are not explicitly tagged as `Early access` or `Beta` are +considered generally available (GA). They have been tested, are stable, and are +enabled by default. -If your Coder license includes an SLA, please consult it for an outline of specific expectations. +If your Coder license includes an SLA, please consult it for an outline of +specific expectations. -For support, consult our knowledgeable and growing community on [Discord](https://discord.gg/coder), or create a [GitHub issue](https://github.com/coder/coder/issues) if one doesn't exist already. -Customers with a valid Coder license, can submit a support request or contact your [account team](https://coder.com/contact). +For support, consult our knowledgeable and growing community on +[Discord](https://discord.gg/coder), or create a +[GitHub issue](https://github.com/coder/coder/issues) if one doesn't exist +already. Customers with a valid Coder license, can submit a support request or +contact your [account team](https://coder.com/contact). -We intend [Coder documentation](../../README.md) to be the [single source of truth](https://en.wikipedia.org/wiki/Single_source_of_truth) and all features should have some form of complete documentation that outlines how to use or implement a feature. -If you discover an error or if you have a suggestion that could improve the documentation, please [submit a GitHub issue](https://github.com/coder/internal/issues/new?title=request%28docs%29%3A+request+title+here&labels=["customer-feedback","docs"]&body=please+enter+your+request+here). +We intend [Coder documentation](../../README.md) to be the +[single source of truth](https://en.wikipedia.org/wiki/Single_source_of_truth) +and all features should have some form of complete documentation that outlines +how to use or implement a feature. If you discover an error or if you have a +suggestion that could improve the documentation, please +[submit a GitHub issue](https://github.com/coder/internal/issues/new?title=request%28docs%29%3A+request+title+here&labels=["customer-feedback","docs"]&body=please+enter+your+request+here). -Some GA features can be disabled for air-gapped deployments. -Consult the feature's documentation or submit a support ticket for assistance. +Some GA features can be disabled for air-gapped deployments. Consult the +feature's documentation or submit a support ticket for assistance. diff --git a/docs/manifest.json b/docs/manifest.json index 3af0cc7505057..1ec955c6244cc 100644 --- a/docs/manifest.json +++ b/docs/manifest.json @@ -143,6 +143,11 @@ "title": "JetBrains Gateway in an air-gapped environment", "description": "Use JetBrains Gateway in an air-gapped offline environment", "path": "./user-guides/workspace-access/jetbrains/jetbrains-airgapped.md" + }, + { + "title": "JetBrains Toolbox", + "description": "Access Coder workspaces through JetBrains Toolbox", + "path": "./user-guides/workspace-access/jetbrains/jetbrains-toolbox.md" } ] }, @@ -506,7 +511,8 @@ { "title": "Configure a template for dev containers", "description": "How to use configure your template for dev containers", - "path": "./admin/templates/extending-templates/devcontainers.md" + "path": "./admin/templates/extending-templates/devcontainers.md", + "state": ["early access"] }, { "title": "Process Logging", @@ -550,7 +556,7 @@ ] }, { - "title": "External Auth", + "title": "External Authentication", "description": "Learn how to configure external authentication", "path": "./admin/external-auth.md", "icon_path": "./images/icons/plug.svg" diff --git a/docs/reference/api/builds.md b/docs/reference/api/builds.md index 00417c700cdfd..3cfd25f2a6e0f 100644 --- a/docs/reference/api/builds.md +++ b/docs/reference/api/builds.md @@ -1731,6 +1731,7 @@ curl -X POST http://coder-server:8080/api/v2/workspaces/{workspace}/builds \ ```json { "dry_run": true, + "enable_dynamic_parameters": true, "log_level": "debug", "orphan": true, "rich_parameter_values": [ diff --git a/docs/reference/api/general.md b/docs/reference/api/general.md index c14c317066a39..12454145569bb 100644 --- a/docs/reference/api/general.md +++ b/docs/reference/api/general.md @@ -533,6 +533,7 @@ curl -X GET http://coder-server:8080/api/v2/deployment/config \ "wildcard_access_url": "string", "workspace_hostname_suffix": "string", "workspace_prebuilds": { + "failure_hard_limit": 0, "reconciliation_backoff_interval": 0, "reconciliation_backoff_lookback": 0, "reconciliation_interval": 0 diff --git a/docs/reference/api/members.md b/docs/reference/api/members.md index a58a597d1ea2a..6b5d124753bc0 100644 --- a/docs/reference/api/members.md +++ b/docs/reference/api/members.md @@ -169,7 +169,9 @@ Status Code **200** | `action` | `application_connect` | | `action` | `assign` | | `action` | `create` | +| `action` | `create_agent` | | `action` | `delete` | +| `action` | `delete_agent` | | `action` | `read` | | `action` | `read_personal` | | `action` | `ssh` | @@ -336,7 +338,9 @@ Status Code **200** | `action` | `application_connect` | | `action` | `assign` | | `action` | `create` | +| `action` | `create_agent` | | `action` | `delete` | +| `action` | `delete_agent` | | `action` | `read` | | `action` | `read_personal` | | `action` | `ssh` | @@ -503,7 +507,9 @@ Status Code **200** | `action` | `application_connect` | | `action` | `assign` | | `action` | `create` | +| `action` | `create_agent` | | `action` | `delete` | +| `action` | `delete_agent` | | `action` | `read` | | `action` | `read_personal` | | `action` | `ssh` | @@ -639,7 +645,9 @@ Status Code **200** | `action` | `application_connect` | | `action` | `assign` | | `action` | `create` | +| `action` | `create_agent` | | `action` | `delete` | +| `action` | `delete_agent` | | `action` | `read` | | `action` | `read_personal` | | `action` | `ssh` | @@ -997,7 +1005,9 @@ Status Code **200** | `action` | `application_connect` | | `action` | `assign` | | `action` | `create` | +| `action` | `create_agent` | | `action` | `delete` | +| `action` | `delete_agent` | | `action` | `read` | | `action` | `read_personal` | | `action` | `ssh` | diff --git a/docs/reference/api/schemas.md b/docs/reference/api/schemas.md index 91f70950e989e..2374c6af8800f 100644 --- a/docs/reference/api/schemas.md +++ b/docs/reference/api/schemas.md @@ -1917,6 +1917,7 @@ This is required on creation to enable a user-flow of validating a template work ```json { "dry_run": true, + "enable_dynamic_parameters": true, "log_level": "debug", "orphan": true, "rich_parameter_values": [ @@ -1939,6 +1940,7 @@ This is required on creation to enable a user-flow of validating a template work | Name | Type | Required | Restrictions | Description | |------------------------------|-------------------------------------------------------------------------------|----------|--------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | `dry_run` | boolean | false | | | +| `enable_dynamic_parameters` | boolean | false | | Enable dynamic parameters skips some of the static parameter checking. It will default to whatever the template has marked as the default experience. Requires the "dynamic-experiment" to be used. | | `log_level` | [codersdk.ProvisionerLogLevel](#codersdkprovisionerloglevel) | false | | Log level changes the default logging verbosity of a provider ("info" if empty). | | `orphan` | boolean | false | | Orphan may be set for the Destroy transition. | | `rich_parameter_values` | array of [codersdk.WorkspaceBuildParameter](#codersdkworkspacebuildparameter) | false | | Rich parameter values are optional. It will write params to the 'workspace' scope. This will overwrite any existing parameters with the same name. This will not delete old params not included in this list. | @@ -2702,6 +2704,7 @@ CreateWorkspaceRequest provides options for creating a new workspace. Only one o "wildcard_access_url": "string", "workspace_hostname_suffix": "string", "workspace_prebuilds": { + "failure_hard_limit": 0, "reconciliation_backoff_interval": 0, "reconciliation_backoff_lookback": 0, "reconciliation_interval": 0 @@ -3200,6 +3203,7 @@ CreateWorkspaceRequest provides options for creating a new workspace. Only one o "wildcard_access_url": "string", "workspace_hostname_suffix": "string", "workspace_prebuilds": { + "failure_hard_limit": 0, "reconciliation_backoff_interval": 0, "reconciliation_backoff_lookback": 0, "reconciliation_interval": 0 @@ -5259,6 +5263,7 @@ Git clone makes use of this by parsing the URL from: 'Username for "https://gith ```json { + "failure_hard_limit": 0, "reconciliation_backoff_interval": 0, "reconciliation_backoff_lookback": 0, "reconciliation_interval": 0 @@ -5267,11 +5272,12 @@ Git clone makes use of this by parsing the URL from: 'Username for "https://gith ### Properties -| Name | Type | Required | Restrictions | Description | -|-----------------------------------|---------|----------|--------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------| -| `reconciliation_backoff_interval` | integer | false | | Reconciliation backoff interval specifies the amount of time to increase the backoff interval when errors occur during reconciliation. | -| `reconciliation_backoff_lookback` | integer | false | | Reconciliation backoff lookback determines the time window to look back when calculating the number of failed prebuilds, which influences the backoff strategy. | -| `reconciliation_interval` | integer | false | | Reconciliation interval defines how often the workspace prebuilds state should be reconciled. | +| Name | Type | Required | Restrictions | Description | +|-----------------------------------|---------|----------|--------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `failure_hard_limit` | integer | false | | Failure hard limit defines the maximum number of consecutive failed prebuild attempts allowed before a preset is considered to be in a hard limit state. When a preset hits this limit, no new prebuilds will be created until the limit is reset. FailureHardLimit is disabled when set to zero. | +| `reconciliation_backoff_interval` | integer | false | | Reconciliation backoff interval specifies the amount of time to increase the backoff interval when errors occur during reconciliation. | +| `reconciliation_backoff_lookback` | integer | false | | Reconciliation backoff lookback determines the time window to look back when calculating the number of failed prebuilds, which influences the backoff strategy. | +| `reconciliation_interval` | integer | false | | Reconciliation interval defines how often the workspace prebuilds state should be reconciled. | ## codersdk.Preset @@ -5911,7 +5917,9 @@ Git clone makes use of this by parsing the URL from: 'Username for "https://gith | `application_connect` | | `assign` | | `create` | +| `create_agent` | | `delete` | +| `delete_agent` | | `read` | | `read_personal` | | `ssh` | @@ -8416,6 +8424,7 @@ If the schedule is empty, the user will be updated to use the default schedule.| "template_id": "c6d67e98-83ea-49f0-8812-e4abae2b68bc", "template_name": "string", "template_require_active_version": true, + "template_use_classic_parameter_flow": true, "ttl_ms": 0, "updated_at": "2019-08-24T14:15:22Z" } @@ -8452,6 +8461,7 @@ If the schedule is empty, the user will be updated to use the default schedule.| | `template_id` | string | false | | | | `template_name` | string | false | | | | `template_require_active_version` | boolean | false | | | +| `template_use_classic_parameter_flow` | boolean | false | | | | `ttl_ms` | integer | false | | | | `updated_at` | string | false | | | @@ -10088,6 +10098,7 @@ If the schedule is empty, the user will be updated to use the default schedule.| "template_id": "c6d67e98-83ea-49f0-8812-e4abae2b68bc", "template_name": "string", "template_require_active_version": true, + "template_use_classic_parameter_flow": true, "ttl_ms": 0, "updated_at": "2019-08-24T14:15:22Z" } diff --git a/docs/reference/api/workspaces.md b/docs/reference/api/workspaces.md index 49377ec14c6fd..241d80ac05f7d 100644 --- a/docs/reference/api/workspaces.md +++ b/docs/reference/api/workspaces.md @@ -296,6 +296,7 @@ of the template will be used. "template_id": "c6d67e98-83ea-49f0-8812-e4abae2b68bc", "template_name": "string", "template_require_active_version": true, + "template_use_classic_parameter_flow": true, "ttl_ms": 0, "updated_at": "2019-08-24T14:15:22Z" } @@ -578,6 +579,7 @@ curl -X GET http://coder-server:8080/api/v2/users/{user}/workspace/{workspacenam "template_id": "c6d67e98-83ea-49f0-8812-e4abae2b68bc", "template_name": "string", "template_require_active_version": true, + "template_use_classic_parameter_flow": true, "ttl_ms": 0, "updated_at": "2019-08-24T14:15:22Z" } @@ -886,6 +888,7 @@ of the template will be used. "template_id": "c6d67e98-83ea-49f0-8812-e4abae2b68bc", "template_name": "string", "template_require_active_version": true, + "template_use_classic_parameter_flow": true, "ttl_ms": 0, "updated_at": "2019-08-24T14:15:22Z" } @@ -1154,6 +1157,7 @@ curl -X GET http://coder-server:8080/api/v2/workspaces \ "template_id": "c6d67e98-83ea-49f0-8812-e4abae2b68bc", "template_name": "string", "template_require_active_version": true, + "template_use_classic_parameter_flow": true, "ttl_ms": 0, "updated_at": "2019-08-24T14:15:22Z" } @@ -1437,6 +1441,7 @@ curl -X GET http://coder-server:8080/api/v2/workspaces/{workspace} \ "template_id": "c6d67e98-83ea-49f0-8812-e4abae2b68bc", "template_name": "string", "template_require_active_version": true, + "template_use_classic_parameter_flow": true, "ttl_ms": 0, "updated_at": "2019-08-24T14:15:22Z" } @@ -1835,6 +1840,7 @@ curl -X PUT http://coder-server:8080/api/v2/workspaces/{workspace}/dormant \ "template_id": "c6d67e98-83ea-49f0-8812-e4abae2b68bc", "template_name": "string", "template_require_active_version": true, + "template_use_classic_parameter_flow": true, "ttl_ms": 0, "updated_at": "2019-08-24T14:15:22Z" } diff --git a/docs/reference/cli/users_edit-roles.md b/docs/reference/cli/users_edit-roles.md index 23e0baa42afff..04f12ce701584 100644 --- a/docs/reference/cli/users_edit-roles.md +++ b/docs/reference/cli/users_edit-roles.md @@ -25,4 +25,4 @@ Bypass prompts. |------|---------------------------| | Type | string-array | -A list of roles to give to the user. This removes any existing roles the user may have. The available roles are: auditor, member, owner, template-admin, user-admin. +A list of roles to give to the user. This removes any existing roles the user may have. diff --git a/docs/user-guides/workspace-access/jetbrains/jetbrains-toolbox.md b/docs/user-guides/workspace-access/jetbrains/jetbrains-toolbox.md new file mode 100644 index 0000000000000..b2b558d9b52b4 --- /dev/null +++ b/docs/user-guides/workspace-access/jetbrains/jetbrains-toolbox.md @@ -0,0 +1,46 @@ +# JetBrains Toolbox Integration + +JetBrains Toolbox helps you manage JetBrains products and includes remote development capabilities for connecting to Coder workspaces. + +## Install the Coder plugin for Toolbox + +1. Install [JetBrains Toolbox](https://www.jetbrains.com/toolbox-app/) version 2.6.0.40632 or later. + +1. Open Toolbox and navigate to the **Remote Development** section. +1. Install the Coder plugin using one of these methods: + - Search for `Coder` in the **Remote Development** plugins section. + - Use this URI to install directly: `jetbrains://gateway/com.coder.toolbox`. + - Download from [JetBrains Marketplace](https://plugins.jetbrains.com/). + - Download from [GitHub Releases](https://github.com/coder/coder-jetbrains-toolbox/releases). + +## Use URI parameters + +For direct connections or creating bookmarks, use custom URI links with parameters: + +```shell +jetbrains://gateway/com.coder.toolbox?url=https://coder.example.com&token=&workspace=my-workspace +``` + +Required parameters: + +- `url`: Your Coder deployment URL +- `token`: Coder authentication token +- `workspace`: Name of your workspace + +Optional parameters: + +- `agent_id`: ID of the agent (only required if workspace has multiple agents) +- `folder`: Specific project folder path to open +- `ide_product_code`: Specific IDE product code (e.g., "IU" for IntelliJ IDEA Ultimate) +- `ide_build_number`: Specific build number of the JetBrains IDE + +For more details, see the [coder-jetbrains-toolbox repository](https://github.com/coder/coder-jetbrains-toolbox#connect-to-a-coder-workspace-via-jetbrains-toolbox-uri). + +## Configure internal certificates + +To connect to a Coder deployment that uses internal certificates, configure the certificates directly in JetBrains Toolbox: + +1. Click the settings icon (⚙) in the lower left corner of JetBrains Toolbox. +1. Select **Settings**. +1. Go to the **Coder** section. +1. Add your certificate path in the **CA Path** field. diff --git a/docs/user-guides/workspace-access/remote-desktops.md b/docs/user-guides/workspace-access/remote-desktops.md index ef8488f5889ff..2fe512b686763 100644 --- a/docs/user-guides/workspace-access/remote-desktops.md +++ b/docs/user-guides/workspace-access/remote-desktops.md @@ -47,6 +47,38 @@ Or use your favorite RDP client to connect to `localhost:3399`. The default username is `Administrator` and password is `coderRDP!`. +### Coder Desktop URI Handling (Beta) + +[Coder Desktop](../desktop) can use a URI handler to directly launch an RDP session without setting up port-forwarding. +The URI format is: + +```text +coder:///v0/open/ws//agent//rdp?username=&password= +``` + +For example: + +```text +coder://coder.example.com/v0/open/ws/myworkspace/agent/main/rdp?username=Administrator&password=coderRDP! +``` + +To include a Coder Desktop button to the workspace dashboard page, add a `coder_app` resource to the template: + +```tf +locals { + server_name = regex("https?:\\/\\/([^\\/]+)", data.coder_workspace.me.access_url)[0] +} + +resource "coder_app" "rdp-coder-desktop" { + agent_id = resource.coder_agent.main.id + slug = "rdp-desktop" + display_name = "RDP with Coder Desktop" + url = "coder://${local.server_name}/v0/open/ws/${data.coder_workspace.me.name}/agent/main/rdp?username=Administrator&password=coderRDP!" + icon = "/icon/desktop.svg" + external = true +} +``` + ## RDP Web Our [WebRDP](https://registry.coder.com/modules/windows-rdp) module in the Coder diff --git a/dogfood/coder/main.tf b/dogfood/coder/main.tf index e21602a26e922..06da4d79c549a 100644 --- a/dogfood/coder/main.tf +++ b/dogfood/coder/main.tf @@ -30,6 +30,81 @@ locals { container_name = "coder-${data.coder_workspace_owner.me.name}-${lower(data.coder_workspace.me.name)}" } +data "coder_workspace_preset" "cpt" { + name = "Cape Town" + parameters = { + (data.coder_parameter.region.name) = "za-cpt" + (data.coder_parameter.image_type.name) = "codercom/oss-dogfood:latest" + (data.coder_parameter.repo_base_dir.name) = "~" + (data.coder_parameter.res_mon_memory_threshold.name) = 80 + (data.coder_parameter.res_mon_volume_threshold.name) = 90 + (data.coder_parameter.res_mon_volume_path.name) = "/home/coder" + } + prebuilds { + instances = 1 + } +} + +data "coder_workspace_preset" "pittsburgh" { + name = "Pittsburgh" + parameters = { + (data.coder_parameter.region.name) = "us-pittsburgh" + (data.coder_parameter.image_type.name) = "codercom/oss-dogfood:latest" + (data.coder_parameter.repo_base_dir.name) = "~" + (data.coder_parameter.res_mon_memory_threshold.name) = 80 + (data.coder_parameter.res_mon_volume_threshold.name) = 90 + (data.coder_parameter.res_mon_volume_path.name) = "/home/coder" + } + prebuilds { + instances = 2 + } +} + +data "coder_workspace_preset" "falkenstein" { + name = "Falkenstein" + parameters = { + (data.coder_parameter.region.name) = "eu-helsinki" + (data.coder_parameter.image_type.name) = "codercom/oss-dogfood:latest" + (data.coder_parameter.repo_base_dir.name) = "~" + (data.coder_parameter.res_mon_memory_threshold.name) = 80 + (data.coder_parameter.res_mon_volume_threshold.name) = 90 + (data.coder_parameter.res_mon_volume_path.name) = "/home/coder" + } + prebuilds { + instances = 1 + } +} + +data "coder_workspace_preset" "sydney" { + name = "Sydney" + parameters = { + (data.coder_parameter.region.name) = "ap-sydney" + (data.coder_parameter.image_type.name) = "codercom/oss-dogfood:latest" + (data.coder_parameter.repo_base_dir.name) = "~" + (data.coder_parameter.res_mon_memory_threshold.name) = 80 + (data.coder_parameter.res_mon_volume_threshold.name) = 90 + (data.coder_parameter.res_mon_volume_path.name) = "/home/coder" + } + prebuilds { + instances = 1 + } +} + +data "coder_workspace_preset" "saopaulo" { + name = "São Paulo" + parameters = { + (data.coder_parameter.region.name) = "sa-saopaulo" + (data.coder_parameter.image_type.name) = "codercom/oss-dogfood:latest" + (data.coder_parameter.repo_base_dir.name) = "~" + (data.coder_parameter.res_mon_memory_threshold.name) = 80 + (data.coder_parameter.res_mon_volume_threshold.name) = 90 + (data.coder_parameter.res_mon_volume_path.name) = "/home/coder" + } + prebuilds { + instances = 1 + } +} + data "coder_parameter" "repo_base_dir" { type = "string" name = "Coder Repository Base Directory" @@ -438,6 +513,14 @@ resource "docker_image" "dogfood" { } resource "docker_container" "workspace" { + lifecycle { + // Ignore changes that would invalidate prebuilds + ignore_changes = [ + name, + hostname, + labels, + ] + } count = data.coder_workspace.me.start_count image = docker_image.dogfood.name name = local.container_name diff --git a/enterprise/coderd/parameters_test.go b/enterprise/coderd/parameters_test.go index e6bc564e43da2..76bd5a1eafdbb 100644 --- a/enterprise/coderd/parameters_test.go +++ b/enterprise/coderd/parameters_test.go @@ -70,8 +70,8 @@ func TestDynamicParametersOwnerGroups(t *testing.T) { require.Equal(t, -1, preview.ID) require.Empty(t, preview.Diagnostics) require.Equal(t, "group", preview.Parameters[0].Name) - require.True(t, preview.Parameters[0].Value.Valid()) - require.Equal(t, database.EveryoneGroup, preview.Parameters[0].Value.Value.AsString()) + require.True(t, preview.Parameters[0].Value.Valid) + require.Equal(t, database.EveryoneGroup, preview.Parameters[0].Value.Value) // Send a new value, and see it reflected err = stream.Send(codersdk.DynamicParametersRequest{ @@ -83,8 +83,8 @@ func TestDynamicParametersOwnerGroups(t *testing.T) { require.Equal(t, 1, preview.ID) require.Empty(t, preview.Diagnostics) require.Equal(t, "group", preview.Parameters[0].Name) - require.True(t, preview.Parameters[0].Value.Valid()) - require.Equal(t, group.Name, preview.Parameters[0].Value.Value.AsString()) + require.True(t, preview.Parameters[0].Value.Valid) + require.Equal(t, group.Name, preview.Parameters[0].Value.Value) // Back to default err = stream.Send(codersdk.DynamicParametersRequest{ @@ -96,6 +96,6 @@ func TestDynamicParametersOwnerGroups(t *testing.T) { require.Equal(t, 3, preview.ID) require.Empty(t, preview.Diagnostics) require.Equal(t, "group", preview.Parameters[0].Name) - require.True(t, preview.Parameters[0].Value.Valid()) - require.Equal(t, database.EveryoneGroup, preview.Parameters[0].Value.Value.AsString()) + require.True(t, preview.Parameters[0].Value.Valid) + require.Equal(t, database.EveryoneGroup, preview.Parameters[0].Value.Value) } diff --git a/enterprise/coderd/prebuilds/reconcile.go b/enterprise/coderd/prebuilds/reconcile.go index f9588a5d7cacb..7796e43777951 100644 --- a/enterprise/coderd/prebuilds/reconcile.go +++ b/enterprise/coderd/prebuilds/reconcile.go @@ -313,6 +313,7 @@ func (c *StoreReconciler) SnapshotState(ctx context.Context, store database.Stor if len(presetsWithPrebuilds) == 0 { return nil } + allRunningPrebuilds, err := db.GetRunningPrebuiltWorkspaces(ctx) if err != nil { return xerrors.Errorf("failed to get running prebuilds: %w", err) @@ -328,7 +329,18 @@ func (c *StoreReconciler) SnapshotState(ctx context.Context, store database.Stor return xerrors.Errorf("failed to get backoffs for presets: %w", err) } - state = prebuilds.NewGlobalSnapshot(presetsWithPrebuilds, allRunningPrebuilds, allPrebuildsInProgress, presetsBackoff) + hardLimitedPresets, err := db.GetPresetsAtFailureLimit(ctx, c.cfg.FailureHardLimit.Value()) + if err != nil { + return xerrors.Errorf("failed to get hard limited presets: %w", err) + } + + state = prebuilds.NewGlobalSnapshot( + presetsWithPrebuilds, + allRunningPrebuilds, + allPrebuildsInProgress, + presetsBackoff, + hardLimitedPresets, + ) return nil }, &database.TxOptions{ Isolation: sql.LevelRepeatableRead, // This mirrors the MVCC snapshotting Postgres does when using CTEs @@ -349,19 +361,45 @@ func (c *StoreReconciler) ReconcilePreset(ctx context.Context, ps prebuilds.Pres slog.F("preset_name", ps.Preset.Name), ) + // If the preset was previously hard-limited, log it and exit early. + if ps.Preset.PrebuildStatus == database.PrebuildStatusHardLimited { + logger.Warn(ctx, "skipping hard limited preset") + return nil + } + + // If the preset reached the hard failure limit for the first time during this iteration: + // - Mark it as hard-limited in the database + // - Send notifications to template admins + if ps.IsHardLimited { + logger.Warn(ctx, "skipping hard limited preset") + + err := c.store.UpdatePresetPrebuildStatus(ctx, database.UpdatePresetPrebuildStatusParams{ + Status: database.PrebuildStatusHardLimited, + PresetID: ps.Preset.ID, + }) + if err != nil { + return xerrors.Errorf("failed to update preset prebuild status: %w", err) + } + + err = c.notifyPrebuildFailureLimitReached(ctx, ps) + if err != nil { + logger.Error(ctx, "failed to notify that number of prebuild failures reached the limit", slog.Error(err)) + return nil + } + + return nil + } + state := ps.CalculateState() actions, err := c.CalculateActions(ctx, ps) if err != nil { - logger.Error(ctx, "failed to calculate actions for preset", slog.Error(err), slog.F("preset_id", ps.Preset.ID)) + logger.Error(ctx, "failed to calculate actions for preset", slog.Error(err)) return nil } // Nothing has to be done. if !ps.Preset.UsingActiveVersion && actions.IsNoop() { - logger.Debug(ctx, "skipping reconciliation for preset - nothing has to be done", - slog.F("template_id", ps.Preset.TemplateID.String()), slog.F("template_name", ps.Preset.TemplateName), - slog.F("template_version_id", ps.Preset.TemplateVersionID.String()), slog.F("template_version_name", ps.Preset.TemplateVersionName), - slog.F("preset_id", ps.Preset.ID.String()), slog.F("preset_name", ps.Preset.Name)) + logger.Debug(ctx, "skipping reconciliation for preset - nothing has to be done") return nil } @@ -442,6 +480,49 @@ func (c *StoreReconciler) ReconcilePreset(ctx context.Context, ps prebuilds.Pres } } +func (c *StoreReconciler) notifyPrebuildFailureLimitReached(ctx context.Context, ps prebuilds.PresetSnapshot) error { + // nolint:gocritic // Necessary to query all the required data. + ctx = dbauthz.AsSystemRestricted(ctx) + + // Send notification to template admins. + if c.notifEnq == nil { + c.logger.Warn(ctx, "notification enqueuer not set, cannot send prebuild is hard limited notification(s)") + return nil + } + + templateAdmins, err := c.store.GetUsers(ctx, database.GetUsersParams{ + RbacRole: []string{codersdk.RoleTemplateAdmin}, + }) + if err != nil { + return xerrors.Errorf("fetch template admins: %w", err) + } + + for _, templateAdmin := range templateAdmins { + if _, err := c.notifEnq.EnqueueWithData(ctx, templateAdmin.ID, notifications.PrebuildFailureLimitReached, + map[string]string{ + "org": ps.Preset.OrganizationName, + "template": ps.Preset.TemplateName, + "template_version": ps.Preset.TemplateVersionName, + "preset": ps.Preset.Name, + }, + map[string]any{}, + "prebuilds_reconciler", + // Associate this notification with all the related entities. + ps.Preset.TemplateID, ps.Preset.TemplateVersionID, ps.Preset.ID, ps.Preset.OrganizationID, + ); err != nil { + c.logger.Error(ctx, + "failed to send notification", + slog.Error(err), + slog.F("template_admin_id", templateAdmin.ID.String()), + ) + + continue + } + } + + return nil +} + func (c *StoreReconciler) CalculateActions(ctx context.Context, snapshot prebuilds.PresetSnapshot) (*prebuilds.ReconciliationActions, error) { if ctx.Err() != nil { return nil, ctx.Err() diff --git a/enterprise/coderd/prebuilds/reconcile_test.go b/enterprise/coderd/prebuilds/reconcile_test.go index 660b1733e6cc9..f52a77ca500b9 100644 --- a/enterprise/coderd/prebuilds/reconcile_test.go +++ b/enterprise/coderd/prebuilds/reconcile_test.go @@ -654,6 +654,131 @@ func TestDeletionOfPrebuiltWorkspaceWithInvalidPreset(t *testing.T) { require.Equal(t, database.WorkspaceTransitionDelete, builds[0].Transition) } +func TestSkippingHardLimitedPresets(t *testing.T) { + t.Parallel() + + if !dbtestutil.WillUsePostgres() { + t.Skip("This test requires postgres") + } + + // Test cases verify the behavior of prebuild creation depending on configured failure limits. + testCases := []struct { + name string + hardLimit int64 + isHardLimitHit bool + }{ + { + name: "hard limit is hit - skip creation of prebuilt workspace", + hardLimit: 1, + isHardLimitHit: true, + }, + { + name: "hard limit is not hit - try to create prebuilt workspace again", + hardLimit: 2, + isHardLimitHit: false, + }, + } + + for _, tc := range testCases { + t.Run(tc.name, func(t *testing.T) { + t.Parallel() + + templateDeleted := false + + clock := quartz.NewMock(t) + ctx := testutil.Context(t, testutil.WaitShort) + cfg := codersdk.PrebuildsConfig{ + FailureHardLimit: serpent.Int64(tc.hardLimit), + ReconciliationBackoffInterval: 0, + } + logger := slogtest.Make( + t, &slogtest.Options{IgnoreErrors: true}, + ).Leveled(slog.LevelDebug) + db, pubSub := dbtestutil.NewDB(t) + fakeEnqueuer := newFakeEnqueuer() + controller := prebuilds.NewStoreReconciler(db, pubSub, cfg, logger, clock, prometheus.NewRegistry(), fakeEnqueuer) + + // Template admin to receive a notification. + templateAdmin := dbgen.User(t, db, database.User{ + RBACRoles: []string{codersdk.RoleTemplateAdmin}, + }) + + // Set up test environment with a template, version, and preset. + ownerID := uuid.New() + dbgen.User(t, db, database.User{ + ID: ownerID, + }) + org, template := setupTestDBTemplate(t, db, ownerID, templateDeleted) + templateVersionID := setupTestDBTemplateVersion(ctx, t, clock, db, pubSub, org.ID, ownerID, template.ID) + preset := setupTestDBPreset(t, db, templateVersionID, 1, uuid.New().String()) + + // Create a failed prebuild workspace that counts toward the hard failure limit. + setupTestDBPrebuild( + t, + clock, + db, + pubSub, + database.WorkspaceTransitionStart, + database.ProvisionerJobStatusFailed, + org.ID, + preset, + template.ID, + templateVersionID, + ) + + // Verify initial state: one failed workspace exists. + workspaces, err := db.GetWorkspacesByTemplateID(ctx, template.ID) + require.NoError(t, err) + workspaceCount := len(workspaces) + require.Equal(t, 1, workspaceCount) + + // We simulate a failed prebuild in the test; Consequently, the backoff mechanism is triggered when ReconcileAll is called. + // Even though ReconciliationBackoffInterval is set to zero, we still need to advance the clock by at least one nanosecond. + clock.Advance(time.Nanosecond).MustWait(ctx) + + // Trigger reconciliation to attempt creating a new prebuild. + // The outcome depends on whether the hard limit has been reached. + require.NoError(t, controller.ReconcileAll(ctx)) + + // These two additional calls to ReconcileAll should not trigger any notifications. + // A notification is only sent once. + require.NoError(t, controller.ReconcileAll(ctx)) + require.NoError(t, controller.ReconcileAll(ctx)) + + // Verify the final state after reconciliation. + workspaces, err = db.GetWorkspacesByTemplateID(ctx, template.ID) + require.NoError(t, err) + updatedPreset, err := db.GetPresetByID(ctx, preset.ID) + require.NoError(t, err) + + if !tc.isHardLimitHit { + // When hard limit is not reached, a new workspace should be created. + require.Equal(t, 2, len(workspaces)) + require.Equal(t, database.PrebuildStatusHealthy, updatedPreset.PrebuildStatus) + return + } + + // When hard limit is reached, no new workspace should be created. + require.Equal(t, 1, len(workspaces)) + require.Equal(t, database.PrebuildStatusHardLimited, updatedPreset.PrebuildStatus) + + // When hard limit is reached, a notification should be sent. + matching := fakeEnqueuer.Sent(func(notification *notificationstest.FakeNotification) bool { + if !assert.Equal(t, notifications.PrebuildFailureLimitReached, notification.TemplateID, "unexpected template") { + return false + } + + if !assert.Equal(t, templateAdmin.ID, notification.UserID, "unexpected receiver") { + return false + } + + return true + }) + require.Len(t, matching, 1) + }) + } +} + func TestRunLoop(t *testing.T) { t.Parallel() diff --git a/enterprise/coderd/workspaceagents_test.go b/enterprise/coderd/workspaceagents_test.go index 44aba69b9ffaa..f0c9b37f3b2a3 100644 --- a/enterprise/coderd/workspaceagents_test.go +++ b/enterprise/coderd/workspaceagents_test.go @@ -7,6 +7,7 @@ import ( "net/http" "os" "regexp" + "runtime" "testing" "time" @@ -89,6 +90,12 @@ func TestReinitializeAgent(t *testing.T) { t.Skip("dbmem cannot currently claim a workspace") } + if runtime.GOOS == "windows" { + t.Skip("test startup script is not supported on windows") + } + + startupScript := fmt.Sprintf("printenv >> %s; echo '---\n' >> %s", tempAgentLog.Name(), tempAgentLog.Name()) + db, ps := dbtestutil.NewDB(t) // GIVEN a live enterprise API with the prebuilds feature enabled client, user := coderdenttest.New(t, &coderdenttest.Options{ @@ -155,7 +162,7 @@ func TestReinitializeAgent(t *testing.T) { Scripts: []*proto.Script{ { RunOnStart: true, - Script: fmt.Sprintf("printenv >> %s; echo '---\n' >> %s", tempAgentLog.Name(), tempAgentLog.Name()), // Make reinitialization take long enough to assert that it happened + Script: startupScript, }, }, Auth: &proto.Agent_Token{ diff --git a/enterprise/coderd/workspaces_test.go b/enterprise/coderd/workspaces_test.go index 7005c93ca36f5..226232f37bf7f 100644 --- a/enterprise/coderd/workspaces_test.go +++ b/enterprise/coderd/workspaces_test.go @@ -1659,6 +1659,119 @@ func TestTemplateDoesNotAllowUserAutostop(t *testing.T) { }) } +// TestWorkspaceTemplateParamsChange tests a workspace with a parameter that +// validation changes on apply. The params used in create workspace are invalid +// according to the static params on import. +// +// This is testing that dynamic params defers input validation to terraform. +// It does not try to do this in coder/coder. +func TestWorkspaceTemplateParamsChange(t *testing.T) { + mainTfTemplate := ` + terraform { + required_providers { + coder = { + source = "coder/coder" + } + } + } + provider "coder" {} + data "coder_workspace" "me" {} + data "coder_workspace_owner" "me" {} + + data "coder_parameter" "param_min" { + name = "param_min" + type = "number" + default = 10 + } + + data "coder_parameter" "param" { + name = "param" + type = "number" + default = 12 + validation { + min = data.coder_parameter.param_min.value + } + } + ` + tfCliConfigPath := downloadProviders(t, mainTfTemplate) + t.Setenv("TF_CLI_CONFIG_FILE", tfCliConfigPath) + + logger := slogtest.Make(t, &slogtest.Options{IgnoreErrors: false}) + dv := coderdtest.DeploymentValues(t) + dv.Experiments = []string{string(codersdk.ExperimentDynamicParameters)} + client, owner := coderdenttest.New(t, &coderdenttest.Options{ + Options: &coderdtest.Options{ + Logger: &logger, + // We intentionally do not run a built-in provisioner daemon here. + IncludeProvisionerDaemon: false, + DeploymentValues: dv, + }, + LicenseOptions: &coderdenttest.LicenseOptions{ + Features: license.Features{ + codersdk.FeatureExternalProvisionerDaemons: 1, + }, + }, + }) + templateAdmin, _ := coderdtest.CreateAnotherUser(t, client, owner.OrganizationID, rbac.RoleTemplateAdmin()) + member, memberUser := coderdtest.CreateAnotherUser(t, client, owner.OrganizationID) + + _ = coderdenttest.NewExternalProvisionerDaemonTerraform(t, client, owner.OrganizationID, nil) + + // This can take a while, so set a relatively long timeout. + ctx := testutil.Context(t, 2*testutil.WaitSuperLong) + + // Creating a template as a template admin must succeed + templateFiles := map[string]string{"main.tf": mainTfTemplate} + tarBytes := testutil.CreateTar(t, templateFiles) + fi, err := templateAdmin.Upload(ctx, "application/x-tar", bytes.NewReader(tarBytes)) + require.NoError(t, err, "failed to upload file") + + tv, err := templateAdmin.CreateTemplateVersion(ctx, owner.OrganizationID, codersdk.CreateTemplateVersionRequest{ + Name: testutil.GetRandomName(t), + FileID: fi.ID, + StorageMethod: codersdk.ProvisionerStorageMethodFile, + Provisioner: codersdk.ProvisionerTypeTerraform, + UserVariableValues: []codersdk.VariableValue{}, + }) + require.NoError(t, err, "failed to create template version") + coderdtest.AwaitTemplateVersionJobCompleted(t, templateAdmin, tv.ID) + tpl := coderdtest.CreateTemplate(t, templateAdmin, owner.OrganizationID, tv.ID) + require.False(t, tpl.UseClassicParameterFlow, "template to use dynamic parameters") + + // When: we create a workspace build using the above template but with + // parameter values that are different from those defined in the template. + // The new values are not valid according to the original plan, but are valid. + ws, err := member.CreateUserWorkspace(ctx, memberUser.Username, codersdk.CreateWorkspaceRequest{ + TemplateID: tpl.ID, + Name: coderdtest.RandomUsername(t), + RichParameterValues: []codersdk.WorkspaceBuildParameter{ + { + Name: "param_min", + Value: "5", + }, + { + Name: "param", + Value: "7", + }, + }, + EnableDynamicParameters: true, + }) + + // Then: the build should succeed. The updated value of param_min should be + // used to validate param instead of the value defined in the temp + require.NoError(t, err, "failed to create workspace") + createBuild := coderdtest.AwaitWorkspaceBuildJobCompleted(t, member, ws.LatestBuild.ID) + require.Equal(t, createBuild.Status, codersdk.WorkspaceStatusRunning) + + // Now delete the workspace + build, err := member.CreateWorkspaceBuild(ctx, ws.ID, codersdk.CreateWorkspaceBuildRequest{ + Transition: codersdk.WorkspaceTransitionDelete, + }) + require.NoError(t, err) + build = coderdtest.AwaitWorkspaceBuildJobCompleted(t, member, build.ID) + require.Equal(t, codersdk.WorkspaceStatusDeleted, build.Status) +} + // TestWorkspaceTagsTerraform tests that a workspace can be created with tags. // This is an end-to-end-style test, meaning that we actually run the // real Terraform provisioner and validate that the workspace is created diff --git a/flake.nix b/flake.nix index bff207662f913..c0f36c3be6e0f 100644 --- a/flake.nix +++ b/flake.nix @@ -141,6 +141,7 @@ kubectl kubectx kubernetes-helm + lazydocker lazygit less mockgen diff --git a/go.mod b/go.mod index c43feefefee4d..41105cf13535e 100644 --- a/go.mod +++ b/go.mod @@ -96,12 +96,12 @@ require ( github.com/chromedp/chromedp v0.13.3 github.com/cli/safeexec v1.0.1 github.com/coder/flog v1.1.0 - github.com/coder/guts v1.3.1-0.20250428170043-ad369017e95b + github.com/coder/guts v1.5.0 github.com/coder/pretty v0.0.0-20230908205945-e89ba86370e0 github.com/coder/quartz v0.1.3 github.com/coder/retry v1.5.1 github.com/coder/serpent v0.10.0 - github.com/coder/terraform-provider-coder/v2 v2.4.1 + github.com/coder/terraform-provider-coder/v2 v2.4.2 github.com/coder/websocket v1.8.13 github.com/coder/wgtunnel v0.1.13-0.20240522110300-ade90dfb2da0 github.com/coreos/go-oidc/v3 v3.14.1 @@ -204,7 +204,7 @@ require ( golang.org/x/sys v0.33.0 golang.org/x/term v0.32.0 golang.org/x/text v0.25.0 // indirect - golang.org/x/tools v0.32.0 + golang.org/x/tools v0.33.0 golang.org/x/xerrors v0.0.0-20240903120638-7835f813f4da google.golang.org/api v0.231.0 google.golang.org/grpc v1.72.0 @@ -485,7 +485,7 @@ require ( require ( github.com/anthropics/anthropic-sdk-go v0.2.0-beta.3 - github.com/coder/preview v0.0.2-0.20250516233606-a1da43489319 + github.com/coder/preview v0.0.2-0.20250521212114-e6a60ffa74f2 github.com/fsnotify/fsnotify v1.9.0 github.com/kylecarbs/aisdk-go v0.0.8 github.com/mark3labs/mcp-go v0.28.0 diff --git a/go.sum b/go.sum index 9ffd716b334de..8a9d79820bce8 100644 --- a/go.sum +++ b/go.sum @@ -905,14 +905,14 @@ github.com/coder/go-httpstat v0.0.0-20230801153223-321c88088322 h1:m0lPZjlQ7vdVp github.com/coder/go-httpstat v0.0.0-20230801153223-321c88088322/go.mod h1:rOLFDDVKVFiDqZFXoteXc97YXx7kFi9kYqR+2ETPkLQ= github.com/coder/go-scim/pkg/v2 v2.0.0-20230221055123-1d63c1222136 h1:0RgB61LcNs24WOxc3PBvygSNTQurm0PYPujJjLLOzs0= github.com/coder/go-scim/pkg/v2 v2.0.0-20230221055123-1d63c1222136/go.mod h1:VkD1P761nykiq75dz+4iFqIQIZka189tx1BQLOp0Skc= -github.com/coder/guts v1.3.1-0.20250428170043-ad369017e95b h1:tfLKcE2s6D7YpFk7MUUCDE0Xbbmac+k2GqO8KMjv/Ug= -github.com/coder/guts v1.3.1-0.20250428170043-ad369017e95b/go.mod h1:31NO4z6MVTOD4WaCLqE/hUAHGgNok9sRbuMc/LZFopI= +github.com/coder/guts v1.5.0 h1:a94apf7xMf5jDdg1bIHzncbRiTn3+BvBZgrFSDbUnyI= +github.com/coder/guts v1.5.0/go.mod h1:0Sbv5Kp83u1Nl7MIQiV2zmacJ3o02I341bkWkjWXSUQ= github.com/coder/pq v1.10.5-0.20240813183442-0c420cb5a048 h1:3jzYUlGH7ZELIH4XggXhnTnP05FCYiAFeQpoN+gNR5I= github.com/coder/pq v1.10.5-0.20240813183442-0c420cb5a048/go.mod h1:AlVN5x4E4T544tWzH6hKfbfQvm3HdbOxrmggDNAPY9o= github.com/coder/pretty v0.0.0-20230908205945-e89ba86370e0 h1:3A0ES21Ke+FxEM8CXx9n47SZOKOpgSE1bbJzlE4qPVs= github.com/coder/pretty v0.0.0-20230908205945-e89ba86370e0/go.mod h1:5UuS2Ts+nTToAMeOjNlnHFkPahrtDkmpydBen/3wgZc= -github.com/coder/preview v0.0.2-0.20250516233606-a1da43489319 h1:flPwcvOZ9RwENDYcLOnfYEClbKWfFvpQCddODdSS6Co= -github.com/coder/preview v0.0.2-0.20250516233606-a1da43489319/go.mod h1:GfkwIv5gQLpL01qeGU1/YoxoFtt5trzCqnWZLo77clU= +github.com/coder/preview v0.0.2-0.20250521212114-e6a60ffa74f2 h1:D52yPPupcbNWppZzWAjZJG5L34TGpNyKj7vG1VT13FU= +github.com/coder/preview v0.0.2-0.20250521212114-e6a60ffa74f2/go.mod h1:9bwyhQSVDjcxAWuFFaG6/qBqhaiW5oqF5PEQMhevKLs= github.com/coder/quartz v0.1.3 h1:hA2nI8uUA2fNN9uhXv2I4xZD4aHkA7oH3g2t03v4xf8= github.com/coder/quartz v0.1.3/go.mod h1:vsiCc+AHViMKH2CQpGIpFgdHIEQsxwm8yCscqKmzbRA= github.com/coder/retry v1.5.1 h1:iWu8YnD8YqHs3XwqrqsjoBTAVqT9ml6z9ViJ2wlMiqc= @@ -925,8 +925,8 @@ github.com/coder/tailscale v1.1.1-0.20250422090654-5090e715905e h1:nope/SZfoLB9M github.com/coder/tailscale v1.1.1-0.20250422090654-5090e715905e/go.mod h1:1ggFFdHTRjPRu9Yc1yA7nVHBYB50w9Ce7VIXNqcW6Ko= github.com/coder/terraform-config-inspect v0.0.0-20250107175719-6d06d90c630e h1:JNLPDi2P73laR1oAclY6jWzAbucf70ASAvf5mh2cME0= github.com/coder/terraform-config-inspect v0.0.0-20250107175719-6d06d90c630e/go.mod h1:Gz/z9Hbn+4KSp8A2FBtNszfLSdT2Tn/uAKGuVqqWmDI= -github.com/coder/terraform-provider-coder/v2 v2.4.1 h1:+HxLJVENJ+kvGhibQ0jbr8Evi6M857d9691ytxNbv90= -github.com/coder/terraform-provider-coder/v2 v2.4.1/go.mod h1:2kaBpn5k9ZWtgKq5k4JbkVZG9DzEqR4mJSmpdshcO+s= +github.com/coder/terraform-provider-coder/v2 v2.4.2 h1:41SJkgwgiA555kwQzGIQcNS3bCm12sVMUmBSa5zGr+A= +github.com/coder/terraform-provider-coder/v2 v2.4.2/go.mod h1:2kaBpn5k9ZWtgKq5k4JbkVZG9DzEqR4mJSmpdshcO+s= github.com/coder/trivy v0.0.0-20250409153844-e6b004bc465a h1:yryP7e+IQUAArlycH4hQrjXQ64eRNbxsV5/wuVXHgME= github.com/coder/trivy v0.0.0-20250409153844-e6b004bc465a/go.mod h1:dDvq9axp3kZsT63gY2Znd1iwzfqDq3kXbQnccIrjRYY= github.com/coder/websocket v1.8.13 h1:f3QZdXy7uGVz+4uCJy2nTZyM0yTBj8yANEHhqlXZ9FE= @@ -2412,8 +2412,8 @@ golang.org/x/tools v0.6.0/go.mod h1:Xwgl3UAJ/d3gWutnCtw505GrjyAbvKui8lOU390QaIU= golang.org/x/tools v0.7.0/go.mod h1:4pg6aUX35JBAogB10C9AtvVL+qowtN4pT3CGSQex14s= golang.org/x/tools v0.13.0/go.mod h1:HvlwmtVNQAhOuCjW7xxvovg8wbNq7LwfXh/k7wXUl58= golang.org/x/tools v0.21.1-0.20240508182429-e35e4ccd0d2d/go.mod h1:aiJjzUbINMkxbQROHiO6hDPo2LHcIPhhQsa9DLh0yGk= -golang.org/x/tools v0.32.0 h1:Q7N1vhpkQv7ybVzLFtTjvQya2ewbwNDZzUgfXGqtMWU= -golang.org/x/tools v0.32.0/go.mod h1:ZxrU41P/wAbZD8EDa6dDCa6XfpkhJ7HFMjHJXfBDu8s= +golang.org/x/tools v0.33.0 h1:4qz2S3zmRxbGIhDIAgjxvFutSvH5EfnsYrRBj0UI0bc= +golang.org/x/tools v0.33.0/go.mod h1:CIJMaWEY88juyUfo7UbgPqbC8rU2OqfAV1h2Qp0oMYI= golang.org/x/xerrors v0.0.0-20190717185122-a985d3407aa7/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0= golang.org/x/xerrors v0.0.0-20191011141410-1b5146add898/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0= golang.org/x/xerrors v0.0.0-20191204190536-9bdfabe68543/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0= diff --git a/provisioner/terraform/serve.go b/provisioner/terraform/serve.go index 562946d8ef92e..3e671b0c68e56 100644 --- a/provisioner/terraform/serve.go +++ b/provisioner/terraform/serve.go @@ -16,7 +16,7 @@ import ( "cdr.dev/slog" "github.com/coder/coder/v2/coderd/database" - "github.com/coder/coder/v2/coderd/unhanger" + "github.com/coder/coder/v2/coderd/jobreaper" "github.com/coder/coder/v2/provisionersdk" ) @@ -39,9 +39,9 @@ type ServeOptions struct { // // This is a no-op on Windows where the process can't be interrupted. // - // Default value: 3 minutes (unhanger.HungJobExitTimeout). This value should + // Default value: 3 minutes (jobreaper.HungJobExitTimeout). This value should // be kept less than the value that Coder uses to mark hung jobs as failed, - // which is 5 minutes (see unhanger package). + // which is 5 minutes (see jobreaper package). ExitTimeout time.Duration } @@ -131,7 +131,7 @@ func Serve(ctx context.Context, options *ServeOptions) error { options.Tracer = trace.NewNoopTracerProvider().Tracer("noop") } if options.ExitTimeout == 0 { - options.ExitTimeout = unhanger.HungJobExitTimeout + options.ExitTimeout = jobreaper.HungJobExitTimeout } return provisionersdk.Serve(ctx, &server{ execMut: &sync.Mutex{}, diff --git a/scripts/build_go.sh b/scripts/build_go.sh index 3e23e15d8b962..97d9431beb544 100755 --- a/scripts/build_go.sh +++ b/scripts/build_go.sh @@ -144,10 +144,10 @@ fi # We use ts_omit_aws here because on Linux it prevents Tailscale from importing # github.com/aws/aws-sdk-go-v2/aws, which adds 7 MB to the binary. TS_EXTRA_SMALL="ts_omit_aws,ts_omit_bird,ts_omit_tap,ts_omit_kube" -if [[ "$slim" == 0 ]]; then - build_args+=(-tags "embed,$TS_EXTRA_SMALL") -else +if [[ "$slim" == 1 || "$dylib" == 1 ]]; then build_args+=(-tags "slim,$TS_EXTRA_SMALL") +else + build_args+=(-tags "embed,$TS_EXTRA_SMALL") fi if [[ "$agpl" == 1 ]]; then # We don't use a tag to control AGPL because we don't want code to depend on diff --git a/scripts/normalize_path.sh b/scripts/normalize_path.sh new file mode 100644 index 0000000000000..07427aa2bae77 --- /dev/null +++ b/scripts/normalize_path.sh @@ -0,0 +1,55 @@ +#!/bin/bash + +# Call: normalize_path_with_symlinks [target_dir] [dir_prefix] +# +# Normalizes the PATH environment variable by replacing each directory that +# begins with dir_prefix with a symbolic link in target_dir. For example, if +# PATH is "/usr/bin:/bin", target_dir is /tmp, and dir_prefix is /usr, then +# PATH will become "/tmp/0:/bin", where /tmp/0 links to /usr/bin. +# +# This is useful for ensuring that PATH is consistent across CI runs and helps +# with reusing the same cache across them. Many of our go tests read the PATH +# variable, and if it changes between runs, the cache gets invalidated. +normalize_path_with_symlinks() { + local target_dir="${1:-}" + local dir_prefix="${2:-}" + + if [[ -z "$target_dir" || -z "$dir_prefix" ]]; then + echo "Usage: normalize_path_with_symlinks " + return 1 + fi + + local old_path="$PATH" + local -a new_parts=() + local i=0 + + IFS=':' read -ra _parts <<<"$old_path" + for dir in "${_parts[@]}"; do + # Skip empty components that can arise from "::" + [[ -z $dir ]] && continue + + # Skip directories that don't start with $dir_prefix + if [[ "$dir" != "$dir_prefix"* ]]; then + new_parts+=("$dir") + continue + fi + + local link="$target_dir/$i" + + # Replace any pre-existing file or link at $target_dir/$i + if [[ -e $link || -L $link ]]; then + rm -rf -- "$link" + fi + + # without MSYS ln will deepcopy the directory on Windows + MSYS=winsymlinks:nativestrict ln -s -- "$dir" "$link" + new_parts+=("$link") + i=$((i + 1)) + done + + export PATH + PATH="$( + IFS=':' + echo "${new_parts[*]}" + )" +} diff --git a/scripts/release/docs_update_experiments.sh b/scripts/release/docs_update_experiments.sh index 1e5e6d1eb6b3e..7d7c178a9d4e9 100755 --- a/scripts/release/docs_update_experiments.sh +++ b/scripts/release/docs_update_experiments.sh @@ -12,27 +12,33 @@ set -euo pipefail source "$(dirname "${BASH_SOURCE[0]}")/../lib.sh" cdroot +# Ensure GITHUB_TOKEN is available +if [[ -z "${GITHUB_TOKEN:-}" ]]; then + if GITHUB_TOKEN="$(gh auth token 2>/dev/null)"; then + export GITHUB_TOKEN + else + echo "Error: GitHub token not found. Please run 'gh auth login' to authenticate." >&2 + exit 1 + fi +fi + if isdarwin; then dependencies gsed gawk sed() { gsed "$@"; } awk() { gawk "$@"; } fi -# From install.sh echo_latest_stable_version() { - # https://gist.github.com/lukechilds/a83e1d7127b78fef38c2914c4ececc3c#gistcomment-2758860 + # Extract redirect URL to determine latest stable tag version="$(curl -fsSLI -o /dev/null -w "%{url_effective}" https://github.com/coder/coder/releases/latest)" version="${version#https://github.com/coder/coder/releases/tag/v}" echo "v${version}" } echo_latest_mainline_version() { - # Fetch the releases from the GitHub API, sort by version number, - # and take the first result. Note that we're sorting by space- - # separated numbers and without utilizing the sort -V flag for the - # best compatibility. + # Use GitHub API to get latest release version, authenticated echo "v$( - curl -fsSL https://api.github.com/repos/coder/coder/releases | + curl -fsSL -H "Authorization: token ${GITHUB_TOKEN}" https://api.github.com/repos/coder/coder/releases | awk -F'"' '/"tag_name"/ {print $4}' | tr -d v | tr . ' ' | @@ -42,7 +48,6 @@ echo_latest_mainline_version() { )" } -# For testing or including experiments from `main`. echo_latest_main_version() { echo origin/main } @@ -59,33 +64,29 @@ sparse_clone_codersdk() { } parse_all_experiments() { - # Go doc doesn't include inline array comments, so this parsing should be - # good enough. We remove all whitespaces so that we can extract a plain - # string that looks like {}, {ExpA}, or {ExpA,ExpB,}. - # - # Example: ExperimentsAll=Experiments{ExperimentNotifications,ExperimentAutoFillParameters,} - go doc -all -C "${dir}" ./codersdk ExperimentsAll | + # Try ExperimentsSafe first, then fall back to ExperimentsAll if needed + experiments_var="ExperimentsSafe" + experiments_output=$(go doc -all -C "${dir}" ./codersdk "${experiments_var}" 2>/dev/null || true) + + if [[ -z "${experiments_output}" ]]; then + # Fall back to ExperimentsAll if ExperimentsSafe is not found + experiments_var="ExperimentsAll" + experiments_output=$(go doc -all -C "${dir}" ./codersdk "${experiments_var}" 2>/dev/null || true) + + if [[ -z "${experiments_output}" ]]; then + log "Warning: Neither ExperimentsSafe nor ExperimentsAll found in ${dir}" + return + fi + fi + + echo "${experiments_output}" | tr -d $'\n\t ' | - grep -E -o 'ExperimentsAll=Experiments\{[^}]*\}' | + grep -E -o "${experiments_var}=Experiments\{[^}]*\}" | sed -e 's/.*{\(.*\)}.*/\1/' | tr ',' '\n' } parse_experiments() { - # Extracts the experiment name and description from the Go doc output. - # The output is in the format: - # - # ||Add new experiments here! - # ExperimentExample|example|This isn't used for anything. - # ExperimentAutoFillParameters|auto-fill-parameters|This should not be taken out of experiments until we have redesigned the feature. - # ExperimentMultiOrganization|multi-organization|Requires organization context for interactions, default org is assumed. - # ExperimentCustomRoles|custom-roles|Allows creating runtime custom roles. - # ExperimentNotifications|notifications|Sends notifications via SMTP and webhooks following certain events. - # ExperimentWorkspaceUsage|workspace-usage|Enables the new workspace usage tracking. - # ||ExperimentTest is an experiment with - # ||a preceding multi line comment!? - # ExperimentTest|test| - # go doc -all -C "${1}" ./codersdk Experiment | sed \ -e 's/\t\(Experiment[^ ]*\)\ \ *Experiment = "\([^"]*\)"\(.*\/\/ \(.*\)\)\?/\1|\2|\4/' \ @@ -104,6 +105,11 @@ for channel in mainline stable; do log "Fetching experiments from ${channel}" tag=$(echo_latest_"${channel}"_version) + if [[ -z "${tag}" || "${tag}" == "v" ]]; then + echo "Error: Failed to retrieve valid ${channel} version tag. Check your GitHub token or rate limit." >&2 + exit 1 + fi + dir="$(sparse_clone_codersdk "${workdir}" "${channel}" "${tag}")" declare -A all_experiments=() @@ -115,14 +121,12 @@ for channel in mainline stable; do done fi - # Track preceding/multiline comments. maybe_desc= while read -r line; do line=${line//$'\n'/} readarray -d '|' -t parts <<<"$line" - # Missing var/key, this is a comment or description. if [[ -z ${parts[0]} ]]; then maybe_desc+="${parts[2]//$'\n'/ }" continue @@ -133,24 +137,20 @@ for channel in mainline stable; do desc="${parts[2]}" desc=${desc//$'\n'/} - # If desc (trailing comment) is empty, use the preceding/multiline comment. if [[ -z "${desc}" ]]; then desc="${maybe_desc% }" fi maybe_desc= - # Skip experiments not listed in ExperimentsAll. if [[ ! -v all_experiments[$var] ]]; then - log "Skipping ${var}, not listed in ExperimentsAll" + log "Skipping ${var}, not listed in experiments list" continue fi - # Don't overwrite desc, prefer first come, first served (i.e. mainline > stable). if [[ ! -v experiments[$key] ]]; then experiments[$key]="$desc" fi - # Track the release channels where the experiment is available. experiment_tags[$key]+="${channel}, " done < <(parse_experiments "${dir}") done @@ -170,8 +170,6 @@ table="$( done )" -# Use awk to print everything outside the BEING/END block and insert the -# table in between. awk \ -v table="${table}" \ 'BEGIN{include=1} /BEGIN: available-experimental-features/{print; print table; include=0} /END: available-experimental-features/{include=1} include' \ @@ -179,5 +177,4 @@ awk \ >"${dest}".tmp mv "${dest}".tmp "${dest}" -# Format the file for a pretty table (target single file for speed). (cd site && pnpm exec prettier --cache --write ../"${dest}") diff --git a/site/src/api/rbacresourcesGenerated.ts b/site/src/api/rbacresourcesGenerated.ts index 079dcb4a87a61..885f603c1eb82 100644 --- a/site/src/api/rbacresourcesGenerated.ts +++ b/site/src/api/rbacresourcesGenerated.ts @@ -130,7 +130,9 @@ export const RBACResourceActions: Partial< update: "update a provisioner daemon", }, provisioner_jobs: { + create: "create provisioner jobs", read: "read provisioner jobs", + update: "update provisioner jobs", }, replicas: { read: "read replicas", @@ -171,7 +173,9 @@ export const RBACResourceActions: Partial< workspace: { application_connect: "connect to workspace apps via browser", create: "create a new workspace", + create_agent: "create a new workspace agent", delete: "delete workspace", + delete_agent: "delete an existing workspace agent", read: "read workspace data to view on the UI", ssh: "ssh into a given workspace", start: "allows starting a workspace", @@ -189,7 +193,9 @@ export const RBACResourceActions: Partial< workspace_dormant: { application_connect: "connect to workspace apps via browser", create: "create a new workspace", + create_agent: "create a new workspace agent", delete: "delete workspace", + delete_agent: "delete an existing workspace agent", read: "read workspace data to view on the UI", ssh: "ssh into a given workspace", start: "allows starting a workspace", diff --git a/site/src/api/typesGenerated.ts b/site/src/api/typesGenerated.ts index 68cf0940ad8e1..74631c2be32fd 100644 --- a/site/src/api/typesGenerated.ts +++ b/site/src/api/typesGenerated.ts @@ -349,7 +349,7 @@ export interface ConvertLoginRequest { // From codersdk/chat.go export interface CreateChatMessageRequest { readonly model: string; - // embedded anonymous struct, please fix by naming it + // external type "github.com/kylecarbs/aisdk-go.Message", to include this type the package must be explicitly included in the parsing readonly message: unknown; readonly thinking: boolean; } @@ -490,6 +490,7 @@ export interface CreateWorkspaceBuildRequest { readonly rich_parameter_values?: readonly WorkspaceBuildParameter[]; readonly log_level?: ProvisionerLogLevel; readonly template_version_preset_id?: string; + readonly enable_dynamic_parameters?: boolean; } // From codersdk/workspaceproxy.go @@ -740,6 +741,19 @@ export interface DeploymentValues { readonly address?: string; } +// From codersdk/parameters.go +export interface DiagnosticExtra { + readonly code: string; +} + +// From codersdk/parameters.go +export type DiagnosticSeverityString = "error" | "warning"; + +export const DiagnosticSeverityStrings: DiagnosticSeverityString[] = [ + "error", + "warning", +]; + // From codersdk/workspaceagents.go export type DisplayApp = | "port_forwarding_helper" @@ -756,16 +770,16 @@ export const DisplayApps: DisplayApp[] = [ "web_terminal", ]; -// From codersdk/templateversions.go +// From codersdk/parameters.go export interface DynamicParametersRequest { readonly id: number; readonly inputs: Record; } -// From codersdk/templateversions.go +// From codersdk/parameters.go export interface DynamicParametersResponse { readonly id: number; - readonly diagnostics: PreviewDiagnostics; + readonly diagnostics: readonly FriendlyDiagnostic[]; readonly parameters: readonly PreviewParameter[]; } @@ -968,10 +982,10 @@ export const FormatZip = "zip"; // From codersdk/parameters.go export interface FriendlyDiagnostic { - readonly severity: PreviewDiagnosticSeverityString; + readonly severity: DiagnosticSeverityString; readonly summary: string; readonly detail: string; - readonly extra: PreviewDiagnosticExtra; + readonly extra: DiagnosticExtra; } // From codersdk/apikey.go @@ -1352,7 +1366,7 @@ export interface MinimalOrganization { export interface MinimalUser { readonly id: string; readonly username: string; - readonly avatar_url: string; + readonly avatar_url?: string; } // From netcheck/netcheck.go @@ -1595,6 +1609,16 @@ export interface OIDCConfig { readonly skip_issuer_checks: boolean; } +// From codersdk/parameters.go +export type OptionType = "bool" | "list(string)" | "number" | "string"; + +export const OptionTypes: OptionType[] = [ + "bool", + "list(string)", + "number", + "string", +]; + // From codersdk/organizations.go export interface Organization extends MinimalOrganization { readonly description: string; @@ -1615,8 +1639,8 @@ export interface OrganizationMember { // From codersdk/organizations.go export interface OrganizationMemberWithUserData extends OrganizationMember { readonly username: string; - readonly name: string; - readonly avatar_url: string; + readonly name?: string; + readonly avatar_url?: string; readonly email: string; readonly global_roles: readonly SlimRole[]; } @@ -1662,6 +1686,34 @@ export interface Pagination { readonly offset?: number; } +// From codersdk/parameters.go +export type ParameterFormType = + | "checkbox" + | "" + | "dropdown" + | "error" + | "input" + | "multi-select" + | "radio" + | "slider" + | "switch" + | "tag-select" + | "textarea"; + +export const ParameterFormTypes: ParameterFormType[] = [ + "checkbox", + "", + "dropdown", + "error", + "input", + "multi-select", + "radio", + "slider", + "switch", + "tag-select", + "textarea", +]; + // From codersdk/idpsync.go export interface PatchGroupIDPSyncConfigRequest { readonly field: string; @@ -1762,6 +1814,7 @@ export interface PrebuildsConfig { readonly reconciliation_interval: number; readonly reconciliation_backoff_interval: number; readonly reconciliation_backoff_lookback: number; + readonly failure_hard_limit: number; } // From codersdk/presets.go @@ -1777,33 +1830,19 @@ export interface PresetParameter { readonly Value: string; } -// From types/diagnostics.go -export interface PreviewDiagnosticExtra { - readonly code: string; - // empty interface{} type, falling back to unknown - readonly Wrapped: unknown; -} - -// From types/diagnostics.go -export type PreviewDiagnosticSeverityString = string; - -// From types/diagnostics.go -export type PreviewDiagnostics = readonly FriendlyDiagnostic[]; - -// From types/parameter.go +// From codersdk/parameters.go export interface PreviewParameter extends PreviewParameterData { readonly value: NullHCLString; - readonly diagnostics: PreviewDiagnostics; + readonly diagnostics: readonly FriendlyDiagnostic[]; } -// From types/parameter.go +// From codersdk/parameters.go export interface PreviewParameterData { readonly name: string; readonly display_name: string; readonly description: string; - readonly type: PreviewParameterType; - // this is likely an enum in an external package "github.com/coder/terraform-provider-coder/v2/provider.ParameterFormType" - readonly form_type: string; + readonly type: OptionType; + readonly form_type: ParameterFormType; readonly styling: PreviewParameterStyling; readonly mutable: boolean; readonly default_value: NullHCLString; @@ -1815,7 +1854,7 @@ export interface PreviewParameterData { readonly ephemeral: boolean; } -// From types/parameter.go +// From codersdk/parameters.go export interface PreviewParameterOption { readonly name: string; readonly description: string; @@ -1823,17 +1862,14 @@ export interface PreviewParameterOption { readonly icon: string; } -// From types/parameter.go +// From codersdk/parameters.go export interface PreviewParameterStyling { readonly placeholder?: string; readonly disabled?: boolean; readonly label?: string; } -// From types/enum.go -export type PreviewParameterType = string; - -// From types/parameter.go +// From codersdk/parameters.go export interface PreviewParameterValidation { readonly validation_error: string; readonly validation_regex: string | null; @@ -2096,7 +2132,9 @@ export type RBACAction = | "application_connect" | "assign" | "create" + | "create_agent" | "delete" + | "delete_agent" | "read" | "read_personal" | "ssh" @@ -2112,7 +2150,9 @@ export const RBACActions: RBACAction[] = [ "application_connect", "assign", "create", + "create_agent", "delete", + "delete_agent", "read", "read_personal", "ssh", @@ -2213,11 +2253,11 @@ export interface RateLimitConfig { // From codersdk/users.go export interface ReducedUser extends MinimalUser { - readonly name: string; + readonly name?: string; readonly email: string; readonly created_at: string; readonly updated_at: string; - readonly last_seen_at: string; + readonly last_seen_at?: string; readonly status: UserStatus; readonly login_type: LoginType; readonly theme_preference?: string; @@ -3246,6 +3286,7 @@ export interface Workspace { readonly template_allow_user_cancel_workspace_jobs: boolean; readonly template_active_version_id: string; readonly template_require_active_version: boolean; + readonly template_use_classic_parameter_flow: boolean; readonly latest_build: WorkspaceBuild; readonly latest_app_status: WorkspaceAppStatus | null; readonly outdated: boolean; @@ -3568,7 +3609,7 @@ export interface WorkspaceBuild { readonly workspace_name: string; readonly workspace_owner_id: string; readonly workspace_owner_name: string; - readonly workspace_owner_avatar_url: string; + readonly workspace_owner_avatar_url?: string; readonly template_version_id: string; readonly template_version_name: string; readonly build_number: number; diff --git a/site/src/components/FeatureStageBadge/FeatureStageBadge.tsx b/site/src/components/FeatureStageBadge/FeatureStageBadge.tsx index 25339d3120778..18b03b2e93661 100644 --- a/site/src/components/FeatureStageBadge/FeatureStageBadge.tsx +++ b/site/src/components/FeatureStageBadge/FeatureStageBadge.tsx @@ -18,6 +18,7 @@ export const featureStageBadgeTypes = { type FeatureStageBadgeProps = Readonly< Omit, "children"> & { contentType: keyof typeof featureStageBadgeTypes; + labelText?: string; size?: "sm" | "md" | "lg"; showTooltip?: boolean; } @@ -25,6 +26,7 @@ type FeatureStageBadgeProps = Readonly< export const FeatureStageBadge: FC = ({ contentType, + labelText = "", size = "md", showTooltip = true, // This is a temporary until the deprecated popover is removed ...delegatedProps @@ -43,7 +45,8 @@ export const FeatureStageBadge: FC = ({ {...delegatedProps} > (This is a - + + {labelText && `${labelText} `} {featureStageBadgeTypes[contentType]} feature) @@ -105,13 +108,6 @@ const styles = { backgroundColor: theme.branding.featureStage.hover.background, }), - badgeLabel: { - // Have to set display mode to anything other than inline, or else the - // CSS capitalization algorithm won't capitalize the element - display: "inline-block", - textTransform: "capitalize", - }, - badgeLargeText: { fontSize: "1rem", }, diff --git a/site/src/components/Filter/Filter.tsx b/site/src/components/Filter/Filter.tsx index ede669416d743..1d568e84a5d2b 100644 --- a/site/src/components/Filter/Filter.tsx +++ b/site/src/components/Filter/Filter.tsx @@ -1,5 +1,4 @@ import { useTheme } from "@emotion/react"; -import Button from "@mui/material/Button"; import Divider from "@mui/material/Divider"; import Menu from "@mui/material/Menu"; import MenuItem from "@mui/material/MenuItem"; @@ -10,6 +9,7 @@ import { hasError, isApiValidationError, } from "api/errors"; +import { Button } from "components/Button/Button"; import { InputGroup } from "components/InputGroup/InputGroup"; import { SearchField } from "components/SearchField/SearchField"; import { useDebouncedFunction } from "hooks/debounce"; @@ -267,9 +267,11 @@ const PresetMenu: FC = ({ = ({ {selectedOption?.label ?? placeholder} diff --git a/site/src/components/IconField/IconField.tsx b/site/src/components/IconField/IconField.tsx index b55ed59445dc6..5a272d44bfd80 100644 --- a/site/src/components/IconField/IconField.tsx +++ b/site/src/components/IconField/IconField.tsx @@ -1,17 +1,16 @@ import { Global, css, useTheme } from "@emotion/react"; -import Button from "@mui/material/Button"; import InputAdornment from "@mui/material/InputAdornment"; import TextField, { type TextFieldProps } from "@mui/material/TextField"; import { visuallyHidden } from "@mui/utils"; -import { DropdownArrow } from "components/DropdownArrow/DropdownArrow"; +import { Button } from "components/Button/Button"; import { ExternalImage } from "components/ExternalImage/ExternalImage"; import { Loader } from "components/Loader/Loader"; -import { Stack } from "components/Stack/Stack"; import { Popover, PopoverContent, PopoverTrigger, } from "components/deprecated/Popover/Popover"; +import { ChevronDownIcon } from "lucide-react"; import { type FC, Suspense, lazy, useState } from "react"; // See: https://github.com/missive/emoji-mart/issues/51#issuecomment-287353222 @@ -40,7 +39,7 @@ export const IconField: FC = ({ const [open, setOpen] = useState(false); return ( - +
= ({ /> - - + }> { @@ -128,6 +125,6 @@ export const IconField: FC = ({
)} -
+ ); }; diff --git a/site/src/components/RichParameterInput/RichParameterInput.tsx b/site/src/components/RichParameterInput/RichParameterInput.tsx index c9a5c895e5825..1af3245b98c7b 100644 --- a/site/src/components/RichParameterInput/RichParameterInput.tsx +++ b/site/src/components/RichParameterInput/RichParameterInput.tsx @@ -1,5 +1,4 @@ import type { Interpolation, Theme } from "@emotion/react"; -import Button from "@mui/material/Button"; import FormControlLabel from "@mui/material/FormControlLabel"; import FormHelperText from "@mui/material/FormHelperText"; import type { InputBaseComponentProps } from "@mui/material/InputBase"; @@ -8,6 +7,7 @@ import RadioGroup from "@mui/material/RadioGroup"; import TextField, { type TextFieldProps } from "@mui/material/TextField"; import Tooltip from "@mui/material/Tooltip"; import type { TemplateVersionParameter } from "api/typesGenerated"; +import { Button } from "components/Button/Button"; import { ExternalImage } from "components/ExternalImage/ExternalImage"; import { MemoizedMarkdown } from "components/Markdown/Markdown"; import { Pill } from "components/Pill/Pill"; @@ -240,7 +240,9 @@ export const RichParameterInput: FC = ({ !hideSuggestion && ( - ); - }, -); +export type SelectMenuButtonProps = ButtonProps & { + startIcon?: React.ReactNode; +}; + +export const SelectMenuButton = forwardRef< + HTMLButtonElement, + SelectMenuButtonProps +>((props, ref) => { + const { startIcon, ...restProps } = props; + return ( + + ); +}); export const SelectMenuSearch: FC = (props) => { return ( diff --git a/site/src/components/SignInLayout/SignInLayout.tsx b/site/src/components/SignInLayout/SignInLayout.tsx index 6a0d4f5865ea1..c557fd3b4c797 100644 --- a/site/src/components/SignInLayout/SignInLayout.tsx +++ b/site/src/components/SignInLayout/SignInLayout.tsx @@ -17,7 +17,8 @@ export const SignInLayout: FC = ({ children }) => { const styles = { container: { flex: 1, - height: "-webkit-fill-available", + // Fallback to 100vh + height: ["100vh", "-webkit-fill-available"], display: "flex", justifyContent: "center", alignItems: "center", diff --git a/site/src/components/UserAutocomplete/UserAutocomplete.tsx b/site/src/components/UserAutocomplete/UserAutocomplete.tsx index e375116cd2d22..c1b86e4d23afc 100644 --- a/site/src/components/UserAutocomplete/UserAutocomplete.tsx +++ b/site/src/components/UserAutocomplete/UserAutocomplete.tsx @@ -20,7 +20,7 @@ import { prepareQuery } from "utils/filters"; // The common properties between users and org members that we need. export type SelectedUser = { - avatar_url: string; + avatar_url?: string; email: string; username: string; }; diff --git a/site/src/index.css b/site/src/index.css index f3bf0918ddb3a..04b388a5cba99 100644 --- a/site/src/index.css +++ b/site/src/index.css @@ -107,4 +107,22 @@ --removed-body-scroll-bar-size: 0 !important; margin-right: 0 !important; } + + /* Prevent layout shift when modals open by maintaining scrollbar width */ + html { + scrollbar-gutter: stable; + } + + /* + This is a temporary fix for MUI Modals/Popovers until they are removed. + When html has scrollbar-gutter: stable, the browser reserves space for the scrollbar. + MUI Modals/Popovers, when locking body scroll, add `overflow: hidden` and `padding-right` + to the body to compensate for the scrollbar they are hiding. This added padding-right + conflicts with the already reserved gutter space, causing a layout shift. + This rule overrides MUI's added padding-right on the body specifically when MUI + is likely to have set both overflow:hidden and padding-right. + */ + body[style*="overflow: hidden"][style*="padding-right"] { + padding-right: 0px !important; + } } diff --git a/site/src/modules/hooks/useSyncFormParameters.ts b/site/src/modules/hooks/useSyncFormParameters.ts new file mode 100644 index 0000000000000..4f6952331eaaf --- /dev/null +++ b/site/src/modules/hooks/useSyncFormParameters.ts @@ -0,0 +1,53 @@ +import type * as TypesGen from "api/typesGenerated"; +import { useEffect, useRef } from "react"; + +import type { PreviewParameter } from "api/typesGenerated"; + +type UseSyncFormParametersProps = { + parameters: readonly PreviewParameter[]; + formValues: readonly TypesGen.WorkspaceBuildParameter[]; + setFieldValue: ( + field: string, + value: TypesGen.WorkspaceBuildParameter[], + ) => void; +}; + +export function useSyncFormParameters({ + parameters, + formValues, + setFieldValue, +}: UseSyncFormParametersProps) { + // Form values only needs to be updated when parameters change + // Keep track of form values in a ref to avoid unnecessary updates to rich_parameter_values + const formValuesRef = useRef(formValues); + + useEffect(() => { + formValuesRef.current = formValues; + }, [formValues]); + + useEffect(() => { + if (!parameters) return; + const currentFormValues = formValuesRef.current; + + const newParameterValues = parameters.map((param) => ({ + name: param.name, + value: param.value.valid ? param.value.value : "", + })); + + const currentFormValuesMap = new Map( + currentFormValues.map((value) => [value.name, value.value]), + ); + + const isChanged = + currentFormValues.length !== newParameterValues.length || + newParameterValues.some( + (p) => + !currentFormValuesMap.has(p.name) || + currentFormValuesMap.get(p.name) !== p.value, + ); + + if (isChanged) { + setFieldValue("rich_parameter_values", newParameterValues); + } + }, [parameters, setFieldValue]); +} diff --git a/site/src/modules/resources/AgentDevcontainerCard.tsx b/site/src/modules/resources/AgentDevcontainerCard.tsx index d9a591625b2f8..543004de5c1e2 100644 --- a/site/src/modules/resources/AgentDevcontainerCard.tsx +++ b/site/src/modules/resources/AgentDevcontainerCard.tsx @@ -88,7 +88,7 @@ export const AgentDevcontainerCard: FC = ({ return ( - + diff --git a/site/src/modules/resources/AgentRow.tsx b/site/src/modules/resources/AgentRow.tsx index 4e53c2cf2ba2c..f97c91e89af2a 100644 --- a/site/src/modules/resources/AgentRow.tsx +++ b/site/src/modules/resources/AgentRow.tsx @@ -15,6 +15,7 @@ import { DropdownArrow } from "components/DropdownArrow/DropdownArrow"; import type { Line } from "components/Logs/LogLine"; import { Stack } from "components/Stack/Stack"; import { useProxy } from "contexts/ProxyContext"; +import { AppStatuses } from "pages/WorkspacePage/AppStatuses"; import { type FC, useCallback, @@ -225,6 +226,13 @@ export const AgentRow: FC = ({
+ {workspace.latest_app_status?.agent_id === agent.id && ( +
+

App statuses

+ +
+ )} + {agent.status === "connected" && (
{shouldDisplayApps && ( diff --git a/site/src/modules/resources/useAgentLogs.test.ts b/site/src/modules/resources/useAgentLogs.test.ts index 8480f756611d2..a5339e00c87eb 100644 --- a/site/src/modules/resources/useAgentLogs.test.ts +++ b/site/src/modules/resources/useAgentLogs.test.ts @@ -1,4 +1,4 @@ -import { renderHook } from "@testing-library/react"; +import { renderHook, waitFor } from "@testing-library/react"; import type { WorkspaceAgentLog } from "api/typesGenerated"; import WS from "jest-websocket-mock"; import { MockWorkspaceAgent } from "testHelpers/entities"; @@ -29,17 +29,23 @@ describe("useAgentLogs", () => { // Send 3 logs server.send(JSON.stringify(generateLogs(3))); - expect(result.current).toHaveLength(3); + await waitFor(() => { + expect(result.current).toHaveLength(3); + }); // Disable the hook rerender({ enabled: false }); - expect(result.current).toHaveLength(0); + await waitFor(() => { + expect(result.current).toHaveLength(0); + }); // Enable the hook again rerender({ enabled: true }); await server.connected; server.send(JSON.stringify(generateLogs(3))); - expect(result.current).toHaveLength(3); + await waitFor(() => { + expect(result.current).toHaveLength(3); + }); }); }); diff --git a/site/src/modules/workspaces/DynamicParameter/DynamicParameter.tsx b/site/src/modules/workspaces/DynamicParameter/DynamicParameter.tsx index cbc7852bd14e5..96727cd0c796f 100644 --- a/site/src/modules/workspaces/DynamicParameter/DynamicParameter.tsx +++ b/site/src/modules/workspaces/DynamicParameter/DynamicParameter.tsx @@ -89,9 +89,7 @@ export const DynamicParameter: FC = ({ /> )}
- {parameter.diagnostics.length > 0 && ( - - )} + ); }; @@ -112,6 +110,9 @@ const ParameterLabel: FC = ({ const displayName = parameter.display_name ? parameter.display_name : parameter.name; + const hasRequiredDiagnostic = parameter.diagnostics?.find( + (d) => d.extra?.code === "required", + ); return (
@@ -186,6 +187,22 @@ const ParameterLabel: FC = ({ )} + {hasRequiredDiagnostic && ( + + + + + + Required + + + + + {hasRequiredDiagnostic.summary || "Required parameter"} + + + + )} {Boolean(parameter.description) && ( @@ -222,6 +239,15 @@ const DebouncedParameterField: FC = ({ const onChangeEvent = useEffectEvent(onChange); // prevDebouncedValueRef is to prevent calling the onChangeEvent on the initial render const prevDebouncedValueRef = useRef(); + const prevValueRef = useRef(value); + + // This is necessary in the case of fields being set by preset parameters + useEffect(() => { + if (value !== undefined && value !== prevValueRef.current) { + setLocalValue(value); + prevValueRef.current = value; + } + }, [value]); useEffect(() => { if (prevDebouncedValueRef.current !== undefined) { @@ -230,18 +256,30 @@ const DebouncedParameterField: FC = ({ prevDebouncedValueRef.current = debouncedLocalValue; }, [debouncedLocalValue, onChangeEvent]); + const textareaRef = useRef(null); + + const resizeTextarea = useEffectEvent(() => { + if (textareaRef.current) { + const textarea = textareaRef.current; + textarea.style.height = `${textarea.scrollHeight}px`; + } + }); + + useEffect(() => { + resizeTextarea(); + }, [resizeTextarea]); switch (parameter.form_type) { - case "textarea": + case "textarea": { return (