Skip to content

docs: add workspace process logging doc #9002

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Aug 18, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
fixup! docs: add workspace process logging doc
  • Loading branch information
deansheather committed Aug 18, 2023
commit 69358cd6b5ae973fc574aa5ad9d503a32e3a09dd
2 changes: 1 addition & 1 deletion docs/templates/docker-in-workspaces.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ There are a few ways to run Docker within container-based Coder workspaces.

The [Sysbox](https://github.com/nestybox/sysbox) container runtime allows unprivileged users to run system-level applications, such as Docker, securely from the workspace containers. Sysbox requires a [compatible Linux distribution](https://github.com/nestybox/sysbox/blob/master/docs/distro-compat.md) to implement these security features. Sysbox can also be used to run systemd inside Coder workspaces. See [Systemd in Docker](#systemd-in-docker).

The Sysbox container runtime is not compatible with our [workspace process logging](./process-logging.md) feature.
The Sysbox container runtime is not compatible with our [workspace process logging](./process-logging.md) feature. Envbox is compatible with process logging, however.

### Use Sysbox in Docker-based templates

Expand Down
315 changes: 169 additions & 146 deletions docs/templates/process-logging.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,16 @@ processes executing in the workspace.
> **Note:** This feature is only available on Linux in Kubernetes. There are
> additional requirements outlined further in this document.
>
> This is an Enterprise feature. To learn more about Coder Enterprise, please
> contact sales.
> This is an [Enterprise](https://coder.com/docs/v2/latest/enterprise) feature.
> To learn more about Coder Enterprise, please
> [contact sales](https://coder.com/contact).

Workspace process logging adds a sidecar container to workspace pods that will
log all processes started in the workspace container (e.g., commands executed in
the terminal or processes created in the background by other processes). You
can view the output from the sidecar or send it to a monitoring stack, such as
CloudWatch, for further analysis or long-term storage.
the terminal or processes created in the background by other processes).
Processes launched inside containers or nested containers within the workspace
are also logged. You can view the output from the sidecar or send it to a
monitoring stack, such as CloudWatch, for further analysis or long-term storage.

Please note that these logs are not recorded or captured by the Coder
organization in any way, shape, or form.
Expand All @@ -25,10 +27,10 @@ impact) to perform in-kernel logging and filtering of all exec system calls
originating from the workspace container.

The core of this feature is also open source and can be found in the
[exectrace](https://github.com/coder/exectrace) repo on GitHub repo. The
enterprise component (in the `enterprise/` directory of the repo) is responsible
for starting the eBPF program with the correct filtering options for the
specific workspace.
[exectrace](https://github.com/coder/exectrace) GitHub repo. The enterprise
component (in the `enterprise/` directory of the repo) is responsible for
starting the eBPF program with the correct filtering options for the specific
workspace.

## Requirements

Expand All @@ -38,18 +40,18 @@ The host machine must be running a Linux kernel >= 5.8 with the kernel config
To check your kernel version, run:

```shell
$ uname -r
uname -r
```

To validate the required kernel config is enabled, run either of the following
commands on your nodes directly (_not_ from a workspace terminal):

```shell
$ cat /proc/config.gz | gunzip | grep CONFIG_DEBUG_INFO_BTF
cat /proc/config.gz | gunzip | grep CONFIG_DEBUG_INFO_BTF
```

```shell
$ cat "/boot/config-$(uname -r)" | grep CONFIG_DEBUG_INFO_BTF
cat "/boot/config-$(uname -r)" | grep CONFIG_DEBUG_INFO_BTF
```

If these requirements are not met, workspaces will fail to start for security
Expand All @@ -75,145 +77,165 @@ would like to add workspace process logging to, follow these steps:

1. Add the following section to your template's `main.tf` file:

```hcl
locals {
# This is the init script for the main workspace container that runs before the
# agent starts to configure workspace process logging.
exectrace_init_script = <<EOT
set -eu
pidns_inum=$(readlink /proc/self/ns/pid | sed 's/[^0-9]//g')
if [ -z "$pidns_inum" ]; then
echo "Could not determine process ID namespace inum"
exit 1
fi

# Before we start the script, does curl exist?
if ! command -v curl >/dev/null 2>&1; then
echo "curl is required to download the Coder binary"
echo "Please install curl to your image and try again"
# 127 is command not found.
exit 127
fi

echo "Sending process ID namespace inum to exectrace sidecar"
rc=0
max_retry=5
counter=0
until [ $counter -ge $max_retry ]; do
set +e
curl \
--fail \
--silent \
--connect-timeout 5 \
-X POST \
-H "Content-Type: text/plain" \
--data "$pidns_inum" \
http://127.0.0.1:56123
rc=$?
set -e
if [ $rc -eq 0 ]; then
break
fi

counter=$((counter+1))
echo "Curl failed with exit code $${rc}, attempt $${counter}/$${max_retry}; Retrying in 3 seconds..."
sleep 3
done
if [ $rc -ne 0 ]; then
echo "Failed to send process ID namespace inum to exectrace sidecar"
exit $rc
fi

EOT
}
```
<!--
If you are updating this section, please also update the example templates
in the exectrace repo.
-->

```hcl
locals {
# This is the init script for the main workspace container that runs before the
# agent starts to configure workspace process logging.
exectrace_init_script = <<EOT
set -eu
pidns_inum=$(readlink /proc/self/ns/pid | sed 's/[^0-9]//g')
if [ -z "$pidns_inum" ]; then
echo "Could not determine process ID namespace inum"
exit 1
fi

# Before we start the script, does curl exist?
if ! command -v curl >/dev/null 2>&1; then
echo "curl is required to download the Coder binary"
echo "Please install curl to your image and try again"
# 127 is command not found.
exit 127
fi

echo "Sending process ID namespace inum to exectrace sidecar"
rc=0
max_retry=5
counter=0
until [ $counter -ge $max_retry ]; do
set +e
curl \
--fail \
--silent \
--connect-timeout 5 \
-X POST \
-H "Content-Type: text/plain" \
--data "$pidns_inum" \
http://127.0.0.1:56123
rc=$?
set -e
if [ $rc -eq 0 ]; then
break
fi

counter=$((counter+1))
echo "Curl failed with exit code $${rc}, attempt $${counter}/$${max_retry}; Retrying in 3 seconds..."
sleep 3
done
if [ $rc -ne 0 ]; then
echo "Failed to send process ID namespace inum to exectrace sidecar"
exit $rc
fi

EOT
}
```

1. Update the `command` of your workspace container like the following:

```hcl
resource "kubernetes_pod" "main" {
...
spec {
...
container {
...
// NOTE: this command is changed compared to the upstream kubernetes
// template
command = [
"sh",
"-c",
"${local.exectrace_init_script}\n\n${coder_agent.main.init_script}",
]
...
}
...
}
...
}
```

> **Note:** If you are using the `envbox` template, you will need to update
> the third argument to be
> `"${local.exectrace_init_script}\n\nexec /envbox docker"` instead.
<!--
If you are updating this section, please also update the example templates
in the exectrace repo.
-->

```hcl
resource "kubernetes_pod" "main" {
...
spec {
...
container {
...
// NOTE: this command is changed compared to the upstream kubernetes
// template
command = [
"sh",
"-c",
"${local.exectrace_init_script}\n\n${coder_agent.main.init_script}",
]
...
}
...
}
...
}
```

> **Note:** If you are using the `envbox` template, you will need to update
> the third argument to be
> `"${local.exectrace_init_script}\n\nexec /envbox docker"` instead.

1. Add the following container to your workspace pod spec.

```hcl
resource "kubernetes_pod" "main" {
...
spec {
...
// NOTE: this container is added compared to the upstream kubernetes
// template
container {
name = "exectrace"
image = "ghcr.io/coder/exectrace:latest"
image_pull_policy = "Always"
command = [
"/opt/exectrace",
"--init-address", "127.0.0.1:56123",
"--label", "workspace_id=${data.coder_workspace.me.id}",
"--label", "workspace_name=${data.coder_workspace.me.name}",
"--label", "user_id=${data.coder_workspace.me.owner_id}",
"--label", "username=${data.coder_workspace.me.owner}",
"--label", "user_email=${data.coder_workspace.me.owner_email}",
]
security_context {
// exectrace must be started as root so it can attach probes into the
// kernel to record process events with high throughput.
run_as_user = "0"
run_as_group = "0"
// exectrace requires a privileged container so it can control mounts
// and perform privileged syscalls against the host kernel to attach
// probes.
privileged = true
}
}
...
}
...
}
```

> **Note:** `exectrace` requires root privileges and a privileged container
> to attach probes to the kernel. This is a requirement of eBPF.
<!--
If you are updating this section, please also update the example templates
in the exectrace repo.
-->

```hcl
resource "kubernetes_pod" "main" {
...
spec {
...
// NOTE: this container is added compared to the upstream kubernetes
// template
container {
name = "exectrace"
image = "ghcr.io/coder/exectrace:latest"
image_pull_policy = "Always"
command = [
"/opt/exectrace",
"--init-address", "127.0.0.1:56123",
"--label", "workspace_id=${data.coder_workspace.me.id}",
"--label", "workspace_name=${data.coder_workspace.me.name}",
"--label", "user_id=${data.coder_workspace.me.owner_id}",
"--label", "username=${data.coder_workspace.me.owner}",
"--label", "user_email=${data.coder_workspace.me.owner_email}",
]
security_context {
// exectrace must be started as root so it can attach probes into the
// kernel to record process events with high throughput.
run_as_user = "0"
run_as_group = "0"
// exectrace requires a privileged container so it can control mounts
// and perform privileged syscalls against the host kernel to attach
// probes.
privileged = true
}
}
...
}
...
}
```

> **Note:** `exectrace` requires root privileges and a privileged container
> to attach probes to the kernel. This is a requirement of eBPF.

1. Add the following environment variable to your workspace pod:

```hcl
resource "kubernetes_pod" "main" {
...
spec {
...
env {
name = "CODER_AGENT_SUBSYSTEM"
value = "exectrace"
}
...
}
...
}
```
<!--
If you are updating this section, please also update the example templates
in the exectrace repo.
-->

```hcl
resource "kubernetes_pod" "main" {
...
spec {
...
env {
name = "CODER_AGENT_SUBSYSTEM"
value = "exectrace"
}
...
}
...
}
```

Once you have made these changes, you can push a new version of your template
and workspace process logging will be enabled for all workspaces once they are
Expand All @@ -225,7 +247,7 @@ To view the process logs for a specific workspace you can use `kubectl` to print
the logs:

```bash
$ kubectl logs pod-name --container exectrace
kubectl logs pod-name --container exectrace
```

The raw logs will look something like this:
Expand Down Expand Up @@ -259,9 +281,10 @@ The raw logs will look something like this:

### View logs in AWS EKS

If you're using AWS' Elastic Kubernetes Service, you can [configure your
cluster][eks-cloudwatch] to send logs to CloudWatch. This allows you to view the
logs for a specific user or workspace.
If you're using AWS' Elastic Kubernetes Service, you can
[configure your cluster](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Container-Insights-EKS-logs.html)
to send logs to CloudWatch. This allows you to view the logs for a specific user
or workspace.

To view your logs, go to the CloudWatch dashboard (which is available on the
**Log Insights** tab) and run a query similar to the following:
Expand Down