Skip to content

Agent Metadata #3480

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ammario opened this issue Aug 11, 2022 · 13 comments · Fixed by #6614
Closed

Agent Metadata #3480

ammario opened this issue Aug 11, 2022 · 13 comments · Fixed by #6614
Assignees
Labels
roadmap https://coder.com/roadmap. Only humans may set this.

Comments

@ammario
Copy link
Member

ammario commented Aug 11, 2022

The original idea came from @kconley-sq.

Right now, Resource Metadata values are generated at workspace build time. For example:

resource "coder_metadata" "deployment" {
  count = data.coder_workspace.me.start_count
  resource_id = kubernetes_deployment.coder[0].id
  item {
    key = "name"
    value = kubernetes_deployment.coder[0].metadata[0].name
  }
}

The feature excels at showing workspace configuration but can't expose key dynamic information such as CPU usage, load average, and the number of active connections. Here's an alternate syntax that could capture these dynamic values:

resource "coder_metadata" "deployment" {
  count = data.coder_workspace.me.start_count
  resource_id = kubernetes_deployment.coder[0].id
  item {
    key = "name"
    value_cmd = "cat /proc/loadavg"
    # This could also be a cron
    value_cmd_interval = "5s"
  }
}

The agent would execute the value_cmd. So, it could also make sense to configure dynamic metadata in the coder_agent definition block. However they are defined, I think it makes sense to expose these values in the Agent section of the resources table. See below:

image

Ideas:

  • CPU
  • Memory
  • curl ifconfig.me
  • git status
@spikecurtis
Copy link
Contributor

I wonder whether there is just a small set of key metrics that basically everyone wants like CPU, memory, network IO that we could just implement on each supported platform.

Or is each deployment a special snowflake in this regard such that template authors should be defining these?

@ammario
Copy link
Member Author

ammario commented Aug 12, 2022

@spikecurtis it's a good inquiry. There is so much ambiguity around "compute resources" that I struggle to see how hard-coded metrics would be compatible with our product's tenet of flexibility. Some examples:

  • Disk is consumed across both storage and inodes. In a massive npm codebase (many small files), you may run out of inodes before you run out of storage.
  • There could be multiple disks attached to different mountpoints, further complicating "disk used".
  • CPU can be presented across usr and sys or combined.
  • Memory can be looked at from various perspectives similar to CPU.
  • In containers and overprovisioning scenarios, it may be misleading to show the host's CPU, memory, etc.

@mattlqx
Copy link

mattlqx commented Aug 12, 2022

I like the idea of defining these in blocks. In many cases though, the metrics might already be being collected somewhere else though. For example, if you're launching workspaces as pods in a Kubernetes cluster, there are already going to be a lot of metrics available via a Prometheus stack. Perhaps a type of block to connect these by specifying a Prom endpoint and query? I think both the simple exec definitions and a way to link existing metrics would be desirable.

@github-actions
Copy link

This issue is becoming stale. In order to keep the tracker readable and actionable, I'm going close to this issue in 7 days if there isn't more activity.

@github-actions github-actions bot added the stale This issue is like stale bread. label Oct 12, 2022
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Oct 20, 2022
@ammario ammario reopened this Feb 6, 2023
@ammario
Copy link
Member Author

ammario commented Feb 6, 2023

It could be cool if there was a value_type = "tty", then we could preserve coloring/formatting on commands like git diff.

@kylecarbs
Copy link
Member

It'd be interesting if output would render as markdown... then you could do something like:

echo [$(git rev-parse --abbrev-ref HEAD)](https://github.com/coder/coder/tree/$(git rev-parse --abbrev-ref HEAD))

Which would render as:

[branch-name](https://github.com/coder/coder/tree/branch-name)

and be clickable in the UI...

@ammario
Copy link
Member Author

ammario commented Feb 6, 2023

value_type = "markdown|tty|code|string"...

@github-actions github-actions bot removed the stale This issue is like stale bread. label Feb 7, 2023
@BrunoQuaresma
Copy link
Collaborator

BrunoQuaresma commented Feb 7, 2023

I see a lot of potential for these. Talking to @bpmct and @ammario yesterday I can visualize it being used for metrics as well:

...
item {
  type = "metrics:usage"
  key = "Disk"
  value_cmd = "sh ./getDiskUsage.sh"
  value_cmd_interval = "1hour"
  value_total_usage = "2" // Only available for metrics:usage type and used for UI purposes to display "1GB of 2GB"
  value_unit = "GB" // When the value has a unit, used for UI
  ttl = "30days" // By default we don't save the results, but we can save it for an specified time and display a chart in the UI with those info
}

item {
  type = "metrics:execution" // This type would create an alias for the "make build" and store the execution time, so the user could track the performance of a command execution overtime. 
  key = "Build time"
  value_cmd = "make build"
  ttl = "30days"
}
...

@ammario ammario assigned ammario and unassigned ammario Feb 13, 2023
@ammario ammario mentioned this issue Mar 15, 2023
10 tasks
@bpmct
Copy link
Member

bpmct commented Mar 15, 2023

How is this gonna work for docker/kubernetes where top and htop doesn’t show accurate data? cgroup stuff or using the Kubernetes metrics server?

@ammario
Copy link
Member Author

ammario commented Mar 16, 2023

@bpmct — probably one of those two methods, but that the decision is left to the user.

@bpmct
Copy link
Member

bpmct commented Mar 16, 2023

Are we going to include agent metadata in our Kubernetes example? Want to make sure we have a solid answer for monitoring Kubernetes CPU, memory, disk usage.

@ammario
Copy link
Member Author

ammario commented Mar 17, 2023

Yeah, we should. I probably won't go through that effort in my PR though.

@bpmct bpmct added the roadmap https://coder.com/roadmap. Only humans may set this. label Mar 24, 2023
@matifali
Copy link
Member

Are we going to include agent metadata in our Kubernetes example? Want to make sure we have a solid answer for monitoring Kubernetes CPU, memory, disk usage.

I can update the templates after #6614 is merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
roadmap https://coder.com/roadmap. Only humans may set this.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants