Skip to content

coder stat returns wrong memory units #10952

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
f0ssel opened this issue Nov 30, 2023 · 5 comments · Fixed by #11107
Closed

coder stat returns wrong memory units #10952

f0ssel opened this issue Nov 30, 2023 · 5 comments · Fixed by #11107
Assignees

Comments

@f0ssel
Copy link
Contributor

f0ssel commented Nov 30, 2023

Noticed on the dashboard that it says I have 0.197/8589934592 GiB (0%) and this is obviously incorrect. Looks like just a unit output issue.
Screenshot 2023-11-30 at 9 25 15 AM

This is from running scripts/develop.sh on dogfood.

I thought this may be frontend, but looked at the metadata stream and see the backend is incorrect, so I assume this is coming from coder stat mem.

Here's the message from the backend:

[
  {
    "result": {
      "collected_at": "2023-11-30T14:26:56.332349Z",
      "age": 6,
      "value": "0.464 cores\n",
      "error": ""
    },
    "description": {
      "display_name": "CPU Usage",
      "key": "0_cpu_usage",
      "script": "coder stat cpu",
      "interval": 10,
      "timeout": 1
    }
  },
  {
    "result": {
      "collected_at": "2023-11-30T14:26:57.331663Z",
      "age": 5,
      "value": "0.197/8589934592 GiB (0%)\n",
      "error": ""
    },
    "description": {
      "display_name": "RAM Usage",
      "key": "1_ram_usage",
      "script": "coder stat mem",
      "interval": 10,
      "timeout": 1
    }
  },
  {
    "result": {
      "collected_at": "2023-11-30T14:26:36.332216Z",
      "age": 26,
      "value": "67.7/468 GiB (14%)\n",
      "error": ""
    },
    "description": {
      "display_name": "Home Disk",
      "key": "3_home_disk",
      "script": "coder stat disk --path ${HOME}",
      "interval": 60,
      "timeout": 1
    }
  },
  {
    "result": {
      "collected_at": "2023-11-30T14:26:56.332349Z",
      "age": 6,
      "value": "1.2/64 cores (2%)\n",
      "error": ""
    },
    "description": {
      "display_name": "CPU Usage (Host)",
      "key": "4_cpu_usage_host",
      "script": "coder stat cpu --host",
      "interval": 10,
      "timeout": 1
    }
  },
  {
    "result": {
      "collected_at": "2023-11-30T14:26:56.332349Z",
      "age": 6,
      "value": "51.6/251 GiB (21%)\n",
      "error": ""
    },
    "description": {
      "display_name": "Memory Usage (Host)",
      "key": "5_mem_usage_host",
      "script": "coder stat mem --host",
      "interval": 10,
      "timeout": 1
    }
  },
  {
    "result": {
      "collected_at": "2023-11-30T14:26:36.332216Z",
      "age": 26,
      "value": "0.03",
      "error": ""
    },
    "description": {
      "display_name": "Load Average (Host)",
      "key": "6_load_host",
      "script": "      echo \"`cat /proc/loadavg | awk '{ print $1 }'` `nproc`\" | awk '{ printf \"%0.2f\", $1/$2 }'\n",
      "interval": 60,
      "timeout": 1
    }
  },
  {
    "result": {
      "collected_at": "2023-11-30T14:26:56.332349Z",
      "age": 6,
      "value": "1.0/8.0",
      "error": ""
    },
    "description": {
      "display_name": "Swap Usage (Host)",
      "key": "7_swap_host",
      "script": "      free -b | awk '/^Swap/ { printf(\"%.1f/%.1f\", $3/1024.0/1024.0/1024.0, $2/1024.0/1024.0/1024.0) }'\n",
      "interval": 10,
      "timeout": 1
    }
  }
]
@cdr-bot cdr-bot bot added the bug label Nov 30, 2023
@f0ssel f0ssel self-assigned this Nov 30, 2023
@f0ssel
Copy link
Contributor Author

f0ssel commented Nov 30, 2023

Assigning myself because I'm interested in learning the cli code for this, but if anyone feel compelled to take this on before I get to it go for it.

@f0ssel
Copy link
Contributor Author

f0ssel commented Nov 30, 2023

More information, running coder server on mac and using Docker does not seem to cause this issue:
image

@johnstcn
Copy link
Member

Correct, this is likely a problem in cli/clistat.

@f0ssel
Copy link
Contributor Author

f0ssel commented Dec 8, 2023

Some learnings - this result is a magic number: https://unix.stackexchange.com/questions/420906/what-is-the-value-for-the-cgroups-limit-in-bytes-if-the-memory-is-not-restricte

the value is the maximum 64-bit signed integer, rounded to the nearest page (by dropping the last bits).

This happens when the memory limit is not set on the container. We should just detect this magic number and behave accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants