-
Notifications
You must be signed in to change notification settings - Fork 881
Add startup logs to build log UI #2957
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
So, the log is in |
This comment was marked as off-topic.
This comment was marked as off-topic.
Let's make sure we use a |
An implementation challenge is |
The only issue is if we're expecting some sort of prompt to be written to the log (something I know @kylecarbs mentioned when we discussed this Monday) then it might get truncated by the We could potentially only keep the startup logs of the most recent build and maybe also use a The startup logs of previous builds are not likely to be useful. |
Some kind of limiter is absolutely necessary to prevent DoS, e.g
This sounds good to me. |
This issue is becoming stale. In order to keep the tracker readable and actionable, I'm going close to this issue in 7 days if there isn't more activity. |
What would the workspace status be if startup scripts are still running? |
I think MVP here should be just putting the logs in the UI, and then in a follow-up feature we address waiting for startup scripts to finish. I believe some templates have startup scripts that block, e.g. they run |
This is a good question. I don't know, I think it could be wither running or pending. However, I don't think this influences showing the startup logs on the build page too much. |
The discussion in #4677 is relevant to workspace status during startup (/shutdown). Relating to blocking in startup script, I think that should be avoided/discouraged and introduce something new, like I agree with @ammario that we limit this scope to simply storing/streaming the log. I suggest we apply compression (e.g. zstd via klauspost/compress) to the logs but do we do it pre or post- limiting output to 1MB? On the one hand pre-compression will ensure the output isn't usually truncated (in failure scenarios there could be a lot of output to sift through). On the other hand, compression + a for loop echoing the same thing could produce a very huge uncompressed log. |
If we compress the log on capture, wouldn't that make it difficult / impossible to stream the log piecewise in the future? |
@ammario I understand your concern. I think it's doable but we can punt it for now. It'll add some complexity and raises the question if we should just utilize postgres compression instead (for the storage layer). My main thought with suggesting it was to allow larger logs to be stored, but reduced bandwidth could be a nice bonus. On the topic, it's possible to use zstd as a transparent compression layer via SaveTheRbtz/zstd-seekable-format-go (I've only tested that it works, no other experience with seekable). Assuming we can trust it's one frame/line (observed behavior), this would map pretty well to inserts in the database. One simpler solution we may employ is to compact logs once they're done streaming and store it as a single We could also just do compaction for the "historical" jobs, which might allow us to keep them around. I personally think they'd be useful for debugging purposes by both admins and users. (I.e. what's different today, this worked yesterday!) |
Right now logs older than 30 days are getting deleted automatically, so I don't think we have to worry about storage. An upper-bounded workspace would generate 30MB of logs per month. And good points on the lack of seekability in the pq driver. That is unfortunate. |
Removing myself from this until we have the BE done. It makes it easier for me to see what issues I should focus on. |
@ammario I've got a few questions for implementing this functionality. (One thing worth noting also is that "startup scripts that end" has been implemented, and our examples adjusted, see #5749.)
To expand on 1. Push architecture pros/cons:
Pull architecture pros/cons:
If we go with push, do we go with (d)RPC to reduce load? If we go with pull, would this be a suitable addition to the "statistics server" on the agent @kylecarbs? We'd probably rename it though. Reason I ask is that I worry about allocating too many listen ports, we've not reserved that many. And re: 2., assuming we go with push, I imagine an implementation where we allow straight off 1MB(-64KB) to be streamed and stored, then the remaining "tail" is stored in a circular buffer of, say, 64KB and will be committed once the startup script exits. Does this match what you had in mind @ammario? The downside of this is that the user will not see what's currently happening in the log (which is an edge case since they'd exceeded the ~1MB limit). The alternative of streaming the tail part doesn't really prevent DoS. As for None of that would allow inspecting the current log, though. That'd still fall under pull category. As for why we need to break it up into parts is that my workspace log after running for over a day is 3.8MB. |
I think the pull architecture is the way to go:
I also think we should only have a single listen server on the agent, as is typical of most daemons. |
Closing as completed by #6558. |
I have a complex startup script:
And it's not clear to me where I could see its log output. It would also be nice to not have to redirect command execution to logs manually.
The natural place for me to find this is the build log on the dashboard, but I also expect some file in my workspace to contain the log.
While we're at it, we should include the agent logs (
/var/log/coder-agent.log
) in the dashboard build log.The text was updated successfully, but these errors were encountered: