Skip to content

feat: add support package #12289

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 29, 2024
Merged

feat: add support package #12289

merged 2 commits into from
Feb 29, 2024

Conversation

johnstcn
Copy link
Member

@johnstcn johnstcn commented Feb 23, 2024

Part of #12161

More to be added in later PRs.

@johnstcn johnstcn self-assigned this Feb 23, 2024
@johnstcn johnstcn requested review from mafredri and mtojek February 28, 2024 13:39
@johnstcn johnstcn marked this pull request as ready for review February 28, 2024 13:39
Copy link
Member

@mafredri mafredri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, my suggestions are mainly to add safety rails to prevent overly high memory usage. Then again, if this is only running on the client, maybe we don't need to be so careful (I'd be more worried if coderd was assembling the bundle).

log.Error(ctx, "fetch experiments", slog.Error(err))
} else {
d.Experiments = exp
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we do these concurrently for better speed? The data snapshot is "closer together" as well, in case that matters (same for the other functions too?).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Will address in a follow-up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be good to go with single-threaded workflow. It is a support bundle not really a time-critical function.

log.Error(ctx, "fetch coordinator debug page", slog.Error(err))
} else {
defer coordResp.Body.Close()
bs, err := io.ReadAll(coordResp.Body)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use a limit reader here (and elsewhere)? Protective measure just in case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more I think about it, the less I like limiting the amount of data read.
A support bundle with truncated information is likely to be useless.

log.Error(ctx, "fetch provisioner job logs", slog.Error(err), slog.F("job_id", ws.LatestBuild.Job.ID.String()))
} else {
defer closer.Close()
for log := range buildLogCh {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have a max size, only pick the tail-end if too big?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can know the number of logs ahead of time.

log.Error(ctx, "fetch agent startup logs", slog.Error(err), slog.F("agent_id", agentID.String()))
} else {
defer closer.Close()
for logChunk := range agentLogCh {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as with workspace build, agent logs are limited to 1MB currently, but that assumption could change.


b.Deployment = DeploymentInfo(ctx, d.Client, d.Log)
b.Workspace = WorkspaceInfo(ctx, d.Client, d.Log, d.WorkspaceID, d.AgentID)
b.Network = NetworkInfo(ctx, d.Client, d.Log, d.AgentID)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fetch concurrently?

Copy link
Member

@mtojek mtojek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR is 👍

// Bundle is a set of information discovered about a deployment.
// Even though we do attempt to sanitize data, it may still contain
// sensitive information and should thus be treated as secret.
type Bundle struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thinking loud: don't need to move it to codersdk? I'm thinking about some scripts or our site reviewing these files.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd defer doing this until it's absolutely necessary.

Comment on lines +36 to +41
type Network struct {
CoordinatorDebug string `json:"coordinator_debug"`
TailnetDebug string `json:"tailnet_debug"`
NetcheckLocal *codersdk.WorkspaceAgentConnectionInfo `json:"netcheck_local"`
NetcheckRemote *codersdk.WorkspaceAgentConnectionInfo `json:"netcheck_remote"`
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it might be useful to annotate these fields and give some examples what exactly can a user/developer expect. Might be nice to highlight fields with sensitive data.

Comment on lines +43 to +48
type Workspace struct {
Workspace codersdk.Workspace `json:"workspace"`
BuildLogs []codersdk.ProvisionerJobLog `json:"build_logs"`
Agent codersdk.WorkspaceAgent `json:"agent"`
AgentStartupLogs []codersdk.WorkspaceAgentLog `json:"startup_logs"`
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we include last N-workspace builds? User may retry building the workspace a few times.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's leave N=1 for now and add more if we find a use-case. I would generally always look at the latest build when troubleshooting an issue.

log.Error(ctx, "fetch experiments", slog.Error(err))
} else {
d.Experiments = exp
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be good to go with single-threaded workflow. It is a support bundle not really a time-critical function.

Comment on lines 99 to 100
// Get /api/v2/debug/coordinator
coordResp, err := client.Request(ctx, http.MethodGet, "/api/v2/debug/coordinator", nil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I believe this line of code is self-descriptive :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can add debug logging instead comments

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comments were more of a reminder to myself :-) Will remove.

var d Deployment
bi, err := client.BuildInfo(ctx)
if err != nil {
log.Error(ctx, "fetch build info", slog.Error(err))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the approach you implemented here. It is appropriate, but maybe we could keep a field for bundling errors within the bundle? It would make debugging easier when coderd returns HTTP 500, and the client error is saved.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll see if there's a nice generic solution to this in a follow-up.

Copy link
Member

@mtojek mtojek Feb 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely 👍 This is something that can be done in a follow-up.

@@ -44,6 +44,19 @@ type UpdateHealthSettings struct {
DismissedHealthchecks []HealthSection `json:"dismissed_healthchecks"`
}

func (c *Client) GetDebugHealth(ctx context.Context) (HealthcheckReport, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably remove the Get here since other API methods don't have it.

@johnstcn johnstcn merged commit e57c101 into main Feb 29, 2024
@johnstcn johnstcn deleted the cj/pkg-support branch February 29, 2024 11:58
@github-actions github-actions bot locked and limited conversation to collaborators Feb 29, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants