Skip to content

fix: let workspace pages download partial logs for unhealthy workspaces #13761

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 19 commits into from
Jul 3, 2024

Conversation

Parkreiner
Copy link
Member

@Parkreiner Parkreiner commented Jul 2, 2024

Addresses the most important parts of #13723

Changes made

  • Updated useAgentLogs to make it more resistant to data being missing, or having agent log data be updated in other parts of the app
  • Updated the log generation logic to ensure that it wouldn't recalculate the log files on every single render
  • Updated "Download Logs" UI to notify user that an unhealthy workspace may make some logs unavailable, but still let users download a partial set of logs
  • Made a couple of bug fixes to helpers, and added extra safety nets
  • Updated one of the helpers to simplify its API design
Screen.Recording.2024-07-02.at.9.52.11.AM.mov

Notes

  • Just to get this PR out faster, this doesn't cover any changes to any test files. I'll be adding those in a separate PR
  • As I mentioned in the linked issue, at this point, I really think we need to beef up our UI error handling

Comment on lines 45 to 49
const lastLog = logs?.at(-1);
const canSetLogId = lastLog !== undefined && lastQueriedLogId.current === 0;

if (canSetLogId) {
lastQueriedLogId.current = lastLog.id;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously: we were checking that the array existed, but not that it actually had data

Really do think that we need to bite the bullet and get TypeScript's noUncheckedIndexedAccess compiler option turned on, because it would've caught this

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good one and would require a ticket and PR by itself.

files.push({ name, blob });
});

return files;
}, [agentLogResults, agents, buildLogsQuery.data, workspace.name]);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a problem with the previous logic: because agents was getting redefined every render, we were also invalidating the useMemo dependency array every render

Comment on lines +143 to +149
setIsDownloading(true);
const zip = new JSZip();
allFiles.forEach((f) => {
if (f.blob) {
zip.file(f.name, f.blob);
}
});
Copy link
Member Author

@Parkreiner Parkreiner Jul 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved this out of the try block, because it should be the part that shouldn't be able to fail

Comment on lines 279 to 295
notAvailableText: (theme) => ({
display: "flex",
flexFlow: "row nowrap",
alignItems: "center",
columnGap: "4px",

"& > span": {
maxHeight: "fit-content",
display: "flex",
alignItems: "center",
color: theme.palette.error.light,
},

"& > p": {
opacity: "80%",
},
}),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been stuck in the pre-Emotion, Backstage-specific version of MUI for the past few months, so please let me know if anything looks funky

// don't accidentally break the memo cache every render. We can't tuck
// everything into a single memo call, because we need to set up React Query
// state between processing the agents, but we can't violate rules of hooks by
// putting hooks inside of hooks
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know, but I feel it is an over-optimization. If it is fast enough and does not impact UI usage, I don't see why we should complicate things.

Copy link
Member Author

@Parkreiner Parkreiner Jul 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was considering that, but didn't really have time to test out how long it would take to keep generating blobs every render

To be clear, we basically have two choices:

  • Keep the clunky but functional memoization
  • Remove the memoization altogether

The original memoization has worse performance than not having memoization at all, because we're still recalculating the blobs every render, but React has to compare dependencies and decide if it should re-calculate the value

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it was my fault. I usually use "useMemo" to compute a variable value but it would probably be better to be a function. I would rather remove memorization altogether to make things simple.

];

return {
agents: uniqueAgents,
logOptionsArray: uniqueAgents.map((agent) => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to calculate the query options here and not mount it and pass it in the query? I think it is kinda far from what is used and I took some time to link logOptionsArray to agentQueryOptions

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this was an over-optimization on my part. I basically figured that since the query objects shouldn't change unless the agents do, it was safe to throw them all together

let i = 0;
while (size > 1024 && i < units.length) {
while (size > 1024 && i < BLOB_SIZE_UNITS.length) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the BLOB_SIZE_UNITS is only used here, I would not move it to leave outside of the function where it is used.

.filter((a) => a !== undefined) as WorkspaceAgent[];
// The while condition can break if we accidentally exceed the bounds of the
// array. Have to be extra sure we have a unit at the very end.
const finalUnit = BLOB_SIZE_UNITS[i] ?? BLOB_SIZE_UNITS.at(-1) ?? "TB";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't get this 😞

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry – just updated the wording on the comment for more clarity!

Copy link
Collaborator

@BrunoQuaresma BrunoQuaresma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments. I'm going to QA this later today.

@BrunoQuaresma
Copy link
Collaborator

Screenshot 2024-07-02 at 13 10 26

I did some QA, and it works as expected, but I have some design considerations:

  • Since it's not an error, I would use "warning" as the severity for the alert.
  • I would not change the button label to "Download partial". I think it makes the action more confusing.
  • Instead of "N/A", I would use "Not available" colored with the secondary or disabled text color, since we already have an alert to make it explicit.

@Parkreiner
Copy link
Member Author

@BrunoQuaresma Tried implementing all your feedback, and also added some styling for handling long filenames:
Screenshot 2024-07-02 at 3 30 31 PM
Screenshot 2024-07-02 at 3 30 49 PM

Copy link
Collaborator

@BrunoQuaresma BrunoQuaresma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome job!

@Parkreiner Parkreiner merged commit 940afa1 into main Jul 3, 2024
29 checks passed
@Parkreiner Parkreiner deleted the mes/log-fix branch July 3, 2024 14:17
@github-actions github-actions bot locked and limited conversation to collaborators Jul 3, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants