You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Report actionable error when GC fails due to disk pressure
When the image GC fails to free enough space due to disk pressure, the
user-visible event it generates is misleading (see #71869) - the
technical detail leads operators to suspect GC problems. This PR makes
the message actionable by focusing on the disk pressure.
I had hoped to include the total disk space consumed by images that
cannot be collected, to help operators quickly determine if disk
pressure is caused by a large number of active images or by other files
on the node, but this is blocked because the CRI API doesn't reveal the
total disk use (depending on the runtime it has either the compressed or
uncompressed space: #120698).
klog.InfoS("Disk usage on image filesystem is over the high threshold, trying to free bytes down to the low threshold", "usage", usagePercent, "highThreshold", im.policy.HighThresholdPercent, "amountToFree", amountToFree, "lowThreshold", im.policy.LowThresholdPercent)
// Failed to delete images, eg due to a read-only filesystem.
394
395
returnerr
395
396
}
396
397
397
398
im.runPostGCHooks(remainingImages, freeTime)
398
399
399
400
iffreed<amountToFree {
400
-
err:=fmt.Errorf("Failed to garbage collect required amount of images. Attempted to free %d bytes, but only found %d bytes eligible to free.", amountToFree, freed)
0 commit comments