fix ListObjectVersions pagination when version id marker is deleted #12259
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation
As reported with #12255, we've had an issue with
ListObjectVersions
and deletion.The failing scenario is reproduced in the added test, but it can be summarized like that:
The list operation, when paginated and arriving at the max amount it can return, will return a
VersionIdMarker
, representing the last marker it returned (newer operations of S3 return the next one, which is a bit smarter).If you happen to delete this version, AWS will return the next value in line after this now deleted marker. Not sure exactly how they do it, if they encode the data in the version id, or keep track of deleted versions. I've tried to reverse engineer the VersionId base64 encoding, but didn't really get anything and didn't try very hard.
We didn't have anything to compare against as those values were randomly generated, and we didn't have a way to compare them against each other. I've introduced some ever increasing number to be the prefix of the b64 encoded value, so that we can lexicographically sort and compare them.
Changes
ListObjectVersions
Side notes
I'm packing the sequence number as 6 bytes, so a max of 281 474 976 710 656. Minus the start time in millisecond right now, it would give approx. 279735559077354 increases (meaning new versions) before we overflow. I think we're good 😄 (irrational fear of overflow).
pagination is a mess in S3, not one operation implements it the same way, there's always a subtle difference 😅
fixes #12255