-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
S3/ASF parser: fix parsing header list
shape type with multivalue headers
#13022
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
S3 Image Test Results (AMD64 / ARM64) 2 files 2 suites 8m 11s ⏱️ Results for commit 4df5e5c. ♻️ This comment has been updated with latest results. |
Test Results - Preflight, Unit22 140 tests +1 20 404 ✅ ±0 6m 43s ⏱️ +16s For more details on these failures, see this check. Results for commit 4df5e5c. ± Comparison against base commit 5bee08e. ♻️ This comment has been updated with latest results. |
Test Results (amd64) - Acceptance7 tests 5 ✅ 3m 4s ⏱️ Results for commit ac1d143. |
Test Results (amd64) - Integration, Bootstrap 5 files 5 suites 2h 21m 10s ⏱️ Results for commit ac1d143. |
LocalStack Community integration with Pro 2 files 2 suites 1h 43m 3s ⏱️ Results for commit ac1d143. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's a nice catch! LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely great catch! I am just wondering about a super-edge case, which really doesn't block a merge here. 💯
# headers may contain a comma separated list of values (e.g., the ObjectAttributes member in | ||
# s3.GetObjectAttributes), so we prepare it here for the handler, which will be `_parse_list`. | ||
# Header lists can contain optional whitespace, so we strip it | ||
# https://www.rfc-editor.org/rfc/rfc9110.html#name-lists-rule-abnf-extension | ||
payload = [value.strip() for value in payload.split(",")] | ||
# It can also directly contain a list of headers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thought (non-blocking): Is it also possible that there are multiple headers with the same key, some contain a single value, and some contain multiple values (comma-separated)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also thought of this one 👀 I was thinking about splitting all the individual values in the list over ,
at first. It's a bit tricky, because I know for example Set-Cookie
(which was one of the first case I had with multi values headers in localstack/rolo#29) can contain commas in the value, which is why they are sending it a multiple values with the same key instead of a comma-separated single header.
The RFC 2616 states the following:
Multiple message-header fields with the same field-name MAY be
present in a message if and only if the entire field-value for that
header field is defined as a comma-separated list [i.e., #(values)].
It MUST be possible to combine the multiple header fields into one
"field-name: field-value" pair, without changing the semantics of the
message, by appending each subsequent field-value to the first, each
separated by a comma. The order in which header fields with the same
field-name are received is therefore significant to the
interpretation of the combined field value, and thus a proxy MUST NOT
change the order of these field values when a message is forwarded.
So it's not super clear either. I'll give it a try, and depending on the result will update or merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tried with curl
, which treats multi value headers properly but couldn't get it to run against AWS because GetObjectAttributes
needs to be signed, so I've also tried with awscurl
but it doesn't manage multi value headers properly and just overwrite with the last one specified 😕
But reading the RFC, I'm thinking of actually joining all returned values from getlist
over ,
, then splitting them like the previous behavior, following this particular part: It MUST be possible to combine the multiple header fields into one "field-name: field-value" pair, without changing the semantics of the message, by appending each subsequent field-value to the first, each separated by a comma.
I'll update, I think this is probably never gonna happen, but it might simplify the code a little bit 👍
Motivation
We got a report with #12999 that sending a
GetObjectAttributes
request would fail depending on the order of the fields being passed.While debugging and printing out low level HTTP request on the Java side, I realized that Java was sending the request with multiple headers for the same key, like some others SDK (Go in particular)
When receiving headers that way, our web server will return them as a list of values, accessible via
Headers.getlist()
.Because the parser was calling
Headers.get()
, it would only get the first value, and in the user sample above, would not even return the second requested attribute.This PR adapts the parser to use
getlist
in the header shape type islist
, and keep the old automatic splitting on,
that is still required.Changes
headers
override! This was so much easier to write a test for it!!!)requests
concatenates multivalue headers by default buturllib3
does not-Testing
If you remove the parser fix, the tests now fail.
I also validated that the user-provided sample in the linked issue now works 👍