fix proxy Accept-Header handling, combine http clients #9026
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation
This PR addresses #8793, which describes an issue with the handling of the
Accept-Encoding
header when communicating with OpenSearch clusters with enabled http compression (as enabled statically with #8628).This bug is actually caused by our HTTP client used in the Proxy infrastructure used to proxy requests from the client via LocalStack to the started OpenSearch server. This HTTP client always adds
Accept-Encoding: gzip, deflate
to the proxied HTTP request if noAccept-Encoding
is set in the originating request.With enabling the compression support for OpenSearch (i.e. OpenSearch will compress the responses using GZIP
Content-Encoding
in responses if the client sendsgzip
in theAccept-Encoding
request header), this issue surfaced since every request to OpenSearch via LocalStack (which is the default) without theAccept-Encoding
header (which is the default f.e. for a plaincurl
request).It turns out that sending HTTP requests without any
Accept-Encoding
header is not possible withrequests
because it usesurllib3
under the hood, which in turn uses Python'shttp.client
standard lib:urllib3
automatically adds `` if theAccept-Encoding
header is not set at all. See `urllib3.util.request.ACCEPT_ENCODING`.Accept-Encoding
explicitly toNone
http.client
- sets theAccept-Encoding: identity
in case it's not set orNone
.This means it's not really possible to identically relay a request which does not contain the
Accept-Encoding
header at all.Changes
Accept-Encoding
header toidentity
in case it's not set orNone
.urllib3
by setting theAccept-Encoding
header toNone
, but I wanted to make this behavior as explicit as possible (because it was quite a pain to get to the root of this weird behavior).SimpleHttpClient
with theSimpleStreamingHttpClient
. The latter was introduced with fix auto decoding of gzip for s3 vhost proxied requests #8148 in addition to theSimpleHttpClient
where it should actually just have been an extension of the default (but that was understandable at the time in order to prevent too wide-ranging changes).Testing
Testing this in Python is actually quite hard (since the test explicitly has to send a request without any
Accept-Encoding
header whatsoever). I actually had to directly use theHTTPConnection
class of Python's std lib.Fixes #8793.