Skip to content

[HttpClient] Fix using https with proxies #38368

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 1, 2020
Merged

[HttpClient] Fix using https with proxies #38368

merged 1 commit into from
Oct 1, 2020

Conversation

bohanyang
Copy link
Contributor

@bohanyang bohanyang commented Oct 1, 2020

Q A
Branch? 4.4
Bug fix? yes
New feature? no
Deprecations? no
License MIT
Doc PR -

According to my test, when request_fulluri is set to true,
the host appears in the URL will be the Host header,
even if the Host header is set in the context http header.

Since HttpClient has its own DNS cache, the host inside the URL is usually an IP address.
So this can break many things.

{
  "args": {},
  "headers": {
    "Accept": "*/*",
    "Accept-Encoding": "gzip",
    "Host": "3.211.1.78",
    "User-Agent": "Symfony HttpClient/Native",
    "X-Amzn-Trace-Id": "Root=1-5f75a59e-62c8c81e4490e09c700d6180"
  },
  "origin": "xxx.xxx.xxx.xxx",
  "url": "https://3.211.1.78/get"
}
* Hostname was NOT found in DNS cache
* Added httpbin.org:0:3.211.1.78 to DNS cache
* Establish HTTP proxy tunnel to tcp://10.22.22.21:7777
> GET https://3.211.1.78/get HTTP/1.1
Accept: */*
Accept-Encoding: gzip
Host: httpbin.org
User-Agent: Symfony HttpClient/Native

< HTTP/1.1 200 OK
< Date: Thu, 01 Oct 2020 09:47:10 GMT
< Content-Type: application/json
< Content-Length: 300
< Connection: close
< Server: gunicorn/19.9.0
< Access-Control-Allow-Origin: *
< Access-Control-Allow-Credentials: true
<

I've also found this guzzle/guzzle#791

We can also create an option to make it customizable.

@nicolas-grekas
Copy link
Member

Hello, I'm not sure I get what the issue is here.

"Host": "3.211.1.78"

Where is this line coming from?

Note that HTTP/1.1 section 5.3.2 tells that the full uri must be used when going through a proxy:
https://tools.ietf.org/html/rfc7230#section-5.3.2

So this can break many things.

can you give more hints about this? could your proxy be non-compliant? How does it behave when using CurlHttpClient?

@bohanyang
Copy link
Contributor Author

bohanyang commented Oct 1, 2020

Where is this line coming from?

I'm not sure what changed the Host header for now.

could your proxy be non-compliant?

I'm testing it with privoxy, I don't know which proxy is compliant if it isn't.
CurlHttpClient works just fine, same result as setting request_fulluri to false.

Tested on Symfony 5.1.6, Debian 10, PHP (deb.sury.org) 7.4.10, curl 7.72.0 (from unstable repo to enable the debug log)
Privoxy working on 127.0.0.1:7777 version 3.0.28, default config, without sending/redirecting the traffic to another proxy (e.g. SOCKS)

<?php

declare(strict_types=1);

use Symfony\Component\HttpClient\CurlHttpClient;
use Symfony\Component\HttpClient\NativeHttpClient;
use Symfony\Component\HttpClient\HttpClient;

require __DIR__ . '/vendor/autoload.php';

$client = new NativeHttpClient();
$resp = $client->request('GET', 'https://httpbin.org/get');
echo $resp->getContent();
echo $resp->getInfo('debug');

Without request_fulluri

{
  "args": {},
  "headers": {
    "Accept": "*/*",
    "Accept-Encoding": "gzip",
    "Host": "httpbin.org",
    "User-Agent": "Symfony HttpClient/Native",
    "X-Amzn-Trace-Id": "Root=1-5f75cd34-5daa0a902c40e8df6a4cdd5b"
  },
  "origin": "xxx.xxx.xxx.xxx",
  "url": "https://httpbin.org/get"
}
* Hostname was NOT found in DNS cache
* Added httpbin.org:0:34.194.129.11 to DNS cache
* Establish HTTP proxy tunnel to tcp://127.0.0.1:7777
> GET https://34.194.129.11/get HTTP/1.1
Accept: */*
Accept-Encoding: gzip
Host: httpbin.org
User-Agent: Symfony HttpClient/Native

< HTTP/1.1 200 OK
< Date: Thu, 01 Oct 2020 12:36:04 GMT
< Content-Type: application/json
< Content-Length: 300
< Connection: close
< Server: gunicorn/19.9.0
< Access-Control-Allow-Origin: *
< Access-Control-Allow-Credentials: true
<

CurlHttpClient

{
  "args": {},
  "headers": {
    "Accept": "*/*",
    "Accept-Encoding": "gzip",
    "Host": "httpbin.org",
    "User-Agent": "Symfony HttpClient/Curl",
    "X-Amzn-Trace-Id": "Root=1-5f75cbd9-51a4f7a44f1eaa3326be6796"
  },
  "origin": "xxx.xxx.xxx.xxx",
  "url": "https://httpbin.org/get"
}
* Uses proxy env variable https_proxy == 'http://127.0.0.1:7777'
*   Trying 127.0.0.1:7777...
* Connected to 127.0.0.1 (127.0.0.1) port 7777 (#0)
* allocate connect buffer!
* Establish HTTP proxy tunnel to httpbin.org:443
> CONNECT httpbin.org:443 HTTP/1.1
Host: httpbin.org:443
Proxy-Connection: Keep-Alive

< HTTP/1.1 200 Connection established
<
* Proxy replied 200 to CONNECT request
* CONNECT phase completed!
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: /etc/ssl/certs
* CONNECT phase completed!
* CONNECT phase completed!
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server accepted to use h2
* Server certificate:
*  subject: CN=httpbin.org
*  start date: Jan 18 00:00:00 2020 GMT
*  expire date: Feb 18 12:00:00 2021 GMT
*  subjectAltName: host "httpbin.org" matched cert's "httpbin.org"
*  issuer: C=US; O=Amazon; OU=Server CA 1B; CN=Amazon
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x556304296990)
> GET /get HTTP/2
Host: httpbin.org
accept: */*
user-agent: Symfony HttpClient/Curl
accept-encoding: gzip

* Connection state changed (MAX_CONCURRENT_STREAMS == 128)!
< HTTP/2 200
< date: Thu, 01 Oct 2020 12:30:17 GMT
< content-type: application/json
< content-length: 298
< server: gunicorn/19.9.0
< access-control-allow-origin: *
< access-control-allow-credentials: true
<
* Connection #0 to host 127.0.0.1 left intact

can you give more hints about this?

So for example httpbin can be accessed like curl -k https://3.211.1.78/get, but many others don't.
The Host information is lost, a web server uses the Host to find out which website is being accessed.

@stof
Copy link
Member

stof commented Oct 1, 2020

According to https://www.php.net/manual/fr/context.http.php#context.http.request-fulluri, the request_fulluri context param is precisely creating non-standard requests, so that they work with non-standard proxies.

@nicolas-grekas
Copy link
Member

Thanks for the insights. Looking at the details, this means that the fulluri should be used only when https is not in use.

The patch should be stream_context_set_option($context, 'http', 'request_fulluri', !$isSsl);

$isSsl being a new argument on the method, whose value is 'https:' === $url['scheme']

@stof
Copy link
Member

stof commented Oct 1, 2020

Thanks for the insights. Looking at the details, this means that the fulluri should be used only when https is not in use.

why would it depend on https ?

@nicolas-grekas
Copy link
Member

nicolas-grekas commented Oct 1, 2020

  • When doing https, the CONNECT method is used and the remaining of the stream is encrypted: the remote server must see only the path
  • When doing http, GET is used: the proxy sees the headers and routes accordingly. It does so according to section 5.3.2 above, using the full URI.

@nicolas-grekas nicolas-grekas changed the base branch from master to 4.4 October 1, 2020 15:59
@nicolas-grekas nicolas-grekas changed the title [HttpClient] Fix Host header when using proxy with NativeHttpClient [HttpClient] Using https with proxies Oct 1, 2020
@bohanyang
Copy link
Contributor Author

bohanyang commented Oct 1, 2020

Patch applied. Worked with privoxy.
But my "non-standard-compliant" HTTP proxy (v2ray written in Go)
seems need request_fulluri to be false to work...
Could we make it configurable?

@nicolas-grekas nicolas-grekas changed the title [HttpClient] Using https with proxies [HttpClient] Fix using https with proxies Oct 1, 2020
@nicolas-grekas
Copy link
Member

Thank you @bohanyang.

@nicolas-grekas nicolas-grekas merged commit abe0eca into symfony:4.4 Oct 1, 2020
@nicolas-grekas
Copy link
Member

But my "non-standard-compliant" HTTP proxy (v2ray written in Go) seems need request_fulluri to be false to work...

Can you open an issue on the v2ray repo to tell them about this? It should be fixed on their side IMHO.

@stof
Copy link
Member

stof commented Oct 1, 2020

@nicolas-grekas my understanding of the HTTP/1.1 spec is that the request to the proxy should be GET https://httpbin.org/get, not GET https://3.211.1.78/get as the HTTP/1.1 proxy would replace the HOST header. So the PHP behavior for request_fulluri does not seem to be the HTTP/1.1 behavior.

@nicolas-grekas
Copy link
Member

the request to the proxy should be GET https://httpbin.org/get, not GET https://3.211.1.78/get

It should, but here this conflicts with the local DNS resolver/cache. But does it matter?

the HTTP/1.1 proxy would replace the HOST header

Why would it?

@stof
Copy link
Member

stof commented Oct 1, 2020

Why would it?

because of https://tools.ietf.org/html/rfc7230#section-5.4:

When a proxy receives a request with an absolute-form of
request-target, the proxy MUST ignore the received Host header field
(if any) and instead replace it with the host information of the
request-target. A proxy that forwards such a request MUST generate a
new Host field-value based on the received request-target rather than
forward the received Host field-value.

@nicolas-grekas
Copy link
Member

@stof I missed that part, grrr. This means we have a conflict between the local DNS resolver/cache and proxies when using NativeHttpClient. I'm going to submit a PR to account for this.

@bohanyang maybe this will fix your issue with v2ray, I'll ping you so that you could give it a try if you don't mind.

@nicolas-grekas
Copy link
Member

@bohanyang can you please test with #38375?

@bohanyang bohanyang deleted the hc-fulluri branch October 2, 2020 07:07
nicolas-grekas added a commit that referenced this pull request Oct 2, 2020
…las-grekas)

This PR was merged into the 4.4 branch.

Discussion
----------

[HttpClient] fix using proxies with NativeHttpClient

| Q             | A
| ------------- | ---
| Branch?       | 4.4
| Bug fix?      | yes
| New feature?  | no
| Deprecations? | no
| Tickets       | -
| License       | MIT
| Doc PR        | -

As spotted by @stof in #38368 (comment), we cannot use local DNS resolution with HTTP proxies.

Commits
-------

28f301b [HttpClient] fix using proxies with NativeHttpClient
@fabpot fabpot mentioned this pull request Oct 4, 2020
@fabpot fabpot mentioned this pull request Oct 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants