-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Fix issues with Proxy Authentication after httpclient refactor #5852
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix issues with Proxy Authentication after httpclient refactor #5852
Conversation
68ed80b
to
1e0dfbb
Compare
Sorry about the force pushes, there was a slight debate over whether it should be |
ehhh... We found another unrelated proxy issue with the |
I've updated the original description with the second issue. This seems to alleviate the issue we're seeing. |
src/transports/httpclient.c
Outdated
@@ -734,7 +734,7 @@ static int generate_request( | |||
git_buf_printf(buf, "Expect: 100-continue\r\n"); | |||
|
|||
if ((error = apply_server_credentials(buf, client, request)) < 0 || | |||
(error = apply_proxy_credentials(buf, client, request)) < 0) | |||
(!client->proxy_connected && (error = apply_proxy_credentials(buf, client, request)) < 0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So thinking through this. You mention:
we continue to apply proxy-authorization headers to all requests even after we've already built a tunnel
which describes the CONNECT
case. From what you're saying, I agree that we're definitely doing the wrong thing here: we would be sending Proxy-Authentication
through the tunnel to the git server, which is clearly wrong.
But this auth step is for the non-CONNECT
proxy case - when you're sending a GET http://foo.com/
through the proxy itself. If you're authenticating with a mechanism like HTTP Basic to the proxy, then you would need to supply the credentials on every proxy request.
The CONNECT
proxy case would have supplied the credentials when it sent the CONNECT
request itself, in generate_connect_request
. So this is explicitly just for non-CONNECT
proxies.
(I realize that it's remarkably uncommon to not speak over a CONNECT
proxy, but here we are.)
I think that proxy_connected
is not the right test here for that case. proxy_connected
is set by proxy_connect
which is only called in http_client_connect
when:
use_proxy = client->proxy.url.host &&
!strcmp(client->server.url.scheme, "https");
So I think that proxy_connected
will always be false
when we're talking to a non-CONNECT
proxy.
So my reading of this is that we're no longer passing proxy authentication to non-CONNECT
proxies.
(But I'm surprised that we don't test this in our proxy integration tests. 😢 I know that speaking to non-https git servers is uncommon, but it's not a thing that is never done, and I think that we should support it - and support it through a proxy, so that means that we should test it. So I want to think through this a little bit more before I claim that I'm super confident of this analysis.)
I think that we might want to pull out the use_proxy
test in http_client_connect
into its own function, eg:
static bool use_connect_proxy(git_http_client *client)
{
return client->proxy.url.host && !strcmp(client->server.url.scheme, "https");
}
And make this line:
if ((error = apply_server_credentials(buf, client, request)) < 0 ||
(!use_connect_proxy(client) && (error = apply_proxy_credentials(buf, client, request)) < 0))
But - again - I'm not 100% on my analysis. Does this sound right to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I think that proxy_connected will always be false when we're talking to a non-CONNECT proxy.
So my reading of this is that we're no longer passing proxy authentication to non-CONNECT proxies.
I believe if proxy_connected
is always false, then we always pass the Proxy-Authentication
headers through, because we perform !client->proxy_connected
aka stop sending the Proxy-Authentication
headers if we're already connected.
I am not positive with your analysis, but I will go back to our test bed to confirm. I am not positive, because prior to the httpclient refactor, we always attached Proxy-Authentication
headers unless we had successfully connected to the proxy. I believe that is what I've updated httpclient
to do as well, since proxy_connected
is not set until we have established a connection to the proxy.
My intuition with your suggestion is that we would actually regress again, because use_connect_proxy
would always be set for an httpclient that has a connect proxy, therefore we would continue sending the Proxy-Authentication
headers through after connect has occurred. However, I don't understand the area fully, so maybe the use_connect_proxy
will return false on the direct-line to the git server and that's where I am confused with your analysis.
I'll get back to you today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well I'll be damned. Yea, that totally works in testing. In fact, it works better than what I had originally wrote in that there is less traffic to the git server. This change you suggested removes one round trip... Basically:
In my PR as written:
"GET http://github.com/libgit2/TestGitRepository/info/refs?service=git-upload-pack HTTP/1.1\r\nUser-Agent: git/2.0 (libgit2 1.1.0)\r\nHost: github.com\r\nAccept: */*\r\n\r\n"
"GET http://github.com/libgit2/TestGitRepository/info/refs?service=git-upload-pack HTTP/1.1\r\nUser-Agent: git/2.0 (libgit2 1.1.0)\r\nHost: github.com\r\nAccept: */*\r\n\r\n"
"CONNECT github.com:443 HTTP/1.1\r\nUser-Agent: git/2.0 (libgit2 1.1.0)\r\nHost: {omitted}\r\n\r\n"
"CONNECT github.com:443 HTTP/1.1\r\nUser-Agent: git/2.0 (libgit2 1.1.0)\r\nHost: {omitted}\r\nProxy-Authorization: Negotiate {omitted}\r\n\r\n"
"GET /libgit2/TestGitRepository/info/refs?service=git-upload-pack HTTP/1.1\r\nUser-Agent: git/2.0 (libgit2 1.1.0)\r\nHost: github.com\r\nAccept: */*\r\n\r\n"
"POST /libgit2/TestGitRepository/git-upload-pack HTTP/1.1\r\nUser-Agent: git/2.0 (libgit2 1.1.0)\r\nHost: github.com\r\nAccept: application/x-git-upload-pack-result\r\nContent-Type: application/x-git-upload-pack-request\r\nContent-Length: 429\r\n\r\n"
After your suggestion
"GET http://github.com/libgit2/TestGitRepository/info/refs?service=git-upload-pack HTTP/1.1\r\nUser-Agent: git/2.0 (libgit2 1.1.0)\r\nHost: github.com\r\nAccept: */*\r\n\r\n"
"CONNECT github.com:443 HTTP/1.1\r\nUser-Agent: git/2.0 (libgit2 1.1.0)\r\nHost: {omitted}\r\n\r\n"
"CONNECT github.com:443 HTTP/1.1\r\nUser-Agent: git/2.0 (libgit2 1.1.0)\r\nHost: {omitted}\r\nProxy-Authorization: Negotiate {omitted}\r\n\r\n"
"GET /libgit2/TestGitRepository/info/refs?service=git-upload-pack HTTP/1.1\r\nUser-Agent: git/2.0 (libgit2 1.1.0)\r\nHost: github.com\r\nAccept: */*\r\n\r\n"
"POST /libgit2/TestGitRepository/git-upload-pack HTTP/1.1\r\nUser-Agent: git/2.0 (libgit2 1.1.0)\r\nHost: github.com\r\nAccept: application/x-git-upload-pack-result\r\nContent-Type: application/x-git-upload-pack-request\r\nContent-Length: 429\r\n\r\n"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to try and get an intuition for that now :).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My intuition with your suggestion is that we would actually regress again, because use_connect_proxy would always be set for an httpclient that has a connect proxy
Yes I would expect use_connect_proxy
to always be true when set for an httpclient that had a connect proxy. My thinking is that we should always set proxy credentials in this method when not using a connect proxy and never set proxy credentials in this method when we are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Thanks for explaining that. I understand and agree with your take on this. We leverage our proxy credentials in a connect proxy in generate_connect_request, and thus don't need to ever add them in this method when we are a connect proxy. This makes sense. (looks like I'm the one who missed a !
😄).
I still think I standby this approach I took (though I can see how it is inferior), because proxy_connected
will never be set for non CONNECT proxies, therefore we will always send the Proxy-Authentication when in generate_request
. I believe that it works for the same reason that the original flow worked (there is external mutable state that signals whether we have leveraged CONNECT or not).
Are you certain that the approach I took will have a regression for non-CONNECT proxies? (asking, because we're shipping soon and debating on whether we need to take your suggestions before or if they can be taken later 😄)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for the ping, I edited that last one and want to make sure that you get the notification @ethomson.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you certain that the approach I took will have a regression for non-CONNECT proxies?
🤔 No, I'm not certain. I'm armchair reviewing from my phone. 🙃
I appreciate you explaining this, I think that you're right and that your approach should work. I had trouble reasoning about proxy_connected
.
I'd like to spin up some tests to play with this more but I wouldn't have too many hesitations about shipping hours as if if I were you. Worst case scenario: http (not https) auth breaks to git servers through a proxy. I'm actually not even sure that we support auth over http at the layer above this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had trouble reasoning about
proxy_connected
.
Mutable state was easy to reason about? 😂
I am going to take your solution, because it ultimately relies not on whether a connection was ever established, but on the fact that if one was ever established, that check must also hold. I prefer the non mutable approach because it's much easier to reason about.
Thanks for the prompt reply, that helped us make a decision.
a2cb816
to
3473a08
Compare
Thanks for this! |
Issue 1
When skipping the body, we used to wait for the
on_message_complete
callback to have been triggered, which setparse_finished
. After the refactor withhttpclient
, we stopped relying on this value, and thus we no longer consume the entire body when skipping it in the client. Here's where we used to consume the bodylibgit2/src/transports/http.c
Line 956 in bf55fac
This shows up as an issue primarily when working with the proxy connection workflow, where we need to discard the entire body (completely drain the socket of the response) before we start the retry with credentials.
Issue 2
We used to mutate
proxy_opts.type
toGIT_PROXY_NONE
after successfully connecting the tunnel. After the httpclient refactor, we continue to apply proxy-authorization headers to all requests even after we've already built a tunnel. This is fixed by checking the status of the proxy tunnel before applying proxy credentials. I think this roughly translates to the same behavior as before.This is where we used to mutate the
proxy_opts.type
beforelibgit2/src/transports/http.c
Line 1032 in bf55fac
libgit2/src/transports/http.c
Lines 233 to 235 in bf55fac