HTTP: Support Apache-based servers with Negotiate #5286

ethomson · 2019-10-26T09:47:59Z

Most non-Microsoft web servers handle NTLM and Negotiate authentication differently than Microsoft web servers.

In theory, NTLM and Negotiate authenticate a connection. Since it's a challenge/response authentication mechanism, it's inefficient to authenticate. To avoid repeated back-and-forth with authentication, the entirety of a keep-alive socket is deemed authentication once authentication succeeds.

This requires the server understand the authentication for the lifetime of the keep-alive socket, which some servers (eg, Apache) simply do not do. Instead, they require re-authentication for each request. This is costly but sadly unavoidable.

libgit2 was built with a premise that we would do a GET to the git endpoint, which we would authenticate. Subsequent POST requests would remain authenticated, either because we were using HTTP Basic, and would re-send credentials each time, or we were using NTLM or Negotiate authentication and on a kept-alive socket that would remain authenticated through its lifetime.

Sadly, this assumption is poor. Apache does not abide by these rules. So we need to teach our HTTP stack to authenticate POST connections.

Regrettably, this leads to other difficulties - when we POST, we need to deliver a payload in our request. If we get a 401 back, we need to send that payload again. This is - at best - inefficient since we're sending the payload twice, but - at worst - is impossible since we may be streaming data to the server and we cannot rewind the stream.

One attempt is to use expect/continue. Instead of delivering the payload, we send "Expect: 100-continue", and only deliver the headers. We can then do the authentication dance, and when the server responds with a "100 Continue" response, we can deliver the payload. Although expect/continue is incredibly useful, it's not necessarily widely supported on ancient servers, and it's not supported by winhttp. As a result, although I'm adding expect/continue support here, I don't think it's the proper solution. It's behind an option.

I'll add the "probe packet" support from #2529, as this seems like the most correct way of handling this.

ethomson · 2019-10-26T12:12:07Z

I meant to open this as a draft. It’s incomplete, I just wanted to see the CI builds for this approach.

pks-t · 2019-11-28T12:21:03Z

I meant to open this as a draft. It’s incomplete, I just wanted to see the CI builds for this approach.

So no need to do reviews yet?

ethomson · 2020-01-01T22:08:58Z

OK, this is ready for review. I started this PR by adding support for both expect/continue and sending probe packets. And it was bad. The business logic of dealing with the HTTP flow was rather tangled up in things like the callbacks used by the HTTP parser library. Some of the code was shared between the standard request/response handling, proxy handling, and then the new expect/continue and probe packet handling, but some of it was copypasta. It was rather brittle.

And so after seeing this report in #5236:

Unfortunately however I get segfaults in libssl when cloning over https 😩 SSH and Negotiate over HTTP work fine. We had this code working with 28.2, but I have a feeling that maybe something has changed in this branch and we need to change how we're calling the api. Here is the backtrace, if you have any ideas on what to check first that would be really helpful!

I realized that if I had just broken something, it would be hard to diagnose, and probably harder to fix.

So I decided to tease things apart -- I've added a git_http_client, which actually handles sending a request and receiving a response, and made the existing http transport use that. I think that the result is much, much simpler and easier to reason about.

@jonathanturcotte could you test this? I hope that you're not seeing segmentation faults anymore. 🤞

ethomson · 2020-01-01T22:12:42Z

Also, apologies for the number of commits. A lot of the changes to http.c were refactorings to better understand how things were working in preparation for creating the httpclient. I hope that this isn't too bearish of a review. 😓

pks-t

Whew, that's been quite a read. I like this quite a lot, tearing apart HTTP and Git related protocols.

One thing I noted is that this series seems to build the same set of features twice, first for the legacy transports and then the same stuff for the new HTTP client, removing the first try again. I can relate if this is asking too much, but would it be possible to kind of discard the first set of commits that get discarded later on anyway, starting with simple bits like tracing, then creating the new httpclient and afterwards converting callers? Would make it much easier to review this series in future iterations.

src/transports/auth_negotiate.c

src/transports/http.c

src/transports/httpclient.c

src/transports/http.c

ethomson · 2020-01-08T15:54:57Z

Whew, that's been quite a read. I like this quite a lot, tearing apart HTTP and Git related protocols.

Thanks. I'm really happy with it overall. Due to the sort of weird way the transports work (the "read" / "write" lifecycle), it took a lot of aborted attempts to get it right.

One thing I noted is that this series seems to build the same set of features twice, first for the legacy transports and then the same stuff for the new HTTP client, removing the first try again. I can relate if this is asking too much, but would it be possible to kind of discard the first set of commits that get discarded later on anyway, starting with simple bits like tracing, then creating the new httpclient and afterwards converting callers? Would make it much easier to review this series in future iterations.

Yeah, I can drop the early http.c refactoring. I left those in since that's literally the process that I followed - but they're certainly pointless in the end. I'll respond to your review and fix this up over the weekend.

Move the redirect handling into `git_net_url` for consistency.

Provide a mechanism to add a path and query string to an existing url so that we can easily append `/info/refs?...` type url segments to a url given to us by a user.

Allow users to consume a buffer by the number of bytes, not just to an ending pointer.

Introduce a new http client implementation that can GET and POST to remote URLs. Consumers can use `git_http_client_init` to create a new client, `git_http_client_send_request` to send a request to the remote server and `git_http_client_read_response` to read the response. The http client implementation will perform the I/O with the remote server (http or https) but does not understand the git smart transfer protocol. This allows us to split the concerns of the http subtransport from the actual http implementation.

Teach httpclient how to support chunking when POSTing request bodies.

The CLAR_TRACE_LEVEL environment variable was supported when building with GIT_TRACE. Now we always build with GIT_TRACE, but that variable is not provided to tests. Simply support clar tracing always.

When sending a new request, ensure that we got the entirety of the response body. Our caller may have decided that they were done reading. If we were not at the end of the message, this means that we need to tear down the connection and cannot do keep-alive. However, if the caller read all of the message, but we still have a final end-of-response chunk signifier (ie, "0\r\n\r\n") on the socket, then we should consider that the response was successfully copmleted. If we're asked to send a new request, try to read from the socket, just to clear out that end-of-chunk message, marking ourselves as disconnected on any errors.

Introduce a function to format the path and query string for a URL, suitable for creating an HTTP request.

Store the last-seen credential challenges (eg, all the 'WWW-Authenticate' headers in a response message). Given some credentials, find the best (first) challenge whose mechanism supports these credentials. (eg, 'Basic' supports username/password credentials, 'Negotiate' supports default credentials). Set up an authentication context for this mechanism and these credentials. Continue exchanging challenge/responses until we're authenticated.

Detect responses that are sent with Transfer-Encoding: chunked, and record that information so that we can consume the entire message body.

Fully support HTTP proxies, in particular CONNECT proxies, that allow us to speak TLS through a proxy.

Allow users to opt-in to expect/continue handling when sending a POST and we're authenticated with a "connection-based" authentication mechanism like NTLM or Negotiate. If the response is a 100, return to the caller (to allow them to post their body). If the response is *not* a 100, buffer the response for the caller. HTTP expect/continue is generally safe, but some legacy servers have not implemented it correctly. Require it to be opt-in.

Untangle the notion of the http transport from the actual http implementation. The http transport now uses the httpclient.

When we're authenticating with a connection-based authentication scheme (NTLM, Negotiate), we need to make sure that we're still connected between the initial GET where we did the authentication and the POST that we're about to send. Our keep-alive session may have not kept alive, but more likely, some servers do not authenticate the entire keep-alive connection and may have "forgotten" that we were authenticated, namely Apache and nginx. Send a "probe" packet, that is an HTTP POST request to the upload-pack or receive-pack endpoint, that consists of an empty git pkt ("0000"). If we're authenticated, we'll get a 200 back. If we're not, we'll get a 401 back, and then we'll resend that probe packet with the first step of our authentication (asking to start authentication with the given scheme). We expect _yet another_ 401 back, with the authentication challenge. Finally, we will send our authentication response with the actual POST data. This will allow us to authenticate without draining the POST data in the initial request that gets us a 401.

When tracing is disabled, don't let `git_trace__level` return a void, since that can't be compared against.

Download poxygit, a debugging git server, and clone from it using NTLM, both IIS-style (with connection affinity) and Apache-style ("broken", requiring constant reauthentication).

Disambiguate between general network problems and HTTP problems in error codes.

ethomson · 2020-01-24T17:08:40Z

Thanks @pks-t for the review!

ethomson force-pushed the ethomson/gssapi branch from 17aae90 to c14c44d Compare October 28, 2019 14:17

ethomson mentioned this pull request Nov 1, 2019

Release 0.99 #5291

Merged

2 tasks

ethomson mentioned this pull request Nov 19, 2019

gssapi: delete half-built security context so auth can continue #5236

Closed

ethomson force-pushed the ethomson/gssapi branch 8 times, most recently from 5fb1541 to da07958 Compare November 27, 2019 12:28

ethomson changed the title ~~HTTP: Support Apache-based servers with Negotiate~~ WIP: HTTP: Support Apache-based servers with Negotiate Dec 5, 2019

ethomson force-pushed the ethomson/gssapi branch 8 times, most recently from 07bdb22 to fe24296 Compare January 1, 2020 21:44

ethomson force-pushed the ethomson/gssapi branch 4 times, most recently from 72589f6 to 08ed83b Compare January 2, 2020 05:21

pks-t reviewed Jan 6, 2020

View reviewed changes

ethomson added 21 commits January 24, 2020 09:54

net: refactor gitno redirect handling

471daee

Move the redirect handling into `git_net_url` for consistency.

net: introduce git_net_url_joinpath

e995f74

Provide a mechanism to add a path and query string to an existing url so that we can easily append `/info/refs?...` type url segments to a url given to us by a user.

buf: add consume_bytes

d68f2b1

Allow users to consume a buffer by the number of bytes, not just to an ending pointer.

net: introduce url formatting function

a591f36

httpclient: add chunk support to POST

84b99a9

Teach httpclient how to support chunking when POSTing request bodies.

tests: support CLAR_TRACE_LEVEL

dcd3b81

The CLAR_TRACE_LEVEL environment variable was supported when building with GIT_TRACE. Now we always build with GIT_TRACE, but that variable is not provided to tests. Simply support clar tracing always.

net: introduce path formatting function

0b8358c

Introduce a function to format the path and query string for a URL, suitable for creating an HTTP request.

net: free the url's query component

0e39a8f

httpclient: handle chunked responses

6b20883

Detect responses that are sent with Transfer-Encoding: chunked, and record that information so that we can consume the entire message body.

httpclient: support CONNECT proxies

6c21c98

Fully support HTTP proxies, in particular CONNECT proxies, that allow us to speak TLS through a proxy.

tests: allow users to use expect/continue

bf55fac

http: use the new httpclient

b9c5b15

Untangle the notion of the http transport from the actual http implementation. The http transport now uses the httpclient.

trace: compare against an int value

3e9ee04

When tracing is disabled, don't let `git_trace__level` return a void, since that can't be compared against.

httpclient: use defines for status codes

29762e4

ci: add NTLM tests

7fd9b3f

Download poxygit, a debugging git server, and clone from it using NTLM, both IIS-style (with connection affinity) and Apache-style ("broken", requiring constant reauthentication).

http: introduce GIT_ERROR_HTTP

e9cef7c

Disambiguate between general network problems and HTTP problems in error codes.

ethomson force-pushed the ethomson/gssapi branch from 77316e9 to e9cef7c Compare January 24, 2020 16:40

ethomson merged commit 4460bf4 into master Jan 24, 2020

ethomson deleted the ethomson/gssapi branch February 2, 2020 13:06

This was referenced Feb 26, 2020

Bump libgit2 to v0.99 nodegit/nodegit#1757

Closed

0.99.0: Performance regression in Mac OS clone #5427

Closed

ethomson mentioned this pull request Apr 27, 2020

Resend POST requests for Kerberos in WinHTTP #2529

Closed

paolobarbolini mentioned this pull request May 26, 2020

Failure to clone repository when git CLI succeeds #5525

Closed

ethomson mentioned this pull request Jun 1, 2020

httpclient: support googlesource #5536

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

HTTP: Support Apache-based servers with Negotiate #5286

HTTP: Support Apache-based servers with Negotiate #5286

Uh oh!

ethomson commented Oct 26, 2019

Uh oh!

ethomson commented Oct 26, 2019

Uh oh!

pks-t commented Nov 28, 2019

Uh oh!

ethomson commented Jan 1, 2020

Uh oh!

ethomson commented Jan 1, 2020

Uh oh!

pks-t left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ethomson commented Jan 8, 2020

Uh oh!

ethomson commented Jan 24, 2020

Uh oh!

Uh oh!

HTTP: Support Apache-based servers with Negotiate #5286

HTTP: Support Apache-based servers with Negotiate #5286

Uh oh!

Conversation

ethomson commented Oct 26, 2019

Uh oh!

ethomson commented Oct 26, 2019

Uh oh!

pks-t commented Nov 28, 2019

Uh oh!

ethomson commented Jan 1, 2020

Uh oh!

ethomson commented Jan 1, 2020

Uh oh!

pks-t left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ethomson commented Jan 8, 2020

Uh oh!

ethomson commented Jan 24, 2020

Uh oh!

Uh oh!