NAME

    Hijk - Fast & minimal low-level HTTP client

SYNOPSIS

    A simple GET request:

        use Hijk ();
        my $res = Hijk::request({
            method       => "GET",
            host         => "example.com",
            port         => "80",
            path         => "/flower",
            query_string => "color=red"
        });
    
        if (exists $res->{error} and $res->{error} & Hijk::Error::TIMEOUT) {
            die "Oh noes we had some sort of timeout";
        }
    
        die "Expecting an 'OK' response" unless $res->{status} == 200;
    
        say $res->{body};

    A POST request, you have to manually set the appropriate headers, URI
    escape your values etc.

        use Hijk ();
        use URI::Escape qw(uri_escape);
    
        my $res = Hijk::request({
            method       => "POST",
            host         => "example.com",
            port         => "80",
            path         => "/new",
            head         => [ "Content-Type" => "application/x-www-form-urlencoded" ],
            query_string => "type=flower&bucket=the%20one%20out%20back",
            body         => "description=" . uri_escape("Another flower, let's hope it's exciting"),
        });
    
        die "Expecting an 'OK' response" unless $res->{status} == 200;

DESCRIPTION

    Hijk is a fast & minimal low-level HTTP client intended to be used
    where you control both the client and the server, e.g. for talking to
    some internal service from a frontend user-facing web application.

    It is NOT a general HTTP user agent, it doesn't support redirects,
    proxies, SSL and any number of other advanced HTTP features like (in
    roughly descending order of feature completeness) LWP::UserAgent,
    WWW::Curl, HTTP::Tiny, HTTP::Lite or Furl. This library is basically
    one step above manually talking HTTP over sockets.

    Having said that it's lightning fast and extensively used in production
    at Booking.com <https://www.booking.com> where it's used as the go-to
    transport layer for talking to internal services. It uses non-blocking
    sockets and correctly handles all combinations of connect/read timeouts
    and other issues you might encounter from various combinations of parts
    of your system going down or becoming otherwise unavailable.

FUNCTION: Hijk::request( $args :HashRef ) :HashRef

    Hijk::request is the only function you should use. It (or anything else
    in this package for that matter) is not exported, so you have to use
    the fully qualified name.

    It takes a HashRef of arguments and either dies or returns a HashRef as
    a response.

    The HashRef argument to it must contain some of the key-value pairs
    from the following list. The value for host and port are mandatory, but
    others are optional with default values listed below.

        protocol               => "HTTP/1.1", # (or "HTTP/1.0")
        host                   => ...,
        port                   => ...,
        connect_timeout        => undef,
        read_timeout           => undef,
        read_length            => 10240,
        method                 => "GET",
        path                   => "/",
        query_string           => "",
        head                   => [],
        body                   => "",
        socket_cache           => \%Hijk::SOCKET_CACHE, # (undef to disable, or \my %your_socket_cache)
        on_connect             => undef, # (or sub { ... })
        parse_chunked          => 0,
        head_as_array          => 0,
        no_default_host_header => 1,

    Notice how Hijk does not take a full URI string as input, you have to
    specify the individual parts of the URL. Users who need to parse an
    existing URI string to produce a request should use the URI module to
    do so.

    The value of head is an ArrayRef of key-value pairs instead of a
    HashRef, this way you can decide in which order the headers are sent,
    and you can send the same header name multiple times. For example:

        head => [
            "Content-Type" => "application/json",
            "X-Requested-With" => "Hijk",
        ]

    Will produce these request headers:

        Content-Type: application/json
        X-Requested-With: Hijk

    In addition Hijk will provide a Host header for you by default with the
    host value you pass to request(). To suppress this (e.g. to send custom
    Host requests) pass a true value to the no_default_host_header option
    and provide your own Host header in the head ArrayRef (or don't, if you
    want to construct a Host-less request knock yourself out...).

    Hijk doesn't escape any values for you, it just passes them through
    as-is. You can easily produce invalid requests if e.g. any of these
    strings contain a newline, or aren't otherwise properly escaped.

    The value of connect_timeout or read_timeout is in floating point
    seconds, and is used as the time limit for connecting to the host, and
    reading the response back from it, respectively. The default value for
    both is undef, meaning no timeout limit. If you don't supply these
    timeouts and the host really is unreachable or slow, we'll reach the
    TCP timeout limit before returning some other error to you.

    The default protocol is HTTP/1.1, but you can also specify HTTP/1.0.
    The advantage of using HTTP/1.1 is support for keep-alive, which
    matters a lot in environments where the connection setup represents
    non-trivial overhead. Sometimes that overhead is negligible (e.g. on
    Linux talking to an nginx on the local network), and keeping open
    connections down and reducing complexity is more important, in those
    cases you can either use HTTP/1.0, or specify Connection: close in the
    request, but just using HTTP/1.0 is an easy way to accomplish the same
    thing.

    By default we will provide a socket_cache for you which is a global
    singleton that we maintain keyed on join($;, $$, $host, $port).
    Alternatively you can pass in socket_cache hash of your own which we'll
    use as the cache. To completely disable the cache pass in undef.

    The optional on_connect callback is intended to be used for you to
    figure out from production traffic what you should set the
    connect_timeout. I.e. you can start a timer when you call
    Hijk::request() that you end when on_connect is called, that's how long
    it took us to get a connection. If you start another timer in that
    callback that you end when Hijk::request() returns to you that'll give
    you how long it took to send/receive data after we constructed the
    socket, i.e. it'll help you to tweak your read_timeout. The on_connect
    callback is provided with no arguments, and is called in void context.

    We have experimental support for parsing chunked responses encoding.
    historically Hijk didn't support this at all and if you wanted to use
    it with e.g. nginx you had to add chunked_transfer_encoding off to the
    nginx config file.

    Since you may just want to do that instead of having Hijk do more work
    to parse this out with a more complex and experimental codepath you
    have to explicitly enable it with parse_chunked. Otherwise Hijk will
    die when it encounters chunked responses. The parse_chunked option may
    be turned on by default in the future.

    The return value is a HashRef representing a response. It contains the
    following key-value pairs.

        proto         => :Str
        status        => :StatusCode
        body          => :Str
        head          => :HashRef (or :ArrayRef with "head_as_array")
        error         => :PositiveInt
        error_message => :Str
        errno_number  => :Int
        errno_string  => :Str

    For example, to send a request to http://example.com/flower?color=red,
    pass the following parameters:

        my $res = Hijk::request({
            host         => "example.com",
            port         => "80",
            path         => "/flower",
            query_string => "color=red"
        });
        die "Response is not 'OK'" unless $res->{status} == 200;

    Notice that you do not need to put the leading "?" character in the
    query_string. You do, however, need to properly uri_escape the content
    of query_string.

    Again, Hijk doesn't escape any values for you, so these values MUST be
    properly escaped before being passed in, unless you want to issue
    invalid requests.

    By default the head in the response is a HashRef rather then an
    ArrayRef. This makes it easier to retrieve specific header fields, but
    it means that we'll clobber any duplicated header names with the most
    recently seen header value. To get the returned headers as an ArrayRef
    instead specify head_as_array.

    If you want to fiddle with the read_length value it controls how much
    we POSIX::read($fd, $buf, $read_length) at a time.

    We currently don't support servers returning a http body without an
    accompanying Content-Length header; bodies MUST have a Content-Length
    or we won't pick them up.

ERROR CODES

    If we had a recoverable error we'll include an "error" key whose value
    is a bitfield that you can check against Hijk::Error::* constants.
    Those are:

        Hijk::Error::CONNECT_TIMEOUT
        Hijk::Error::READ_TIMEOUT
        Hijk::Error::TIMEOUT
        Hijk::Error::CANNOT_RESOLVE
        Hijk::Error::REQUEST_SELECT_ERROR
        Hijk::Error::REQUEST_WRITE_ERROR
        Hijk::Error::REQUEST_ERROR
        Hijk::Error::RESPONSE_READ_ERROR
        Hijk::Error::RESPONSE_BAD_READ_VALUE
        Hijk::Error::RESPONSE_ERROR

    In addition we might return error_message, errno_number and
    errno_string keys, see the discussion of Hijk::Error::REQUEST_* and
    Hijk::Error::RESPONSE_* errors below.

    The Hijk::Error::TIMEOUT constant is the same as
    Hijk::Error::CONNECT_TIMEOUT | Hijk::Error::READ_TIMEOUT. It's there
    for convenience so you can do:

        .. if exists $res->{error} and $res->{error} & Hijk::Error::TIMEOUT;

    Instead of the more verbose:

        .. if exists $res->{error} and $res->{error} & (Hijk::Error::CONNECT_TIMEOUT | Hijk::Error::READ_TIMEOUT)

    We'll return Hijk::Error::CANNOT_RESOLVE if we can't gethostbyname()
    the host you've provided.

    If we fail to do a select() or write() during when sending the response
    we'll return Hijk::Error::REQUEST_SELECT_ERROR or
    Hijk::Error::REQUEST_WRITE_ERROR, respectively. Similarly to
    Hijk::Error::TIMEOUT the Hijk::Error::REQUEST_ERROR constant is a union
    of these two, and any other request errors we might add in the future.

    When we're getting the response back we'll return
    Hijk::Error::RESPONSE_READ_ERROR when we can't read() the response, and
    Hijk::Error::RESPONSE_BAD_READ_VALUE when the value we got from read()
    is 0. The Hijk::Error::RESPONSE_ERROR constant is a union of these two
    and any other response errors we might add in the future.

    Some of these Hijk::Error::REQUEST_* and Hijk::Error::RESPONSE_* errors
    are re-thrown errors from system calls. In that case we'll also pass
    along error_message which is a short human readable error message about
    the error, as well as errno_number & errno_string, which are $!+0 and
    "$!" at the time we had the error.

    Hijk might encounter other errors during the course of the request and
    WILL call die if that happens, so if you don't want your program to
    stop when a request like that fails wrap it in eval.

    Having said that the point of the Hijk::Error::* interface is that all
    errors that happen during normal operation, i.e. making valid requests
    against servers where you can have issues like timeouts, network blips
    or the server thread on the other end being suddenly kill -9'd should
    be caught, categorized and returned in a structural way by Hijk.

    We're not currently aware of any issues that occur in such normal
    operations that aren't classified as a Hijk::Error::*, and if we find
    new issues that fit the criteria above we'll likely just make a new
    Hijk::Error::* for it.

    We're just not trying to guarantee that the library can never die, and
    aren't trying to catch truly exceptional issues like e.g. fcntl()
    failing on a valid socket.

AUTHORS

    Kang-min Liu <gugod@gugod.org>

    Ævar Arnfjörð Bjarmason <avar@cpan.org>

    Borislav Nikolov <jack@sofialondonmoskva.com>

    Damian Gryski <damian@gryski.com>

COPYRIGHT

    Copyright (c) 2013- Kang-min Liu <gugod@gugod.org>.

LICENCE

    The MIT License

DISCLAIMER OF WARRANTY

    BECAUSE THIS SOFTWARE IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
    FOR THE SOFTWARE, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT
    WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER
    PARTIES PROVIDE THE SOFTWARE "AS IS" WITHOUT WARRANTY OF ANY KIND,
    EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
    WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE
    ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE SOFTWARE IS WITH
    YOU. SHOULD THE SOFTWARE PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL
    NECESSARY SERVICING, REPAIR, OR CORRECTION.

    IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
    WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
    REDISTRIBUTE THE SOFTWARE AS PERMITTED BY THE ABOVE LICENCE, BE LIABLE
    TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL, OR
    CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE
    SOFTWARE (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING
    RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A
    FAILURE OF THE SOFTWARE TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF
    SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH
    DAMAGES.