Why Use Service Discovery
Why Use Service Discovery
Why Use Service Discovery
Let’s imagine that you are writing some code that invokes a service that has a REST
API or Thrift API. In order to make a request, your code needs to know the network
location (IP address and port) of a service instance. In a traditional application
running on physical hardware, the network locations of service instances are
relatively static. For example, your code can read the network locations from a
configuration file that is occasionally updated.
Now that we have looked at client-side discovery, let’s take a look at server-side
discovery.
The AWS Elastic Load Balancer (ELB) is an example of a server-side discovery router.
An ELB is commonly used to load balance external traffic from the Internet.
However, you can also use an ELB to load balance traffic that is internal to a virtual
private cloud (VPC). A client makes requests (HTTP or TCP) via the ELB using its
DNS name. The ELB load balances the traffic among a set of registered Elastic
Compute Cloud (EC2) instances or EC2 Container Service (ECS) containers. There
isn’t a separate service registry. Instead, EC2 instances and ECS containers are
registered with the ELB itself.
HTTP servers and load balancers such as NGINX Plus and NGINX can also be used
as a server-side discovery load balancer. For example, this blog post describes
using Consul Template to dynamically reconfigure NGINX reverse proxying.
Consul Template is a tool that periodically regenerates arbitrary configuration files
from configuration data stored in the Consul service registry. It runs an arbitrary shell
command whenever the files change. In the example described by the blog post,
Consul Template generates an nginx.conf file, which configures the reverse
proxying, and then runs a command that tells NGINX to reload the configuration. A
more sophisticated implementation could dynamically reconfigure NGINX Plus using
either its HTTP API or DNS.
Some deployment environments such as Kubernetes and Marathon run a proxy on
each host in the cluster. The proxy plays the role of a server-side discovery load
balancer. In order to make a request to a service, a client routes the request via the
proxy using the host’s IP address and the service’s assigned port. The proxy then
transparently forwards the request to an available service instance running
somewhere in the cluster.
The server-side discovery pattern has several benefits and drawbacks. One great
benefit of this pattern is that details of discovery are abstracted away from the client.
Clients simply make requests to the load balancer. This eliminates the need to
implement discovery logic for each programming language and framework used by
your service clients. Also, as mentioned above, some deployment environments
provide this functionality for free. This pattern also has some drawbacks, however.
Unless the load balancer is provided by the deployment environment, it is yet
another highly available system component that you need to set up and manage.
etcd – A highly available, distributed, consistent, key-value store that is used for
shared configuration and service discovery. Two notable projects that use etcd are
Kubernetes and Cloud Foundry.
consul – A tool for discovering and configuring services. It provides an API that
allows clients to register and discover services. Consul can perform health checks to
determine service availability.
Apache Zookeeper – A widely used, high-performance coordination service for
distributed applications. Apache Zookeeper was originally a subproject of Hadoop but
is now a top-level project.
Also, as noted previously, some systems such as Kubernetes, Marathon, and AWS
do not have an explicit service registry. Instead, the service registry is just a built-in
part of the infrastructure.
Now that we have looked at the concept of a service registry, let’s look at how
service instances are registered with the service registry.
The alternative approach, which decouples services from the service registry, is the
third-party registration pattern.
The third-party registration pattern has various benefits and drawbacks. A major
benefit is that services are decoupled from the service registry. You don’t need to
implement service-registration logic for each programming language and framework
used by your developers. Instead, service instance registration is handled in a
centralized manner within a dedicated service.
One drawback of this pattern is that unless it’s built into the deployment environment,
it is yet another highly available system component that you need to set up and
manage.
Summary
In a microservices application, the set of running service instances changes
dynamically. Instances have dynamically assigned network locations. Consequently,
in order for a client to make a request to a service it must use a service-discovery
mechanism.
A key part of service discovery is the service registry. The service registry is a
database of available service instances. The service registry provides a
management API and a query API. Service instances are registered with and
deregistered from the service registry using the management API. The query API is
used by system components to discover available service instances.
There are two main service-discovery patterns: client-side discovery and service-
side discovery. In systems that use client-side service discovery, clients query the
service registry, select an available instance, and make a request. In systems that
use server-side discovery, clients make requests via a router, which queries the
service registry and forwards the request to an available instance.
There are two main ways that service instances are registered with and deregistered
from the service registry. One option is for service instances to register themselves
with the service registry, the self-registration pattern. The other option is for some other
system component to handle the registration and deregistration on behalf of the
service, the third-party registration pattern.
In some deployment environments you need to set up your own service-discovery
infrastructure using a service registry such as Netflix Eureka, etcd,
or Apache Zookeeper. In other deployment environments, service discovery is built in.
For example, Kubernetes and Marathon handle service instance registration and
deregistration. They also run a proxy on each cluster host that plays the role
of server-side discovery router.
An HTTP reverse proxy and load balancer such as NGINX can also be used as a
server-side discovery load balancer. The service registry can push the routing
information to NGINX and invoke a graceful configuration update; for example, you
can use Consul Template. NGINX Plus supports additional dynamic reconfiguration
mechanisms – it can pull information about service instances from the registry using
DNS, and it provides an API for remote reconfiguration.
In future blog posts, we’ll continue to dive into other aspects of microservices. Sign
up to the NGINX mailing list (form is below) to be notified of the release of future
articles in the series.