Stupid Simple Kubernetes

Stupid Simple
Kubernetes
by Zoltán Czakó
Welcome to Stupid Simple Kubernetes
In software development, the single constant is that everything In this e-book, we will show you how to build a stable, easily
changes fast. Good developers are always prepared for manageable, highly available microservices architecture. In
change: the framework you’re working on today could be the first part, we will introduce Kubernetes, its components and
outdated in a few months. One way to prepare for change is building blocks. Then, we will build a small sample application
to create loosely coupled, independent components that can based on a microservices architecture. We’ll define the
be easily replaced. Kubernetes scripts to create the Deployments, Services,
Ingress Controllers and Persistent Volumes and then deploy
As software and tools change, so do application architectures. this ecosystem in Azure Cloud.
Recently we’ve witnessed an evolution from traditional
monolithic architecture, where all components of the In the second part of this book, we will dive into scalability
application are packed as a single atonomous unit, to and we will define different Kubernetes configuration files for
service-oriented architecture, to today’s microservices Horizontal and Vertical Pod Autoscaling and also for Cluster
architecture. Autoscaling.
Microservices architectures have sprung up because of In the last part of the book, we will present different solutions
changes in tools, programming languages and development for easily handling all the cross-cutting concerns that we
environments. To keep up with new technologies, you need a presented when using Service Meshes. We’ll build our own
way to add new, independent components at any time. These Service Mesh using Envoy proxies and then use Istio to handle
components must be free to use whatever technology they like, all these concerns automatically.
so they can be built using different programming languages
and frameworks, databases or even communication protocols. Ready to get started with Kubernetes? Let’s go.
Stupid Simple Kubernetes 2

Chapter 1 Chapter 5
Everything You Need to Know to 4 Stupid Simple Scability 46

Start Using Kubernetes
Chapter 2 Chapter 6
Deployments, Services and 14 Stupid Simple Service Mesh - 53

Ingresses Explained What, When, Why
Chapter 3 Chapter 7
Persistent Volumes Explained 25 Stupid Simple Service Mesh 62

in Kubernetes
Chapter 4 Conclusion
Create an Azure Infrastructure 36 Become a Microservices Master 73

for Microservices

Chapter 1
Everything You Need

to Know to Start
Using Kubernetes
In the era of Microservices, Cloud Computing and Serverless
architecture, it’s useful to understand Kubernetes and learn
how to use it. However, the official Kubernetes documentation
can be hard to decipher, especially for newcomers. In this
book, I will present a simplified view of Kubernetes and give
examples of how to use it for deploying microservices using
different cloud providers, including Azure, Amazon, Google
Cloud and even IBM.
In this first chapter, we’ll talk about the most important

concepts used in Kubernetes. Later in the book, we’ll learn how
to write configuration files, use Helm as a package manager,
create a cloud infrastructure, easily orchestrate our services
using Kubernetes and create a CI/CD pipeline to automate
the whole workflow. With this information, you can spin up any
kind of project and create a solid infrastructure/architecture.
First, I’d like to mention that using containers has multiple

benefits, from increased deployment velocity to delivery
Now, let’s start our journey in the world of Kubernetes.
consistency with a greater horizontal scale. Even so, you should
not use containers for everything because just putting any part
of your application in a container comes with overhead, like
maintaining a container orchestration layer. So, don’t jump to
conclusions. Instead, create a cost/benefit analysis at the start
of the project.

Kubernetes
Cluster
A cluster is a group of nodes. When you deploy programs onto

Hardware Structure the cluster, it automatically handles the distribution of work
to the individual nodes. If more resources are required (for
example, we need more memory), new nodes can be added
to the cluster, and the work will be redistributed automatically.
We run our code on a cluster, and we shouldn’t care about

which node. The distribution of the work is automatic.
Nodes
Nodes are worker machines in Kubernetes, which can be any

device that has CPU and RAM. For example, a node can be
anything, from a smartwatch, smartphone, or laptop to a
Raspberry Pi. When we work with cloud providers, a node is a
virtual machine (VM). So, a node is an abstraction over a single
device.
As you will see in the next chapter, the beauty of this abstraction
is that we don’t need to know the underlying hardware
structure. We will just use nodes; this way, our infrastructure is
platform independent.

Persistent Volumes We can classify the data management solutions into
two classes:
Because our code can be relocated from one node to another
(for example, a node doesn’t have enough memory, so the
work is rescheduled on a different node with enough memory), 1. Vertically scalable — includes traditional RDMS solutions
data saved on a node is volatile. But there are cases when such as MySQL, PostgreSQL and SQL Server
we want to save our data persistently. In this case, we should 2. Horizontally scalable — includes “NoSQL” solutions such as
use Persistent Volumes. A persistent volume is like an external ElasticSearch or Hadoop-based solutions
hard drive; you can plug it in and save your data on it.
Vertical scalable solutions like MySQL, Postgres and Microsoft
Google developed Kubernetes as a platform for stateless SQL should not go in containers. These database platforms
applications with persistent data stored elsewhere. As the require high I/O, shared disks, block storage, etc., and do not
project matured, many organizations wanted to leverage it for (by design) handle the loss of a node in a cluster gracefully,
their stateful applications, so the developers added persistent which often happens in a container-based ecosystem.
volume management. Much like the early days of virtualization,
database servers are not typically the first group of servers to For horizontally scalable applications (Elastic, Cassandra,
move into this new architecture. That’s because the database Kafka, etc.), use containers. They can withstand the loss of a
is the core of many applications and may contain valuable node in the database cluster, and the database application
information, so on-premises database systems still largely run can independently rebalance.
in VMs or physical servers.
Usually, you can and should containerize distributed
So, the question is, when should we use Persistent Volumes? To databases that use redundant storage techniques and can
answer that question, first, we should understand the different withstand a node’s loss in the database cluster (ElasticSearch
types of database applications. is a good example).

Kubernetes
Software Components
Container
One of the goals of modern software development is to keep

applications on the same host or cluster isolated. Virtual
machines are one solution to this problem. But virtual machines
require their own OS, so they are typically gigabytes in size.
Containers, by contrast, isolate application execution

environments from one another but share the underlying OS
kernel. So, a container is like a box where we store everything
needed to run an application: code, runtime, system tools,
system libraries, settings, etc. They’re typically measured in
megabytes, use far fewer resources than VMs and start up Pods
almost immediately.
A pod is a group of containers. In Kubernetes, the smallest unit
of work is a pod. A pod can contain multiples containers, but
usually, we use one container per pod because the replication
unit in Kubernetes is the pod. If we want to scale each container
independently, we add one container in a pod.

Deployments Stateful Sets
The primary role of deployment is to provide declarative StatefulSet is a new concept in Kubernetes, and it is a resource
updates to both the pod and the ReplicaSet (a set in which the used to manage stateful applications. It manages the
same pod is replicated multiple times). Using the deployment, deployment and scaling of a set of pods and guarantees these
we can specify how many replicas of the same pod should be pods’ ordering and uniqueness. It is similar to deployment;
running at any time. The deployment is like a manager for the the only difference is that the deployment creates a set of
pods; it automatically spins up the number of pods requested, pods with random pod names and the order of the pods
monitors the pods and recreates the pods in case of failure. is not important, while the StatefulSet creates pods with a
Deployments are helpful because you don’t have to create unique naming convention and order. So, if you want to create
and manage each pod separately. three replicas of a pod called example, the StatefulSet will
create pods with the following names: example-0, example-1,
example-2. In this case, the most important benefit is that you
can rely on the name of the pods.
DaemonSets
A DaemonSet ensures that the pod runs on all the nodes

of the cluster. If a node is added/removed from a cluster,
DaemonSet automatically adds/deletes the pod. This is useful
for monitoring and logging because you can monitor every
node and don’t have to monitor the cluster manually.

Services ConfigMaps
While deployment is responsible for keeping a set of pods If you want to deploy to multiple environments, like staging,
running, the service is responsible for enabling network dev and prod, it’s a bad practice to bake the configs into the
access to a set of pods. Services provide standardized features application because of environmental differences. Ideally,
across the cluster: load balancing, service discovery between you’ll want to separate configurations to match the deploy
applications and zero-downtime application deployments. environment. This is where ConfigMap comes into play.
Each service has a unique IP address and a DNS hostname. ConfigMaps allow you to decouple configuration artifacts
Applications that consume a service can be manually from image content to keep containerized applications
configured to use either the IP address or the hostname and portable.
the traffic will be load balanced to the correct pods. In the

External Traffic section, we will learn more about the service
types and how we can communicate between our internal
services and the external world.

External Traffic
Now that you’ve got the services running in your cluster, how
do you get external traffic into your cluster? There are three
different service types for handling external traffic: ClusterIP,
NodePort and LoadBalancer. The 4th solution is to add another
layer of abstraction, called Ingress Controller.
ClusterIP
ClusterIP is the default service type in Kubernetes and lets you

communicate with other services inside your cluster. While
ClusterIP is not meant for external access, with a little hack
using a proxy, external traffic can hit our service. Don’t use
this solution in production, but only for debugging. Services
declared as ClusterIP should NOT be directly visible from the
NodePort
outside.
As we saw in the first part of this chapter, pods are running on
nodes. Nodes can be different devices, like laptops or virtual
machines (when working in the cloud). Each node has a fixed
IP address. By declaring a service as NodePort, the service
will expose the node’s IP address so that you can access it
from the outside. You can use NodePort in production, but for
large applications, where you have many services, manually
managing all the different IP addresses can be cumbersome.

LoadBalancer Ingress
Declaring a service of type LoadBalancer exposes it externally Ingress is not a service but an API object that manages external
using a cloud provider’s load balancer. How the external load access to a cluster’s services. It acts as a reverse proxy and
balancer routes traffic to the Service pods depends on the single entry-point to your cluster that routes the request
cluster provider. With this solution, you don’t have to manage to different services. I usually use NGINX Ingress Controller,
all the IP addresses of every node of the cluster, but you will which takes on reverse proxy while also functioning as SSL. The
have one load balancer per service. The downside is that every best production-ready solution to expose the ingress is to use
service has a separate load balancer and you will be billed per a load balancer.
load balancer instance.

With this solution, you can expose any number of services
using a single load balancer, so you can keep your bills as low
as possible.
This solution is good for production, but it can be a little bit

expensive. Let’s look at a less expensive solution.

Next Steps
In this chapter, we learned about the basic concepts used in

Kubernetes and its hardware structure. We also discussed the
different software components including Pods, Deployments,
StatefulSets and Services, and saw how to communicate
between services and with the outside world.
In the next chapter, we’ll set up a cluster on Azure and create

an infrastructure with a LoadBalancer, an Ingress Controller
and two Services and use two Deployments to spin up three
Pods per Service.

Chapter 2
Deployments,
Services and
Ingresses Explained
In the first chapter, we learned about the basic concepts
used in Kubernetes, its hardware structure, the different
software components like Pods, Deployments, StatefulSets,
Services, Ingresses and Persistent Volumes and saw how to
communicate between services and with the outside world.
In this chapter, we will:
• Create a NodeJS backend with a MongoDB database

• Write the Dockerfile to containerize our application
• Create the Kubernetes Deployment scripts to spin up the
Pods
• Create the Kubernetes Service scripts to define the
communication interface between the containers and the
outside world
• Deploy an Ingress Controller for request routing
Because our code can be relocated from one node to another
• Write the Kubernetes Ingress scripts to define the
(for example, a node doesn’t have enough memory, so the
communication with the outside world.
work will be rescheduled on a different node with enough
memory), data saved on a node is volatile (so our MongoDB
data will be volatile, too). In the next chapter, we will talk about
the problem of data persistence and how to use Kubernetes
Persistent Volumes to safely store our persistent data.
In this tutorial, we will use NGINX as an Ingress Controller

and Azure Container Registry to store our custom Docker

images. All the scripts written in this book can be found in my
StupidSimpleKubernetes git repository. If you like it, please
leave a star!
NOTE: the scripts are platform agnostic, so you can follow the
tutorial using other types of cloud providers or a local cluster
with K3s. I suggest using K3s because it is very lightweight,
packed in a single binary less than 40MB. What’s more, it’s a
highly available, certified Kubernetes distribution designed for
production workloads in resource-constrained environments.
For more information, you can take a look over its well-written
and easy-to-follow documentation.
I would like to recommend another great article about basic

Kubernetes concepts: Explain By Example: Kubernetes.

Requirements
by the Kubernetes service (in this case, Azure Kubernetes
Service or AKS).
Before starting this tutorial, please make sure that you have The Docker file for NodeJS:
FROM node:13.10.1
installed Docker. Kubectl will be installed with Docker. (If not, WORKDIR /usr/src/app
COPY package*.json ./
please install it from here). RUN npm install
# Bundle app source
COPY . .
EXPOSE 3000
The Kubectl commands used throughout this tutorial can be CMD [ “node”, “index.js” ]
found in the Kubectl Cheat Sheet.
In the first line, we need to define from what image we want to

Through this tutorial, we will use Visual Studio Code, but this is
build our backend service. In this case, we will use the official
not mandatory.
node image with version 13.10.1 from Docker Hub.
Creating a In line 3 we create a directory to hold the application code

inside the image. This will be the working directory for your
Production-Ready application.
Microservices This image comes with Node.js and NPM already installed so
Architecture the next thing we need to do is to install your app dependencies

using the npm command.
Containerize the app

Note that to install the required dependencies, we don’t have
The first step is to create the Docker image of our NodeJS to copy the whole directory, only the package.json, which
backend. After creating the image, we will push it in to the allows us to take advantage of cached Docker layers (more
container registry, where it will be accessible and can be pulled info about efficient Dockerfiles here).

In line 9 we copy our source code into the working directory and To run the image locally, we can use the following command:
on line 11 we expose it on port 3000 (you can choose another
port if you want, but make sure to change in the Kubernetes docker run -p 3000:3000 node-user-service:dev
Service script, too.)
To push this image to our Azure Container Registry, we have

Finally, on line 13 we define the command to run the application
to tag it using the following format <container_registry_log-
(inside the Docker container). Note that there should only be
in_service>/<image_name>:<tag>, so in our case:
one CMD instruction in each Dockerfile. If you include more
than one, only the last will take effect.
docker build -t node-user-service:dev .
Now that we have defined the Dockerfile, we will build an

image from it using the following Docker command (using the
The last step is to push it to our container registry using the
Terminal of the Visual Studio Code or for example using the
following Docker command:
CMD on Windows):
docker push stupidsimplekubernetescontainerregistry.

docker build -t node-user-service:dev . azurecr.io/node-user-service:dev
Note the little dot from the end of the Docker command,
it means that we are building our image from the current
directory, so please make sure that you are in the same folder,
where the Dockerfile is located (in this case the root folder of
the repository).

Create Pods using
The Kubernetes API lets you query and manipulates the
state of objects in the Kubernetes Cluster (for example, Pods,
Deployment scripts Namespaces, ConfigMaps, etc.). The current stable version of

this API is 1, as we specified in the first line.
NodeJs backend In each Kubernetes .yml script we have to define the Kubernetes
resource type (Pods, Deployments, Services, etc.) using the
The next step is to define the Kubernetes Deployment script,
kind keyword. In this case, in line 2 we defined that we would
which automatically manages the Pods for us.
like to use the Deployment resource.
Kubernetes lets you add some metadata to your resources.

piVersion: apps/v1
kind: Deployment This way it’s easier to identify, filter and in general to refer to
metadata:
name: node-user-service-deployment your resources.
spec:
selector:
matchLabels:
app: node-user-service-pod From line 5 we define the specifications of this resource. In line
replicas: 3
template:
8 we specified that this Deployment should be applied only to
metadata:
the resources with the label app:node-user-service-pod and
labels:
app: node-user-service-pod in line 9 we said that we want to create 3 replicas of the same
spec:
containers: pod.
- name: node-user-service-container
image: stupidsimplekubernetescontainerregistry.
azurecr.io/node-user-service:dev
resources: The template (starting from line 10) defines the Pods. Here we
limits:
memory: “256Mi” add the label app:node-user-service-pod to each Pod. This
cpu: “500m”
imagePullPolicy: Always way they will be identified by the Deployment. In lines 16 and 17
ports:
- containerPort: 3000
we define what kind of Docker Container should be run inside
the pod. As you can see in line 17, we will use the Docker Image
from our Azure Container Registry which was built and pushed

in the previous section.
apiVersion: apps/v1
kind: Deployment
metadata:
We can also define the resource limits for the Pods, avoiding name: user-db-deployment
spec:
Pod starvation (when a Pod uses all the resources and other selector:
matchLabels:
Pods don’t get a chance to use them). Furthermore, when app: user-db-app
replicas: 1
you specify the resource request for Containers in a Pod, the template:
metadata:
scheduler uses this information to decide which node to place
labels:
the Pod on. When you specify a resource limit for a Container, app: user-db-app
spec:
the kubelet enforces those limits so that the running container containers:
- name: mongo
is not allowed to use more of that resource than the limit you image: mongo:3.6.4
command:
set. The kubelet also reserves at least the request amount of - mongod
- “--bind_ip_all”
that system resource specifically for that container to use. Be - “--directoryperdb”
ports:
aware that if you don’t have enough hardware resources (like
CPU or memory), the pod won’t be scheduled -- ever. volumeMounts:
- name: data
mountPath: /data/db
resources:
The last step is to define the port used for communication. In limits:
memory: “256Mi”
this case, we used port 3000. This port number should be the cpu: “500m”
volumes:
same as the port number exposed in the Dockerfile. - name: data
persistentVolumeClaim:
claimName: static-persistence-volume-claim-mongo
MongoDB
The Deployment script for the MongoDB database is quite In this case, we used the official MongoDB image directly from
similar. The only difference is that we have to specify the the DockerHub (line 17). The volume mounts are defined in line
volume mounts (the folder on the node where the data will 24. The last four lines will be explained in the next chapter when
be saved). we will talk about Kubernetes Persistent Volumes.

Create the Services
The important part of this .yml script is the selector, which
defines how to identify the Pods (created by the Deployment)
for Network Access to which we want to refer from this Service. As you can see
in line 8, the selector is app:node-user-service-pod, because
the Pods from the previously defined Deployment are labeled
Now that we have the Pods up and running, we should like this. Another important thing is to define the mapping
define the communication between the containers and with between the container port and the Service port. In this case,
the outside world. For this, we need to define a Service. The the incoming request will use the 3000 port, defined on line 10
relation between a Service and a Deployment is 1-to-1, so for and they will be routed to the port defined in line 11.
each Deployment, we should have a Service. The Deployment
manages the lifecycle of the Pods and it is also responsible for The Kubernetes Service script for the MongoDB pods is very
monitoring them, while the Service is responsible for enabling similar. We just have to update the selector and the ports.
network access to a set of Pods (as we saw in Chapter One).
apiVersion: v1
kind: Service
metadata:
name: user-db-service
apiVersion: v1 spec:
kind: Service clusterIP: None
metadata: selector:
name: node-user-service app: user-db-app
spec: ports:
type: ClusterIP - port: 27017
selector: targetPort: 27017
app: node-user-service-pod
ports:
- port: 3000
targetPort: 3000

Configure the kind: Service
apiVersion: v1
External Traffic
metadata:
name: ingress-nginx
namespace: ingress-nginx
labels:
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
spec:
To communicate with the outside world, we need to define externalTrafficPolicy: Local
type: LoadBalancer
an Ingress Controller and specify the routing rules using an
selector:
Ingress Kubernetes Resource. app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
ports:
- name: http
To configure an NGINX Ingress Controller we will use the script port: 80
targetPort: http
that can be found here. - name: https
port: 443
targetPort: https
This is a generic script that can be applied without

modifications (explaining the NGINX Ingress Controller is out
of scope for this book).
The next step is to define the Load Balancer, which will be used
to route external traffic using a public IP address (the cloud
provider provides the load balancer).

Now that we have the Ingress Controller and the Load Balancer line 13). In this example, we added only one entry for the NodeJS
up and running, we can define the Ingress Kubernetes user service backend, which will be accessible using port 3000.
Resource for specifying the routing rules. The /user-api uniquely identifies our service, so any request
that starts with stupid-simple-kubernetes.eastus2.cloudapp.
apiVersion: extensions/v1beta1 azure.com/user-api will be routed to this NodeJS backend. If
kind: Ingress
metadata: you want to add other services, then you have to update this
name: node-user-service-ingress
annotations: script (as an example see the commented out code).
kubernetes.io/ingress.class: “nginx”
nginx.ingress.kubernetes.io/rewrite-target: /$2
spec: Apply the .yml scripts
rules:
- host: stupid-simple-kubernetes.eastus2.cloudapp.
azure.com
To apply these scripts, we will use the kubectl. The kubectl
http: command to apply files is the following:
paths:
- backend:
serviceName: node-user-service
servicePort: 3000 kubectl apply -f <file_name>
path: /user-api(/|$)(.*)
# - backend:
# serviceName: nestjs-i-consultant-service
# servicePort: 3001 So in our case, if you are in the root folder of the
# path: /i-consultant-api(/|$)(.*)
StupidSimpleKubernetes repository, you will write the
following commands:
In line 6 we define the Ingress Controller type (it’s a Kubernetes
predefined value; Kubernetes as a project currently supports kubectl apply -f .\manifest\kubernetes\deployment.yml
kubectl apply -f .\manifest\kubernetes\service.yml
and maintains GCE and nginx controllers). kubectl apply -f .\manifest\kubernetes\ingress.yml
kubectl apply -f .\manifest\ingress-controller\nginx-in-
gress-controller-deployment.yml
kubectl apply -f .\manifest\ingress-controller\ng-
In line 7 we define the rewrite target rules (more information nix-load-balancer-setup.yml
here) and in line 10 we define the hostname.
After applying these scripts, we will have everything in place, so

For each service that should be accessible from the outside we can call our backend from the outside world (for example
world, we should add an entry in the paths list (starting from by using Postman).

Conclusion
In this tutorial, we learned how to create different kinds of
resources in Kubernetes, like Pods, Deployments, Services,
Ingresses and Ingress Controller. We created a NodeJS
backend with a MongoDB database and we containerized
and deployed the NodeJS and MongoDB containers using
replication of 3 pods.
In the next chapter, we will approach the problem of saving

data persistently and we will learn about Persistent Volumes
in Kubernetes.

Chapter 3
Persistent
Volumes Explained
Welcome back to our series, where we introduce you to the
basic concepts of Kubernetes. In the first chapter, we provided
a brief introduction to Persistent Volumes. Here, we’ll dig into
this topic: we will learn how to set up data persistency and will
write Kubernetes scripts to connect our Pods to a Persistent
Volume. In this example, we will use Azure File Storage to store
the data from our MongoDB database, but you can use any
kind of volume to achieve to same results (such as Azure Disk,
GCE Persistent Disk, AWS Elastic Block Store, etc.)
NOTE: the scripts provided are platform agnostic, so you can

follow the tutorial using other types of cloud providers or using
a local cluster with K3s. I suggest using K3s because it is very
lightweight, packed in a single binary with a size less than 40MB.
It is also a highly available, certified Kubernetes distribution
designed for production workloads in resource-constrained
environments. For more information, take a look at its
well-written and easy-to-follow documentation.

Requirements What Problem
Does Kubernetes
Before starting this tutorial, please make sure that you have
installed Docker. Kubectl will install with Docker (if not, please
Volume Solve?
install it from here).
The Kubectl commands used throughout this tutorial can be

found in the Kubectl Cheat Sheet.
not mandatory.
Remember that we have a Node (an actual hardware device

or a virtual machine) and inside the Nodes, we have a Pod (or
multiple Pods) and inside the Pod, we have the Container. Pods
are ephemeral, so they can come and go very often (they can
be deleted, rescheduled, etc.). In this case, if you have data that
you must keep even if the Pod goes down you have to move it
outside the Pod. This way it can exist independently of any Pod.
This external place is called Volume and it is an abstraction of a
storage system. Using the Volume, you can persist state across
multiple Pods.

When to Use
to a wide variety of cloud storage systems without having to
create an explicit dependency with those systems. This can
Persistent Volumes make the consumption of cloud storage much more seamless
and eliminate integration costs. It can also make it much
easier to migrate between clouds and adopt multi-cloud
When containers became popular, they were designed to strategies.
support stateless workloads with persistent data stored

elsewhere. Since then, a lot of effort has been made to support Even if sometimes, because of material constraints like money,
stateful applications in the container ecosystem. time or manpower (which are closely related) you have to
make some compromises and directly couple your app with a
Every project needs some kind of data persistency, so usually, specific platform or provider, you should try to avoid as many
you need a database to store the data. But in a clean design, direct dependencies as possible. One way of decoupling your
you don’t want to depend on concrete implementations; application from the actual database implementation (there
you want to write an application as reusable and platform are other solutions, but those solutions require more effort) is
independent as possible. by using containers (and Persistent Volumes to prevent data

loss). This way, your app will rely on abstraction instead of a
There has always been a need to hide the details of storage specific implementation.
implementation from the applications. But now, in the

era of cloud-native applications, cloud providers create Now the real question is, should we always use a containerized
environments where applications or users who want to access database with Persistent Volume, or what are the storage
the data need to integrate with a specific storage system. system types which should NOT be used in containers?
For example, many applications are directly using specific

storage systems like Amazon S3, Azure File or Blog storage, etc. There is no golden rule of when you should and shouldn’t use
which create an unhealthy dependency. Kubernetes is trying Persistent Volumes, but as a starting point, you should have in
to change this by creating an abstraction called Persistent mind scalability and the handling of the loss of node in the
Volume, which allows cloud-native applications to connect cluster.

Types of Kubernetes
Based on scalability, we can have two types of storage systems:
1. Vertically scalable — includes traditional RDMS solutions

such as MySQL, PostgreSQL and SQL Server
Volumes
2. Horizontally scalable — includes “NoSQL” solutions such as
ElasticSearch or Hadoop based solution
We can categorize the Kubernetes Volumes based on their
lifecycle and the way they are provisioned.
Vertically scalable solutions like MySQL, Postgres, Microsoft SQL,
etc. should NOT go in containers. These database platforms
Considering the lifecycle of the volumes, we can have:
require high I/O, shared disks, block storage, etc., and were not
designed to handle the loss of a node in a cluster gracefully,
1. Ephemeral Volumes, which are tightly coupled with the
which often happens in a container-based ecosystem.
lifetime of the Node (for example emptyDir, or hostPath) and
they are deleted if the Node goes down.
For horizontally scalable applications (Elastic, Cassandra,
Kafka, etc.), you should use containers, because they can
2. Persistent Volumes, which are meant for long-term storage
withstand the loss of a node in the database cluster and the and are independent of the Pods or Nodes lifecycle.
database application can independently re-balance. These can be cloud volumes (like gcePersistentDisk,
awsElasticBlockStore, azureFile or azureDisk), NFS (Network File
Usually, you can and should containerize distributed Systems) or Persistent Volume Claims (a series of abstraction
databases that use redundant storage techniques and to connect to the underlying cloud provided storage volumes).
withstand the loss of a node in the database cluster
(ElasticSearch is a really good example). Based on the way the volumes are provisioned, we can have:
1. Direct access
2. Static provisioning
3. Dynamic provisioning

Direct Access Persistent Volumes The script for creating a Secret is as follows:
apiVersion: v1
kind: Secret
metadata:
name: static-persistence-secret
type: Opaque
data:
azurestorageaccountname: “base64StorageAccountName”
azurestorageaccountkey: “base64StorageAccountKey”
In this case, the pod will be directly coupled with the volume, As in any Kubernetes script, on line 2 we specify the type of the
so it will know the storage system (for example, the Pod will be resource -- in this case, Secret. On line 4, we give it a name (we
coupled with the Azure Storage Account). This solution is not called it static because it is manually created by the Admin
cloud-agnostic and it depends on a concrete implementation and not automatically generated). The Opaque type, from
and not an abstraction. So if possible, please avoid this solution. Kubernetes’ point of view, means that the content (data) of
The only advantage is that it is easy and fast. Create the Secret this Secret is unstructured (it can contain arbitrary key-value
in the Pod and specify the Secret and the exact storage type pairs). To learn more about Kubernetes Secrets, see the Secrets
that should be used. design document and Configure Kubernetes Secrets.
In the data section, we have to specify the account name (in

Azure, it is the name of the Storage Account) and the access
key (in Azure, select the Storage Account under Settings, Access
key). Don’t forget that both should be encoded using Base64.

The next step is to modify our Deployment script to use the As you can see, the only difference is that from line 32 we
Volume (in this case the volume is the Azure File Storage). specify the used volume, give it a name and specify the exact
details of the underlying storage system. The secretName
must be the name of the previously created Secret.
apiVersion: apps/v1
kind: Deployment
metadata:
name: user-db-deployment
Kubernetes Storage Class
spec:
selector:
To understand the Static or Dynamic provisioning, first we
matchLabels:
app: user-db-app have to understand the Kubernetes Storage Class.
replicas: 1
template:
metadata:
labels: With StorageClass, administrators can offer Profiles or
app: user-db-app
spec: “classes” regarding the available storage. Different classes
containers:
- name: mongo might map to quality-of-service levels, or backup policies or
image: mongo:3.6.4
command: arbitrary policies determined by the cluster administrators.
- mongod
- “--directoryperdb”
For example, you could have a profile to store data on an
ports:
- containerPort: 27017 HDD named slow-storage or a profile to store data on an SSD
volumeMounts:
- name: data named fast-storage. The kind of storage is determined by
mountPath: /data/db
resources: the Provisioner. For Azure, there are two kinds of provisioners:
limits:
memory: “256Mi” AzureFile and AzureDisk (the difference is that AzureFile
cpu: “500m”
volumes: can be used with ReadWriteMany access mode, while
- name: data
azureFile: AzureDisk supports only ReadWriteOnce access, which can
secretName: static-persistence-secret
shareName: user-mongo-db
be a disadvantage when you want to use multiple pods
readOnly: false
simultaneously). You can learn more about the different types
of StorageClasses here.

The script for our StorageClass: Persistent Volume and Persistent Volume Claim
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: azurefilestorage
provisioner: kubernetes.io/azure-file
parameters:
storageAccount: storageaccountname
reclaimPolicy: Retain
allowVolumeExpansion: true
Kubernetes predefines the value for the provisioner property

(see Kubernetes Storage Classes). The Retain reclaim policy
means that after we delete the PVC and PV, the actual
storage medium is NOT purged. We can set it to Delete and
with this setting, as soon as a PVC is deleted, it also triggers
the removal of the corresponding PV along with the actual
storage medium (here the actual storage is the Azure File
Storage).
Kubernetes has a matching primitive for each of the traditional

storage operational activities (provisioning/configuring/
attaching). Persistent Volume is Provisioning, Storage Class is
Configuring and Persistent Volume Claim is Attaching.
From the original documentation:
A PersistentVolume (PV) is a piece of storage in the cluster

that has been provisioned by an administrator or dynamically

provisioned using Storage Classes. Dynamic Provisioning
A PersistentVolumeClaim (PVC) is a request for storage by a

user. It is similar to a Pod. Pods consume node resources and
PVCs consume PV resources. Pods can request specific levels
of resources (CPU and memory). Claims can request specific
size and access modes (e.g., they can be mounted once read/
write or many times read-only).
This means that the Admin will create the Persistent Volume
to specify the type of storage that can be used by the Pods,
In this case, there is NO PersistentVolume and Secret created
the size of the storage, and the access mode. The Developer
manually, so Kubernetes will try to generate them. The
will create a Persistent Volume Claim asking for a piece of
StorageClass is mandatory and we will use the one created
volume, access permission and the type of storage. This way
earlier.
there is a clear separation between “Dev” and “Ops.” Devs
are responsible for asking for the necessary volume (PVC)
The script for the PersistentVolumeClaim can be found below:
and Ops are responsible for preparing and provisioning the
requested volume (PV).
apiVersion: v1
kind: PersistentVolumeClaim
The difference between Static and Dynamic provisioning is metadata:
name: persistent-volume-claim-mongo
that if there isn’t a PersistentVolume and a Secret created spec:
accessModes:
manually by the Admin, Kubernetes will try to automatically - ReadWriteMany
resources:
create these resources. requests:
storage: 1Gi
storageClassName: azurefilestorage

And our updated Deployment script: The most important advantage of this approach is that you
don’t have to create the PV and the Secret manually and the
apiVersion: apps/v1 Deployment is cloud agnostic. The underlying detail of the
kind: Deployment
metadata: storage is not present in the Pod’s specs. But there are also
name: user-db-deployment
spec: some disadvantages: you cannot configure the Storage
selector:
matchLabels: Account or the File Share because they are auto-generated
app: user-db-app
replicas: 1 and you cannot reuse the PV or the Secret —
they will be
template:
metadata:
regenerated for each new Claim.
labels:
app: user-db-app
spec:
containers:
Dynamic Provisioning
- name: mongo
image: mongo:3.6.4
command:
- mongod
- “--directoryperdb”
ports:
volumeMounts:
- name: data
mountPath: /data/db
resources:
limits:
memory: “256Mi”
cpu: “500m”
volumes:
- name: data
persistentVolumeClaim:
claimName: persistent-volume-claim-mongo The only difference between Static and Dynamic provisioning
is that we manually create the PersistentVolume and the
Secret in Static Provisioning. This way we have full control over
As you can see, in line 34 we referenced the previously created
the resource that will be created in our cluster.
PVC by name. In this case, we didn’t create a PersistenVolume
or a Secret for it, so it will be created automatically.

Conclusion
The PersistentVolume script is below:
apiVersion: v1
kind: PersistentVolume
metadata:
name: static-persistent-volume-mongo In this tutorial, we learned how to persist data and state
labels:
storage: azurefile using Volumes. We presented three different ways of setting
spec:
up your system, Direct Access, Dynamic Provisioning and
capacity:
storage: 1Gi Static Provisioning and discussed the advantages and
accessModes:
- ReadWriteMany disadvantages of each.
storageClassName: azurefilestorage
azureFile:
secretName: static-persistence-secret
shareName: user-mongo-db In the next chapter, we will talk about CI/CD pipelines to
readOnly: false
automate the deployment of Microservices.
It is important that in line 12 we reference the StorageClass by

name. Also, in line 14 we reference the Secret, which is used to
access the underlying storage system.
I recommend this solution, even if it requires more work,

because it is cloud-agnostic. It also lets you apply separation
of concerns regarding roles (Cluster Administrator vs.
Developers) and gives you control of naming and resource
creation.

Chapter 4
Create an
Azure Infrastructure
for Microservices
In the first chapter, we learned about the basic concepts used
in Kubernetes and its hardware structure. We talked about the
different software components, including Pods, Deployments,
StatefulSets and Services and how to communicate between
services and with the outside world.
In this chapter, we’re getting practical. We will create all the

necessary configuration files to deploy multiple microservices
in different languages using MongoDB as data storage. We
will also learn about Azure Kubernetes Service (AKS) and will
present the infrastructure used to deploy our services.
The code used in this chapter can be found in my

StupidSimpleKubernetes-AKS git repository. If you like it,
please leave a star!
NOTE: the scripts provided are platform agnostic, so you

can follow the tutorial using other types of cloud providers
or a local cluster with K3s. I suggest using K3s because it is
very lightweight, packed in a single binary with a size less
than 40MB. Furthermore, it is a highly available, certified
Kubernetes distribution designed for production workloads in
resource-constrained environments. For more information,
you can take a look over its well-written and easy-to-follow
documentation.

Requirements
Before starting this tutorial, please make sure that you have
installed Docker and Azure CLI. Kubectl will be installed with
Docker (if not, please install it from here).
You will also need an Azure Account. Azure offers a 30-day

free trial that gives you $200 in credit, which will be more than
enough for our tutorial.
not mandatory.

Creating a
After running the code, it will return a JSON response with the
following structure:
Production Ready
Azure Infrastructure for
Microservices
To have a fast setup, I’ve created an ARM Template, which will Based on this information, you will have to update the ARM
automatically spin up all the Azure resources needed for this Template to use your Service Principal. For this, please copy
tutorial. You can read more about ARM Templates here. the appId from the returned JSON to the clientId in the ARM
Template. Also, copy the password and paste it into the ARM
We will run all the scripts in the VS Code Terminal. Template’s secret field.
The first step is to log in to your Azure account from the VS In the next step, you should create a new Resource Group
Code Terminal. For this run az login. This will open a new tab called “StupidSimpleKubernetes” in your Azure Portal and
in your default browser, where you can enter your credentials. import the ARM template to it.
For the Azure Kubernetes Service, we need to set up a Service

Principal. For this, I’ve created a PowerShell script called
create-service-principal.ps1. Just run this script in the VS Code
Terminal or PowerShell.

To import the ARM template, in the Azure Portal, click on the Hit the save button, select the StupidSimpleKubernetes resource
Create a resource button, search for Template Deployment group, and hit Purchase. This will take a while and it will create
and select Build your own template in the editor. Copy and all the necessary Azure resources for a production-ready
paste the template code from our Git repository to the Azure microservices infrastructure.
template editor. Now you should see something like in the
following picture:

You can also apply the ARM Template using the Azure CLI, by running the following command in the root folder of our git repository:
az deployment group create --name testtemplate --re-

source-group StupidSimpleKubernetes --template-file .\mani-
fest\arm-templates\template.json
After the ARM Template Deployment is done, we should have the following Azure resources:

The next step is to authorize our Kubernetes service to pull images from the Container Registry. For this, select the container registry,
select the Access Control (IAM) menu option from the left menu, click on the Add button and select Role Assignment.
In the right menu, search for the correct Service Principal (use the Z from the returned JSON object — see the Service Principal image
above).
After this step, our Kubernetes Service will be able to pull the right Docker images from the Azure Container Registry. We will store all
our custom Docker images in this Azure Container Registry.
We are almost ready! In the last step, we will set up the NGNIX Ingress Controller and add a RecordSet to our DNS. This assigns a hu-
man-readable hostname to our services instead of using the IP:PORT of the Load Balancer.

To set up the NGINX Ingress Controller, run the following two commands in the root folder of the repository, one by one:
kubectl apply -f .\manifest\ingress-controller\nginx-in-

gress-controller-deployment.yml
kubectl apply -f .\manifest\ingress-controller\ng-
nix-load-balancer-setup.yml
This will create a new public IP, which you can see in the Azure Portal:

If we take a look over the details of this new Public IP resource, we can see that it does NOT have a DNS name.

Conclusion
To assign a human-readable DNS name to this Public IP, please
run the following PowerShell script (just replace the IP address
with the correct IP address from your Public IP resource):
In this tutorial, we learned how to create a production-ready
This assigns a DNS name to the public IP of your NGINX Ingress Azure infrastructure to deploy our microservices. We used an
Controller. ARM Template to automatically set up the Azure Kubernetes
Service, the Azure Container Registry, the Azure Load Balancer,
Now we are ready to deploy our Microservices to the Azure Azure File Storage (which will be used for persistent data storage)
Kubernetes Cluster. and to add a DNS Zone. We applied some configuration files
to authorize Kubernetes to pull Docker images from the Azure
Container Registry, configure the NGINX Ingress Controller and
set up a DNS Hostname for our Ingress Controller.

Chapter 5
Stupid Simple
Scalability
This post will define and explain software scalability in employee can handle only one client per minute. You decide to
Kubernetes and look at different scalability types. Then we hire two more employees. With this, you’ve solved the problem
will present three autoscaling methods in Kubernetes: HPA for a while.
(Horizontal Pod Autoscaler), VPA (Vertical Pod Autoscaler), and
CA (Cluster Autoscaler). After some time, near the coffee shop, the city opens a fun
park, so more and more tourists are coming and drinking
their coffee in your famous coffee shop. So you decide to hire
Scalability Explained more people, but even with more employees, the waiting time
is almost the same. The problem is that your coffee machine
can make three coffees per minute, so now your employees
To understand the different concepts in software scalability,
are waiting for the coffee machine. The solution is to buy a new
let’s take a real-life example.
coffee machine. Another problem is that the clients tend to
buy coffee from employees that they already know. As a result,
Suppose you’ve just opened a coffee shop, you bought a
some employees have a lot of work, and others are idle. This
simple coffee machine, which can make three coffees per
is when you decide you need to hire another employee who
minute, and you hired an employee who serves the clients.
will greet the clients and redirect them to the employee who is
free or has fewer orders to prepare.
At first, you have a few clients: everything is going well, and
all the people are happy about the coffee and the service
Analyzing your income and expenses, you realize that you
because they don’t have to wait too long to get their delicious
have many more clients during the summer than in the winter,
coffee. As time goes by, your coffee shop becomes famous in
so you decide to hire seasonal workers. Now you have three
town, and more and more people are buying their coffee from
employees working full-time and the other employees are
you. But there is a problem. You have only one employee and
working for you only during the summer. This way, you can
too many clients, so the waiting time gets considerably higher
increase your income and decrease expenses. Furthermore,
and people are starting to complain about your service. The
you can rent some coffee machines during the summer and
coffee machine could make three coffees per minute, but the
give them back during the winter to minimize the costs. This

way, you won’t have idle coffee machines.
To translate this short story to software scalability in

Kubernetes, we can replace the coffee machines with nodes,
the employees with pods, the coffee shop is the cluster, and
the employee who greets the clients and redirects them is the
load balancer. Adding more employees means Horizontal Pod
Scaling; adding more coffee machines means Cluster Scaling.
Seasonal workers and renting coffee machines only for the
summer season means Autoscaling because when the load
is higher, we have more pods to serve the clients and more
nodes to be used by pods. When the load drops (during the
winter), we have fewer expenses. In this analogy, Vertical Pod
Scaling would be hiring a more experienced employee who
can serve more clients in the same amount of time (high
performing employee). The trigger for the Autoscaling would
be the season; we scale up during the summer and scale
down during the winter.

Horizontal Pod
duration can be configured, by default, it is set to 15 seconds)
and fetches the resource metrics from the resource metrics
Autoscaling (HPA) API for each pod. Using these metrics, it calculates the actual
resource utilization values based on the mean values of all the
pods and compares them to the metrics defined in the HPA
definition. To calculate the desired number of replicas, HPA
uses the following formula:
desiredReplicas = ceil[currentReplicas*(currentMet-
ricValue/desiredMetricValue)]
To understand this formula, let’s take the

following configuration:
spec:
containers:
- name: php-apache
image: k8s.gcr.io/hpa-example
Horizontal scaling or scaling out means that the number of ports:
running pods dynamically increases or decreases as your - containerPort: 80
application usage changes. To know exactly when to increase resources:
or decrease the number of replicas, Kubernetes uses triggers limits:
based on the observed metrics (average CPU utilization, cpu: 500m
average memory utilization, or custom metrics defined by requests:
the user). HPA, a Kubernetes resource, runs in a loop (the loop cpu: 200m

The unit suffix m stands for “thousandth of a core,” so this HPA and VPA or HPA Walkthrough.
resources object specifies that the container process needs
When configuring HPA, make sure that:
200/1000 of a core (20%) and is allowed to use, at most,
1. All pods have resource requests and limits configured -
500/1000 of a core (50 percent).
this will be taken into consideration when HPA takes
scaling decisions
With the following command, we can create an HPA that
2. Use custom metrics or observed metrics - external
maintains between 1 and 10 replicas. It will increase or decrease
metrics can be a security risk because they can provide
the number of replicas to maintain an average CPU usage of
access to a large number of metrics
50 percent, or in this concrete example, 100 milli-cores.
3. Use HPA together with CA whenever possible
kubectl autoscale deployment deployment_name

--cpu-percent=50 --min=1 --max=10
Suppose that the CPU usage has increased to 210 percent; this
means that we will have nrReplicas = ceil[ 1 * ( 210 / 50 )] =
ceil[4.2] = 5 replicas.
Now the CPU usage drops to 25 percent when having 5 replicas,

so the HPA will decrease the number of replicas to nrReplicas
= ceil[ 5 * ( 25 / 50 )] = ceil[2.5] = 3 replicas.
For more examples, read Autoscaling in Kubernetes using

Vertical Pod
VPA recommends optimized CPU and memory requests/
limits values (and automatically updates them for you so
Autoscaling (VPA) that the cluster resources are efficiently used). VPA won’t add
more replicas of a Pod, but it increases the memory or CPU
limits. This is useful when adding more replicas won’t help your
solution. For example, sometimes you can’t scale a database
(read Chapter Three, Persistent Volumes Explained) just by
adding more Pods. Still, you can make the database handle
more connections by increasing the memory or CPU. You can
use the VPA when your application serves heavyweight re-
quests, which requires higher resources.
HPA can be useful when, for example, your application serves

a large number of lightweight (i.e., low resource-consuming)
requests. In that case, scaling the number of replicas can
distribute the workload on each pod. The VPA, on the other
hand, can be useful when your application serves heavyweight
requests, which require higher resources.
HPA and VPA are incompatible. Do not use both together for
the same set of pods. HPA uses the resource request and limits
to trigger scaling, and in the meantime, VPA modifies those
limits, so it will be a mess unless you configure the HPA to use
either custom or external metrics. Read more about VPA and
HPA here.

Cluster Autoscaling (CA) Conclusion
In the first part of this chapter, we provided a real-life example
to explain the different concepts used in software scalability.
Then we defined and presented the three scalability methods
provided by Kubernetes, HPA (Horizontal Pod Autoscaler), VPA
(Vertical Pod Autoscaler), and CA (Cluster Autoscaler).
While HPA scales the number of Pods, the CA changes the

number of nodes. When your cluster runs low on resources,
the CA provision a new computation unit (physical or virtual
machine) and adds it to the cluster. If there are too many
empty nodes, the CA will remove them to reduce costs.
Learn more about Cluster Autoscaling in Architecting

Kubernetes Clusters—Choosing the Best Autoscaling Strategy.

Chapter 6
Stupid Simple
Service Mesh -
What, When, Why
Recently microservices-based applications became very
popular and with the rise of microservices, the concept of
Service Mesh also became a very hot topic. Unfortunately,
there are only a few articles about this concept and most of
them are hard to digest.
In this section, we will try to demystify the concept of Service

Mesh using “Stupid Simple” explanations, diagrams, and
examples to make this concept more transparent and
accessible for everyone. In the first chapter, we will talk about
the basic building blocks of a Service Mesh and we will Overview of the sample application
implement a sample application to have a practical example
of each theoretical concept. In the next chapter, based on
For one, a microservices-based architecture means that
this sample app, we will touch more advanced topics, like
we have a distributed system. Every distributed system
Service Mesh in Kubernetes, and we will talk about some more
has challenges such as transparency, security, scalability,
advanced Service Mesh implementations like Istio, Linkerd,
troubleshooting, and identifying the root cause of issues. In
etc.
a monolithic system, we can find the root cause of a failure
by tracing. But in a microservice-based system, each service
To understand the concept of Service Mesh, the first step is to
can be written in different languages, so tracing is no trivial
understand what problems it solves and how it solves them.
task. Another challenge is service-to-service communication.
Instead of focusing on business logic, developers need to take
Software architecture has evolved a lot in a short time, from
care of service discovery, handle connection errors, detect
a classical monolithic architecture to microservices. Although
latency, retry logic, etc. Applying SOLID principles on the
many praise the microservice architecture as the holy grail of
architecture level means that these kinds of network problems
software development, it introduces some serious challenges.
should be abstracted away and not mixed with the business
logic. This is why we need Service Mesh.

Ingress Controller vs.
On a stupid simple and oversimplified level, these are the
responsibilities of each concept:
API Gateway vs. Service 1. Ingress Controller: allows a single IP port to access all services
Mesh from the cluster, so its main responsibilities are path mapping,
routing and simple load balancing, like a reverse proxy
As I mentioned above, we need to apply SOLID principles on an

2. API Gateway: aggregates and abstracts away APIs; other
architectural level. For this, it is important to set the boundaries
responsibilities are rate-limiting, authentication, and
between Ingress Controller, API Gateway, and Service Mesh
security, tracing, etc. In a microservices-based application,
and understand each one’s role and responsibility.
you need a way to distribute the requests to different services,
gather the responses from multiple/all microservices, and
then prepare the final response to be sent to the caller.
This is what an API Gateway is meant to do. It is responsible
for client-to-service communication, north-south traffic.
3. Service Mesh: responsible for service-to-service

communication, east-west traffic. We’ll dig more into the
concept of Service Mesh in the next section.

Service Mesh and API Gateway have overlapping functionalities, The main responsibility of an API gateway is to accept traffic
such as rate limiting, security, service discovery, tracing, etc. from outside your network and distribute it internally, while the
but they work on different levels and solve different problems. main responsibility of a service mesh is to route and manage
Service Mesh is responsible for the flow of requests between traffic within your network. They are complementary concepts
services. API Gateway is responsible for the flow of requests and a well-defined microservices-based system should
between the client and the services, aggregating multiple combine them to ensure application uptime and resiliency
services and creating and sending the final response to the while ensuring that your applications are easily consumable.
client.
What does a Service

Mesh Solve?
As an oversimplified and stupid simple definition, a Service

Mesh is an abstraction layer hiding away and separating
networking-related logic from business logic. This way
developers can focus only on implementing business logic. We
implement this abstraction using a proxy, which sits in the front
of the service. It takes care of all the network-related problems.
This allows the service to focus on what is really important:

Understanding Envoy
the business logic. In a microservice-based architecture, we
have multiple services and each service has a proxy. Together,
these proxies are called Service Mesh.
As best practices suggest, proxy and service should be Ingress and Egress
in separate containers, so each container has a single
responsibility. In the world of Kubernetes, the container of
the proxy is implemented as a sidecar. This means that each
service has a sidecar containing the proxy. A single Pod will
contain two containers: the service and the sidecar. Another
implementation is to use one proxy for multiple pods. In this
case, the proxy can be implemented as a Deamonset. The
Simple definitions:
most common solution is using sidecars. Personally, I prefer
sidecars over Deamonsets, because they keep the logic of the
proxy as simple as possible.
• Any traffic sent to the server (service) is called ingress.
• Any traffic sent from the server (service) is called egress.
There are multiple Service Mesh solutions, including Istio,

The Ingress and the Egress rules should be added to the
Linkerd, Consul, Kong, and Cilium. Let’s focus on the basics
configuration of the Envoy proxy, so the sidecar will take care
and understand the concept of Service Mesh, starting with
of these. This means that any traffic to the service will first go
Envoy. This is a high-performance proxy and not a complete
to the Envoy sidecar. Then the Envoy proxy redirects the traffic
framework or solution for Service Meshes (in this tutorial, we
to the real service. Vice-versa, any traffic from this service will
will build our own Service Mesh solution). Some of the Service
Mesh solutions use Envoy in the background (like Istio), so go to the Envoy proxy first and Envoy resolves the destination
before starting with these higher-level solutions, it’s a good service using Service Discovery. By intercepting the inbound
idea to understand the low-level functioning. and outbound traffic, Envoy can implement service discovery,
circuit breaker, rate limiting, etc.

The Structure of an Envoy Proxy Configuration File Every Envoy configuration file has the following components:
1. Listeners: where we configure the IP and the Portnumber

that the Envoy proxy listens to
2. Routes: the received request will be routed to a cluster

based on rules. For example, we can have path matching
rules and prefix rewrite rules to select the service that
should handle a request for a specific path/subdomain.
Actually, the route is just another type of filter, which is
mandatory. Otherwise, the proxy doesn’t know where to
route our request.
3. Filters: Filters can be chained and are used to enforce

different rules, such as rate-limiting, route mutation,
manipulation of the requests, etc.
4. Clusters: act as a manager for a group of logically

similar services (the cluster has similar responsibility as
a service in Kubernetes; it defines the way a service can
be accessed), and acts as a load balancer between the
services.
5. Service/Host: the concrete service that handles and

responds to the request

Here is an example of an Envoy configuration file:
--- match:
admin: path: “/”
access_log_path: “/tmp/admin_access.log” route:
address: cluster: “base”
socket_address: http_filters:
address: “127.0.0.1” -
port_value: 9901 name: “envoy.router”
static_resources: config: {}
listeners:
- clusters:
name: “http_listener” -
address: name: “base”
socket_address: connect_timeout: “0.25s”
address: “0.0.0.0” type: “strict_dns”
port_value: 80 lb_policy: “ROUND_ROBIN”
filter_chains: hosts:
filters: -
- socket_address:
name: “envoy.http_connection_manager” address: “service_1_envoy”
config: port_value: 8786
stat_prefix: “ingress” -
codec_type: “AUTO” socket_address:
generate_request_id: true address: “service_2_envoy”
route_config: port_value: 8789
name: “local_route” -
virtual_hosts: name: “nodejs”
- connect_timeout: “0.25s”
name: “http-route” type: “strict_dns”
domains: lb_policy: “ROUND_ROBIN”
- “*” hosts:
routes: -
- socket_address:
match: address: “service_4_envoy”
prefix: “/nestjs” port_value: 8792
route: -
prefix_rewrite: “/” name: “nestjs”
cluster: “nestjs” connect_timeout: “0.25s”
- type: “strict_dns”
match: lb_policy: “ROUND_ROBIN”
prefix: “/nodejs” hosts:
route: -
prefix_rewrite: “/” socket_address:
cluster: “nodejs” address: “service_5_envoy”
- port_value: 8793

The configuration file above translates into the After configuring the listener, between lines 15-52 we define
following diagram: the Filters. For simplicity we used only the basic filters, to
match the routes and to rewrite the target routes. In this case,
if the subdomain is “host:port/nodeJs,” the router will choose
the nodejs cluster and the URL will be rewritten to “host:port/”
(this way the request for the concrete service won’t contain
the /nodesJs part). The logic is the same also in the case
of “host:port/nestJs”. If we don’t have a subdomain in the
request, then the request will be routed to the cluster called
base without prefix rewrite filter.
Between lines 53-89 we defined the clusters. The base cluster

will have two services and the chosen load balancing strategy
is round-robin. Other available strategies can be found here.
The other two clusters (nodejs and nestjs) are simple, with only
a single service.
The complete code for this tutorial can be found in my Stupid

Simple Service Mesh git repository.
This diagram did not include all configuration files for all the
services, but it is enough to understand the basics. You can
find this code in my Stupid Simple Service Mesh repository.
As you can see, between lines 10-15 we defined the Listener for
our Envoy proxy. Because we are working in Docker, the host is
0.0.0.0.

Conclusion
In this chapter, we learned about the basic concepts of Service Mesh. In the first part, we understood the responsibilities and
differences between the Ingress Controller, API Gateway, and Service Mesh. Then we talked about what Service Mesh is and what
problems it solves. In the second part, we introduced Envoy, a performant and popular proxy, which we used to build our Service
Mesh example. We learned about the different parts of the Envoy configuration files and created a Service Mesh with five example
services and a front-facing edge proxy.
In the next chapter, we will look at how to use Service Mesh with Kubernetes and will create an example project that can be used as
a starting point in any project using microservices.

Chapter 7
Stupid Simple
Service Mesh
in Kubernetes
To understand the estimate, let’s understand what we need
Stupid Simple Service to do in order to have a functional rating microservice. The
Mesh in Kubernetes CRUD (Create, Read, Update, Delete) part is easy -- just simple
coding. But adding this new project to our microservices-based
application is not trivial. First, we have to implement
authentication and authorization, then we need some kind of
We covered the what, when and why of Service Mesh in an
tracing to understand what is happening in our application.
earlier chapter. Now I’d like to talk about why they are critical
Because the network is not reliable (unstable connections can
in Kubernetes.
result in data loss), we have to think about solutions for retries,
circuit breakers, timeouts, etc.
To understand the importance of using service meshes when
working with microservices-based applications, let’s start with
We also need to think about deployment strategies. Maybe we
a story.
want to use shadow deployments to test our code in production
without impacting the users. Maybe we want to add A/B testing
Suppose that you are working on a big microservices-based
capabilities or canary deployments. So even if we create just a
banking application, where any mistake can have serious
simple microservice, there are lots of cross-cutting concerns
impacts. One day the development team receives a feature
that we have to keep in mind.
request to add a rating functionality to the application. The
solution is obvious: create a new microservice that can handle
Sometimes it is much easier to add a new functionality to an
user ratings. Now comes the hard part. The team must come
existing service, than create a new service and add it to our
up with a reasonable time estimate to add this new service.
infrastructure. It can take a lot of time to deploy a new service,
to add authentication and authorization, to configure tracing,
The team estimates that the rating system can be finished in 4
to create CI/CD pipelines, to implement retry mechanisms
sprints. The manager is angry. He cannot understand why it is
and more. But adding the new feature to an existing service
so hard to add a simple rating functionality to the app.
will make the service too big. It will also break the rule of single
responsibility, and like many existing microservices projects, it

will be transformed into a set of connected macroservices or Set Up a Service Mesh in Kubernetes using Istio
monoliths.
Istio solves these issues using sidecars, which it automatically
We call this the cross-cutting concerns burden — the fact that injects into your pods. Your services won’t communicate directly
in each microservice you must reimplement the cross-cutting with each other — they’ll communicate through sidecars. The
concerns, such as authentication, authorization, retry sidecars will handle all the cross-cutting concerns. You define
mechanisms and rate limiting. the rules once, and these rules will be injected automatically
into all of your pods.
What is the solution for this burden? Is there a way to
implement all these concerns once and inject them into
every microservice, so the development team can focus on
producing business value? The answer is Istio.

Samples Application
Let’s put this idea into practice. We’ll build a sample application
to explain the basic functionalities and structure of Istio.
In the previous chapter, we created a service mesh by

hand, using envoy proxies. In this tutorial, we will use the
same services, but we will configure our Service Mesh
using Istio and Kubernetes.
The image below depicts that application architecture.

Requirements
Running our Microservices-Based Project using
Istio and Kubernetes
As I mentioned above, step one is to configure Istio to inject

To work along with this tutorial, you will need to install the
the sidecars into each of your pods from a namespace. We
following tools:
will use the default namespace. This can be done using the
following command:
1. Kubernetes (we used the 1.21.3 version in this tutorial)
2. Helm (we used the v2)
kubectl label namespace default istio-injection=enabled
3. Istio (we used 1.1.17) - setup tutorial
4. Minikube, K3s or Kubernetes cluster enabled in Docker
In the second step, we navigate into the /kubernetes folder from
the downloaded repository, and we apply the configuration
Git Repository files for our services:
My Stupid Simple Service Mesh in Kubernetes repository kubectl apply -f service1.yaml

kubectl apply -f service2.yaml
contains all the scripts for this tutorial. Based on these scripts kubectl apply -f service3.yaml
you can configure any project.

After these steps, we will have the green part up and running: For now, we can’t access our services from the browser. In
the next step, we will configure the Istio Ingress and Gateway,
allowing traffic from the exterior.
The gateway configuration is as follows:
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
name: http-gateway
spec:
selector:
istio: ingressgateway
servers:
- port:
number: 80
name: http
protocol: HTTP
hosts: - “*”
Using the selector istio: ingressgateway, we specify that we

would like to use the default ingress gateway controller, which
was automatically added when we installed Istio. As you can
see, the gateway allows traffic on port 80, but it doesn’t know
where to route the requests. To define the routes, we need a
so-called VirtualService, which is another custom Kubernetes
resource defined by Istio.

without the burden of sidecars and gateway configuration.
apiVersion: networking.istio.io/v1b
kind: VirtualService
metadata:
name: sssm-virtual-services Now let’s see what we can do using Istio rules.
spec:
hosts: - “*”
gateways: - http-gateway
http:
- match: Security in Istio
- uri:
prefix: /service1 Without Istio, every microservice must implement
route:
- destination: authentication and authorization. Istio removes the
host: service1
port: responsibility of adding authentication and authorization
number: 80
- match: from the main container (so developers can focus on providing
- uri:
prefix: /service2
business value) and moves these responsibilities into its
route: sidecars. The sidecars can be configured to request the access
- destination:
host: service2 token at each call, making sure that only authenticated
port:
number: 80 requests can reach our services.
The code above shows an example configuration for the

apiVersion: authentication.istio.io/v1beta1
VirtualService. In line 7, we specified that the virtual service kind: Policy
metadata:
applies to the requests coming from the gateway called name: auth-policy
spec:  
http-gateway and from line 8 we define the rules to match the targets:
- name: service1
services where the requests should be sent. Every request with - name: service2
- name: service3
/service1 will be routed to the service1 container while every - name: service4
- name: service5  
requests with /service2 will be routed to the service2 container.
origins:
- jwt:   
issuer: “{YOUR_DOMAIN}”
At this step, we have a working application. Until now there jwksUri: “{YOUR_JWT_URI}”  
principalBinding: USE_ORIGIN
is nothing special about Istio — you can get the same
architecture with a simple Kubernetes Ingress controller,

As an identity and access management server, you can Shadowing is easily achieved by defining a destination rule
use Auth0, Okta or other OAuth providers. You can learn more using subsets and a virtual service defining the mirroring route.
about authentication and authorization using Auth0 with Istio The destination rule will be defined as follows:
in this article.
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
Traffic Management
metadata:  
name: service2
spec:  
using Destination Rules

host: service2
subsets:
- name: v1
labels:
version: v1
- name: v2
Istio’s official documentation says that the labels:
version: v2
DestinationRule “defines policies that apply to
traffic intended for a service after routing has occurred.”
As we can see above, we defined two subsets for the
This means that the DestionationRule resource is situated
two versions.
somewhere between the Ingress controller and our services.
Using DestinationRules, we can define policies for load
Now we define the virtual service with mirroring configuration,
balancing, rate limiting or even outlier detection to detect
like in the script below:
unhealthy hosts.
Shadowing apiVersion: networking.istio.io/v1alpha3

metadata:  
Shadowing, also called Mirroring, is useful when you want name: service2
spec:  
to test your changes in production silently, without affecting hosts:
- service2  
end users. All the requests sent to the main service are mirrored
http:
(a copy of the request) to the secondary service that you want - route:
- destination:   
to test. host: service2
subset: v1
mirror:
host: service2
subset: v2

In this virtual service, we defined the main destination route The most important part of the script is the weight tag, which
for service2 version v1. The mirroring service will be the same defines the percentage of the requests that will reach that
service, but with the v2 version tag. This way the end user will specific service instance. In our case, 90 percent of the request
interact with the v1 service, while the request will also be sent will go to the v1 service, while only 10 percent of the requests will
also to the v2 service for testing. go to v2 service.
Traffic Splitting Canary Deployments
Traffic splitting is a technique used to test your new version of In canary deployments, newer versions of services
a service by letting only a small part (a subset) of users to are incrementally rolled out to users to minimize the risk and
interact with the new service. This way, if there is a bug in the impact of any bugs introduced by the newer version.
new service, only a small subset of end users will be affected.
This can be achieved by gradually decreasing the weight of
This can be achieved by modifying our virtual service the old version while increasing the weight of the new version.
as follows:
apiVersion: networking.istio.io/v1alpha3
A/B Testing
metadata:   This technique is used when we have two or more different
name: service2
spec:   user interfaces and we would like to test which one offers a
hosts:
- service2 better user experience. We deploy all the different versions
http:
- route: and we collect metrics about the user interaction. A/B testing
host: service2   
can be configured using a load balancer based on consistent
subset: v1    hashing or by using subsets.
weight: 90   
host: service2
subset: v2
weight: 10

In the first approach, we define the load balancer like in the As you can see, the consistent hashing is based on the version
following script: tag, so this tag must be added to our service called “service2”,
like this (in the repository you will find two files called service2_
apiVersion: networking.istio.io/v1alpha3 v1 and service2_v2 for the two different versions that we use):
kind: DestinationRule
metadata:  
name: service2
spec:   apiVersion: apps/v1
host: service2 kind: Deployment
trafficPolicy:    metadata:  
loadBalancer:    name: service2-v2  
consistentHash:    labels:   
httpHeaderName: version app: service2
spec:  
selector:   
matchLabels:   
app: service2  
strategy:   
type: Recreate  
template:   
metadata:
labels:   
app: service2   
version: v2   
spec:   
containers:
- image: zoliczako/sssm-service2:1.0.0   
imagePullPolicy: Always   
name: service2   
ports:
- containerPort: 5002   
resources:   
limits:   
memory: “256Mi”   
cpu: “500m”
The most important part to notice is the spec -> template ->
metadata -> version: v2. The other service has the version:
v1 tag.
The other solution is based on subsets.

Conclusion
Retry Management
Using Istio, we can easily define the maximum number of

attempts to connect to a service if the initial attempt fails (for
In this chapter, we learned how to set up and configure a
example, in case of overloaded service or network error).
service mesh in Kubernetes using Istio. First, we configured
an ingress controller and gateway and then we learned
The retry strategy can be defined by adding the following lines
about traffic management using destination rules and
to the end of our virtual service:
virtual services.
retries:  
attempts: 5
perTryTimeout: 10s
With this configuration, our service2 will have five retry attempts
in case of failure and it will wait 10 seconds before returning a
timeout.
Learn more about traffic management in this article. You’ll find

a great workshop to configure an end-to-end service mesh
using Istio here.

Conclusion
Become a
Microservices Master
You’ve made it through our Stupid Simple Kubernetes e-book. Congratulations! You are well on your way to becoming
a microservices master.
There are many more resources available to further your learning Microservices, including the Microservices.io website. Similarly,
there are many Kubernetes resources out there. One of our favorites is The Illustrated Children’s Guide to Kubernetes video.
I strongly encourage you to get hands on and continue your learning. The SUSE & Rancher Community is a great place to start – and
is welcoming to learners at all levels. Whether you are interested in an introductory Kubernetes class or ready to go deeper with a
mutli-week class on K3s, they’ve got it all. Join the free community today!
Keep learning and keep it simple!
Zoltán Czakó

Zoltán Czakó is a software developer experienced in backend,
frontend, DevOps, artificial intelligence and machine Learning.
He is the founder of HumindZ, a company focused on making
Artificial Intelligence and Machine Learning accessible for
everyone, providing services to improve everyday life using the
power of AI/ML.
He is also a research assistant at the Technical University of

Cluj-Napoca in Romania, where he is applying his skills to
create a platform that combines No-Code AI with AutoAI. Using
this platform, the research team creates automated solutions
mainly for healthcare, automating the diagnosis of different
diseases, this way helping to improve the lives of thousands
of people.
During his career, Zoltán has worked on multiple

microservices-based projects. He wrote this book to help
others get started with microservices and to make Kubernetes
simple and accessible for everyone.

SUSE is a global leader in innovative, reliable and enterprise-grade open source
solutions, relied upon by more than 60% of the Fortune 500 to power their
mission-critical workloads. We specialize in Business-critical Linux, Enterprise
Container Management and Edge solutions, and collaborate with partners and
communities to empower our customers to innovate everywhere – from the data
center, to the cloud, to the edge and beyond.
SUSE puts the “open” back in open source, giving customers the agility to tackle
innovation challenges today and the freedom to evolve their strategy and solutions
tomorrow. The company employs more than 2,000 people globally. SUSE is listed on
the Frankfurt Stock Exchange.

Stupid Simple Kubernetes

Uploaded by

Copyright:

Available Formats

Stupid Simple Kubernetes

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Stupid Simple Kubernetes

Uploaded by

Copyright:

Available Formats

Stupid Simple

Stupid Simple Kubernetes 2

Everything You Need to Know to 4 Stupid Simple Scability 46

Deployments, Services and 14 Stupid Simple Service Mesh - 53

Persistent Volumes Explained 25 Stupid Simple Service Mesh 62

Create an Azure Infrastructure 36 Become a Microservices Master 73

Stupid Simple Kubernetes 3

Everything You Need

In this first chapter, we’ll talk about the most important

First, I’d like to mention that using containers has multiple

Stupid Simple Kubernetes 5

A cluster is a group of nodes. When you deploy programs onto

We run our code on a cluster, and we shouldn’t care about

Nodes are worker machines in Kubernetes, which can be any

Stupid Simple Kubernetes 6

Stupid Simple Kubernetes 7

One of the goals of modern software development is to keep

Containers, by contrast, isolate application execution

Stupid Simple Kubernetes 8

A DaemonSet ensures that the pod runs on all the nodes

Stupid Simple Kubernetes 9

configured to use either the IP address or the hostname and portable.

the traffic will be load balanced to the correct pods. In the

Stupid Simple Kubernetes 10

ClusterIP is the default service type in Kubernetes and lets you

Stupid Simple Kubernetes 11

load balancer instance.

This solution is good for production, but it can be a little bit

Stupid Simple Kubernetes 12

In this chapter, we learned about the basic concepts used in

In the next chapter, we’ll set up a cluster on Azure and create

Stupid Simple Kubernetes 13

In this chapter, we will:

• Create a NodeJS backend with a MongoDB database

In this tutorial, we will use NGINX as an Ingress Controller

Stupid Simple Kubernetes 15

I would like to recommend another great article about basic

Stupid Simple Kubernetes 16

In the first line, we need to define from what image we want to

Creating a In line 3 we create a directory to hold the application code

Architecture the next thing we need to do is to install your app dependencies

Containerize the app

Stupid Simple Kubernetes 17

Service script, too.)

To push this image to our Azure Container Registry, we have

Now that we have defined the Dockerfile, we will build an

docker push stupidsimplekubernetescontainerregistry.

Stupid Simple Kubernetes 18

Deployment scripts Namespaces, ConfigMaps, etc.). The current stable version of

Kubernetes lets you add some metadata to your resources.

Stupid Simple Kubernetes 19

Stupid Simple Kubernetes 20

Stupid Simple Kubernetes 21

This is a generic script that can be applied without

Stupid Simple Kubernetes 22

After applying these scripts, we will have everything in place, so

Stupid Simple Kubernetes 23

In the next chapter, we will approach the problem of saving

Stupid Simple Kubernetes 24

NOTE: the scripts provided are platform agnostic, so you can