Kubernetes Tutorials
Kubernetes Tutorials
Kubernetes Tutorials
Basics
• Kubernetes Basics is an in-depth interactive tutorial that helps you
understand the Kubernetes system and try out some basic Kubernetes
features.
• Hello Minikube
Configuration
• Configuring Redis Using a ConfigMap
Stateless Applications
• Exposing an External IP Address to Access an Application in a Cluster
Stateful Applications
• StatefulSet Basics
Clusters
• AppArmor
Services
• Using Source IP
What's next
If you would like to write a tutorial, see Content Page Types for information
about the tutorial page type.
Hello Minikube
This tutorial shows you how to run a sample app on Kubernetes using
minikube and Katacoda. Katacoda provides a free, in-browser Kubernetes
environment.
Note: You can also follow this tutorial if you've installed minikube
locally. See minikube start for installation instructions.
Objectives
• Deploy a sample application to minikube.
• Run the app.
• View application logs.
Last modified October 22, 2020 at 3:27 PM PST: Fix links in tutorials
section (774594bf1)
Edit this page Create child page Create an issue
◦ Objectives
◦ Before you begin
◦ Create a minikube cluster
◦ Create a Deployment
◦ Create a Service
◦ Enable addons
◦ Clean up
◦ What's next
Learn Kubernetes Basics
Kubernetes Basics
This tutorial provides a walkthrough of the basics of the Kubernetes
cluster orchestration system. Each module contains some background
information on major Kubernetes features and concepts, and includes
an interactive online tutorial. These interactive tutorials let you manage
a simple cluster and its containerized applications for yourself.
2. Deploy an app
3. Explore your app
Create a Cluster
Kubernetes Clusters
Kubernetes coordinates a highly available cluster of computers
that are connected to work as a single unit. The abstractions in
Kubernetes allow you to deploy containerized applications to a cluster
without tying them specifically to individual machines. To make use of
this new model of deployment, applications need to be packaged in a
way that decouples them from individual hosts: they need to be
containerized. Containerized applications are more flexible and
available than in past deployment models, where applications were
installed directly onto specific machines as packages deeply integrated
into the host. Kubernetes automates the distribution and
scheduling of application containers across a cluster in a more
efficient way. Kubernetes is an open-source platform and is
production-ready.
Summary:
◦ Kubernetes cluster
◦ Minikube
Cluster Diagram
Masters manage the cluster and the nodes that are used to host the
running applications.
Now that you know what Kubernetes is, let's go to the online tutorial
and start our first cluster!
Feedback
Was this page helpful?
Yes No
Deploy an App
Kubernetes Deployments
Once you have a running Kubernetes cluster, you can deploy your
containerized applications on top of it. To do so, you create a
Kubernetes Deployment configuration. The Deployment instructs
Kubernetes how to create and update instances of your application.
Once you've created a Deployment, the Kubernetes master schedules
the application instances included in that Deployment to run on
individual Nodes in the cluster.
Now that you know what Deployments are, let's go to the online tutorial
and deploy our first app!
Feedback
Was this page helpful?
Yes No
Last modified April 14, 2020 at 12:31 PM PST: Changing links in the old
interactive tutorials. (996aa8ed4)
Edit this page Create child page Create an issue
Kubernetes Pods
When you created a Deployment in Module 2, Kubernetes created a
Pod to host your application instance. A Pod is a Kubernetes
abstraction that represents a group of one or more application
containers (such as Docker), and some shared resources for those
containers. Those resources include:
Pods are the atomic unit on the Kubernetes platform. When we create a
Deployment on Kubernetes, that Deployment creates Pods with
containers inside them (as opposed to creating containers directly).
Each Pod is tied to the Node where it is scheduled, and remains there
until termination (according to restart policy) or deletion. In case of a
Node failure, identical Pods are scheduled on other available Nodes in
the cluster.
Summary:
◦ Pods
◦ Nodes
◦ Kubectl main commands
Nodes
A Pod always runs on a Node. A Node is a worker machine in
Kubernetes and may be either a virtual or a physical machine,
depending on the cluster. Each Node is managed by the Master. A Node
can have multiple pods, and the Kubernetes master automatically
handles scheduling the pods across the Nodes in the cluster. The
Master's automatic scheduling takes into account the available
resources on each Node.
Node overview
Troubleshooting with kubectl
In Module 2, you used Kubectl command-line interface. You'll continue
to use it in Module 3 to get information about deployed applications
and their environments. The most common operations can be done with
the following kubectl commands:
You can use these commands to see when applications were deployed,
what their current statuses are, where they are running and what their
configurations are.
Now that we know more about our cluster components and the
command line, let's explore our application.
Feedback
Was this page helpful?
Yes No
Last modified August 15, 2020 at 6:55 PM PST: remove rkt reference
(74ed54528)
Edit this page Create child page Create an issue
Although each Pod has a unique IP address, those IPs are not exposed
outside the cluster without a Service. Services allow your applications
to receive traffic. Services can be exposed in different ways by
specifying a type in the ServiceSpec:
Additionally, note that there are some use cases with Services that
involve not defining selector in the spec. A Service created without se
lector will also not create the corresponding Endpoints object. This
allows users to manually map a Service to specific endpoints. Another
possibility why there may be no selector is you are strictly using type:
ExternalName.
Summary
◦ Exposing Pods to external traffic
◦ Load balancing traffic across multiple Pods
◦ Using labels
Pod
Node
A Service routes traffic across a set of Pods. Services are the
abstraction that allow pods to die and replicate in Kubernetes without
impacting your application. Discovery and routing among dependent
Pods (such as the frontend and backend components in an application)
is handled by Kubernetes Services.
Feedback
Was this page helpful?
Yes No
Last modified July 27, 2020 at 4:18 AM PST: Revise Pod concept
(#22603) (49eee8fd3)
Edit this page Create child page Create an issue
Summary:
◦ Scaling a Deployment
You can create from the start a Deployment with multiple instances
using the --replicas parameter for the kubectl create deployment
command
Scaling overview
1.
2.
Previous Next
Scaling out a Deployment will ensure new Pods are created and
scheduled to Nodes with available resources. Scaling will increase the
number of Pods to the new desired state. Kubernetes also supports
autoscaling of Pods, but it is outside of the scope of this tutorial.
Scaling to zero is also possible, and it will terminate all Pods of the
specified Deployment.
Feedback
Was this page helpful?
Yes No
Last modified June 25, 2020 at 1:37 PM PST: Update kubectl run from
docs where necessary (00f502fa6)
Edit this page Create child page Create an issue
Updating an application
Users expect applications to be available all the time and developers
are expected to deploy new versions of them several times a day. In
Kubernetes this is done with rolling updates. Rolling updates allow
Deployments' update to take place with zero downtime by
incrementally updating Pods instances with new ones. The new Pods
will be scheduled on Nodes with available resources.
Summary:
◦ Updating an app
Feedback
Was this page helpful?
Yes No
Last modified February 04, 2020 at 12:15 AM PST: Typo fix (#18830)
(fa598f196)
Edit this page Create child page Create an issue
Configuration
Although Secrets are also used to store key-value pairs, they differ from
ConfigMaps in that they're intended for confidential/sensitive
information and are stored using Base64 encoding. This makes secrets
the appropriate choice for storing such things as credentials, keys, and
tokens, the former of which you'll do in the Interactive Tutorial. For
more information on Secrets, you can find the documentation here.
Objectives
◦ Create a Kubernetes ConfigMap and Secret
◦ Inject microservice configuration using MicroProfile Config
Feedback
Was this page helpful?
Yes No
Last modified October 22, 2020 at 3:27 PM PST: Fix links in tutorials
section (774594bf1)
Edit this page Create child page Create an issue
◦ Before you begin
▪ Creating Kubernetes ConfigMaps & Secrets
▪ Externalizing Config from Code
◦ Objectives
◦ Example: Externalizing config using MicroProfile, ConfigMaps and
Secrets
▪ Start Interactive Tutorial
Interactive Tutorial -
Configuring a Java Microservice
Last modified October 06, 2020 at 2:03 PM PST: Update katacoda id to
kubernetes-bootcamp (ecba51d73)
Edit this page Create child page Create an issue
Objectives
◦ Create a kustomization.yaml file containing:
▪ a ConfigMap generator
▪ a Pod resource config using the ConfigMap
◦ Apply the directory by running kubectl apply -k ./
◦ Verify that the configuration was correctly applied.
◦ Katacoda
◦ Play with Kubernetes
To check the version, enter kubectl version.
◦ The example shown on this page works with kubectl 1.14 and
above.
◦ Understand Configure Containers Using a ConfigMap.
You can follow the steps below to configure a Redis cache using data
stored in a ConfigMap.
First create a kustomization.yaml containing a ConfigMap from the re
dis-config file:
pods/config/redis-config
maxmemory 2mb
maxmemory-policy allkeys-lru
pods/config/redis-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: redis
spec:
containers:
- name: redis
image: redis:5.0.4
command:
- redis-server
- "/redis-master/redis.conf"
env:
- name: MASTER
value: "true"
ports:
- containerPort: 6379
resources:
limits:
cpu: "0.1"
volumeMounts:
- mountPath: /redis-master-data
name: data
- mountPath: /redis-master
name: config
volumes:
- name: data
emptyDir: {}
- name: config
configMap:
name: example-redis-config
items:
- key: redis-config
path: redis.conf
kubectl apply -k .
Use kubectl exec to enter the pod and run the redis-cli tool to
verify that the configuration was correctly applied:
What's next
◦ Learn more about ConfigMaps.
Feedback
Was this page helpful?
Yes No
Last modified June 22, 2020 at 10:46 PM PST: update exec command
(ca71d9142)
Edit this page Create child page Create an issue
◦ Objectives
◦ Before you begin
◦ Real World Example: Configuring Redis using a ConfigMap
◦ What's next
Stateless Applications
Example: Add logging and metrics to the PHP / Redis Guestbook example
Objectives
◦ Run five instances of a Hello World application.
◦ Create a Service object that exposes an external IP address.
◦ Use the Service object to access the running application.
service/load-balancer-example.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app.kubernetes.io/name: load-balancer-example
name: hello-world
spec:
replicas: 5
selector:
matchLabels:
app.kubernetes.io/name: load-balancer-example
template:
metadata:
labels:
app.kubernetes.io/name: load-balancer-example
spec:
containers:
- image: gcr.io/google-samples/node-hello:1.0
name: hello-world
ports:
- containerPort: 8080
Name: my-service
Namespace: default
Labels: app.kubernetes.io/name=load-balancer-
example
Annotations: <none>
Selector: app.kubernetes.io/name=load-balancer-
example
Type: LoadBalancer
IP: 10.3.245.137
LoadBalancer Ingress: 104.198.205.71
Port: <unset> 8080/TCP
NodePort: <unset> 32377/TCP
Endpoints:
10.0.0.6:8080,10.0.1.6:8080,10.0.1.7:8080 + 2 more...
Session Affinity: None
Events: <none>
7. In the preceding output, you can see that the service has several
endpoints: 10.0.0.6:8080,10.0.1.6:8080,10.0.1.7:8080 + 2 more.
These are internal addresses of the pods that are running the
Hello World application. To verify these are pod addresses, enter
this command:
curl http://<external-ip>:<port>
Hello Kubernetes!
Cleaning up
To delete the Service, enter this command:
What's next
Learn more about connecting applications with services.
Feedback
Was this page helpful?
Yes No
Objectives
◦ Start up a Redis master.
◦ Start up Redis slaves.
◦ Start up the guestbook frontend.
◦ Expose and view the Frontend Service.
◦ Clean up.
◦ Katacoda
◦ Play with Kubernetes
To check the version, enter kubectl version.
The guestbook application uses Redis to store its data. It writes its data
to a Redis master instance and reads data from multiple Redis slave
instances.
application/guestbook/redis-master-deployment.yaml
3. Query the list of Pods to verify that the Redis Master Pod is
running:
4. Run the following command to view the logs from the Redis
Master Pod:
application/guestbook/redis-master-service.yaml
apiVersion: v1
kind: Service
metadata:
name: redis-master
labels:
app: redis
role: master
tier: backend
spec:
ports:
- name: redis
port: 6379
targetPort: 6379
selector:
app: redis
role: master
tier: backend
2. Query the list of Services to verify that the Redis Master Service is
running:
Although the Redis master is a single pod, you can make it highly
available to meet traffic demands by adding replica Redis slaves.
If there are not any replicas running, this Deployment would start the
two replicas on your container cluster. Conversely, if there are more
than two replicas running, it would scale down until two replicas are
running.
application/guestbook/redis-slave-deployment.yaml
2. Query the list of Pods to verify that the Redis Slave Pods are
running:
NAME READY
STATUS RESTARTS AGE
redis-master-1068406935-3lswp 1/1
Running 0 1m
redis-slave-2005841000-fpvqc 0/1
ContainerCreating 0 6s
redis-slave-2005841000-phfv9 0/1
ContainerCreating 0 6s
application/guestbook/redis-slave-service.yaml
apiVersion: v1
kind: Service
metadata:
name: redis-slave
labels:
app: redis
role: slave
tier: backend
spec:
ports:
- port: 6379
selector:
app: redis
role: slave
tier: backend
2. Query the list of Services to verify that the Redis slave service is
running:
2. Query the list of Pods to verify that the three frontend replicas are
running:
application/guestbook/frontend-service.yaml
apiVersion: v1
kind: Service
metadata:
name: frontend
labels:
app: guestbook
tier: frontend
spec:
# comment or delete the following line if you want to use
a LoadBalancer
type: NodePort
# if your cluster supports it, uncomment the following to
automatically create
# an external load-balanced IP for the frontend service.
# type: LoadBalancer
ports:
- port: 80
selector:
app: guestbook
tier: frontend
1. Apply the frontend Service from the frontend-service.yaml file:
1. Run the following command to get the IP address for the frontend
Service.
http://192.168.99.100:31323
2. Copy the IP address, and load the page in your browser to view
your guestbook.
1. Run the following command to get the IP address for the frontend
Service.
kubectl get service frontend
2. Copy the external IP address, and load the page in your browser to
view your guestbook.
Cleaning up
Deleting the Deployments and Services also deletes any running Pods.
Use labels to delete multiple resources with one command.
What's next
◦ Add ELK logging and monitoring to your Guestbook application
◦ Complete the Kubernetes Basics Interactive Tutorials
◦ Use Kubernetes to create a blog using Persistent Volumes for
MySQL and Wordpress
◦ Read more about connecting applications
◦ Read more about Managing Resources
Feedback
Was this page helpful?
Yes No
Last modified August 07, 2020 at 12:05 PM PST: Fix links in tutorials
section (c7e297cca)
Edit this page Create child page Create an issue
◦ Objectives
◦ Before you begin
◦ Start up the Redis Master
▪ Creating the Redis Master Deployment
▪ Creating the Redis Master Service
◦ Start up the Redis Slaves
▪ Creating the Redis Slave Deployment
▪ Creating the Redis Slave Service
◦ Set up and Expose the Guestbook Frontend
▪ Creating the Guestbook Frontend Deployment
▪ Creating the Frontend Service
▪ Viewing the Frontend Service via NodePort
▪ Viewing the Frontend Service via LoadBalancer
◦ Scale the Web Frontend
◦ Cleaning up
◦ What's next
Example: Add logging and
metrics to the PHP / Redis
Guestbook example
This tutorial builds upon the PHP Guestbook with Redis tutorial.
Lightweight log, metric, and network data open source shippers, or
Beats, from Elastic are deployed in the same Kubernetes cluster as the
guestbook. The Beats collect, parse, and index the data into
Elasticsearch so that you can view and analyze the resulting
operational information in Kibana. This example consists of the
following components:
Objectives
◦ Start up the PHP Guestbook with Redis.
◦ Install kube-state-metrics.
◦ Create a Kubernetes Secret.
◦ Deploy the Beats.
◦ View dashboards of your logs and metrics.
◦ Katacoda
◦ Play with Kubernetes
To check the version, enter kubectl version.
This tutorial builds on the PHP Guestbook with Redis tutorial. If you
have the guestbook application running, then you can monitor that. If
you do not have it running then follow the instructions to deploy the
guestbook and do not perform the Cleanup steps. Come back to this
page when you have the guestbook running.
Create a cluster level role binding so that you can deploy kube-state-
metrics and the Beats at the cluster level (in kube-system).
Install kube-state-metrics
Output:
cd examples/beats-k8s-send-anywhere
Note: There are two sets of steps here, one for self managed
Elasticsearch and Kibana (running on your servers or using
the Elastic Helm Charts), and a second separate set for the
managed service Elasticsearch Service in Elastic Cloud. Only
create the secret for the type of Elasticsearch and Kibana
system that you will use for this tutorial.
◦ Self Managed
◦ Managed service
Self managed
Switch to the Managed service tab if you are connecting to
Elasticsearch Service in Elastic Cloud.
1. ELASTICSEARCH_HOSTS
2. ELASTICSEARCH_PASSWORD
3. ELASTICSEARCH_USERNAME
4. KIBANA_HOST
Set these with the information for your Elasticsearch cluster and your
Kibana host. Here are some examples (also see this configuration)
ELASTICSEARCH_HOSTS
["http://elasticsearch-master.default.svc.cluster.local:
9200"]
["http://host.docker.internal:9200"]
["http://host1.example.com:9200", "http://
host2.example.com:9200"]
Edit ELASTICSEARCH_HOSTS:
vi ELASTICSEARCH_HOSTS
ELASTICSEARCH_PASSWORD
<yoursecretpassword>
Edit ELASTICSEARCH_PASSWORD:
vi ELASTICSEARCH_PASSWORD
ELASTICSEARCH_USERNAME
Edit ELASTICSEARCH_USERNAME:
vi ELASTICSEARCH_USERNAME
KIBANA_HOST
1. The Kibana instance from the Elastic Kibana Helm Chart. The
subdomain default refers to the default namespace. If you have
deployed the Helm Chart using a different namespace, then your
subdomain will be different:
"kibana-kibana.default.svc.cluster.local:5601"
"host1.example.com:5601"
Edit KIBANA_HOST:
vi KIBANA_HOST
Managed service
This tab is for Elasticsearch Service in Elastic Cloud only, if you have
already created a secret for a self managed Elasticsearch and Kibana
deployment, then continue with Deploy the Beats.
1. ELASTIC_CLOUD_AUTH
2. ELASTIC_CLOUD_ID
Set these with the information provided to you from the Elasticsearch
Service console when you created the deployment. Here are some
examples:
ELASTIC_CLOUD_ID
devk8s:ABC123def456ghi789jkl123mno456pqr789stu123vwx456yza789
bcd012efg345hijj678klm901nop345zEwOTJjMTc5YWQ0YzQ5OThlN2U5MjA
wYTg4NTIzZQ==
ELASTIC_CLOUD_AUTH
Manifest files are provided for each Beat. These manifest files use the
secret created earlier to configure the Beats to connect to your
Elasticsearch and Kibana servers.
About Filebeat
Filebeat will collect logs from the Kubernetes nodes and the containers
running in each pod running on those nodes. Filebeat is deployed as a
DaemonSet. Filebeat can autodiscover applications running in your
Kubernetes cluster. At startup Filebeat scans existing containers and
launches the proper configurations for them, then it will watch for new
start/stop events.
- condition.contains:
kubernetes.labels.app: redis
config:
- module: redis
log:
input:
type: docker
containers.ids:
- ${data.kubernetes.container.id}
slowlog:
enabled: true
var.hosts: ["${data.host}:${data.port}"]
Deploy Filebeat:
Verify
About Metricbeat
- condition.equals:
kubernetes.labels.tier: backend
config:
- module: redis
metricsets: ["info", "keyspace"]
period: 10s
# Redis hosts
hosts: ["${data.host}:${data.port}"]
Verify
About Packetbeat
packetbeat.interfaces.device: any
packetbeat.protocols:
- type: dns
ports: [53]
include_authorities: true
include_additionals: true
- type: http
ports: [80, 8000, 8080, 9200]
- type: mysql
ports: [3306]
- type: redis
ports: [6379]
packetbeat.flows:
timeout: 30s
period: 10s
Deploy Packetbeat
View in Kibana
Search for Packetbeat on the Dashboard page, and view the Packetbeat
overview.
Similarly, view dashboards for Apache and Redis. You will see
dashboards for logs and metrics for each. The Apache Metricbeat
dashboard will be blank. Look at the Apache Filebeat dashboard and
scroll to the bottom to view the Apache error logs. This will tell you why
there are no metrics available for Apache.
The output:
The output:
deployment.extensions/frontend scaled
See the screenshot, add the indicated filters and then add the columns
to the view. You can see the ScalingReplicaSet entry that is marked,
following from there to the top of the list of events shows the image
being pulled, the volumes mounted, the pod starting, etc.
Cleaning up
Deleting the Deployments and Services also deletes any running Pods.
Use labels to delete multiple resources with one command.
No resources found.
What's next
◦ Learn about tools for monitoring resources
◦ Read more about logging architecture
◦ Read more about application introspection and debugging
◦ Read more about troubleshoot applications
Feedback
Was this page helpful?
Yes No
Last modified October 22, 2020 at 11:54 AM PST: Tweak coding styles
for guestbook logging tutorial (cebcdf5fc)
Edit this page Create child page Create an issue
◦ Objectives
◦ Before you begin
◦ Start up the PHP Guestbook with Redis
◦ Add a Cluster role binding
◦ Install kube-state-metrics
▪ Check to see if kube-state-metrics is running
◦ Clone the Elastic examples GitHub repo
◦ Create a Kubernetes Secret
◦ Deploy the Beats
▪ About Filebeat
▪ Deploy Filebeat:
▪ About Metricbeat
▪ Deploy Metricbeat
▪ About Packetbeat
◦ View in Kibana
◦ Scale your Deployments and see new pods being monitored
◦ View the changes in Kibana
◦ Cleaning up
◦ What's next
Stateful Applications
StatefulSet Basics
StatefulSet Basics
This tutorial provides an introduction to managing applications with
StatefulSets. It demonstrates how to create, delete, scale, and update
the Pods of StatefulSets.
◦ Pods
◦ Cluster DNS
◦ Headless Services
◦ PersistentVolumes
◦ PersistentVolume Provisioning
◦ StatefulSets
◦ The kubectl command line tool
Objectives
StatefulSets are intended to be used with stateful applications and
distributed systems. However, the administration of stateful
applications and distributed systems on Kubernetes is a broad, complex
topic. In order to demonstrate the basic features of a StatefulSet, and
not to conflate the former topic with the latter, you will deploy a simple
web application using a StatefulSet.
After this tutorial, you will be familiar with the following.
Creating a StatefulSet
Begin by creating a StatefulSet using the example below. It is similar to
the example presented in the StatefulSets concept. It creates a
headless Service, nginx, to publish the IP addresses of Pods in the
StatefulSet, web.
application/web/web.yaml
apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
app: nginx
spec:
ports:
- port: 80
name: web
clusterIP: None
selector:
app: nginx
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web
spec:
serviceName: "nginx"
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: k8s.gcr.io/nginx-slim:0.8
ports:
- containerPort: 80
name: web
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html
volumeClaimTemplates:
- metadata:
name: www
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 1Gi
You will need to use two terminal windows. In the first terminal, use ku
bectl get to watch the creation of the StatefulSet's Pods.
service/nginx created
statefulset.apps/web created
...then get the web StatefulSet, to verify that both were created
successfully:
Notice that the web-1 Pod is not launched until the web-0 Pod is
Running (see Pod Phase) and Ready (see type in Pod Conditions).
Pods in a StatefulSet
Pods in a StatefulSet have a unique ordinal index and a stable network
identity.
web-0
web-1
Use kubectl run to execute a container that provides the nslookup
command from the dnsutils package. Using nslookup on the Pods'
hostnames, you can examine their in-cluster DNS addresses:
Server: 10.0.0.10
Address 1: 10.0.0.10 kube-dns.kube-system.svc.cluster.local
Name: web-0.nginx
Address 1: 10.244.1.6
nslookup web-1.nginx
Server: 10.0.0.10
Address 1: 10.0.0.10 kube-dns.kube-system.svc.cluster.local
Name: web-1.nginx
Address 1: 10.244.2.6
The CNAME of the headless service points to SRV records (one for each
Pod that is Running and Ready). The SRV records point to A record
entries that contain the Pods' IP addresses.
In a second terminal, use kubectl delete to delete all the Pods in the
StatefulSet:
Wait for the StatefulSet to restart them, and for both Pods to transition
to Running and Ready:
Use kubectl exec and kubectl run to view the Pods' hostnames and
in-cluster DNS entries. First, view the Pods' hostnames:
web-0
web-1
then, run:
Server: 10.0.0.10
Address 1: 10.0.0.10 kube-dns.kube-system.svc.cluster.local
Name: web-0.nginx
Address 1: 10.244.1.7
nslookup web-1.nginx
Server: 10.0.0.10
Address 1: 10.0.0.10 kube-dns.kube-system.svc.cluster.local
Name: web-1.nginx
Address 1: 10.244.2.8
The Pods' ordinals, hostnames, SRV records, and A record names have
not changed, but the IP addresses associated with the Pods may have
changed. In the cluster used for this tutorial, they have. This is why it is
important not to configure other applications to connect to Pods in a
StatefulSet by IP address.
NAME STATUS
VOLUME CAPACITY
ACCESSMODES AGE
www-web-0 Bound pvc-15c268c7-
b507-11e6-932f-42010a800002 1Gi RWO 48s
www-web-1 Bound pvc-15c79307-
b507-11e6-932f-42010a800002 1Gi RWO 48s
Write the Pods' hostnames to their index.html files and verify that the
NGINX webservers serve the hostnames:
web-0
web-1
Note:
Examine the output of the kubectl get command in the first terminal,
and wait for all of the Pods to transition to Running and Ready.
web-0
web-1
Even though web-0 and web-1 were rescheduled, they continue to serve
their hostnames because the PersistentVolumes associated with their
PersistentVolumeClaims are remounted to their volumeMounts. No
matter what node web-0and web-1 are scheduled on, their
PersistentVolumes will be mounted to the appropriate mount points.
Scaling a StatefulSet
Scaling a StatefulSet refers to increasing or decreasing the number of
replicas. This is accomplished by updating the replicas field. You can
use either kubectl scale or kubectl patch to scale a StatefulSet.
Scaling Up
In one terminal window, watch the Pods in the StatefulSet:
statefulset.apps/web scaled
Examine the output of the kubectl get command in the first terminal,
and wait for the three additional Pods to transition to Running and
Ready.
Scaling Down
In one terminal, watch the StatefulSet's Pods:
NAME STATUS
VOLUME CAPACITY
ACCESSMODES AGE
www-web-0 Bound pvc-15c268c7-
b507-11e6-932f-42010a800002 1Gi RWO 13h
www-web-1 Bound pvc-15c79307-
b507-11e6-932f-42010a800002 1Gi RWO 13h
www-web-2 Bound pvc-e1125b27-
b508-11e6-932f-42010a800002 1Gi RWO 13h
www-web-3 Bound pvc-e1176df6-
b508-11e6-932f-42010a800002 1Gi RWO 13h
www-web-4 Bound pvc-e11bb5f8-
b508-11e6-932f-42010a800002 1Gi RWO 13h
Rolling Update
The RollingUpdate update strategy will update all Pods in a
StatefulSet, in reverse ordinal order, while respecting the StatefulSet
guarantees.
statefulset.apps/web patched
statefulset.apps/web patched
The Pods in the StatefulSet are updated in reverse ordinal order. The
StatefulSet controller terminates each Pod, and waits for it to transition
to Running and Ready prior to updating the next Pod. Note that, even
though the StatefulSet controller will not proceed to update the next
Pod until its ordinal successor is Running and Ready, it will restore any
Pod that fails during the update to its current version.
Pods that have already received the update will be restored to the
updated version, and Pods that have not yet received the update will be
restored to the previous version. In this way, the controller attempts to
continue to keep the application healthy and the update consistent in
the presence of intermittent failures.
k8s.gcr.io/nginx-slim:0.8
k8s.gcr.io/nginx-slim:0.8
k8s.gcr.io/nginx-slim:0.8
All the Pods in the StatefulSet are now running the previous container
image.
statefulset.apps/web patched
statefulset.apps/web patched
k8s.gcr.io/nginx-slim:0.8
You can roll out a canary to test a modification by decrementing the par
tition you specified above.
statefulset.apps/web patched
k8s.gcr.io/nginx-slim:0.7
k8s.gcr.io/nginx-slim:0.8
statefulset.apps/web patched
Wait for all of the Pods in the StatefulSet to become Running and
Ready.
Get the container image details for the Pods in the StatefulSet:
k8s.gcr.io/nginx-slim:0.7
k8s.gcr.io/nginx-slim:0.7
k8s.gcr.io/nginx-slim:0.7
On Delete
The OnDelete update strategy implements the legacy (1.6 and prior)
behavior, When you select this update strategy, the StatefulSet
controller will not automatically update Pods when a modification is
made to the StatefulSet's .spec.template field. This strategy can be
selected by setting the .spec.template.updateStrategy.type to OnDe
lete.
Deleting StatefulSets
StatefulSet supports both Non-Cascading and Cascading deletion. In a
Non-Cascading Delete, the StatefulSet's Pods are not deleted when the
StatefulSet is deleted. In a Cascading Delete, both the StatefulSet and
its Pods are deleted.
Non-Cascading Delete
In one terminal window, watch the Pods in the StatefulSet.
Use kubectl delete to delete the StatefulSet. Make sure to supply the
--cascade=false parameter to the command. This parameter tells
Kubernetes to only delete the StatefulSet, and to not delete any of its
Pods.
Even though web has been deleted, all of the Pods are still Running and
Ready. Delete web-0:
As the web StatefulSet has been deleted, web-0 has not been
relaunched.
statefulset.apps/web created
service/nginx unchanged
Ignore the error. It only indicates that an attempt was made to create
the nginx headless Service even though that Service already exists.
Examine the output of the kubectl get command running in the first
terminal.
Let's take another look at the contents of the index.html file served by
the Pods' webservers:
web-0
web-1
Even though you deleted both the StatefulSet and the web-0 Pod, it still
serves the hostname originally entered into its index.html file. This is
because the StatefulSet never deletes the PersistentVolumes associated
with a Pod. When you recreated the StatefulSet and it relaunched web-
0, its original PersistentVolume was remounted.
Cascading Delete
In one terminal window, watch the Pods in the StatefulSet.
In another terminal, delete the StatefulSet again. This time, omit the --
cascade=false parameter.
Examine the output of the kubectl get command running in the first
terminal, and wait for all of the Pods to transition to Terminating.
As you saw in the Scaling Down section, the Pods are terminated one at
a time, with respect to the reverse order of their ordinal indices. Before
terminating a Pod, the StatefulSet controller waits for the Pod's
successor to be completely terminated.
service/nginx created
statefulset.apps/web created
web-0
web-1
Even though you completely deleted the StatefulSet, and all of its Pods,
the Pods are recreated with their PersistentVolumes mounted, and web-
0 and web-1 continue to serve their hostnames.
application/web/web-parallel.yaml
apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
app: nginx
spec:
ports:
- port: 80
name: web
clusterIP: None
selector:
app: nginx
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web
spec:
serviceName: "nginx"
podManagementPolicy: "Parallel"
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: k8s.gcr.io/nginx-slim:0.8
ports:
- containerPort: 80
name: web
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html
volumeClaimTemplates:
- metadata:
name: www
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 1Gi
This manifest is identical to the one you downloaded above except that
the .spec.podManagementPolicy of the web StatefulSet is set to Parall
el.
service/nginx created
statefulset.apps/web created
Examine the output of the kubectl get command that you executed in
the first terminal.
Keep the second terminal open, and, in another terminal window scale
the StatefulSet:
statefulset.apps/web scaled
Examine the output of the terminal where the kubectl get command is
running.
The StatefulSet launched two new Pods, and it did not wait for the first
to become Running and Ready prior to launching the second.
Cleaning up
You should have two terminals open, ready for you to run kubectl
commands as part of cleanup.
You can watch kubectl get to see those Pods being deleted.
Close the terminal where the kubectl get command is running and
delete the nginx Service:
Note:
You also need to delete the persistent storage media for the
PersistentVolumes used in this tutorial.
Feedback
Was this page helpful?
Yes No
Objectives
◦ Create PersistentVolumeClaims and PersistentVolumes
◦ Create a kustomization.yaml with
▪ a Secret generator
▪ MySQL resource configs
▪ WordPress resource configs
◦ Apply the kustomization directory by kubectl apply -k ./
◦ Clean up
Before you begin
You need to have a Kubernetes cluster, and the kubectl command-line
tool must be configured to communicate with your cluster. If you do not
already have a cluster, you can create one by using minikube or you can
use one of these Kubernetes playgrounds:
◦ Katacoda
◦ Play with Kubernetes
To check the version, enter kubectl version. The example shown on
this page works with kubectl 1.14 and above.
1. mysql-deployment.yaml
2. wordpress-deployment.yaml
Create a kustomization.yaml
application/wordpress/mysql-deployment.yaml
apiVersion: v1
kind: Service
metadata:
name: wordpress-mysql
labels:
app: wordpress
spec:
ports:
- port: 3306
selector:
app: wordpress
tier: mysql
clusterIP: None
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mysql-pv-claim
labels:
app: wordpress
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
---
apiVersion: apps/v1 # for versions before 1.9.0 use apps/
v1beta2
kind: Deployment
metadata:
name: wordpress-mysql
labels:
app: wordpress
spec:
selector:
matchLabels:
app: wordpress
tier: mysql
strategy:
type: Recreate
template:
metadata:
labels:
app: wordpress
tier: mysql
spec:
containers:
- image: mysql:5.6
name: mysql
env:
- name: MYSQL_ROOT_PASSWORD
valueFrom:
secretKeyRef:
name: mysql-pass
key: password
ports:
- containerPort: 3306
name: mysql
volumeMounts:
- name: mysql-persistent-storage
mountPath: /var/lib/mysql
volumes:
- name: mysql-persistent-storage
persistentVolumeClaim:
claimName: mysql-pv-claim
application/wordpress/wordpress-deployment.yaml
apiVersion: v1
kind: Service
metadata:
name: wordpress
labels:
app: wordpress
spec:
ports:
- port: 80
selector:
app: wordpress
tier: frontend
type: LoadBalancer
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: wp-pv-claim
labels:
app: wordpress
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
---
apiVersion: apps/v1 # for versions before 1.9.0 use apps/
v1beta2
kind: Deployment
metadata:
name: wordpress
labels:
app: wordpress
spec:
selector:
matchLabels:
app: wordpress
tier: frontend
strategy:
type: Recreate
template:
metadata:
labels:
app: wordpress
tier: frontend
spec:
containers:
- image: wordpress:4.8-apache
name: wordpress
env:
- name: WORDPRESS_DB_HOST
value: wordpress-mysql
- name: WORDPRESS_DB_PASSWORD
valueFrom:
secretKeyRef:
name: mysql-pass
key: password
ports:
- containerPort: 80
name: wordpress
volumeMounts:
- name: wordpress-persistent-storage
mountPath: /var/www/html
volumes:
- name: wordpress-persistent-storage
persistentVolumeClaim:
claimName: wp-pv-claim
kubectl apply -k ./
NAME
TYPE DATA AGE
mysql-pass-c57bb4t7mf
Opaque 1 9s
NAME STATUS
VOLUME CAPACITY
ACCESS MODES STORAGECLASS AGE
mysql-pv-claim Bound pvc-8cbd7b2e-4044-11e9-
b2bb-42010a800002 20Gi RWO
standard 77s
wp-pv-claim Bound pvc-8cd0df54-4044-11e9-
b2bb-42010a800002 20Gi RWO
standard 77s
http://1.2.3.4:32406
6. Copy the IP address, and load the page in your browser to view
your site.
You should see the WordPress set up page similar to the following
screenshot.
Warning: Do not leave your WordPress installation on this
page. If another user finds it, they can set up a website on
your instance and use it to serve malicious content.
kubectl delete -k ./
What's next
◦ Learn more about Introspection and Debugging
◦ Learn more about Jobs
◦ Learn more about Port Forwarding
◦ Learn how to Get a Shell to a Container
Feedback
Was this page helpful?
Yes No
Note:
Objectives
◦ Create and validate a Cassandra headless Service.
◦ Use a StatefulSet to create a Cassandra ring.
◦ Validate the StatefulSet.
◦ Modify the StatefulSet.
◦ Delete the StatefulSet and its Pods.
◦ Katacoda
◦ Play with Kubernetes
The following Service is used for DNS lookups between Cassandra Pods
and clients within your cluster:
application/cassandra/cassandra-service.yaml
apiVersion: v1
kind: Service
metadata:
labels:
app: cassandra
name: cassandra
spec:
clusterIP: None
ports:
- port: 9042
selector:
app: cassandra
Validating (optional)
Get the Cassandra Service.
The response is
If you don't see a Service named cassandra, that means creation failed.
Read Debug Services for help troubleshooting common issues.
Using a StatefulSet to create a Cassandra
ring
The StatefulSet manifest, included below, creates a Cassandra ring that
consists of three Pods.
application/cassandra/cassandra-statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: cassandra
labels:
app: cassandra
spec:
serviceName: cassandra
replicas: 3
selector:
matchLabels:
app: cassandra
template:
metadata:
labels:
app: cassandra
spec:
terminationGracePeriodSeconds: 1800
containers:
- name: cassandra
image: gcr.io/google-samples/cassandra:v13
imagePullPolicy: Always
ports:
- containerPort: 7000
name: intra-node
- containerPort: 7001
name: tls-intra-node
- containerPort: 7199
name: jmx
- containerPort: 9042
name: cql
resources:
limits:
cpu: "500m"
memory: 1Gi
requests:
cpu: "500m"
memory: 1Gi
securityContext:
capabilities:
add:
- IPC_LOCK
lifecycle:
preStop:
exec:
command:
- /bin/sh
- -c
- nodetool drain
env:
- name: MAX_HEAP_SIZE
value: 512M
- name: HEAP_NEWSIZE
value: 100M
- name: CASSANDRA_SEEDS
value: "cassandra-0.cassandra.default.svc.cluster
.local"
- name: CASSANDRA_CLUSTER_NAME
value: "K8Demo"
- name: CASSANDRA_DC
value: "DC1-K8Demo"
- name: CASSANDRA_RACK
value: "Rack1-K8Demo"
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
readinessProbe:
exec:
command:
- /bin/bash
- -c
- /ready-probe.sh
initialDelaySeconds: 15
timeoutSeconds: 5
# These volume mounts are persistent. They are like
inline claims,
# but not exactly because the names need to match
exactly one of
# the stateful pod volumes.
volumeMounts:
- name: cassandra-data
mountPath: /cassandra_data
# These are converted to volume claims by the controller
# and mounted at the paths mentioned above.
# do not use these in production until ssd
GCEPersistentDisk or other ssd pd
volumeClaimTemplates:
- metadata:
name: cassandra-data
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: fast
resources:
requests:
storage: 1Gi
---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: fast
provisioner: k8s.io/minikube-hostpath
parameters:
type: pd-ssd
It can take several minutes for all three Pods to deploy. Once they
are deployed, the same command returns output similar to:
3. Run the Cassandra nodetool inside the first Pod, to display the
status of the ring.
Datacenter: DC1-K8Demo
======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns
(effective) Host ID Rack
UN 172.17.0.5 83.57 KiB 32
74.0% e2dd09e6-d9d3-477e-96c5-45094c08db0f
Rack1-K8Demo
UN 172.17.0.4 101.04 KiB 32
58.8% f89d6835-3a42-4419-92b3-0e62cae1479c
Rack1-K8Demo
UN 172.17.0.6 84.74 KiB 32
67.1% a6a1e8c2-3dc5-4417-b1a0-26507af2aaad
Rack1-K8Demo
This command opens an editor in your terminal. The line you need
to change is the replicas field. The following sample is an excerpt
of the StatefulSet file:
Cleaning up
Deleting or scaling a StatefulSet down does not delete the volumes
associated with the StatefulSet. This setting is for your safety because
your data is more valuable than automatically purging all related
StatefulSet resources.
What's next
◦ Learn how to Scale a StatefulSet.
◦ Learn more about the KubernetesSeedProvider
◦ See more custom Seed Provider Configurations
Feedback
Was this page helpful?
Yes No
Last modified October 22, 2020 at 3:27 PM PST: Fix links in tutorials
section (774594bf1)
Edit this page Create child page Create an issue
◦ Objectives
◦ Before you begin
▪ Additional Minikube setup instructions
◦ Creating a headless Service for Cassandra
▪ Validating (optional)
◦ Using a StatefulSet to create a Cassandra ring
◦ Validating the Cassandra StatefulSet
◦ Modifying the Cassandra StatefulSet
◦ Cleaning up
◦ Cassandra container environment variables
◦ What's next
Running ZooKeeper, A
Distributed System Coordinator
This tutorial demonstrates running Apache Zookeeper on Kubernetes
using StatefulSets, PodDisruptionBudgets, and PodAntiAffinity.
◦ Pods
◦ Cluster DNS
◦ Headless Services
◦ PersistentVolumes
◦ PersistentVolume Provisioning
◦ StatefulSets
◦ PodDisruptionBudgets
◦ PodAntiAffinity
◦ kubectl CLI
You will require a cluster with at least four nodes, and each node
requires at least 2 CPUs and 4 GiB of memory. In this tutorial you will
cordon and drain the cluster's nodes. This means that the cluster
will terminate and evict all Pods on its nodes, and the nodes will
temporarily become unschedulable. You should use a dedicated
cluster for this tutorial, or you should ensure that the disruption you
cause will not interfere with other tenants.
Objectives
After this tutorial, you will know the following.
The ensemble uses the Zab protocol to elect a leader, and the ensemble
cannot write data until that election is complete. Once complete, the
ensemble uses Zab to ensure that it replicates all writes to a quorum
before it acknowledges and makes them visible to clients. Without
respect to weighted quorums, a quorum is a majority component of the
ensemble containing the current leader. For instance, if the ensemble
has three servers, a component that contains the leader and one other
server constitutes a quorum. If the ensemble can not achieve a quorum,
the ensemble cannot write data.
application/zookeeper/zookeeper.yaml
apiVersion: v1
kind: Service
metadata:
name: zk-hs
labels:
app: zk
spec:
ports:
- port: 2888
name: server
- port: 3888
name: leader-election
clusterIP: None
selector:
app: zk
---
apiVersion: v1
kind: Service
metadata:
name: zk-cs
labels:
app: zk
spec:
ports:
- port: 2181
name: client
selector:
app: zk
---
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: zk-pdb
spec:
selector:
matchLabels:
app: zk
maxUnavailable: 1
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: zk
spec:
selector:
matchLabels:
app: zk
serviceName: zk-hs
replicas: 3
updateStrategy:
type: RollingUpdate
podManagementPolicy: OrderedReady
template:
metadata:
labels:
app: zk
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: "app"
operator: In
values:
- zk
topologyKey: "kubernetes.io/hostname"
containers:
- name: kubernetes-zookeeper
imagePullPolicy: Always
image: "k8s.gcr.io/kubernetes-zookeeper:1.0-3.4.10"
resources:
requests:
memory: "1Gi"
cpu: "0.5"
ports:
- containerPort: 2181
name: client
- containerPort: 2888
name: server
- containerPort: 3888
name: leader-election
command:
- sh
- -c
- "start-zookeeper \
--servers=3 \
--data_dir=/var/lib/zookeeper/data \
--data_log_dir=/var/lib/zookeeper/data/log \
--conf_dir=/opt/zookeeper/conf \
--client_port=2181 \
--election_port=3888 \
--server_port=2888 \
--tick_time=2000 \
--init_limit=10 \
--sync_limit=5 \
--heap=512M \
--max_client_cnxns=60 \
--snap_retain_count=3 \
--purge_interval=12 \
--max_session_timeout=40000 \
--min_session_timeout=4000 \
--log_level=INFO"
readinessProbe:
exec:
command:
- sh
- -c
- "zookeeper-ready 2181"
initialDelaySeconds: 10
timeoutSeconds: 5
livenessProbe:
exec:
command:
- sh
- -c
- "zookeeper-ready 2181"
initialDelaySeconds: 10
timeoutSeconds: 5
volumeMounts:
- name: datadir
mountPath: /var/lib/zookeeper
securityContext:
runAsUser: 1000
fsGroup: 1000
volumeClaimTemplates:
- metadata:
name: datadir
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 10Gi
Open a terminal, and use the kubectl apply command to create the
manifest.
This creates the zk-hs Headless Service, the zk-cs Service, the zk-pdb
PodDisruptionBudget, and the zk StatefulSet.
service/zk-hs created
service/zk-cs created
poddisruptionbudget.policy/zk-pdb created
statefulset.apps/zk created
Once the zk-2 Pod is Running and Ready, use CTRL-C to terminate
kubectl.
The StatefulSet controller creates three Pods, and each Pod has a
container with a ZooKeeper server.
zk-0
zk-1
zk-2
To examine the contents of the myid file for each server use the
following command.
Because the identifiers are natural numbers and the ordinal indices are
non-negative integers, you can generate an identifier by adding 1 to the
ordinal.
myid zk-0
1
myid zk-1
2
myid zk-2
3
To get the Fully Qualified Domain Name (FQDN) of each Pod in the zk
StatefulSet use the following command.
for i in 0 1 2; do kubectl exec zk-$i -- hostname -f; done
The zk-hs Service creates a domain for all of the Pods, zk-
hs.default.svc.cluster.local.
zk-0.zk-hs.default.svc.cluster.local
zk-1.zk-hs.default.svc.cluster.local
zk-2.zk-hs.default.svc.cluster.local
clientPort=2181
dataDir=/var/lib/zookeeper/data
dataLogDir=/var/lib/zookeeper/log
tickTime=2000
initLimit=10
syncLimit=2000
maxClientCnxns=60
minSessionTimeout= 4000
maxSessionTimeout= 40000
autopurge.snapRetainCount=3
autopurge.purgeInterval=0
server.1=zk-0.zk-hs.default.svc.cluster.local:2888:3888
server.2=zk-1.zk-hs.default.svc.cluster.local:2888:3888
server.3=zk-2.zk-hs.default.svc.cluster.local:2888:3888
Achieving Consensus
Consensus protocols require that the identifiers of each participant be
unique. No two participants in the Zab protocol should claim the same
unique identifier. This is necessary to allow the processes in the system
to agree on which processes have committed which data. If two Pods
are launched with the same ordinal, two ZooKeeper servers would both
identify themselves as the same server.
The A records for each Pod are entered when the Pod becomes Ready.
Therefore, the FQDNs of the ZooKeeper servers will resolve to a single
endpoint, and that endpoint will be the unique ZooKeeper server
claiming the identity configured in its myid file.
zk-0.zk-hs.default.svc.cluster.local
zk-1.zk-hs.default.svc.cluster.local
zk-2.zk-hs.default.svc.cluster.local
server.1=zk-0.zk-hs.default.svc.cluster.local:2888:3888
server.2=zk-1.zk-hs.default.svc.cluster.local:2888:3888
server.3=zk-2.zk-hs.default.svc.cluster.local:2888:3888
When the servers use the Zab protocol to attempt to commit a value,
they will either achieve consensus and commit the value (if leader
election has succeeded and at least two of the Pods are Running and
Ready), or they will fail to do so (if either of the conditions are not met).
No state will arise where one server acknowledges a write on behalf of
another.
The command below executes the zkCli.sh script to write world to the
path /hello on the zk-0 Pod in the ensemble.
WATCHER::
The data that you created on zk-0 is available on all the servers in the
ensemble.
WATCHER::
This creates the zk StatefulSet object, but the other API objects in the
manifest are not modified because they already exist.
Once the zk-2 Pod is Running and Ready, use CTRL-C to terminate
kubectl.
Use the command below to get the value you entered during the sanity
test, from the zk-2 Pod.
Even though you terminated and recreated all of the Pods in the zk
StatefulSet, the ensemble still serves the original value.
WATCHER::
volumeClaimTemplates:
- metadata:
name: datadir
annotations:
volume.alpha.kubernetes.io/storage-class: anything
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 20Gi
NAME STATUS
VOLUME CAPACITY
ACCESSMODES AGE
datadir-zk-0 Bound pvc-bed742cd-
bcb1-11e6-994f-42010a800002 20Gi RWO 1h
datadir-zk-1 Bound pvc-bedd27d2-
bcb1-11e6-994f-42010a800002 20Gi RWO 1h
datadir-zk-2 Bound pvc-bee0817e-
bcb1-11e6-994f-42010a800002 20Gi RWO 1h
volumeMounts:
- name: datadir
mountPath: /var/lib/zookeeper
…
command:
- sh
- -c
- "start-zookeeper \
--servers=3 \
--data_dir=/var/lib/zookeeper/data \
--data_log_dir=/var/lib/zookeeper/data/log \
--conf_dir=/opt/zookeeper/conf \
--client_port=2181 \
--election_port=3888 \
--server_port=2888 \
--tick_time=2000 \
--init_limit=10 \
--sync_limit=5 \
--heap=512M \
--max_client_cnxns=60 \
--snap_retain_count=3 \
--purge_interval=12 \
--max_session_timeout=40000 \
--min_session_timeout=4000 \
--log_level=INFO"
…
Configuring Logging
One of the files generated by the zkGenConfig.sh script controls
ZooKeeper's logging. ZooKeeper uses Log4j, and, by default, it uses a
time and size based rolling file appender for its logging configuration.
Use the command below to get the logging configuration from one of
Pods in the zk StatefulSet.
zookeeper.root.logger=CONSOLE
zookeeper.console.threshold=INFO
log4j.rootLogger=${zookeeper.root.logger}
log4j.appender.CONSOLE=org.apache.log4j.ConsoleAppender
log4j.appender.CONSOLE.Threshold=$
{zookeeper.console.threshold}
log4j.appender.CONSOLE.layout=org.apache.log4j.PatternLayout
log4j.appender.CONSOLE.layout.ConversionPattern=%d{ISO8601}
[myid:%X{myid}] - %-5p [%t:%C{1}@%L] - %m%n
This is the simplest possible way to safely log inside the container.
Because the applications write logs to standard out, Kubernetes will
handle log rotation for you. Kubernetes also implements a sane
retention policy that ensures application logs written to standard out
and standard error do not exhaust local storage media.
Use kubectl logs to retrieve the last 20 log lines from one of the Pods.
You can view application logs written to standard out or standard error
using kubectl logs and from the Kubernetes Dashboard.
securityContext:
runAsUser: 1000
fsGroup: 1000
Use the command below to get the file permissions of the ZooKeeper
data directory on the zk-0 Pod.
You can use kubectl patch to update the number of cpus allocated to
the servers.
statefulset.apps/zk patched
This terminates the Pods, one at a time, in reverse ordinal order, and
recreates them with the new configuration. This ensures that quorum is
maintained during a rolling update.
statefulsets "zk"
REVISION
1
2
Use the kubectl rollout undo command to roll back the modification.
Use the following command to examine the process tree for the
ZooKeeper server running in the zk-0 Pod.
The command used as the container's entry point has PID 1, and the
ZooKeeper process, a child of the entry point, has PID 27.
livenessProbe:
exec:
command:
- sh
- -c
- "zookeeper-ready 2181"
initialDelaySeconds: 15
timeoutSeconds: 5
The probe calls a bash script that uses the ZooKeeper ruok four letter
word to test the server's health.
In one terminal window, use the following command to watch the Pods
in the zk StatefulSet.
When the liveness probe for the ZooKeeper process fails, Kubernetes
will automatically restart the process for you, ensuring that unhealthy
processes in the ensemble are restarted.
readinessProbe:
exec:
command:
- sh
- -c
- "zookeeper-ready 2181"
initialDelaySeconds: 15
timeoutSeconds: 5
Use the command below to get the nodes for Pods in the zk StatefulSe
t.
kubernetes-node-cxpk
kubernetes-node-a5aq
kubernetes-node-2g2d
This is because the Pods in the zk StatefulSet have a PodAntiAffinit
y specified.
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: "app"
operator: In
values:
- zk
topologyKey: "kubernetes.io/hostname"
Surviving Maintenance
In this section you will cordon and drain nodes. If you are using
this tutorial on a shared cluster, be sure that this will not
adversely affect other tenants.
The previous section showed you how to spread your Pods across nodes
to survive unplanned node failures, but you also need to plan for
temporary node failures that occur due to planned maintenance.
Use kubectl cordon to cordon all but four of the nodes in your cluster.
In one terminal, use this command to watch the Pods in the zk Statefu
lSet.
kubectl get pods -w -l app=zk
In another terminal, use this command to get the nodes that the Pods
are currently scheduled on.
kubernetes-node-pb41
kubernetes-node-ixsl
kubernetes-node-i4c4
Use kubectl drain to cordon and drain the node on which the zk-0
Pod is scheduled.
As there are four nodes in your cluster, kubectl drain, succeeds and
the zk-0 is rescheduled to another node.
Keep watching the StatefulSet's Pods in the first terminal and drain
the node on which zk-1 is scheduled.
Continue to watch the Pods of the stateful set, and drain the node on
which zk-2 is scheduled.
You cannot drain the third node because evicting zk-2 would violate zk
-budget. However, the node will remain cordoned.
Use zkCli.sh to retrieve the value you entered during the sanity test
from zk-0.
zk-1 is rescheduled on this node. Wait until zk-1 is Running and Ready.
The output:
Cleaning up
◦ Use kubectl uncordon to uncordon all the nodes in your cluster.
◦ You will need to delete the persistent storage media for the
PersistentVolumes used in this tutorial. Follow the necessary steps,
based on your environment, storage configuration, and
provisioning method, to ensure that all storage is reclaimed.
Feedback
Was this page helpful?
Yes No
Clusters
Objectives
◦ See an example of how to load a profile on a node
◦ Learn how to enforce the profile on a Pod
◦ Learn how to check that the profile is loaded
◦ See what happens when a profile is violated
◦ See what happens when a profile cannot be loaded
gke-test-default-pool-239f5d02-gyn2: v1.4.0
gke-test-default-pool-239f5d02-x1kf: v1.4.0
gke-test-default-pool-239f5d02-xwux: v1.4.0
apparmor-test-deny-write (enforce)
apparmor-test-audit-write (enforce)
docker-default (enforce)
k8s-nginx (enforce)
container.apparmor.security.beta.kubernetes.io/
<container_name>: <profile_ref>
See the API Reference for the full details on the annotation and profile
name formats.
To verify that the profile was applied, you can look for the AppArmor
security option listed in the container created event:
You can also verify directly that the container's root process is running
with the correct profile by checking its proc attr:
k8s-apparmor-example-deny-write (enforce)
Example
This example assumes you have already set up a cluster with AppArmor
support.
First, we need to load the profile we want to use onto our nodes. The
profile we'll use simply denies all file writes:
#include <tunables/global>
file,
Since we don't know where the Pod will be scheduled, we'll need to
load the profile on all our nodes. For this example we'll just use SSH to
install the profiles, but other approaches are discussed in Setting up
nodes with profiles.
NODES=(
# The SSH-accessible domain names of your nodes
gke-test-default-pool-239f5d02-gyn2.us-central1-a.my-k8s
gke-test-default-pool-239f5d02-x1kf.us-central1-a.my-k8s
gke-test-default-pool-239f5d02-xwux.us-central1-a.my-k8s)
for NODE in ${NODES[*]}; do ssh $NODE 'sudo apparmor_parser -
q <<EOF
#include <tunables/global>
profile k8s-apparmor-example-deny-write
flags=(attach_disconnected) {
#include <abstractions/base>
file,
Next, we'll run a simple "Hello AppArmor" pod with the deny-write
profile:
pods/security/hello-apparmor.yaml
apiVersion: v1
kind: Pod
metadata:
name: hello-apparmor
annotations:
# Tell Kubernetes to apply the AppArmor profile "k8s-
apparmor-example-deny-write".
# Note that this is ignored if the Kubernetes node is
not running version 1.4 or greater.
container.apparmor.security.beta.kubernetes.io/hello: loc
alhost/k8s-apparmor-example-deny-write
spec:
containers:
- name: hello
image: busybox
command: [ "sh", "-c", "echo 'Hello AppArmor!' && sleep
1h" ]
If we look at the pod events, we can see that the Pod container was
created with the AppArmor profile "k8s-apparmor-example-deny-write":
We can verify that the container is actually running with that profile by
checking its proc attr:
k8s-apparmor-example-deny-write (enforce)
Finally, we can see what happens if we try to violate the profile by
writing to a file:
To wrap up, let's look at what happens if we try to specify a profile that
hasn't been loaded:
apiVersion: v1
kind: Pod
metadata:
name: hello-apparmor-2
annotations:
container.apparmor.security.beta.kubernetes.io/hello: loc
alhost/k8s-apparmor-example-allow-write
spec:
containers:
- name: hello
image: busybox
command: [ "sh", "-c", "echo 'Hello AppArmor!' && sleep
1h" ]
EOF
pod/hello-apparmor-2 created
Name: hello-apparmor-2
Namespace: default
Node: gke-test-default-pool-239f5d02-x1kf/
Start Time: Tue, 30 Aug 2016 17:58:56 -0700
Labels: <none>
Annotations:
container.apparmor.security.beta.kubernetes.io/
hello=localhost/k8s-apparmor-example-allow-write
Status: Pending
Reason: AppArmor
Message: Pod Cannot enforce AppArmor: profile "k8s-
apparmor-example-allow-write" is not loaded
IP:
Controllers: <none>
Containers:
hello:
Container ID:
Image: busybox
Image ID:
Port:
Command:
sh
-c
echo 'Hello AppArmor!' && sleep 1h
State: Waiting
Reason: Blocked
Ready: False
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from
default-token-dnz7v (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
default-token-dnz7v:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-dnz7v
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: <none>
Events:
FirstSeen LastSeen Count
From SubobjectPath Type
Reason Message
--------- -------- -----
---- ------------- --------
------ -------
23s 23s 1 {default-
scheduler } Normal
Scheduled Successfully assigned hello-apparmor-2 to e2e-
test-stclair-node-pool-t1f5
23s 23s 1 {kubelet e2e-test-
stclair-node-pool-t1f5} Warning
AppArmor Cannot enforce AppArmor: profile "k8s-apparmor-
example-allow-write" is not loaded
Note the pod status is Failed, with a helpful error message: Pod
Cannot enforce AppArmor: profile "k8s-apparmor-example-allow-
write" is not loaded. An event was also recorded with the same
message.
Administration
Setting up nodes with profiles
Kubernetes does not currently provide any native mechanisms for
loading AppArmor profiles onto nodes. There are lots of ways to setup
the profiles though, such as:
The scheduler is not aware of which profiles are loaded onto which
node, so the full set of profiles must be loaded onto every node. An
alternative approach is to add a node label for each profile (or class of
profiles) on the node, and use a node selector to ensure the Pod is run
on a node with the required profile.
--enable-admission-plugins=PodSecurityPolicy[,others...]
apparmor.security.beta.kubernetes.io/defaultProfileName: <pro
file_ref>
apparmor.security.beta.kubernetes.io/allowedProfileNames: <pr
ofile_ref>[,others...]
Disabling AppArmor
If you do not want AppArmor to be available on your cluster, it can be
disabled by a command-line flag:
--feature-gates=AppArmor=false
When disabled, any Pod that includes an AppArmor profile will fail
validation with a "Forbidden" error. Note that by default docker always
enables the "docker-default" profile on non-privileged pods (if the
AppArmor kernel module is enabled), and will continue to do so even if
the feature-gate is disabled. The option to disable AppArmor will be
removed when AppArmor graduates to general availability (GA).
Authoring Profiles
Getting AppArmor profiles specified correctly can be a tricky business.
Fortunately there are some tools to help with that:
To debug problems with AppArmor, you can check the system logs to
see what, specifically, was denied. AppArmor logs verbose messages to
dmesg, and errors can usually be found in the system logs or through jo
urnalctl. More information is provided in AppArmor failures.
API Reference
Pod Annotation
Specifying the profile a container will run with:
◦ key: container.apparmor.security.beta.kubernetes.io/
<container_name> Where <container_name> matches the name of
a container in the Pod. A separate profile can be specified for each
container in the Pod.
◦ value: a profile reference, described below
Profile Reference
◦ runtime/default: Refers to the default runtime profile.
▪ Equivalent to not specifying a profile (without a
PodSecurityPolicy default), except it still requires AppArmor
to be enabled.
▪ For Docker, this resolves to the docker-default profile for
non-privileged containers, and unconfined (no profile) for
privileged containers.
◦ localhost/<profile_name>: Refers to a profile loaded on the
node (localhost) by name.
▪ The possible profile names are detailed in the core policy
reference.
◦ unconfined: This effectively disables AppArmor on the container.
PodSecurityPolicy Annotations
Specifying the default profile to apply to containers when none is
provided:
◦ key: apparmor.security.beta.kubernetes.io/
defaultProfileName
◦ value: a profile reference, described above
◦ key: apparmor.security.beta.kubernetes.io/
allowedProfileNames
◦ value: a comma-separated list of profile references (described
above)
▪ Although an escaped comma is a legal character in a profile
name, it cannot be explicitly allowed here.
What's next
Additional resources:
Feedback
Was this page helpful?
Yes No
Last modified July 06, 2020 at 10:17 PM PST: Add documentation for
generally available seccomp functionality (3ad7ea77f)
Edit this page Create child page Create an issue
◦ Objectives
◦ Before you begin
◦ Securing a Pod
◦ Example
◦ Administration
▪ Setting up nodes with profiles
▪ Restricting profiles with the PodSecurityPolicy
▪ Disabling AppArmor
▪ Upgrading to Kubernetes v1.4 with AppArmor
▪ Upgrade path to General Availability
◦ Authoring Profiles
◦ API Reference
▪ Pod Annotation
▪ Profile Reference
▪ PodSecurityPolicy Annotations
◦ What's next
Seccomp stands for secure computing mode and has been a feature of
the Linux kernel since version 2.6.12. It can be used to sandbox the
privileges of a process, restricting the calls it is able to make from
userspace into the kernel. Kubernetes lets you automatically apply
seccomp profiles loaded onto a Node to your Pods and containers.
Identifying the privileges required for your workloads can be difficult.
In this tutorial, you will go through how to load seccomp profiles into a
local Kubernetes cluster, how to apply them to a Pod, and how you can
begin to craft profiles that give only the necessary privileges to your
container processes.
Objectives
◦ Learn how to load seccomp profiles on a node
◦ Learn how to apply a seccomp profile to a container
◦ Observe auditing of syscalls made by a container process
◦ Observe behavior when a missing profile is specified
◦ Observe a violation of a seccomp profile
◦ Learn how to create fine-grained seccomp profiles
◦ Learn how to apply a container runtime default seccomp profile
◦ audit.json
◦ violation.json
◦ fine-grained.json
pods/security/seccomp/profiles/audit.json
{
"defaultAction": "SCMP_ACT_LOG"
}
pods/security/seccomp/profiles/violation.json
{
"defaultAction": "SCMP_ACT_ERRNO"
}
pods/security/seccomp/profiles/fine-grained.json
{
"defaultAction": "SCMP_ACT_ERRNO",
"architectures": [
"SCMP_ARCH_X86_64",
"SCMP_ARCH_X86",
"SCMP_ARCH_X32"
],
"syscalls": [
{
"names": [
"accept4",
"epoll_wait",
"pselect6",
"futex",
"madvise",
"epoll_ctl",
"getsockname",
"setsockopt",
"vfork",
"mmap",
"read",
"write",
"close",
"arch_prctl",
"sched_getaffinity",
"munmap",
"brk",
"rt_sigaction",
"rt_sigprocmask",
"sigaltstack",
"gettid",
"clone",
"bind",
"socket",
"openat",
"readlinkat",
"exit_group",
"epoll_create1",
"listen",
"rt_sigreturn",
"sched_yield",
"clock_gettime",
"connect",
"dup2",
"epoll_pwait",
"execve",
"exit",
"fcntl",
"getpid",
"getuid",
"ioctl",
"mprotect",
"nanosleep",
"open",
"poll",
"recvfrom",
"sendto",
"set_tid_address",
"setitimer",
"writev"
],
"action": "SCMP_ACT_ALLOW"
}
]
}
pods/security/seccomp/kind.yaml
apiVersion: kind.x-k8s.io/v1alpha4
kind: Cluster
nodes:
- role: control-plane
extraMounts:
- hostPath: "./profiles"
containerPath: "/var/lib/kubelet/seccomp/profiles"
Once the cluster is ready, identify the container running as the single
node cluster:
docker ps
You should see output indicating that a container is running with name
kind-control-plane.
CONTAINER ID IMAGE
COMMAND CREATED
STATUS PORTS NAMES
6a96207fed4b kindest/node:v1.18.2 "/usr/local/bin/
entr…" 27 seconds ago Up 24 seconds
127.0.0.1:42223->6443/tcp kind-control-plane
If observing the filesystem of that container, one should see that the pr
ofiles/ directory has been successfully loaded into the default
seccomp path of the kubelet. Use docker exec to run a command in the
Pod:
To start off, apply the audit.json profile, which will log all syscalls of
the process, to a new Pod.
pods/security/seccomp/ga/audit-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: audit-pod
labels:
app: audit-pod
spec:
securityContext:
seccompProfile:
type: Localhost
localhostProfile: profiles/audit.json
containers:
- name: test-container
image: hashicorp/http-echo:0.2.3
args:
- "-text=just made some syscalls!"
securityContext:
allowPrivilegeEscalation: false
pods/security/seccomp/alpha/audit-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: audit-pod
labels:
app: audit-pod
annotations:
seccomp.security.alpha.kubernetes.io/pod: localhost/
profiles/audit.json
spec:
containers:
- name: test-container
image: hashicorp/http-echo:0.2.3
args:
- "-text=just made some syscalls!"
securityContext:
allowPrivilegeEscalation: false
This profile does not restrict any syscalls, so the Pod should start
successfully.
Now you can curl the endpoint from inside the kind control plane
container at the port exposed by this Service. Use docker exec to run a
command in the Pod:
You can see that the process is running, but what syscalls did it actually
make? Because this Pod is running in a local cluster, you should be able
to see those in /var/log/syslog. Open up a new terminal window and
tail the output for calls from http-echo:
You should already see some logs of syscalls made by http-echo, and if
you curl the endpoint in the control plane container you will see more
written.
Clean up that Pod and Service before moving to the next section:
For demonstration, apply a profile to the Pod that does not allow for any
syscalls.
apiVersion: v1
kind: Pod
metadata:
name: violation-pod
labels:
app: violation-pod
spec:
securityContext:
seccompProfile:
type: Localhost
localhostProfile: profiles/violation.json
containers:
- name: test-container
image: hashicorp/http-echo:0.2.3
args:
- "-text=just made some syscalls!"
securityContext:
allowPrivilegeEscalation: false
pods/security/seccomp/alpha/violation-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: violation-pod
labels:
app: violation-pod
annotations:
seccomp.security.alpha.kubernetes.io/pod: localhost/
profiles/violation.json
spec:
containers:
- name: test-container
image: hashicorp/http-echo:0.2.3
args:
- "-text=just made some syscalls!"
securityContext:
allowPrivilegeEscalation: false
If you check the status of the Pod, you should see that it failed to start.
kubectl get pod/violation-pod
Clean up that Pod and Service before moving to the next section:
If you take a look at the fine-pod.json, you will notice some of the
syscalls seen in the first example where the profile set "defaultAction
": "SCMP_ACT_LOG". Now the profile is setting "defaultAction":
"SCMP_ACT_ERRNO", but explicitly allowing a set of syscalls in the "acti
on": "SCMP_ACT_ALLOW" block. Ideally, the container will run
successfully and you will see no messages sent to syslog.
pods/security/seccomp/ga/fine-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: fine-pod
labels:
app: fine-pod
spec:
securityContext:
seccompProfile:
type: Localhost
localhostProfile: profiles/fine-grained.json
containers:
- name: test-container
image: hashicorp/http-echo:0.2.3
args:
- "-text=just made some syscalls!"
securityContext:
allowPrivilegeEscalation: false
pods/security/seccomp/alpha/fine-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: fine-pod
labels:
app: fine-pod
annotations:
seccomp.security.alpha.kubernetes.io/pod: localhost/
profiles/fine-grained.json
spec:
containers:
- name: test-container
image: hashicorp/http-echo:0.2.3
args:
- "-text=just made some syscalls!"
securityContext:
allowPrivilegeEscalation: false
Open up a new terminal window and tail the output for calls from htt
p-echo:
tail -f /var/log/syslog | grep 'http-echo'
Check what port the Service has been assigned on the node:
curl the endpoint from inside the kind control plane container:
You should see no output in the syslog because the profile allowed all
necessary syscalls and specified that an error should occur if one
outside of the list is invoked. This is an ideal situation from a security
perspective, but required some effort in analyzing the program. It
would be nice if there was a simple way to get closer to this security
without requiring as much effort.
Clean up that Pod and Service before moving to the next section:
Most container runtimes provide a sane set of default syscalls that are
allowed or not. The defaults can easily be applied in Kubernetes by
using the runtime/default annotation or setting the seccomp type in
the security context of a pod or container to RuntimeDefault.
Download the correct manifest for your Kubernetes version:
pods/security/seccomp/ga/default-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: audit-pod
labels:
app: audit-pod
spec:
securityContext:
seccompProfile:
type: RuntimeDefault
containers:
- name: test-container
image: hashicorp/http-echo:0.2.3
args:
- "-text=just made some syscalls!"
securityContext:
allowPrivilegeEscalation: false
pods/security/seccomp/alpha/default-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: default-pod
labels:
app: default-pod
annotations:
seccomp.security.alpha.kubernetes.io/pod: runtime/default
spec:
containers:
- name: test-container
image: hashicorp/http-echo:0.2.3
args:
- "-text=just made some syscalls!"
securityContext:
allowPrivilegeEscalation: false
The default seccomp profile should provide adequate access for most
workloads.
What's next
Additional resources:
◦ A Seccomp Overview
◦ Seccomp Security Profiles for Docker
Feedback
Was this page helpful?
Yes No
Last modified October 22, 2020 at 3:27 PM PST: Fix links in tutorials
section (774594bf1)
Edit this page Create child page Create an issue
◦ Objectives
◦ Before you begin
◦ Create Seccomp Profiles
◦ Create a Local Kubernetes Cluster with Kind
◦ Create a Pod with a Seccomp profile for syscall auditing
◦ Create Pod with Seccomp Profile that Causes Violation
◦ Create Pod with Seccomp Profile that Only Allows Necessary
Syscalls
◦ Create Pod that uses the Container Runtime Default Seccomp
Profile
◦ What's next
Services
Using Source IP
Using Source IP
Applications running in a Kubernetes cluster find and communicate
with each other, and the outside world, through the Service abstraction.
This document explains what happens to the source IP of packets sent
to different types of Services, and how you can toggle this behavior
according to your needs.
Before you begin
Terminology
This document makes use of the following terms:
NAT
network address translation
Source NAT
replacing the source IP on a packet; in this page, that usually
means replacing with the IP address of a node.
Destination NAT
replacing the destination IP on a packet; in this page, that usually
means replacing with the IP address of a Pod
VIP
a virtual IP address, such as the one assigned to every Service in
Kubernetes
kube-proxy
a network daemon that orchestrates Service VIP management on
every node
Prerequisites
You need to have a Kubernetes cluster, and the kubectl command-line
tool must be configured to communicate with your cluster. If you do not
already have a cluster, you can create one by using minikube or you can
use one of these Kubernetes playgrounds:
◦ Katacoda
◦ Play with Kubernetes
The examples use a small nginx webserver that echoes back the source
IP of requests it receives through an HTTP header. You can create it as
follows:
deployment.apps/source-ip-app created
Objectives
◦ Expose a simple application through various types of Services
◦ Understand how each Service type handles source IP NAT
◦ Understand the tradeoffs involved in preserving source IP
Source IP for Services with Type=ClusterIP
Packets sent to ClusterIP from within the cluster are never source
NAT'd if you're running kube-proxy in iptables mode, (the default). You
can query the kube-proxy mode by fetching http://localhost:10249/
proxyMode on the node where kube-proxy is running.
Get the proxy mode on one of the nodes (kube-proxy listens on port
10249):
iptables
service/clusterip exposed
CLIENT VALUES:
client_address=10.244.3.8
command=GET
...
service/nodeport exposed
client_address=10.180.1.1
client_address=10.240.0.5
client_address=10.240.0.3
Note that these are not the correct client IPs, they're cluster internal
IPs. This is what happens:
Visually:
To avoid this, Kubernetes has a feature to preserve the client source IP.
If you set service.spec.externalTrafficPolicy to the value Local,
kube-proxy only proxies proxy requests to local endpoints, and does not
forward traffic to other nodes. This approach preserves the original
source IP address. If there are no local endpoints, packets sent to the
node are dropped, so you can rely on the correct source-ip in any
packet processing rules you might apply a packet that make it through
to the endpoint.
service/nodeport patched
Now, re-run the test:
client_address=198.51.100.79
Note that you only got one reply, with the right client IP, from the one
node on which the endpoint pod is running.
Visually:
curl 203.0.113.140
CLIENT VALUES:
client_address=10.240.0.5
...
Visually:
configuration
Load balancer
Service
Health Health
check of check of
node 1 node 2
returns returns
200 500
Node 1 Node 2
healthCheckNodePort: 32122
curl 203.0.113.140
CLIENT VALUES:
client_address=198.51.100.79
...
Cross-platform support
Load balancers in the first category must use an agreed upon protocol
between the loadbalancer and backend to communicate the true client
IP such as the HTTP Forwarded or X-FORWARDED-FOR headers, or the
proxy protocol. Load balancers in the second category can leverage the
feature described above by creating an HTTP health check pointing at
the port stored in the service.spec.healthCheckNodePort field on the
Service.
Cleaning up
What's next
◦ Learn more about connecting applications via services
◦ Read how to Create an External Load Balancer
Feedback
Was this page helpful?
Yes No
Last modified August 10, 2020 at 12:51 AM PST: Modify wget comment
so that 10.0.170.92 refers to ClusterIP of the service and not the PodIP
(eeb9f539d)
Edit this page Create child page Create an issue
◦ Before you begin
▪ Terminology
▪ Prerequisites
◦ Objectives
◦ Source IP for Services with Type=ClusterIP
◦ Source IP for Services with Type=NodePort
◦ Source IP for Services with Type=LoadBalancer
◦ Cross-platform support
◦ Cleaning up
◦ What's next