K8s Horizontal Pod Autoscaling

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

Horizontal Pod Autoscaling

In kubernetes the metric server sends metrics of resource


consumption to HPA and based on the rules you have defined in HPA
manifest file, this object decides to scale up or down the pods. For
example, if the CPU usage was more than 80 percentage, the HPA
order replica Set and deployment to scale up pods and if the usage
came below 10 percentage, the additional pods will be removed.This
is how Kubernetes HPA work

Ajinkya Kale
Practical implementation :-

Step 1: Deploy the Metrics Server


kubectl apply -f
https://github.com/kubernetessigs/metricsserver/releases/latest/down
load/components.yaml

Step 2: Download the patch for Metrics Server


wget -c
https://gist.githubusercontent.com/initcron/1a2bd25353e1faa22a0ad4
1ad1c01b62/raw/008e23f9fbf4d7e2cf79df1dd008de2f1 db62a10/k8s-
metrics-server.patch.yaml

Ajinkya Kale
Step 3: Apply the Patch for deployed Metrics Server
kubectl patch deploy metrics-server -p "$(cat k8s- metrics
server.patch.yaml)" -n kube-system

Step 4: Verify the Metrics Server Pod status


kubectl get pods -n kube-system

Ajinkya Kale
Step 5: Create a two deployments httpd and nginx for testing

Step 6: Check Pod Status

Step 7: Verify if the Metrics Server is working or not using the


kubectl top command

Ajinkya Kale
HPA Setup with Testing

Step 8: Create Deployment YAML file


This is our main application deployment that will scale based on the
load.

Ajinkya Kale
Step 9: Apply and Check the deployment pod

Step 10: Create a Service for PHP-Apache

Ajinkya Kale
Step 11: Create Horizontal Pod Autoscaler (HPA)
Now, we define the HPA that will scale our php-apache deployment
based on CPU usage.

Step 12: Apply the HPA and Check that the HPA is correctly
configured

Ajinkya Kale
Step 13: Create a Load Generator Pod
This pod generates traffic to your application to trigger scaling.

Now Inside the Load Generator Pod execute this following command
that will continuously send requests to your php-apache service,
causing CPU utilization to increase, which should trigger the HPA to
scale up the pods.
while true; do wget -q -O- http://php-apache.default.svc.cluster.local;
done

Step 14: Monitor HPA status, check how the HPA is scaling your
deployment
kubectl get hpa

Ajinkya Kale
Step 15: Also check pod scalling up or down based on the load
generated by the load-generator.

======================= =======================

Ajinkya Kale
Autoscaling with CPU and Memory Metrics
Now we configure Horizontal Pod Autoscaling (HPA) with both
CPU and memory as metrics, like in your new YAML, the HPA
will monitor both CPU and memory resource utilization. It will
scale up or down based on the average utilization of these two
resources according to the thresholds you’ve set.

This is our new HPA Configuration YAML file

Ajinkya Kale
Step 1: Apply the New HPA Configuration
This will update your HPA to use both CPU and memory metrics
for scaling.

Step 2: Now exec into the load-generator pod to create load and
trigger scaling

Inside the load-generator pod, run the following command to


continuously hit the PHP-Apache service
while true; do wget -q -O- http://php-apache.default.svc.cluster.local;
done

Ajinkya Kale
Step 3: Now monitor the scaling process by checking the status
of the HPA
kubectl get hpa -w
As you can see the number of replicas increasing as CPU or memory usage
exceeds the threshold. For example, if CPU usage exceeds 40%, the HPA
will scale up the deployment, and the number of replicas will increase.

For scale down :- Once you stop the load generation, the CPU
and memory usage should fall below the target thresholds, and
the HPA should scale down the deployment
Simply terminate the load command as you can see beloiw

You might also like