K8s Horizontal Pod Autoscaling
K8s Horizontal Pod Autoscaling
K8s Horizontal Pod Autoscaling
Ajinkya Kale
Practical implementation :-
Ajinkya Kale
Step 3: Apply the Patch for deployed Metrics Server
kubectl patch deploy metrics-server -p "$(cat k8s- metrics
server.patch.yaml)" -n kube-system
Ajinkya Kale
Step 5: Create a two deployments httpd and nginx for testing
Ajinkya Kale
HPA Setup with Testing
Ajinkya Kale
Step 9: Apply and Check the deployment pod
Ajinkya Kale
Step 11: Create Horizontal Pod Autoscaler (HPA)
Now, we define the HPA that will scale our php-apache deployment
based on CPU usage.
Step 12: Apply the HPA and Check that the HPA is correctly
configured
Ajinkya Kale
Step 13: Create a Load Generator Pod
This pod generates traffic to your application to trigger scaling.
Now Inside the Load Generator Pod execute this following command
that will continuously send requests to your php-apache service,
causing CPU utilization to increase, which should trigger the HPA to
scale up the pods.
while true; do wget -q -O- http://php-apache.default.svc.cluster.local;
done
Step 14: Monitor HPA status, check how the HPA is scaling your
deployment
kubectl get hpa
Ajinkya Kale
Step 15: Also check pod scalling up or down based on the load
generated by the load-generator.
======================= =======================
Ajinkya Kale
Autoscaling with CPU and Memory Metrics
Now we configure Horizontal Pod Autoscaling (HPA) with both
CPU and memory as metrics, like in your new YAML, the HPA
will monitor both CPU and memory resource utilization. It will
scale up or down based on the average utilization of these two
resources according to the thresholds you’ve set.
Ajinkya Kale
Step 1: Apply the New HPA Configuration
This will update your HPA to use both CPU and memory metrics
for scaling.
Step 2: Now exec into the load-generator pod to create load and
trigger scaling
Ajinkya Kale
Step 3: Now monitor the scaling process by checking the status
of the HPA
kubectl get hpa -w
As you can see the number of replicas increasing as CPU or memory usage
exceeds the threshold. For example, if CPU usage exceeds 40%, the HPA
will scale up the deployment, and the number of replicas will increase.
For scale down :- Once you stop the load generation, the CPU
and memory usage should fall below the target thresholds, and
the HPA should scale down the deployment
Simply terminate the load command as you can see beloiw