A guide to mastering autoscaling in Kubernetes with KEDA

4 min readApr 15, 2024

Efficiently managing containerized applications in Kubernetes involves a critical component: autoscaling. This dynamic allocation of resources based on demand is essential for optimizing performance and cost-effectiveness. However, implementing autoscaling can be challenging, especially when integrating external metrics and working with Horizontal Pod Autoscaler (HPA).

This is where Kubernetes Event-driven Autoscaling (KEDA) steps in. With KEDA, autoscaling becomes more flexible and powerful, allowing you to scale your applications based on a wide range of metrics, including Prometheus metrics. This blog will walk you through the process of autoscaling with KEDA and show you how to efficiently scale your applications in a Kubernetes environment.

Why should you autoscale in Kubernetes

The number of containers running an application in a Kubernetes cluster can vary based on demand. Traditionally, this scaling process required manual intervention or tedious scripting, leading to inefficient resource allocation and potential downtime during peak traffic. However, the ability to adjust resource allocation based on autoscaling has been a game-changer. Let’s find out how.

Benefits of autoscaling in Kubernetes

With the ability to automatically adjust resource allocation based on demand, autoscaling in Kubernetes provides the following benefits:

Improved performance: Autoscaling ensures applications can handle high traffic loads without compromising performance. The applications remain responsive even during peak times by automatically adding resources when needed.
Cost optimization: Autoscaling allows for efficient utilization of resources, scaling them up or down as needed. It eliminates the need for overprovisioning, saving costs on unnecessary resources. KEDA can scale down the number of pods to zero if there are no events to process. This is harder to do using the standard HPA, and it helps ensure effective resource utilization and cost optimization, ultimately bringing down the cloud bills.
Simplified management: Autoscaling eliminates the need for manual intervention, reducing the risk of human error and improving overall operational efficiency.

Introducing KEDA: Autoscaling made easy

KEDA (Kubernetes Event-driven Autoscaling) is an open-source project that simplifies and enhances Kubernetes’ autoscaling capabilities. It enables autoscaling based on events instead of the traditional resource utilization metrics. KEDA allows users to your applications dynamically in response to various events, such as message queue length, HTTP requests, or custom metrics emitted by your application. With KEDA, users can use various event sources, including Azure Queue, Kafka, RabbitMQ, Prometheus metrics and more. This flexibility enables the creation of scalable and responsive applications that adapt to different workload patterns.

Let’s dive deeper into the functionality of KEDA to get a comprehensive understanding and hands-on of how it works. Harnessing the power of KEDA can ensure optimal performance and resource utilization in your containerized applications.

Installing KEDA and configure autoscaling

Let’s use the Helm chart installation method to deploy KEDA to the minikube cluster. Users have the flexibility to use any vendor-specific Kubernetes cluster, provided, they have appropriate cluster access and a locally installed Kubectl client.

helm repo add https://kedacore.github.io/charts 
helm repo update 
helm install keda kedacore/keda --namespace keda --create-namespace

KEDA CRDS (custom resource definition) are available in your cluster once you install the Helm chart for KEDA. You must define the configuration using these CRDs.

Scaledobjects.keda.sh
Scaledjobs.keda.sh
Triggerauthentications.keda.sh
Clustertriggerauthentications.keda.sh

These custom resources allow you to link an event source (along with its authentication) to a Deployment, Custom Resource, StatefulSet, or Job for scaling purposes.

ScaledObjects represent the desired mapping between an event source (e.g., Rabbit MQ) and the Kubernetes Deployment, StatefulSet, or any custom resource that defines /scales a subresource.
ScaledJobs represent the mapping between the event source and Kubernetes Job.
ScaledObject/ScaledJob may also make reference to a TriggerAuthentication or ClusterTriggerAuthentication, containing the necessary authentication configurations or secrets for monitoring the event source.

Actual functionality of KEDA

Now let’s set up nginx deployment using the below manifests

nginx-deploy.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.14.2
        ports:
        - containerPort: 80
        resources:
          Requests:
            Cpu: “0.5”
            memory: “200Mi”

---apiVersion: v1
kind: Service
metadata:
  name: nginx
spec:
  selector:
    app: nginx
  type: clusterIP
  ports:
    - protocol: TCP
      port: 80
      targetPort: 80

kubectl apply –f nginx-deploy.yaml

Now, let’s set up the scaledobject (read more..)