Skip to main content

Autoscaling your NKE workload

Autoscaling enables you to worry less about capacity planning and ensures the uptime of your services during load peaks. All this while you only pay what resources are needed at any given moment.

Details

With autoscaling configured, NKE automatically adds new node(s) to your cluster if you've created new Pods that don't have enough capacity to run. Conversely, if a node in your cluster is underutilized and its Pods can be run on other nodes, the autoscaler can delete the node.

Keep in mind that when resources are deleted or moved in the course of autoscaling your cluster, your services can experience some disruption. For example, if your service consists of a controller with a single replica, that replica's Pod might be restarted on a different node if its current node is deleted. Before enabling autoscaling, ensure that your services can tolerate potential disruption or that they are designed and configured so that downscaling does not disrupt Pods that cannot be interrupted.

Availability

Autoscaling can be configured on a node pool level in Cockpit. But there are a few more things that have to be configured in order to automatically scale your workload.

Usage

Scaling your workloads horizontally

  1. Configure the minimum and maximum node count: By default we won't just scale your cluster to an infinite amount of nodes to guard you from unexpected costs. You can set the min and max count of nodes on a per node pool basis and the autoscaler will scale within these boundaries.

  2. Set CPU requests on your pods: The cluster autoscaler is using this as a base to calculate how much free capacity a node has. Without setting CPU requests the cluster autoscaler does not function. Plus this is good practice regardless if you make use of the autoscaler or not.

  3. Setup a Horizontal Pod Autoscaler: To scale your pods with the incoming load you can setup a Horizontal Pod Autoscaler (HPA) to scale the pods on the CPU utilization. As soon as your nodes are full this will in turn trigger the cluster autoscaler to add more nodes. The Kubernetes documentation has a great walkthrough to help you setup a HPA.

Scaling on custom metrics

There is the possibility to scale horizontally by leveraging an optional managed installation of keda. Keda allows to get metrics from various backends (called 'scalers' in keda terms) and to scale based on them. You can deploy Keda on your cluster using Cockpit.

You can find the available scalers in keda's documentation. For example if your cluster also has Prometheus deployed, you can make use of the Prometheus scaler. Keda also allows to scale on own custom metrics, by providing a self written external scaler.