Autoscaling Pods

OKD can autoscale a deployment or a deployment configuration based on current load on the application pods, by means of a HorizontalPodAutoscaler resource type.

A horizontal pod autoscaler resource uses performance metrics collected by the OpenShift Metrics subsystem.

To autoscale a deployment or deployment configuration, you must specify resource requests for pods so that the horizontal pod autoscaler can calculate the percentage of usage.

The recommended way to create a horizontal pod autoscaler resource is using the oc autoscale command, for example:

oc autoscale dc/hello --min 1 --max 10 --cpu-percent 80

The previous command creates a horizontal pod autoscaler resource that changes the number of replicas on the hello deployment configuration to keep its pods under 80% of their total requested CPU usage.

The oc autoscale command creates a horizontal pod autoscaler resource using the name of the deployment or deployment configuration as an argument (hello in the previous example)

The maximum and minimum values for the horizontal pod autoscaler resource serve to accommodate bursts of load and avoid overloading the OKD cluster. If the load on the application changes too quickly, then it might be advisable to keep a number of spare pods to cope with sudden bursts of user requests. Conversely, too many pods can use up all cluster capacity and impact other applications sharing the same OKD cluster.

To get information about horizontal pod autoscaler resources in the current project, use the oc get command.

oc get hpa