Implementing Automatic Scaling

For successful automatic scaling, the following are required:

  • A cluster deployed in full-stack automation, because automatic scaling must interact with a cloud service when adding or removing workers.

  • A ClusterAutoscaler resource, if the infrastructure supports it. Additionally, the ClusterAutoscaler resource might limit the maximum number of nodes, and define a minimum and maximum values for cores, memory, and GPUs. The enabled: true entry in the scaleDown section of the ClusterAutoscaler resource authorizes the cluster to automatically scale in the number of machines when they are not used.

  • At least one MachineAutoscaler resource. Each MachineAutoscaler resource defines a minimum and a maximum number of replicas for a specific machine set.

apiVersion: ""
kind: "MachineAutoscaler"
    name: "scale-automatic"
    namespace: "openshift-machine-api"
    minReplicas: 1
    maxReplicas: 2
        apiVersion: kind: MachineSet
        name: MACHINE-SET-NAME

After creating these resources, if the cluster cannot manage a load, then the automatic addition of worker nodes is triggered.

The ClusterAutoscaler resource scales out the number of workers when pods fail to schedule on any of the current nodes due to insufficient resources, or when another node is necessary to meet deployment needs.