Automatically Scaling a Cluster

As discussed in the section Manually Scaling an OKD Cluster, the Machine API provides several resources for managing the workloads of your cluster. You can scale your cluster resources in two ways: manually and automatically.

Manual scaling requires updating the number of replicas in a machine set. Automatic scaling of a cluster involves using two custom resources: MachineAutoscaler and ClusterAutoscaler.

A MachineAutoscaler resource automatically scales the number of replicas in a machine set, depending on the load. This API resource interacts with the machine sets and instructs them to add more worker nodes to the cluster. The resource supports the definition of lower and upper boundaries. The ClusterAutoscaler enforces limits for the whole cluster, such as the total number of nodes.

For example, MaxNodesTotal sets the maximum number of cores in the whole cluster, and MaxMemoryTotal sets the maximum memory in the whole cluster.

Each OKD cluster can only have one ClusterAutoscaler resource. The ClusterAutoscaler resource operates at a higher level and defines the maximum number of nodes and other resources, such as cores, memory, and Graphical Processing Units (GPUs). This prevents the MachineAutoscaler resources from scaling out in an uncontrolled manner

The following excerpt describes a ClusterAutoscaler resource:

apiVersion: ""
kind: "ClusterAutoscaler"
    name: "default"
    podPriorityThreshold: -10
        maxNodesTotal: 6
        enabled: true
        delayAfterAdd: 3m
        unneededTime: 3m

Use a MachineAutoscaler resource to scale the number of machines defined by a MachineSet resource. Scaling only works if you defined a ClusterAutoscaler, and adding a new machine does not exceed any of the values defined by the ClusterAutoscaler resource.

For example, adding one m4.xlarge machine from AWS adds one node, 4 CPU cores, and 16 GB of memory. On its own, a MachineAutoscaler does not scale the cluster in or out unless cluster autoscaling is allowed.

The following diagram shows how the MachineAutoscaler resource interacts with machine sets, which scale worker nodes in and out.

  • If the MachineAutoscaler resource must scale, then it checks to see if a ClusterAutoscaler resource exists. If it does not, then no scaling occurs.

    If a ClusterAutoscaler resource does exist, then the MachineAutoscaler resource evaluates whether adding the new machine violates any of the limits defined in the ClusterAutoscaler.

  • Provided the request does not exceeds the limits, a new machine is created.

  • When the new machine is ready, OKD schedules pods to it as a new node.

Properties such as minReplicas and maxReplicas define the lower and upper boundaries of automatic scaling.