Scheduling and Topology

A common topology for large data centers, such as cloud providers, is to organize hosts into regions and zones:

  • A region is a set of hosts in a close geographic area, which guarantees high-speed connectivity between them.

  • A zone, also called an availability zone**, is a set of hosts that might fail together because they share common critical infrastructure components, such as a network switch, a storage array, or an uninterruptible power supply (UPS).

As an example of regions and zones, Amazon Web Services (AWS) has a region in northern Virginia (us-east-1) with 6 availability zones and another region in Ohio (us-east-2) with 3 availability zones. Each of the AWS availability zones can contain multiple data centers potentially consisting of hundreds of thousands of servers.

The standard configuration of the OKD pod scheduler supports this kind of cluster topology by defining predicates based on the region and zone labels. The predicates are defined in such a way that:

  • Replica pods, created from the same deployment (or deployment configuration), are scheduled to run in nodes having the same value for the region label.
  • Replica pods are scheduled to run in nodes having different values for the zone label.

The figure below shows a sample topology that consists of multiple regions, each with multiple zones, and each zone with multiple nodes.