Alerts and AlertManager

An alert is a rule that evaluates to true or false. The rule is often based on cluster observations, such as cluster CPU utilization.

An alert is also associated with a duration. For an alert to trigger, the alert rule must continue to evaluate to true for the defined duration.

You access cluster alerts from the OKD web console at Monitoring → Alerting. A brief description of the alert, the alert state, and the alert severity are displayed. View alert details by clicking the name of the alert.

An alert has four states:

  • Firing
    The alert rule evaluates to true, and has evaluated to true for longer than the defined alert duration.

  • Pending
    The alert rule evaluates to true, but has not evaluated to true for longer than the defined alert duration.

  • Silenced
    The alert is Firing, but is actively being silenced. Administrators can silence an alert to temporary deactivate it.

  • Not Firing
    Any alert that is not Firing, Pending, or Silenced is labeled as Not Firing.

Although the Firing, Silenced, and Pending alert states are displayed by default from the web console, the display of each state can be toggled on or off.

Alerts Severity filters:

  • Critical
    The condition that triggered the alert could have a critical impact.

  • Warning
    . The alert provides a warning notification about something that might require attention in order to prevent a problem from occurring.

  • Info
    The alert is provided for informational purposes only.

  • None
    The alert has no defined severity.

You can also create custom severity definitions for alerts relating to user-defined projects.