Grafana is an open-source analytics and visualization project intended to improve the observability of system metrics. OKD integrates Grafana dashboards at Monitoring → Dashboards.
A separate Grafana user interface is accessible using the URL of the grafana route in the openshift-monitoring namespace.
With Grafana, you can create customized dashboards to visualize key cluster metrics. Grafana dashboards frequently refresh to display current summary metrics and graphs.
Grafana graphs are interactive. With Grafana, you can further explore interesting data features and characteristics you observe in a graph. OKD provides several dashboards in Grafana. These dashboards serve as a good starting point for near real-time observability of cluster metrics and health.
After receiving an alert, an administrator might use Grafana dashboards to investigate the problem. This investigation could include checking if a specific node or project has a problem.
Additionally, Grafana dashboards can help identify if a problem was temporary or if it appears to be persistent
Grafana includes the following default dashboards:
This dashboard provides information on etcd instances running in the cluster.
This dashboard provides a high level view of cluster resources.
Kubernetes/Compute Resources/Namespace (Pods)
This dashboard displays resource usage for pods within a namespace.
Kubernetes/Compute Resources/Namespace (Workloads)
This dashboard filters resource usage first by namespace and then by workload type, such as deployment, daemonset, and statefulset. Grafana displays all workloads of the specified type within the namespace.
Kubernetes/Compute Resources/Node (Pods)
This dashboard shows pod resource usage filtered by node.
This dashboard displays the resource usage for individual pods. Select a namespace and a pod within the namespace.
This dashboard provides resources usage filtered by namespace, workload, and workload type.
This dashboard displays network usage for the cluster. Grafana sorts many items to show namespaces with the highest usage.
This dashboard provides detailed information about the prometheus-k8s pods running in the openshift-monitoring namespace.
USE is an acronym for Utilization Saturation and Errors. This dashboard displays several graphics that can identify if the cluster is over utilized, over saturated, or experiencing a high number of errors. Because Grafana displays all nodes in the cluster, you might be able to identity a node that is not behaving the same as the other nodes in the cluster.