Path: blob/main/operations/observability/mixins/meta/rules/dashboard.yaml
2500 views
apiVersion: monitoring.coreos.com/v11kind: PrometheusRule2metadata:3labels:4prometheus: k8s5role: alert-rules6name: dashboard-monitoring-rules7spec:8groups:9- name: dashboard10rules:11- alert: DashboardHighCPUUsage12# Reasoning: high rates of CPU consumption should only be temporary.13expr: avg(rate(container_cpu_usage_seconds_total{container!="POD", pod=~"dashboard-.*"}[5m])) by (cluster) > 0.114for: 10m15labels:16# sent to the team internal channel until we fine tuned it17severity: warning18team: webapp19annotations:20runbook_url: https://github.com/gitpod-io/runbooks/blob/main/runbooks/WebAppServicesHighCPUUsage.md21summary: Dashboard has excessive CPU usage.22description: Dashboard is consumming too much CPU. Please investigate.23dashboard_url: https://grafana.gitpod.io/d/6581e46e4e5c7ba40a07646395ef7b23/kubernetes-compute-resources-pod?var-cluster={{ $labels.cluster }}&var-namespace=default24- alert: DashboardPodsAreNotAllInReadyState25expr: sum(kube_deployment_status_replicas_unavailable{deployment="dashboard"}) > 026for: 10m27labels:28severity: critical29team: webapp30dedicated: included31annotations:32runbook_url: https://github.com/gitpod-io/runbooks/blob/main/runbooks/DashboardStuckInPodInitState.md33summary: Dashboard stuck in PodInitializing state {{ $labels.cluster }}.34description: Dashboard is stuck in PodInitializing for at least 10 minutes353637