Path: blob/main/operations/observability/mixins/workspace/rules/satellite/workspaces.yaml
2506 views
# Copyright (c) 2022 Gitpod GmbH. All rights reserved.1# Licensed under the GNU Affero General Public License (AGPL).2# See License.AGPL.txt in the project root for license information.34apiVersion: monitoring.coreos.com/v15kind: PrometheusRule6metadata:7labels:8prometheus: k8s9role: alert-rules10name: workspace-monitoring-satellite-rules11spec:12groups:13- name: workspace-rules14rules:15- record: gitpod_workspace_regular_not_active_percentage_mk216expr: |17sum(gitpod_ws_manager_mk2_workspace_activity_total{active="false"}) by (cluster) / sum(gitpod_ws_manager_mk2_workspace_activity_total) by (cluster)1819- name: workspace-alerts20rules:21- alert: GitpodWorkspaceTooManyRegularNotActiveMk222labels:23severity: critical24for: 10m25annotations:26runbook_url: https://github.com/gitpod-io/runbooks/blob/main/runbooks/GitpodWorkspaceRegularNotActive.md27summary: too many running but inactive workspaces28description: too many running but inactive workspaces. lower bound is 20 "regular not active" workspaces to reduce the false-positive rate.29# bumped from 20 to 40 temporarily30expr: |31sum(gitpod_workspace_regular_not_active_percentage_mk2) by(cluster) > 0.0832AND33sum (gitpod_ws_manager_mk2_workspace_activity_total{active="false"}) by (cluster) > 403435- alert: GitpodWorkspacesNotStartingMk236labels:37severity: critical38for: 10m39annotations:40runbook_url: https://github.com/gitpod-io/runbooks/blob/main/runbooks/GitpodWorkspaceRegularNotActive.md41summary: workspaces are not starting.42description: inactive regular workspaces exists but workspaces are not being started.43expr: |44sum by(cluster) (avg_over_time(gitpod_workspace_regular_not_active_percentage_mk2[1m]) > 0)45AND46sum by(cluster) (rate(gitpod_ws_manager_mk2_workspace_startup_seconds_sum{type="Regular"}[1m])) == 047- alert: GitpodWsManagerMk2BackupFailureError48labels:49severity: error50team: engine51annotations:52runbook_url: https://github.com/gitpod-io/runbooks/blob/main/runbooks/WorkspaceBackupFailures.md53summary: Workspace backups failed recently in cluster {{ $labels.cluster }}54description: This can happen when a single node has failed in the cloud provider55expr: |56sum by (cluster) (increase(gitpod_ws_manager_mk2_workspace_backups_failure_total{cluster!~"ephemeral.*"}[1h])) > 057AND58sum by (cluster) (increase(gitpod_ws_manager_mk2_workspace_backups_failure_total{cluster!~"ephemeral.*"}[1h])) < 1659- alert: GitpodWsManagerMk2BackupFailureCritical60labels:61severity: critical62team: engine63annotations:64runbook_url: https://github.com/gitpod-io/runbooks/blob/main/runbooks/WorkspaceBackupFailures.md65summary: Workspace backups failed recently in cluster {{ $labels.cluster }}66description: This can be an indicator of two or more nodes failing in a cloud provider67expr: |68sum by (cluster) (increase(gitpod_ws_manager_mk2_workspace_backups_failure_total{cluster!~"ephemeral.*"}[1h])) >= 16697071