Skip to content

Commit ead64b8

Browse files
Nilscswatt
andauthored
cluster alert guide (DataDog#12225)
* cluster alert guide * add images * add entry in the index * add file for result * add image for the result on the monitor status page * Apply suggestions from code review Co-authored-by: cswatt <cecilia.watt@datadoghq.com>
1 parent 1f5c63e commit ead64b8

File tree

5 files changed

+40
-0
lines changed

5 files changed

+40
-0
lines changed

content/en/monitors/guide/_index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,5 +12,6 @@ disable_sidebar: true
1212
{{< nextlink href="monitors/guide/monitor-for-value-within-a-range" >}}Monitoring Ranges{{< /nextlink >}}
1313
{{< nextlink href="monitors/guide/suppress-alert-with-downtimes" >}}Suppress Alerts with Downtimes{{< /nextlink >}}
1414
{{< nextlink href="monitors/guide/alert-on-no-change-in-value" >}}Alert on No Change in Value{{< /nextlink >}}
15+
{{< nextlink href="monitors/guide/create-cluster-alert" >}}Create Cluster alerts for Metrics monitor{{< /nextlink >}}
1516
{{< nextlink href="monitors/guide/slo-checklist" >}}SLO Checklist{{< /nextlink >}}
1617
{{< /whatsnext >}}
Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
---
2+
title: Create cluster alerts to notify when a percentage of groups are in critical state
3+
kind: guide
4+
further_reading:
5+
- link: "/monitors/create/types/"
6+
tag: "Documentation"
7+
text: "Learn how to create a monitor"
8+
- link: "/monitors/notify/"
9+
tag: "Documentation"
10+
text: "Configure your monitor notifications"
11+
---
12+
13+
## Overview
14+
15+
This guide shows how to create alerts that would not notify for each single group meeting the condition, but only when a given percent of them do.
16+
This is helpful, for example, if you want a monitor that alerts only when a given percentage of hosts or containers reach a critical state.
17+
18+
### Example: Alert for a percentage of hosts with high CPU usage
19+
20+
In this example, you want to receive a notification when 40 percent of hosts have a CPU usage above 50 percent. Leverage the `min_cutoff` and `count_nonzero` functions:
21+
22+
* Use the `min_cutoff` function to count the number of hosts that have CPU usage above 50 percent.
23+
* Use the `count_nonzero` function to count the total number of hosts.
24+
* Divide one by the other for the resulting percentage of hosts with CPU usage above 50 percent.
25+
26+
{{< img src="monitors/faq/cluster-condition.png" alt="cluster-alert-condition" >}}
27+
28+
* Then, set the condition to alert if the percentage of hosts in that condition reaches 40 percent.
29+
30+
{{< img src="monitors/faq/cluster-trigger.png" alt="cluster-alert-trigger" >}}
31+
32+
This monitor tracks the percentage of host that have a CPU usage above 50 percent within the last ten minutes and generates a notification if more than 40 percent of those hosts meet the specified condition.
33+
34+
{{< img src="monitors/faq/cluster-status.png" alt="cluster-alert-status" >}}
35+
36+
{{< partial name="whats-next/whats-next.html" >}}
37+
38+
[1]: /api/
39+
[2]: /monitors/create/types/#define-the-conditions
102 KB
Loading
181 KB
Loading
61.7 KB
Loading

0 commit comments

Comments
 (0)