Grafana学习(9)—— Alerting - Labels and annotations

发布时间 2023-11-22 16:46:31作者: 钱塘江畔

1. 简介

Labels and annotations contain information about an alert. Both labels and annotations have the same structure: a set of named values; however their intended uses are different. An example of label, or the equivalent annotation, might be alertname="test".

The main difference between a label and an annotation is that labels are used to differentiate an alert from all other alerts, while annotations are used to add additional information to an existing alert.

标签和注释包含有关警报的信息。标签和注释具有相同的结构:一组命名值;然而它们的预期用途是不同的。标签或等效注释的示例可能是alertname="test"
标签和注释之间的主要区别在于,标签用于将警报与所有其他警报区分开来,而注释用于向现有警报添加附加信息。

For example, consider two high CPU alerts: one for server1 and another for server2. In such an example we might have a label called server where the first alert has the label server="server1" and the second alert has the label server="server2". However, we might also want to add a description to each alert such as "The CPU usage for server1 is above 75%.", where server1 and 75% are replaced with the name and CPU usage of the server (please refer to the documentation on templating labels and annotations for how to do this). This kind of description would be more suitable as an annotation.

例如,考虑两个高CPU警报:一个用于server1,另一个用于server 2。在这样的例子中,我们可能有一个名为server的标签,其中第一个警报的标签为server=“server1”,第二个警报的标记为server=”server2“。但是,我们可能还想为每个警报添加一个描述,例如“服务器1的CPU使用率高于75%”,其中server175%被替换为服务器的名称和CPU使用率(有关如何执行此操作,请参阅模板标签和注释的文档)。这种描述将更适合作为注释。

Labels
Labels contain information that identifies an alert. An example of a label might be server=server1. Each alert can have more than one label, and the complete set of labels for an alert is called its label set. It is this label set that identifies the alert.

For example, an alert might have the label set {alertname="High CPU usage",server="server1"} while another alert might have the label set {alertname="High CPU usage",server="server2"}. These are two separate alerts because although their alertname labels are the same, their server labels are different.

The label set for an alert is a combination of the labels from the datasource, custom labels from the alert rule, and a number of reserved labels such as alertname.

标签

  • 标签包含标识警报的信息。标签的一个例子可能是server=server1。每个警报可以有多个标签,警报的完整标签集称为其标签集。正是这个标签集标识了警报。
  • 例如,一个警报可能具有标签集{alertname=“High CPU usage”,server=“server1”},而另一个警报则可能具有标签集{alertname=“High CPU use”,server=“server2”}。这是两个独立的警报,因为尽管它们的alertname标签相同,但它们的server标签不同。
  • 警报的标签集是数据源中的标签、警报规则中的自定义标签以及一些保留标签(如alertname)的组合。
Custom Labels
Custom labels are additional labels from the alert rule. Like annotations, custom labels must have a name, and their value can contain a combination of text and template code that is evaluated when an alert is fired. Documentation on how to template custom labels can be found here.

When using custom labels with templates it is important to make sure that the label value does not change between consecutive evaluations of the alert rule as this will end up creating large numbers of distinct alerts. However, it is OK for the template to produce different label values for different alerts. For example, do not put the value of the query in a custom label as this will end up creating a new set of alerts each time the value changes. Instead use annotations.

It is also important to make sure that the label set for an alert does not have two or more labels with the same name. If a custom label has the same name as a label from the datasource then it will replace that label. However, should a custom label have the same name as a reserved label then the custom label will be omitted from the alert.

自定义标签

  • 自定义标签是警报规则中的附加标签。与注释一样,自定义标签必须有一个名称,并且其值可以包含文本和模板代码的组合,这些代码在触发警报时进行评估。关于如何模板自定义标签的文档可以在这里找到。
  • 将自定义标签与模板一起使用时,重要的是要确保标签值在警报规则的连续评估之间不会发生变化,因为这最终会创建大量不同的警报。但是,模板可以为不同的警报生成不同的标签值。例如,不要将查询的值放在自定义标签中,因为每次值更改时都会创建一组新的警报。而是使用注释。
  • 同样重要的是要确保警报的标签集没有两个或多个同名标签。如果自定义标签与数据源中的标签同名,则它将替换该标签。但是,如果自定义标签与保留标签的名称相同,则该自定义标签将从警报中省略。
Annotations
Annotations are named pairs that add additional information to existing alerts. There are a number of suggested annotations in Grafana such as description, summary, runbook_url, dashboardUId and panelId. Like custom labels, annotations must have a name, and their value can contain a combination of text and template code that is evaluated when an alert is fired. If an annotation contains template code, the template is evaluated once when the alert is fired. It is not re-evaluated, even when the alert is resolved. Documentation on how to template annotations can be found here.

注释
注释是为现有警报添加附加信息的命名对。Grafana中有许多建议的注释,如description、summary、runbook_url、dashboardUId和panelId。与自定义标签一样,注释必须有一个名称,并且其值可以包含文本和模板代码的组合,这些代码在触发警报时进行评估。如果注释包含模板代码,则在触发警报时会对模板进行一次评估。即使警报已解决,也不会对其进行重新评估。关于如何对注释进行模板化的文档可以在这里找到。

2. 标签匹配器

Use labels and label matchers to link alert rules to notification policies and silences. This allows for a very flexible way to manage your alert instances, specify which policy should handle them, and which alerts to silence.
A label matchers consists of 3 distinct parts, the label, the value and the operator.
  The Label field is the name of the label to match. It must exactly match the label name.
  The Value field matches against the corresponding value for the specified Label name. How it matches depends on the Operator value.
  The Operator field is the operator to match against the label value. The available operators are

使用标签和标签匹配器将警报规则链接到通知策略和静默。这允许以一种非常灵活的方式来管理您的警报实例,指定应该处理它们的策略,以及静音哪些警报。
标签匹配器由3个不同的部分组成,即标签、值和运算符。

  • “标签”字段是要匹配的标签的名称。它必须与标签名称完全匹配。
  • “值”字段与指定标签名称的相应值匹配。它的匹配方式取决于运算符值。
  • 运算符字段是要与标签值匹配的运算符。可用的运算符有:
Operator Description
= Select labels that are exactly equal to the value.
!= Select labels that are not equal to the value.
=~ Select labels that regex-match the value.
!~ Select labels that do not regex-match the value.
If you are using multiple label matchers, they are combined using the AND logical operator. This means that all matchers must match in order to link a rule to a policy.

如果使用多个标签匹配器,则会使用AND逻辑运算符对它们进行组合。这意味着所有标签匹配器必须匹配,才能将规则链接到策略。

Example scenario
If you define the following set of labels for your alert:

{ foo=bar, baz=qux, id=12 }

then:

A label matcher defined as foo=bar matches this alert rule.
A label matcher defined as foo!=bar does not match this alert rule.
A label matcher defined as id=~[0-9]+ matches this alert rule.
A label matcher defined as baz!~[0-9]+ matches this alert rule.
Two label matchers defined as foo=bar and id=~[0-9]+ match this alert rule.

示例场景
如果您为警报定义了以下一组标签:
{foo=bar,baz=qux,id=12}
那么:

  • 定义为foo=bar的标签匹配器与此警报规则匹配。
  • 定义为foo!=bar的标签匹配器与此警报规则不匹配。
  • 定义为id=~[0-9]+的标签匹配器与此警报规则匹配。
  • 定义为baz!~[0-9]+的标签匹配器与此警报规则匹配。
  • 定义为foo=barandid=~[0-9]+的两个标签匹配符与此警报规则匹配。

3. Grafana Alerting中的标签

This topic explains why labels are a fundamental component of alerting.

The complete set of labels for an alert is what uniquely identifies an alert within Grafana alerts.
The Alertmanager uses labels to match alerts for silences and alert groups in notification policies.
The alerting UI shows labels for every alert instance generated during evaluation of that rule.
Contact points can access labels to dynamically generate notifications that contain information specific to the alert that is resulting in a notification.
You can add labels to an alerting rule. Labels are manually configurable, use template functions, and can reference other labels. Labels added to an alerting rule take precedence in the event of a collision between labels (except in the case of Grafana reserved labels).

本主题解释了为什么标签是alert的基本组成部分。

  • 警报的完整标签集是Grafana警报中唯一标识alert的标签。
  • Alertmanager使用标签来匹配通知策略中静默和警报组的警报。
  • 警报UI显示在评估该规则期间生成的每个警报实例的标签。
  • 触点可以访问标签以动态生成通知,这些通知包含造成通知的警报的特定信息。
  • 您可以向警报规则添加标签。标签可以手动配置,使用模板功能,并且可以引用其他标签。在标签之间发生冲突的情况下,添加到警报规则中的标签优先(Grafana保留标签除外)。
External Alertmanager Compatibility
Grafana’s built-in Alertmanager supports both Unicode label keys and values. If you are using an external Prometheus Alertmanager, label keys must be compatible with their data model. This means that label keys must only contain ASCII letters, numbers, as well as underscores and match the regex [a-zA-Z_][a-zA-Z0-9_]*. Any invalid characters will be removed or replaced by the Grafana alerting engine before being sent to the external Alertmanager according to the following rules:

外部警报管理器兼容性
Grafana内置的Alertmanager同时支持Unicode标签键和值。如果使用外部Prometheus Alertmanager,则标签密钥必须与其数据模型兼容。这意味着标签键必须只包含ASCII字母、数字以及下划线,并且与正则表达式[a-zA-Z_][a-zA-Z0-9_]*匹配。根据以下规则,在发送到外部Alertmanager之前,Grafana警报引擎将删除或替换任何无效字符:

  • Whitespace will be removed.
  • ASCII characters will be replaced with _.
  • All other characters will be replaced with their lower-case hex representation. If this is the first character it will be prefixed with _.
    Example: A label key/value pair Alert! ?="?" will become Alert_0x1f514="?".
    Note If multiple label keys are sanitized to the same value, the duplicates will have a short hash of the original label appended as a suffix.
Grafana reserved labels
Note: Labels prefixed with grafana_ are reserved by Grafana for special use. If a manually configured label is added beginning with grafana_ it may be overwritten in case of collision. To stop the Grafana Alerting engine from adding a reserved label, you can disable it via the `disabled_labels` option in [unified_alerting.reserved_labels][unified-alerting-reserved-labels] configuration.
Grafana reserved labels can be used in the same way as manually configured labels. The current list of available reserved labels are:

4. 模板化标签和注释

You can use templates to include data from queries and expressions in labels and annotations. For example, you might want to set the severity label for an alert based on the value of the query, or use the instance label from the query in a summary annotation so you know which server is experiencing high CPU usage.

All templates should be written in text/template. Regardless of whether you are templating a label or an annotation, you should write each template inline inside the label or annotation that you are templating. This means you cannot share templates between labels and annotations, and instead you will need to copy templates wherever you want to use them.

可以使用模板将查询和表达式中的数据包含在标签和注释中。例如,您可能希望根据查询的值设置警报的严重性标签,或者在摘要注释中使用查询中的实例标签,以便了解哪个服务器的CPU使用率较高。
所有模板都应以文本/模板形式编写。无论您是对标签还是注释进行模板化,都应该在正在进行模板化的标签或注释内内联编写每个模板。这意味着您不能在标签和注释之间共享模板,相反,您需要将模板复制到需要使用它们的任何位置。