Grafana学习(5)——Introduction to histograms and heatmaps

发布时间 2023-11-22 12:46:50作者: 钱塘江畔
A histogram is a graphical representation of the distribution of numerical data. It groups values into buckets (sometimes also called bins) and then counts how many values fall into each bucket.
Instead of graphing the actual values, histograms graph the buckets. Each bar represents a bucket, and the bar height represents the frequency (such as count) of values that fell into that bucket’s interval.

直方图是数字数据分布的图形表示。它将值分组到bucket(有时也称为bin)中,然后统计每个bucket中有多少值。
直方图不是绘制实际值,而是绘制bucket。每个条形表示一个bucket,条形高度表示落入该bucket区间的值的频率(如计数)。

Histogram example
This histogram shows the value distribution of a couple of time series. You can easily see that most values land between 240-300 with a peak between 260-280.
For more information about histogram visualization options, refer to Histogram.
Histograms only look at value distributions over a specific time range. The problem with histograms is that you cannot see any trends or changes in the distribution over time. This is where heatmaps become useful.

直方图示例
该直方图显示了几个时间序列的值分布。你可以很容易地看到,大多数数值在240-300之间,峰值在260-280之间。
有关直方图可视化选项的更多信息,请参阅直方图
直方图只关注特定时间范围内的值分布。直方图的问题是,你看不到随时间推移分布的任何趋势或变化。这就是热图变得有用的地方。

Heatmaps
A heatmap is like a histogram, but over time, where each time slice represents its own histogram. Instead of using bar height as a representation of frequency, it uses cells, and colors the cell proportional to the number of values in the bucket.
In this example, you can clearly see what values are more common and how they trend over time.

热图
热图就像一个直方图,但随着时间的推移,每个时间片都代表自己的直方图。它不使用条形高度作为频率的表示,而是使用单元格,并根据bucket中值的数量为单元格着色。
在这个例子中,您可以清楚地看到哪些价值观更常见,以及它们如何随着时间的推移而变化。
For more information about heatmap visualization options, refer to Heatmap.

Pre-bucketed data
There are a number of data sources supporting histogram over time, like Elasticsearch (by using a Histogram bucket aggregation) or Prometheus (with histogram metric type and Format as option set to Heatmap). But generally, any data source could be used as long as it meets the requirement that it either returns series with names representing bucket bounds, or that it returns series sorted by the bounds in ascending order.

预先装箱的数据
有许多数据源支持随时间变化的直方图,如Elasticsearch(通过使用直方图桶聚合)或Prometheus(将直方图度量类型和Format作为选项设置为Heatmap)。但一般来说,任何数据源都可以使用,只要它满足以下要求:返回具有表示桶边界的名称的序列,或者返回按边界升序排序的序列。

Raw data vs aggregated
If you use the heatmap with regular time series data (not pre-bucketed), then it’s important to keep in mind that your data is often already aggregated by your time series backend. Most time series queries do not return raw sample data, but instead include a group by time interval or maxDataPoints limit coupled with an aggregation function (usually average).
This all depends on the time range of your query of course. But the important point is to know that the histogram bucketing that Grafana performs might be done on already aggregated and averaged data. To get more accurate heatmaps, it is better to do the bucketing during metric collection, or to store the data in Elasticsearch or any other data source which supports doing histogram bucketing on the raw data.
If you remove or lower the group by time (or raise maxDataPoints) in your query to return more data points, your heatmap will be more accurate, but this can also be very CPU and memory taxing for your browser, possibly causing hangs or crashes if the number of data points becomes unreasonably large.

原始数据与汇总数据
如果您将热图与常规时间序列数据一起使用(而不是预先分段),那么重要的是要记住,您的数据通常已经由时间序列后端聚合。大多数时间序列查询不返回原始样本数据,而是包含一个按时间间隔或maxDataPoints限制的组,并结合一个聚合函数(通常为平均值)。
当然,这完全取决于查询的时间范围。但重要的一点是要知道,Grafana执行的直方图拼接可能是对已经聚合和平均的数据进行的。为了获得更准确的热图,最好在度量收集过程中进行bucketing,或者将数据存储在Elasticsearch或任何其他支持对原始数据进行直方图bucketing的数据源中。
如果您在查询中按时间删除或降低组(或提高maxDataPoints)以返回更多的数据点,则热图将更准确,但这也可能会对浏览器的CPU和内存造成很大的负担,如果数据点的数量变得不合理地大,则可能会导致挂起或崩溃。