Prometheus - Introduction

发布时间 2023-09-26 16:46:48作者: ZhangZhihuiAAA

Prometheus is generally a pull-based system. This would involve the application just exposing an endpoint externally in a format that Prometheus understands. 

We would then install Prometheus on another separate server, which would then require us to rely on service discovery mechanisms to find our application and then access the exposed endpoint to retrieve the application metrics.

The Prometheus server should be configured to be able to discover and access all of these endpoints and gather and aggregate them accordingly.

To make it easier to understand what kind of metrics we can collect that would be useful in our case, we can go with the RED framework. The RED framework stands for Rate, Error Rates, and Duration. The rate refers to how frequently the operation being measured is instantiated or used. Error Rate refers to how frequently the operation results in error. Duration refers to how long said operation takes to execute and to respond in either a favorable or unfavorable manner.

Another alternative metrics framework that we can follow to create metrics that might be useful for is the USE framework. The USE framework represents Utilization, Saturation, and Errors. Utilization refers to the average amount of time that the resource we are trying to measure is busy. Saturation refers to how much extra work the resource we are trying to measure is left/queued up. Errors are simply a count of the number of error events that are happening.