OpenTelemetry简介

发布时间 2023-04-04 15:49:13作者: fxjwind

官网,https://opentelemetry.io/docs/instrumentation/java/getting-started/ 

Dapper论文解析,https://www.cnblogs.com/fxjwind/p/16589224.html

 

OT主要分为两块,instrumentation和collector

 

collector作为数据处理和数据exporter,比较简单

exporter后面的方案OT是没有统一的,

tracing接入Jaeger

metrics接入prometheus

日志接入日志存储,OT本身对于日志的支持比较弱

Collector通过配置,receivers,processors,exports,这块比较common

 

 

所以OT的核心是Instrumentation

可以看出从Trace,Metrics,logs,能力递减

对于主流的语言,基本trace和metrics是涵盖的

 

 

Instrumentation的方式,分为自动,手动

Java

Java是支持自动Instrumentation的

Automatic instrumentation with Java uses a Java agent JAR that can be attached to any Java 8+ application. It dynamically injects bytecode to capture telemetry from many popular libraries and frameworks. It can be used to capture telemetry data at the “edges” of an app or service, such as inbound requests, outbound HTTP calls, database calls, and so on. 

Java Agent在类加载的时候,会匹配对应的库或framework的类,并直接将Instrumentation逻辑通过字节码改写的方式进行注入。

 

使用非常简单,就在启动java进程的时候,加上启动参数,

当前支持的库和框架,参考如下链接

https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/docs/supported-libraries.md

 

当然也可以对于自己开发的模块,进行手工打点

Tracing

主要的操作就是创建Span

import io.opentelemetry.api;

//...

Tracer tracer =
    openTelemetry.getTracer("instrumentation-library-name", "1.0.0");
Span span = tracer.spanBuilder("my span").startSpan();

// Make the span the current span
try (Scope ss = span.makeCurrent()) {
  // In this scope, the span is the current/active span
} finally {
    span.end();
}

还有就是context propagation

发送

URL url = new URL("http://127.0.0.1:8080/resource");
Span outGoing = tracer.spanBuilder("/resource").setSpanKind(SpanKind.CLIENT).startSpan();
try (Scope scope = outGoing.makeCurrent()) {
  // Use the Semantic Conventions.
  // (Note that to set these, Span does not *need* to be the current instance in Context or Scope.)
  outGoing.setAttribute(SemanticAttributes.HTTP_METHOD, "GET");
  outGoing.setAttribute(SemanticAttributes.HTTP_URL, url.toString());
  HttpURLConnection transportLayer = (HttpURLConnection) url.openConnection();
  // Inject the request with the *current*  Context, which contains our current Span.
  openTelemetry.getPropagators().getTextMapPropagator().inject(Context.current(), transportLayer, setter);
  // Make outgoing call
} finally {
  outGoing.end();
}

接收

public void handle(HttpExchange httpExchange) {
  // Extract the SpanContext and other elements from the request.
  Context extractedContext = openTelemetry.getPropagators().getTextMapPropagator()
        .extract(Context.current(), httpExchange, getter);
  try (Scope scope = extractedContext.makeCurrent()) {
    // Automatically use the extracted SpanContext as parent.
    Span serverSpan = tracer.spanBuilder("GET /resource")
        .setSpanKind(SpanKind.SERVER)
        .startSpan();
    try {
      ...
    } finally {
      serverSpan.end();
    }

 

Metrics

The metrics API defines a variety of instruments. Instruments record measurements, which are aggregated by the metrics SDK and eventually exported out of process. Instruments come in synchronous and asynchronous varieties. Synchronous instruments record measurements as they happen. Asynchronous instrument register a callback, which is invoked once per collection, and which records measurements at that point in time. The following instruments are available:

  • LongCounter/DoubleCounter: records only positive values, with synchronous and asynchronous options. Useful for counting things, such as the number of bytes sent over a network. Counter measurements are aggregated to always-increasing monotonic sums by default.
  • LongUpDownCounter/DoubleUpDownCounter: records positive and negative values, with synchronous and asynchronous options. Useful for counting things that go up and down, like the size of a queue. Up down counter measurements are aggregated to non-monotonic sums by default.
  • LongGauge/DoubleGauge: measures an instantaneous value with an asynchronous callback. Useful for recording values that can’t be merged across attributes, like CPU utilization percentage. Gauge measurements are aggregated as gauges by default.
  • LongHistogram/DoubleHistogram: records measurements that are most useful to analyze as a histogram distribution. No asynchronous option is available. Useful for recording things like the duration of time spent by an HTTP server processing a request. Histogram measurements are aggregated to explicit bucket histograms by default.

分为几类,

Counter和UpDownCounter,计数,UpDown可加可减,可负数

Gauge,仪表,表示当前值,比如cpu使用率

Histogram,直方图

这里有同步和异步之分,异步可以注册callback

使用的例子,

OpenTelemetry openTelemetry = // obtain instance of OpenTelemetry

// Gets or creates a named meter instance
Meter meter = openTelemetry.meterBuilder("instrumentation-library-name")
        .setInstrumentationVersion("1.0.0")
        .build();

// Build counter e.g. LongCounter
LongCounter counter = meter
      .counterBuilder("processed_jobs")
      .setDescription("Processed jobs")
      .setUnit("1")
      .build();

// It is recommended that the API user keep a reference to Attributes they will record against
Attributes attributes = Attributes.of(stringKey("Key"), "SomeWork");

// Record data
counter.add(123, attributes);