使用 dotnet-monitor 诊断.NET应用程序

发布时间 2023-11-13 22:38:49作者: 初久的私房菜

生产环境中收集诊断信息

在生产环境中,收集诊断信息(如跟踪、日志、度量和转储)可能具有挑战性。通常,必须访问环境,安装一些工具,然后收集信息。dotnet-monitor 简化并统一了收集诊断信息的方式,通过暴露一个 REST API,无论您的应用程序在哪里执行(在您的本地机器上,内部服务器上,或在 Kubernetes 集群内)。根据我们的需求,dotnet-monitor 可能会替代其他 .NET 诊断工具,如 dotnet-counters、dotnet-dump、dotnet-gcdump 和 dotnet-trace,特别是在信息收集的上下文中。

设置

我们可以使用以下命令将其安装为全局工具:

dotnet tool install --global dotnet-monitor --version 8.0.0-rc.2.23502.11
安装完成后,我们可以通过以下命令启动:

dotnet monitor collect --no-auth
dotnet-monitor 包括用于在 . 中浏览 API 表面的 Swagger UI。要测试工具,我们将使用一个标准的 .NET 应用程序。运行以下命令:

dotnet new web - o DotNetMonitorSandBox
dotnet new sln - n DotNetMonitorSandbox
dotnet sln add --in-root DotNetMonitorSandBox

Processes

Processes API 列出了可以检测到的进程并获取它们的元数据。打开浏览器并导航到 https://localhost:52323/processes 以列出可用的进程(确保您已运行我们的示例应用程序):

[
  {
    "pid": 19828,
    "uid": "66140161-2208-4e7c-b874-79aa037d4344",
    "name": "dotnet",
    "isDefault": false
  },
  {
  "pid": 57388,
    "uid": "2b0aba55-1579-41a5-b6a7-c1575650352a",
    "name": "DotNetMonitorSandBox",
    "isDefault": false
  }
]

属性表示进程的 ID。属性在进程运行在进程 ID 可能不唯一的环境中(例如,在 Kubernetes pod 内的多个容器将具有进程 ID 为 1 的入口点进程)时,对于唯一标识进程非常有用。导航到 https://localhost:52323/process?pid={pid} 以查看更多信息或获取指定进程的环境变量。

Logs

Logs API 使我们能够收集记录到 ILogger<> 基础结构的日志。打开 Program.cs 文件并更新内容如下:

var builder = WebApplication.CreateBuilder(args);
var app = builder.Build();
app.MapGet("/", () =>
{
  app.Logger.LogInformation("Hello World!");
  return "Hello World!";
});

app.Run();

运行应用程序并导航到 https://localhost:52323/logs?pid={pid}&durationSeconds=60 以在接下来的 60 秒内实时查看我们的日志记录。

Traces

Traces API 使我们能够收集格式化的跟踪。要使用预定义的跟踪配置文件集(如 Cpu、HttpLogs、Metrics)捕获进程的跟踪,请导航到 https://localhost:52323/trace?pid={pid}&durationSeconds=60 并等待获取 .nettrace 文件。打开 Program.cs 文件并更新内容如下:

using System.Diagnostics.Tracing;

var builder = WebApplication.CreateBuilder(args);
var app = builder.Build();
app.MapGet("/", () =>
{
app.Logger.LogInformation("Hello World!");
MyEventSource.Log.Request("Hello World!");
return "Hello World!";
});

app.Run();

[EventSource(Name = "MyEventSource")]
public sealed class MyEventSource : EventSource
{
    public static MyEventSource Log { get; } = new MyEventSource();

    [Event(1, Level = EventLevel.Informational)]
    public void Request(string message)
    {
        WriteEvent(1, message);
    }
}

要捕获自定义事件提供程序的跟踪,我们需要对相同的端点进行 POST 调用,并使用以下请求正文:

{
    "Providers": [{
        "Name": "MyEventSource",
        "EventLevel": "Informational"
    }],
    "BufferSizeInMB": 1024
}

在 Windows 上,.nettrace 文件可以在 PerfView 中查看以进行分析,或在 Visual Studio 中查看。

Metrics

Metrics API 获取单个进程的 Prometheus 暴露格式的度量快照(pid 将通过配置设置)。dotnet-monitor 可以从多个来源读取和合并配置。Windows 的文件设置路径为 %USERPROFILE%.dotnet-monitor\settings.json。因此,让我们用以下内容更新文件(如果不存在,请创建它):

{
  "DefaultProcess": {
    "Filters": [{
        "Key": "ProcessId",
      "Value": "<pid>"
    }]
  },
}

配置将自动被 dotnet-monitor 加载。默认情况下,收集的度量来自以下提供程序:

System.Runtime
Microsoft.AspNetCore.Hosting
Grpc.AspNetCore.Server
导航到 https://localhost:52323/metrics 查看类似于以下的输出:

# HELP systemruntime_cpu_usage_ratio CPU Usage
# TYPE systemruntime_cpu_usage_ratio gauge
systemruntime_cpu_usage_ratio 0 1699198374885
systemruntime_cpu_usage_ratio 0 1699198379898
systemruntime_cpu_usage_ratio 0 1699201002325
# HELP systemruntime_working_set_bytes Working Set
# TYPE systemruntime_working_set_bytes gauge
systemruntime_working_set_bytes 63393792 1699198364894
systemruntime_working_set_bytes 63401984 1699198369888
systemruntime_working_set_bytes 63418368 1699198374885
# HELP systemruntime_gc_heap_size_bytes GC Heap Size
# TYPE systemruntime_gc_heap_size_bytes gauge 
systemruntime_gc_heap_size_bytes 7085504 1699198364894 
systemruntime_gc_heap_size_bytes 7093696 1699198369888 
systemruntime_gc_heap_size_bytes 7110080 1699198374885

dotnet-monitor 支持 System.Diagnostics.Metrics(.NET 8 应用程序)基于 API 和 EventCounters(您可以在这里查看度量 API 之间的区别)。由于我们使用的是 .NET 7,我们将按如下方式修改 Program.cs 文件:

using System.Diagnostics.Tracing;

var builder = WebApplication.CreateBuilder(args);
var app = builder.Build();
app.MapGet("/", () =>
{
app.Logger.LogInformation("Hello World!");
MyEventSource.Log.Request("Hello World!");
return "Hello World!";
});

app.Run();

[EventSource(Name = "MyEventSource")]
public sealed class MyEventSource : EventSource
{
    public static MyEventSource Log { get; } = new MyEventSource();

    private EventCounter _counter;

    public MyEventSource()
    {
        _counter = new EventCounter("my-custom-counter", this)
        {
            DisplayName = "my-custom-counter",
            DisplayUnits = "ms"
        };
    }

    [Event(1, Level = EventLevel.Informational)]
    public void Request(string message)
    {
        WriteEvent(1, message);
        _counter.WriteMetric(1);
    }
}

要捕获自定义事件提供程序的度量,我们需要修改 settings.json 文件,如下所示:

{
  "Metrics": {
    "Providers": [
      {
        "ProviderName": "MyEventSource",
        "CounterNames": [
          "my-custom-counter"
        ]
      }
    ]
  },
  "DefaultProcess": {
    "Filters": [{
        "Key": "ProcessId",
      "Value": "<pid>"
    }]
  },
}

Live Metrics

Live Metrics API 为所选进程捕获度量(与度量部分中列出的相同默认提供程序)。导航到 https://localhost:52323/livemetrics?pid={pid}&durationSeconds=60 并等待获取 .json 文件。要捕获自定义事件提供程序的实时度量,我们需要对同一端点进行调用,并使用以下请求正文:

{
    "includeDefaultProviders": false,
    "providers": [
        {
        "providerName": "MyEventSource",
            "counterNames": [
                "my-custom-counter"
            ]
        }
    ]
}

输出将类似于以下内容:

{"timestamp":"2023-11-05T18:52:32.5333078-05:00","provider":"MyEventSource","name":"my-custom-counter","displayName":"my-custom-counter","unit":"ms","counterType":"Metric","tags":"","value":1}
{ "timestamp":"2023-11-05T18:52:37.5321623-05:00","provider":"MyEventSource","name":"my-custom-counter","displayName":"my-custom-counter","unit":"ms","counterType":"Metric","tags":"","value":1}
{ "timestamp":"2023-11-05T18:52:42.5360839-05:00","provider":"MyEventSource","name":"my-custom-counter","displayName":"my-custom-counter","unit":"ms","counterType":"Metric","tags":"","value":1}
{ "timestamp":"2023-11-05T18:52:47.5309596-05:00","provider":"MyEventSource","name":"my-custom-counter","displayName":"my-custom-counter","unit":"ms","counterType":"Metric","tags":"","value":1} { "timestamp":"2023-11-05T18:52:52.5323712-05:00","provider":"MyEventSource","name":"my-custom-counter","displayName":"my-custom-counter","unit":"ms","counterType":"Metric","tags":"","value":1} { "timestamp":"2023-11-05T18:52:57.5310386-05:00","provider":"MyEventSource","name":"my-custom-counter","displayName":"my-custom-counter","unit":"ms","counterType":"Metric","tags":"","value":1}

Dump

Dump API 在不使用调试器的情况下捕获指定进程的托管转储。导航到 https://localhost:52323/dump?pid={pid}&durationSeconds=60 并等待获取 .dmp 文件(在收集转储时,应用程序将被挂起)。转储文件可以使用诸如 dotnet-dump 或 Visual Studio 之类的工具进行分析。在捕获时,转储文件不能在具有不同操作系统/架构的计算机上进行分析。

GCDump

GCDump API 捕获指定进程的 GC 转储。导航到 https://localhost:52323/gcdump?pid={pid}&durationSeconds=60 并等待获取 .gcdump 文件。除了 Visual Studio 外,我们还可以使用 PerfView 分析 gcdump 文件,并使用 dotnet-gcdump 生成报告。与转储文件不同,gcdump 文件是一种可移植格式,无论在哪个平台上收集,都可以进行分析。