Go 中的性能分析器标签

原文地址

Go1.9 引入了profiler标签功能, 可以向 CPU 分析器收集的样本添加任意键值. CPU分析器收集并输出CPU执行时花费最多时间的热点, 典型的CPU分析器主要报告这些位置的函数名称, 源代码行数等. 通过查看这些数据你可以检查代码的哪些部分调用了这些点, 并且可以按照调用者过滤来更细致的了解某些执行路径.

尽管位置数据在分析中是非常有用的, 但是它并不总是足够. 很多的Go程序运行在服务器上, 在服务器上分析性能问题更加的复杂, 很难将某些路径同其他路径隔离开, 或者难以理解它是否只是某个特殊路径的问题.

在 Go1.9 中你可以向执行路径中添加更多的上下文, 你可以使用任何标签集作为分析数据的一部分, 然后使用这些标签更精确的检查分析器的输出.

一些显而易见的用法如下:

  • 你不希望将软件的实现细节泄露到分析数据中, 例如对 Web Server 显示对应的 URL 路径比函数名称更有用
  • 调用堆栈的信息不足以明确任务的信息, 例如消息队列中的消费者可以通过标签来识别消息的来源
  • 待分析的问题需要上下文信息

增加标签

runtime/pprof包导出了一些函数以允许用户增加标签. 常见的用法是使用 Do 函数来扩展一个Context, 然后在 f 函数执行时记录这些标签:

func Do(ctx context.Context, labels LabelSet, f func(context.Context))

Do函数只在当前 goroutine 执行的过程中设置标签, 如果你需要在 f 函数中使用 goroutine 并保存标签, 需要使用如下方式:

labels := pprof.Labels("worker", "purge")
pprof.Do(ctx, labels, func(ctx context.Context) {
    // Do some work
    go update(ctx)   // propagates labels in ctx.
})

上面的标签将被记为 worker:purge.

检查分析器输出

本节将演示如何通过分析器标签检查记录的样本, 使用标签过滤器来分析和使用分析器数据.

使用 net/http/pprof 来进行数据采样, 代码如下:

package main

import (
    "context"
    "fmt"
    "log"
    "net/http"
    _ "net/http/pprof"
    "runtime/pprof"
    "time"
)

func main() {
    ctx := context.Background()
    go func() {
        time.Sleep(time.Second)
        for i := 0; i < 10000000; i++ {
            labels := pprof.Labels("handler", "hello")
            pprof.Do(ctx, labels, func(ctx context.Context) {
                generate(1, 10)
            })
        }
        fmt.Println("handler:hello done")
    }()

    go func() {
        time.Sleep(time.Second)
        for i := 0; i < 10000000; i++ {
            labels := pprof.Labels("handler2", "hello")
            pprof.Do(ctx, labels, func(ctx context.Context) {
                generate(1, 10)
            })
        }
        fmt.Println("handler2:hello done")
    }()

    log.Fatal(http.ListenAndServe("localhost:5555", nil))
}

func generate(duration int, usage int) string {
    s := fmt.Sprintf("duration: %d", duration)
    for i := 0; i < usage; i++ {
        s += s
    }
    return s
}

使用以下命令来获取采样数据:

go tool pprof http://localhost:5555/debug/pprof/profile

通过以下命令来获取标签以及查看对应标签的数据:

PS C:\WINDOWS\system32> go tool pprof http://localhost:5555/debug/pprof/profile
Fetching profile over HTTP from http://localhost:5555/debug/pprof/profile
Saved profile in C:\Users\51112\pprof\pprof.samples.cpu.009.pb.gz
Type: cpu
Time: Aug 22, 2018 at 3:34pm (CST)
Duration: 30.16s, Total samples = 1.41mins (281.45%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 50.27s, 59.22% of 84.89s total
Dropped 222 nodes (cum <= 0.42s)
Showing top 10 nodes out of 101
      flat  flat%   sum%        cum   cum%
    14.97s 17.63% 17.63%     25.43s 29.96%  runtime.scanobject
    13.61s 16.03% 33.67%     13.61s 16.03%  runtime.memclrNoHeapPointers
     6.23s  7.34% 41.01%      6.23s  7.34%  runtime.memmove
     4.27s  5.03% 46.04%      4.66s  5.49%  runtime.heapBitsForObject
     2.39s  2.82% 48.85%      3.57s  4.21%  runtime.greyobject
     2.27s  2.67% 51.53%      2.27s  2.67%  runtime.heapBits.bits (inline)
     2.15s  2.53% 54.06%      8.22s  9.68%  runtime.mallocgc
     1.55s  1.83% 55.88%      1.55s  1.83%  runtime.nextFreeFast (inline)
     1.45s  1.71% 57.59%      1.45s  1.71%  runtime.stdcall2
     1.38s  1.63% 59.22%      2.72s  3.20%  runtime.gcmarknewobject
(pprof) tags
 handler: Total 8.7s
          8.7s (  100%): hello

 handler2: Total 8.6s
           8.6s (  100%): hello
(pprof) tagfocus="handler:hello"
(pprof) top
Active filters:
   tagfocus=handler:hello
Showing nodes accounting for 6.49s, 7.65% of 84.89s total
Dropped 41 nodes (cum <= 0.42s)
Showing top 10 nodes out of 39
      flat  flat%   sum%        cum   cum%
     2.97s  3.50%  3.50%      2.97s  3.50%  runtime.memmove
     0.74s  0.87%  4.37%      6.17s  7.27%  runtime.concatstrings
     0.71s  0.84%  5.21%      2.49s  2.93%  runtime.mallocgc
     0.60s  0.71%  5.91%      0.60s  0.71%  runtime.nextFreeFast (inline)
     0.40s  0.47%  6.38%      0.43s  0.51%  runtime.gcWriteBarrier
     0.35s  0.41%  6.80%      0.87s  1.02%  runtime.gcmarknewobject
     0.20s  0.24%  7.03%      2.34s  2.76%  runtime.rawstring
     0.18s  0.21%  7.24%      0.41s  0.48%  runtime.deferreturn
     0.18s  0.21%  7.46%      0.18s  0.21%  runtime.spanOfUnchecked (inline)
     0.16s  0.19%  7.65%      6.33s  7.46%  runtime.concatstring2
(pprof) tagfocus="handler2:hello"
(pprof) top
Active filters:
   tagfocus=handler2:hello
Showing nodes accounting for 6.50s, 7.66% of 84.89s total
Dropped 44 nodes (cum <= 0.42s)
Showing top 10 nodes out of 50
      flat  flat%   sum%        cum   cum%
     3.18s  3.75%  3.75%      3.18s  3.75%  runtime.memmove
     0.81s  0.95%  4.70%      2.68s  3.16%  runtime.mallocgc
     0.60s  0.71%  5.41%      6.38s  7.52%  runtime.concatstrings
     0.50s  0.59%  6.00%      0.50s  0.59%  runtime.nextFreeFast (inline)
     0.46s  0.54%  6.54%      0.89s  1.05%  runtime.gcmarknewobject
     0.27s  0.32%  6.86%      0.32s  0.38%  runtime.gcWriteBarrier
     0.20s  0.24%  7.09%      2.63s  3.10%  runtime.rawstringtmp
     0.20s  0.24%  7.33%      0.20s  0.24%  runtime.spanOfUnchecked (inline)
     0.14s  0.16%  7.49%      7.96s  9.38%  main.generate
     0.14s  0.16%  7.66%      0.14s  0.16%  runtime.acquirem (inline)
(pprof) tagfocus=
(pprof) tagignore="handler:hello"
(pprof) tags
TagHide expression matched no samples
 handler2: Total 8.6s
           8.6s (  100%): hello

(pprof)

从以上实例可以看到, 我们可以通过 tagfocus 来设置只显示含有对应标签的数据, 使用 tagignore 来过滤对应标签的数据. 其他的还有 tagshow, taghide 等命令.

对于 HTTP 路径, 可以使用 pprofutil 包来自动的增加标签.

results matching ""

    No results matching ""