引用 github.com/segmentio/kafka-go v0.4.39 出现的 copy 队列溢出的问题

发布时间 2023-06-30 18:03:43作者: piperck

 

在高并发 (40k~60k) rps 的情况下,github.com/segmentio/kafka-go v0.4.39 该库频繁出现 

panic: runtime error: slice bounds out of range [:4636] with capacity 4096 goroutine 3474393327 

的报错,并且因为该错误由库里面的 go run(xxx) 的 goroutine 抛出,是触发在库中的 goroutine 中,外部线程程序无法对齐进行 recover。

我尝试了一些方法无法均无法 cover 这种情况,lib 内部的 panic 会导致主程序崩溃频繁重启。

panic: runtime error: slice bounds out of range [:4636] with capacity 4096 goroutine 3474393327 
[running]: github.com/segmentio/kafka-go/protocol.(*encoder).Write(0xc04a826c80, {0xc04a7b5900, 0x0?, 0x1000}) 
/root/go/pkg/mod/github.com/segmentio/kafka-go@v0.4.39/protocol/encode.go:59 +0x10c bytes.(*Reader).WriteTo(0xc03e43f830, {0xec2c40?, 0xc04a826c80?}) 
/opt/go/src/bytes/reader.go:143 +0x87 io.copyBuffer({0xec2c40, 0xc04a826c80}, {0x7f9a038bcfe8, 0xc03e43f830}, {0x0, 0x0, 0x0}) 
/opt/go/src/io/io.go:409 +0x16e io.Copy(...) 
/opt/go/src/io/io.go:386 github.com/segmentio/kafka-go/protocol.(*encoder).writeVarNullBytesFrom(0xc04a826c80?, {0xec9ab0, 0xc03e43f830}) 
/root/go/pkg/mod/github.com/segmentio/kafka-go@v0.4.39/protocol/encode.go:325 +0xb2 
github.com/segmentio/kafka-go/protocol.(*RecordSet).writeToVersion2.func1(0x5b, 0xc03e43f7b0) 
/root/go/pkg/mod/github.com/segmentio/kafka-go@v0.4.39/protocol/record_v2.go:259 +0x377 
github.com/segmentio/kafka-go/protocol.handleRecord(0xc03e43f790?, 0x4?, 0xc11f1dddae703fca?) 
/root/go/pkg/mod/github.com/segmentio/kafka-go@v0.4.39/protocol/record_batch.go:74 +0xcb 
github.com/segmentio/kafka-go/protocol.forEachRecord({0xec2b60, 0xc03e43f790}, 0x14d62e0?) 
/root/go/pkg/mod/github.com/segmentio/kafka-go@v0.4.39/protocol/record_batch.go:61 +0x65 
github.com/segmentio/kafka-go/protocol.(*RecordSet).writeToVersion2(0xc084078908, 0xc09d65e180, 0x46) 
/root/go/pkg/mod/github.com/segmentio/kafka-go@v0.4.39/protocol/record_v2.go:223 +0x46b 
github.com/segmentio/kafka-go/protocol.(*RecordSet).WriteTo(0xc084078908, {0xec2ce0?, 0xc09d65e180?}) 
/root/go/pkg/mod/github.com/segmentio/kafka-go@v0.4.39/protocol/record.go:269 +0x145 
github.com/segmentio/kafka-go/protocol.writerEncodeFuncOf.func1(0xc04a826c30, {{0xcf2180?, 0xc084078908?, 0xc0002288f0?}}) 
/root/go/pkg/mod/github.com/segmentio/kafka-go@v0.4.39/protocol/encode.go:510 +0x91 
github.com/segmentio/kafka-go/protocol.structEncodeFuncOf.func2(0xc334c0?, {{0xcce600?, 0xc084078900?, 0x20?}}) 
/root/go/pkg/mod/github.com/segmentio/kafka-go@v0.4.39/protocol/encode.go:462 +0xdd 
github.com/segmentio/kafka-go/protocol.(*encoder).encodeArray(0xc04a826c30, {{0xc334c0?, 0xc04a8410c0?, 0xc04a826c30?}}, {0xc000042056?, 0x1e?}, 0xc00022ae00) 
/root/go/pkg/mod/github.com/segmentio/kafka-go@v0.4.39/protocol/encode.go:170 +0xb0 
github.com/segmentio/kafka-go/protocol.arrayEncodeFuncOf.func4(0xcce6a0?, {{0xc334c0?, 0xc04a8410c0?, 0xc0002288e0?}}) 
/root/go/pkg/mod/github.com/segmentio/kafka-go@v0.4.39/protocol/encode.go:497 +0x2f 
github.com/segmentio/kafka-go/protocol.structEncodeFuncOf.func2(0xc33500?, {{0xcce6a0?, 0xc04a8410b0?, 0x20?}}) 
/root/go/pkg/mod/github.com/segmentio/kafka-go@v0.4.39/protocol/encode.go:462 +0xdd 
github.com/segmentio/kafka-go/protocol.(*encoder).encodeArray(0xc04a826c30, {{0xc33500?, 0xc04a841098?, 0xc04a826c5c?}}, {0x0?, 0x20?}, 0xc00022ae40) 
/root/go/pkg/mod/github.com/segmentio/kafka-go@v0.4.39/protocol/encode.go:170 +0xb0 
github.com/segmentio/kafka-go/protocol.arrayEncodeFuncOf.func4(0xd1a4a0?, {{0xc33500?, 0xc04a841098?, 0xc0002288d0?}}) 
/root/go/pkg/mod/github.com/segmentio/kafka-go@v0.4.39/protocol/encode.go:497 +0x2f 
github.com/segmentio/kafka-go/protocol.structEncodeFuncOf.func2(0xc04a826c30?, {{0xd1a4a0?, 0xc04a841080?, 0x20?}}) 
/root/go/pkg/mod/github.com/segmentio/kafka-go@v0.4.39/protocol/encode.go:462 +0xdd 
github.com/segmentio/kafka-go/protocol.WriteRequest({0x7f9a03b477c8, 0xc000f56480}, 0x8, 0x6b3d, {0x0, 0x0}, {0xec3460, 0xc04a841080}) 
/root/go/pkg/mod/github.com/segmentio/kafka-go@v0.4.39/protocol/request.go:118 +0x46d 
github.com/segmentio/kafka-go/protocol.RoundTrip({0xec7cb0?, 0xc000f56480}, 0x1080?, 0x6b3d, {0x0, 0x0}, {0xec3460, 0xc04a841080}) 
/root/go/pkg/mod/github.com/segmentio/kafka-go@v0.4.39/protocol/roundtrip.go:9 +0xa5 
github.com/segmentio/kafka-go/protocol.(*Conn).RoundTrip(0xc000f56480, {0xec3460, 0xc04a841080}) 
/root/go/pkg/mod/github.com/segmentio/kafka-go@v0.4.39/protocol/conn.go:94 +0x16d 
github.com/segmentio/kafka-go.(*conn).roundTrip(0xc00027a770?, {0xeca7d0, 0xc04a826be0}, 0xc000f56480, {0xec3460, 0xc04a841080}) 
/root/go/pkg/mod/github.com/segmentio/kafka-go@v0.4.39/transport.go:1271 +0x165 
github.com/segmentio/kafka-go.(*conn).run(0xc0010480f0, 0xc000f56480, 0x0?) 
/root/go/pkg/mod/github.com/segmentio/kafka-go@v0.4.39/transport.go:1247 +0x126 
created by github.com/segmentio/kafka-go.(*connGroup).connect /root/go/pkg/mod/github.com/segmentio/kafka-go@v0.4.39/transport.go:1224 +0xd8d

 

看上去是里面一个 copy 操作对空间的计算错误导致的超出 slice 空间。

没有找到很好的方法最后我放弃使用 kafka-go 库转回使用 sarama 库,问题解决,并且感觉性能更好,记录一下。