telegraf: Telegraf 1.21.0 - go panic with SNMP plugin
Relevent telegraf.conf
Dec 17 16:38:04 hostname systemd[1]: Started The plugin-driven server agent for reporting metrics into InfluxDB.
Dec 17 16:38:04 hostname telegraf[306402]: 2021-12-17T15:38:04Z I! Starting Telegraf 1.21.1
Dec 17 16:38:04 hostname telegraf[306402]: 2021-12-17T15:38:04Z I! Loaded inputs: cpu disk internal mem net ping snmp (352x) swap
Dec 17 16:38:04 hostname telegraf[306402]: 2021-12-17T15:38:04Z I! Loaded aggregators:
Dec 17 16:38:04 hostname telegraf[306402]: 2021-12-17T15:38:04Z I! Loaded processors: converter
Dec 17 16:38:04 hostname telegraf[306402]: 2021-12-17T15:38:04Z I! Loaded outputs: influxdb_v2 (2x)
Dec 17 16:38:04 hostname telegraf[306402]: 2021-12-17T15:38:04Z I! Tags enabled: host=s
Dec 17 16:38:04 hostname telegraf[306402]: 2021-12-17T15:38:04Z I! [agent] Config: Interval:2m0s, Quiet:false, Hostname:"", Flush Interval:10s
Dec 17 16:38:06 hostname telegraf[306402]: panic: runtime error: index out of range [1] with length 1
Dec 17 16:38:06 hostname telegraf[306402]: goroutine 1 [running]:
Dec 17 16:38:06 hostname telegraf[306402]: github.com/influxdata/telegraf/internal/snmp.SnmpTranslateCall({0xc00158ca81, 0x27})
Dec 17 16:38:06 hostname telegraf[306402]: /go/src/github.com/influxdata/telegraf/internal/snmp/translate.go:153 +0xbfd
Dec 17 16:38:06 hostname telegraf[306402]: github.com/influxdata/telegraf/plugins/inputs/snmp.SnmpTranslate({0xc00158ca81, 0x27})
Dec 17 16:38:06 hostname telegraf[306402]: /go/src/github.com/influxdata/telegraf/plugins/inputs/snmp/snmp.go:877 +0x170
Dec 17 16:38:06 hostname telegraf[306402]: github.com/influxdata/telegraf/plugins/inputs/snmp.(*Field).init(0xc000766708)
Dec 17 16:38:06 hostname telegraf[306402]: /go/src/github.com/influxdata/telegraf/plugins/inputs/snmp/snmp.go:263 +0x73
Dec 17 16:38:06 hostname telegraf[306402]: github.com/influxdata/telegraf/plugins/inputs/snmp.(*Table).Init(0xc0016ede00)
Dec 17 16:38:06 hostname telegraf[306402]: /go/src/github.com/influxdata/telegraf/plugins/inputs/snmp/snmp.go:169 +0xd8
Dec 17 16:38:06 hostname telegraf[306402]: github.com/influxdata/telegraf/plugins/inputs/snmp.(*Snmp).Init(0xc0009f7340)
Dec 17 16:38:06 hostname telegraf[306402]: /go/src/github.com/influxdata/telegraf/plugins/inputs/snmp/snmp.go:110 +0x127
Dec 17 16:38:06 hostname telegraf[306402]: github.com/influxdata/telegraf/models.(*RunningInput).Init(0xc000e4b758)
Dec 17 16:38:06 hostname telegraf[306402]: /go/src/github.com/influxdata/telegraf/models/running_input.go:82 +0x35
Dec 17 16:38:06 hostname telegraf[306402]: github.com/influxdata/telegraf/agent.(*Agent).initPlugins(0xc000184ba8)
Dec 17 16:38:06 hostname telegraf[306402]: /go/src/github.com/influxdata/telegraf/agent/agent.go:189 +0x96
Dec 17 16:38:06 hostname telegraf[306402]: github.com/influxdata/telegraf/agent.(*Agent).Run(0xc000184ba8, {0x57e5688, 0xc000acc380})
Dec 17 16:38:06 hostname telegraf[306402]: /go/src/github.com/influxdata/telegraf/agent/agent.go:105 +0x185
Dec 17 16:38:06 hostname telegraf[306402]: main.runAgent({0x57e5688, 0xc000acc380}, {0x837a278, 0x0, 0x0}, {0x837a278, 0x0, 0x0})
Dec 17 16:38:06 hostname telegraf[306402]: /go/src/github.com/influxdata/telegraf/cmd/telegraf/telegraf.go:312 +0xc57
Dec 17 16:38:06 hostname telegraf[306402]: main.reloadLoop({0x837a278, 0x0, 0x0}, {0x837a278, 0x0, 0x0})
Dec 17 16:38:06 hostname telegraf[306402]: /go/src/github.com/influxdata/telegraf/cmd/telegraf/telegraf.go:147 +0x28a
Dec 17 16:38:06 hostname telegraf[306402]: main.run(...)
Dec 17 16:38:06 hostname telegraf[306402]: /go/src/github.com/influxdata/telegraf/cmd/telegraf/telegraf_posix.go:8
Dec 17 16:38:06 hostname telegraf[306402]: main.main()
Dec 17 16:38:06 hostname telegraf[306402]: /go/src/github.com/influxdata/telegraf/cmd/telegraf/telegraf.go:485 +0xa9a
Dec 17 16:38:06 hostname systemd[1]: telegraf.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Dec 17 16:38:06 hostname systemd[1]: telegraf.service: Failed with result 'exit-code'.
Dec 17 16:38:07 hostname systemd[1]: telegraf.service: Service RestartSec=100ms expired, scheduling restart.
Dec 17 16:38:07 hostname systemd[1]: telegraf.service: Scheduled restart job, restart counter is at 26.
Dec 17 16:38:07 hostname systemd[1]: Stopped The plugin-driven server agent for reporting metrics into InfluxDB.
System info
Telegraf 1.21.1
Docker
No response
Steps to reproduce
- After the agent restarted it restarts every 100ms because of the error “panic: runtime error: index out of range [1] with length 1”
- rollback to 1.21.0 does not fix the issue
- rollback to 1.20.4 fixes the issue
Expected behavior
to work like 1.20.4
Actual behavior
crashes
Additional info
Not sure if I have a specific snmp MIB file that causes this, I cannot see any reference in the trace what causes it besides snmptranslate
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 44 (19 by maintainers)
It would be nice if this were left open til the release, so that we would get notifications when the release happens.
FWIW this issue is still VERY much a problem - I’ve got issues across a wide MIB list using v1.21.3-1
The current fix (for me, and others) was downgrading to 1.20.4-1 on my Ubuntu machine via:
curl -LO -C - https://dl.influxdata.com/telegraf/releases/telegraf_1.20.4-1_amd64.deb sudo dpkg -i telegraf_1.20.4-1_amd64.deb
Be sure to set a hold on telegraf too!
sudo apt-mark hold telegraf
Once I downgraded, my data was once again making it to Influx & I was able to visualize my data in Grafana again
v1.21.2 which contains numerous SNMP fixes is now out. Thanks!
Looks like you might not have the repository added for telegraf?
Easiest way would be to download the .deb and install via dpkg.
If you would prefer to manage via apt make sure you add the source. https://docs.influxdata.com/telegraf/v1.21/introduction/installation/#ubuntu--debian
I ran your config with your mibs. I had to comment out
HOST-RESOURCES-MIB
as that was not included. I ran on this pr as it is what is going to fix the panic. And I am gettingperforming get on field sysName: marshal: marshalPDU: unable to marshal varbind list: unable to marshal OID: Invalid object identifier
. Which is an issue withgosnmp
and can be tracked via this ticket. Thank you for you patience with all of these kinks while we get switched over togosmi
!I have tried to implement what the owner of the library recommended. If you wouldn’t mind testing this draft pr and giving some feedback, it would be greatly appreciated 😃
We have this panic on a similar mib, unfortunately, it is an issue in the new
gosmi
library we are using to parse mibs. I logged an issue with the library here, if you would like to follow it. I will also keep you updated on the status. Thank you for your continued support!I don’t know if this is the same issue, but I see similiarities and thought i would post it here. Let me know though if I should post it as a new issue, or if you would like more information. This same telegraf.conf has been used for around a year without any issue with the previous telegraf versions. Thanks for you help and working on a great product.
Telegraph installation on a raspberryPi running Ubuntu.
telegraf/unknown,now 1.21.1-1 arm64 [installed]
Relevant logs:
I’m only using the snmp plugin to monitor a ubiquity EdgeRouterX, and here are the relevant log: