go-carbon: [BUG] find failed, can't expand globs
Describe the bug We attempted to upgrade from go-carbon 0.14.0 to version 0.15.0. Initially the upgrade seemed to go fine, but shortly afterwards we began seeing no data in Grafana and traced the issue back to these errors in the go-carbon logs.
Logs The following log has been sanitized for the hostname and metric strings. The metric string contained dot-delimited alphanumeric characters but no wildcard characters.
Oct 20 15:49:35 hostname go-carbon[130923]: {“level”:“ERROR”,“timestamp”:“2020-10-20T15:49:35.633Z”,“logger”:“access”,“message”:“fetch failed”,“handler”:“render”,“url”:“/render/?format=carbonapi_v3_pb”,“peer”:“127.0.0.1:33600”,“carbonapi_uuid”:“54ff70c2-a91b-47d4-82e0-e95e4a1685c0”,“carbonzipper_uuid”:“54ff70c2-a91b-47d4-82e0-e95e4a1685c0”,“format”:“carbonapi_v3_pb”,“targets”:[“foo.bar.baz”],“runtime_seconds”:0.000319742,“reason”:“failed to read data”,“http_code”:400,“error”:“find failed, can’t expand globs”}
Go-carbon Configuration:
[common]
user = "carbon"
graph-prefix = "carbon.agents.{host}"
metric-endpoint = "local"
max-cpu = 16
metric-interval = "1m0s"
[whisper]
data-dir = "/data/graphite/whisper"
schemas-file = "/etc/go-carbon/storage-schemas.conf"
#aggregation-file = "/etc/go-carbon/storage-aggregation.conf"
workers = 32
max-updates-per-second = 70000
max-creates-per-second = 2000
hard-max-creates-per-second = false
sparse-create = false
flock = false
hash-filenames = false
enabled = true
[cache]
max-size = 80000000
write-strategy = "max"
[udp]
listen = "127.0.0.1:2003"
log-incomplete = true
buffer-size = 0
enabled = true
[tcp]
listen = "127.0.0.1:2003"
buffer-size = 0
#compression = ""
enabled = true
[pickle]
listen = "127.0.0.1:2004"
max-message-size = 67108864
buffer-size = 0
enabled = true
[carbonlink]
listen = "127.0.0.1:7002"
read-timeout = "30s"
enabled = true
[grpc]
listen = "127.0.0.1:7003"
enabled = true
[tags]
tagdb-url = "http://127.0.0.1:8000"
tagdb-chunk-size = 32
tagdb-update-interval = 100
local-dir = "/var/lib/graphite/tagging/"
tagdb-timeout = "1s"
enabled = false
[carbonserver]
listen = "127.0.0.1:8080"
query-cache-enabled = true
query-cache-size-mb = 0
find-cache-enabled = true
buckets = 10
max-globs = 100
fail-on-max-globs = false
metrics-as-counters = false
trigram-index = false
trie-index = true
internal-stats-dir = ""
read-timeout = "1m0s"
idle-timeout = "1m0s"
write-timeout = "1m0s"
scan-frequency = "5m0s"
enabled = true
[dump]
path = "/var/lib/graphite/dump/"
restore-per-second = 0
enabled = false
[pprof]
listen = "127.0.0.1:7007"
enabled = true
[prometheus]
endpoint = "/metrics"
enabled = true
[prometheus.labels]
[[logging]]
logger = ""
file = "stderr"
level = "warn"
encoding = "json"
encoding-time = "iso8601"
encoding-duration = "seconds"
Metric retention and aggregation schemas storage-schemas.conf:
[01.carbon]
pattern = ^carbon\.
retentions = 60s:15d,5m:180d
[02.foo]
pattern = ^foo\.bar\.
retentions = 60s:15d,5m:180d
[03.metric_testing]
pattern = ^metric_testing
retentions = 60s:1d,5m:7d
[04.grafana]
pattern = ^grafana\.
retentions = 60s:15d
[99.default_1min_for_90day]
pattern = .*
retentions = 60s:15d,5m:180d
Simplified query (if applicable) See logging output.
Additional context This is go-carbon version 0.15.0 running on Ubuntu 16.04.7 LTS. It is being used as a backendsv2 backend (protocol carbonapi_v3_pb) for carbonapi version 0.14.1-1.
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 35 (22 by maintainers)
Commits related to this issue
- carbonserver: fix findError information lost due to unexported fields zap seems ignored unexported fields in struct findError which makes it useless. This commit fix the issue by converting it to a s... — committed to go-graphite/go-carbon by bom-d-van 4 years ago
- carbonserver: fix findError information lost due to unexported fields zap seems ignored unexported fields in struct findError which makes it useless. This commit fix the issue by converting it to a s... — committed to go-graphite/go-carbon by bom-d-van 4 years ago
- carbonserver: fix findError information lost due to unexported fields (#380) zap seems ignored unexported fields in struct findError which makes it useless. This commit fix the issue by converting i... — committed to go-graphite/go-carbon by bom-d-van 4 years ago
- trie: support non-ascii text in metric names and queries #379 This would come with a slightly higher memory overhead for parsing queries. — committed to go-graphite/go-carbon by bom-d-van 4 years ago
- trie/bug fixes and finally adds some simple fuzzing logics (#383) * trie: support non-ascii text in metric names and queries #379 This would come with a slightly higher memory overhead for parsing... — committed to go-graphite/go-carbon by bom-d-van 4 years ago
I can’t really speak to what a sane default might be. I think the issue here is less about the value and more about making sure folks are aware of it during the upgrade cycle. Hopefully it’s far enough in the past that it won’t matter for most other users.
But in our case, the one query that surfaced this issue was returning almost 30k metrics. I don’t think we have a good way of measuring per-query metric numbers (this would be a useful metric to export imho, either in go-carbon or carbonapi), so for now we’re erring on the side of caution and increasing our setting to 100k.
That XXX_NoUnkeyedLiteral might be related to: https://github.com/go-graphite/protocol/commit/ec66858bcd41d96d7fd8aac4ffcc643763ad2cfd
Currently pinned https://github.com/go-graphite/go-carbon/blob/master/go.mod#L21 version is below that and there were two new fields introduced (and go-graphite/carbonapi uses them now).
Though good thing about that is that they do not matter much for go-carbon (could be somehow useful to take them into account if there is a graphite-web 1.1.x pointing to carbonapi, but that’s not a hard requirement though)
"debug"would be better.There were no other logs but we were logging at
level = "warn". Would it help for me to increase logging output?