go-carbon: [BUG] Go-carbon open-files keeps increasing over time
Describe the bug The number of file-descriptors used by go-carbon is forever increasing (it looks like). We did have a low max number of files (default 1024) - which caused it to crash after a few days. It’s now increased - and monitored - and we can see that it increases with, on my server, 25-40 processes per hour. I’m running 0.15.0 on an old Ubuntu 14.04 I also have the Carbon-server enabled that handles some requests (not many) as most still goes over Grapite Web (Gunicorn). Maybe we didn’t have this issue when the server was not enabled.
Logs I have the normal go-carbon log - but no debug logs. The logs does contain some panics for accessing metrics that doesn’t exist (in case it’s somehow related)
Std issue when data is not found?
[2020-09-29T06:45:03.129Z] ERROR [access] panic recovered {"handler": "render", "url": "/render/?format=protobuf&from=1601361602&target=rancher-collectd-statsd.production.statsd.count-pe.%2A.resolver.%2A.%7Bsoft%2Chard%7D.statistic.count&until=1601361902", "peer": "10.16.10.219:45410", "targets": ["rancher-collectd-statsd.production.statsd.count-pe.*.resolver.*.{soft,hard}.statistic.count"], "format": "carbonapi_v2_pb", "carbonzipper_uuid": "8d980c88-50e0-41a8-884c-ed5f240722c5", "carbonapi_uuid": "8d980c88-50e0-41a8-884c-ed5f240722c5", "stack": "github.com/go-graphite/go-carbon/carbonserver.(*CarbonserverListener).renderHandler.func1\n\t/root/go/src/github.com/go-graphite/go-carbon/carbonserver/render.go:193\nruntime.gopanic\n\t/usr/local/go/src/runtime/panic.go:967\nruntime.panicmem\n\t/usr/local/go/src/runtime/panic.go:212\nruntime.sigpanic\n\t/usr/local/go/src/runtime/signal_unix.go:687\ngithub.com/go-graphite/go-carbon/carbonserver.(*CarbonserverListener).getExpandedGlobs\n\t/root/go/src/github.com/go-graphite/go-carbon/carbonserver/find.go:355\ngithub.com/go-graphite/go-carbon/carbonserver.(*CarbonserverListener).prepareDataProto\n\t/root/go/src/github.com/go-graphite/go-carbon/carbonserver/render.go:343\ngithub.com/go-graphite/go-carbon/carbonserver.(*CarbonserverListener).fetchWithCache\n\t/root/go/src/github.com/go-graphite/go-carbon/carbonserver/render.go:301\ngithub.com/go-graphite/go-carbon/carbonserver.(*CarbonserverListener).renderHandler\n\t/root/go/src/github.com/go-graphite/go-carbon/carbonserver/render.go:207\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2012\ngithub.com/go-graphite/go-carbon/carbonserver.TraceHandler.func1\n\t/root/go/src/github.com/go-graphite/go-carbon/carbonserver/carbonserver.go:186\ngithub.com/dgryski/httputil.TimeHandler.func1\n\t/root/go/src/github.com/go-graphite/go-carbon/vendor/github.com/dgryski/httputil/times.go:26\ngithub.com/dgryski/httputil.TrackConnections.func1\n\t/root/go/src/github.com/go-graphite/go-carbon/vendor/github.com/dgryski/httputil/track.go:40\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2012\nnet/http.(*ServeMux).ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2387\ngithub.com/NYTimes/gziphandler.GzipHandlerWithOpts.func1.1\n\t/root/go/src/github.com/go-graphite/go-carbon/vendor/github.com/NYTimes/gziphandler/gzip.go:287\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2012\nnet/http.serverHandler.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2807\nnet/http.(*conn).serve\n\t/usr/local/go/src/net/http/server.go:1895", "error": "runtime error: invalid memory address or nil pointer dereference"}
[2020-09-29T06:45:03.130Z] ERROR [access] fetch failed {"handler": "render", "url": "/render/?format=protobuf&from=1601361602&target=rancher-collectd-statsd.production.statsd.count-pe.%2A.resolver.%2A.%7Bsoft%2Chard%7D.statistic.count&until=1601361902", "peer": "10.16.10.219:45410", "carbonzipper_uuid": "8d980c88-50e0-41a8-884c-ed5f240722c5", "carbonapi_uuid": "8d980c88-50e0-41a8-884c-ed5f240722c5", "format": "carbonapi_v2_pb", "targets": ["rancher-collectd-statsd.production.statsd.count-pe.*.resolver.*.{soft,hard}.statistic.count"], "runtime_seconds": 0.186705487, "reason": "panic during serving the request", "stack": "github.com/go-graphite/go-carbon/carbonserver.(*CarbonserverListener).renderHandler.func1\n\t/root/go/src/github.com/go-graphite/go-carbon/carbonserver/render.go:199\nruntime.gopanic\n\t/usr/local/go/src/runtime/panic.go:967\nruntime.panicmem\n\t/usr/local/go/src/runtime/panic.go:212\nruntime.sigpanic\n\t/usr/local/go/src/runtime/signal_unix.go:687\ngithub.com/go-graphite/go-carbon/carbonserver.(*CarbonserverListener).getExpandedGlobs\n\t/root/go/src/github.com/go-graphite/go-carbon/carbonserver/find.go:355\ngithub.com/go-graphite/go-carbon/carbonserver.(*CarbonserverListener).prepareDataProto\n\t/root/go/src/github.com/go-graphite/go-carbon/carbonserver/render.go:343\ngithub.com/go-graphite/go-carbon/carbonserver.(*CarbonserverListener).fetchWithCache\n\t/root/go/src/github.com/go-graphite/go-carbon/carbonserver/render.go:301\ngithub.com/go-graphite/go-carbon/carbonserver.(*CarbonserverListener).renderHandler\n\t/root/go/src/github.com/go-graphite/go-carbon/carbonserver/render.go:207\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2012\ngithub.com/go-graphite/go-carbon/carbonserver.TraceHandler.func1\n\t/root/go/src/github.com/go-graphite/go-carbon/carbonserver/carbonserver.go:186\ngithub.com/dgryski/httputil.TimeHandler.func1\n\t/root/go/src/github.com/go-graphite/go-carbon/vendor/github.com/dgryski/httputil/times.go:26\ngithub.com/dgryski/httputil.TrackConnections.func1\n\t/root/go/src/github.com/go-graphite/go-carbon/vendor/github.com/dgryski/httputil/track.go:40\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2012\nnet/http.(*ServeMux).ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2387\ngithub.com/NYTimes/gziphandler.GzipHandlerWithOpts.func1.1\n\t/root/go/src/github.com/go-graphite/go-carbon/vendor/github.com/NYTimes/gziphandler/gzip.go:287\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2012\nnet/http.serverHandler.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2807\nnet/http.(*conn).serve\n\t/usr/local/go/src/net/http/server.go:1895", "error": "runtime error: invalid memory address or nil pointer dereference", "http_code": 500}
[2020-09-29T06:47:02.834Z] ERROR [access] panic recovered {"handler": "render", "url": "/render/?format=protobuf&from=1601361722&target=rancher-collectd-statsd.production.statsd.count-pe.%2A.resolver.%2A.%7Bsoft%2Chard%7D.statistic.count&until=1601362022", "peer": "10.16.10.219:45490", "targets": ["rancher-collectd-statsd.production.statsd.count-pe.*.resolver.*.{soft,hard}.statistic.count"], "format": "carbonapi_v2_pb", "carbonapi_uuid": "3c97f7b8-e63f-428f-94fe-c778caa2e3d6", "carbonzipper_uuid": "3c97f7b8-e63f-428f-94fe-c778caa2e3d6", "stack": "github.com/go-graphite/go-carbon/carbonserver.(*CarbonserverListener).renderHandler.func1\n\t/root/go/src/github.com/go-graphite/go-carbon/carbonserver/render.go:193\nruntime.gopanic\n\t/usr/local/go/src/runtime/panic.go:967\nruntime.panicmem\n\t/usr/local/go/src/runtime/panic.go:212\nruntime.sigpanic\n\t/usr/local/go/src/runtime/signal_unix.go:687\ngithub.com/go-graphite/go-carbon/carbonserver.(*CarbonserverListener).getExpandedGlobs\n\t/root/go/src/github.com/go-graphite/go-carbon/carbonserver/find.go:355\ngithub.com/go-graphite/go-carbon/carbonserver.(*CarbonserverListener).prepareDataProto\n\t/root/go/src/github.com/go-graphite/go-carbon/carbonserver/render.go:343\ngithub.com/go-graphite/go-carbon/carbonserver.(*CarbonserverListener).fetchWithCache\n\t/root/go/src/github.com/go-graphite/go-carbon/carbonserver/render.go:301\ngithub.com/go-graphite/go-carbon/carbonserver.(*CarbonserverListener).renderHandler\n\t/root/go/src/github.com/go-graphite/go-carbon/carbonserver/render.go:207\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2012\ngithub.com/go-graphite/go-carbon/carbonserver.TraceHandler.func1\n\t/root/go/src/github.com/go-graphite/go-carbon/carbonserver/carbonserver.go:186\ngithub.com/dgryski/httputil.TimeHandler.func1\n\t/root/go/src/github.com/go-graphite/go-carbon/vendor/github.com/dgryski/httputil/times.go:26\ngithub.com/dgryski/httputil.TrackConnections.func1\n\t/root/go/src/github.com/go-graphite/go-carbon/vendor/github.com/dgryski/httputil/track.go:40\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2012\nnet/http.(*ServeMux).ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2387\ngithub.com/NYTimes/gziphandler.GzipHandlerWithOpts.func1.1\n\t/root/go/src/github.com/go-graphite/go-carbon/vendor/github.com/NYTimes/gziphandler/gzip.go:287\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2012\nnet/http.serverHandler.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2807\nnet/http.(*conn).serve\n\t/usr/local/go/src/net/http/server.go:1895", "error": "runtime error: invalid memory address or nil pointer dereference"}
[2020-09-29T06:47:02.834Z] ERROR [access] fetch failed {"handler": "render", "url": "/render/?format=protobuf&from=1601361722&target=rancher-collectd-statsd.production.statsd.count-pe.%2A.resolver.%2A.%7Bsoft%2Chard%7D.statistic.count&until=1601362022", "peer": "10.16.10.219:45490", "carbonapi_uuid": "3c97f7b8-e63f-428f-94fe-c778caa2e3d6", "carbonzipper_uuid": "3c97f7b8-e63f-428f-94fe-c778caa2e3d6", "format": "carbonapi_v2_pb", "targets": ["rancher-collectd-statsd.production.statsd.count-pe.*.resolver.*.{soft,hard}.statistic.count"], "runtime_seconds": 0.192304691, "reason": "panic during serving the request", "stack": "github.com/go-graphite/go-carbon/carbonserver.(*CarbonserverListener).renderHandler.func1\n\t/root/go/src/github.com/go-graphite/go-carbon/carbonserver/render.go:199\nruntime.gopanic\n\t/usr/local/go/src/runtime/panic.go:967\nruntime.panicmem\n\t/usr/local/go/src/runtime/panic.go:212\nruntime.sigpanic\n\t/usr/local/go/src/runtime/signal_unix.go:687\ngithub.com/go-graphite/go-carbon/carbonserver.(*CarbonserverListener).getExpandedGlobs\n\t/root/go/src/github.com/go-graphite/go-carbon/carbonserver/find.go:355\ngithub.com/go-graphite/go-carbon/carbonserver.(*CarbonserverListener).prepareDataProto\n\t/root/go/src/github.com/go-graphite/go-carbon/carbonserver/render.go:343\ngithub.com/go-graphite/go-carbon/carbonserver.(*CarbonserverListener).fetchWithCache\n\t/root/go/src/github.com/go-graphite/go-carbon/carbonserver/render.go:301\ngithub.com/go-graphite/go-carbon/carbonserver.(*CarbonserverListener).renderHandler\n\t/root/go/src/github.com/go-graphite/go-carbon/carbonserver/render.go:207\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2012\ngithub.com/go-graphite/go-carbon/carbonserver.TraceHandler.func1\n\t/root/go/src/github.com/go-graphite/go-carbon/carbonserver/carbonserver.go:186\ngithub.com/dgryski/httputil.TimeHandler.func1\n\t/root/go/src/github.com/go-graphite/go-carbon/vendor/github.com/dgryski/httputil/times.go:26\ngithub.com/dgryski/httputil.TrackConnections.func1\n\t/root/go/src/github.com/go-graphite/go-carbon/vendor/github.com/dgryski/httputil/track.go:40\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2012\nnet/http.(*ServeMux).ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2387\ngithub.com/NYTimes/gziphandler.GzipHandlerWithOpts.func1.1\n\t/root/go/src/github.com/go-graphite/go-carbon/vendor/github.com/NYTimes/gziphandler/gzip.go:287\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2012\nnet/http.serverHandler.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2807\nnet/http.(*conn).serve\n\t/usr/local/go/src/net/http/server.go:1895", "error": "runtime error: invalid memory address or nil pointer dereference", "http_code": 500}
[2020-09-29T06:51:02.791Z] ERROR [access] panic recovered {"handler": "render", "url": "/render/?format=protobuf&from=1601361962&target=rancher-collectd-statsd.production.statsd.count-pe.%2A.resolver.%2A.%7Bsoft%2Chard%7D.statistic.count&until=1601362262", "peer": "10.16.10.219:45610", "targets": ["rancher-collectd-statsd.production.statsd.count-pe.*.resolver.*.{soft,hard}.statistic.count"], "format": "carbonapi_v2_pb", "carbonapi_uuid": "dc38c2aa-3b42-467e-bac2-a5b449d1b090", "carbonzipper_uuid": "dc38c2aa-3b42-467e-bac2-a5b449d1b090", "stack": "github.com/go-graphite/go-carbon/carbonserver.(*CarbonserverListener).renderHandler.func1\n\t/root/go/src/github.com/go-graphite/go-carbon/carbonserver/render.go:193\nruntime.gopanic\n\t/usr/local/go/src/runtime/panic.go:967\nruntime.panicmem\n\t/usr/local/go/src/runtime/panic.go:212\nruntime.sigpanic\n\t/usr/local/go/src/runtime/signal_unix.go:687\ngithub.com/go-graphite/go-carbon/carbonserver.(*CarbonserverListener).getExpandedGlobs\n\t/root/go/src/github.com/go-graphite/go-carbon/carbonserver/find.go:355\ngithub.com/go-graphite/go-carbon/carbonserver.(*CarbonserverListener).prepareDataProto\n\t/root/go/src/github.com/go-graphite/go-carbon/carbonserver/render.go:343\ngithub.com/go-graphite/go-carbon/carbonserver.(*CarbonserverListener).fetchWithCache\n\t/root/go/src/github.com/go-graphite/go-carbon/carbonserver/render.go:301\ngithub.com/go-graphite/go-carbon/carbonserver.(*CarbonserverListener).renderHandler\n\t/root/go/src/github.com/go-graphite/go-carbon/carbonserver/render.go:207\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2012\ngithub.com/go-graphite/go-carbon/carbonserver.TraceHandler.func1\n\t/root/go/src/github.com/go-graphite/go-carbon/carbonserver/carbonserver.go:186\ngithub.com/dgryski/httputil.TimeHandler.func1\n\t/root/go/src/github.com/go-graphite/go-carbon/vendor/github.com/dgryski/httputil/times.go:26\ngithub.com/dgryski/httputil.TrackConnections.func1\n\t/root/go/src/github.com/go-graphite/go-carbon/vendor/github.com/dgryski/httputil/track.go:40\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2012\nnet/http.(*ServeMux).ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2387\ngithub.com/NYTimes/gziphandler.GzipHandlerWithOpts.func1.1\n\t/root/go/src/github.com/go-graphite/go-carbon/vendor/github.com/NYTimes/gziphandler/gzip.go:287\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2012\nnet/http.serverHandler.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2807\nnet/http.(*conn).serve\n\t/usr/local/go/src/net/http/server.go:1895", "error": "runtime error: invalid memory address or nil pointer dereference"}
and
At the same time - the Cached memory started increasing from 29.1GB up to 42.4GB that it reached about 2 hours later. The cached memory stayed on that level for another 2 hours at 03:13 it stopped updating this metric.
[2020-09-29T22:23:29.765Z] ERROR [persister] failed to open whisper file {"path": "/data/graphite/whisper/publish-es-7-staging-master-a-01/statsd/gauge-elasticsearch/nodes/publish-es-7-staging-data-c-01/euc1/aws/flcn/io/thread_pool/ml_datafeed/completed.wsp", "error": "open /data/graphite/whisper/publish-es-7-staging-master-a-01/statsd/gauge-elasticsearch/nodes/publish-es-7-staging-data-c-01/euc1/aws/flcn/io/thread_pool/ml_datafeed/completed.wsp: too many open files"}
[2020-09-29T22:23:52.247Z] INFO [carbonserver] error processing {"handler": "fileListUpdated", "path": "/data/graphite/whisper/es-7-staging-master-b-01/statsd/gauge-elasticsearch/indices/polkagris-2020/40/total/segments", "error": "open /data/graphite/whisper/es-7-staging-master-b-01/statsd/gauge-elasticsearch/indices/polkagris-2020/40/total/segments: too many open files"}
[2020-09-29T22:23:52.247Z] INFO [carbonserver] error processing {"handler": "fileListUpdated", "path": "/data/graphite/whisper/es-7-staging-master-b-01/statsd/gauge-elasticsearch/indices/polkagris-2020/40/total/store", "error": "open /data/graphite/whisper/es-7-staging-master-b-01/statsd/gauge-elasticsearch/indices/polkagris-2020/40/total/store: too many open files"}
[2020-09-29T22:23:52.247Z] INFO [carbonserver] error processing {"handler": "fileListUpdated", "path": "/data/graphite/whisper/es-7-staging-master-b-01/statsd/gauge-elasticsearch/indices/polkagris-2020/40/total/translog", "error": "open /data/graphite/whisper/es-7-staging-master-b-01/statsd/gauge-elasticsearch/indices/polkagris-2020/40/total/translog: too many open files"}
[2020-09-29T22:23:52.247Z] INFO [carbonserver] error processing {"handler": "fileListUpdated", "path": "/data/graphite/whisper/es-7-staging-master-b-01/statsd/gauge-elasticsearch/indices/polkagris-2020/40/total/warmer", "error": "open /data/graphite/whisper/es-7-staging-master-b-01/statsd/gauge-elasticsearch/indices/polkagris-2020/40/total/warmer: too many open files"}
In the Go-carbon graphs - nothing was special until 02:34 or 02:56 where the cache began to grow constantly. Graphs stopped updating before it actually hit the maxSize and it start throwing metrics away.
Some other Go-carbon metrics stopped updating after 02:42 and more after 03:10. Like the Persister where things looked quite normal up til this point.
Go-carbon Configuration:
# New Go-carbon config options.
[common]
# Run as user. Works only in daemon mode
user = "graphite"
# Prefix for store all internal go-carbon graphs. Supported macroses: {host}
graph-prefix = "carbon.agents.{host}"
#graph-prefix = "stats.carbon.agents.stats-go"
# Endpoint for store internal carbon metrics. Valid values: "" or "local", "tcp://host:port", "udp://host:port"
metric-endpoint = "local"
# Interval of storing internal metrics. Like CARBON_METRIC_INTERVAL
#metric-interval = "1m0s"
metric-interval = "10s"
# Increase for configuration with multi persister workers
#max-cpu = 18
# When fully migrated (and on a c5.9xlarge)
max-cpu = 34
[whisper]
data-dir = "/data/graphite/whisper"
# http://graphite.readthedocs.org/en/latest/config-carbon.html#storage-schemas-conf. Required
schemas-file = "/etc/go-carbon/storage-schemas.conf"
# http://graphite.readthedocs.org/en/latest/config-carbon.html#storage-aggregation-conf. Optional
aggregation-file = "/etc/go-carbon/storage-aggregation.conf"
# Worker threads count. Metrics sharded by "crc32(metricName) % workers"
# workers = 10
# When fully migrated, (half of max-cpu)
workers = 28
# Limits the number of whisper update_many() calls per second. 0 - no limit
max-updates-per-second = 2500
# Softly limits the number of whisper files that get created each second. 0 - no limit
max-creates-per-second = 10
# Make max-creates-per-second a hard limit. Extra new metrics are dropped. A hard throttle of 0 drops all new metrics.
hard-max-creates-per-second = false
# Sparse file creation
sparse-create = false
# use flock on every file call (ensures consistency if there are concurrent read/writes to the same file)
flock = true
enabled = true
# Use hashed filenames for tagged metrics instead of human readable
# https://github.com/go-graphite/go-carbon/pull/225
hash-filenames = false
# specify to enable/disable compressed format (EXPERIMENTAL)
# See details and limitations in https://github.com/go-graphite/go-whisper#compressed-format
# IMPORTANT: Only one process/thread could write to compressed whisper files at a time, especially when you are
# rebalancing graphite clusters (with buckytools, for example), flock needs to be enabled both in go-carbon and your tooling.
compressed = false
# automatically delete empty whisper file caused by edge cases like server reboot
remove-empty-file = false
[cache]
# Limit of in-memory stored points (not metrics)
max-size = 250000000
# Capacity of queue between receivers and cache
# Strategy to persist metrics. Values: "max","sorted","noop"
# "max" - write metrics with most unwritten datapoints first
# "sorted" - sort by timestamp of first unwritten datapoint.
# "noop" - pick metrics to write in unspecified order,
# requires least CPU and improves cache responsiveness
write-strategy = "noop"
[udp]
listen = ":2413"
enabled = true
# Optional internal queue between receiver and cache
buffer-size = 0
[tcp]
listen = ":2413"
enabled = true
# Optional internal queue between receiver and cache
buffer-size = 0
[pickle]
listen = ":2414"
# Limit message size for prevent memory overflow
max-message-size = 67108864
enabled = true
# Optional internal queue between receiver and cache
buffer-size = 0
# You can define unlimited count of additional receivers
# Common definition scheme:
# [receiver.<any receiver name>]
# protocol = "<any supported protocol>"
# <protocol specific options>
#
# All available protocols:
#
# [receiver.udp2]
# protocol = "udp"
# listen = ":2003"
# # Enable optional logging of incomplete messages (chunked by max UDP packet size)
# log-incomplete = false
#
# [receiver.tcp2]
# protocol = "tcp"
# listen = ":2003"
#
# [receiver.pickle2]
# protocol = "pickle"
# listen = ":2004"
# # Limit message size for prevent memory overflow
# max-message-size = 67108864
#
# [receiver.protobuf]
# protocol = "protobuf"
# # Same framing protocol as pickle, but message encoded in protobuf format
# # See https://github.com/go-graphite/go-carbon/blob/master/helper/carbonpb/carbon.proto
# listen = ":2005"
# # Limit message size for prevent memory overflow
# max-message-size = 67108864
#
# [receiver.http]
# protocol = "http"
# # This receiver receives data from POST requests body.
# # Data can be encoded in plain text format (default),
# # protobuf (with Content-Type: application/protobuf header) or
# # pickle (with Content-Type: application/python-pickle header).
# listen = ":2007"
# max-message-size = 67108864
#
[carbonlink]
#listen = "127.0.0.1:7412"
listen = ":7412"
enabled = true
# Close inactive connections after "read-timeout"
read-timeout = "30s"
# grpc api
# protocol: https://github.com/go-graphite/go-carbon/blob/master/helper/carbonpb/carbon.proto
# samples: https://github.com/go-graphite/go-carbon/tree/master/api/sample
[grpc]
#listen = "127.0.0.1:7403"
listen = ":7403"
enabled = true
# http://graphite.readthedocs.io/en/latest/tags.html
[tags]
enabled = false
# TagDB url. It should support /tags/tagMultiSeries endpoint
tagdb-url = "http://127.0.0.1:8000"
tagdb-chunk-size = 32
tagdb-update-interval = 100
# Directory for send queue (based on leveldb)
local-dir = "/var/lib/graphite/tagging/"
# POST timeout
tagdb-timeout = "1s"
[carbonserver]
# Please NOTE: carbonserver is not intended to fully replace graphite-web
# It acts as a "REMOTE_STORAGE" for graphite-web or carbonzipper/carbonapi
listen = ":8082"
# Carbonserver support is still experimental and may contain bugs
# Or be incompatible with github.com/grobian/carbonserver
enabled = true
# Buckets to track response times
buckets = 10
# carbonserver-specific metrics will be sent as counters
# For compatibility with grobian/carbonserver
metrics-as-counters = false
# Read and Write timeouts for HTTP server
read-timeout = "60s"
write-timeout = "60s"
# Enable /render cache, it will cache the result for 1 minute
query-cache-enabled = true
# 0 for unlimited
query-cache-size-mb = 500
# Enable /metrics/find cache, it will cache the result for 5 minutes
find-cache-enabled = true
# Control trigram index
# This index is used to speed-up /find requests
# However, it will lead to increased memory consumption
# Estimated memory consumption is approx. 500 bytes per each metric on disk
# Another drawback is that it will recreate index every scan-frequency interval
# All new/deleted metrics will still be searchable until index is recreated
trigram-index = true
# carbonserver keeps track of all available whisper files
# in memory. This determines how often it will check FS
# for new or deleted metrics.
scan-frequency = "5m0s"
# Control trie index (EXPERIMENTAL)
# This index is built as an alternative to trigram index, with shorter indexing
# time and less memory usage (around 2 - 5 times). For most of the queries,
# trie is faster than trigram. For queries with keyword wrap around by widcards
# (like ns1.ns2.*keywork*.metric), trigram index performs better. Trie index
# could be speeded up by enabling adding trigrams to trie, at the some costs of
# memory usage (by setting both trie-index and trigram-index to true).
trie-index = false
# This provides the ability to query for new metrics without any wsp files
# i.e query for metrics present only in cache. Does a cache-scan and
# populates index with metrics with or without corresponding wsp files,
# but will lead to increased memory consumption. Disabled by default.
# (EXPERIMENTAL)
cache-scan = false
# Maximum amount of globs in a single metric in index
# This value is used to speed-up /find requests with
# a lot of globs, but will lead to increased memory consumption
max-globs = 100
# Fail if amount of globs more than max-globs
fail-on-max-globs = false
# Maximum metrics could be returned by glob/wildcard in find request (currently
# works only for trie index)
max-metrics-globbed = 30000
# Maximum metrics could be returned in render request (works both all types of
# indexes)
max-metrics-rendered = 1000
# graphite-web-10-mode
# Use Graphite-web 1.0 native structs for pickle response
# This mode will break compatibility with graphite-web 0.9.x
# If false, carbonserver won't send graphite-web 1.0 specific structs
# That might degrade performance of the cluster
# But will be compatible with both graphite-web 1.0 and 0.9.x
graphite-web-10-strict-mode = true
# Allows to keep track for "last time readed" between restarts, leave empty to disable
internal-stats-dir = ""
# Calculate /render request time percentiles for the bucket, '95' means calculate 95th Percentile. To disable this feature, leave the list blank
stats-percentiles = [99, 98, 95, 75, 50]
[dump]
# Enable dump/restore function on USR2 signal
enabled = true
# Directory for store dump data. Should be writeable for carbon
#path = "/var/lib/graphite/dump/"
path = "/data/graphite/dump/"
# Restore speed. 0 - unlimited
restore-per-second = 0
[pprof]
listen = "localhost:7007"
enabled = false
#[prometheus]
#enabled = true
#[prometheus.labels]
#foo = "test"
#bar = "baz"
# Default logger
[[logging]]
# logger name
# available loggers:
# * "" - default logger for all messages without configured special logger
# @TODO
logger = ""
# Log output: filename, "stderr", "stdout", "none", "" (same as "stderr")
file = "/data/graphite/log/carbon/go-carbon.log"
# Log level: "debug", "info", "warn", "error", "dpanic", "panic", and "fatal"
level = "warn"
# Log format: "json", "console", "mixed"
encoding = "mixed"
# Log time format: "millis", "nanos", "epoch", "iso8601"
encoding-time = "iso8601"
# Log duration format: "seconds", "nanos", "string"
encoding-duration = "seconds"
# You can define multiply loggers:
# Copy errors to stderr for systemd
# [[logging]]
# logger = ""
# file = "stderr"
# level = "error"
# encoding = "mixed"
# encoding-time = "iso8601"
# encoding-duration = "seconds"
Metric retention and aggregation schemas Please provide content of storage-schemas.conf and storage-aggregation.conf files.
/etc/go-carbon/storage-schemas.conf
[default]
pattern = .*
retentions = 10s:1d,1m:7d,5m:30d,15m:90d,1h:2y
/etc/go-carbon/storage-aggregation.conf
[min]
pattern = \.min$
xFilesFactor = 0.1
aggregationMethod = min
[max]
pattern = \.max$
xFilesFactor = 0.1
aggregationMethod = max
[sum]
pattern = \.count$
xFilesFactor = 0
aggregationMethod = sum
[default_average]
pattern = .*
xFilesFactor = 0.5
aggregationMethod = average
Simplified query (if applicable) Not applicable
Additional context Looks like it’s leaving a lot of sockets behind, and these are the ones using up all file-descriptors.
go-carbon 1534 graphite 267u sock 0,8 0t0 288695732 can't identify protocol
go-carbon 1534 graphite 271u sock 0,8 0t0 286852886 can't identify protocol
go-carbon 1534 graphite 274u sock 0,8 0t0 291150365 can't identify protocol
go-carbon 1534 graphite 278u sock 0,8 0t0 291150367 can't identify protocol
root@stats:/data/graphite/log/carbon# lsof -p 1534 | grep "can't identify protocol" |wc
231 2541 22407
root@stats:/data/graphite/log/carbon# lsof -p 1534 | grep -v "can't identify protocol" |wc
56 527 8507
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 20 (12 by maintainers)
Commits related to this issue
- carbonserver: fix two panics related to prometheus #374 It was introduced in commit 52fe9e23. Only triggered if prometheus is not enabled. — committed to go-graphite/go-carbon by bom-d-van 4 years ago
- carbonserver: fix two panics related to prometheus #374 It was introduced in commit 52fe9e23. Only triggered if prometheus is not enabled. — committed to go-graphite/go-carbon by bom-d-van 4 years ago
- carbonserver: fix two panics related to prometheus #374 (#376) It was introduced in commit 52fe9e23. Only triggered if prometheus is not enabled. — committed to go-graphite/go-carbon by bom-d-van 4 years ago
- carbonserver: fix two panics related to prometheus #374 (#376) It was introduced in commit 52fe9e23. Only triggered if prometheus is not enabled. — committed to bom-d-van/go-carbon by bom-d-van 4 years ago
Cool, thanks for fixing that, @bom-d-van ! I merged #376 to master. Will glad to release 0.15.1 if it helps