graphite-web: Recent metric data not displayed until flushed to disk
I wasn’t sure if I should file this against graphite-web or carbon, so apologies if I’ve chosen poorly.
We have a carbon-relay instance fronting 4 x carbon-cache instances, no aggregation in the stack, but when building graphs or fetching data via the api the most recent metric values are not immediately available. If we wait long enough they do appear (typically 5-30 minutes, we’re always starved for IOPS), so I thought the query cache lookups from graphite-web were failing, but I see successful cache lookups logged in the cache instances:
29/01/2016 11:28:48 :: [query] [127.0.0.1:37758] cache query bulk for "32" metrics returned 2025 values
A bit of debug logging and tcpdump’ing shows the metric updates coming into the relay and passing straight through to the cache instances, so there’s no delay getting the data into the cache processes - we just cant seem to get the data out until (I assume) they’re flushed to disk. These are not new metrics, the whisper files were created long ago.
You can find config files and log snippets in this gist, if anyone can see if we’re missing something that would be very much appreciated. Or if I’m totally wrong about how this is supposed to work and we just need more IOPS you can tell me that too 😃
About this issue
- Original URL
- State: closed
- Created 8 years ago
- Reactions: 1
- Comments: 16 (12 by maintainers)
We experience the same issue with our setup with new metrics.
If we send a metric which already had some value in the past, it appears almost instantly. But if we send a new metric, it does not appear for about 20-30 min.
Hi I’m experiencing the same behavior, while carbon cache instances are queuing data graphite doesn’t return recent metrics , we have 6x carbon-cache instances (200k metrics/minute aprox)
As you can see in this queue graph , we have this error frequently.
the problem persists until data queues disappear, usually 10/15 minutes ( sometimes 20/30 minutes)