prom-client: Memory leak
Hi!
I currently have an implementation of prom client running that seems to continuously grow the node heap until an eventual crash/service restart when collecting API metrics. Currently, I am collecting the default metrics and a couple custom ones that just count websocket connections. These all work fine. The problems arise from the API metrics I try to collect using a middleware that sits on our Hapi routes. This middleware listens for ‘onRequest’ and then ‘response’ for collecting a response time in milliseconds, while adding labels for method, path, statusCode, and usernames.
Here is what the middleware looks like (I do not believe the problem is with this):
/**
* This is the middleware that looks at each request as it comes through the server
* @param {Object} server This is the instantiated Hapi server that we wish to listen on.
* @param {Object} options This is an object that can contain additional data.
* @param {Function} next This is the callback function that allows additional requests to continue.
* @return {Function} Callback function.
*/
const middlewareWrapper = (server, options, next) => {
server.ext('onRequest', (request, reply) => {
request.startTime = process.hrtime()
return reply.continue()
})
server.on('response', (request) => {
logic.apiPerformance(request.response.statusCode, request.startTime, sanitizeRequestData(request))
})
return next()
}
The metrics themselves are used inside of that apiPerformance function which looks like this:
/**
* This function takes various information about a request, performs some basic calculations, and then sends that off to loggly.
* @param {Integer} statusCode This is the HTTP statusCode that will be returned to the user with the response.
* @param {Array} startTime The difference in time from the request recieved to the response given. https://nodejs.org/api/process.html#process_process_hrtime_time
* @param {Object} responseData An object with sanitized response data.
*/
apiPerformance: (statusCode, startTime, responseData) => {
const diff = process.hrtime(startTime)
const responseTime = ((diff[0] * 1e3) + (diff[1] * 1e-6))
const time = new Date()
logger.transports.loggly.log('info', 'Route Log', {
responseCode: statusCode,
route: responseData.route,
host: responseData.host,
port: responseData.port,
method: responseData.method,
userData: responseData.userData,
responseTime: `${responseTime}ms`,
params: responseData.params,
timestamp: time.toISOString(),
tags: ['events', 'api', 'performance']
}, (error, response) => {
if (error) console.log(error)
})
if (config.metrics.prometheus.enabled) {
requestDurationSummary.labels(responseData.method, responseData.route, statusCode, responseData.userData.displayName).observe(responseTime)
requestBucketsHistogram.labels(responseData.method, responseData.route, statusCode, responseData.userData.displayName).observe(responseTime)
}
}
The metrics are registered at the top of that file like so:
const requestDurationSummary = prometheus.createSummary('http_request_duration_milliseconds', 'request duration in milliseconds', ['method', 'path', 'status', 'user'])
const requestBucketsHistogram = prometheus.createHistogram('http_request_buckets_milliseconds', 'request duration buckets in milliseconds.', ['method', 'path', 'status', 'user'], [ 500, 2000 ])
Node: v6.1.0 Hapi: v16.4.3 prom-client: v9.1.1
We are including prom-client as a part of a custom package (its basically a common libs package) that we include with our projects. Inside of this package, we include prom-client and expose only the methods we use (collectDefaultMetrics, registerMetrics, createGauge, createSummary, createHistogram). Another thing to note is that all metrics except http_request_duration_milliseconds and http_request_buckets_milliseconds (the two metrics posted above) are registered inside of the application we are tracking metrics for. Those two are registered in the code I posted above, which is within the custom package itself.
Any other information I can provide please let me know. Any help or suggestions are greatly appreciated. Thanks!
About this issue
- Original URL
- State: open
- Created 7 years ago
- Comments: 29 (7 by maintainers)
Commits related to this issue
- #142: Fix memory leakage and add compressCount option — committed to atd-schubert/prom-client by atd-schubert 6 years ago
- #142: Add compression of tdigest to CHANGELOG — committed to atd-schubert/prom-client by atd-schubert 5 years ago
- Compress t-digest to prevent memory leakage (#234) * 142: Fix memory leakage and add compressCount option * #142: Add compression of tdigest to CHANGELOG — committed to siimon/prom-client by atd-schubert 5 years ago
For what it’s worth, I can confirm that 12.0.0 works infinitely better than 11.5.3, which either slowly grew in memory over days, or in some cases OOM’d a minute after start. Thanks for fixing this! 🙇♀️
There is a PR open that might fix it, but I’m waiting for clarifications in it. If you are in an experimental mood you could try that branch out and see if it solves your problem.
I think the easiest way to show the leaking behavior is this leakage-test I wrote:
Note: It’s written with mocha not with jest…
You will see that Summary is leaking, but Histogram does not.
Should you be using the
pluginspart of the request? https://hapijs.com/api#request-properties Or maybeapp(ifappis per request)?I’ve never used hapi, just trying to be a good rubber duck 🙂
https://hapijs.com/api#request-properties