node-newrelic: Undici instrumentation breaks fastify throughput tracking
Description
We enabled undici instrumentation in our production and saw huge decrease in our service throughput, but all other KPIs were stable (which means the issue is the metric, the actual throughput didn’t drop)
We are using fastify with Node 16, our base NR configuration is
NEW_RELIC_FEATURE_FLAG_NEW_PROMISE_TRACKING=true
NEW_RELIC_FEATURE_FLAG_UNRESOLVED_PROMISE_CLEANUP=false
We enabled undici instrumentation (newrelic SDK v8.5.1) using these flags:
NEW_RELIC_FEATURE_FLAG_UNDICI_INSTRUMENTATION=true
NEW_RELIC_FEATURE_FLAG_UNDICI_ASYNC_TRACKING=false
this caused the first drop u see on the screenshot.

our next try was to set undici_asyn_tracking to “true”
NEW_RELIC_FEATURE_FLAG_UNDICI_INSTRUMENTATION=true
NEW_RELIC_FEATURE_FLAG_UNDICI_ASYNC_TRACKING=true
This change increased the throughput for a while after app restart, but then the throughput dropped again (this is the second drop on the screenshot)

after disabling undici instrumentation, throughput got back to normal.
Expected Behavior
I don’t see how an http client instrumentation(undici) in my service(fastify), can cause the throughput metric to get broken (as I mentioned the real throughput stayed the same, its just the metrics report that got broken)
Your Environment
Fastify v3.22.1 Node 16.11 undici 4.7.1
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 17 (9 by maintainers)
@artur-ma Just wanted to give you a heads up that we’re considering work related to this in the upcoming roadmap. We don’t have a definitive timeline yet. I’ll update when we do. Thanks!
@artur-ma I tried saving references to the channels and ran our test and it is still an issue. However, that’s not to say your issue has not been resolved. The repro case we created is very brute force. So I’m going to make this suggested change and perhaps you could retest, or could you apply the patch below and test yourself without us releasing a version of the agent?
Commit Sha
I’ve been able to reproduce this a couple ways but only one way consistently: by modifying an endpoint to hang for a period of time and then allowing traffic to resume (in this case by exiting but may be other ways) with other endpoints being hit. It is quite contrived but at least demonstrates a problem. I have not toggled all the potential variable bits to confirm the exact contributing factors.
A standalone reproduction (of at least a similar case to what is documented here) can be found here: https://github.com/michaelgoin/undici-fastify-repro. The default configuration which doesn’t all lump everything under a single transaction may not see a drop in traffic in the UI (in fast may increase due to quick failures of what doesn’t get lumped together) but the logs will show the state conflation.
Nice find @artur-ma. I’m going to see if it addresses the issue you raised here 🤞🏻
Maybe this is somehow related: https://github.com/nodejs/node/issues/42170 even if not, it should be changed for now (till it will be fixed in Node) to avoid garbage collection of subscribers, u have assign the channel to a variable
here for example https://github.com/newrelic/node-newrelic/blob/main/lib/instrumentation/undici.js#L72 it should be
Ok @artur-ma. I’ll try to reproduce. If I can’t, I’ll probably have you go through our technical support to get the right info.