apm-agent-dotnet: Elasticsearch operations produce span retrieve error in distributed transactions
Describe the bug Hi,
I didn’t notice this with previous versions of APM, but right know if I create a manual transaction with an existing distributed tracing id with inner spans, on each Elasticsearch (Nest lib) I get the following errors from APM like
Failed to find current span in ConcurrentDictionary
To Reproduce
Steps to reproduce the behavior:
1.My use case is the following : App1 is publishing a message to Kafka, in the message I set a distributed correlationId
using the following :
public static string GetCorrelationLogsId() =>
(Agent.Tracer.CurrentSpan?.OutgoingDistributedTracingData
?? Agent.Tracer.CurrentTransaction?.OutgoingDistributedTracingData)?.SerializeToString();
Let’s say the string returned is 00-67fe36868e7ce8498bc0195b3a0acb4f-f867f0376eec7a49-01, it’s then set into the message before sending it to Kafka.
- Then
App2is consuming the message, and use the correlationId to open a new manual transaction
await Agent.Tracer
.CaptureTransaction("myTransaction", ApiConstants.SubtypeElasticsearch, async (t) =>
{
... some code
await t.CaptureSpan("PersistEntity", ApiConstants.SubtypeElasticsearch, async span =>
{
_logger.LogDebug("Persist entity in Elk with Id {id}", mymessage.Id);
await _elasticsearchRepository.Add(mymessage); // produces errors !
});
... some code
}, DistributedTracingData.TryDeserializeFromString(mymessage.DistributedTracingData)); // 00-67fe36868e7ce8498bc0195b3a0acb4f-f867f0376eec7a49-01
- The call of Elastic repository method produces the following logs
[07:51:17 INF] {RequestPipelineDiagnosticsListener} Received an CallElasticsearch.Start event from elasticsearch
[07:51:17 INF] {SerializerDiagnosticsListener} Received an Serialize.Start event from elasticsearch
[07:51:17 ERR] {SerializerDiagnosticsListener} Failed to find current span in ConcurrentDictionary 00-4b8a560b65e8ff4cb4f6e376f526c3ac-5fcbad25ebedd341-01
[07:51:17 INF] {HttpConnectionDiagnosticsListener} Received an SendAndReceiveHeaders.Start event from elasticsearch
[07:51:17 ERR] {HttpConnectionDiagnosticsListener} Failed to find current span in ConcurrentDictionary 00-4b8a560b65e8ff4cb4f6e376f526c3ac-5fcbad25ebedd341-01
[07:51:17 INF] {HttpConnectionDiagnosticsListener} Received an ReceiveBody.Start event from elasticsearch
[07:51:17 INF] {SerializerDiagnosticsListener} Received an Deserialize.Start event from elasticsearch
[07:51:17 ERR] {SerializerDiagnosticsListener} Failed to find current span in ConcurrentDictionary 00-4b8a560b65e8ff4cb4f6e376f526c3ac-dc31ed64b33f0145-01
[07:51:17 ERR] {HttpConnectionDiagnosticsListener} Failed to find current span in ConcurrentDictionary 00-4b8a560b65e8ff4cb4f6e376f526c3ac-5fcbad25ebedd341-01
[07:51:17 ERR] {RequestPipelineDiagnosticsListener} Failed to find current span in ConcurrentDictionary 00-4b8a560b65e8ff4cb4f6e376f526c3ac-302847dec737e94c-01
=> the correlation id reported in logs are not the same as the one used to initiate the transaction (ie : transported in message payload)
- The correlation Id seems to be correctly used thanks to the APM trace sample since both applications appear in the same transaction

Versions and config Here are the current versions of used libs
<TargetFramework>net5.0</TargetFramework>
<PackageReference Include="Elastic.Apm.NetCoreAll" Version="1.7.1" />
<PackageReference Include="Elastic.Apm.SerilogEnricher" Version="1.5.1" />
<PackageReference Include="Elastic.CommonSchema.Serilog" Version="1.5.1" />
<PackageReference Include="NEST" Version="7.10.1" />
And my appsettings.json
"ElasticApm": {
"ServerUrls": "http://apm:8200",
"TransactionMaxSpans": 5000,
"CaptureBody": "all",
"CloudProvider": "none"
}
Expected behavior I expect no error from APM or to understand what I did wrong
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 1
- Comments: 21 (8 by maintainers)
Commits related to this issue
- Fix Client Diagnostics following upgrade to DiagnosticSource 5.0.0 Relates: elastic/apm-agent-dotnet#1094 This commit fixes a rather pernicious bug with Diagnostics from the clients, following the u... — committed to elastic/elasticsearch-net by russcam 3 years ago
- Fix Client Diagnostics following upgrade to DiagnosticSource 5.0.0 (#5326) Relates: elastic/apm-agent-dotnet#1094 This commit fixes a rather pernicious bug with Diagnostics from the clients, foll... — committed to elastic/elasticsearch-net by russcam 3 years ago
- Fix Client Diagnostics following upgrade to DiagnosticSource 5.0.0 (#5326) Relates: elastic/apm-agent-dotnet#1094 This commit fixes a rather pernicious bug with Diagnostics from the clients, foll... — committed to elastic/elasticsearch-net by russcam 3 years ago
- Fix Client Diagnostics following upgrade to DiagnosticSource 5.0.0 (#5326) (#5327) Relates: elastic/apm-agent-dotnet#1094 This commit fixes a rather pernicious bug with Diagnostics from the client... — committed to elastic/elasticsearch-net by github-actions[bot] 3 years ago
@NicolasREY69330 - We’ve just pushed packages for 7.11.1 which include a fix for this bug.
I’ve opened https://github.com/elastic/elasticsearch-net/pull/5326 to address.
I’ve outlined the problem in the PR; it’s a rather pernicious and subtle bug that was introduced when the clients were updated to System.Diagnostics.DiagnosticSource, and stems from
Activitynow implementingDispose()and settingActivity.Currentto the parentActivityonDispose(), which ends up happening when the derivedDiagnostictypes in the client are disposed, and before it notifies any Diagnostic Listeners of the stopping of the activity that happens in the overriddenDispose(bool).Thank you for the repro @NicolasReyDotNet, I can reproduce the issue. Looking into
Thank you for your help
@russcam Could you reproduce the problem with my repo ?