istio: Tracing regression in 1.5.x - nested traces are intermittently incomplete

Bug description We are having a problem where traces aren’t coming through to Jaeger correctly, where the spans are complaining that the parentSpanId is missing.

This is only an issue on 1.5.1, using a 1.4.6 sidecar (with 1.5.1 control plane) this traces are always complete. Therefore I believe this to be a 1.5 regression.

It’d be great if someone who more deeply understands istio/envoy/zipkin and the istio debug logs could help me wade through the information here to see what is up. If you’re struggling to reproduce i’m more than happy to remote share with you to debug this.

Example failure As an example, we have a simple app setup of:

nginx -> istio-test-app-1 -> istio-test-app-2 ->istio-test-app-3

Which has been around since istio 1.0 days, to help us test these sorts of things, nothing has changed in them with regards to header forwarding logic.

As you can see in Jaeger the trace for 38e310d5619723f6 is messed up:

Screenshot 2020-04-15 at 19 59 01

The istio-test-app-2 span is missing.

Another example here on trace 64f13d7f39b4a5f2:

Screenshot 2020-04-15 at 20 01 26

In this case, the span for istio-test-app-1 is missing.

I have captured istio-proxy-debug logs for each of the test apps, which contain the requests mentioned above (as well as some other noise, sorry).

istio-test-app-1.log istio-test-app-2.log istio-test-app-3.log

Expected behavior Traces to be complete

Steps to reproduce the bug For us, it’s using a 1.5.1 sidecar

Version (include the output of istioctl version --remote and kubectl version and helm version if you used Helm) 1.5.1 microservice model (not istiod)

How was Istio installed? Helm

Environment where bug was observed (cloud vendor, OS, etc)

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 1
  • Comments: 42 (39 by maintainers)

Most upvoted comments

@douglas-reid if this is considered a regression, I think it would be better to fix it in 1.5.x.