opentelemetry-operator: Auto instrumentation python service - trace data doesn't arrive in Endpoint

I am trying out the auto-instrumentation feature of Open Telemetry with an example demo application. I tried to applied auto-instrumentation to the 2 python services recommendation and email, as well as 2 nodejs services payment and currency. However I was only able to find trace data from nodejs services in my endpoint (Opensearch), but not python services’.

My environment k3s-1.23 example demo application(https://github.com/GoogleCloudPlatform/microservices-demo) python 3.7 for the python services

manifest of the python service:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: recommendationservice
spec:
  selector:
    matchLabels:
      app: recommendationservice
  template:
    metadata:
      labels:
        app: recommendationservice
      annotations:
        sidecar.opentelemetry.io/inject: "true"
        instrumentation.opentelemetry.io/inject-python: "true"
    spec:
      serviceAccountName: default
      terminationGracePeriodSeconds: 5
      containers:
      - name: server
        image: tybalex/recommendationservice:dev
        ports:
        - containerPort: 8080
        readinessProbe:
          periodSeconds: 5
          exec:
            command: ["/bin/grpc_health_probe", "-addr=:8080"]
        livenessProbe:
          periodSeconds: 5
          exec:
            command: ["/bin/grpc_health_probe", "-addr=:8080"]
        env:
        - name: PORT
          value: "8080"
        - name: PRODUCT_CATALOG_SERVICE_ADDR
          value: "productcatalogservice:3550"
        resources:
          requests:
            cpu: 100m
            memory: 220Mi
          limits:
            cpu: 200m
            memory: 450Mi
---
apiVersion: v1
kind: Service
metadata:
  name: recommendationservice
spec:
  type: ClusterIP
  selector:
    app: recommendationservice
  ports:
  - name: grpc
    port: 8080
    targetPort: 8080

Example logs in the recommendation service pod:

{"timestamp": 1654892491.8406446, "severity": "INFO", "name": "recommendationservice-server", "message": "[Recv ListRecommendations] product_ids=['0PUK6V6EV0', '2ZYFJ3GM2N', '6E92ZMYYFZ', 'L9ECAV7KIM', '66VCHSJNUP']", "otelSpanID": "7fd7aae81b7615e0", "otelTraceID": "8dc9a11590672e91fda208974c4bc6e7", "otelServiceName": "recommendationservice"}
{"timestamp": 1654892491.8581243, "severity": "INFO", "name": "recommendationservice-server", "message": "[Recv ListRecommendations] product_ids=['6E92ZMYYFZ', '9SIQT8TOJO', '0PUK6V6EV0', 'L9ECAV7KIM', 'OLJCESPC7Z']", "otelSpanID": "87451d7577e6cfee", "otelTraceID": "33610c8624ecaab6f54b72d027f0013b", "otelServiceName": "recommendationservice"}
{"timestamp": 1654892492.5934675, "severity": "INFO", "name": "recommendationservice-server", "message": "[Recv ListRecommendations] product_ids=['L9ECAV7KIM', '1YMWWN1N4O', '2ZYFJ3GM2N', '6E92ZMYYFZ', '0PUK6V6EV0']", "otelSpanID": "e986f7291edb5335", "otelTraceID": "6ae0a6f318cb5e186d0de7984fca5c19", "otelServiceName": "recommendationservice"}
{"timestamp": 1654892492.6086466, "severity": "INFO", "name": "recommendationservice-server", "message": "[Recv ListRecommendations] product_ids=['1YMWWN1N4O', '0PUK6V6EV0', '2ZYFJ3GM2N', 'OLJCESPC7Z', '66VCHSJNUP']", "otelSpanID": "875c0e8c9d238aa4", "otelTraceID": "165553d3bbcfd3017777d92ced997ad3", "otelServiceName": "recommendationservice"}

they do have Otel traceID and ServiceName assigned.

What is the expected behavior? trace data available in the Opensearch index otel-v1-apm-span*, as I do see the trace data from nodejs services.

What is the actual behavior? trace data from python services are missing.

additional information here’s the instrumentation :

apiVersion: opentelemetry.io/v1alpha1
kind: Instrumentation
metadata:
  name: my-instrumentation
spec:
  exporter:
    endpoint: http://opentelemetry-collector:4317
  propagators:
    - tracecontext
    - baggage
    - b3
  sampler:
    type: parentbased_traceidratio
    argument: "0.25"
  java:
    image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-java:latest
  nodejs:
    image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-nodejs:latest
  python:
    image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-python:latest

About this issue

  • Original URL
  • State: open
  • Created 2 years ago
  • Comments: 15 (12 by maintainers)

Most upvoted comments

@tybalex please set for Python instrumentation different exporter endpoint as Python instrumentation is using OTLP HTTP exporter by default. Please use eg. http://collector-hostname:4318. You can set it only for Python modifying your Instrumentation CR e.g.

apiVersion: opentelemetry.io/v1alpha1
kind: Instrumentation
metadata:
  name: my-instrumentation
spec:
  exporter:
    endpoint: http://opentelemetry-collector:4317
  propagators:
    - tracecontext
    - baggage
    - b3
  sampler:
    type: parentbased_traceidratio
    argument: "1"
  java:
    image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-java:latest
  nodejs:
    image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-nodejs:latest
  python:
    env:
      - name: OTEL_EXPORTER_OTLP_TRACES_ENDPOINT
        value: http://opentelemetry-collector:4318/v1/traces
    image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-python:latest

We have been using OTLP/gRPC as a default exporter in the auto instrumentation package https://github.com/open-telemetry/opentelemetry-python-contrib/blob/51ba801bfda31c3d57902d9f9df938cee1236eb8/opentelemetry-distro/src/opentelemetry/distro/__init__.py#L37-L38. This was the choice of default partly because we didn’t have HTTP exporters for OTLP. Even outside this operator deployment, I have seen people run into weird issues because gRPC sometimes doesn’t work well with their env. Personally, I am in favor of using OTLP/HTTP + Protobuf. With that change people who already run gRPC will face issues because the env like endpoint needs to be updated.

From the linked issues it seems like there is no strong opinion what is the default exporter and SDKs might choose one or the other depending on circumstances.

To me it seems like that we should follow what the language SDKs do and choose the right default based on the language.

we should take some action item here and fix the problem for all users.

This should be either documented or the operator can set env var OTEL_EXPORTER_OTLP_TRACES_ENDPOINT for python.

This worked! thank you @mat-rumian . Just curious why python is treated differently from the other 2…