azure-sdk-for-python: [SB][pyamqp][OTEL] pyamqp tracing is missing values for `amqpLink` and `amqpSession`

  • Package Name: azure-servicebus==7.11.2
  • Package Version: azure-servicebus==7.11.2, azure-core==1.29.4, azure-core-tracing-opentelemetry==1.0.0b11, azure-monitor-opentelemetry-exporter==1.0.0b17, opentelemetry-*==1.20.0/0.41b
  • Operating System: macOS, Ubuntu, Debian
  • Python Version: 3.10

Describe the bug We get logs like

2023-09-22 07:40:48.193 [WARNING] [opentelemetry.attributes] __init__._clean_attribute -: Invalid type NoneType for attribute 'amqpLink' value. Expected one of ['bool', 'str', 'bytes', 'int', 'float'] or a sequence of those types
2023-09-22 07:40:48.040 [WARNING] [opentelemetry.attributes] __init__._clean_attribute -: Invalid type NoneType for attribute 'amqpSession' value. Expected one of ['bool', 'str', 'bytes', 'int', 'float'] or a sequence of those types

Looks like pyamqp distributed tracing is trying to add attributes to the span that are clearly empty.

To Reproduce Use Azure SB with pyamqp and enable opentelemetry instrumentation. Try sending some messages.

Expected behavior amqpLink and amqpSession is not an empty value or if it’s an empty value then it’s not added as attribute to the span.

Screenshots Zrzut ekranu 2023-09-22 o 10 08 01

Additional context As a library user I would like to know if there’s an issue, but this logs would come up too often and would only increase our bill.

About this issue

  • Original URL
  • State: closed
  • Created 9 months ago
  • Reactions: 1
  • Comments: 18 (16 by maintainers)

Most upvoted comments

This error is generated by the upstream opentelemetry api library. This is because None attribute values are actually invalid according to the spec. @swathipil azure-servicebus may not be adding attributes to the log record directly but the opentelemetry python logging sdk adds entries in the extra argument as attributes directly on the LogRecord. This means that any values defined in extra will be subject to OpenTelemetry attribute validations. That is probably what is causing the error, and because @macieyng is most likely using the AzureMonitorLogExporter, they are capturing these errorneous log messages and they get sent to their app insights resource. There are a couple of things we can do here:

  1. Enforce OT attribute validation on all azure-sdk libraries that log messages (or at least for now, azure-servicebus
  2. Go to OpenTelemetry Python and request for configuration of what fields of a LogRecord can be omitted from becoming opentelemetry attributes
  3. (Advised to do anyway) Don’t add the LoggingHandler to your root logger as this will cause telemetry collection for ALL Python loggers in your application. This will be an issue regardless of the original problem that was reported. Instead, use namespaced loggers specifically for your application and add the handler to those so you only collect and export logs related to your application.

@swathipil

The only validation from the opentelemetry python sdk that is surfaced will be warning logs that you see above (they should never throw an exception). OpenTelemetry api uses _clean_attributes to validate OpenTelemetry attributes from a dictionary. Technically you can use that for any additional fields (specifically extra) that you place on LogRecord but I think for this specific case maybe you just need to check for None values and default them to something appropriate. As well, additionally I will to bring this up in upstream OpenTelemetry to perform validation of attributes on the additional fields as well since we don’t do that today (this is a bug). The validation will still need to be performed in azure-sdk however because upstream will eventually be changed to drop invalid attributes (instead of spitting out warnings like above).

Hi @macieyng -

Having “amqpLink”/“amqpSession” logged as either an empty string or None is useful for debugging scenarios to establish that they haven’t been created yet or are being re-created. We’re currently looking into whether this error might be an issue with either opentelemetry or the Azure Monitor otel exporter package. Will keep you updated.

I’m assuming amqpSession is the same as service bus session

For clarification, “amqpSession” and ServiceBusSession are two different concepts. A Service Bus session is used for ordered handling/processing of messages related by “session_id”. An AMQP Session is a protocol level concept for grouping links that are used for communication, in our case, with the service.

From my experience, if there’s no session then I don’t care and probably wouldn’t be using this key/attribtue/field for searching for stuff in Application Insights.

For your purposes of tracking a Service Bus session, you should not be tracking the “amqpSession” logging parameter. The “amqpLink/Session/Connection” are for advanced usage & debugging. Thanks.

@annatisch - We don’t have the logging parameters documented but have an issue here (#32220) to address the troubleshooting documentation during the upcoming MQ.

@macieyng - Thanks for your logs! I see a few errors in there that look like: Invalid type NoneType for attribute 'order_id' value. Expected one of ['bool', 'str', 'bytes', 'int', 'float'] or a sequence of those types and Invalid type NoneType for attribute '***_id' value. Expected one of ['bool', 'str', 'bytes', 'int', 'float'] or a sequence of those types

We’re not passing order_id explicitly within the azure-servicebus library anywhere that I know of. Do you know where these logs above could be originating from?

Actually after some thought and re-reading the error message - this might simply be because we are defaulting amqpSession and amqpLink to None rather than empty str: https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/servicebus/azure-servicebus/azure/servicebus/_pyamqp/_connection.py#L116

So this might be a very simple fix - but needs some further testing.

Ah this is very interesting - thanks for reporting @macieyng! @swathipil - this is because we are using the built-in log formatting. We may need to “pre format” all the log messages…

It should be a simple update - though it would be a relatively extensive one.

  1. Sync client
  2. Here are some logs query_data (17).csv