opentelemetry-cpp: Opentelemetry API singletons (TraceProvider etc.) do not play well in shared library uses that hide symbol visibility
Describe your environment We are using the opentelemetry tracing API (and SDK) inside a Linux C++ shared library (.so file) built with the standard gcc toolchain. This library is provides messaging using the AMQP protocol (the library is Qpid Proton C++). We create tracing spans internal to the library to represent the lifecycle of messages sent and received by the library. The contexts associated with the spans are propagated with the messages so that distributed tracing is achieved.
The intention is that any application using our library should be able to create its own tracing spans that are naturally related to the library generated spans (by default using the active span as the parent span). This requires that the TracingProvider accessible to the library and to the application using it is the same TracingProvider.
Steps to reproduce
Our library is built with -fvisibility=hidden
. This means that any symbol that should be exported from the library needs some extra annotation (__attribute__((visibility("default")))
for gcc/clang). We use this to carefully control the visible API/ABI from our library and to be sure that internal details aren’t visible outside.
However the way that the singleton insideProvider
is implemented in a header file means that it is defined as a static symbol inside the inline member function Provider::GetProvider
. This means that there are duplicate symbols for every time the provider.h
header file is included in for example our library and an application that uses it.
If the symbols are all visible to the linker at link and runtime then the symbol will be effectively deduplicated and there will be only one used - to my understanding the application symbol takes precedence, but as long as there is only one it will work correctly.
However when the library symbols are hidden by default as in our library there is no way for the runtime loader to know that there should only be one version of the singleton and what happens is that the library and the application end up with different TracerProvider
s.
This is a serious problem for us as we ship our library with the symbols exported explicitly for very good reason and so we can’t just use Provider::GetTracerProvider
in the library and application and have it work correctly together.
What is the expected behavior?
The way I would expect this to work is that the singletons are not defined in header files and so duplicated in every object file that includes them, but rather the header file only declares Provider::GetProvider
and it is defined in one of the opentelemetry-cpp libraries (the common lib?). I understand that this moves the singleton out of what you think of as the ‘API’ and into the ‘SDK’. But I really think that architecturally singletons are actuall implementation artifacts not API artifiacts.
Additional context I note that this is not an issue at all if you statically link everything as in that case the final link phase takes care of the issue.
I also note that this should be an issue using DLLs on windows as by symbols are not exported by default (afair) and also need to be explicitly marked to be exported (using __declspec(dllexport)
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 27 (25 by maintainers)
Commits related to this issue
- Proof of concept. Fix header only API for singletons (#1520) — committed to marcalff/opentelemetry-cpp by marcalff 2 years ago
- Fix header only API for singletons (#1520) Fixed asan build — committed to marcalff/opentelemetry-cpp by marcalff 2 years ago
- Fix header only API for singletons (#1520) Code cleanup. — committed to marcalff/opentelemetry-cpp by marcalff 2 years ago
- Fix header only api singletons (#1520) This fix is for gcc and clang only. — committed to marcalff/opentelemetry-cpp by marcalff 2 years ago
- Fix header only api singletons (#1520) (#1604) — committed to open-telemetry/opentelemetry-cpp by marcalff 2 years ago
- Fix header only api singletons (#1520) (#1604) — committed to yxue/opentelemetry-cpp by marcalff 2 years ago
- Fix header only api singletons (#1520) (#1604) — committed to yxue/opentelemetry-cpp by marcalff 2 years ago
Thanks @astitcher for the detailed problem description.
Also discussed during the opentelemetry-cpp meeting today.
The same issue was reported recently (#1409) and fixed (#1420) for some singletons present in the SDK, because the code used the general pattern of declaring singletons in header only code, for the most part.
There is still some areas to fix in the SDK, see (#1429), but moving singletons to *.cc files in the SDK only works because there are libraries associated with the SDK.
The same coding pattern (header only singleton) is also used in the API, with some extra constraints that the API should be header only.
I tried below to summarize the problem and constraints, to have a fact based discussion about possible solutions.
All opinions are my own.
I) History
For some history about this decision, from the meeting notes:
Oct 7th, 2019:
Bogdan - can we have pure C++ header API? Ryan/Max - singleton/global would be hard.
Feb 24th, 2020:
[Evgeny] Does library provide access to globals/singletons, e.g. TraceFactory Current requirements.md and this comment suggest no exported symbols
https://github.com/open-telemetry/opentelemetry-cpp/pull/38#discussion_r383064843
Multiple libraries, built separately, how do they share/access the same singleton?
[Ryan] The archives will have multiple symbols but the linker will resolve it. Bigger problems around late binding.
[Evgeny] Alternative: abandon header-only approach and put the singleton business into a *.cc file.
[Ryan] We decided early on to try to be header-only to be easier to install.
Takeaway from these early discussions:
The challenge of implementing singletons in header only code has been identified more that 2.5 years ago already
The specific issue about shared libraries (“Bigger problems around late binding”) has been identified more that 2 years ago, and is not fully resolved to this day.
A header only library is indeed easier to install, but note the wording “try to”, and the related fallback “Alternative: abandon header-only approach”: a header only API is a design goal, not an absolute requirement.
II) Desirable architectural qualities
II-A) Singletons MUST be singletons.
I think we all agree on this, otherwise everything will fall apart.
No matter what the final technical solution is, singletons must really be singletons for opentelemetry-cpp to work.
II-B) Instrumenting a library MUST depend only on API header files.
When instrumenting library X, the library owner has no control, and should not be concerned with, whether the library will be deployed:
The best way to achieve this is to compile the instrumented library only against API header files, to make it independent of an SDK implementation.
This is what hooks like the trace provider singleton are for in the first place, isolate the instrumentation using the API from the implementation in the SDK.
The desired goal here is for library authors to ship only one library, instead of releasing one library with, one without, opentelemetry.
II-C) Deployment ease of use.
Assume an existing application composed of:
The main application links against libraries A, B and C.
When instrumenting libraries with opentelemetry:
To deploy the application, several choices are then possible:
a) Not using opentelemetry
Link:
In this deployment, linking the final application does not change, as there are no additional libraries needed.
What the “header only API” property provides, is the possibility to link the application without changes, in this case alone.
This is desirable to facilitate distributing libraries with/without instrumentation, with minimal (in this case, zero) disruptions to the application using the libraries and not using opentelemetry.
b) Using opentelemetry
Link:
In this case, the fact that the API is header only is irrelevant, because the application has to be modified to link with more libraries anyway.
II-D) Non intrusive architectural constraints
No matter what technical choices are made in the opentelemetry-cpp space, adopting opentelemetry in an existing application should not invalidate architectural and design choices made in the application.
In particular, if a given application has reasons of its own to compile code with
-fvisibility=hidden
, opentelemetry should not get in the way and cause this application decision to be changed to accommodate opentelemetry-cpp.III) Current state
As of now, singletons like the global trace provider are:
For applications using
-fvisibility=hidden
, symbols are not visible, and singletons are in fact duplicated, causing bugs.With the code as is, II-C-a) is satisfied at the expense of II-D)
IV) Possible solutions
IV-A) Do not support -fvisibility=hidden.
This “solution” is listed for completeness only to point out that it is not acceptable.
For opentelemetry in general, and opentelemetry-cpp in particular, to be successful, adopting opentelemetry should not cause major disruptions in the application space, and changing how the entire application is build is disruptive.
Disclosure:
In the application we (at work) instrument,
-fvisibility=hidden
is used as well.Changing this is not an option, making II-D) a critical point.
IV-B) Make header-only API and -fvisibility=hidden work together.
If feasible, this would be the best technical solution, as it satisfies both II-C) and II-D)
Adding explicit visibility directives to the singleton globals declared in the API, to make these symbols visible even when the code is built with
-fvisibility=hidden
by default, could be a technical solution here.My main concern is:
In particular, what if this is not possible for a given (old) version of a compiler in a given platform, that is otherwise supported today ?
Is dropping support for compiler X an acceptable outcome ?
IV-C) Revisit the “header-only API” principle.
To take an example with the tracer provider singleton, today the API has:
trace::Provider::GetTracerProvider()
trace::Provider::SetTracerProvider()
provider
, defined insidetrace::Provider::GetProvider()
As a side comment, not only the provider singleton should be unique, but the mutex that protects it should be unique as well, to prevent races.
Looking at each part:
the getter function is definitely part of the API, to be used when instrumenting a library, per II-B)
the setter function should not be considered part of the API in the first place. An instrumented library should never call this setter. Logically, it belongs to the SDK, to be called by the application owner to configure the SDK.
the singleton itself is right at the boundary between the API and the SDK. As such, observing a different set of rules for the API/SDK boundary, if required for technical reasons, seems valid (to me).
The fact that the setter function is defined and implemented in the API does not cause harm otherwise, so leaving the setter in the API is just fine.
Moving the singleton implementation in a opentelemetry-api-singletons library will solve the technical problem here.
Consequences are that:
the “header-only API” principle applies to all the code except singletons, which must be documented to avoid confusion
applications linking with instrumented libraries now need an extra library to link
The later point is not desirable, but this is a compromise if IV-B can not be implemented.
In this solution, II-D) is supported at the expense of II-C-a)
V) Decision needed.
I think it is time to revisit the header-only API principle, with regards to singletons.
If the pattern can be amended with visibility directives to make it work all the time, great.
The first step is technical investigations, to see if solution IV-B is feasible.
Otherwise, opentelemetry-cpp will have to provide an API library for singletons, per IV-C.
The trade off between II-C and II-D is a major one.
Not supporting II-C has major impact on everyone not using opentelemetry.
Not supporting II-D has major impact on everyone using opentelemetry.
The plan is to limit the scope to Linux for now: https://github.com/open-telemetry/opentelemetry-cpp/pull/1525#issuecomment-1226020353
@marcalff - Thanks for very neatly summarizing all of this and explaining the history - I think I’m in the camp that really considers the singletons to really be part of the SDK as they are actually implementation artifacts, but I totally get that requiring a new library to link with the API is highly undesirable at this point. I will spend a bit of time this week to try and rough out the code using
__attribute__((visibility(default)))
to see if that works. My previous experience suggests that this syntax is good for every version of gcc and clang that matters (as in supports at least C++11). I also think that using a simple macro that either expands to that or to__declspec(dllexport)
for VS should work. One thing that I think makes this work simpler is that the symbol export is unconditionally true for every use of the header file, which is unlike the common use where you want to export the symbol when actually compiling the library itself but import the symbol when using the header to define the API that the library exports.