Log: https://cmssdt.cern.ch/SDT/cgi-bin/logreader/cc8_aarch64_gcc9/CMSSW_12_0_X_2021-07-19-2300/unitTestLogs/FWCore/Utilities#/121-121
===== Test "testFWCoreUtilities" ====
Running ...............................F...
reusableobjectholder_t.cppunit.cpp:240:Assertion
Test name: reusableobjectholder_test::testSimultaneousUse
assertion failed
- Expression: t1ItemsSeen.size() > 0 && t1ItemsSeen.size() < 3
Failures !!!
Run: 34 Failure total: 1 Failures: 1 Errors: 0
---> test testFWCoreUtilities had ERRORS
TestTime:29
^^^^ End Test testFWCoreUtilities ^^^^
I spent some time on this issue this morning, and I think I traced to the oneTBB implementation of
concurrent_queue::try_pop()(see include/oneapi/tbb/concurrent_queue.h#L180-L195):Due to the first
std::memory_order_relaxed,try_popmay sometime fail spuriously on architectures that have a relaxed memory ordering like Power and ARM.Changing it to
std::memory_order_acquireand rebuilding the test fixed it for me on a Power 8 machine (run successfully 20 times out of 20).I’m unable to reproduce the failure locally. Likely related to the load of the ARM nodes (that isn’t too high at the moment). I’m thinking to enable the printout https://github.com/cms-sw/cmssw/blob/23f1aad119f27a6bfe1838e0c6f5c6f9f56f0b66/FWCore/Utilities/test/reusableobjectholder_t.cppunit.cpp#L240-L242 (but before the asserts) to see a bit more what is going on when the asserts fail.