etl: undefined behaviour and memory issues.

What happened

Hi, you’ve created a great library! But i discovered that source code of tests and maybe source code of library itself contains multiple instances of at least two kinds of problems:

  1. undefined behavior
  2. memory issues.

How to reproduce

It can be easily reproduced by addition of “-fsanitize=address,undefined” to CMAKE_CXX_FLAGS on every supported compiler. As for example:

set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fsanitize=address,undefined")

Or for single type of check:

set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fsanitize=undefined")

You can checkout concrete example for gcc/clang from this commit

You can see failing results in github actions ci.

Reports example

  • Undefined behavior example
/home/runner/work/etl/etl/test/test_array_wrapper.cpp:264:7: runtime error: index -1 out of bounds for type 'int [5]'
/home/runner/work/etl/etl/test/test_array_wrapper.cpp:278:7: runtime error: index -1 out of bounds for type 'int [5]'
/home/runner/work/etl/etl/test/test_array_wrapper.cpp:279:7: runtime error: index -1 out of bounds for type 'int [5]'
/home/runner/work/etl/etl/test/test_array_wrapper.cpp:282:7: runtime error: index -1 out of bounds for type 'int [5]'
/home/runner/work/etl/etl/test/test_array_wrapper.cpp:283:7: runtime error: index -1 out of bounds for type 'int [5]'
/home/runner/work/etl/etl/test/test_array_wrapper.cpp:332:37: runtime error: index -1 out of bounds for type 'int [5]'
/home/runner/work/etl/etl/test/test_array_wrapper.cpp:353:37: runtime error: index -1 out of bounds for type 'int [5]'
/home/runner/work/etl/etl/test/../include/etl/bit.h:163:19: runtime error: shift exponent 32 is too large for 32-bit type 'unsigned int'
/home/runner/work/etl/etl/test/test_bit.cpp:184:19: runtime error: shift exponent 32 is too large for 32-bit type 'unsigned int'
/home/runner/work/etl/etl/test/../include/etl/bit.h:163:19: runtime error: shift exponent 64 is too large for 64-bit type 'long unsigned int'
/home/runner/work/etl/etl/test/test_bit.cpp:184:19: runtime error: shift exponent 64 is too large for 64-bit type 'long unsigned int'
/home/runner/work/etl/etl/test/../include/etl/bitset.h:454:28: runtime error: left shift of negative value -128
/home/runner/work/etl/etl/test/../include/etl/bit_stream.h:237:44: runtime error: left shift of negative value -91
/home/runner/work/etl/etl/test/../include/etl/bit_stream.h:237:44: runtime error: left shift of negative value -37
  • Address issue example
==4661==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7ffe36665b90 at pc 0x55db03f737cc bp 0x7ffe366655e0 sp 0x7ffe366655d0
READ of size 8 at 0x7ffe36665b90 thread T0
    #0 0x55db03f737cb in (anonymous namespace)::Suitetest_callback_timer_interrupt::Testcallback_timer_interrupt_log_timer_calls::RunImpl() const::{lambda()#1}::operator()() const (/home/runner/work/etl/etl/test/etl_tests+0xadce7cb)
    #1 0x55db03f7b96f in void etl::delegate<void ()>::lambda_stub<(anonymous namespace)::Suitetest_callback_timer_interrupt::Testcallback_timer_interrupt_log_timer_calls::RunImpl() const::{lambda()#1}>(void*) (/home/runner/work/etl/etl/test/etl_tests+0xadd696f)
    #2 0x55db03ef9d84 in etl::delegate<void ()>::operator()() const (/home/runner/work/etl/etl/test/etl_tests+0xad54d84)
    #3 0x55db03f7a7fb in etl::icallback_timer_interrupt<(anonymous namespace)::ScopedGuard>::tick(unsigned int) (/home/runner/work/etl/etl/test/etl_tests+0xadd57fb)
    #4 0x55db03f74e41 in (anonymous namespace)::Suitetest_callback_timer_interrupt::Testcallback_timer_interrupt_log_timer_calls::RunImpl() const (/home/runner/work/etl/etl/test/etl_tests+0xadcfe41)
    #5 0x55db0b51dcc2 in void UnitTest::ExecuteTest<UnitTest::Test>(UnitTest::Test&, UnitTest::TestDetails const&, bool) (/home/runner/work/etl/etl/test/etl_tests+0x12378cc2)
    #6 0x55db0b51d8e3 in UnitTest::Test::Run() (/home/runner/work/etl/etl/test/etl_tests+0x123788e3)
    #7 0x55db0b52113e in UnitTest::TestRunner::RunTest(UnitTest::TestResults*, UnitTest::Test*, int) const (/home/runner/work/etl/etl/test/etl_tests+0x1237c13e)
    #8 0x55db0b521bbb in int UnitTest::TestRunner::RunTestsIf<UnitTest::True>(UnitTest::TestList const&, char const*, UnitTest::True const&, int) const (/home/runner/work/etl/etl/test/etl_tests+0x1237cbbb)
    #9 0x55db0b51ff17 in UnitTest::RunAllTests() (/home/runner/work/etl/etl/test/etl_tests+0x1237af17)
    #10 0x55db034d36d2 in main (/home/runner/work/etl/etl/test/etl_tests+0xa32e6d2)
    #11 0x7f1eafcf1c86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)
    #12 0x55db034d35e9 in _start (/home/runner/work/etl/etl/test/etl_tests+0xa32e5e9)

Address 0x7ffe36665b90 is located in stack of thread T0 at offset 928 in frame
    #0 0x55db03f744a5 in (anonymous namespace)::Suitetest_callback_timer_interrupt::Testcallback_timer_interrupt_log_timer_calls::RunImpl() const (/home/runner/work/etl/etl/test/etl_tests+0xadcf4a5)

Why it matters

Brief analysis of issues in repository suggested, that some tools already warned at least about undefined behavior elements in test code: array index issue but were explicitly ignored. Even though this particular linked issue isn’t a big deal by itself. But it can have negative effects on final code quality:

  1. New versions of compilers have optimization strategies based on assumption, that program isn’t ill-formed and doesn’t have undefined behavior code. Under that assumption, the optimizer can optimize away some checks inside tests, so they will never have a chance to report when some regression bug happens.
  2. Ignoring a whole category of ub bugs in the assumption that it only contains “silly” stuff from “too pedantic” check inside of test code - can hide some real bugs in code.
  3. stl/boost and other well established libraries on market all can pass its tests without ubsan/asan errors. Some organizations have policies to have this checks on and passing. So some of potential users couldn’t use this library.
  4. Any memory related bugs can prohibit usage of fuzzing techniques because they usually use sanitizers too.
  5. I’ve started poking around with these flags after I encountered, that tests doesn’t pass/or hang under my machine using gcc10.3 and gcc 11.2 and clang 13.0.1. And they have different behavior under different compilers. So my guess, that some issues found by asan/ubsan - are real problems(in tests themselves or in library code).

ping @jwellbelove

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 24 (13 by maintainers)

Commits related to this issue

Most upvoted comments

@jwellbelove does that example speak to you in some way? https://godbolt.org/z/rPMs1xPPd I’ve manually added placement new from local buffer, and problem is gone.

So in case of placement new - object is de-allocated using release method, which calls destructor manually. But in case of stack based object - we get destructor called twice? first - with release method, second - at the end of scope? which mess-up vtable for object, so ubsan detects it as wrong kind of object? Illustration of this effect As mentioned here (stackoverflow answer) calling destructor twice is ub.

So i’ve tried this code:

#include "unit_test_framework.h"

#include "etl/fixed_sized_memory_block_allocator.h"
#include "etl/reference_counted_message_pool.h"
#include "etl/shared_message.h"

namespace {
constexpr etl::message_id_t MessageId1 = 1U;
constexpr etl::message_id_t MessageId2 = 2U;

struct Message1 : public etl::message<MessageId1> {
  Message1(int i_) : i(i_) {}

  ~Message1() {}

  int i;
};

struct Message2 : public etl::message<MessageId2> {
  ~Message2() {}
};

SUITE(test_shared_message) {
  using pool_message_parameters =
      etl::atomic_counted_message_pool::pool_message_parameters<Message1,
                                                                Message2>;

  etl::fixed_sized_memory_block_allocator<
      pool_message_parameters::max_size, pool_message_parameters::max_alignment,
      4U>
      memory_allocator;

  etl::atomic_counted_message_pool message_pool(memory_allocator);

  TEST(test_reference_counted_pool_exceptions) {
    using pool_message_parameters =
        etl::atomic_counted_message_pool::pool_message_parameters<Message1,
                                                                  Message2>;

    etl::fixed_sized_memory_block_allocator<
        pool_message_parameters::max_size,
        pool_message_parameters::max_alignment, 4U>
        memory_allocator;

    etl::atomic_counted_message_pool message_pool(memory_allocator);

    etl::reference_counted_message<Message1, etl::atomic_int>* prcm;
    CHECK_NO_THROW(prcm = message_pool.allocate<Message1>(6));

    using rcm = etl::reference_counted_message<Message1, etl::atomic_int>;
    alignas(rcm) unsigned char buf[sizeof(rcm)];

    Message1 message1(6);

    rcm* temp = new (buf) rcm(message1, message_pool);

    try {
      message_pool.release(*temp);
    } catch (etl::exception e) {
      CHECK_EQUAL(std::string("reference_counted_message_pool:release failure"),
                  std::string(e.what()));
    }
  }
}
}  // namespace

And it passes tests.