simple-binary-encoding: [C++] Undefined behaviour in generated code

The C++ code generator typically generates code like this for accessing integer fields:

    std::int32_t securityID(void) const
    {
        return SBE_LITTLE_ENDIAN_ENCODE_32(*((std::int32_t *)(m_buffer + m_offset + 24)));
    }

    NewOrderSingle &securityID(const std::int32_t value)
    {
        *((std::int32_t *)(m_buffer + m_offset + 24)) = SBE_LITTLE_ENDIAN_ENCODE_32(value);
        return *this;
    }

The compiler assumes that a pointer to a int32_t has the correct alignment. With this generated code the alignment requirement might not fulfilled. On amd64 this is fine as long as the compiler doesn’t try to use SSE or AVX instructions, but it’s not safe in general to assume it will work.

The solution is to use memcpy like this (https://chromium.googlesource.com/chromium/src.git/+/master/base/bit_cast.h):

template <class Dest, class Source>
inline Dest bit_cast(const Source& source) {
   static_assert(sizeof(Dest) == sizeof(Source),
   "bit_cast requires source and destination to be the same size");
   static_assert(base::is_trivially_copyable<Dest>::value,
   "bit_cast requires the destination type to be copyable");
   static_assert(base::is_trivially_copyable<Source>::value,
   "bit_cast requires the source type to be copyable");
    
   Dest dest;
   memcpy(&dest, &source, sizeof(dest));
   return dest;
}

This should optimize to a single load load on amd64, guaranteed not to use instructions requiring alignment.

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Reactions: 3
  • Comments: 21 (17 by maintainers)

Commits related to this issue

Most upvoted comments

I don’t control the specification, I’m using what is provided by the exchange etc. But that is not the problem.

The problem is that the generated code is relying on undefined behaviour. The compiler will generate correct instructions on amd64 most of the time. By using memcpy we can make sure it will generate correct instructions all the time.

Haven’t exterminated all the (void) yet. And we are old C hackers anyway.