onnxruntime: Failed to build with protobuf 3.20

Describe the bug

3.19 is fine, this is only about protobuf 3.20.

FAILED: CMakeFiles/onnxruntime_framework.dir/tmp/scratch/onnxruntime/onnxruntime/core/framework/tensorprotoutils.cc.o
ccache /usr/bin/g++-10 -DCPUINFO_SUPPORTED_PLATFORM=1 -DEIGEN_MPL2_ONLY -DEIGEN_USE_THREADS -DENABLE_CPU_FP16_TRAINING_OPS -DENABLE_LANGUAGE_INTEROP_OPS -DNSYNC_ATOMIC_CPP11 -DONNX_ML=1 -DONNX_NAMESPACE=onnx -DORT_RUN_EXTERNAL_ONNX_TESTS -DPLATFORM_POSIX
 -DUSE_DNNL=1 -I/tmp/scratch/onnxruntime/include/onnxruntime -I/tmp/scratch/onnxruntime/include/onnxruntime/core/session -I/tmp/scratch/onnxruntime/cmake/external/nsync/public -I/tmp/scratch/onnxruntime/build -I/tmp/scratch/onnxruntime/onnxruntime -I/tmp
/scratch/onnxruntime/cmake/external/eigen -I/tmp/scratch/onnxruntime/cmake/external/SafeInt -I/tmp/scratch/onnxruntime/cmake/external/mp11/include -I/tmp/scratch/onnxruntime/cmake/external/pytorch_cpuinfo/include -I/tmp/scratch/onnxruntime/cmake/external
/onnx -I/tmp/scratch/onnxruntime/build/external/onnx -I/tmp/scratch/onnxruntime/cmake/external/flatbuffers/include -fdebug-prefix-map='/tmp/scratch'='/usr/local/src' -g -march=haswell -mtune=generic -ffunction-sections -fdata-sections -DCPUINFO_SUPPORTED
 -O3 -DNDEBUG -DGSL_UNENFORCED_ON_CONTRACT_VIOLATION -flto -fno-fat-lto-objects -fPIC -Wall -Wextra -Wno-deprecated-copy -Wno-nonnull-compare -std=gnu++17 -MD -MT CMakeFiles/onnxruntime_framework.dir/tmp/scratch/onnxruntime/onnxruntime/core/framework/ten
sorprotoutils.cc.o -MF CMakeFiles/onnxruntime_framework.dir/tmp/scratch/onnxruntime/onnxruntime/core/framework/tensorprotoutils.cc.o.d -o CMakeFiles/onnxruntime_framework.dir/tmp/scratch/onnxruntime/onnxruntime/core/framework/tensorprotoutils.cc.o -c /tm
p/scratch/onnxruntime/onnxruntime/core/framework/tensorprotoutils.cc
/tmp/scratch/onnxruntime/onnxruntime/core/framework/tensorprotoutils.cc: In function ‘onnxruntime::common::Status onnxruntime::utils::UnpackTensor(const onnx::TensorProto&, const void*, size_t, T*, size_t) [with T = float; size_t = long unsigned int]’:
/tmp/scratch/onnxruntime/onnxruntime/core/framework/tensorprotoutils.cc:259:20: error: invalid cast from type ‘google::protobuf::internal::RepeatedIterator<const float>’ to type ‘const float*’
  259 |       *p_data++ = *reinterpret_cast<const T*>(data_iter);                                                   \
      |                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/scratch/onnxruntime/onnxruntime/core/framework/tensorprotoutils.cc:264:1: note: in expansion of macro ‘DEFINE_UNPACK_TENSOR_IMPL’
  264 | DEFINE_UNPACK_TENSOR_IMPL(float, ONNX_NAMESPACE::TensorProto_DataType_FLOAT, float_data, float_data_size)
      | ^~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/scratch/onnxruntime/onnxruntime/core/framework/tensorprotoutils.cc: In function ‘onnxruntime::common::Status onnxruntime::utils::UnpackTensor(const onnx::TensorProto&, const void*, size_t, T*, size_t) [with T = double; size_t = long unsigned int]’:
/tmp/scratch/onnxruntime/onnxruntime/core/framework/tensorprotoutils.cc:259:20: error: invalid cast from type ‘google::protobuf::internal::RepeatedIterator<const double>’ to type ‘const double*’
  259 |       *p_data++ = *reinterpret_cast<const T*>(data_iter);                                                   \                                                                                                                                               |                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/scratch/onnxruntime/onnxruntime/core/framework/tensorprotoutils.cc:265:1: note: in expansion of macro ‘DEFINE_UNPACK_TENSOR_IMPL’
  265 | DEFINE_UNPACK_TENSOR_IMPL(double, ONNX_NAMESPACE::TensorProto_DataType_DOUBLE, double_data, double_data_size);
      | ^~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/scratch/onnxruntime/onnxruntime/core/framework/tensorprotoutils.cc: In function ‘onnxruntime::common::Status onnxruntime::utils::UnpackTensor(const onnx::TensorProto&, const void*, size_t, T*, size_t) [with T = unsigned char; size_t = long unsigned
int]’:
/tmp/scratch/onnxruntime/onnxruntime/core/framework/tensorprotoutils.cc:259:20: error: invalid cast from type ‘google::protobuf::internal::RepeatedIterator<const int>’ to type ‘const uint8_t*’ {aka ‘const unsigned char*’}
  259 |       *p_data++ = *reinterpret_cast<const T*>(data_iter);                                                   \                                                                                                                                               |                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/scratch/onnxruntime/onnxruntime/core/framework/tensorprotoutils.cc:266:1: note: in expansion of macro ‘DEFINE_UNPACK_TENSOR_IMPL’
  266 | DEFINE_UNPACK_TENSOR_IMPL(uint8_t, ONNX_NAMESPACE::TensorProto_DataType_UINT8, int32_data, int32_data_size)
      | ^~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/scratch/onnxruntime/onnxruntime/core/framework/tensorprotoutils.cc: In function ‘onnxruntime::common::Status onnxruntime::utils::UnpackTensor(const onnx::TensorProto&, const void*, size_t, T*, size_t) [with T = signed char; size_t = long unsigned in
t]’:
/tmp/scratch/onnxruntime/onnxruntime/core/framework/tensorprotoutils.cc:259:20: error: invalid cast from type ‘google::protobuf::internal::RepeatedIterator<const int>’ to type ‘const int8_t*’ {aka ‘const signed char*’}
  259 |       *p_data++ = *reinterpret_cast<const T*>(data_iter);                                                   \
      |                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/scratch/onnxruntime/onnxruntime/core/framework/tensorprotoutils.cc:267:1: note: in expansion of macro ‘DEFINE_UNPACK_TENSOR_IMPL’
  267 | DEFINE_UNPACK_TENSOR_IMPL(int8_t, ONNX_NAMESPACE::TensorProto_DataType_INT8, int32_data, int32_data_size)
      | ^~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/scratch/onnxruntime/onnxruntime/core/framework/tensorprotoutils.cc: In function ‘onnxruntime::common::Status onnxruntime::utils::UnpackTensor(const onnx::TensorProto&, const void*, size_t, T*, size_t) [with T = short int; size_t = long unsigned int]
’:
/tmp/scratch/onnxruntime/onnxruntime/core/framework/tensorprotoutils.cc:259:20: error: invalid cast from type ‘google::protobuf::internal::RepeatedIterator<const int>’ to type ‘const int16_t*’ {aka ‘const short int*’}
  259 |       *p_data++ = *reinterpret_cast<const T*>(data_iter);                                                   \
      |                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/scratch/onnxruntime/onnxruntime/core/framework/tensorprotoutils.cc:268:1: note: in expansion of macro ‘DEFINE_UNPACK_TENSOR_IMPL’
  268 | DEFINE_UNPACK_TENSOR_IMPL(int16_t, ONNX_NAMESPACE::TensorProto_DataType_INT16, int32_data, int32_data_size)
      | ^~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/scratch/onnxruntime/onnxruntime/core/framework/tensorprotoutils.cc: In function ‘onnxruntime::common::Status onnxruntime::utils::UnpackTensor(const onnx::TensorProto&, const void*, size_t, T*, size_t) [with T = short unsigned int; size_t = long unsi
gned int]’:
/tmp/scratch/onnxruntime/onnxruntime/core/framework/tensorprotoutils.cc:259:20: error: invalid cast from type ‘google::protobuf::internal::RepeatedIterator<const int>’ to type ‘const uint16_t*’ {aka ‘const short unsigned int*’}
  259 |       *p_data++ = *reinterpret_cast<const T*>(data_iter);                                                   \
      |                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/scratch/onnxruntime/onnxruntime/core/framework/tensorprotoutils.cc:269:1: note: in expansion of macro ‘DEFINE_UNPACK_TENSOR_IMPL’
  269 | DEFINE_UNPACK_TENSOR_IMPL(uint16_t, ONNX_NAMESPACE::TensorProto_DataType_UINT16, int32_data, int32_data_size)
      | ^~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/scratch/onnxruntime/onnxruntime/core/framework/tensorprotoutils.cc: In function ‘onnxruntime::common::Status onnxruntime::utils::UnpackTensor(const onnx::TensorProto&, const void*, size_t, T*, size_t) [with T = int; size_t = long unsigned int]’:
/tmp/scratch/onnxruntime/onnxruntime/core/framework/tensorprotoutils.cc:259:20: error: invalid cast from type ‘google::protobuf::internal::RepeatedIterator<const int>’ to type ‘const int32_t*’ {aka ‘const int*’}
  259 |       *p_data++ = *reinterpret_cast<const T*>(data_iter);                                                   \
      |                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/scratch/onnxruntime/onnxruntime/core/framework/tensorprotoutils.cc:270:1: note: in expansion of macro ‘DEFINE_UNPACK_TENSOR_IMPL’
  270 | DEFINE_UNPACK_TENSOR_IMPL(int32_t, ONNX_NAMESPACE::TensorProto_DataType_INT32, int32_data, int32_data_size)
      | ^~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/scratch/onnxruntime/onnxruntime/core/framework/tensorprotoutils.cc: In function ‘onnxruntime::common::Status onnxruntime::utils::UnpackTensor(const onnx::TensorProto&, const void*, size_t, T*, size_t) [with T = long int; size_t = long unsigned int]’
:
/tmp/scratch/onnxruntime/onnxruntime/core/framework/tensorprotoutils.cc:259:20: error: invalid cast from type ‘google::protobuf::internal::RepeatedIterator<const long int>’ to type ‘const int64_t*’ {aka ‘const long int*’}
  259 |       *p_data++ = *reinterpret_cast<const T*>(data_iter);                                                   \
      |                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/scratch/onnxruntime/onnxruntime/core/framework/tensorprotoutils.cc:271:1: note: in expansion of macro ‘DEFINE_UNPACK_TENSOR_IMPL’
  271 | DEFINE_UNPACK_TENSOR_IMPL(int64_t, ONNX_NAMESPACE::TensorProto_DataType_INT64, int64_data, int64_data_size)
      | ^~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/scratch/onnxruntime/onnxruntime/core/framework/tensorprotoutils.cc: In function ‘onnxruntime::common::Status onnxruntime::utils::UnpackTensor(const onnx::TensorProto&, const void*, size_t, T*, size_t) [with T = long unsigned int; size_t = long unsig
ned int]’:
/tmp/scratch/onnxruntime/onnxruntime/core/framework/tensorprotoutils.cc:259:20: error: invalid cast from type ‘google::protobuf::internal::RepeatedIterator<const long unsigned int>’ to type ‘const uint64_t*’ {aka ‘const long unsigned int*’}
  259 |       *p_data++ = *reinterpret_cast<const T*>(data_iter);                                                   \
      |                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/scratch/onnxruntime/onnxruntime/core/framework/tensorprotoutils.cc:272:1: note: in expansion of macro ‘DEFINE_UNPACK_TENSOR_IMPL’
  272 | DEFINE_UNPACK_TENSOR_IMPL(uint64_t, ONNX_NAMESPACE::TensorProto_DataType_UINT64, uint64_data, uint64_data_size)
      | ^~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/scratch/onnxruntime/onnxruntime/core/framework/tensorprotoutils.cc: In function ‘onnxruntime::common::Status onnxruntime::utils::UnpackTensor(const onnx::TensorProto&, const void*, size_t, T*, size_t) [with T = unsigned int; size_t = long unsigned i
nt]’:
/tmp/scratch/onnxruntime/onnxruntime/core/framework/tensorprotoutils.cc:259:20: error: invalid cast from type ‘google::protobuf::internal::RepeatedIterator<const long unsigned int>’ to type ‘const uint32_t*’ {aka ‘const unsigned int*’}
  259 |       *p_data++ = *reinterpret_cast<const T*>(data_iter);                                                   \
      |                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/scratch/onnxruntime/onnxruntime/core/framework/tensorprotoutils.cc:273:1: note: in expansion of macro ‘DEFINE_UNPACK_TENSOR_IMPL’
  273 | DEFINE_UNPACK_TENSOR_IMPL(uint32_t, ONNX_NAMESPACE::TensorProto_DataType_UINT32, uint64_data, uint64_data_size)
      | ^~~~~~~~~~~~~~~~~~~~~~~~~
ninja: build stopped: subcommand failed.

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Debian 11
ONNX Runtime installed from (source or binary): source
ONNX Runtime version: 1.10
Python version: Distro stock
GCC/Compiler version (if compiling from source): gcc-10

To Reproduce

CMake args:

https://github.com/xkszltl/Roaster/blob/17a5679edbffbf8f1e84fd91a8da2bfc7ad9b89a/pkgs/ort.sh#L89-L130

About this issue

Original URL
State: closed
Created 2 years ago
Reactions: 1
Comments: 23 (23 by maintainers)

Commits related to this issue

Pin protobuf to 3.19 for now due to issues in ort. - https://github.com/microsoft/onnxruntime/issues/11129 — committed to xkszltl/Roaster by xkszltl 2 years ago
python-onnxruntime: add a patch to make it build with protobuf 3.20 Also adds sodepends to avoid incompatible updates as well as removes unneeded items from `depends` for python-onnxruntime-cuda. Se... — committed to archlinuxcn/repo by deleted user 2 years ago
add a patch to make it build with protobuf 3.20 Also adds sodepends to avoid incompatible updates as well as removes unneeded items from `depends` for python-onnxruntime-cuda. See: https://github.co... — committed to archlinux/aur by deleted user 2 years ago
Manually patch ort 1.13.1 build with protobuf 3.20. - https://github.com/microsoft/onnxruntime/issues/11129 - https://github.com/microsoft/onnxruntime/pull/13731 — committed to xkszltl/Roaster by xkszltl a year ago

Most upvoted comments

float stored as int and want to restore to its original float

Yes, float values stored in int32_data need reinterpret_cast as per comments in ONNX, and they are handled in different template instantiations. For example, FLOAT16 is handled by https://github.com/microsoft/onnxruntime/blob/v1.11.1/onnxruntime/core/framework/tensorprotoutils.cc#L332.

static_cast only takes care of endian when sizes are the same

I think static_cast works for data types of different sizes on all kinds of architectures? I managed to have a try with qemu for aarch64_be (AArch64 big endian):

$ cat test-endian.cpp
#include <cstdint>
#include <cstdio>

int main() {
    int32_t a[1] = { 123 };
    uint8_t b = *reinterpret_cast<const uint8_t*>(a);
    uint8_t c = static_cast<uint8_t>(*a);
    printf("%d %d\n", b, c);
    return 0;
}

$ ./gcc-arm-11.2-2022.02-x86_64-aarch64_be-none-linux-gnu/bin/aarch64_be-none-linux-gnu-g++ test-endian.cpp -o test-endian-aarch64_be -static

$ g++ test-endian.cpp -o test-endian-x86_64

$ qemu-aarch64_be test-endian-aarch64_be
0 123

$ ./test-endian-x86_64
123 123

The emulator qemu-aarch64_be is from Arch Linux package https://archlinux.org/packages/extra/x86_64/qemu-arch-extra/, and the toolchain gcc-arm-11.2-2022.02-x86_64-aarch64_be-none-linux-gnu is downloaded from https://developer.arm.com/tools-and-software/open-source-software/developer-tools/gnu-toolchain/downloads.

yan12125 on Apr 29, 2022

Yeah, both onnx and onnxruntime are pinning protobuf 3.x for now (https://github.com/microsoft/onnxruntime/pull/11687, https://github.com/microsoft/onnxruntime/pull/11682, https://github.com/onnx/onnx/pull/4223)

yan12125 on Jun 1, 2022

Thanks for the pointer! If I understand those comments, relevant ONNX issues (e.g., https://github.com/onnx/onnx/issues/838, https://github.com/onnx/onnx/issues/3733) and some ONNX codes (e.g., onnx.numpy_helper) correctly, each 8-bit or 16-bit value is expanded to int32 during serialization, so static_cast is indeed correct. reinterpret_cast may lead to incorrect values for 8-bit and 16-bit values on big endian systems as protobuf should have decoded each int32 value as big endian format, and thus each 8-bit or 16-bit value is at a higher address instead of data_iter. I don’t have any big endian devices for testing, though.

yan12125 on Apr 28, 2022