onnxruntime-genai: terminated by signal SIGSEGV (Address boundary error) if import onnxruntime before onnxruntime-genai

Cannot create og.Model if user import onnxruntime before the onnxruntime-gneai. The job is terminated by signal SIGSEGV (Address boundary error)


import onnxruntime as ort

def genai_run(prompt, model_path, max_length=200):

    import time

    import onnxruntime_genai as og

    print("Loading model...")
    app_started_timestamp = time.time()
    model = og.Model(model_path)
    model_loaded_timestamp = time.time()
    print("Model loaded in {:.2f} seconds".format(model_loaded_timestamp - app_started_timestamp))
    tokenizer = og.Tokenizer(model)
    tokenizer_stream = tokenizer.create_stream()
    input_tokens = tokenizer.encode(prompt)
    started_timestamp = time.time()

    print("Creating generator ...")
    params = og.GeneratorParams(model)
    params.set_search_options(
        {
            "do_sample": False,
            "max_length": max_length,
            "min_length": 0,
            "top_p": 0.9,
            "top_k": 40,
            "temperature": 1.0,
            "repetition_penalty": 1.0,
        }
    )
    params.input_ids = input_tokens
    generator = og.Generator(model, params)
    print("Generator created")

    first = True
    new_tokens = []

    while not generator.is_done():
        generator.compute_logits()
        generator.generate_next_token()
        if first:
            first_token_timestamp = time.time()
            first = False

        new_token = generator.get_next_tokens()[0]
        print(tokenizer_stream.decode(new_token), end="")
        new_tokens.append(new_token)

    run_time = time.time() - started_timestamp
    print(
        f"Prompt tokens: {len(input_tokens)}, New tokens: {len(new_tokens)},"
        f" Time to first: {(first_token_timestamp - started_timestamp):.2f}s,"
        f" New tokens per second: {len(new_tokens)/run_time:.2f} tps"
    )


model_path = "xxxx"
genai_run("helo world", model_path)

How to reproduce:

  1. python -m onnxruntime_genai.models.builder -m microsoft/phi-2 -e cpu -p int4 -o ./models/phi2
  2. run above scripts

About this issue

  • Original URL
  • State: closed
  • Created 3 months ago
  • Comments: 25 (20 by maintainers)

Most upvoted comments

I think I can confirm the root cause is https://stackoverflow.com/questions/63902528/program-crashes-when-filesystempath-is-destroyed , because after I set a breakpoint at “std::filesystem::__cxx11::path::_M_split_cmpts()”, I got the following stacktrace:

#0  0x00007ffff55e14a0 in std::filesystem::__cxx11::path::_M_split_cmpts() () from /lib/libstdc++.so.6
#1  0x00007fff8cc6efb6 in std::filesystem::__cxx11::path::path<char const*, std::filesystem::__cxx11::path> (this=0x7fffffffcdf0,
    __source=@0x7fffffffce58: 0x7fffffffd020 "xxxx") at /usr/include/c++/8/bits/fs_path.h:185
#2  0x00007fff8cc71234 in std::make_unique<Generators::Config, char const*&> () at /usr/include/c++/8/bits/unique_ptr.h:835
#3  0x00007fff8cc6cd0c in Generators::CreateModel (ort_env=..., config_path=0x7fffffffd020 "xxxx") at /ort_genai_src/src/models/model.cpp:345
#4  0x00007fff8cbee330 in Generators::<lambda(const string&)>::operator()(const std::__cxx11::string &) const (__closure=0x5555561a0ad8,
    config_path="xxxx") at /ort_genai_src/src/python/python.cpp:212
#5  0x00007fff8cbefee6 in pybind11::detail::initimpl::factory<Generators::pybind11_init_onnxruntime_genai(pybind11::module_&)::<lambda(const string&)>, pybind11::detail::void_type (*)(), std::shared_ptr<Generators::Model>(const std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&), pybind11::detail::void_type()>::<lambda(pybind11::detail::value_and_holder&, const std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&)>::operator()(pybind11::detail::value_and_holder &, const std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > &) const (this=0x5555561a0ad8, v_h=..., args#0="xxxx") at /ort_genai_src/build/cuda/_deps/pybind11_project-src/include/pybind11/detail/init.h:297
#6  0x00007fff8cbf3a97 in pybind11::detail::argument_loader<pybind11::detail::value_and_holder&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>::call_impl<void, pybind11::detail::initimpl::factory<Func, pybind11::detail::void_type (*)(), Return(Args ...)>::execute(Class&, const Extra& ...) && [with Class = pybind11::class_<Generators::Model, std::shared_ptr<Generators::Model> >; Extra = {}; Func = Generators::pybind11_init_onnxruntime_genai(pybind11::module_&)::<lambda(const string&)>; Return = std::shared_ptr<Generators::Model>; Args = {const std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&}]::<lambda(pybind11::detail::value_and_holder&, const std::__cxx11::basic_string<char>&)>&, 0, 1, pybind11::detail::void_type>(pybind11::detail::initimpl::factory<Generators::pybind11_init_onnxruntime_genai(pybind11::module_&)::<lambda(const string&)>, pybind11::detail::void_type (*)(), std::shared_ptr<Generators::Model>(const std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&), pybind11::detail::void_type()>::<lambda(pybind11::detail::value_and_holder&, const std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&)> &, std::index_sequence, pybind11::detail::void_type &&) (this=0x7fffffffd010, f=...)
    at /ort_genai_src/build/cuda/_deps/pybind11_project-src/include/pybind11/cast.h:1439
#7  0x00007fff8cbf3985 in pybind11::detail::argument_loader<pybind11::detail::value_and_holder&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>::call<void, pybind11::detail::void_type, pybind11::detail::initimpl::factory<Func, pybind11::detail::void_type (*)(), Return(Args ...)>::execute(Class&, const Extra& ...) && [with Class = pybind11::class_<Generators::Model, std::shared_ptr<Generators::Model> >; Extra = {}; Func = Generators::pybind11_init_onnxruntime_genai(pybind11::module_&)::<lambda(const string&)>; Return = std::shared_ptr<Generators::Model>; Args = {const std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&}]::<lambda(pybind11::detail::value_and_holder&, const std::__cxx11::basic_string<char>&)>&>(pybind11::detail::initimpl::factory<Generators::pybind11_init_onnxruntime_genai(pybind11::module_&)::<lambda(const string&)>, pybind11::detail::void_type (*)(), std::shared_ptr<Generators::Model>(const std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&), pybind11::detail::void_type()>::<lambda(pybind11::detail::value_and_holder&, const std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&)> &) (this=0x7fffffffd010, f=...) at /ort_genai_src/build/cuda/_deps/pybind11_project-src/include/pybind11/cast.h:1413
#8  0x00007fff8cbf3463 in pybind11::cpp_function::<lambda(pybind11::detail::function_call&)>::operator()(pybind11::detail::function_call &) const (
    this=0x0, call=...) at /ort_genai_src/build/cuda/_deps/pybind11_project-src/include/pybind11/pybind11.h:249
#9  0x00007fff8cbf34e6 in pybind11::cpp_function::<lambda(pybind11::detail::function_call&)>::_FUN(pybind11::detail::function_call &) ()
    at /ort_genai_src/build/cuda/_deps/pybind11_project-src/include/pybind11/pybind11.h:224
#10 0x00007fff8cc049bb in pybind11::cpp_function::dispatcher (self=0x7fff90095ed0, args_in=0x7ffff77cf580, kwargs_in=0x0)
    at /ort_genai_src/build/cuda/_deps/pybind11_project-src/include/pybind11/pybind11.h:929
#11 0x00007ffff7d849d2 in cfunction_call (func=0x7fff9009c220, args=<optimized out>, kwargs=<optimized out>) at Objects/methodobject.c:543
#12 0x00007ffff7d68390 in _PyObject_MakeTpCall (tstate=0x555555559b80, callable=0x7fff9009c220, args=0x7fffffffd7e0, nargs=2, keywords=0x0)
    at Objects/call.c:191

Though onnxruntime_genai.cpython-39-x86_64-linux-gnu.so contains a copy of the function:

$ nm -C  /home/chasun/.local/lib/python3.9/site-packages/onnxruntime_genai/onnxruntime_genai.cpython-39-x86_64-linux-gnu.so |grep std::filesystem::__cxx11::path::_M_split_cmpts
00000000002d2140 T std::filesystem::__cxx11::path::_M_split_cmpts()
00000000001ec838 t std::filesystem::__cxx11::path::_M_split_cmpts() [clone .cold.121]

At runtime from the callstack you can see actually an implementation from /lib/libstdc++.so.6 was used. Given the layout of std::filesystem::path object between GCC 8 and higher GCC versions are different(and incompatible), we should not see one calls into another.

A few solutions for us to explore:

  1. Try making the pybind symbols private using version_script.lds (the way ort does).
  2. If that does not help, we can try avoiding the use of std::filesystem
  3. And try using a newer GCC compiler.

I’ll work on finding the right path forward.

I got a genai debug package, but the callstack is wired. But, I found an issue. When I ran

nm -C -g --defined-only /home/chasun/.local/lib/python3.9/site-packages/onnxruntime/capi/onnxruntime_pybind11_state.cpython-39-x86_64-linux-gnu.so

It only prints two things:

00000000000e8595 T PyInit_onnxruntime_pybind11_state
0000000000000000 A VERS_1.0

However, with the same command the GenAI package’s binary prints a lot of stdc++ symbols. Which means you didn’t hide the symbols per suggestion from https://gcc.gnu.org/wiki/Visibility

Native code from ort python package in the callstack. I am 100% sure on that.

I am rebuilding the packages with symbols.

Ok, I was able to reproduce the problem on my end. I was also able to narrow down the problem to be related to std::filesystem.

Summary of the problem:

  • onnxruntime-genai-cuda is built with GCC 8.5.
  • onnxruntime-genai (cpu variant) is built with GCC 12.
  • onnxruntime-gpu and probably torch/transformers use GCC > 8 for their build.

The problem is introduced because when importing onnxruntime/torch/transformers first, the symbols for std::filesystem are loaded from libstdc++.so.6. These symbols are incompatible with the symbols needed for std::filesystem for GCC 8. There is more meaningful information provided here: https://bugs.launchpad.net/ubuntu/+source/gcc-8/+bug/1824721/comments/6

I think to resolve this problem, we might need publish a patch release that is built with a higher GCC version.

cc @jchen351

Could you share more details on this error? And if possible, a stacktrace. That would help me reproduce this.

I tried running your script and was able to successfully execute it. Here is the output:

Loading model...
Model loaded in 15.46 seconds
Creating generator ...
Generator created
.

In the end, the friends realized that their journey had not only deepened their understanding of the world but also strengthened their bond. They had learned the importance of embracing diversity, respecting different perspectives, and finding common ground.

As they bid farewell to the Enchanted Forest, they carried with them the memories of their adventure and the wisdom gained from their encounters. They knew that their shared experiences would forever shape their lives and inspire them to continue exploring the wonders of the world.

And so, the three friends returned to their small town, forever changed by their journey. They became advocates for unity, spreading the message of acceptance and understanding wherever they went. Their story became a testament to the power of friendship, curiosity, and the pursuit of knowledge.

As they looked back on their adventure, they realized that the Enchanted Forest had not only taught them about the world but also about themselves. They had discovered their own strengths, overcome their fears, andPrompt tokens: 3, New tokens: 197, Time to first: 0.25s, New tokens per second: 14.55 tps

I am using onnxruntime-genai 0.1.0 from PyPI for this test. Could you share the following information:

Platform: windows, linux… ort version ort-genai version python version torch version transformers version stacktrace if possible