onnxruntime-genai: terminated by signal SIGSEGV (Address boundary error) if import onnxruntime before onnxruntime-genai
Cannot create og.Model if user import onnxruntime before the onnxruntime-gneai. The job is terminated by signal SIGSEGV (Address boundary error)
import onnxruntime as ort
def genai_run(prompt, model_path, max_length=200):
import time
import onnxruntime_genai as og
print("Loading model...")
app_started_timestamp = time.time()
model = og.Model(model_path)
model_loaded_timestamp = time.time()
print("Model loaded in {:.2f} seconds".format(model_loaded_timestamp - app_started_timestamp))
tokenizer = og.Tokenizer(model)
tokenizer_stream = tokenizer.create_stream()
input_tokens = tokenizer.encode(prompt)
started_timestamp = time.time()
print("Creating generator ...")
params = og.GeneratorParams(model)
params.set_search_options(
{
"do_sample": False,
"max_length": max_length,
"min_length": 0,
"top_p": 0.9,
"top_k": 40,
"temperature": 1.0,
"repetition_penalty": 1.0,
}
)
params.input_ids = input_tokens
generator = og.Generator(model, params)
print("Generator created")
first = True
new_tokens = []
while not generator.is_done():
generator.compute_logits()
generator.generate_next_token()
if first:
first_token_timestamp = time.time()
first = False
new_token = generator.get_next_tokens()[0]
print(tokenizer_stream.decode(new_token), end="")
new_tokens.append(new_token)
run_time = time.time() - started_timestamp
print(
f"Prompt tokens: {len(input_tokens)}, New tokens: {len(new_tokens)},"
f" Time to first: {(first_token_timestamp - started_timestamp):.2f}s,"
f" New tokens per second: {len(new_tokens)/run_time:.2f} tps"
)
model_path = "xxxx"
genai_run("helo world", model_path)
How to reproduce:
- python -m onnxruntime_genai.models.builder -m microsoft/phi-2 -e cpu -p int4 -o ./models/phi2
- run above scripts
About this issue
- Original URL
- State: closed
- Created 3 months ago
- Comments: 25 (20 by maintainers)
I think I can confirm the root cause is https://stackoverflow.com/questions/63902528/program-crashes-when-filesystempath-is-destroyed , because after I set a breakpoint at “std::filesystem::__cxx11::path::_M_split_cmpts()”, I got the following stacktrace:
Though onnxruntime_genai.cpython-39-x86_64-linux-gnu.so contains a copy of the function:
At runtime from the callstack you can see actually an implementation from /lib/libstdc++.so.6 was used. Given the layout of std::filesystem::path object between GCC 8 and higher GCC versions are different(and incompatible), we should not see one calls into another.
A few solutions for us to explore:
I’ll work on finding the right path forward.
I got a genai debug package, but the callstack is wired. But, I found an issue. When I ran
It only prints two things:
However, with the same command the GenAI package’s binary prints a lot of stdc++ symbols. Which means you didn’t hide the symbols per suggestion from https://gcc.gnu.org/wiki/Visibility
Native code from ort python package in the callstack. I am 100% sure on that.
I am rebuilding the packages with symbols.
Ok, I was able to reproduce the problem on my end. I was also able to narrow down the problem to be related to std::filesystem.
Summary of the problem:
The problem is introduced because when importing onnxruntime/torch/transformers first, the symbols for std::filesystem are loaded from libstdc++.so.6. These symbols are incompatible with the symbols needed for std::filesystem for GCC 8. There is more meaningful information provided here: https://bugs.launchpad.net/ubuntu/+source/gcc-8/+bug/1824721/comments/6
I think to resolve this problem, we might need publish a patch release that is built with a higher GCC version.
cc @jchen351
Could you share more details on this error? And if possible, a stacktrace. That would help me reproduce this.
I tried running your script and was able to successfully execute it. Here is the output:
I am using
onnxruntime-genai 0.1.0from PyPI for this test. Could you share the following information:Platform: windows, linux… ort version ort-genai version python version torch version transformers version stacktrace if possible