onnxruntime: [Mobile] React Native app crash with Fatal signal 4 (SIGILL), code 1 (ILL_ILLOPC), fault addr 0x6f6d9217fc in tid 10411 (mqt_native_modu), pid 10224 (ReactNativeDemo)

Describe the issue

image I can confirm that this error occurs when I run const “fetches = await session.run(feeds);”. As this is native crash, I have no idea how to fix it. Please help! The code works on pixel 4a emulator but not on my samsung note 10 lite.

To reproduce

Code crashed after running const fetches = await session.run(feeds); and upon setting breakpoint, the app crashing point is determined to be at OrtSession.java

OrtSession.java crashing point:

      OnnxValue[] outputValues =
          run(
              OnnxRuntime.ortApiHandle,
              nativeHandle,
              allocator.handle,
              inputNamesArray,
              inputHandles,
              inputNamesArray.length,
              outputNamesArray,
              outputNamesArray.length,
              runOptionsHandle);
      return new Result(outputNamesArray, outputValues);

My code crashing point:

export const predictModelfromUri = async (
  session: ort.InferenceSession,
  imageUri: string
): Promise<number> => {
  const imageFloat32 = await convertImageToFloat32Array(imageUri);
  const feeds: Record<string, ort.Tensor> = {};
  feeds[session.inputNames[0]] = new ort.Tensor(
    "float32",
    imageFloat32!,
    [1, 3, 224, 224]
  );
  const fetches = await session.run(feeds);
  const output: object = fetches[session.outputNames[0]].data;
  return findMaxId(Object.values(output));
};

Urgency

No response

Platform

Android

OS Version

13

ONNX Runtime Installation

Built from Source

Compiler Version (if ‘Built from Source’)

No response

Package Name (if ‘Released Package’)

None

ONNX Runtime Version or Commit ID

onnxruntime-react-native@1.15.1

ONNX Runtime API

Java/Kotlin

Architecture

ARM64

Execution Provider

Default CPU

Execution Provider Library Version

No response

About this issue

  • Original URL
  • State: closed
  • Created 10 months ago
  • Comments: 35 (17 by maintainers)

Commits related to this issue

Most upvoted comments

The tests ran as expected. Running onnxruntime_mlas_test_1.16.1 failed and onnxruntime_mlas_test_1.16.1_patch pass.

r7:/data/local/tmp $ ./onnxruntime_mlas_test_1.16.1
WARNING: linker: Warning: unable to normalize "'/data/local/tmp'" (ignoring)
-------------------------------------------------------
----Running normal quick check mode. To enable more complete test,
----  run with '--long' as first argument!
Illegal instruction

nothing obvious comes to mind. arm64 assembly hasn’t changed for a while now. What does ILL_ILLOPC mean? illegal op code? is it executing some instructions that the device does not support?

@chenfu these devices are pretty old. According to this the Nokia 8 has a Qualcomm Snapdragon 835 which according to this has ‘Kryo 280 (2.45 GHz Cortex-A73 + 1.9 GHz Cortex-A53)’ and according to this both of those Cortex chips are ARMv8-A and not ARMv8.2-A.

The latest version of the Samsung Galaxy 8 also used a Cortex-A73/Cortext-A53 according to this.

This is still a little muddy when combined with https://github.com/microsoft/onnxruntime/issues/17647#issuecomment-1738542003 for two reasons

  • the change to the compiler flags mentioned there was included in 1.15 and @juliankotrba says 1.15 works.
    • Possible that it’s a combination of the flags and some other change to the MLAS ARM64 kernels is now generating an instruction that is invalid unless the chip supports ARMv8.2-A.
  • the Pocophone F1 specs suggest it should support ARMv8.2-A so if the issue was the ORT compiler flags that shouldn’t be affected

Here are some potential tests we could do using the onnx_test_runner binary to take React Native out of the picture. This can be run on device using adb.

  • test 1: Attempt to run the resnet18 model from this issue

    • expected to fail if the problem is in onnxruntime
  • test 2 if test 1 fails: Attempt to run the ‘Reshape’ model Rachel mentioned above

    • this model just has a Reshape node and would not hit any MLAS assembly code
    • expected to pass if the problem is in onnxruntime MLAS code
  • test 3 if test 2 passes: build onnx_test_runner with flags that target ARMv8-A instead of ARMv8.2-A

    • assuming test 1 fails, it would be expected this fixes the illegal opcode error

We can provide a zip with the necessary onnx_test_runner binaries/models/input data and instructions to run them if someone is able to test this out on a device with the issue.

Regardless, we should consider changing the MLAS flags to target ARMv8-A rather than ARMv8.2-A for Android builds if we want to support old devices.

@bartproo Just to double check the model executes correctly with XNNPACK and we can say for sure that the ORT MLAS implementation is the issue?

@YUNQIUGUO Yes my crash happened on android 13 note 10 lite. The xiaomi is android 12. I tried running your model with the following code and it ran successfully

  feeds[session.inputNames[0]] = new ort.Tensor(
    "float32",
    new Float32Array(24),
    [2, 3, 4]
  );
  const shape = new ort.Tensor(
    "int64",
    new BigInt64Array([24n]),
    [1]
  );
  feeds["shape"] = shape;
  const fetches = await session.run(feeds);

As far as I tested, it failed on Android 9 or older. It works fine on Android 10 or newer.

Just FYI, we are experiencing the same (?) crash on Android 9 devices, but only since version 1.16.0. Version 1.15.1 is working fine for us.

Fatal signal 4 (SIGILL), code 1 (ILL_ILLOPC), fault addr 0x6f291947fc in tid 13393

In case I find any more information I will post it here.

Just realised that the access to the drive link is restricted. I have allowed access for anyone with the link now

Can you clarify where this is running? You’ve said the architecture is X64 but in the stack trace it’s using /lib/arm64/libonnxruntime.so.