transformers.js: [Question] failed to call OrtRun(). error code = 1. When I try to load Xenova/pygmalion-350m

Iโ€™m getting an error failed to call OrtRun(). error code = 1. When I try to load Xenova/pygmalion-350m. The error is as follows

wasm-core-impl.ts:392 Uncaught Error: failed to call OrtRun(). error code = 1.
    at e.run (wasm-core-impl.ts:392:19)
    at e.run (proxy-wrapper.ts:215:17)
    at e.OnnxruntimeWebAssemblySessionHandler.run (session-handler.ts:100:15)
    at InferenceSession.run (inference-session-impl.ts:108:40)
    at sessionRun (models.js:191:36)
    at async Function.decoderForward [as _forward] (models.js:478:26)
    at async Function.forward (models.js:743:16)
    at async Function.decoderRunBeam [as _runBeam] (models.js:564:18)
    at async Function.runBeam (models.js:1284:16)
    at async Function.generate (models.js:1009:30)

And my Code for running it is this


let text = 'Once upon a time, there was';
let generator = await pipeline('text-generation', 'Xenova/pygmalion-350m');
let output = await generator(text, {
  temperature: 2,
  max_new_tokens: 10,
  repetition_penalty: 1.5,
  no_repeat_ngram_size: 2,
  num_beams: 2,
  num_return_sequences: 2,
});

console.log(output);

I see that OrtRun is something returned by the OnnxRuntime on a failure but have you had success in running the Pygmalion-350m model ?

About this issue

  • Original URL
  • State: open
  • Created 9 months ago
  • Comments: 16 (8 by maintainers)

Most upvoted comments

Sure. Let me give another try and I can add a separate issue in case it does not work. Thanks for the prompt response ๐Ÿ˜ƒ

@uahmad235 Feel free to open up a new issue with code that allows me to reproduce this. It might just be a configuration issue. ๐Ÿ˜‡

Same on Linux / Chrome 117, not browser related. We had these errors before and sometimes itโ€™s โ€œeasyโ€ to fix them:

https://github.com/xenova/transformers.js/issues/140#issuecomment-1629879842

Sometimes they crash the entire browser ๐Ÿ™ˆ

error code = 1 means ORT_FAIL, it couldnโ€™t be more descriptive than that ๐Ÿ˜… Sometimes I question the WASM overhead, it makes everything just so much more difficult to debug and I had a project where WASM is actually slower than V8 jitted/optimized JS (which would still be easy to debug).

Here is a full error:

ort-wasm.js:25 2023-09-30 15:13:51.698699 [E:onnxruntime:, sequential_executor.cc:494 ExecuteKernel] Non-zero status code returned while running DynamicQuantizeMatMul node. Name:'/decoder/layers.0/fc1/Gemm_MatMul_quant' Status Message: matmul_helper.h:61 Compute MatMul dimension mismatch
ort-wasm.js:25 2023-09-30 15:13:51.699100 [E:onnxruntime:, sequential_executor.cc:494 ExecuteKernel] Non-zero status code returned while running If node. Name:'optimum::if' Status Message: Non-zero status code returned while running DynamicQuantizeMatMul node. Name:'/decoder/layers.0/fc1/Gemm_MatMul_quant' Status Message: matmul_helper.h:61 Compute MatMul dimension mismatch
ort-wasm.js:25 Non-zero status code returned while running If node. Name:'optimum::if' Status Message: Non-zero status code returned while running DynamicQuantizeMatMul node. Name:'/decoder/layers.0/fc1/Gemm_MatMul_quant' Status Message: matmul_helper.h:61 Compute MatMul dimension mismatch

image