onnxruntime: onnx.load() | DecodeError: Error parsing message

Bug issue.

Goal: re-develop this BERT Notebook to use textattack/albert-base-v2-MRPC.

Kernel: conda_pytorch_p36. Deleted all output files and did Restart & Run All.

I can successfully create and save an ONNX model from HuggingFace Transformers model in run time memory. Error occurs when onnx.load(), from storage into memory.

Are my ONNX files corrupted?

`albert.onnx` and `alber.opt.onnx` here.

Section 2.1 - export in-memory PyTorch model as ONNX model:

import onnxruntime

def export_onnx_model(args, model, tokenizer, onnx_model_path):
    with torch.no_grad():
        inputs = {'input_ids':      torch.ones(1,128, dtype=torch.int64),
                    'attention_mask': torch.ones(1,128, dtype=torch.int64),
                    'token_type_ids': torch.ones(1,128, dtype=torch.int64)}
        outputs = model(**inputs)

        symbolic_names = {0: 'batch_size', 1: 'max_seq_len'}
        torch.onnx.export(model,                                            # model being run
                    (inputs['input_ids'],                             # model input (or a tuple for multiple inputs)
                    inputs['attention_mask'], 
                    inputs['token_type_ids']),                                         # model input (or a tuple for multiple inputs)
                    onnx_model_path,                                # where to save the model (can be a file or file-like object)
                    opset_version=11,                                 # the ONNX version to export the model to
                    do_constant_folding=True,                         # whether to execute constant folding for optimization
                    input_names=['input_ids',                         # the model's input names
                                'input_mask', 
                                'segment_ids'],
                    output_names=['output'],                    # the model's output names
                    dynamic_axes={'input_ids': symbolic_names,        # variable length axes
                                'input_mask' : symbolic_names,
                                'segment_ids' : symbolic_names})
        logger.info("ONNX Model exported to {0}".format(onnx_model_path))

export_onnx_model(configs, model, tokenizer, "albert.onnx")

Then optimisation:

pip install torch_optimizer

import torch_optimizer as optim

optimizer = optim.DiffGrad(model.parameters(), lr=0.001)
optimizer.step()

torch.save(optimizer.state_dict(), 'albert.opt.onnx')

Section 2.2 Quantize ONNX model:

from onnxruntime.quantization import quantize_dynamic, QuantType
import onnx

def quantize_onnx_model(onnx_model_path, quantized_model_path):    
    onnx_opt_model = onnx.load(onnx_model_path)  # DecodeError
    quantize_dynamic(onnx_model_path,
                     quantized_model_path,
                     weight_type=QuantType.QInt8)

    logger.info(f"quantized model saved to:{quantized_model_path}")

quantize_onnx_model('albert.opt.onnx', 'albert.opt.quant.onnx')

print('ONNX full precision model size (MB):', os.path.getsize("albert.opt.onnx")/(1024*1024))
print('ONNX quantized model size (MB):', os.path.getsize("albert.opt.quant.onnx")/(1024*1024))

Traceback:

---------------------------------------------------------------------------
DecodeError                               Traceback (most recent call last)
<ipython-input-16-2d2d32b0a667> in <module>
     10     logger.info(f"quantized model saved to:{quantized_model_path}")
     11 
---> 12 quantize_onnx_model('albert.opt.onnx', 'albert.opt.quant.onnx')
     13 
     14 print('ONNX full precision model size (MB):', os.path.getsize("albert.opt.onnx")/(1024*1024))

<ipython-input-16-2d2d32b0a667> in quantize_onnx_model(onnx_model_path, quantized_model_path)
      3 
      4 def quantize_onnx_model(onnx_model_path, quantized_model_path):
----> 5     onnx_opt_model = onnx.load(onnx_model_path)
      6     quantize_dynamic(onnx_model_path,
      7                      quantized_model_path,

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/onnx/__init__.py in load_model(f, format, load_external_data)
    119     '''
    120     s = _load_bytes(f)
--> 121     model = load_model_from_string(s, format=format)
    122 
    123     if load_external_data:

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/onnx/__init__.py in load_model_from_string(s, format)
    156     Loaded in-memory ModelProto
    157     '''
--> 158     return _deserialize(s, ModelProto())
    159 
    160 

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/onnx/__init__.py in _deserialize(s, proto)
     97                          '\ntype is {}'.format(type(proto)))
     98 
---> 99     decoded = cast(Optional[int], proto.ParseFromString(s))
    100     if decoded is not None and decoded != len(s):
    101         raise google.protobuf.message.DecodeError(

DecodeError: Error parsing message

Output Files:

albert.onnx  # original save
albert.opt.onnx  # optimised version save

Please let me know if there’s anything else I can add to post.

About this issue

Original URL
State: closed
Created 2 years ago
Comments: 16 (3 by maintainers)

Most upvoted comments

@danielbellhv, you can try the following command instead since you want to run MRPC: python -m onnxruntime.transformers.benchmark -m albert-base-v2 -i 1 -t 100 -b 1 -s 128 -e onnxruntime --model_class AutoModelForSequenceClassification -p int8 -o -v

Try python -m onnxruntime.transformers.benchmark --help for more information about the parameter.

tianleiwu on Jan 26, 2022

onnxruntime: onnx.load() | DecodeError: Error parsing message

albert.onnx and alber.opt.onnx here.

About this issue

Most upvoted comments

`albert.onnx` and `alber.opt.onnx` here.