onnxruntime: onnx.load() | DecodeError: Error parsing message
Bug issue.
Goal: re-develop this BERT Notebook to use textattack/albert-base-v2-MRPC.
Kernel: conda_pytorch_p36. Deleted all output files and did Restart & Run All.
I can successfully create and save an ONNX model from HuggingFace Transformers model in run time memory. Error occurs when onnx.load(), from storage into memory.
Are my ONNX files corrupted?
albert.onnx and alber.opt.onnx here.
Section 2.1 - export in-memory PyTorch model as ONNX model:
import onnxruntime
def export_onnx_model(args, model, tokenizer, onnx_model_path):
with torch.no_grad():
inputs = {'input_ids': torch.ones(1,128, dtype=torch.int64),
'attention_mask': torch.ones(1,128, dtype=torch.int64),
'token_type_ids': torch.ones(1,128, dtype=torch.int64)}
outputs = model(**inputs)
symbolic_names = {0: 'batch_size', 1: 'max_seq_len'}
torch.onnx.export(model, # model being run
(inputs['input_ids'], # model input (or a tuple for multiple inputs)
inputs['attention_mask'],
inputs['token_type_ids']), # model input (or a tuple for multiple inputs)
onnx_model_path, # where to save the model (can be a file or file-like object)
opset_version=11, # the ONNX version to export the model to
do_constant_folding=True, # whether to execute constant folding for optimization
input_names=['input_ids', # the model's input names
'input_mask',
'segment_ids'],
output_names=['output'], # the model's output names
dynamic_axes={'input_ids': symbolic_names, # variable length axes
'input_mask' : symbolic_names,
'segment_ids' : symbolic_names})
logger.info("ONNX Model exported to {0}".format(onnx_model_path))
export_onnx_model(configs, model, tokenizer, "albert.onnx")
Then optimisation:
pip install torch_optimizer
import torch_optimizer as optim
optimizer = optim.DiffGrad(model.parameters(), lr=0.001)
optimizer.step()
torch.save(optimizer.state_dict(), 'albert.opt.onnx')
Section 2.2 Quantize ONNX model:
from onnxruntime.quantization import quantize_dynamic, QuantType
import onnx
def quantize_onnx_model(onnx_model_path, quantized_model_path):
onnx_opt_model = onnx.load(onnx_model_path) # DecodeError
quantize_dynamic(onnx_model_path,
quantized_model_path,
weight_type=QuantType.QInt8)
logger.info(f"quantized model saved to:{quantized_model_path}")
quantize_onnx_model('albert.opt.onnx', 'albert.opt.quant.onnx')
print('ONNX full precision model size (MB):', os.path.getsize("albert.opt.onnx")/(1024*1024))
print('ONNX quantized model size (MB):', os.path.getsize("albert.opt.quant.onnx")/(1024*1024))
Traceback:
---------------------------------------------------------------------------
DecodeError Traceback (most recent call last)
<ipython-input-16-2d2d32b0a667> in <module>
10 logger.info(f"quantized model saved to:{quantized_model_path}")
11
---> 12 quantize_onnx_model('albert.opt.onnx', 'albert.opt.quant.onnx')
13
14 print('ONNX full precision model size (MB):', os.path.getsize("albert.opt.onnx")/(1024*1024))
<ipython-input-16-2d2d32b0a667> in quantize_onnx_model(onnx_model_path, quantized_model_path)
3
4 def quantize_onnx_model(onnx_model_path, quantized_model_path):
----> 5 onnx_opt_model = onnx.load(onnx_model_path)
6 quantize_dynamic(onnx_model_path,
7 quantized_model_path,
~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/onnx/__init__.py in load_model(f, format, load_external_data)
119 '''
120 s = _load_bytes(f)
--> 121 model = load_model_from_string(s, format=format)
122
123 if load_external_data:
~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/onnx/__init__.py in load_model_from_string(s, format)
156 Loaded in-memory ModelProto
157 '''
--> 158 return _deserialize(s, ModelProto())
159
160
~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/onnx/__init__.py in _deserialize(s, proto)
97 '\ntype is {}'.format(type(proto)))
98
---> 99 decoded = cast(Optional[int], proto.ParseFromString(s))
100 if decoded is not None and decoded != len(s):
101 raise google.protobuf.message.DecodeError(
DecodeError: Error parsing message
Output Files:
albert.onnx # original save
albert.opt.onnx # optimised version save
Please let me know if there’s anything else I can add to post.
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 16 (3 by maintainers)
@danielbellhv, you can try the following command instead since you want to run MRPC: python -m onnxruntime.transformers.benchmark -m albert-base-v2 -i 1 -t 100 -b 1 -s 128 -e onnxruntime --model_class AutoModelForSequenceClassification -p int8 -o -v
Try
python -m onnxruntime.transformers.benchmark --helpfor more information about the parameter.Related onnx export code can be found in https://github.com/microsoft/onnxruntime/blob/4af116649c8f5f6e725ce8b314b7f8e38007f236/onnxruntime/python/tools/transformers/onnx_exporter.py#L347