swift-coreml-transformers: distilbert-onnx-coreml.py "works" for BERT, but I get "Error computing NN outputs." when predicting

Hi,

I used distilbert-onnx-coreml.py to convert a custom PyTorch BertForSequenceClassification model to CoreML. The conversion finishes without error.

However I can’t use the resulting CoreML model for prediction. The following code fails:

model = coremltools.models.MLModel(f"./path/to/model/model.mlmodel")

input_ids = np.zeros((1,64))
d = {}
d['input_ids'] = input_ids

predictions = model.predict(d, True)

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-29-1c38a7b07949> in <module>
----> 1 predictions = model.predict(d, True)

~/anaconda3/lib/python3.7/site-packages/coremltools/models/model.py in predict(self, data, useCPUOnly, **kwargs)
    328 
    329         if self.__proxy__:
--> 330             return self.__proxy__.predict(data, useCPUOnly)
    331         else:
    332             if _macos_version() < (10, 13):

RuntimeError: {
    NSLocalizedDescription = "Error computing NN outputs.";
}

Note, my input dim is 64:

spec.description.input

[name: "input_ids"
type {
  multiArrayType {
    shape: 1
    shape: 64
    dataType: INT32
  }
}
]

When I try to substitute my model into the DistilBERT demo app, I get the following error in Xcode when predicting:

CoreMLBert.bert_transactions_64Input
2020-01-07 10:12:58.271435+1300 CoreMLBert[1044:35882] [espresso] [Espresso::handle_ex_plan] exception=Espresso exception: "Invalid state": Cannot squeeze a dimension whose value is not 1: shape[1]=64 status=-5
2020-01-07 10:12:58.272716+1300 CoreMLBert[1044:35882] [coreml] Error computing NN outputs -5

The only hint that something might have gone wrong in the onnx->coreml conversion is a note about a deleted node, however I’m struggling to find out whether this is just a red herring:

[Core ML Pass] 1 disconnected constants nodes deleted
Translation to CoreML spec completed. Now compiling the CoreML model.
Model Compilation done.

Are there any particular layers that need custom conversion in BERT into coreml? Any suggestions on further debugging?

Thanks.

About this issue

Original URL
State: open
Created 4 years ago
Comments: 24 (7 by maintainers)

Most upvoted comments

Run this script to fix the issue:

import coremltools
import numpy as np

mlmodel = coremltools.models.MLModel("bert-test-256_FP16.mlmodel")
spec = mlmodel._spec

spec.neuralNetwork.layers[9].activation.linear.alpha = 1  # whereNonZero
spec.neuralNetwork.layers[11].activation.linear.alpha = 1   # squeeze

new_model = coremltools.models.MLModel(spec)
new_model.save("w00t.mlmodel")

new_model.predict({"input_ids": np.zeros((1, 256), dtype=np.int32)})

The issue was the squeeze and whereNonZero layers at the beginning of the model. This script replaces them with harmless linear activation layers.

hollance on Feb 27, 2020

Hello I decided i’m just going to post an example code snippet and people can modify it to their needs because i think it’s not a fix that would be uniform for every use case but people should be able to modify it to their needs. Using this after converting the model via pytorch->onnx->coreml in the distilbert-onnx-coreml.py script i was able to get it to run on device and get similar results to the pytorch model. If it doesn’t work for you post in this issue and i can try to help.

#The motivation behind this is to iterate through a fine-tuned distilbert model
#and fix the squeeze layers which for some reason try to squeeze along the
#incorrect dimension. This will throw an error similar to 
#Espresso exception: "Invalid state": Cannot squeeze a dimension whose value is not 1
#the coreml pytorch conversion does not work out of the box either from my experience
#so this is a way to get a fine-tuned distilbert QA model to run on an iOS device.
#note that you should run the torch.onnx.export command with the opset_version flag
#set to less than 11. I tested it and it works on opset=9 and opset=10

import coremltools

mlmodel = coremltools.models.MLModel($YOUR_FINETUNED_MODEL_HERE)

spec = mlmodel._spec

layers_to_change = []

#iterates through the network layers and identifies the squeeze layers
for i,layer in enumerate(spec.neuralNetwork.layers):
        if "Squeeze" in layer.name:
                layers_to_change.append(i)

#changes the axes to squeeze along the 0 axis which should be 1 dimensional
#in the converted model
for x in layers_to_change:
        del spec.neuralNetwork.layers[x].squeeze.axes[:]
        spec.neuralNetwork.layers[x].squeeze.axes.extend([0])

new_model = coremltools.models.MLModel(spec)

new_model.save($YOUR_MODEL_PATCHED)

calderma on Jul 16, 2020