intel-extension-for-pytorch: Invalid output and errors using model = ipex.optimize(model): split master weight unsupported, Conv BatchNorm folding failed, Linear BatchNorm folding failed

Hi, trying to run inference with a pretrained OFA (OFA-huge) model according to these instructions:

https://github.com/OFA-Sys/OFA/blob/feature/add_transformers/transformers.md

This runs fine on both CPU and CUDA but using XPU results in gibberish. I also get several warnings which go away when model = ipex.optimize(model) is commented out. With essentially the only change between CPU/CUDA and XPU being the .to('xpu') part, the model still outputs gibberish.

Warnings from model = ipex.optimize(model):

  warnings.warn(
./OFA-huge
<super: <class 'OFATokenizer'>, <OFATokenizer object>>
/home/mediamatik/.virtualenvs/keplermatik_whisper_api/lib/python3.10/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/workspace/pytorch/aten/src/ATen/native/TensorShape.cpp:3190.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
/home/mediamatik/.virtualenvs/keplermatik_whisper_api/lib/python3.10/site-packages/intel_extension_for_pytorch/frontend.py:447: UserWarning: For XPU device, the split master weight is unsupported for now, so temp to disable it
  warnings.warn("For XPU device, the split master weight is unsupported for now, so temp to disable it")
/home/mediamatik/.virtualenvs/keplermatik_whisper_api/lib/python3.10/site-packages/intel_extension_for_pytorch/frontend.py:457: UserWarning: For XPU device to save valuable device memory, temp to do optimization on inplaced model, so                     make inplace to be true
  warnings.warn(
/home/mediamatik/.virtualenvs/keplermatik_whisper_api/lib/python3.10/site-packages/intel_extension_for_pytorch/frontend.py:464: UserWarning: For XPU, the weight prepack and sample input are disabled. The onednn layout                     is automatically chosen to use
  warnings.warn(
/home/mediamatik/.virtualenvs/keplermatik_whisper_api/lib/python3.10/site-packages/intel_extension_for_pytorch/frontend.py:486: UserWarning: Conv BatchNorm folding failed during the optimize process.
  warnings.warn("Conv BatchNorm folding failed during the optimize process.")
/home/mediamatik/.virtualenvs/keplermatik_whisper_api/lib/python3.10/site-packages/intel_extension_for_pytorch/frontend.py:491: UserWarning: Linear BatchNorm folding failed during the optimize process.
  warnings.warn("Linear BatchNorm folding failed during the optimize process.")

[' this is the ch ch chaval all the is is the word for the band that is'] ^ gibberish output

With CPU/CUDA: [' a black and white photo of a wolf walking through the woods at night.'] ^ correct output

I’m running Ubuntu 22.04 with 1.13.10+xpu, code is below:

import warnings
from PIL import Image
from torchvision import transforms
from transformers import OFATokenizer, OFAModel
import intel_extension_for_pytorch as ipex

chkpt_dir = "./OFA-huge"
path_to_image = "image.jpg"
mean, std = [0.5, 0.5, 0.5], [0.5, 0.5, 0.5]
resolution = 256
patch_resize_transform = transforms.Compose([
        lambda image: image.convert("RGB"),
        transforms.Resize((resolution, resolution), interpolation=Image.BICUBIC),
        transforms.ToTensor(),
        transforms.Normalize(mean=mean, std=std)
    ])


tokenizer = OFATokenizer.from_pretrained(chkpt_dir)

txt = " what does the image describe?"
inputs = tokenizer([txt], return_tensors="pt").input_ids
img = Image.open(path_to_image)
patch_img = patch_resize_transform(img).unsqueeze(0)

model = OFAModel.from_pretrained(chkpt_dir, use_cache=False)
model = model.to("xpu")
patch_img = patch_img.to("xpu")
inputs = inputs.to("xpu")
model = ipex.optimize(model)

gen = model.generate(inputs, patch_images=patch_img, num_beams=5, no_repeat_ngram_size=3)

print(tokenizer.batch_decode(gen, skip_special_tokens=True))

Image:

Thanks!

About this issue

Original URL
State: open
Created a year ago
Comments: 22 (9 by maintainers)

Most upvoted comments

This issue will be fixed in the next release soon.

jingxu10 on Apr 2, 2023

took hours to build, so uploaded unofficial wheels of xpu-master here: https://github.com/leuc/intel-extension-for-pytorch/releases/tag/v1.13.120%2Bgit5fdf9e6

leuc on Apr 21, 2023

Update, I have also tried this with an Intel I9-11900K CPU and A770 with the same result. The first attempt was using an AMD Threadripper. The code does not work on either platform. Is there a timeline for this issue? Thanks so much!

nathanodle on Apr 2, 2023

similar issue while trying to run openai-whisper on A770

     from . import load_model
+    import intel_extension_for_pytorch as ipex

     model = load_model(model_name, device=device, download_root=model_dir)
+    model.eval()
+    model = model.to('xpu')
+    ipex.optimize(model)

whisper --model tiny --language en --task transcribe --device xpu ...

results in

intel_extension_for_pytorch/frontend.py:264: UserWarning: Conv BatchNorm folding failed during the optimize process.
intel_extension_for_pytorch/frontend.py:277: UserWarning: pending the optimization for LSTM

Whipser then fails to decode the tokens.

torch                       1.10.0a0+git3d5f2d4
intel-extension-for-pytorch 1.10.200+gpu

. /opt/intel/oneapi/tbb/2021.8.0/env/vars.sh
. /opt/intel/oneapi/compiler/2022.2.0/env/vars.sh
. /opt/intel/oneapi/mkl/2022.2.0/env/vars.sh

> sycl-ls
[opencl:acc:0] Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device 1.2 [2022.14.7.0.30_160000]
[opencl:cpu:1] Intel(R) OpenCL, AMD Ryzen 9 5900X 12-Core Processor             3.0 [2022.14.7.0.30_160000]
[opencl:gpu:2] Intel(R) OpenCL HD Graphics, Intel(R) Graphics [0x56a0] 3.0 [22.49.25018.23]
[ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) Graphics [0x56a0] 1.3 [1.3.25018]
[host:host:0] SYCL host platform, SYCL host device 1.2 [1.2]

> uname -a
Linux 5.17.0-1020-oem #21-Ubuntu SMP PREEMPT Fri Oct 14 09:33:24 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

leuc on Mar 23, 2023