InvokeAI: [bug] MacOs: diffusers model, Image 768x768, failed assertion NDArray > 2**32

Is there an existing issue for this?

I have searched the existing issues

OS

macOS

GPU

mps

VRAM

128GB

What happened?

@keturn, when I’m using a diffuser model (deliberate) I get a failed assertion when image is 768x768.

NOTE: Before the update to diffuser 0.12.1 and transformers=4.26.0 I was able to generate images of this size and larger.

I also tried to !covert_model again, but it fails too

my local repos is updated to c18db4e47b10cf1658612f3eec2d537a789b10ea and my .venv updated by python -mpip install -r requirements.txt

Screenshots

(midjourney) invoke> !switch d-deliberate

Current VRAM usage: 0.00G Offloading midjourney to CPU Loading diffusers model from /users/ivano/Junk/SD/diffusers/deliberate-v1.1 | Using more accurate float32 precision | Default image dimensions = 512 x 512 Model loaded in 1.07s Textual inversions available: Style-GlassFinal, Style-Princess Setting Sampler to k_lms (LMSDiscreteScheduler) (d-deliberate) invoke> a nice dog in the garden -H 768 -W 768 objc[26679]: Class CaptureDelegate is implemented in both /Users/ivano/Code/Ai/invokeai/.venv/lib/python3.10/site-packages/cv2/cv2.abi3.so (0x13bc76480) and /opt/homebrew/Cellar/opencv/4.7.0_1/lib/libopencv_videoio.4.7.0.dylib (0x369c78880). One of the two will be used. Which one is undefined. objc[26679]: Class CVWindow is implemented in both /Users/ivano/Code/Ai/invokeai/.venv/lib/python3.10/site-packages/cv2/cv2.abi3.so (0x13bc764d0) and /opt/homebrew/Cellar/opencv/4.7.0_1/lib/libopencv_highgui.4.7.0.dylib (0x31f6e0b10). One of the two will be used. Which one is undefined. objc[26679]: Class CVView is implemented in both /Users/ivano/Code/Ai/invokeai/.venv/lib/python3.10/site-packages/cv2/cv2.abi3.so (0x13bc764f8) and /opt/homebrew/Cellar/opencv/4.7.0_1/lib/libopencv_highgui.4.7.0.dylib (0x31f6e0b38). One of the two will be used. Which one is undefined. objc[26679]: Class CVSlider is implemented in both /Users/ivano/Code/Ai/invokeai/.venv/lib/python3.10/site-packages/cv2/cv2.abi3.so (0x13bc76520) and /opt/homebrew/Cellar/opencv/4.7.0_1/lib/libopencv_highgui.4.7.0.dylib (0x31f6e0b60). One of the two will be used. Which one is undefined. Patchmatch initialized Generating: 0%| | 0/1 [00:00<?, ?it/s]/Users/ivano/Code/Ai/invokeai/.venv/lib/python3.10/site-packages/diffusers/schedulers/scheduling_lms_discrete.py:268: UserWarning: The operator ‘aten::nonzero’ is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/mps/MPSFallback.mm:11.) step_indices = [(schedule_timesteps == t).nonzero().item() for t in timesteps] /AppleInternal/Library/BuildRoots/c651a45f-806e-11ed-a221-7ef33c48bc85/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSCore/Types/MPSNDArray.mm:724: failed assertion `[MPSNDArray initWithDevice:descriptor:] Error: total bytes of NDArray > 2**32’ | 0/50 [00:00<?, ?it/s] zsh: abort python ./scripts/invoke.py /opt/homebrew/Cellar/python@3.10/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown warnings.warn('resource_tracker: There appear to be %d ’

Additional context

Only happen with diffuser model, when I’m using a ckpt model, then I can generate image up to 960x960

Contact Details

No response

About this issue

Original URL
State: closed
Created a year ago
Comments: 48 (10 by maintainers)

Commits related to this issue

MPS fix for large diffusers images - Downgrades diffusers to 0.11 and transformers to 4.25 in order to work around "failed assertion NDArray > 2**32" error. - Fixes #2444 — committed to invoke-ai/InvokeAI by lstein a year ago

Most upvoted comments

Was this Closed/resolved because its fixed in 3.0.0?

pivot69 on Jul 2, 2023

@pcuenca thank you very much for your help with the snippet, I will give it a try this evening when I come back from work

i3oc9i on Mar 7, 2023

I had incompatibility issues between torch and torchvision when using the command I mention above. This worked better:

pip3 install numpy --pre torch torchvision torchaudio --force-reinstall --index-url https://download.pytorch.org/whl/nightly/cpu

Ref: FAQ number 3 at https://pytorch.org/get-started/pytorch-2.0/

pivot69 on Apr 10, 2023

fixed!

pip install --pre torch --force-reinstall --index-url https://download.pytorch.org/whl/cpu

generated 1024x1024 @ 30 steps in 106.74s and 1088x1088 @ 30 steps in 264.53s

That quite the difference from 4317.85s wit PyTorch 1.13.1

pip list | grep -e diffuser  -e transformer -e torch
clip-anytorch               2.5.2
diffusers                   0.14.0
pytorch-lightning           1.7.7
taming-transformers-rom1504 0.0.6
torch                       2.0.0
torch-fidelity              0.3.0
torchdiffeq                 0.2.3
torchmetrics                0.11.4
torchsde                    0.2.5
torchvision                 0.14.1
transformers                4.26.1

pivot69 on Mar 31, 2023

Entry 8 in the invoke.sh script results in an endless loop

Python 3.10.9 Press ^D to exit

But Im assuming the developer console is the same as entering the venv of the invoke-dir as in the manual install?

export INVOKEAI_ROOT=~/invokeai cd $INVOKEAI_ROOT source .venv/bin/activate

Inside the virtual environment I’m able to upgrade pip, list whatever modules are installed, etc. There I’m also able to run the command mentioned by @xuhao1 but this results in a broken install…

Manual install guide mentions the command pip install InvokeAI --use-pep517 for M1 and M2 Macs… This works, but doesn’t upgrade anything.

pivot69 on Mar 31, 2023

I did not remark any notable speed improvement with pytorch 2.0

i3oc9i on Mar 29, 2023

@keturn @lstein @pcuenca **Good News !!! ** This issue is solved by upgrading to Ventura 13.3 and pyTorch 2.0

I’m rolling InvokeAI 3.0.0+a0 on commit 09dfde0
pip list | grep -e diffuser  -e transformer -e torch
clip-anytorch           2.5.2
diffusers               0.14.0
pytorch-lightning       1.7.7
torch                   2.0.0
torchmetrics            0.11.4
torchvision             0.15.1
transformers            4.27.3
I was able to generate image up to 1088x1088, after I get still Error: total bytes of NDArray > 2**32’

I will propose to close after other macuser confirm

Hoorah! Upgrade to Ventura 13.3 did the trick! Seems to be working with invokeai 2.3.2 also, even without newest pytorch.

pip list | grep -e diffuser  -e transformer -e torch

clip-anytorch               2.5.2
diffusers                   0.14.0
pytorch-lightning           1.7.7
taming-transformers-rom1504 0.0.6
torch                       1.13.1
torch-fidelity              0.3.0
torchdiffeq                 0.2.3
torchmetrics                0.11.4
torchsde                    0.2.5
torchvision                 0.14.1
transformers                4.26.1

Its chugging down memory though, and at 1088x1088 it quickly consumed all 64gb of ram, slowing my mac and generation speed dropped to almost a standstill. It’s generating at 248s per iteration 😉 (but holding!)

I had no issues with generating image at 832x832 though, something that was not possible with the previous version of Ventura. I’ll keep trying to see what max dimensions will be but will wait until Im to ndeending on my mac for work.

UPDATE 1: 1088x1088 generated in 4317.85s UPDATE 2: Everything above 1088x1088 fails, confirming what @i3oc9i experienced

pivot69 on Mar 29, 2023

@keturn @lstein @pcuenca **Good News !!! ** This issue is solved by upgrading to Ventura 13.3 and pyTorch 2.0

I’m rolling InvokeAI 3.0.0+a0 on commit 09dfde0b

pip list | grep -e diffuser  -e transformer -e torch
clip-anytorch           2.5.2
diffusers               0.14.0
pytorch-lightning       1.7.7
torch                   2.0.0
torchmetrics            0.11.4
torchvision             0.15.1
transformers            4.27.3

I was able to generate image up to 1088x1088, after I get still Error: total bytes of NDArray > 2**32’

I will propose to close after other macuser confirm

i3oc9i on Mar 27, 2023

I have updated to last commit 27a113d8 and I have updated my venv using the following command

pip install --upgrade --upgrade-strategy eager --use-pep517 -e .

in order to upgrade to last version of torch, diffuser and transformers modules

pip list | grep -e diffuser  -e transformer -e torch
clip-anytorch           2.5.2
diffusers               0.14.0
pytorch-lightning       1.7.7
torch                   2.0.0
torchmetrics            0.11.4
torchvision             0.15.1
transformers            4.27.1

but the issue in not solved, invoking a 704x768 image with the stock SD-1.5 model fail.

invokeai --web
* Initializing, be patient...
>> Initialization file /Users/ivano/Code/Ai/@Stuffs/invokeai.models/invokeai.init found. Loading...
>> Internet connectivity is True
>> InvokeAI, version 3.0.0+a0
>> InvokeAI runtime directory is "/Users/ivano/Code/Ai/@Stuffs/invokeai.models"
>> GFPGAN Initialized
>> CodeFormer Initialized
>> ESRGAN Initialized
>> Using device_type mps
>> xformers not installed
>> NSFW checker is disabled
>> Current VRAM usage:  0.00G
>> Loading diffusers model from runwayml/stable-diffusion-v1-5
  | Using more accurate float32 precision
  | Loading diffusers VAE from stabilityai/sd-vae-ft-mse
  | Using more accurate float32 precision
Fetching 15 files: 100%| | 15/15 [00:00<00:00, 44119.61it/s]
  | Default image dimensions = 512 x 512
>> Loading embeddings from /Users/ivano/Code/Ai/@Stuffs/invokeai.models/embeddings
>> Textual inversion triggers: bad_prompt
>> Model loaded in 4.52s
>> Setting Sampler to k_lms (LMSDiscreteScheduler)

* --web was specified, starting web server...
Loading Python libraries...

* Initializing, be patient...
>> Initialization file /Users/ivano/Code/Ai/@Stuffs/invokeai.models/invokeai.init found. Loading...
>> Started Invoke AI Web Server
>> Default host address now 127.0.0.1 (localhost). Use --host 0.0.0.0 to bind any address.
>> Point your browser at http://127.0.0.1:9090
>> System config requested
>> Patchmatch initialized

>> Image Generation Parameters:

{'prompt': 'am happy dog in a nice garden', 'iterations': 1, 'steps': 30, 'cfg_scale': 7.5, 'threshold': 0, 'perlin': 0, 'height': 768, 'width': 704, 'sampler_name': 'k_euler_a', 'seed': 226339246, 'progress_images': False, 'progress_latents': True, 'save_intermediates': 5, 'generation_mode': 'txt2img', 'init_mask': '...', 'hires_fix': False, 'seamless': False, 'variation_amount': 0}

>> ESRGAN Parameters: False
>> Facetool Parameters: False
>> Setting Sampler to k_euler_a (EulerAncestralDiscreteScheduler)
Generating:   0%|
AppleInternal/Library/BuildRoots/c651a45f-806e-11ed-a221-7ef33c48bc85/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSCore/Types/MPSNDArray.mm:724: failed assertion `[MPSNDArray initWithDevice:descriptor:] Error: total bytes of NDArray > 2**32'
zsh: abort      invokeai --web
/opt/homebrew/Cellar/python@3.10/3.10.10_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown 
warnings.warn('resource_tracker: There appear to be %d '

i3oc9i on Mar 18, 2023