web-llm: Chat demo does not work on Android because of maxStorageBufferBindingSize
On my Pixel 7 Android device, the maxStorageBufferBindingSize limit is only 128. Sadly https://webllm.mlc.ai/#chat-demo requires 1024 for the Llama-2-7b-chat-hf-q4f16_1 model as seen below when run on Chrome for Android.
As it is possible with an Android app (https://llm.mlc.ai/#android), would it be possible to lower this limit so that more Android devices can access and play with https://webllm.mlc.ai/#chat-demo? Note that WebGPU support is coming to Android soon. See https://groups.google.com/a/chromium.org/g/blink-dev/c/YFWuDlCKTP4/m/97C4LCBUBgAJ
About this issue
- Original URL
- State: open
- Created 8 months ago
- Comments: 57 (56 by maintainers)
Commits related to this issue
- Add 128MB buffer size friendly models (#213) This PR adds three models with 1k context length that can be run by devices with `maxStorageBufferBindingSize = 128MB`: - `Llama-2-7b-chat-hf-q4f16_1-1k... — committed to mlc-ai/web-llm by CharlieFRuan 8 months ago
- [SimpleChat][ChatModule] Disable most models on Android phone (#256) This PR is motivated by https://github.com/mlc-ai/web-llm/issues/209, where limited devices like Android phones crash when we try... — committed to mlc-ai/web-llm by CharlieFRuan 6 months ago
WebGPU is available without a flag in Chrome Canary for Android.
Enjoy 🦃
@toji may have insights. From what I understand there’s not much you can do from the web app side.
Thank you @CharlieFRuan, I’ll have a look at the PR. @toji who added Android support to WebGPU in Chrome may be interested as well in your findings.
I cannot reproduce in https://webgpureport.org with Chrome Canary 122.0.6237.0 on my Pixel 7 device (Android 14). @toji Is this expected?
In the meantime, it is possible to react to “out-of-memory” GPU errors. See https://gpuweb.github.io/gpuweb/#error-scopes
🥳 I was finally able to get logs on my Pixel 7 Android device!
Hopefully this will help.
I’m also really happy 😄 to see WebLLM running on Android without any flag 😉
shader-f16is shipping in Chrome 120. See https://groups.google.com/a/chromium.org/g/blink-dev/c/AsKn-UwMYAE/m/4FKB-x_QAQAJI’m unable to get logs as Chrome is now crashing with this model. Sorry 😭 I’ve tried several times.
It works great with
RedPajama-INCITE-Chat-3B-v1-q4f32_1-1k.