transformers: Transformers 4.36 doesn't work with `microsoft/phi-1.5` unless you pass in `trust_remote_code=True`
System Info
When the transformers library typically adds a new supported model, we no longer need to pass in trust_remote_code=True during model or tokenizer initialization.
However, even with the latest version of the transformers package (4.36.1), I see that I need to do it when I try using microsoft/phi-1.5 to actually get the model to load and for the einops weights to get converted to torch weights:
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("microsoft/phi-1.5", trust_remote_code=True)
I took a look at the PR that added Phi. Is the expectation that we should just be using susnato/phi-1_5_dev instead of microsoft/phi-1.5 going forward? If yes, why is this the case? If not, how can I use the original microsoft/phi-1.5 model without setting trust_remote_code to True?
Thanks a bunch! Super excited that Phi is now a well supported model in the transformers ecosystem!
Who can help?
@ArthurZucker @younesbelkada @susa
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examplesfolder (such as GLUE/SQuAD, …) - My own task or dataset (give details below)
Reproduction
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("microsoft/phi-1.5", trust_remote_code=True)
Expected behavior
I was expecting that like all transformers models that get “first class” support on new major transformer version releases, Phi would also work the same way but somehow it doesn’t seem to be the case.
About this issue
- Original URL
- State: closed
- Created 7 months ago
- Comments: 20 (8 by maintainers)
Hi @ArthurZucker, can I please help in adding
phi-2in any way?As far as I can tell, the architecture is the same as the
phithat we have in the library(but slightly bigger). Just need to convert the weights using the existing phi script and transfer the weights. Also maybe add an integration test to make sure the logits are the same.Let me know if I could be of any help.
ETA is probably end of next week. Gotta finish DECI, add mamba and then Phi2! Should be quite fast 🤗
Hi @arnavgarg1, I fixed some things and now
phi2should work as expected, could you please run it fromsusnato/phi-2and let me know if it is showing expected results or not.@susnato No worries at all, let me know if there’s anything I can do to help!
Yeah sure, the idea will be to open a PR once everything is done, making sure we don’t have issues with the model type and ask the author to merge. There is for me no issue with this regarding the licence, we don’t modify the weight, we modify the dictionnary that stores them / the split !
We’ll be adding phi2 as well yes 😊 we asked the author if he is interested but given the community’s interest it’s a good way to go anyways
Hey @susnato, based on inspection, they seem architecturally similar, just that phi-2 is slightly bigger than phi 1.5. See this diff I created: https://www.diffchecker.com/cAspnGZ3/
I assume that we just need to update the keys and config and we should be good to go? Is this something I could potentially help with?
Thanks for the prompt response @younesbelkada and @susnato! That makes sense! Will keep on the lookout for when the weights and necessary files PR on the hub gets merged in - for now, this looks like it works nicely!
Thanks for the awesome work in adding support for Phi @susnato!
Out of curiosity - are there plans to add support for Phi-2 as well?
Hi, I have already opened PRs on the Hub for transferring weights and necessary files, here and here. We need to wait for someone from the org to merge them then you can use without passing
trust_remote_code=True.Hi @arnavgarg1 phi-1 from microsoft still uses code on the Hub feature: https://huggingface.co/microsoft/phi-1_5 If you want to use the HF version of Phi-1 you need to use the converted checkpoints from @susnato such as : https://huggingface.co/susnato/phi-1_5_dev - I think we should transfer those weights under microsoft org with the suffix
-hfI did not reviewed the Phi integration PR but if the keys are the same we can also open PRs on the Hub on the original repos cc @ArthurZucker