transformers: ModelError while deploying FlanT5-xl
System Info
transformers_version==4.17.0 Plaform = Sagemaker Notebook python==3.9.0
Who can help?
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examplesfolder (such as GLUE/SQuAD, …) - My own task or dataset (give details below)
Reproduction
Amazon Sagemaker deployment script in AWS for flant5-xl
from sagemaker.huggingface import HuggingFaceModel
import sagemaker
role = sagemaker.get_execution_role()
# Hub Model configuration. https://huggingface.co/models
hub = {
'HF_MODEL_ID':'google/flan-t5-xl',
'HF_TASK':'text2text-generation'
}
# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
transformers_version='4.17.0',
pytorch_version='1.10.2',
py_version='py38',
env=hub,
role=role,
)
# deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
initial_instance_count=1, # number of instances
instance_type='ml.m5.xlarge' # ec2 instance type
)
predictor.predict({
'inputs': "The answer to the universe is"
})
Results in
---------------------------------------------------------------------------
ModelError Traceback (most recent call last)
/tmp/ipykernel_20116/1338286066.py in <cell line: 26>()
24 )
25
---> 26 predictor.predict({
27 'inputs': "The answer to the universe is"
28 })
~/anaconda3/envs/python3/lib/python3.10/site-packages/sagemaker/predictor.py in predict(self, data, initial_args, target_model, target_variant, inference_id)
159 data, initial_args, target_model, target_variant, inference_id
160 )
--> 161 response = self.sagemaker_session.sagemaker_runtime_client.invoke_endpoint(**request_args)
162 return self._handle_response(response)
163
~/anaconda3/envs/python3/lib/python3.10/site-packages/botocore/client.py in _api_call(self, *args, **kwargs)
528 )
529 # The "self" in this scope is referring to the BaseClient.
--> 530 return self._make_api_call(operation_name, kwargs)
531
532 _api_call.__name__ = str(py_operation_name)
~/anaconda3/envs/python3/lib/python3.10/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params)
958 error_code = parsed_response.get("Error", {}).get("Code")
959 error_class = self.exceptions.from_code(error_code)
--> 960 raise error_class(parsed_response, operation_name)
961 else:
962 return parsed_response
ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from primary with message "{
"code": 400,
"type": "InternalServerException",
"message": "Could not load model /.sagemaker/mms/models/google__flan-t5-xl with any of the following classes: (\u003cclass \u0027transformers.models.auto.modeling_auto.AutoModelForSeq2SeqLM\u0027\u003e, \u003cclass \u0027transformers.models.t5.modeling_t5.T5ForConditionalGeneration\u0027\u003e)."
}
"
From an existing issue, I suspected this might be due to the use of transformers==4.17.0, however, when I use the exact same script to deploy flant5-large model, it works without any issues.
Expected behavior
The model should get deployed on AWS Sagemaker without any issues.
About this issue
- Original URL
- State: closed
- Created a year ago
- Reactions: 2
- Comments: 18 (7 by maintainers)
@RonLek done: https://www.philschmid.de/deploy-flan-t5-sagemaker
If you check this blog post: https://www.philschmid.de/deploy-t5-11b There is a code snippet on how to do this, for
t5-11bhttps://www.philschmid.de/deploy-t5-11b@RonLek i am planning to create an example. I ll post it here once it is ready.
When you provide a
model_datakey word you also have to include theinference.pyand the model weights.@philschmid what should be the contents of the
inference.pyin case of the flan-t5-xl model? Can this be an empty file if I don’t intend to change anything from the hub model? There doesn’t seem to be such a file included within the Hugging Face repository.@valentinboyanov I confirm getting the same as well. From the CW logs it seems that
4.17.0is un-installed and replaced with the latest version specified in therequirements.txtfile.@younesbelkada if I change it, I’m unable to deploy at all:
This is why I’ve followed the instructions by Heiko Hotz (marshmellow77) in this comment to provide a
requirements.txtfile that will let me specify dependencies I want to be installed in the container.Hi @younesbelkada and @RonLek ! I have the same issue deploying
google/flan-t5-xxlon SageMaker.I’ve tried to update to
transformers==4.26.0by providingcode/requirements.txtthroughs3://sagemaker-eu-north-1-***/model.tar.gz:Observing the AWS logs I can see that
transformers==4.26.0was installed:But I got the same error when trying to do an inference:
AWS logs: