sagemaker-python-sdk: Increasing the timeout for InvokeEndpoint
The current timeout for InvokeEndpoint is 60 seconds as specified here: https://docs.aws.amazon.com/en_pv/sagemaker/latest/dg/API_runtime_InvokeEndpoint.html
Is there any way we can increase this limit, to say 120 seconds?
***EDIT
Just to be clear, I was able to keep the process on the server running by passing an environment variable in the Model definition like so
model = MXNetModel(..., env = {'SAGEMAKER_MODEL_SERVER_TIMEOUT' : '300' })
Through CloudWatch, I was able to confirm that the task is still running even after 60 seconds. (For my usecase, I am processing a video frame by frame) My question is however, on the client side I am receiving this kind of error due to the timeout
An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (0) from model with message "Your invocation timed out while waiting for a response from container model. Review the latency metrics for each container in Amazon CloudWatch, resolve the issue, and try again.". See https://us-west-2.console.aws.amazon.com/cloudwatch/home?region=us-west-2#logEventViewer:group=/aws/sagemaker/Endpoints/nightingale-pose-estimation in account 552571371228 for more information.: ModelError
About this issue
- Original URL
- State: open
- Created 5 years ago
- Reactions: 30
- Comments: 26 (2 by maintainers)
@ajaykarpur Any update on this? Switching to batch transforms doesn‘t seem to be interesting for video input.
I agree 60s limit is quite low and I hope they will slightly bump it.
If you need more than 1 minute (and less than 15) you might be interested in the newest SageMaker offering, namely Asynchronous Inference. The client sends the payload to the endpoint and the result will eventually appear in specified S3 bucket. You can set up SNS or Lambda to inform the client that it is ready to consume (and for example generate S3 presigned URL in the process).
More on that here: https://aws.amazon.com/about-aws/whats-new/2021/08/amazon-sagemaker-asynchronous-new-inference-option/
For disabling retries, you should be able to do something like the following (please note I havent tested this code myself, it serves as a reference):
See:
In regards to the feature request, one option is to use SageMaker’s batch transform option (https://docs.aws.amazon.com/sagemaker/latest/dg/batch-transform.html). May not fit your use case though…
I’ve only gotten the invoke timeouts to work after contacting AWS Support and having them configure those. I’ll note they initially didn’t want to do it either, so it’s not quite as simple unfortunately.
No argument you may want to close this ticket, but I’d kindly suggest this is still an improvement that would be a really, really nice to have.
The issue is described pretty well in the original post. I do not have anything to add: https://github.com/aws/sagemaker-python-sdk/issues/1119#issue-521175736
any update?
Any updates on this?
Any update ?
Any updates on this?