sagemaker-python-sdk: Serving a Tensorflow model fails with ConnectionClosedError

Please fill out the form below.

System Information

Framework (e.g. TensorFlow) / Algorithm (e.g. KMeans): Tensorflow
Framework Version: 1.12.0
Python Version: 3
CPU or GPU:
Python SDK Version:
Are you using a custom image: No

When I try to run a prediction / classification on an image, I get timeouts from Sagemaker. It seems like I’m not doing anything particularly complex

bucketPath = "s3://sagemaker-my-s3-bucket-foo"
MODEL_NAME_OR_ARTIFACT = "001.tar.gz"
COMPUTE_INSTANCE_TYPE = "ml.p2.xlarge"

from sagemaker.tensorflow.serving import Model
# Create model from artifact on s3
# https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/tensorflow/deploying_tensorflow_serving.rst#making-predictions-against-a-sagemaker-endpoint
model = Model(model_data= os.path.join(bucketPath, MODEL_NAME_OR_ARTIFACT), role= role)
predictor = model.deploy(initial_instance_count=1, instance_type= COMPUTE_INSTANCE_TYPE)

# Set up the handling
import tensorflow as tf
def read_tensor_from_image_file(file_name, input_height=299, input_width=299, input_mean=128, input_std=128):
    """
    Code from v1.6.0 of Tensorflow's label_image.py example
    """
    #pylint: disable= W0621
    input_name = "file_reader"
    file_reader = tf.read_file(file_name, input_name)
    if file_name.endswith(".png"):
        image_reader = tf.image.decode_png(file_reader, channels=3, name="png_reader")
    elif file_name.endswith(".gif"):
        image_reader = tf.squeeze(tf.image.decode_gif(file_reader, name="gif_reader"))
    elif file_name.endswith(".bmp"):
        image_reader = tf.image.decode_bmp(file_reader, name="bmp_reader")
    else:
        image_reader = tf.image.decode_jpeg(file_reader, channels=3, name="jpeg_reader")
    float_caster = tf.cast(image_reader, tf.float32)
    dims_expander = tf.expand_dims(float_caster, 0)
    resized = tf.image.resize_bilinear(dims_expander, [input_height, input_width])
    normalized = tf.divide(tf.subtract(resized, [input_mean]), [input_std])
    sess = tf.Session()
    result = sess.run(normalized)
    return result


testPath = "path/to/myImage.jpg"
testImageTensor = read_tensor_from_image_file(testPath)
inputData1 = {
    "instances": testImageTensor.tolist()
}
predictor.accept = 'application/json'
predictor.content_type = 'application/json'
try:
    import simplejson as json
except (ModuleNotFoundError, ImportError):
    !pip install simplejson
    import simplejson as json
# Classify complains unless it's as JSON
jsonSend = json.dumps(inputData1)
sizeBytes = len(jsonSend.encode("utf8"))
# https://github.com/awslabs/amazon-sagemaker-examples/issues/324#issuecomment-433959266
# https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms-inference-code.html#your-algorithms-inference-code-container-response
print("Sending megabytes:", sizeBytes / 1024 / 1024) # Sending megabytes: 5.2118330001831055
predictor.classify(jsonSend) 
# Returns:
# ConnectionResetError: [Errno 104] Connection reset by peer
# ConnectionClosedError: Connection was closed before we received a valid response from endpoint URL: "https://runtime.sagemaker.us-west-2.amazonaws.com/endpoints/sagemaker-tensorflow-serving-2019-06-05-17-35-41-960/invocations".

It seems I’m htting the 5 MB payload limit. This seems awfully small for image retraining, and I don’t see an argument to adjust payload size (also here).

I tried changing the input to the raw Numpy array

predictor.accept = 'application/x-npy'
predictor.content_type = 'application/x-npy'
from sagemaker.predictor import numpy_deserializer, npy_serializer
predictor.deserializer =  numpy_deserializer
predictor.serializer =  npy_serializer
predictor.predict(testImageTensor)

but got

ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (415) from model with message "{"error": "Unsupported Media Type: application/x-npy"}". See https://us-west-2.console.aws.amazon.com/cloudwatch/home?region=us-west-2#logEventViewer:group=/aws/sagemaker/Endpoints/sagemaker-tensorflow-serving-2019-06-05-17-35-41-960 in account *** for more information.

though #799 suggests that I I should be able to push Numpy directly, though I’d need to specify an entry point script to handle it on the endpoint’s side (which isn’t described in the documentation for deploy, either).

I get the same error when trying to directly create a RealTimePredictor:

from sagemaker.predictor import RealTimePredictor
predictor2 = RealTimePredictor("sagemaker-tensorflow-serving-mymodel", serializer= npy_serializer, deserializer= numpy_deserializer)
predictor2.predict(testImageTensor)

About this issue

Original URL
State: closed
Created 5 years ago
Reactions: 2
Comments: 27 (10 by maintainers)

Most upvoted comments

can you try tar-ing it up as:

002/
  variables/
    variables.index
    variables.data-00000-of-00001
  saved_model.pb
code/
  inference.py
  requirements.txt

laurenyu on Jun 7, 2019

@laurenyu oh, it worked

Thanks alot

JohnEmad on Feb 20, 2020

@JohnEmad cool, that confirms my hypothesis. it’s an issue in the SDK code - I’ve opened https://github.com/aws/sagemaker-python-sdk/pull/1302 to fix it.

Even after the fix is released, the sagemaker_model.bucket = test-sagemaker-bucket line will be needed if you want the repacked model to stay in your original S3 bucket.

As a workaround for now, can you try not specifying your entry point? (assuming your S3 model data is already packed as you described above)

laurenyu on Feb 20, 2020

Seperately confirming that the code directory was incorrectly nested, and putting it top level resolved the requirements.txt file issue.

Looks like the gateway timeout was from passing in a model where I saved it with the wrong input layer so it had a bad shape. Thanks!

tigerhawkvok on Jun 7, 2019