recommenders: [Question] After saving & loading the TFmodel/ Scann/ BruteForce objects with Dict input for User Tower, the loaded model won't work properly

About Save & Load BruteForce/Scann and Model object

I am playing this tutorial with a online shopping dataset, and followed the tutorial where the User tower is similar to this:

Class UserModel ():
...
...
def call(self, inputs):
    # Take the input dictionary, pass it through each input layer,
    # and concatenate the result.
    return tf.concat([
        self.user_embedding(inputs["user_id"]),
        self.timestamp_embedding(inputs["timestamp"]),
        tf.reshape(self.normalized_timestamp(inputs["timestamp"]), (-1, 1)),
    ], axis=1)

Where inputs is a dict.

I am able to get the embedding by UserModel()(input_dict) just fine. The issue is when I work with the example in this link: https://www.tensorflow.org/recommenders/examples/efficient_serving, where we want to save the Scann/BF object.

I am able to get the Scann working and able to call it

scann = tfrs.layers.factorized_top_k.ScaNN(model.user_model, num_reordering_candidates=100)
scann.index_from_dataset(
    sku_map.batch(2048).map(lambda x: (x["SKU_KEY"], model.sku_model(x)) ))
    # (sku_map.batch(2048).map(lambda x: x["SKU_KEY"]) , sku_map.batch(2048).map(model.sku_model) )
scann({'CONTEXT_ID': np.array([[b'263', b'34', b'555', b'44', b'3300']]) ,
 'USER_ID': np.array([b'sssksksksksksk'])
})

and it will return meaningful results. However, if I follow Deploying the approximate model section to save it and load it back, I got an error

ValueError: Could not find matching function to call loaded from the SavedModel. Got:
  Positional arguments (3 total):
    * {'CONTEXT_ID': <tf.Tensor 'queries:0' shape=(1, 5) dtype=string>, 'USER_ID': <tf.Tensor 'queries_1:0' shape=(1,) dtype=string>}
    * None
    * False
  Keyword arguments: {}

Expected these arguments to match one of the following 4 option(s):

Option 1:
  Positional arguments (3 total):
    * {'PRICE': TensorSpec(shape=(None,), dtype=tf.float32, name='PRICE'), 'USER_ID': TensorSpec(shape=(None,), dtype=tf.string, name='USER_ID'), 'TRANS_COUNT': TensorSpec(shape=(None,), dtype=tf.int64, name='TRANS_COUNT'), 'SKU_KEY': TensorSpec(shape=(None, 1), dtype=tf.string, name='SKU_KEY'), 'SKU_DESC': TensorSpec(shape=(None,), dtype=tf.string, name='SKU_DESC'), 'CONTEXT_ID': TensorSpec(shape=(None, 5), dtype=tf.string, name='CONTEXT_ID')}
    * None
    * False
  Keyword arguments: {}

Seems to me 1, the shape is all (None,) 2, it can not identify the input dict anymore… same thing happened if I tried to save the model and load it back

model.save('my_model')
my_tf_saved_model = tf.keras.models.load_model(
    './my_model')
my_tf_saved_model(row)

It would throw similar errors but the model(row) (row is a dict) works fine…

Can’t do model.evaluate after replacing factorized_metrics with BruteForce

Another strange finding is, with the above setup, if I define the Query model in the BruteForce (brute_force = tfrs.layers.factorized_top_k.BruteForce(model.user_model) ), and then reset factorized_metrics and then do the model.evaluate (for fast performance), it will give me an error

    TypeError: Only integers, slices (`:`), ellipsis (`...`), tf.newaxis (`None`) and scalar tf.int32/tf.int64 tensors are valid indices, got 'USER_ID'

seems it does not like the way how I let User Tower’s input as a dict and return self.user_model(inputs['USER_ID']). However, It will work if I do not specify the User_Model when initializing the BruteForce function.

Any insights would be appreciated!

About this issue

Original URL
State: closed
Created 3 years ago
Comments: 16

Most upvoted comments

@xiaoyaoyang When serialising a model, tensorflow creates a strict function call signature based on the tracing the model. Before serialising you have passed in a dict containing.

{'PRICE': TensorSpec(shape=(None,), dtype=tf.float32, name='PRICE'), 'USER_ID': TensorSpec(shape=(None,), dtype=tf.string, name='USER_ID'), 'TRANS_COUNT': TensorSpec(shape=(None,), dtype=tf.int64, name='TRANS_COUNT'), 'SKU_KEY': TensorSpec(shape=(None, 1), dtype=tf.string, name='SKU_KEY'), 'SKU_DESC': TensorSpec(shape=(None,), dtype=tf.string, name='SKU_DESC'), 'CONTEXT_ID': TensorSpec(shape=(None, 5), dtype=tf.string, name='CONTEXT_ID')}

When you serialise your model tensorflow will create a call signature that expects all those inputs, even if your model doesn’t use them. So when you call it with just USER_ID, it will fail.

It’s best to ensure you only pass the required features into your model during training and evaluation. Alternatively you should be able to resolve this by calling the model once with an example record with only the required features before serialising. This will then result in another call signature that matches the input you expect to pass when serving.

patrickorlando on Dec 9, 2021

I mean getting a prediction when I say call the model. After you index the brute_force layer, you need to run.

scores, identifiers = index(example_record)

patrickorlando on Jan 21, 2022

Hi @patrickorlando,

Thank you so much for your help!! Now the prediction of the loaded ranking model finally works! But when I tried to predict using the loaded retrieval model, it showed a totally different error message. Can you also help me with that? Thank you!! My code:

def save_retrieval(model):

# Create a BruteForce layer as before for prediction
index = tfrs.layers.factorized_top_k.BruteForce(model.query_model)
#index = tfrs.layers.factorized_top_k.ScaNN(model.query_model)
index.index_from_dataset(
    tf.data.Dataset.zip((items.batch(100), items.batch(100).map(model.candidate_model)))
)

# Export the query model.
with tempfile.TemporaryDirectory() as tmp:
    path = os.path.join(tmp, "retrieval")

    # Save the index.
    tf.saved_model.save(
        index, 
        path,
        #options=tf.saved_model.SaveOptions(namespace_whitelist=["Scann"]),
    )

    # Load it back; can also be done in TensorFlow Serving.
    loaded = tf.saved_model.load(path)
    #print(type(loaded))
    time = datetime.datetime.strptime("2021", '%Y')
    time_input = datetime.datetime.timestamp(time)

    # Pass a customer id in, get top predicted product id back.
    scores, titles = loaded({
        "customer_id": np.array(["0001019648"]), 
        "customer_price_group": np.array(["KK-nicht verwenden"]), 
        "customer_type": np.array(["END-ACCOUNT"]),
        "customer_industry": np.array(["Power Utilities"]),       
        "companyname_gu": np.array(["Zweckverband kommunaler Anteilseigner der WEMAG"]),        
        "project_flag": np.array([0]),        
        "timestamp": np.array([time_input]),
    }).numpy()

    print(f"Recommendations: {titles[0][:3]}")

save_retrieval(model)

The error message shows as follows:

WARNING:tensorflow:Skipping full serialization of Keras layer <tensorflow_recommenders.layers.factorized_top_k.BruteForce object at 0x000001BF160BBAF0>, because it is not built. WARNING:tensorflow:Skipping full serialization of Keras layer <tensorflow_recommenders.layers.factorized_top_k.BruteForce object at 0x000001BF160BBAF0>, because it is not built. WARNING:absl:Found untraced functions such as query_with_exclusions while saving (showing 1 of 1). These functions will not be directly callable after loading. INFO:tensorflow:Assets written to: C:\Users\z004f16b\AppData\Local\Temp\tmpz47edw4c\retrieval\assets INFO:tensorflow:Assets written to: C:\Users\z004f16b\AppData\Local\Temp\tmpz47edw4c\retrieval\assets

TypeError Traceback (most recent call last) ~\AppData\Local\Temp/ipykernel_23596/1368832069.py in <module> 70 71 #save_ranking(model) —> 72 save_retrieval(model)

~\AppData\Local\Temp/ipykernel_23596/1368832069.py in save_retrieval(model) 25 26 # Pass a customer id in, get top predicted product id back. —> 27 scores, titles = loaded({ 28 “customer_id”: np.array([“0001019648”]), 29 “customer_price_group”: np.array([“KK-nicht verwenden”]),

TypeError: ‘_UserObject’ object is not callable

cory1219 on Jan 20, 2022

As always, @patrickorlando has the right answer. The key here is passing only the features you need (here, only the user features) into your model, not all features.

maciejkula on Dec 20, 2021