transformers: Suddenly unable to create BERT encodings

System Info

Google colab- Python 3.10.12 transformers 4.38.2 tensorflow 2.15.0 datasets 2.18.0

Who can help?

@gante and @Rocketknight1

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, …)
  • My own task or dataset (give details below)

Reproduction

I am trying to fine-tune TFBert for an NLI task, with the addition of additional dense layers on top of the BERT encodings. My dataset is a dataframe that contains a column with premises, a column with hypotheses, and a column for the label- either entailment or contradiction.

In the past I have used similar code with TFBert to encode my dataset and fine-tune an NLI model without issue. The line that raises the issue is: embeddings = bert.layers[0](input_ids, attention_mask, token_type_ids), and I cannot figure out why- since I stated I have used this line in similar code (the differences in my previous code included converting the dataset to Dataset format after tokenizing, not before) without any problems before. The code was working last week- and it seems that there was a transformers release 4.38.2 ~5 days ago, could this issue be related to that new release?

Here is my complete code:

##### 1. Load tokenizer
from transformers import BertTokenizer
tranformersPreTrainedModelName = 'bert-base-uncased'
bert_tokenizer = BertTokenizer.from_pretrained(tranformersPreTrainedModelName)

#### 2. Load model
from transformers import TFBertModel
bert = TFBertModel.from_pretrained(tranformersPreTrainedModelName, output_hidden_states = True)

data = pd.read_excel('/content/nli_fine_tuning.xlsx', sheet_name = 'fine tuning 4')

# Split into train, validate and test datasets
train, val, test = np.split(data.sample(frac = 1, random_state = 123), [int(.6*len(data)), int(.8*len(data))])

# Clean up datasets, convert to Dataset format
train_dataset = Dataset.from_pandas(train)
train_dataset = train_dataset.remove_columns(["__index_level_0__"])
val_dataset = Dataset.from_pandas(val)
val_dataset = val_dataset.remove_columns(["__index_level_0__"])
test_dataset = Dataset.from_pandas(test)
test_dataset = test_dataset.remove_columns(["__index_level_0__"])

# Tokenize the datasets
def tokenize_data(data, tokenizer):
  encoded_data = bert_tokenizer(data['premise'], data['hypothesis'], 
                           max_length = 100, 
                           truncation = True, 
                           padding = 'max_length', 
                           add_special_tokens = True, 
                           return_token_type_ids = True, 
                           return_attention_mask = True, 
                           return_tensors = 'tf')

  labels = np.array(pd.get_dummies(data['label']))
  return encoded_data, labels

# Apply tokenizer to train, val and test
train_encoded_data, train_labels = tokenize_data(train_dataset, bert_tokenizer)
val_encoded_data, val_labels = tokenize_data(val_dataset, bert_tokenizer)
test_encoded_data, test_labels = tokenize_data(test_dataset, bert_tokenizer)

# Model setup
input_ids = tf.keras.Input(shape = (100,),dtype=tf.int32, name = 'input_ids')
attention_mask = tf.keras.Input(shape = (100,),dtype=tf.int32, name = 'attention_mask')
token_type_ids = tf.keras.Input(shape = (100,),dtype=tf.int32, name = 'token_type_ids')

# create BERT embeddings
embeddings = bert.layers[0](input_ids, attention_mask, token_type_ids)
last_hidden_states = embeddings.last_hidden_state # extract the last hidden state
X = tf.keras.layers.Dense(32, activation = 'relu')(last_hidden_states) # Dense layers for classification
y = tf.keras.layers.Dense(2, activation = 'softmax')(X)

finetuned_bert_model = tf.keras.Model(inputs = [input_ids, attention_mask, token_type_ids], outputs = y)

# Freeze Bert layer
finetuned_bert_model.layers[3].trainable = False

# Compile
finetuned_bert_model.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])

bert_hist = finetuned_bert_model.fit(train_encoded_data, train_labels,
                           validation_data = [val_encoded_data, val_labels], 
                           epochs = 5) 

And here is the printout of the error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
[<ipython-input-17-75f3b98408a6>](https://localhost:8080/#) in <cell line: 6>()
      4 
      5 # form BERT embeddings
----> 6 embeddings = bert.layers[0](input_ids, attention_mask, token_type_ids)
      7 # then extract the last hidden state
      8 last_hidden_states = embeddings.last_hidden_state

6 frames
[/usr/local/lib/python3.10/dist-packages/tf_keras/src/utils/traceback_utils.py](https://localhost:8080/#) in error_handler(*args, **kwargs)
     68             # To get the full stack trace, call:
     69             # `tf.debugging.disable_traceback_filtering()`
---> 70             raise e.with_traceback(filtered_tb) from None
     71         finally:
     72             del filtered_tb

[/usr/local/lib/python3.10/dist-packages/transformers/modeling_tf_utils.py](https://localhost:8080/#) in run_call_with_unpacked_inputs(self, *args, **kwargs)
    426 
    427         unpacked_inputs = input_processing(func, config, **fn_args_and_kwargs)
--> 428         return func(self, **unpacked_inputs)
    429 
    430     # Keras enforces the first layer argument to be passed, and checks it through `inspect.getfullargspec()`. This

[/usr/local/lib/python3.10/dist-packages/transformers/models/bert/modeling_tf_bert.py](https://localhost:8080/#) in call(self, input_ids, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, encoder_hidden_states, encoder_attention_mask, past_key_values, use_cache, output_attentions, output_hidden_states, return_dict, training)
    910             token_type_ids = tf.fill(dims=input_shape, value=0)
    911 
--> 912         embedding_output = self.embeddings(
    913             input_ids=input_ids,
    914             position_ids=position_ids,

[/usr/local/lib/python3.10/dist-packages/transformers/models/bert/modeling_tf_bert.py](https://localhost:8080/#) in call(self, input_ids, position_ids, token_type_ids, inputs_embeds, past_key_values_length, training)
    204 
    205         if input_ids is not None:
--> 206             check_embeddings_within_bounds(input_ids, self.config.vocab_size)
    207             inputs_embeds = tf.gather(params=self.weight, indices=input_ids)
    208 

[/usr/local/lib/python3.10/dist-packages/transformers/tf_utils.py](https://localhost:8080/#) in check_embeddings_within_bounds(tensor, embed_dim, tensor_name)
    161         tensor_name (`str`, *optional*): The name of the tensor to use in the error message.
    162     """
--> 163     tf.debugging.assert_less(
    164         tensor,
    165         tf.cast(embed_dim, dtype=tensor.dtype),

[/usr/local/lib/python3.10/dist-packages/keras/src/layers/core/tf_op_layer.py](https://localhost:8080/#) in handle(self, op, args, kwargs)
    117             for x in tf.nest.flatten([args, kwargs])
    118         ):
--> 119             return TFOpLambda(op)(*args, **kwargs)
    120         else:
    121             return self.NOT_SUPPORTED

[/usr/local/lib/python3.10/dist-packages/keras/src/utils/traceback_utils.py](https://localhost:8080/#) in error_handler(*args, **kwargs)
     68             # To get the full stack trace, call:
     69             # `tf.debugging.disable_traceback_filtering()`
---> 70             raise e.with_traceback(filtered_tb) from None
     71         finally:
     72             del filtered_tb

TypeError: Exception encountered when calling layer 'embeddings' (type TFBertEmbeddings).

Could not build a TypeSpec for name: "tf.debugging.assert_less/assert_less/Assert/Assert"
op: "Assert"
input: "tf.debugging.assert_less/assert_less/All"
input: "tf.debugging.assert_less/assert_less/Assert/Assert/data_0"
input: "tf.debugging.assert_less/assert_less/Assert/Assert/data_1"
input: "tf.debugging.assert_less/assert_less/Assert/Assert/data_2"
input: "Placeholder"
input: "tf.debugging.assert_less/assert_less/Assert/Assert/data_4"
input: "tf.debugging.assert_less/assert_less/y"
attr {
  key: "T"
  value {
    list {
      type: DT_STRING
      type: DT_STRING
      type: DT_STRING
      type: DT_INT32
      type: DT_STRING
      type: DT_INT32
    }
  }
}
attr {
  key: "summarize"
  value {
    i: 3
  }
}
 of unsupported type <class 'tensorflow.python.framework.ops.Operation'>.

Call arguments received by layer 'embeddings' (type TFBertEmbeddings):
  • input_ids=<KerasTensor: shape=(None, 100) dtype=int32 (created by layer 'input_ids')>
  • position_ids=None
  • token_type_ids=<KerasTensor: shape=(None, 100) dtype=int32 (created by layer 'token_type_ids')>
  • inputs_embeds=None
  • past_key_values_length=0
  • training=False

Expected behavior

I would expect bert.layers[0](input_ids, attention_mask, token_type_ids) to create the encodings to train the model, after which I could add other types of layers to accomplish my NLI task

About this issue

  • Original URL
  • State: open
  • Created 4 months ago
  • Comments: 24 (11 by maintainers)

Most upvoted comments

@melissafeeney my guess is that the last one of those was the solution! Basically:

  • Transformers will always use Keras 2 objects, unless you’re using an old version that doesn’t know about Keras 3 yet.
  • If you’ve installed Keras 3, you can still get Keras 2 objects, either by importing them from tf_keras or by setting TF_USE_LEGACY_KERAS=1 and importing them from tf.keras
  • If you import from keras (not tf.keras), you will always get the current version of Keras you have installed, which will probably be Keras 3 for people using TF 2.15 / TF 2.16!

Therefore, the most likely cause of this issue is either importing directly from keras, or importing from tf.keras without setting TF_USE_LEGACY_KERAS=1. Since tf_keras is a backward compatibility package, it will always have Keras 2 objects, so it should always be safe to import from when it’s installed.

Any time you see an error like this:

Data of type <class 'keras.src.backend.common.keras_tensor.KerasTensor'> is not allowed only (<class 'tensorflow.python.framework.tensor.Tensor'>,

It is a sign that a Keras 3 object is being passed to a Keras 2 class that doesn’t understand it. As a result of TensorFlow’s somewhat chaotic transition, these mixups will happen! We’ve tried to update transformers so that it correctly sets TF to use Keras 2, but it’s still possible to create Keras 3 objects depending on your workflow.

Here’s what I suggest as a general solution to these kinds of problems:

  • First, try installing transformers from main with pip install git+https://github.com/huggingface/transformers.git to see if it fixes the problem. Note that the version on main is newer than the latest release version, 4.38.2. These fixes have not been included in a released version yet!
  • If it still doesn’t fix the problem, try pip install tf-keras
  • If that still doesn’t fix the problem, try setting the environment variable TF_USE_LEGACY_KERAS=1
  • If that still doesn’t fix the problem, check your code and make sure you’re using tf.keras and not directly using Keras 3 like import keras or from keras import x

That’s strange - I can reproduce your issue with tf-keras installed, but not without it! One other test - can you try pip install transformers==4.37.2? That was the last version before the Keras 3 compatibility fix.

I am facing same error.