transformers: Suddenly unable to create BERT encodings
System Info
Google colab- Python 3.10.12 transformers 4.38.2 tensorflow 2.15.0 datasets 2.18.0
Who can help?
@gante and @Rocketknight1
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examplesfolder (such as GLUE/SQuAD, …) - My own task or dataset (give details below)
Reproduction
I am trying to fine-tune TFBert for an NLI task, with the addition of additional dense layers on top of the BERT encodings. My dataset is a dataframe that contains a column with premises, a column with hypotheses, and a column for the label- either entailment or contradiction.
In the past I have used similar code with TFBert to encode my dataset and fine-tune an NLI model without issue. The line that raises the issue is: embeddings = bert.layers[0](input_ids, attention_mask, token_type_ids), and I cannot figure out why- since I stated I have used this line in similar code (the differences in my previous code included converting the dataset to Dataset format after tokenizing, not before) without any problems before. The code was working last week- and it seems that there was a transformers release 4.38.2 ~5 days ago, could this issue be related to that new release?
Here is my complete code:
##### 1. Load tokenizer
from transformers import BertTokenizer
tranformersPreTrainedModelName = 'bert-base-uncased'
bert_tokenizer = BertTokenizer.from_pretrained(tranformersPreTrainedModelName)
#### 2. Load model
from transformers import TFBertModel
bert = TFBertModel.from_pretrained(tranformersPreTrainedModelName, output_hidden_states = True)
data = pd.read_excel('/content/nli_fine_tuning.xlsx', sheet_name = 'fine tuning 4')
# Split into train, validate and test datasets
train, val, test = np.split(data.sample(frac = 1, random_state = 123), [int(.6*len(data)), int(.8*len(data))])
# Clean up datasets, convert to Dataset format
train_dataset = Dataset.from_pandas(train)
train_dataset = train_dataset.remove_columns(["__index_level_0__"])
val_dataset = Dataset.from_pandas(val)
val_dataset = val_dataset.remove_columns(["__index_level_0__"])
test_dataset = Dataset.from_pandas(test)
test_dataset = test_dataset.remove_columns(["__index_level_0__"])
# Tokenize the datasets
def tokenize_data(data, tokenizer):
encoded_data = bert_tokenizer(data['premise'], data['hypothesis'],
max_length = 100,
truncation = True,
padding = 'max_length',
add_special_tokens = True,
return_token_type_ids = True,
return_attention_mask = True,
return_tensors = 'tf')
labels = np.array(pd.get_dummies(data['label']))
return encoded_data, labels
# Apply tokenizer to train, val and test
train_encoded_data, train_labels = tokenize_data(train_dataset, bert_tokenizer)
val_encoded_data, val_labels = tokenize_data(val_dataset, bert_tokenizer)
test_encoded_data, test_labels = tokenize_data(test_dataset, bert_tokenizer)
# Model setup
input_ids = tf.keras.Input(shape = (100,),dtype=tf.int32, name = 'input_ids')
attention_mask = tf.keras.Input(shape = (100,),dtype=tf.int32, name = 'attention_mask')
token_type_ids = tf.keras.Input(shape = (100,),dtype=tf.int32, name = 'token_type_ids')
# create BERT embeddings
embeddings = bert.layers[0](input_ids, attention_mask, token_type_ids)
last_hidden_states = embeddings.last_hidden_state # extract the last hidden state
X = tf.keras.layers.Dense(32, activation = 'relu')(last_hidden_states) # Dense layers for classification
y = tf.keras.layers.Dense(2, activation = 'softmax')(X)
finetuned_bert_model = tf.keras.Model(inputs = [input_ids, attention_mask, token_type_ids], outputs = y)
# Freeze Bert layer
finetuned_bert_model.layers[3].trainable = False
# Compile
finetuned_bert_model.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])
bert_hist = finetuned_bert_model.fit(train_encoded_data, train_labels,
validation_data = [val_encoded_data, val_labels],
epochs = 5)
And here is the printout of the error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
[<ipython-input-17-75f3b98408a6>](https://localhost:8080/#) in <cell line: 6>()
4
5 # form BERT embeddings
----> 6 embeddings = bert.layers[0](input_ids, attention_mask, token_type_ids)
7 # then extract the last hidden state
8 last_hidden_states = embeddings.last_hidden_state
6 frames
[/usr/local/lib/python3.10/dist-packages/tf_keras/src/utils/traceback_utils.py](https://localhost:8080/#) in error_handler(*args, **kwargs)
68 # To get the full stack trace, call:
69 # `tf.debugging.disable_traceback_filtering()`
---> 70 raise e.with_traceback(filtered_tb) from None
71 finally:
72 del filtered_tb
[/usr/local/lib/python3.10/dist-packages/transformers/modeling_tf_utils.py](https://localhost:8080/#) in run_call_with_unpacked_inputs(self, *args, **kwargs)
426
427 unpacked_inputs = input_processing(func, config, **fn_args_and_kwargs)
--> 428 return func(self, **unpacked_inputs)
429
430 # Keras enforces the first layer argument to be passed, and checks it through `inspect.getfullargspec()`. This
[/usr/local/lib/python3.10/dist-packages/transformers/models/bert/modeling_tf_bert.py](https://localhost:8080/#) in call(self, input_ids, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, encoder_hidden_states, encoder_attention_mask, past_key_values, use_cache, output_attentions, output_hidden_states, return_dict, training)
910 token_type_ids = tf.fill(dims=input_shape, value=0)
911
--> 912 embedding_output = self.embeddings(
913 input_ids=input_ids,
914 position_ids=position_ids,
[/usr/local/lib/python3.10/dist-packages/transformers/models/bert/modeling_tf_bert.py](https://localhost:8080/#) in call(self, input_ids, position_ids, token_type_ids, inputs_embeds, past_key_values_length, training)
204
205 if input_ids is not None:
--> 206 check_embeddings_within_bounds(input_ids, self.config.vocab_size)
207 inputs_embeds = tf.gather(params=self.weight, indices=input_ids)
208
[/usr/local/lib/python3.10/dist-packages/transformers/tf_utils.py](https://localhost:8080/#) in check_embeddings_within_bounds(tensor, embed_dim, tensor_name)
161 tensor_name (`str`, *optional*): The name of the tensor to use in the error message.
162 """
--> 163 tf.debugging.assert_less(
164 tensor,
165 tf.cast(embed_dim, dtype=tensor.dtype),
[/usr/local/lib/python3.10/dist-packages/keras/src/layers/core/tf_op_layer.py](https://localhost:8080/#) in handle(self, op, args, kwargs)
117 for x in tf.nest.flatten([args, kwargs])
118 ):
--> 119 return TFOpLambda(op)(*args, **kwargs)
120 else:
121 return self.NOT_SUPPORTED
[/usr/local/lib/python3.10/dist-packages/keras/src/utils/traceback_utils.py](https://localhost:8080/#) in error_handler(*args, **kwargs)
68 # To get the full stack trace, call:
69 # `tf.debugging.disable_traceback_filtering()`
---> 70 raise e.with_traceback(filtered_tb) from None
71 finally:
72 del filtered_tb
TypeError: Exception encountered when calling layer 'embeddings' (type TFBertEmbeddings).
Could not build a TypeSpec for name: "tf.debugging.assert_less/assert_less/Assert/Assert"
op: "Assert"
input: "tf.debugging.assert_less/assert_less/All"
input: "tf.debugging.assert_less/assert_less/Assert/Assert/data_0"
input: "tf.debugging.assert_less/assert_less/Assert/Assert/data_1"
input: "tf.debugging.assert_less/assert_less/Assert/Assert/data_2"
input: "Placeholder"
input: "tf.debugging.assert_less/assert_less/Assert/Assert/data_4"
input: "tf.debugging.assert_less/assert_less/y"
attr {
key: "T"
value {
list {
type: DT_STRING
type: DT_STRING
type: DT_STRING
type: DT_INT32
type: DT_STRING
type: DT_INT32
}
}
}
attr {
key: "summarize"
value {
i: 3
}
}
of unsupported type <class 'tensorflow.python.framework.ops.Operation'>.
Call arguments received by layer 'embeddings' (type TFBertEmbeddings):
• input_ids=<KerasTensor: shape=(None, 100) dtype=int32 (created by layer 'input_ids')>
• position_ids=None
• token_type_ids=<KerasTensor: shape=(None, 100) dtype=int32 (created by layer 'token_type_ids')>
• inputs_embeds=None
• past_key_values_length=0
• training=False
Expected behavior
I would expect bert.layers[0](input_ids, attention_mask, token_type_ids) to create the encodings to train the model, after which I could add other types of layers to accomplish my NLI task
About this issue
- Original URL
- State: open
- Created 4 months ago
- Comments: 24 (11 by maintainers)
@melissafeeney my guess is that the last one of those was the solution! Basically:
tf_kerasor by settingTF_USE_LEGACY_KERAS=1and importing them fromtf.keraskeras(nottf.keras), you will always get the current version of Keras you have installed, which will probably be Keras 3 for people using TF 2.15 / TF 2.16!Therefore, the most likely cause of this issue is either importing directly from
keras, or importing fromtf.keraswithout settingTF_USE_LEGACY_KERAS=1. Sincetf_kerasis a backward compatibility package, it will always have Keras 2 objects, so it should always be safe to import from when it’s installed.Any time you see an error like this:
It is a sign that a Keras 3 object is being passed to a Keras 2 class that doesn’t understand it. As a result of TensorFlow’s somewhat chaotic transition, these mixups will happen! We’ve tried to update
transformersso that it correctly sets TF to use Keras 2, but it’s still possible to create Keras 3 objects depending on your workflow.Here’s what I suggest as a general solution to these kinds of problems:
transformersfrommainwithpip install git+https://github.com/huggingface/transformers.gitto see if it fixes the problem. Note that the version onmainis newer than the latest release version, 4.38.2. These fixes have not been included in a released version yet!pip install tf-kerasTF_USE_LEGACY_KERAS=1tf.kerasand not directly using Keras 3 likeimport kerasorfrom keras import xThat’s strange - I can reproduce your issue with
tf-kerasinstalled, but not without it! One other test - can you trypip install transformers==4.37.2? That was the last version before the Keras 3 compatibility fix.I am facing same error.