transformers: Covid-19 - TPU V3-1024 - T5 11B: Tensorflow to Pytorch conversion failed
We are training a large scale T5-11B model using TPU V3-1024 for a Covid-19 project We tried to convert the TensorFlow checkpoint to the Pytorch version, but it did fail. Could you please help us to figure out the problem since this model is very important for Covid-19 research.
🐛 Bug
Information
Model I am using (Bert, XLNet …): T5
Language I am using the model on (English, Chinese …): Protein Sequences
The problem arises when using:
- the official example scripts: (give details below)
- my own modified scripts: (give details below)
The tasks I am working on is:
- an official GLUE/SQUaD task: (give the name)
- my own task or dataset: (give details below)
To reproduce
Steps to reproduce the behavior:
- The config file:
{
"architectures": [
"T5WithLMHeadModel"
],
"d_ff": 65536,
"d_kv": 128,
"d_model": 1024,
"decoder_start_token_id": 0,
"dropout_rate": 0.1,
"eos_token_id": 1,
"initializer_factor": 1.0,
"is_encoder_decoder": true,
"layer_norm_epsilon": 1e-06,
"model_type": "t5",
"n_positions": 512,
"num_heads": 128,
"num_layers": 24,
"output_past": true,
"pad_token_id": 0,
"relative_attention_num_buckets": 32,
"vocab_size": 128
}
- conversion command:
python convert_t5_original_tf_checkpoint_to_pytorch.py \
--tf_checkpoint_path xxx/tensorflow \
--config_file xxx/t5-11b-config.json \
--pytorch_dump_path xxx/pytorch
- Error:
Building PyTorch model from configuration: T5Config {
"architectures": [
"T5WithLMHeadModel"
],
"d_ff": 65536,
"d_kv": 128,
"d_model": 1024,
"decoder_start_token_id": 0,
"dropout_rate": 0.1,
"eos_token_id": 1,
"initializer_factor": 1.0,
"is_encoder_decoder": true,
"layer_norm_epsilon": 1e-06,
"model_type": "t5",
"n_positions": 512,
"num_heads": 128,
"num_layers": 24,
"output_past": true,
"pad_token_id": 0,
"relative_attention_num_buckets": 32,
"vocab_size": 128
}
INFO:transformers.modeling_t5:Converting TensorFlow checkpoint from /mnt/lsf-nas-1/lsf/job/repo/elnaggar/prot-transformers/models/t5/tensorflow/bfd100
INFO:transformers.modeling_t5:Loading TF weight decoder/block_000/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_000/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_000/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_000/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_000/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_000/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_000/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_000/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_000/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_000/layer_000/SelfAttention/relative_attention_bias with shape [128, 32]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_000/layer_000/SelfAttention/relative_attention_bias_slot_v with shape [128, 32]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_000/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_000/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_000/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_000/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_000/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_000/layer_001/EncDecAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_000/layer_001/EncDecAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_000/layer_001/EncDecAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_000/layer_001/EncDecAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_000/layer_001/EncDecAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_000/layer_001/EncDecAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_000/layer_001/EncDecAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_000/layer_001/EncDecAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_000/layer_001/EncDecAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_000/layer_001/EncDecAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_000/layer_001/EncDecAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_000/layer_001/EncDecAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_000/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_000/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_000/layer_002/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_000/layer_002/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_000/layer_002/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_000/layer_002/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_000/layer_002/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_000/layer_002/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_000/layer_002/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_000/layer_002/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_001/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_001/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_001/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_001/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_001/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_001/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_001/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_001/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_001/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_001/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_001/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_001/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_001/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_001/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_001/layer_001/EncDecAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_001/layer_001/EncDecAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_001/layer_001/EncDecAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_001/layer_001/EncDecAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_001/layer_001/EncDecAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_001/layer_001/EncDecAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_001/layer_001/EncDecAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_001/layer_001/EncDecAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_001/layer_001/EncDecAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_001/layer_001/EncDecAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_001/layer_001/EncDecAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_001/layer_001/EncDecAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_001/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_001/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_001/layer_002/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_001/layer_002/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_001/layer_002/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_001/layer_002/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_001/layer_002/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_001/layer_002/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_001/layer_002/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_001/layer_002/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_002/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_002/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_002/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_002/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_002/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_002/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_002/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_002/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_002/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_002/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_002/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_002/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_002/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_002/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_002/layer_001/EncDecAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_002/layer_001/EncDecAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_002/layer_001/EncDecAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_002/layer_001/EncDecAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_002/layer_001/EncDecAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_002/layer_001/EncDecAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_002/layer_001/EncDecAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_002/layer_001/EncDecAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_002/layer_001/EncDecAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_002/layer_001/EncDecAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_002/layer_001/EncDecAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_002/layer_001/EncDecAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_002/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_002/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_002/layer_002/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_002/layer_002/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_002/layer_002/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_002/layer_002/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_002/layer_002/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_002/layer_002/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_002/layer_002/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_002/layer_002/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_003/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_003/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_003/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_003/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_003/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_003/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_003/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_003/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_003/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_003/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_003/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_003/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_003/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_003/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_003/layer_001/EncDecAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_003/layer_001/EncDecAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_003/layer_001/EncDecAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_003/layer_001/EncDecAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_003/layer_001/EncDecAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_003/layer_001/EncDecAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_003/layer_001/EncDecAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_003/layer_001/EncDecAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_003/layer_001/EncDecAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_003/layer_001/EncDecAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_003/layer_001/EncDecAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_003/layer_001/EncDecAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_003/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_003/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_003/layer_002/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_003/layer_002/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_003/layer_002/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_003/layer_002/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_003/layer_002/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_003/layer_002/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_003/layer_002/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_003/layer_002/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_004/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_004/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_004/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_004/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_004/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_004/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_004/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_004/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_004/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_004/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_004/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_004/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_004/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_004/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_004/layer_001/EncDecAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_004/layer_001/EncDecAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_004/layer_001/EncDecAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_004/layer_001/EncDecAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_004/layer_001/EncDecAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_004/layer_001/EncDecAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_004/layer_001/EncDecAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_004/layer_001/EncDecAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_004/layer_001/EncDecAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_004/layer_001/EncDecAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_004/layer_001/EncDecAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_004/layer_001/EncDecAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_004/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_004/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_004/layer_002/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_004/layer_002/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_004/layer_002/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_004/layer_002/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_004/layer_002/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_004/layer_002/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_004/layer_002/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_004/layer_002/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_005/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_005/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_005/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_005/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_005/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_005/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_005/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_005/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_005/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_005/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_005/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_005/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_005/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_005/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_005/layer_001/EncDecAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_005/layer_001/EncDecAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_005/layer_001/EncDecAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_005/layer_001/EncDecAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_005/layer_001/EncDecAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_005/layer_001/EncDecAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_005/layer_001/EncDecAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_005/layer_001/EncDecAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_005/layer_001/EncDecAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_005/layer_001/EncDecAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_005/layer_001/EncDecAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_005/layer_001/EncDecAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_005/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_005/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_005/layer_002/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_005/layer_002/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_005/layer_002/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_005/layer_002/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_005/layer_002/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_005/layer_002/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_005/layer_002/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_005/layer_002/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_006/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_006/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_006/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_006/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_006/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_006/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_006/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_006/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_006/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_006/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_006/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_006/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_006/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_006/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_006/layer_001/EncDecAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_006/layer_001/EncDecAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_006/layer_001/EncDecAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_006/layer_001/EncDecAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_006/layer_001/EncDecAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_006/layer_001/EncDecAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_006/layer_001/EncDecAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_006/layer_001/EncDecAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_006/layer_001/EncDecAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_006/layer_001/EncDecAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_006/layer_001/EncDecAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_006/layer_001/EncDecAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_006/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_006/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_006/layer_002/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_006/layer_002/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_006/layer_002/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_006/layer_002/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_006/layer_002/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_006/layer_002/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_006/layer_002/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_006/layer_002/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_007/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_007/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_007/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_007/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_007/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_007/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_007/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_007/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_007/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_007/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_007/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_007/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_007/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_007/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_007/layer_001/EncDecAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_007/layer_001/EncDecAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_007/layer_001/EncDecAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_007/layer_001/EncDecAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_007/layer_001/EncDecAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_007/layer_001/EncDecAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_007/layer_001/EncDecAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_007/layer_001/EncDecAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_007/layer_001/EncDecAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_007/layer_001/EncDecAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_007/layer_001/EncDecAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_007/layer_001/EncDecAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_007/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_007/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_007/layer_002/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_007/layer_002/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_007/layer_002/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_007/layer_002/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_007/layer_002/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_007/layer_002/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_007/layer_002/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_007/layer_002/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_008/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_008/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_008/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_008/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_008/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_008/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_008/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_008/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_008/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_008/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_008/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_008/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_008/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_008/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_008/layer_001/EncDecAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_008/layer_001/EncDecAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_008/layer_001/EncDecAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_008/layer_001/EncDecAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_008/layer_001/EncDecAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_008/layer_001/EncDecAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_008/layer_001/EncDecAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_008/layer_001/EncDecAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_008/layer_001/EncDecAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_008/layer_001/EncDecAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_008/layer_001/EncDecAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_008/layer_001/EncDecAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_008/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_008/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_008/layer_002/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_008/layer_002/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_008/layer_002/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_008/layer_002/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_008/layer_002/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_008/layer_002/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_008/layer_002/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_008/layer_002/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_009/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_009/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_009/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_009/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_009/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_009/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_009/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_009/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_009/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_009/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_009/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_009/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_009/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_009/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_009/layer_001/EncDecAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_009/layer_001/EncDecAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_009/layer_001/EncDecAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_009/layer_001/EncDecAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_009/layer_001/EncDecAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_009/layer_001/EncDecAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_009/layer_001/EncDecAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_009/layer_001/EncDecAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_009/layer_001/EncDecAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_009/layer_001/EncDecAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_009/layer_001/EncDecAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_009/layer_001/EncDecAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_009/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_009/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_009/layer_002/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_009/layer_002/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_009/layer_002/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_009/layer_002/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_009/layer_002/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_009/layer_002/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_009/layer_002/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_009/layer_002/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_010/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_010/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_010/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_010/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_010/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_010/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_010/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_010/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_010/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_010/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_010/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_010/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_010/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_010/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_010/layer_001/EncDecAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_010/layer_001/EncDecAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_010/layer_001/EncDecAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_010/layer_001/EncDecAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_010/layer_001/EncDecAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_010/layer_001/EncDecAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_010/layer_001/EncDecAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_010/layer_001/EncDecAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_010/layer_001/EncDecAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_010/layer_001/EncDecAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_010/layer_001/EncDecAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_010/layer_001/EncDecAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_010/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_010/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_010/layer_002/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_010/layer_002/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_010/layer_002/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_010/layer_002/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_010/layer_002/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_010/layer_002/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_010/layer_002/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_010/layer_002/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_011/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_011/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_011/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_011/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_011/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_011/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_011/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_011/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_011/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_011/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_011/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_011/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_011/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_011/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_011/layer_001/EncDecAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_011/layer_001/EncDecAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_011/layer_001/EncDecAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_011/layer_001/EncDecAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_011/layer_001/EncDecAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_011/layer_001/EncDecAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_011/layer_001/EncDecAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_011/layer_001/EncDecAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_011/layer_001/EncDecAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_011/layer_001/EncDecAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_011/layer_001/EncDecAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_011/layer_001/EncDecAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_011/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_011/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_011/layer_002/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_011/layer_002/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_011/layer_002/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_011/layer_002/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_011/layer_002/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_011/layer_002/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_011/layer_002/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_011/layer_002/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_012/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_012/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_012/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_012/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_012/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_012/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_012/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_012/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_012/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_012/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_012/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_012/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_012/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_012/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_012/layer_001/EncDecAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_012/layer_001/EncDecAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_012/layer_001/EncDecAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_012/layer_001/EncDecAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_012/layer_001/EncDecAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_012/layer_001/EncDecAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_012/layer_001/EncDecAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_012/layer_001/EncDecAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_012/layer_001/EncDecAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_012/layer_001/EncDecAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_012/layer_001/EncDecAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_012/layer_001/EncDecAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_012/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_012/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_012/layer_002/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_012/layer_002/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_012/layer_002/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_012/layer_002/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_012/layer_002/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_012/layer_002/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_012/layer_002/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_012/layer_002/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_013/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_013/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_013/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_013/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_013/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_013/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_013/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_013/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_013/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_013/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_013/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_013/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_013/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_013/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_013/layer_001/EncDecAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_013/layer_001/EncDecAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_013/layer_001/EncDecAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_013/layer_001/EncDecAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_013/layer_001/EncDecAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_013/layer_001/EncDecAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_013/layer_001/EncDecAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_013/layer_001/EncDecAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_013/layer_001/EncDecAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_013/layer_001/EncDecAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_013/layer_001/EncDecAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_013/layer_001/EncDecAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_013/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_013/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_013/layer_002/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_013/layer_002/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_013/layer_002/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_013/layer_002/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_013/layer_002/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_013/layer_002/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_013/layer_002/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_013/layer_002/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_014/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_014/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_014/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_014/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_014/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_014/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_014/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_014/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_014/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_014/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_014/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_014/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_014/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_014/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_014/layer_001/EncDecAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_014/layer_001/EncDecAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_014/layer_001/EncDecAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_014/layer_001/EncDecAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_014/layer_001/EncDecAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_014/layer_001/EncDecAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_014/layer_001/EncDecAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_014/layer_001/EncDecAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_014/layer_001/EncDecAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_014/layer_001/EncDecAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_014/layer_001/EncDecAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_014/layer_001/EncDecAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_014/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_014/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_014/layer_002/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_014/layer_002/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_014/layer_002/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_014/layer_002/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_014/layer_002/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_014/layer_002/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_014/layer_002/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_014/layer_002/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_015/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_015/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_015/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_015/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_015/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_015/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_015/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_015/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_015/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_015/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_015/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_015/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_015/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_015/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_015/layer_001/EncDecAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_015/layer_001/EncDecAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_015/layer_001/EncDecAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_015/layer_001/EncDecAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_015/layer_001/EncDecAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_015/layer_001/EncDecAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_015/layer_001/EncDecAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_015/layer_001/EncDecAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_015/layer_001/EncDecAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_015/layer_001/EncDecAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_015/layer_001/EncDecAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_015/layer_001/EncDecAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_015/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_015/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_015/layer_002/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_015/layer_002/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_015/layer_002/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_015/layer_002/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_015/layer_002/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_015/layer_002/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_015/layer_002/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_015/layer_002/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_016/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_016/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_016/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_016/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_016/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_016/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_016/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_016/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_016/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_016/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_016/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_016/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_016/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_016/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_016/layer_001/EncDecAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_016/layer_001/EncDecAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_016/layer_001/EncDecAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_016/layer_001/EncDecAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_016/layer_001/EncDecAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_016/layer_001/EncDecAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_016/layer_001/EncDecAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_016/layer_001/EncDecAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_016/layer_001/EncDecAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_016/layer_001/EncDecAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_016/layer_001/EncDecAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_016/layer_001/EncDecAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_016/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_016/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_016/layer_002/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_016/layer_002/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_016/layer_002/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_016/layer_002/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_016/layer_002/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_016/layer_002/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_016/layer_002/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_016/layer_002/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_017/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_017/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_017/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_017/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_017/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_017/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_017/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_017/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_017/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_017/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_017/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_017/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_017/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_017/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_017/layer_001/EncDecAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_017/layer_001/EncDecAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_017/layer_001/EncDecAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_017/layer_001/EncDecAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_017/layer_001/EncDecAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_017/layer_001/EncDecAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_017/layer_001/EncDecAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_017/layer_001/EncDecAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_017/layer_001/EncDecAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_017/layer_001/EncDecAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_017/layer_001/EncDecAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_017/layer_001/EncDecAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_017/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_017/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_017/layer_002/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_017/layer_002/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_017/layer_002/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_017/layer_002/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_017/layer_002/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_017/layer_002/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_017/layer_002/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_017/layer_002/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_018/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_018/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_018/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_018/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_018/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_018/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_018/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_018/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_018/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_018/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_018/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_018/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_018/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_018/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_018/layer_001/EncDecAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_018/layer_001/EncDecAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_018/layer_001/EncDecAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_018/layer_001/EncDecAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_018/layer_001/EncDecAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_018/layer_001/EncDecAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_018/layer_001/EncDecAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_018/layer_001/EncDecAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_018/layer_001/EncDecAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_018/layer_001/EncDecAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_018/layer_001/EncDecAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_018/layer_001/EncDecAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_018/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_018/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_018/layer_002/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_018/layer_002/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_018/layer_002/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_018/layer_002/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_018/layer_002/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_018/layer_002/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_018/layer_002/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_018/layer_002/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_019/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_019/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_019/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_019/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_019/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_019/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_019/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_019/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_019/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_019/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_019/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_019/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_019/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_019/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_019/layer_001/EncDecAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_019/layer_001/EncDecAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_019/layer_001/EncDecAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_019/layer_001/EncDecAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_019/layer_001/EncDecAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_019/layer_001/EncDecAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_019/layer_001/EncDecAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_019/layer_001/EncDecAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_019/layer_001/EncDecAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_019/layer_001/EncDecAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_019/layer_001/EncDecAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_019/layer_001/EncDecAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_019/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_019/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_019/layer_002/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_019/layer_002/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_019/layer_002/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_019/layer_002/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_019/layer_002/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_019/layer_002/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_019/layer_002/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_019/layer_002/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_020/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_020/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_020/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_020/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_020/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_020/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_020/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_020/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_020/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_020/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_020/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_020/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_020/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_020/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_020/layer_001/EncDecAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_020/layer_001/EncDecAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_020/layer_001/EncDecAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_020/layer_001/EncDecAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_020/layer_001/EncDecAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_020/layer_001/EncDecAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_020/layer_001/EncDecAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_020/layer_001/EncDecAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_020/layer_001/EncDecAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_020/layer_001/EncDecAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_020/layer_001/EncDecAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_020/layer_001/EncDecAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_020/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_020/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_020/layer_002/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_020/layer_002/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_020/layer_002/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_020/layer_002/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_020/layer_002/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_020/layer_002/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_020/layer_002/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_020/layer_002/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_021/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_021/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_021/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_021/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_021/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_021/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_021/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_021/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_021/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_021/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_021/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_021/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_021/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_021/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_021/layer_001/EncDecAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_021/layer_001/EncDecAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_021/layer_001/EncDecAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_021/layer_001/EncDecAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_021/layer_001/EncDecAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_021/layer_001/EncDecAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_021/layer_001/EncDecAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_021/layer_001/EncDecAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_021/layer_001/EncDecAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_021/layer_001/EncDecAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_021/layer_001/EncDecAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_021/layer_001/EncDecAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_021/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_021/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_021/layer_002/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_021/layer_002/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_021/layer_002/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_021/layer_002/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_021/layer_002/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_021/layer_002/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_021/layer_002/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_021/layer_002/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_022/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_022/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_022/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_022/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_022/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_022/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_022/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_022/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_022/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_022/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_022/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_022/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_022/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_022/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_022/layer_001/EncDecAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_022/layer_001/EncDecAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_022/layer_001/EncDecAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_022/layer_001/EncDecAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_022/layer_001/EncDecAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_022/layer_001/EncDecAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_022/layer_001/EncDecAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_022/layer_001/EncDecAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_022/layer_001/EncDecAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_022/layer_001/EncDecAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_022/layer_001/EncDecAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_022/layer_001/EncDecAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_022/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_022/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_022/layer_002/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_022/layer_002/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_022/layer_002/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_022/layer_002/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_022/layer_002/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_022/layer_002/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_022/layer_002/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_022/layer_002/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_023/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_023/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_023/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_023/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_023/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_023/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_023/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_023/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_023/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_023/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_023/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_023/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_023/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_023/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_023/layer_001/EncDecAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_023/layer_001/EncDecAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_023/layer_001/EncDecAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_023/layer_001/EncDecAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_023/layer_001/EncDecAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_023/layer_001/EncDecAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_023/layer_001/EncDecAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_023/layer_001/EncDecAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_023/layer_001/EncDecAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_023/layer_001/EncDecAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_023/layer_001/EncDecAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_023/layer_001/EncDecAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_023/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_023/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_023/layer_002/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_023/layer_002/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_023/layer_002/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_023/layer_002/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_023/layer_002/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_023/layer_002/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_023/layer_002/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/block_023/layer_002/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight decoder/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_000/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_000/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_000/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_000/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_000/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_000/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_000/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_000/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_000/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_000/layer_000/SelfAttention/relative_attention_bias with shape [128, 32]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_000/layer_000/SelfAttention/relative_attention_bias_slot_v with shape [128, 32]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_000/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_000/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_000/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_000/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_000/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_000/layer_001/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_000/layer_001/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_000/layer_001/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_000/layer_001/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_000/layer_001/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_000/layer_001/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_000/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_000/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_001/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_001/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_001/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_001/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_001/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_001/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_001/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_001/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_001/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_001/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_001/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_001/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_001/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_001/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_001/layer_001/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_001/layer_001/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_001/layer_001/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_001/layer_001/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_001/layer_001/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_001/layer_001/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_001/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_001/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_002/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_002/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_002/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_002/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_002/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_002/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_002/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_002/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_002/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_002/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_002/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_002/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_002/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_002/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_002/layer_001/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_002/layer_001/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_002/layer_001/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_002/layer_001/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_002/layer_001/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_002/layer_001/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_002/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_002/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_003/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_003/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_003/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_003/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_003/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_003/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_003/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_003/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_003/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_003/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_003/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_003/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_003/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_003/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_003/layer_001/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_003/layer_001/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_003/layer_001/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_003/layer_001/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_003/layer_001/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_003/layer_001/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_003/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_003/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_004/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_004/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_004/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_004/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_004/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_004/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_004/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_004/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_004/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_004/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_004/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_004/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_004/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_004/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_004/layer_001/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_004/layer_001/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_004/layer_001/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_004/layer_001/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_004/layer_001/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_004/layer_001/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_004/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_004/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_005/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_005/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_005/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_005/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_005/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_005/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_005/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_005/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_005/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_005/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_005/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_005/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_005/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_005/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_005/layer_001/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_005/layer_001/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_005/layer_001/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_005/layer_001/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_005/layer_001/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_005/layer_001/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_005/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_005/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_006/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_006/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_006/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_006/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_006/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_006/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_006/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_006/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_006/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_006/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_006/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_006/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_006/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_006/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_006/layer_001/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_006/layer_001/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_006/layer_001/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_006/layer_001/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_006/layer_001/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_006/layer_001/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_006/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_006/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_007/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_007/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_007/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_007/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_007/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_007/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_007/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_007/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_007/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_007/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_007/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_007/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_007/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_007/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_007/layer_001/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_007/layer_001/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_007/layer_001/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_007/layer_001/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_007/layer_001/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_007/layer_001/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_007/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_007/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_008/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_008/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_008/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_008/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_008/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_008/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_008/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_008/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_008/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_008/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_008/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_008/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_008/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_008/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_008/layer_001/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_008/layer_001/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_008/layer_001/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_008/layer_001/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_008/layer_001/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_008/layer_001/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_008/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_008/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_009/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_009/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_009/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_009/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_009/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_009/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_009/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_009/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_009/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_009/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_009/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_009/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_009/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_009/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_009/layer_001/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_009/layer_001/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_009/layer_001/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_009/layer_001/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_009/layer_001/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_009/layer_001/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_009/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_009/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_010/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_010/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_010/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_010/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_010/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_010/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_010/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_010/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_010/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_010/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_010/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_010/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_010/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_010/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_010/layer_001/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_010/layer_001/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_010/layer_001/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_010/layer_001/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_010/layer_001/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_010/layer_001/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_010/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_010/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_011/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_011/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_011/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_011/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_011/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_011/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_011/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_011/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_011/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_011/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_011/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_011/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_011/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_011/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_011/layer_001/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_011/layer_001/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_011/layer_001/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_011/layer_001/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_011/layer_001/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_011/layer_001/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_011/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_011/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_012/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_012/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_012/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_012/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_012/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_012/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_012/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_012/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_012/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_012/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_012/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_012/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_012/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_012/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_012/layer_001/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_012/layer_001/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_012/layer_001/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_012/layer_001/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_012/layer_001/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_012/layer_001/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_012/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_012/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_013/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_013/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_013/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_013/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_013/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_013/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_013/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_013/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_013/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_013/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_013/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_013/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_013/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_013/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_013/layer_001/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_013/layer_001/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_013/layer_001/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_013/layer_001/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_013/layer_001/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_013/layer_001/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_013/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_013/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_014/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_014/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_014/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_014/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_014/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_014/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_014/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_014/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_014/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_014/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_014/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_014/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_014/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_014/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_014/layer_001/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_014/layer_001/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_014/layer_001/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_014/layer_001/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_014/layer_001/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_014/layer_001/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_014/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_014/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_015/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_015/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_015/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_015/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_015/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_015/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_015/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_015/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_015/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_015/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_015/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_015/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_015/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_015/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_015/layer_001/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_015/layer_001/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_015/layer_001/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_015/layer_001/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_015/layer_001/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_015/layer_001/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_015/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_015/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_016/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_016/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_016/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_016/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_016/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_016/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_016/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_016/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_016/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_016/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_016/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_016/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_016/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_016/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_016/layer_001/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_016/layer_001/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_016/layer_001/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_016/layer_001/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_016/layer_001/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_016/layer_001/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_016/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_016/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_017/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_017/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_017/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_017/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_017/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_017/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_017/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_017/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_017/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_017/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_017/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_017/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_017/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_017/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_017/layer_001/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_017/layer_001/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_017/layer_001/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_017/layer_001/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_017/layer_001/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_017/layer_001/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_017/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_017/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_018/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_018/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_018/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_018/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_018/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_018/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_018/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_018/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_018/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_018/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_018/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_018/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_018/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_018/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_018/layer_001/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_018/layer_001/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_018/layer_001/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_018/layer_001/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_018/layer_001/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_018/layer_001/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_018/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_018/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_019/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_019/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_019/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_019/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_019/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_019/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_019/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_019/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_019/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_019/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_019/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_019/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_019/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_019/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_019/layer_001/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_019/layer_001/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_019/layer_001/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_019/layer_001/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_019/layer_001/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_019/layer_001/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_019/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_019/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_020/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_020/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_020/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_020/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_020/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_020/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_020/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_020/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_020/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_020/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_020/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_020/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_020/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_020/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_020/layer_001/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_020/layer_001/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_020/layer_001/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_020/layer_001/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_020/layer_001/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_020/layer_001/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_020/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_020/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_021/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_021/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_021/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_021/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_021/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_021/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_021/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_021/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_021/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_021/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_021/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_021/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_021/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_021/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_021/layer_001/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_021/layer_001/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_021/layer_001/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_021/layer_001/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_021/layer_001/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_021/layer_001/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_021/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_021/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_022/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_022/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_022/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_022/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_022/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_022/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_022/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_022/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_022/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_022/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_022/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_022/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_022/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_022/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_022/layer_001/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_022/layer_001/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_022/layer_001/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_022/layer_001/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_022/layer_001/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_022/layer_001/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_022/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_022/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_023/layer_000/SelfAttention/k with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_023/layer_000/SelfAttention/k_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_023/layer_000/SelfAttention/k_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_023/layer_000/SelfAttention/o with shape [16384, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_023/layer_000/SelfAttention/o_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_023/layer_000/SelfAttention/o_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_023/layer_000/SelfAttention/q with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_023/layer_000/SelfAttention/q_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_023/layer_000/SelfAttention/q_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_023/layer_000/SelfAttention/v with shape [1024, 16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_023/layer_000/SelfAttention/v_slot_vc with shape [16384]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_023/layer_000/SelfAttention/v_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_023/layer_000/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_023/layer_000/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_023/layer_001/DenseReluDense/wi/kernel with shape [1024, 65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_023/layer_001/DenseReluDense/wi/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_023/layer_001/DenseReluDense/wi/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_023/layer_001/DenseReluDense/wo/kernel with shape [65536, 1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_023/layer_001/DenseReluDense/wo/kernel_slot_vc with shape [65536]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_023/layer_001/DenseReluDense/wo/kernel_slot_vr with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_023/layer_001/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/block_023/layer_001/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/rms_norm/scale with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight encoder/rms_norm/scale_slot_v with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight global_step with shape []
INFO:transformers.modeling_t5:Loading TF weight shared/embedding with shape [128, 1024]
INFO:transformers.modeling_t5:Loading TF weight shared/embedding_slot_vc with shape [1024]
INFO:transformers.modeling_t5:Loading TF weight shared/embedding_slot_vr with shape [128]
INFO:transformers.modeling_t5:Transposing numpy weight of shape (1024, 16384) for ['decoder', 'block_000', 'layer_000', 'SelfAttention', 'k']
INFO:transformers.modeling_t5:Initialize PyTorch weight ['decoder', 'block_000', 'layer_000', 'SelfAttention', 'k']
INFO:transformers.modeling_t5:Skipping decoder/block_000/layer_000/SelfAttention/k_slot_vc
INFO:transformers.modeling_t5:Skipping decoder/block_000/layer_000/SelfAttention/k_slot_vr
INFO:transformers.modeling_t5:Transposing numpy weight of shape (16384, 1024) for ['decoder', 'block_000', 'layer_000', 'SelfAttention', 'o']
INFO:transformers.modeling_t5:Initialize PyTorch weight ['decoder', 'block_000', 'layer_000', 'SelfAttention', 'o']
INFO:transformers.modeling_t5:Skipping decoder/block_000/layer_000/SelfAttention/o_slot_vc
INFO:transformers.modeling_t5:Skipping decoder/block_000/layer_000/SelfAttention/o_slot_vr
INFO:transformers.modeling_t5:Transposing numpy weight of shape (1024, 16384) for ['decoder', 'block_000', 'layer_000', 'SelfAttention', 'q']
INFO:transformers.modeling_t5:Initialize PyTorch weight ['decoder', 'block_000', 'layer_000', 'SelfAttention', 'q']
INFO:transformers.modeling_t5:Skipping decoder/block_000/layer_000/SelfAttention/q_slot_vc
INFO:transformers.modeling_t5:Skipping decoder/block_000/layer_000/SelfAttention/q_slot_vr
INFO:transformers.modeling_t5:Transposing numpy weight of shape (128, 32) for ['decoder', 'block_000', 'layer_000', 'SelfAttention', 'relative_attention_bias']
INFO:transformers.modeling_t5:Initialize PyTorch weight ['decoder', 'block_000', 'layer_000', 'SelfAttention', 'relative_attention_bias']
INFO:transformers.modeling_t5:Skipping decoder/block_000/layer_000/SelfAttention/relative_attention_bias_slot_v
INFO:transformers.modeling_t5:Transposing numpy weight of shape (1024, 16384) for ['decoder', 'block_000', 'layer_000', 'SelfAttention', 'v']
INFO:transformers.modeling_t5:Initialize PyTorch weight ['decoder', 'block_000', 'layer_000', 'SelfAttention', 'v']
INFO:transformers.modeling_t5:Skipping decoder/block_000/layer_000/SelfAttention/v_slot_vc
INFO:transformers.modeling_t5:Skipping decoder/block_000/layer_000/SelfAttention/v_slot_vr
INFO:transformers.modeling_t5:Skipping decoder/block_000/layer_000/rms_norm/scale
Traceback (most recent call last):
File "xxx/convert_t5_original_tf_checkpoint_to_pytorch.py", line 61, in <module>
convert_tf_checkpoint_to_pytorch(args.tf_checkpoint_path, args.config_file, args.pytorch_dump_path)
File "xxx/convert_t5_original_tf_checkpoint_to_pytorch.py", line 36, in convert_tf_checkpoint_to_$ytorch
load_tf_weights_in_t5(model, config, tf_checkpoint_path)
File "xxx/modeling_t5.py", line 102, in load_tf_weights_in_t5
pointer = getattr(pointer, "weight")
File "xxx/anaconda3/envs/transformers/lib/python3.7/site-packages/torch/nn/modules/module.py", line 594, in __getattr__
type(self).__name__, name))
AttributeError: 'T5LayerSelfAttention' object has no attribute 'weight'
Expected behavior
T5 tensorflow model should be converted to pytorch model.
Environment info
transformers
version: 2.11.0- Platform: Linux-4.15.0-101-generic-x86_64-with-debian-buster-sid
- Python version: 3.7.7
- PyTorch version (GPU?): 1.5.0 (False)
- Tensorflow version (GPU?): 2.2.0 (False)
- Using GPU in script?: No
- Using distributed or parallel set-up in script?: No
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 35 (35 by maintainers)
Thanks @patrickvonplaten for the clarification. Using T5ForConditionalGeneration did solve my issue, and now my models work as excepted.
Thanks a lot for optimizing the shared encoder/decoder weights. This will be really helpful to allow large models 3B/11B to fit on a single GPU for inference. This will be the last point to cover before closing this issue.
I will take a deeper look next week on Monday when I’m back regarding the conversion from tensorflow to PyTorch.
Regarding, the
decoder_input_ids
(“For some reason, I have to send also the ids as decoder_input_ids.”) -> yes for T5 you have to input bothinput_ids
anddecoder_input_ids
, for the first forward call this is usually:So this means that the
decoder_input_ids
should not be the same as theinput_ids
(encoder inputs) but should correspond to the auto-regressively generated text by the model, that starts with thedecoder_start_token_id
. Hope this makes it a bit clearer.Ok, my apologies for not reading your issue carefully enough. For the inference API, that’s probably the reason though.
Please see https://github.com/huggingface/transformers/issues/5986#issuecomment-663090043
@misrasaurabh1 No we didn’t use the pytorch version.