text: mac os wheel broken for 2.2.0rc2
I’m getting an error running tokenize with tensorflow-text==2.2.0rc2 that I can only reproduce on macs. (same error on rc1, and possibly earlier versions)
Steps to reproduce:
- Setup:
python3 -m venv .test_venv
source .test_venv/bin/activate
pip install --upgrade pip
pip install tensorflow==2.2.0rc3
pip install tensorflow-text==2.2.0rc2
-
Download vocab.txt into the dir you plan to run the test:
aws s3 cp s3://models.huggingface.co/bert/bert-base-uncased-vocab.txt ./vocab.txt
-
And then run these 5 lines in python
import tensorflow as tf
from tensorflow_text import BertTokenizer
tokenizer = BertTokenizer('./vocab.txt')
test2 = tf.convert_to_tensor(
'Hello', dtype=tf.string
)
tokenizer.tokenize(test2)
Works on linux, (returns <tf.RaggedTensor [[[100]]]>) On Mac, it throws an error. I’ve run on two separate macs (one with all totally fresh installs)
2020-04-16 13:18:07.892934: W tensorflow/core/framework/op_kernel.cc:1753] OP_REQUIRES failed at wordpiece_kernel.cc:204 : Invalid argument: Trying to access resource using the wrong type. Expected N10tensorflow6lookup15LookupInterfaceE got N10tensorflow6lookup15LookupInterfaceE
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/sylvia/Desktop/workspace/.tfvenv/lib/python3.7/site-packages/tensorflow_text/python/ops/bert_tokenizer.py", line 222, in tokenize
return self._wordpiece_tokenizer.tokenize(tokens)
File "/Users/sylvia/Desktop/workspace/.tfvenv/lib/python3.7/site-packages/tensorflow_text/python/ops/wordpiece_tokenizer.py", line 100, in tokenize
subword, _, _ = self.tokenize_with_offsets(input)
File "/Users/sylvia/Desktop/workspace/.tfvenv/lib/python3.7/site-packages/tensorflow_text/python/ops/wordpiece_tokenizer.py", line 156, in tokenize_with_offsets
tokens.flat_values)
File "/Users/sylvia/Desktop/workspace/.tfvenv/lib/python3.7/site-packages/tensorflow_text/python/ops/wordpiece_tokenizer.py", line 182, in tokenize_with_offsets
**kwargs))
File "<string>", line 141, in wordpiece_tokenize_with_offsets
File "/Users/sylvia/Desktop/workspace/.tfvenv/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 6653, in raise_from_not_ok_status
six.raise_from(core._status_to_exception(e.code, message), None)
File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.InvalidArgumentError: Trying to access resource using the wrong type. Expected N10tensorflow6lookup15LookupInterfaceE got N10tensorflow6lookup15LookupInterfaceE [Op:WordpieceTokenizeWithOffsets]
running on python 3.7.6
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 7
- Comments: 18 (6 by maintainers)
We’ve been continuing to look at this. It is currently a blocker for TF 2.3, so it will be fixed soon.
The problem is that TypeIndex::Make is returning different hashes on MacOS for the same type when called from within different shared objects. The likely cause was commit 15275d3a14c77e2244ae1155f93243256f08e3ed which was added in tf 2.2.0.
I’m working with TF to test changes that would prevent these differing hashes from being generated to address this problem.
On a related note, we’re looking at adding additional TF.Text CI testing against nightly TF to catch these before they reach release.
I’m getting the exact same error in Python 3.6.8 - any ideas how to work around? Here’s some more reproducible code:
Getting:
The same exact code works on my EC2 instance, the only difference being the python version is 3.6.10:
FYI; the fix has been submitted. https://github.com/tensorflow/tensorflow/commit/1823f877359bb138c57a005c30aba8832dfa79fb#diff-991a6b786e16708ba1e6f5c9926cf151
Getting the same error with
tensorflow==2.2.0
,tensorflow-text==2.2.1
and python 3.7.6 on macOS Catalina 10.15.