neuralcoref: Can't serialize document
I can save a Spacy document to disk but not one produced by neuralcoref. For example, the following snippet returns error TypeError: can't serialize My sister: [My sister, She]
.
import spacy
nlp0 = spacy.load('en_core_web_sm')
doc0 = nlp0(u'My sister has a dog. She loves him.')
with open(f'output/test0.pkl', 'wb') as f:
f.write(doc0.to_bytes())
nlp = spacy.load('en_coref_sm')
doc = nlp(u'My sister has a dog. She loves him.')
with open(f'output/test.pkl', 'wb') as f:
f.write(doc.to_bytes())
The files produced are as follows:
$ ls -lh output/*.pkl
-rw-r--r-- 1 cumeo staff 0B Aug 11 21:16 output/test.pkl
-rw-r--r-- 1 cumeo staff 16K Aug 11 21:16 output/test0.pkl
About this issue
- Original URL
- State: open
- Created 6 years ago
- Reactions: 6
- Comments: 15 (1 by maintainers)
I ran into the same issue when running
nlp.pipe
with multiple processes:Since this is actively blocking me, I found a temporary workaround:
I added this after my last pipeline (i.e.
after='coreference_resolver'
) which converted the coreferences into entities so I no longer needed the coref metadata which was unserializable.Can you please provide a more complete example. I use your code snippet but unfortunately I have no access to the coref data.
I’ve had this issue too, while trying to call
doc_bytes = doc.to_bytes()