spaCy: Using GPU on Windows leads to unexpected results
As reported here and here, using a GPU on Windows returns unexpected parsing results. This topic was made per request of @adrianeboyd. An example of sentences with their tokenisation, unexpected POS-tags, and unexpected DEP labels:
s = "The decrease in 2008 primarily relates to the decrease in cash and cash equivalents 1.\n"
['The', 'decrease', 'in', '2008', 'primarily', 'relates', 'to', 'the', 'decrease', 'in', 'cash', 'and', 'cash', 'equivalents', '1', '.', '\n']
['VERB', 'PRON', 'PROPN', 'NOUN', 'VERB', 'ADV', 'VERB', 'NUM', 'PRON', 'NOUN', 'VERB', 'PROPN', 'PROPN', 'VERB', 'VERB', 'NOUN', 'SPACE']
['dep', 'dep', 'dep', 'dep', 'dep', 'dep', 'dep', 'dep', 'dep', 'dep', 'dep', 'dep', 'dep', 'dep', 'dep', 'ROOT', '']
s = "The Company's current liabilities of €32.6 million primarily relate to deferred income from collaborative arrangements and trade payables.\n"
['The Company', "'s", 'current', 'liabilities', 'of', '&', 'euro;32.6', 'million', 'primarily', 'relate', 'to', 'deferred', 'income', 'from', 'collaborative', 'arrangements', 'and', 'trade', 'payables', '.', '\n']
['NOUN', 'VERB', 'AUX', 'NOUN', 'NOUN', 'PROPN', 'PROPN', 'PROPN', 'VERB', 'VERB', 'ADV', 'VERB', 'VERB', 'NOUN', 'NOUN', 'PROPN', 'NOUN', 'PROPN', 'VERB', 'NUM', 'NOUN', 'SPACE']
['dep', 'dep', 'dep', 'dep', 'dep', 'dep', 'dep', 'dep', 'dep', 'punct', 'dep', 'dep', 'dep', 'dep', 'dep', 'dep', 'dep', 'dep', 'dep', 'dep', 'ROOT', '']
s = 'The increase in deferred income is related to new deals with partners.\n'
['The', 'increase', 'in', 'deferred', 'income', 'is', 'related', 'to', 'new', 'deals', 'with', 'partners', '.', '\n']
['NOUN', 'PROPN', 'PROPN', 'VERB', 'NOUN', 'NOUN', 'NOUN', 'VERB', 'ADV', 'VERB', 'NOUN', 'VERB', 'NOUN', 'SPACE']
['dep', 'dep', 'dep', 'dep', 'dep', 'dep', 'dep', 'dep', 'dep', 'punct', 'dep', 'dep', 'ROOT', '']
Example repo with data here.
Is it perhaps possible to include Windows in the integrated testing? It might be interesting to have an idea about how many people who use this package are actually on Windows. (I use it for testing and prototyping, mainly.)
spaCy version info
- spaCy version: 2.2.3
- Platform: Windows-10-10.0.18362-SP0
- Python version: 3.7.5
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 23 (17 by maintainers)
Looking great! Nice to see that Windows isn’t completely left behind in the development world. Even though I run my heavy-duty stuff on our servers, I do a lot of testing at home so I love every bit of cross-platform compatibility that I see.
Thanks a lot for your work!
Closing this after the fix by @svlandeg over at https://github.com/explosion/thinc/pull/149
@BramVanroy : Finally found and fixed the issue 😃 cf. PR https://github.com/explosion/thinc/pull/149
Thanks for all your testing !
Unfortunately, yes, though slightly different:
This is with
en_core_web_sm.Withen_core_web_mdfor instance the parse looks perfectly ok.Yea,
thinc_gpu_opshas some problems installing on Windows - I run into the same. The good news is that https://github.com/explosion/thinc/pull/117 has resolved that - the bad news is that this may be causing the recent regressions on GPU. That’s why I wanted to verify that you hadn’t seen this behaviour on 2.1.8 (and you haven’t - which is good in a sense). I’ll continue further down the rabbit hole 😃