pandarallel: Setting progress_bar=True freezes execution for parallel_apply before reaching 1% completion on all CPU's
When progress_bar=True
, I noticed that the execution of my parallel_apply
task stopped right before all parallel processes reached 1% progress mark.
Here are some further details of what I was encountering -
- I turned on
logging
withDEBUG
messages, but no messages were displayed when the execution stopped. There were no error messages either. The dataframe rows simply stopped processing further and the process seemed to be frozen. - I have two CPU’s. It seems that the progress bar only updates in 1% increments. One of the progress bars reaches 1% mark, but when the number of processed rows reaches the 2% mark (which I assume is associated with the second progress bar updating to 1% as well), that’s when the process froze.
- The process runs fine with
progress_bar=False
.
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 13
- Comments: 25 (5 by maintainers)
Similar issue and i’m only working on about 12k rows. It seems to get to about 300 completed items on each core then all of the forked processes just seem to die - almost like it’s trying to create new threads but then it just sits there, all cores basically unused.
Python 3.6.9 on Ubuntu-18.04 WSL2
** Edit** I removed the enable for progress_bar in my little console application, and it seems that whatever deadlock is occurring has disappeared, it seems to be progressing pretty well
I’m assuming this has been fixed.