pandarallel: pandarallel_apply crashes with OverflowError: int too big to convert
Hi everyone,
I am getting this error here using parallel_apply in pandas:
File "extract_specifications.py", line 156, in <module>
extracted_data = df.parallel_apply(extract_raw_infos, axis=1)
File "/home/tom/.local/lib/python3.6/site-packages/pandarallel/pandarallel.py", line 367, in closure
kwargs,
File "/home/tom/.local/lib/python3.6/site-packages/pandarallel/pandarallel.py", line 239, in get_workers_args
zip(input_files, output_files, chunk_lengths)
File "/home/tom/.local/lib/python3.6/site-packages/pandarallel/pandarallel.py", line 238, in <listcomp>
for index, (input_file, output_file, chunk_length) in enumerate(
File "/home/tom/.local/lib/python3.6/site-packages/pandarallel/pandarallel.py", line 169, in wrapper
time=time,
File "/home/tom/.local/lib/python3.6/site-packages/pandarallel/utils/inliner.py", line 34, in wrapper
return function(*args, **kwargs)
File "/home/tom/.local/lib/python3.6/site-packages/pandarallel/utils/inliner.py", line 464, in inline
func_instructions, len(b"".join(pinned_pre_func_instructions_without_return))
File "/home/tom/.local/lib/python3.6/site-packages/pandarallel/utils/inliner.py", line 34, in wrapper
return function(*args, **kwargs)
File "/home/tom/.local/lib/python3.6/site-packages/pandarallel/utils/inliner.py", line 314, in shift_instructions
for instruction in instructions
File "/home/tom/.local/lib/python3.6/site-packages/pandarallel/utils/inliner.py", line 314, in <genexpr>
for instruction in instructions
File "/home/tom/.local/lib/python3.6/site-packages/pandarallel/utils/inliner.py", line 34, in wrapper
return function(*args, **kwargs)
File "/home/tom/.local/lib/python3.6/site-packages/pandarallel/utils/inliner.py", line 293, in shift_instruction
return bytes((operation,)) + int2python_bytes(python_ints2int(values) + qty)
File "/home/tom/.local/lib/python3.6/site-packages/pandarallel/utils/inliner.py", line 34, in wrapper
return function(*args, **kwargs)
File "/home/tom/.local/lib/python3.6/site-packages/pandarallel/utils/inliner.py", line 71, in int2python_bytes
return int.to_bytes(item, nb_bytes, "little")
OverflowError: int too big to convert
I am using
pandarallel == 1.4.2
pandas == 0.24.2
python == 3.6.9
Any idea how to proceed from here? I have basically no idea what could cause this bug. I suspect it might be related to the size of the data I have in one column (I save html from web pages in there). But otherwise no idea. I would help removing this bug(?) if I had some guidance here. Thx for helping.
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 23
- Comments: 21 (4 by maintainers)
Commits related to this issue
- Fix #63 by adding an alternative way to wrap progress_pre_func — committed to dair-targ/pandarallel by dair-targ 4 years ago
same here, any way to make it work with progress bar?
Same error also. It woud be nice to have the progress bar.
just edit
PYTHON_PATH/python3.6/site-packages/pandarallel/utils/inliner.py
line 71Are there any updates for this issue? I have the same problem and can’t solve it.
i check my apply function work correctly when progress_bar is set False. It seems to be something relating to the version stuff. Pls fix it
thx. So without the progress bar it seemed to work. Thx for the hint. As it is a quiet long computation it would be nice to have it though…
here is some more information about my dataset. it includes a lot of python objects holding quiet large strings (largest one being html_content)