astropy: multiprocessing+astropy.units in conjunction with units.eds.enable() fails on linux systems

Description

The following code will trigger an exception on some machines, in my case a linux system

The behavior is also triggered when replacing multiprocessing.Pool with pathos.pools.ProcessPool, which is how the bug was initially found: pathos issue 254

Note that in the original case, the bug was triggered although the call to units.eds.enable() was in a different module executed long before the Pool was instantiated. On my Windows machine, the same code works fine, same on Mac.

Expected behavior

print out one year in seconds

Actual behavior

exception raised, astropy.units.core.UnitConversionError: 'yr' (time) and 's' (time) are not convertible

Steps to Reproduce

run this:

from multiprocessing import Pool 
from astropy import units as u

def run_me(t):
    t_ = t.to(u.s)
    print(t_)
if __name__ == "__main__":
    from astropy.units import cds
    # the culprit is in the next line. Commenting it out will fix the problem
    cds.enable()
    p = Pool(processes=8)
    t = 1.*u.yr
    args = [t, t, t, t, t, t, t, t]
    result = p.map(run_me, args)

System Details

astropy 4.3.1 python 3.8.12 dill 0.3.5.1 ppft 1.7.6.5

About this issue

  • Original URL
  • State: open
  • Created 2 years ago
  • Comments: 15 (9 by maintainers)

Most upvoted comments

Thank you for your comments. In our case, we can just avoid using cds.enable().

I could trace this down to these lines: https://github.com/astropy/astropy/blob/d669a5391b2b4efba995dccd6d12629766c1c650/astropy/units/core.py#L1085-L1092

This if statement expression is True without parallelization (directly calling run_me(t)), and False when called from the within the multiprocessing pool. With False the next lines create a UnitConversionError. This happens already when the pool size is 1 (Pool(processes=1)), and the args list has a single element (args = [t])

With the print debugger,

            # Check quickly whether equivalent.  This is faster than
            # `is_equivalent`, because it doesn't generate the entire
            # physical type list of both units.  In other words it "fails
            # fast".
            if(self_decomposed.powers == other_decomposed.powers and
               all(self_base is other_base for (self_base, other_base)
                   in zip(self_decomposed.bases, other_decomposed.bases))):
                return self_decomposed.scale / other_decomposed.scale
            else:
                print(self_decomposed.bases)
                print(other_decomposed.bases)
                print(self_decomposed.bases[0].__class__,
                      other_decomposed.bases[0].__class__)

I get:

[Unit("s")]
[Unit("s")]
<class 'astropy.units.core.Unit'> <class 'astropy.units.core.IrreducibleUnit'>

so it seems there is a mismatch of the unit class. Why this happens, and why only in the parallel case, I don’t know.

@mmckerns Sorry for the confusion: The bug is triggered both with multiprocessing.Pool and pathos.pools.ProcessPool. I have not used multiprocess. Originally, I found the bug in conjunction with pathos, while playing with the MWE I found it is the same multiprocessing.