ray: [Bug] modin-on-ray's unwrap_partitions is 100x slower on Mac than on Windows

Search before asking

  • I searched the issues and found no similar issues.

Ray Component

Ray Core

What happened + What you expected to happen

I was doing some development on Modin and wanted to call unwrap_partitions, which uses Ray to materialize the partitions constituting a Modin dataframe. I found that the function call took 122 seconds on my Macbook, but 721 milliseconds on a Windows computer.

Versions / Dependencies

Before running the reproduction script, install Modin with

pip install modin

and download this file.

My Mac is on:

  • MacBook Pro (16-inch, 2019), macOS Big Sur 11.5.2
  • Python 3.8.8
  • Ray 1.8.0
  • Modin 0.11.1+37.g0a3acc15
  • RAM: 16 GB 2667 MHz DDR4
  • ray.cluster_resources(): {‘CPU’: 16.0, ‘object_store_memory’: 3248034201.0, ‘memory’: 6496068404.0, ‘node:127.0.0.1’: 1.0}

The Windows computer is:

  • Windows 10 machine with 10 cores
  • Python 3.8.5
  • Ray 1.7.1
  • Modin 0.11.1+45.g41213581

Reproduction script

from modin.distributed.dataframe.pandas import unwrap_partitions import modin.pandas as pd import ray from modin.config import NPartitions fdf = pd.read_csv(“test_700kx256.csv”) %time ray.wait(unwrap_partitions(fdf, axis=0), num_returns=NPartitions.get())

Anything else

Some other Modin operations have been drastically slower on my Mac than on the same Windows machine. I have observed the same problem with other Macs. I have tried to provide a simple, reproducible example here.

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 64 (49 by maintainers)

Most upvoted comments

Haha wow, I didn’t realize this was actually faster on Windows! First time something is faster on Windows I think 😃

@devin-petersohn @mvashishtha are y’all able to reproduce this without a dataset? Or really ideally, simulate modin’s behavior here with just the ray core?

cc @scv119 @rkooo567

after some reading, you i think you might be able to use vmstat to monitor the swap out pages during the process; this is the best indication of trashing.

If it indeed is thrashing, then seems we should set a limit on the object store size that can be allocated to /tmp by default. In Linux, we already raise an error if the /tmp object store is configured to be >10GB. However on other platforms no error is raised: https://github.com/ray-project/ray/blob/master/python/ray/_private/services.py#L1851

Run the same scripts on my Mac got following result:

start
2021-12-07 18:25:41,116	INFO services.py:1272 -- View the Ray dashboard at http://127.0.0.1:8265
2021-12-07 18:25:41,118	INFO services.py:1769 -- object_store_memory is not verified when plasma_directory is set.
done 3.896249771118164
4.212249994277954
(f pid=98552) ObjectRef(623b26bdd75b28e9ffffffffffffffffffffffff0100000002000000)
(f pid=98552) took  2.701543092727661
(f pid=98547) ObjectRef(69a6825d641b4613ffffffffffffffffffffffff0100000002000000)
(f pid=98547) took  2.770854949951172
(f pid=98544) ObjectRef(63964fa4841d4a2effffffffffffffffffffffff0100000002000000)
(f pid=98544) took  2.8403570652008057
(f pid=98554) ObjectRef(480a853c2c4c6f27ffffffffffffffffffffffff0100000002000000)
(f pid=98554) took  2.803499698638916
(f pid=98551) ObjectRef(32cccd03c567a254ffffffffffffffffffffffff0100000002000000)
(f pid=98551) took  2.7818920612335205
(f pid=98553) ObjectRef(ee4e90da584ab0ebffffffffffffffffffffffff0100000002000000)
(f pid=98553) took  2.8101229667663574
3.543503999710083
(f pid=98548) ObjectRef(4ee449587774c1f0ffffffffffffffffffffffff0100000002000000)
(f pid=98548) took  2.8532378673553467
(f pid=98543) ObjectRef(a67dc375e60ddd1affffffffffffffffffffffff0100000002000000)
(f pid=98543) took  2.8480982780456543

@wuisawesome it seems like his workload is slow without object spilling (whereas it is faster in your machine). Isn’t there possibility the issue is where plasma object is stored? (In mac, it is /tmp, and maybe his tmp is slow, and in windows, it probably uses shm)

Thanks for following up on this @devin-petersohn and thanks for the repro, we will prioritize it!