dvc: dvc run unexpectedly modifies PATH before running commands

Steps to reproduce:

  1. pyenv install miniconda3-latest
  2. pyenv shell miniconda3-latest
  3. conda install dvc==0.59.2
  4. conda create -n testenv python=3.7
  5. mkdir dvc-test && cd dvc-test
  6. dvc init && dvc run -f tmp.dvc 'echo $PATH'

Expected output:

  • Conda env /bin
  • Pyenv shim /bin
  • Conda base /bin
  • Pyenv shim /bin
  • etc

Actual output:

  • Conda base /bin
  • Pyenv shim /bin
  • Conda env /bin
  • Pyenv shim /bin
  • Conda base /bin
  • Pyenv shim /bin
  • etc

Somehow part of the PATH from wherever DVC is installed is getting prepended to PATH before dvc run-ing something. This potentially breaks code in the project.

Note that installing and running DVC with Pipx instead of Conda does not result in the same problem. The adverse interaction appears to be Conda-specific (although there are many combinations of environments I haven’t yet tried). This particular Pipx installation is not being managed by Conda.

Discord conversation: https://discordapp.com/channels/485586884165107732/485596304961962003/623513705145040919

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 1
  • Comments: 17 (13 by maintainers)

Commits related to this issue

Most upvoted comments

Great work! 🎊 Thanks everybody for taking the time!

So we’ve had an interactive debugging session with @benjaminvdb today and found that he had ~/.zshenv file that was modifying PATH. As it turned out [1]

.zshenv' is sourced on all invocations of the shell, unless the -f option is set. It should contain commands to set the command search path, plus other important environment variables. .zshenv’ should not contain commands that produce output or assume the shell is attached to a tty.

so it was modifying the PATH when dvc was spawning a new process. Moving those lines from zshenv to zshrc fixed the problem, but it would still be nice for us to protect against such things in the future. To do that we could consider using -f option for zsh and an equivalent option for bash, to make them not load such files. Will take a look.

[1] http://zsh.sourceforge.net/Intro/intro_3.html

@shcheklein in their case, is DVC installed in the “parent” Python environment, or in a separate virtualenv? The problem might lie in something that has to do with the parent/child relationship.

I have tried two setups, and both fail the sense that:

  1. The error ImportError: No module named pandas is returned.
  2. dvc run -o test 'which python > test' outputs /usr/local/bin/python in the test file, where it should point to python in the virtualenv.

Setup 1

Homebrew
   └─ DVC (`/usr/local/bin/dvc`)
   └─ Virtualenv + Python (`/usr/local/bin/{python,virtualenv}`)
             └─Active virtualenv environment
                       └─  Pandas

Setup 2

Homebrew
   └─ Virtualenv + Python (`/usr/local/bin/{python,virtualenv}`)
             └─Active virtualenv environment
                       └─  DVC
                       └─  Pandas

Also, did they use Homebrew to install Python? Brew is another common factor here, and more likely to cause problems than Zsh, since Brew does its own layer of symlinking.

Yes, Python was installed by Homebrew. (FYI: the Python interpreter that comes with the latest version of macOS (Mojave, version 10.14.6) is 2.7.10 and is 4.5 years old. I figure most people using Python on macOS will have shadowed this outdated version with a more recent one.)

@shcheklein and @efiop asked me to share the output of a few commands on the Discord channels and perhaps it helps if I share it here as well.

> echo $SHELL
> dvc run -f test.dvc 'echo $SHELL'
> ls -la $SHELL
> file $SHELL
/bin/zsh
'test.dvc' already exists. Do you wish to run the command and overwrite it? [y/n] y
Running command:
    echo $SHELL
/bin/zsh
Saving information to 'test.dvc'.

To track the changes with git, run:

    git add test.dvc
-rwxr-xr-x 1 root wheel 610240 May  4 09:05 /bin/zsh
/bin/zsh: Mach-O 64-bit executable x86_64
> cat test.dvc
cmd: echo $SHELL
md5: ee3b44e50705d557b7aa3eef74821f74

I wish I could help out more, but my knowledge of Python environments and DVC internals is very limited. However, let me know if I can help you out with further information and I’m happy to provide it.

@shcheklein in their case, is DVC installed in the “parent” Python environment, or in a separate virtualenv? The problem might lie in something that has to do with the parent/child relationship.

Also, did they use Homebrew to install Python? Brew is another common factor here, and more likely to cause problems than Zsh, since Brew does its own layer of symlinking.

My admittedly convoluted setup:

Linuxbrew
├─ Pyenv
│  └─ Conda                        <- PATH is broken when DVC is installed here
│     └─ Active conda environment  <- PATH is OK when DVC is installed here
└─ Python
   └─ Pipx-managed Virtualenv      <- PATH is OK when DVC is installed here