pyenv-virtualenv: Slow shell performance after running pyenv virtualenv-init

It looks like there is some issue on my system that is causing very poor runtime performance after running pyenv virtualenv-init. Related to #132

I think the issue is with pyenv verision-name. What steps can I take to debug this further?

About this issue

  • Original URL
  • State: open
  • Created 6 years ago
  • Reactions: 35
  • Comments: 54 (5 by maintainers)

Commits related to this issue

Most upvoted comments

On zsh replacing the precmd hook with a precwd hook seems like an ok workaround.

In my zshrc: eval “$(pyenv virtualenv-init - | sed s/precmd/precwd/g)”

then:

~/src $ python --version
Python 2.7.16
~/src $ cd project
~/src/project $ python --version
Python 3.9.9

With precwd hook:

********************************************************************
                      Prompt Benchmark Results
********************************************************************
Warmup duration      8s
Benchmark duration   2.029s
Benchmarked prompts  56
Time per prompt      36.24ms  <-- prompt latency (lower is better)
********************************************************************

With precmd hook:

********************************************************************
                      Prompt Benchmark Results
********************************************************************
Warmup duration      8s
Benchmark duration   2.174s
Benchmarked prompts  6
Time per prompt      362.39ms  <-- prompt latency (lower is better)
********************************************************************

For me, it’s the pyenv sh-activate --quiet (called by _pyenv_virtualenv_hook in my PROMPT_COMMAND) that is taking up the bulk of the time.

Thanks @monopoler08 your solution did not work for me, I did not have a precwd but chpwdevent. I changed your command with:

eval "$(pyenv virtualenv-init - | sed s/precmd/chpwd/g)"

And it worked perfectly.

In my case, the root problem is that pytenv sh-active is slow and it is executed each time _pyenv_virtualenv_hook is called. And since the hook is called each time we issue a command (or switch directory, if we change precmd to chpwd), it makes all actions on the shell sluggish.

To speed up the shell, we need to make the execution of pytenv sh-activ[at]e more selective. Below is my work-around on this issue.

Emphasis mine, just doing this blind obscures the problem, not solves it. The solution is to actually improve the performance of pyenv-sh-activate (or rather the hook, that currently calls it). Most of the negative performance appears to be caused by recursive call into the pyenv shell function / command to call other pyenv[-virtualenv] scripts as well as pyenv libexec utilities.

So, for the sake of a fresh look at this investigative process:

If we can assume that we know what’s a libexec and what isn’t, and that we don’t have to depend on the idea of someone else overriding pyenv or this plugin’s behavior, a fix can be seen as follows:

  • Remove unnecessary string -> value evaluation (rely on exit codes)
  • determine the script bindir at the top of the script, use it directly.
  • if no venv names are provided on the command line, no need to check the python prefix twice. ref: https://github.com/pyenv/pyenv-virtualenv/blob/f8469a1c67973c3c99604c3ec50cb90df38cbdf0/bin/pyenv-sh-activate#L111-L117
    • If getting 3.12.2 as a local/global/shell version, we won’t find a venv at 3.12.2/envs/3.12.2.
    • If getting some_venv_name as a local/global/shell version, we won’t find a venv at some_venv_name/envs/some_venv_name
    • This shaves off 20-40 millis.

The first check still takes 20-40 millis. Can we do any better? Littering the scripts with date +%s%N | cut -b1-13 1>&2 (and timing this under zsh to take ~2.5ms, so, count of output x 2.5 => time spent on the date calls themselves) shows me this pattern:

image

Which tells me that the actual prefix finding script isn’t a large deal. Digging deeper, get_current_versions is 40-60ms (incredibly variable), each call to the prefix versions script is ~ 20ms. Not much point in optimizing out the other call then. Can we do anything about get_current_versions?

Logically speaking, the answer is yes-- it’s effectively the result of https://github.com/pyenv/pyenv/blob/5b4d5a32d343dcae5e7b3f1a09850312f89ba868/libexec/pyenv-version-name. The current versions (assuming no other plugins) can change under the result of the following commands:

  • a changed directory
  • the relevant .python_version file changing (if done manually or directly by some plugin, screw that)
  • pyenv shell

So, the ideal would be

  • don’t check venv prefixes twice unnecessarily
    • have the singular call be a bash function in the sh-activate script rather than it’s own script? sourcing is relatively fast by my benchmarks, so might want to move it to libexec and source that in the bin/ script.
  • call the scripts directly rather than via indirection
  • don’t go back and forth between strings and values unnecessarily
  • only read the .python_version file if you have to. Which means
    • if PYENV_VERSION is set, skip calling get_current_versions spawning subshells (that info is already in the env var).
    • cache the result of get_current_versions somehow. Maybe change the hook to evaluate on chpwd, get the versions, cache the dir + versions tuple, pass the versions in to sh-activate, cache-miss on changed directory (chpwd), set a separate hook such that every time pyenv global/local are called, the versions cache gets cleared?

I’ll try to implement this ideal over the next weekend (only focusing on zsh and by extension that the changes are not magic, bash , sorry) on a branch; and this should shave off the majority (>90ms/120ms per call) by my estimate.

Similar problem on Debian 9 with zsh. Removing eval "$(pyenv virtualenv-init -)" from ~/.zshrc fixes it.

eval "$(pyenv virtualenv-init - | sed s/precmd/chpwd/g)"

When using virtualenv-init in pyenv, by default, _pyenv_virtualenv_hook runs every time a shell prompt appears. To avoid this, I tried to change to use chpwd instead of precmd hook. This improves the response speed of the shell because the _pyenv_virtualenv_hook runs only upon directory changes.

However, if this method was applied, the virtual environment was not activated immediately because the _pyenv_virtualenv_hook did not run when zsh was first run. As a workaround, I modified the hook with the sed command and added the code to run _pyenv_virtualenv_hook once. This enabled the virtual environment to be activated normally at the start of zsh. I hope this information is helpful!

eval "$(pyenv virtualenv-init - | sed s/precmd/chpwd/g)"
_pyenv_virtualenv_hook

More progress. I can reproduce outside of pyenv now. 😄 I wrote a shell script that creates and calls another script. If the script has a shebang line in it, then the execution is horrendous.

#!/bin/bash
SUB_SCRIPT=$(mktemp)
if [[ $1 == "--bash" ]]; then
  echo "#!/bin/bash" > $SUB_SCRIPT
fi
echo "exit" >> $SUB_SCRIPT
chmod +x $SUB_SCRIPT
for X in $(seq 100); do
  $($SUB_SCRIPT)
done
rm $SUB_SCRIPT
$ time ./test.sh
./test.sh  0.05s user 0.08s system 37% cpu 0.341 total

$ time ./test.sh --bash
./test.sh --bash  0.24s user 0.23s system 3% cpu 11.869 total

Even on my unaffected system, the performance is measurably worse when shebang line exists.

$ time ./test.sh
./test.sh  0.07s user 0.08s system 94% cpu 0.156 total

$ time ./test.sh --bash
./test.sh --bash  0.15s user 0.16s system 87% cpu 0.351 total

Thanks @monopoler08 your solution did not work for me, I did not have a precwd but chpwdevent. I changed your command with:

eval "$(pyenv virtualenv-init - | sed s/precmd/chpwd/g)"

And it worked perfectly.

This worked for me on macos, however there is still the same delay when switching directories. I assume that this is due to the hook now only running when the current working dir is changed. Until there is a more efficient hook, every time it runs there seems to be a delay.

To find out what takes time, we need to get a debug trace of what’s running while you perform those actions and how long it takes.

For the first point

export PS4='+(${BASH_SOURCE}:${LINENO}): ${FUNCNAME[0]:+${FUNCNAME[0]}(): }'
set -x
<reproduce the problem>
set +x

For the second point, using $SECONDS in PS4 may help. (I’m not sure what it does, but it seems to expand to the time since last shell prompt or invocation.) It that var doesn’t help, you’ll need to intersperse the code that runs when you reproduce the problem with time calls and use those to localize the time hog.

Solution

Warning: the solution is far from perfect, it breaks command pyenv activate/deactivate. Use it only if the slowness really bothers you.

Quick Solution

  1. Unload hook func _pyenv_virtualenv_hook
  2. Never use pyenv activate/deactivate, cause they’re broken without the above hook. Stick with pyenv shell env_name, pyenv shell --unset instead.

For ZSH and Bash

# Init pyenv-virtualenv, but 
# unload precmd hook _pyenv_virtualenv_hook
eval "$(pyenv init -)"
eval "$(pyenv virtualenv-init -)"

# Warning: unloading the following hook breaks command
# `pyenv activate/deactivate`. Please switch to
# `pyenv shell env_name`, `pyenv shell --unset` instead.
if [[ -n $ZSH_VERSION ]]; then
  autoload -Uz add-zsh-hook
  add-zsh-hook -D precmd _pyenv_virtualenv_hook
fi
if [[ -n $BASH_VERSION ]]; then
  PROMPT_COMMAND="${PROMPT_COMMAND/_pyenv_virtualenv_hook;/}"
fi

For anyone using fish and looking for an equivalent to the zsh workaround, this seems to work for me:

status --is-interactive; and pyenv virtualenv-init - | sed 's/--on-event fish_prompt/--on-variable PWD/g' | source

Here’s an alternative

if status is-interactive;
    pyenv init - | source
    pyenv virtualenv-init - | sed 's/--on-event fish_prompt/--on-event fish_preexec/g' | source
end

fish_preexec runs before the command is processed. This moves the toil to before the command executes. This lets you type new commands without slowing you down

@raphaelchristin instead of chpwd use precwd. I’m on macos and it works perfectly fine for me.

❯ /bin/zsh --version
zsh 5.8.1 (x86_64-apple-darwin22.0)

For anyone using fish and looking for an equivalent to the zsh workaround, this seems to work for me:

status --is-interactive; and pyenv virtualenv-init - | sed 's/--on-event fish_prompt/--on-variable PWD/g' | source

just pressing enter with empty input in shell results in a small yet very noticeable delay before next prompt.

With the code I gave, you should’ve also seen the trace of any automatic commands that run at this moment. Weren’t there any output?

What does this produce for you?

trap -p
echo "$PROMPT_COMMAND"

Thank you for the updated instructions. I have used the following script to test the behavior:

#!/usr/bin/env bash

export PS4='+(${BASH_SOURCE}:${LINENO}) $SECONDS: ${FUNCNAME[0]:+${FUNCNAME[0]}(): } '
set -x
ls
set +x

which results in

+(./virtualenv-prompt.sh:5) 0: main():  ls
dark  out  parse.py  schemenames.json  virtualenv-prompt.sh
+(./virtualenv-prompt.sh:6) 0: main():  set +x

The code snippet you’ve posted outputs an empty string. I have also reinstalled pyenv and the plugin, which has sped up things by a lot. If I just keep hitting enter the new line appears almost instantly. I have (probably?) had bad install or older version. It seems like pyenv wouldn’t update to its 2+ version using the built-in updater.