powerlevel10k: Startup performance is noticeably slow

Hey @romkatv,

You have sparkled my interest in https://github.com/denysdovhan/spaceship-prompt/issues/734, so first of all thanks for that šŸ˜‰

I’m very interested to give this prompt a try, so I installed it using antigen and followed the initial wizard to configure the prompt.

While the experience in general feels smooth, I noticed that the startup time is noticeably slower than what I’m used to with my async version of spaceship. I can’t give you any numbers, and to be honest I didn’t dig deeply into powerlevel configuration to see what can be optimized (I only tried to disable a bunch of segments to no avail), but I did ask a friend for his perception (who I know also uses my async fork) and he confirms that the startup feels slower.

To clarify, by startup I mean the period of time between when I launch a new terminal instance and when the prompt is rendered so I can start typing my commands.

So maybe to start the discussion with something, have you heard a similar feedback before? Do you have any idea what might be the cause? Have you specifically attempted to optimize the startup time, or it hasn’t been your focus so far?

I’m happy to assist with whatever I can, running experiments, performing benchmarks (how?), etc. At this point I dont have a powerlevel configuration I like (so we can start with some default and use it as an example).

I’m on Arch Linux using kitty terminal, zsh 5.7.1.

To me personally it’s very important to get startup time to a minimum, I launch new terminal windows all the time and it feels annoying when I can notice the ā€œloadingā€. This is a lot more annoying than watching slow sections load (as long as they are asynchronous and not blocking my typing), so if I had to choose, I’d rather have slower async sections but faster startup.

šŸ™‚

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 4
  • Comments: 79 (59 by maintainers)

Most upvoted comments

Guys, check it out: https://gist.github.com/romkatv/8b318a610dc302bdbe1487bb1847ad99. Instant prompt! 😁

asciicast

It’s a bit gimmicky but it does reduce the perceived ZSH startup latency a lot. You’ll need to define a ā€œloadingā€ prompt that matches your real prompt to achieve best results. For example, I’m using the stock ā€œleanā€ two-line prompt, so I have this as my loading prompt:

"%B%39F${${(V)${(%):-%~}//\%/%%}//\//%b%31F/%B%39F}%b%f"$'\n'"%76FāÆ%f "

For @maximbaz this should work, I think:

"%B%3F%D{%H:%M:%S}%b%f%(#. as %1F%n%f.) in %B%6F${${(V)${(%):-%~}//\%/%%}//\//%b%65F/%B%6F}%b%f"$'\n'"%2FāÆ%f "

Check out the gist. It has instructions at the top. To make it more interesting, add sleep 1 at the bottom of your ~/.zshrc. Let me know what you think.

I still cannot believe my eyes how awesomely fast this is

Frankly, me too 😃

The current implementation works for me out of the box

Great šŸ‘

Just out of curiosity, how did you find the argument to instant-zsh-pre for me (which is perfect btw), did you compose it by hand or there is a way to get this string out of the rendered prompt?

I looked at your dotfiles and wrote it by hand.

Agreed. If you want, we can re-open the issue so that it can attract others interested in this

I’ve an idea that might speed up startup quite a bit without too many code changes. I’ll give it a shot some time next week. Will report here whether it works or not.

I’ve made p10k startup about 2.5-3 times faster. It’s still not super fast but certainly better than before. Please give it a try and let me know if you see an improvement and whether startup latency is good enough now.

Here are some benchmarks. All runs are from my desktop running Ubuntu. Current directory is powerlevel10k Git repo. 100 runs per benchmarks. Reported numbers are for a single run.

Baseline: 5.01ms.

time (repeat 100 zsh -dfis <<<'true')

Powerlevel9k with default settings: 171ms

time (repeat 100 zsh -dfis <<< 'source ~/powerlevel9k/powerlevel9k.zsh-theme')

Powerlevel10k with default settings: 59.7ms

With these settings p10k looks the same as p9k.

time (repeat 100 zsh -dfis <<< 'source ~/powerlevel10k/powerlevel10k.zsh-theme')

Powerlevel10k with ~/.p10k.zsh: 78.6ms

~/.p10k.zsh was generated by p10k configure. The choices don’t seem to have much of an effect on startup performance. The reported number is from what I suppose is the slowest configuration with the maximum number of bells and whistles. Here’s the config’s header:

# Generated by Powerlevel10k configuration wizard on 2019-10-02 at 15:42 CEST.
# Based on romkatv/powerlevel10k/config/p10k-classic.zsh, checksum 47547.
# Wizard options: nerdfont-complete + powerline, small icons, classic, dark, time,
# slanted separators, blurred heads, blurred tails, 2 lines, dotted, right frame,
# sparse, many icons, fluent.
time (repeat 100 zsh -dfis <<< 'source ~/.p10k.zsh; source ~/powerlevel10k/powerlevel10k.zsh-theme') 

I believe the last benchmark is the most important. For comparison, the same benchmark before my optimizations reports 231ms startup latency. Now it’s 2.9 times faster.

Hahaha this is amazing! šŸ˜„ I’m definitely keeping this!

BTW: it turns out my terminal doesn’t support $+terminfo[u7] so while I can report this issue, I commented out the whole ā€œcursor position checkā€ block and it works well for me, can you please clarify what is it for? Is it to catch when you switch to root session with sudo -s? I guess I could replace that commented block with simply (( EUID )) || return 0…

Hmm nice catch…

I don’t get credit for finding this issue. It was reported by users whose zsh broke.

Have you considered introducing some sort of a hook like ā€œprompt_finished_loadingā€ so that users can put commands that request console input in them?

Such commands need to run before instant prompt. They cannot be run between the instant and the real prompts because they’ll eat buffered keybard input (things like ls ~/projects that got typed when instant prompt was showing). This isn’t important though. What is important is that restructuring zshrc is too difficult for most users to and I cannot automate it. Moving interactive commands into special_hook(), or moving them above instant prompt – are both too hard.

@sinetoami I encourage you to post your findings as you go. I might be able to guess the culprit before you find the smallest possible test case.

Do I understand correctly that the number of \n is just an educated guess of how many time a user is able to press Enter while waiting for the real prompt to initialize?

Correct. If you make your window scroll while zsh is still loading, the ā€œloadingā€ prompt won’t be properly erased.

Ideally, it should be prompt-height + n where n is configurable. I’ll do it this way in p10k.

There’s only one minor bug, when the prompt is near the bottom of the terminal window and instant-zsh is being run, content jumps up.

This is intentional. It’s the best way I found to avoid the problems I mentioned earlier. I figured this is better than calling clear because it gives users freedom. Anyone who liked the previous behavior can simply call clear on their own before calling instant-zsh-pre.

I’m gonna try to integrate instant-zsh into p10k. Not sure how hard it is but worth a try.

I still cannot believe my eyes how awesomely fast this is šŸ˜‚

The current implementation works for me out of the box šŸ‘

Just out of curiosity, how did you find the argument to instant-zsh-pre for me (which is perfect btw), did you compose it by hand or there is a way to get this string out of the rendered prompt?

@sinetoami Glad to hear you’ve made your zsh load faster 😁 Thanks for feedback, keep it coming.

P.S.

The ā€œleanā€ style that you can choose in p10k configure is really much better than p10-pure.zsh. The latter is an exact replication of Pure, which isn’t that good.

I’ve watched your screencast a few more times and I think I understand what’s going on there. I see that your zsh isn’t very fast to load but I cannot say whether there is difference in latency between zinc and p10k.

Is it possible that you perceive difference in latency because zinc is printing an empty line when it starts up? Maybe you can tell it not to print it? Or, alternatively, you can print the empty line when using p10k. I don’t know at which point zinc prints the empty line so you might want to try several things in order to replicate the same delays.

  1. Print an empty line from ~/.zshrc.

Add echo somewhere in ~/.zshrc. Perhaps before p10k zplugin ice.

  1. Print an empty line right before sourcing p10k.

Add echo to p10k zplugin ice.

  1. Print an empty line right before p10k initialization.

Add print-line() { echo; add-zsh-hook -D precmd print-line }; add-zsh-hook precmd print-line to the p10k zplugin ice.


I’ve attempted to check whether zplugin might be making things slower than they should be. The short answer is no. When powerlevel10k is the only loaded plugin, sourcing it directly is 11ms faster than loading it with zplugin. It’s expected to be faster because it must take some time to load zplugin itself. The extra 11ms added by zplugin isn’t much, and it might pay off if zplugin reduces the total loading time when using many plugins (I haven’t tried to verify this).

I’ve recorded a screencast showing what I’ve measured and how:

asciicast

The screencast starts with the following docker command:

docker run -e LANG=C.UTF-8 -e TERM -it --rm debian:buster bash -uec '
  apt update && apt install -y curl git zsh sudo nano
  useradd -ms /bin/zsh me
  sudo -u me sh -c "$(curl -fsSL \
    https://raw.githubusercontent.com/zdharma/zplugin/master/doc/install.sh)"
  >>~me/.zshrc echo "zplugin ice atload\"source config/p10k-pure.zsh\" lucid
zplugin light romkatv/powerlevel10k
# source ~/.zplugin/plugins/romkatv---powerlevel10k/config/p10k-pure.zsh
# source ~/.zplugin/plugins/romkatv---powerlevel10k/powerlevel10k.zsh-theme"
  exec su - me'

It installs zsh and zplugin on Debian and logs in as a user with the following ~/.zshrc:

### Added by Zplugin's installer
source '/home/me/.zplugin/bin/zplugin.zsh'
autoload -Uz _zplugin
(( ${+_comps} )) && _comps[zplugin]=_zplugin
### End of Zplugin installer's chunk

zplugin ice atload"source config/p10k-pure.zsh" lucid
zplugin light romkatv/powerlevel10k

# source ~/.zplugin/plugins/romkatv---powerlevel10k/config/p10k-pure.zsh
# source ~/.zplugin/plugins/romkatv---powerlevel10k/powerlevel10k.zsh-theme

Note that powerlevel10k and its config are loaded with zplugin.

After logging in, the current directory is changed to ~/.zplugin/plugins/romkatv---powerlevel10k because it’s more interesting to observe what happens next if you are in a Git repository. You can see that true is instant while exec zsh is almost instant.

The following command is used to measure zsh startup latency:

time ( repeat 100 zsh -is <<< '' )

It repeatedly starts interacive zsh sessions and exits them once the first prompt renders. The command takes 5.115s (51ms per run).

Then ~/.zshrc is edited so that powerlevel10k and its config are sourced directly, without zplugin. Here’s the new ~/.zshrc.

source ~/.zplugin/plugins/romkatv---powerlevel10k/config/p10k-pure.zsh
source ~/.zplugin/plugins/romkatv---powerlevel10k/powerlevel10k.zsh-theme

After starting a new zsh session everything proceeds as before. This time the benchmark completes in 4.016s (40ms per run).

Thanks to docker, it should be relatively straightforward to reproduce this screencast for everyone who wishes to do so.

@sinetoami Your benchmark results indicate that p10k loads slightly faster than zinc.

Now, since the results of this benchmark don’t match your experience when using zsh with your real zsh config files, it means there is something in them that makes it different from the benchmark. Something that causes p10k load slower than it does in the benchmark. You can try to figure out what it is by removing parts of your config and checking when loading time improves. The first thing I would suggest is to load p10k without zplugin. Simply clone powerlevel10k repository and put source ~/.p10k-pure.zsh and source ~/powerlevel10k/powerlevel10k.zsh-theme in your ~/.zshrc.

I’ve fixed bugs in my previous implementation and rolled it forward. I had to disable some optimizations for zsh < 5.4 due to bugs in quoting that I don’t want to write workarounds for. I implemented a few extra optimizations.

Now p10k loads in 37 ms on my machine (6x improvement). Not instant but decently fast. This time includes starting zsh, sourcing ~/p10k.zsh, loading the theme and rendering the first prompt. Note that the first prompt is complete, unlike with some other themes that will give you just the current directory and then repaint prompt asynchronously when extra data becomes available.

I think it’s possible to shave off another 10 ms or so but more than that will be difficult. Unless someone complains loudly enough, I’ll leave the code be.

I also might have to revert the change if I broke something again. The changes are quite complex, so there is non-trivial risk of breakage.

I would appreciate if someone on this issue could update powerlevel10k and verify that loading has gotten faster.

FYI: I’ve reverted my optimizations as they were causing some issues and I don’t have the time to debug them right now. Will debug and roll forward later this week.

Any reason not to just have the configuration wizard run your command above to compile everything?

I was thinking the same thing.

It’s certainly possible. One non-trivial aspect is handling different combinations of permissions and file/directory ownership. Not too difficult. The thing that is stopping me from implementing this right now is potential interaction with plugin managers.

I’ll wait for the fallout from my recent optimizations to subside and then add zcompile once bug reports stop pouring in.

But it makes sense now, powerlevel10k is targeting to optimize the performance of long-running sessions by sacrificing a bit of startup time, and async spaceship has the opposite goal.

I don’t think there is an unavoidable trade off between these goals. Maybe sometimes you have to choose one over the other but in many cases you can have both. Powerlevel10k is slow to start and fast to use simply because I optimized for the latter without caring for the former. Things you don’t care about tend to be slow but it doesn’t mean they cannot be made fast.