oh-my-posh: Performance: Command execution slow on Windows

Prerequisites

  • I have read and understand the CONTRIBUTING guide
  • [x ] I looked for duplicate issues before submitting this one

Description

Slowness in rendering prompt. It’s noticeably slow in Windows compared to WSL2, even in the same Terminal.

Environment

  • Oh my Posh version: 3.64.2
  • Theme:
  • Operating System: Windows 10 2004
  • Shell: powershell
  • Terminal: default console

Steps to Reproduce

I was trying to figure out why composing the prompt is so slow on windows. I unfortunately don’t know much go, but I muddled around with it and found a couple of things.

Firstly, running and parsing git commands seems quite slow.

PS C:\Users\amol> Measure-Command {posh-windows-amd64.exe --shell zsh --config c:\users\amol\.poshthemes\amol.omp.json}


Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 0
Milliseconds      : 89
Ticks             : 895987
TotalDays         : 1.03702199074074E-06
TotalHours        : 2.48885277777778E-05
TotalMinutes      : 0.00149331166666667
TotalSeconds      : 0.0895987
TotalMilliseconds : 89.5987



PS C:\Users\amol> Measure-Command {c:\users\amol\appdata\local\Atlassian\SourceTree\git_local\bin\git.exe rev-parse --is-inside-work-tree }
fatal: Not a git repository (or any of the parent directories): .git


Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 0
Milliseconds      : 37
Ticks             : 376365
TotalDays         : 4.35607638888889E-07
TotalHours        : 1.04545833333333E-05
TotalMinutes      : 0.000627275
TotalSeconds      : 0.0376365
TotalMilliseconds : 37.6365

As you can see here, running the raw git command takes less than half the time it takes to run a posh command where my only config is a git segment (the default one from your theme file)

There is no git repo in that directory (I was trying to get the fastest execution for the git segment, but interestingly the overall execution time of posh does not change even within a git repo)

I tried to do some profiling in go, and came up with this for execution:

I’m always running

posh-windows-amd64.exe --shell zsh --config c:\users\amol\.poshthemes\amol.omp.json

where the theme only contains a only git prompt segment exactly like the jandedobbeleer profile

H:\github\oh-my-posh3\src>go tool pprof c:\Users\amol\foo
Type: cpu
Time: Jan 1, 2021 at 2:20pm (PST)
Duration: 201.11ms, Total samples = 60ms (29.83%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top10
Showing nodes accounting for 60ms, 100% of 60ms total
Showing top 10 nodes out of 30
      flat  flat%   sum%        cum   cum%
      50ms 83.33% 83.33%       50ms 83.33%  runtime.cgocall
      10ms 16.67%   100%       10ms 16.67%  os/exec.(*Cmd).Start
         0     0%   100%       30ms 50.00%  bufio.(*Reader).ReadLine
         0     0%   100%       30ms 50.00%  bufio.(*Reader).ReadSlice
         0     0%   100%       30ms 50.00%  bufio.(*Reader).fill
         0     0%   100%       30ms 50.00%  internal/poll.(*FD).Read
         0     0%   100%       60ms   100%  main.(*Segment).enabled (inline)
         0     0%   100%       60ms   100%  main.(*Segment).setStringValue
         0     0%   100%       60ms   100%  main.(*engine).setStringValues.func1
         0     0%   100%       10ms 16.67%  main.(*environment).hasCommand
(pprof) quit

In a repo with git, this is the profile

H:\github\oh-my-posh3\src>go tool pprof foo
Type: cpu
Time: Jan 1, 2021 at 2:08pm (PST)
Duration: 412.02ms, Total samples = 220ms (53.40%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top10
Showing nodes accounting for 220ms, 100% of 220ms total
Showing top 10 nodes out of 48
      flat  flat%   sum%        cum   cum%
     210ms 95.45% 95.45%      210ms 95.45%  runtime.cgocall
      10ms  4.55%   100%       10ms  4.55%  runtime.slicerunetostring
         0     0%   100%      150ms 68.18%  bufio.(*Reader).ReadLine
         0     0%   100%      150ms 68.18%  bufio.(*Reader).ReadSlice
         0     0%   100%      150ms 68.18%  bufio.(*Reader).fill
         0     0%   100%       30ms 13.64%  fmt.Fprintf
         0     0%   100%       30ms 13.64%  fmt.Printf
         0     0%   100%      150ms 68.18%  internal/poll.(*FD).Read
         0     0%   100%       30ms 13.64%  internal/poll.(*FD).Write
         0     0%   100%       30ms 13.64%  internal/poll.(*FD).writeConsole
(pprof)

So, it seems that running multiple git commands slows things down even more , as you’d expect.

With my less-than-zero knowledge of go, I can only theorize that

  1. Perhaps parallelizing all the git commands will help responsiveness, instead of executing them sequentially. I saw a closed issue where you did this for segments, but I wasn’t able to figure out if you tried just the git commands.
  2. Maybe there is an optimization to be made where instead of reading line by line, you read the entire buffer and then parse it line by line. This should help in situations where there is a lot of git output, I would think.

Love the tool, thanks for the great work !

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 103 (48 by maintainers)

Commits related to this issue

Most upvoted comments

These 3 seemed like the simplest candidates for native Go refactoring in the git segment:

  • Remove external call by natively searching for git repo.
  • Remove external call by saving repo root during enable check.
  • Remove external call by count the lines in {ROOT_REPO}/.git/logs/refs/stash (if it exists)

I pushed a new proof-of-concept branch trying these changes here: https://github.com/shedric1/oh-my-posh3/tree/native-git

Using all three gave the following results on my machine inside a git repo directory:

...\posh-windows-amd64-native-git.exe --config C:\Misc\Resources\customtheme.omp.json --debug

path(true)           -   0 ms -    C:\Misc\oh-my-posh3█
git(true)            -  71 ms -  :faster-commands ≡  ?2  2
exit(false)          -   0 ms -

For reference, here’s the experimental build from above:

...\posh-windows-amd64-experimental.exe --config C:\Misc\Resources\customtheme.omp.json --debug

path(true)           -   0 ms -    C:\Misc\oh-my-posh3█
git(true)            - 208 ms -  :faster-commands ≡  ?2  2
exit(false)          -   0 ms -

Still needs polishing, but in general it’s a big performance improvement.

The slow load times I was experiencing turned out to be MS Defender ATP in the end. for whatever reason it was causing some intensive scanning. After we reported this to MS Support the following day we saw Defender ATP no longer was blatting it and the prompt profile came up in the expected time.

All back to normal now.

@TravisTX @lnu @amoldeshpande @royvou @shedric1 @gibwar looking at the latest release, how do we feel about the speed? Improved enough to consider this solved, or do we need follow-up actions to go for the extra smile?

I’m testing all the different cases, so far so good. I’ll already publish this part.

If all goes well I’ll have a working version tomorrow that uses paths to validate if we’re in a git repo, including support for worktrees.

Well, this thread just keeps growing 😃

@shedric1 I was looking at the same thing this morning, also found out the stash count is being done AT ALL TIMES which is a huge mistake on my part. I agree with swapping out the calls as much as we can, I do believe your changes are a candidate for this. How would you like to proceed? Do you want to give this a go, or do you want me to assist in merging this in properly? It will require a rewrite of tests, splitting functionality to environment, etc, so there’s more work to be done than doing the changes alone.

About git2go, that still requires the libgit binary so we can’t have a single exe in that case. I’d say avoiding doing git calls is the most straightforward approach here.

As royvou mentions, another approach is removing as many external command calls as possible. Same principle as using the go-git package, but with a simple manual implementation.

For example, the first git call is in the segment enable check, and only checks if the current working directory is inside a working tree. Replicating the same behavior using system calls through the os package should be easy; just check each folder between working directory and root for a .git subfolder. As far as I can tell, this is how the go-git package handles their PlainOpen function with DetectDotGit specified.

I got a very basic proof-of-concept working here if anyone else wants to try it out/test it: https://github.com/shedric1/oh-my-posh3/tree/git-treewalk

Some pretty promising results:

...\posh-windows-amd64-git-treewalk.exe --config C:\Misc\Resources\customtheme.omp.json --debug

Here are the timings of segments in your prompt:

path(true)           -   0 ms -    C:\Misc█
git(false)           -  20 ms -
exit(false)          -   0 ms -

For reference, here are my results of running the faster-commands build without any changes:

...\posh-windows-amd64-faster-commands.exe --config C:\Misc\Resources\customtheme.omp.json --debug

Here are the timings of segments in your prompt:

path(true)           -   0 ms -    C:\Misc█
git(false)           -  66 ms -
exit(false)          -   0 ms -

The same ~40-60ms savings are present for both repo folders and non-repo folders alike. Might be too small to warrant the potential edge-cases, but the fewer external command executions the better. Comparing against the experimental build, it’s a smaller ~20-40ms savings, but still (slightly) faster consistently.

Not sure if it will cause performance issues on other OSes where external commands aren’t so slow, so any Linux/macOS comparisons would be appreciated!

EDIT: looking closer, the process of walking parent dirs for a git repo by definition would also return the repo root directory. The second external git call only returns this path, so we could kill two external function calls with one stone here fairly easily.

@TravisTX that’s not enough to say we’re about to have a party. Confirms what I’m seeing. Better, but not by major margins.

Keeping the fork synced isn’t going to be the challenge so I’m definitely in favor of keeping that. Cool that we’re now seeing actual differences, that’s clear progress!

I’m excited to see the progress in making the git segment faster (it’s 81ms on my home computer, I’ll update my work machine tomorrow with 3.86.3) but there is one thing I want to caution: finding git entries manually can be problematic if you’re only looking for a .git folder.

I use git’s worktrees pretty extensively and they only have a .git file, currently with a single entry of gitdir: c:/absolute/repo/.git/worktrees/name that link the two together. Currently 3.86.3 works with this setup and displays all output properly (branches, staging, push/pull distance, etc) and would hate to see it only work on full clone repositories.

I just realized I also need to revert back to exec.Command("command").Output() rather than the override to avoid calling process.Wait(). Ill try to do that and compile with the fork. See if that makes a difference.

We already went from ±350ms to 200ms for the git segment rendering(on windows). In wsl2 it takes 10ms at most.
Clearly, if we could get the same performance on windows as on linux, that would be fantastic.

There also seem to be a couple of go libraries for git. Those might be better options than spawning processes. e.g., https://github.com/go-git/go-git/

@apapiccio Thank you for reporting this to Microsoft! So it turns out for me the culprit was also Defender ATP as today, it is behaving normally again! ❤

Well I don’t think the static build will do anything unless the go linker explicitly links with all the Windows import libraries it needs. Static builds are meaningless in windows except for the c library, which i assume go doesn’t need anyway.

A normal, well behaved windows application will link with system import libs for kernel32 user32 advapi32 ws2_32 etc. when such an application starts, the NT loader will in current versions load the DLLS for these in parallel.

what go did the last time I looked at it, was that in a stupid overkill aim to only ship an exe (again, static is meaningless in windows), they didn’t link with any system libs and instead serially loaded each one at startup.

maybe it has changed, but only a WPT profile will tell

I’ll check that out and let you know

Get Outlook for Androidhttps://aka.ms/AAb9ysg


From: Jan De Dobbeleer @.> Sent: Tuesday, March 29, 2022 6:29:11 PM To: JanDeDobbeleer/oh-my-posh @.> Cc: Angelo Papiccio @.>; Mention @.> Subject: Re: [JanDeDobbeleer/oh-my-posh] Performance: Command execution slow on Windows (#305)

@apapicciohttps://github.com/apapiccio can you validate if Windows Defender (or any other scanning tool) isn’t blocking the execution of oh-my-posh?

— Reply to this email directly, view it on GitHubhttps://github.com/JanDeDobbeleer/oh-my-posh/issues/305#issuecomment-1081702847, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AGMEMHHUCPHEFHKCBVQW6H3VCLLPPANCNFSM4VQS3D6Q. You are receiving this because you were mentioned.Message ID: @.***>

I have been using oh my posh 3 for a while now on other systems but I got a new work laptop today and went through the install procedures on the web site using the Admin Powershell method to install the modules. When I then try and set any posh prompt theme it takes a very long time e.g. 20303ms and sometimes more. All commands in the terminal and then very slow

When I add the import and set-poshprompt settings in my PS profile it takes for ever to load image

I been able to track it down to setting the prompt and once the prompt theme is loaded. If I remark out all those entries in my PS profile it loads very quick (as normal) and functionality is fine.

@Kudostoy0u I would actually use the latest release as a lot of git calls were removed since then. It contains more optimizations than the fork right now.

Latest release is perceptibly faster than the old one, so I’m fine with calling it good. It’s about 100-120ms in a repo like posh3 for me.

thanks !

It’s good for me (but it’s always been pretty good on my computer). I’m at about 63ms in a git repo, and 7ms outside of one. Which is a notable improvement!

Another interesting datapoint. I use SourceTree and its embedded git version. Turns out it defaults to a really old one and you have to manually make it update. I went from git 2.11 to 2.20 when I forced the upgrade.

anyway, to get to the point, git status in my UnrealEngine repo is ~900 ms in 2.20 (so a good 300ms faster), and as a result posh comes in around 1.2 seconds instead of 1.5+

Just another variable to add to the matrix.

I’ll pick up the task. I can do this during lunchtime, I’ll try to provide intermediate builds here for validation.

@JanDeDobbeleer I won’t have a chance to look into a proper version with updated tests and organization for a while, so anyone is welcome to get the ball rolling on it using that branch! I just whipped those up as a proof of concept to see if it worked and offered meaningful performance improvements, but I’m new to Go so a well formatted version might take me a lot longer than someone more familiar with the language. I’ll take a crack at it next week if no one else takes up the mantle by then!

This sounds great! 2x ~40ms would be a huge bonus. And would get us closer to under the 100ms.

I also did a little expiriment with go git, and as Jan already mentioned the git status equivalent, is slow with big repositories with a lot of excluded files.

Parrelize the other few git commands (or check if some could be merged) would be a huge bonus.

For the Linux check I did (on my own branch) it didnt really matter as it was always 4-6ms.

Would it also be an idea, where appropriate, to replace an Cmd execution, with a small go function? I would expect this to be faster on both Windows & Linux.

For example, I think the check ‘is in git folder’ could be replaced with a simple check if there is a .Git folder in the current dir or any parent dir. Might be possible to replace some other cmd executions with some go code.

I think the az module could be changed as well to just read two config files!

@amoldeshpande It’s not so easy. A couple of commands can be executed in // but others depend on the context.
At least we can give it a try.

Spinning up processes is simply more expensive on Windows… I heard this is why .NET Framework introduced App Domains ages ago… Most of the benefits of process isolation, without expensive bit of being a separate process.

done. no idea where I picked up the old version since I only started using posh about a week ago 😃 Anyway, no change in timings probably as expected.

Also tried replacing the current exe to looks if it “feels” faster, but i would not say so. @lnu Are you sure, the timings above seem to assume it is not processed in parallel.

With --debug each segment is processed one by one. But you can assume the most expensive segment will be the bottleneck when processed in //.

I like to go for gravy. But 40ms is blazing fast compared to where we came from. If more people can confirm this, I’ll do the necessary changes to ensure we can build with that fork on Windows.

Very, very fast in between everything. I could have made some mistakes here 😓 posh-windows-amd64-experimental.exe.zip

I have a proposal. I just forked go and removed the timeout that’s present on Windows. It’s stated that removing it does work, meaning we can give it a “go” and see what the effect is of oh-my-posh built with that fork. Managing that should be doable if it provides to be a major improvement without side effects.

i tried your experimental build, but i could not get it to work

@royvou aha. That’s interesting as I assumed the binary would have packed it. I’ll upload a new zip with that included 👍🏻