jj: Slow operations on very large repos

Description

Right up front I want to acknowledge: (a) this is definitely an unusual situation, and (b) I totally get that it is likely to take a bit to sort through. But: I tried out Jujutsu on a very large repo from work a few minutes ago and found it’s distinctly not yet ready to use there:

Command Time
jj init --git-repo=. 4m 59s
jj status 25s

(I’ll add more operations to this list once I’m actually back at work in August!)

For scale: this repo has on the order of 3M LOC checked in—primarily JavaScript, TypeScript, and Handlebars, but with a mix of Java and Gradle as well, with a massive node_modules directory and a not-small bucket of things related to Gradle (both gitignore’d buuuut still massive) and it has hundreds of thousands of commits in its history, hundreds of active branches… and, annoyingly, also hundreds of thousands of tags (one for each commit; better not to ask).

For comparison, git status takes a second or two (again, I will time them when I’m back at work). I’m not using a sparse checkout here (other folks sometimes do, but for various reasons it’s a non-starter for me 😩).

Comparable open source repos might be something like Firefox or Chrome? I tried DefinitelyTyped, and its 3M LOC and mere 84,275 commits only took 9s to initialize and jj status took around a second. Even so, the comparable scale of the codebase itself and dramatically better performance suggests there may be something repo-specific (the tags?) causing the issue.

Steps to Reproduce the Problem

  1. Check out a massive repo with git.
  2. Initialize it with jj.
  3. Run operations on it.

Expected Behavior

It completes in a reasonable amount of time.

Actual Behavior

It completes in what honestly probably is a reasonable amount of time given the sheer scale of the things, but in a way that makes it much worse than Git for the moment.

Specifications

  • Platform: macOS Ventura 13.4.1
  • Version: 0.7.0

About this issue

  • Original URL
  • State: open
  • Created a year ago
  • Comments: 37 (9 by maintainers)

Commits related to this issue

Most upvoted comments

Both of them seems to point to the git_futils_readbuffer_updated takes a lot of time. Is this due to the natural of large git repo?

If you have tons of refs under .git/refs directory, try git pack-refs. It will reduce the overhead of automated git imports.

Excellent, I think I needed core.fsmonitor in the repo too, I had it in my user config but I don’t think that helped. Now it’s working!

$ time jj status
Parent commit: 59a346bf5dfb [docs] Add missing docs for log-command
Working copy : 0c96d3dc092e (no description set)
The working copy is clean

________________________________________________________
Executed in  807.39 millis    fish           external
   usr time  646.04 millis  621.00 micros  645.42 millis
   sys time  175.54 millis  278.00 micros  175.27 millis

So, a fair amount of time would be spent for importing refs. 25s - 9s = ~16s. Watchmain will help to reduce the 9s part to a few hundred ms, I suppose.

Perhaps we could also point Watchman at the Git ref files/directories, so that we could at least skip importing refs when none of them have changed (or something more ambitious where we import refs selectively based on which files have changed).

If you’re curious what’s taking time, you can try profiling using e.g. samply. Just install with cargo install samply, then run e.g. samply record jj log and open the link it prints. Feel free to share a screenshot.

With #2232 merged, you should see significantly better performance in fresh clones of large repos. For example, I timed jj log | head -1000 in the Linux repo. That took ~13 s before and ~2.3 s after.

I posted in Discord https://discord.com/channels/968932220549103686/969291218347524238/1129516951706816532 but should post here as well:

Here’s a tracing profile of jj status in nixpkgs with Watchman.

Screenshot_2023-07-14_at_23 46 11

Interesting segments:

  • Total time: 768ms
  • snapshot: 257ms
    • import_git_refs: 53ms
    • tree_state (reading it): 51ms
    • deleting file states (filtering out Git submodules from a list of files): 14ms
    • make_fsmonitor_matcher: 99ms
    • query_watchman: 84ms (still a bit much in my opinion…)
    • finish: 25ms
  • write_commit_summary: 10ms
  • conflicts: 424ms 😱
  • remaining time: ~50ms, unattributed, trying to figure out what this is; the last thing that happens in cmd_status is conflicts, so I’m guessing the remainder is some Drop implementation

@martinvonz is working on tree-level conflicts which should take care of the biggest bottleneck. I think we can cut ~90ms if we stop storing file states in the tree-state proto for the Watchman case.

With some additional feature work, we could possibly reduce import_git_refs somewhat by querying Watchman (might have to do it in parallel with snapshotting the working copy… actually, it would probably help to do them in parallel right now). The last 50ms of remaining time need more investigation. But then I think we could get status down to an acceptable ~100ms.

PS I installed watchman, built jj with the feature flag and enabled in my config. I can see that the daemon is running. Does this mean I can now use --ignore-working-copy all the time safely?

Whether you use --ignore-working-copy is orthogonal to the availability of Watchman. It only means that working copy snapshots won’t be taken. If a snapshot is taken and Watchman is available, then jj will use Watchman as a faster path instead of scanning the filesystem.

Make sure that you set core.fsmonitor to watchman in your repo as well (jj config set). You should be able to confirm that Watchman is being used for snapshotting by invoking jj with the environment variable RUST_LOG=info. It should print a message saying that it is querying Watchman.