tauri: [bug] Inflated build time due to heavy dependencies
Describe the bug
At times, I’ve gotten build times as large as 10 minutes. I’ve even gotten a 20 minute build time, but I’ve lost the cargo output to my bash terminal buffer.
To investigate, I set up some profile settings in my project to disable LTO and disable optimisation (in order to get the best case scenario), and then did a clean build using cargo clean && cargo +nightly build --timings.

The report shows that the three biggest hinderances are zstd-sys, blake3, and bzip2-sys.
Using cargo tree -i, I found the following
$ cargo tree -i zstd-sys
zstd-sys v1.6.3+zstd.1.5.2
└── zstd-safe v4.1.4+zstd.1.5.2
└── zstd v0.10.0+zstd.1.5.2
├── tauri-codegen v1.0.0-rc.2
│ └── tauri-macros v1.0.0-rc.2 (proc-macro)
│ └── tauri v1.0.0-rc.3
│ └── dsrbmm v0.1.0
├── tauri-utils v1.0.0-rc.2
│ ├── tauri-build v1.0.0-rc.3
│ │ [build-dependencies]
│ │ └── dsrbmm v0.1.0
│ ├── tauri-codegen v1.0.0-rc.2 (*)
│ └── tauri-macros v1.0.0-rc.2 (proc-macro) (*)
└── tauri-utils v1.0.0-rc.2
├── tauri v1.0.0-rc.3 (*)
├── tauri-runtime v0.3.2
│ ├── tauri v1.0.0-rc.3 (*)
│ └── tauri-runtime-wry v0.3.2
│ └── tauri v1.0.0-rc.3 (*)
└── tauri-runtime-wry v0.3.2 (*)
$ cargo tree -i blake3
blake3 v1.3.1
└── tauri-codegen v1.0.0-rc.2
└── tauri-macros v1.0.0-rc.2 (proc-macro)
└── tauri v1.0.0-rc.3
└── dsrbmm v0.1.0
$ cargo tree -i bzip2-sys
bzip2-sys v0.1.11+1.0.8
└── bzip2 v0.4.3
└── zip v0.5.13
└── tauri v1.0.0-rc.3
└── dsrbmm v0.1.0
As you can see, they all impact the compile time of Tauri, and this is the best-case scenario (no overhead of optimising or LTO to increase the compile time).
I know there’s nothing Tauri can do about the compile time of the libraries, but are there any lightweight alternatives that could replace these behemoths?
I do only have an Intel Core i7 7500U, so use that to put the performance in perspective. Only 4 cores, a maximum concurrency of 7, and 321 total compilation units. Even so, the fact that zstd-sys took half of the total build time is insane and is something that is worth at least looking further into. With those 3 hogging 3 CPU cores, I only had 1 thread remaining for everything else. And quad core isn’t exactly a niche setup
Reproduction
No response
Expected behavior
No response
Platform and versions
Operating System - Windows, version 10.0.19043 X64
Webview2 - 98.0.1108.56
Visual Studio Build Tools:
- Visual Studio Build Tools 2019
WARNING: no lock files found, defaulting to npm
Node.js environment
Node.js - 16.10.0
@tauri-apps/cli - 1.0.0-rc.5
@tauri-apps/api - Not installed
Global packages
npm - 8.4.1
pnpm - 6.29.1
yarn - 3.1.1
Rust environment
rustup - 1.24.3
rustc - 1.59.0
cargo - 1.59.0
toolchain - stable-x86_64-pc-windows-msvc
App directory structure
/icons
/src
/target
/WixTools
App
tauri - 1.0.0-rc.3
tauri-build - 1.0.0-rc.3
tao - 0.6.2
wry - 0.13.3
build-type - bundle
CSP - default-src blob: data: filesystem: ws: wss: http: https: tauri: 'unsafe-eval' 'unsafe-inline' 'self' img-src: 'self'
distDir - ../frontend/build/dist
devPath - http://localhost:3000/
framework - React
Stack trace
No response
Additional context
No response
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Reactions: 5
- Comments: 18 (12 by maintainers)
At the time of #1430 i had used the cargo timings to test compilation times in general. When I was running it, the linking time absolutely swamped everything else. On computers where it is more CPU constrained than IO constrained (like this issue’s example), these slow to build crates can become a larger issues.
Over time, it seems there have also been issues with longer compile times in some of these projects. Specifically
zstdwhich seems to get slower to compile as versions go on while speeding up their runtime performance. Sincezstd-sysis building the actualzstdc code, it seems to continuously get slower to compile over time. I don’t recallblake3taking up nearly as much time as this timing output, so it may be the case that some cpu platforms are slower to build than others because it adds in cpu specific code for performance reasons.Overall, I was not really concerned with the crate dependency compile times because I was much more focused on dirty builds (the dependencies have already been compiled) since that is the typical developer workflow loop. Additionally, I was not aware of how severe being CPU constrained (crate compilation) instead IO constrained (linking) could affect build times.
Along with the above reasoning, I chose
zstdbecause:zstdhas very good compression ratio and decompression speed/memory usageI ended up adding
blake3in #1430 to prevent additional work from being performed if the asset file was not changed during a dirty build. This was much more important when compression was not an optional feature, as not only did it prevent IO work but it would also prevent CPU work. Nowadays, if compression is disabled it will only prevent IO work. This brought good wins for large asset files without extra dirty build time work. That said, I didn’t recall it having such an effect on a clean build time, but perhaps that is because I was not CPU constrained. I choseblake3because:debugperformanceThe runtime performance was important because its runtime is also during compile time due to its use in
tauri-codegen. It had to be faster to hash the file than writing out the contents to disk even with a debug profile.tl;dr - I mostly focused on dirty build compile times (the dependency is already compiled) and otherwise focused on runtime performance, which
zstdandblake3brings plenty of.There are multiple ways to go for solutions, I’ll start with
blake3.blake3alternativesblake3actually comes with a Rust reference implementation. It is a single file with <400 LoC and no generics (from a quick glance) which usually translates to very fast compile times. I tested a release build on a Windows 10 VM with 2 CPU and it took 1s flat for a clean build and no caches. Downside, it is not published as a crate. The file changes very little (the logic changes even less) so vendoring this should be fine as long as we keep a bit of an eye on the original reference. We also may be able to convince the blake3 project in publishing it as a crate. Also it exists in a repo that is dual Apache-2.0/CC0 1.0 Universal but it itself is not specified and we may want to make sure it is also licensed the same.I didn’t really look into anything else because this seemed good enough. I will add though, that if the slower performance doesn’t keep up wins for asset inclusion on HDDs (thinking of a JS project with megabytes of dependencies), we may also want to only enable if compression is also enabled since more work is done there.
The 1s compilation time for the reference can be compared to 13s for the main crate on the same machine, and 32.37s for the main crate with the
rayonfeature enabled. Perhaps just disabling rayon support will also bring a big-enough compilation time win with minimal performance impact.As for runtime performance (6.7MB JavaScript file)…
blake3w/rayon)blake3w/rayon)--no-mmapblake3w/rayon)--no-mmap --num-threads=1opt-level = 1opt-level = 2opt-level = 3Click me to see the rust reference sum wrapper code
As a sanity check against b3sum to make sure the output was the same, here is the output for both. Side note that the release build and the debug builds for the reference take a very similar amount of compile time (both ~1s) due to no dependencies. It may be worth it to enable an
opt-levelfor it in the proc macro with a profile override (if we can do it in the proc-macro?) to change the runtime from 0.8s -> 0.03s. Not sure if the overrides work on the proc macro or only the root crate being builtSummary, if disabling
rayondoesn’t give us enough compile-time gains onblake3, using the reference implementation is almost instant to compile, and only 50% slower (of a very fast runtime).zstdalternativesI started off building
zstdin that same virtual machine (2 cpu). Clean build took only 13s which seems really low compared to the timings in this issue because that’s half the time blake3 build time took with rayon. The timings thinks it takes longer than blake3. Perhaps this is another issue that’s difficult to see for all computers.I did get a warning while compiling the
zstd-syscrate ofwarning: cl : Command line warning D9002 : ignoring unknown option '-fvisibility=hidden', butcargo testpasses so I don’t believe it has an effect.brotli
I first checked out brotli because that is actually what I used when first adding compression a long time ago. A clean build of
brotlitook 7s on the VM whenffi-apiwas disabled (compared to 12s). Dropbox’s implementation of brotli (thebrotlicrate) includes a few things over the base brotli project, most notably multithreading which brings it back into the performance ballpark of zstd.This is looking promising, so I did some comparisons using the JS vendor file from https://app.element.io (6.7MB). These timings were taken on the same 2 cpu VM. Note that the
brotlicommand used was the binary available on the rust crate. Note that brotli’s default profile is the same as best.I actually really like
brotli(9)here, since it’s still sub-second compression (and brotli has good decompression) along with a slightly lower file size than the compression-time equivalent of the zstd profile. I think usingbrotli(2)for debug builds andbrotli(9)for release builds is a good balance. We can always somehow add ahurt-me-plentyoption that uses best to try and crank out the last kilobytes of the assets at the cost of runtime (during compile time in the codegen) performance for those that really want it.brotliwould be my choice hands down at the replacement. Here’s reasons why:zstdalong with really good sub-second compression options.miniz_oxide
A Rust implementation of DEFLATE. Compiles clean in ~2.8s. I didn’t look into it further because compression ratio and decompression performance (32k window size) are not ideal. This doesn’t really knock it out as a contender, I just prefer the brotli solution first.