gitoxide: Spurious illegal instructioin on x86_64-unkown-linux-musl

Duplicates

  • I have searched the existing issues

Current behavior 😯

cargo-binstall, which uses gix v0.50.1, sometimes get killed by illegal instructions on this CI and also this one:

+ ./cargo-binstall binstall --force --git file:///tmp/tmp.iyk2axDWcq --no-confirm cargo-binstall
 INFO resolve: Resolving package: 'cargo-binstall'
 INFO Cloning::receiving pack: Enumerating objects → 3.0 objects
git.sh: line 39: 24033 Illegal instruction     (core dumped) "./$1" binstall --force --git "file://$GIT" --no-confirm cargo-binstall

I suspect this has something to do with max-performance, it could be that the use of assemblies in sha1-asm or libz-ng caused this problem.

It always happens when cloning from a local git repository using file:///, so I suppose it also has something to do with it.

Expected behavior 🤔

No response

Steps to reproduce 🕹

No response

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 16 (16 by maintainers)

Most upvoted comments

Great to hear we pin-pointed it! I recommend turning sha1-asm back on to get a little performance back.

I think the current performance is already good enough, in fact I believe that as long as we don’t do a checkout, it will be ok for binstall to even switch back to -Oz.

If the performance becomes a problem again, binstall can employ the same caching schema used in .cargo/registry/index, although I’m not sure how cargo determines the name of the directory and prevent concurrent access to it.

Great to hear we pin-pointed it! I recommend turning sha1-asm back on to get a little performance back. If the issue reoccurs than we’d also learn something, i.e. it would then be some combination of libraries or some strangeness with codegen maybe.

For now I am tentatively closing this issue as there is nothing we can do here, and follow the related issue instead. If something changes, this issue should be reopened though.

That’s great, as one possibility has been removed. The next to try would be zlib-ng then (while leaving sha1-asm disabled)? Maybe change it to the normal backend to retain some performance. If that still doesn’t work, try with max-performance-safe alone, that is pure Rust then.

Thanks for the swift response!

I will disable sha1-asm and leaving only zlib-ng, since IMO zlib-ng is more battle-tested and it provides more speedup than sha1-asm.

That’s interesting as it doesn’t reproduce (or is nothing I encountered) in this CI yet. Thus another possible cause may be be miscompilation, in addition to assembly as potential cause.

If the issue is miscompilation, then adding rustflags = ["-Ctarget-cpu=native"] (or removing it) might affect codegen and maybe resolves the issue that way.

If assembly is the issue, you could try to disable one or the other performance option, using either sha1-asm only or only libz-ng (or another zlib implementation for that matter). To do that, you would go with max-performance-safe, add a dependency to gix-features and add --features = ["fast-sha1"] or --features = ["zlib-ng"] (or another backend) to see issue is fixed that way.

Please note that I think for performance, a great zlib implementation is more important than a great hash implementation, even though I didn’t yet run tests to validate my hunch (as a disclaimer).

Thanks for letting me know how it goes.

PS: Maybe another possible issue is some strange interaction with the VM this runs on. After all, illegal instructions should always be run in tight algorithms like SHA1 or zlib-ng, so it should consistently crash. But it doesn’t, so might be related to something else entirely. In any case, changing the instructions in the binary might help to fix it merely by not using the maybe special instructions that come with assembly or certain super-optimized C code in zlib-ng.