candle-vllm: candle-flash-attn linking error with Red Hat based distributions

I am trying to make the following (unfinished) Dockerfile work:

# Note: if building on a machine with a different GPU or no GPU then check
# https://developer.nvidia.com/cuda-gpus and pass the value without the decimal point to
# CUDA_COMPUTE_CAP directly without the $(...), for example for an A100 is CUDA_COMPUTE_CAP=80 and
# for an A10 is CUDA_COMPUTE_CAP=86.
#
# docker build --build-arg USERID=$(id -u) --build-arg \
#   CUDA_COMPUTE_CAP=$(nvidia-smi --query-gpu=compute_cap --format=csv | tail -n1 | tr -d .) \
#   -t local/candle-vllm-bench .

# Select an available version from
# https://gitlab.com/nvidia/container-images/cuda/blob/master/doc/supported-tags.md:
FROM nvidia/cuda:12.3.1-devel-rockylinux9
ARG USERID=1000
ARG CUDA_COMPUTE_CAP
RUN yum install -y cargo libcudnn8-devel openssl-devel git && yum clean all && \
    rm -rf /var/cache/yum/*
RUN git clone https://github.com/EricLBuehler/candle-vllm
WORKDIR /candle-vllm
RUN cargo build --release --features cuda,cudnn,flash-attn,nccl
RUN adduser -u $USERID user
USER user

But it fails with:

error: linking with `cc` failed: exit status: 1
  |
  = note: LC_ALL="C" PATH="/usr/lib/rustlib/x86_64-unknown-linux-gnu/bin:/root/.local/bin:/root/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" VSLANG="1033" "cc" "-m64" "/tmp/rustczqulH1/symbols.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.0.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.1.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.10.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.11.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.12.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.13.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.14.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.15.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.2.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.3.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.4.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.5.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.6.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.7.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.8.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.candle_vllm.cb7142adda2cd887-cgu.9.rcgu.o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb.4psql9m0o7iw6sqs.rcgu.o" "-Wl,--as-needed" "-L" "/candle-vllm/target/release/deps" "-L" "/candle-vllm/target/release/build/zstd-sys-51991617680764ab/out" "-L" "/usr/local/cuda/lib64" "-L" "/usr/local/cuda/lib64/stubs" "-L" "/usr/local/cuda/targets/x86_64-linux" "-L" "/usr/local/cuda/targets/x86_64-linux/lib" "-L" "/usr/local/cuda/targets/x86_64-linux/lib/stubs" "-L" "/usr/lib" "-L" "/usr/lib64" "-L" "/candle-vllm/target/release/build/bzip2-sys-f7fb57a3f4e98cc1/out/lib" "-L" "/candle-vllm/target/release/build/ring-a59330cc6e943984/out" "-L" "/candle-vllm/target/release/build/lz4-sys-c90b3b6e3d6da391/out" "-L" "/candle-vllm/target/release/build/esaxx-rs-83f1f68488f360a8/out" "-L" "/candle-vllm/target/release/build/onig_sys-d0c2f3461f43020d/out" "-L" "/candle-vllm/target/release/build/candle-flash-attn-bb54a4d16d25ee03/out" "-L" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib" "-Wl,-Bstatic" "/candle-vllm/target/release/deps/libenv_logger-0f0fa188a1404846.rlib" "/candle-vllm/target/release/deps/libtermcolor-c53cf66b9b32e10f.rlib" "/candle-vllm/target/release/deps/libis_terminal-cdf9c5266fcbba03.rlib" "/candle-vllm/target/release/deps/librustix-a629012946c99e6d.rlib" "/candle-vllm/target/release/deps/liblinux_raw_sys-15bed2ca91cf42a8.rlib" "/candle-vllm/target/release/deps/libhumantime-1dc284c82c7f0559.rlib" "/candle-vllm/target/release/deps/libcandle_vllm-ce9b07d51787770c.rlib" "/candle-vllm/target/release/deps/libchrono-ae2c4cf3aacef826.rlib" "/candle-vllm/target/release/deps/libiana_time_zone-2bd86fbdc9e46a38.rlib" "/candle-vllm/target/release/deps/libhf_hub-b2415c762b503a90.rlib" "/candle-vllm/target/release/deps/libdirs-45aa89c180ae36f2.rlib" "/candle-vllm/target/release/deps/libdirs_sys-b0294348c2e4986c.rlib" "/candle-vllm/target/release/deps/liboption_ext-3db96de540040126.rlib" "/candle-vllm/target/release/deps/libureq-22a62ebb34562523.rlib" "/candle-vllm/target/release/deps/libnative_tls-addec962e00a97ff.rlib" "/candle-vllm/target/release/deps/libopenssl_probe-e135bf478bd9e62b.rlib" "/candle-vllm/target/release/deps/libopenssl-f7e740960c8b0b56.rlib" "/candle-vllm/target/release/deps/libforeign_types-434e4620cdd2963d.rlib" "/candle-vllm/target/release/deps/libforeign_types_shared-3cd91dddd8b3059a.rlib" "/candle-vllm/target/release/deps/libopenssl_sys-2724f2f05b6f6e71.rlib" "/candle-vllm/target/release/deps/libwebpki_roots-fb31dcc12f4e6db5.rlib" "/candle-vllm/target/release/deps/librustls-ca4a80b00d74d11d.rlib" "/candle-vllm/target/release/deps/libsct-d1a0a53864376724.rlib" "/candle-vllm/target/release/deps/libwebpki-8db93ee63982280a.rlib" "/candle-vllm/target/release/deps/libring-c45b21a3fb043429.rlib" "/candle-vllm/target/release/deps/libspin-a5bca8ced7fc453c.rlib" "/candle-vllm/target/release/deps/libuntrusted-766afbb3ef44c1d1.rlib" "/candle-vllm/target/release/deps/libcandle_lora_transformers-c49058ffb6d7068a.rlib" "/candle-vllm/target/release/deps/libtqdm-e47c7a840c2fc706.rlib" "/candle-vllm/target/release/deps/libcrossterm-f705860770d94db8.rlib" "/candle-vllm/target/release/deps/libsignal_hook_mio-ec3a5a299cc915e5.rlib" "/candle-vllm/target/release/deps/libsignal_hook-49df15a2181bf250.rlib" "/candle-vllm/target/release/deps/libanyhow-78648c12fa2eaee5.rlib" "/candle-vllm/target/release/deps/libcandle_lora-0543f2db3a02f6c2.rlib" "/candle-vllm/target/release/deps/libtrc-af4d2dc9e955d45c.rlib" "/candle-vllm/target/release/deps/libuuid-8e9abe15319c7747.rlib" "/candle-vllm/target/release/deps/libcandle_transformers-3a408f703fe757e5.rlib" "/candle-vllm/target/release/deps/libserde_plain-9edacf8e6b8b5e3b.rlib" "/candle-vllm/target/release/deps/libcandle_flash_attn-6ec38f8ed9aac30d.rlib" "/candle-vllm/target/release/deps/libdyn_fmt-ca01837b2f65b0b1.rlib" "/candle-vllm/target/release/deps/libfutures-813f484dc1c71e4c.rlib" "/candle-vllm/target/release/deps/libfutures_executor-cdd38bae408d4ce8.rlib" "/candle-vllm/target/release/deps/libcandle_sampling-07b86ed24f500345.rlib" "/candle-vllm/target/release/deps/libcandle_nn-3eaedbdadbe5fbb5.rlib" "/candle-vllm/target/release/deps/libtokenizers-61b7f12c56fed2c5.rlib" "/candle-vllm/target/release/deps/libesaxx_rs-c3b0fa8f52cc413c.rlib" "/candle-vllm/target/release/deps/libunicode_normalization_alignments-025da513407d9879.rlib" "/candle-vllm/target/release/deps/libspm_precompiled-8a5e3784a84b6fa0.rlib" "/candle-vllm/target/release/deps/libbase64-a00060132962802d.rlib" "/candle-vllm/target/release/deps/libunicode_segmentation-0609f6ce0b27032d.rlib" "/candle-vllm/target/release/deps/libnom-828591b7d6e9f08d.rlib" "/candle-vllm/target/release/deps/libunicode_categories-4b2d8309eb580595.rlib" "/candle-vllm/target/release/deps/libmonostate-121edb8fb43689e8.rlib" "/candle-vllm/target/release/deps/libmacro_rules_attribute-fbe2172e90fd6d9d.rlib" "/candle-vllm/target/release/deps/libindicatif-5ac26ff2181c3839.rlib" "/candle-vllm/target/release/deps/libportable_atomic-37fa7d733d3c2283.rlib" "/candle-vllm/target/release/deps/libnumber_prefix-fcbd61cd7f0fb674.rlib" "/candle-vllm/target/release/deps/libconsole-927989bf813852d8.rlib" "/candle-vllm/target/release/deps/libunicode_width-4a01194dbfae8c91.rlib" "/candle-vllm/target/release/deps/librayon_cond-ec5fdcb09b40065c.rlib" "/candle-vllm/target/release/deps/libitertools-87b264833edf6f52.rlib" "/candle-vllm/target/release/deps/libonig-40dabd6ed5124b91.rlib" "/candle-vllm/target/release/deps/libonig_sys-90597c1391bce008.rlib" "/candle-vllm/target/release/deps/libderive_builder-3471ddeab47c0b9a.rlib" "/candle-vllm/target/release/deps/liblazy_static-852800890c81fb22.rlib" "/candle-vllm/target/release/deps/libclap-23394ec333e54596.rlib" "/candle-vllm/target/release/deps/libclap_builder-41cde94296fdb820.rlib" "/candle-vllm/target/release/deps/libstrsim-bfb3799e9677cd4d.rlib" "/candle-vllm/target/release/deps/libanstream-d284661ab137b824.rlib" "/candle-vllm/target/release/deps/libanstyle_query-d08e7c102e46eb49.rlib" "/candle-vllm/target/release/deps/libcolorchoice-d9fe16d50a3dd803.rlib" "/candle-vllm/target/release/deps/libanstyle_parse-6ac7d6e179081361.rlib" "/candle-vllm/target/release/deps/libutf8parse-86e737e0d4678582.rlib" "/candle-vllm/target/release/deps/libclap_lex-3a6b7689365ae37a.rlib" "/candle-vllm/target/release/deps/libanstyle-9a261b265642b8a4.rlib" "/candle-vllm/target/release/deps/libcandle_core-d2f01b6e6a29d888.rlib" "/candle-vllm/target/release/deps/libmemmap2-4476da1f91fb3603.rlib" "/candle-vllm/target/release/deps/libzip-9bf92410c307c36c.rlib" "/candle-vllm/target/release/deps/libpbkdf2-bfe2a8675cfe3dd6.rlib" "/candle-vllm/target/release/deps/libsha2-7f594f901cd89567.rlib" "/candle-vllm/target/release/deps/libpassword_hash-2fa33ff8d4990779.rlib" "/candle-vllm/target/release/deps/libbase64ct-760f27bcfd4054ae.rlib" "/candle-vllm/target/release/deps/libzstd-bafef58bb20c82a7.rlib" "/candle-vllm/target/release/deps/libzstd_safe-2c41e8f78c52fdfc.rlib" "/candle-vllm/target/release/deps/libbzip2-b94c5c5e7c15f010.rlib" "/candle-vllm/target/release/deps/libbzip2_sys-a158ea0d0289b351.rlib" "/candle-vllm/target/release/deps/libaes-dc1bc8251226040a.rlib" "/candle-vllm/target/release/deps/libcipher-eeb8ea70098f4f7f.rlib" "/candle-vllm/target/release/deps/libinout-5e79d2c693701e41.rlib" "/candle-vllm/target/release/deps/libhmac-246f344022381f5d.rlib" "/candle-vllm/target/release/deps/libconstant_time_eq-742a8ca43fc4b3c6.rlib" "/candle-vllm/target/release/deps/libyoke-b5cb326284cb506c.rlib" "/candle-vllm/target/release/deps/libzerofrom-72df68927b68a064.rlib" "/candle-vllm/target/release/deps/libstable_deref_trait-76725faa25d9c59b.rlib" "/candle-vllm/target/release/deps/libthiserror-7cc4f2a96da73a94.rlib" "/candle-vllm/target/release/deps/libsafetensors-b94965e86f7ef122.rlib" "/candle-vllm/target/release/deps/libcudarc-bb4cc1d0d1d68ba3.rlib" "/candle-vllm/target/release/deps/libcandle_kernels-af06d5fd4a087af6.rlib" "/candle-vllm/target/release/deps/libgemm-9939fb772d1ff792.rlib" "/candle-vllm/target/release/deps/libgemm_c32-cba446e570d4386d.rlib" "/candle-vllm/target/release/deps/libgemm_c64-701b72db790c5491.rlib" "/candle-vllm/target/release/deps/libgemm_f64-132035f8fb79f58d.rlib" "/candle-vllm/target/release/deps/libgemm_f16-a17195123a2b5a97.rlib" "/candle-vllm/target/release/deps/libgemm_f32-43dd1a29089d0d80.rlib" "/candle-vllm/target/release/deps/libgemm_common-888ab4912d03277a.rlib" "/candle-vllm/target/release/deps/libpulp-c51f68967478b6aa.rlib" "/candle-vllm/target/release/deps/libnum_complex-9293d6ad98d7b1c3.rlib" "/candle-vllm/target/release/deps/libdyn_stack-e01f3657ea7d975f.rlib" "/candle-vllm/target/release/deps/libreborrow-77659d577c4b718c.rlib" "/candle-vllm/target/release/deps/libraw_cpuid-b9cfe85e371d3083.rlib" "/candle-vllm/target/release/deps/librayon-7e6c7f8c76536947.rlib" "/candle-vllm/target/release/deps/librayon_core-2fef7474b3331466.rlib" "/candle-vllm/target/release/deps/libcrossbeam_deque-f3876680669c2c7d.rlib" "/candle-vllm/target/release/deps/libcrossbeam_epoch-d5f20c1ae49163b7.rlib" "/candle-vllm/target/release/deps/libmemoffset-b4fab92a5d1a5e30.rlib" "/candle-vllm/target/release/deps/libcrossbeam_utils-1d67d2d362ef675e.rlib" "/candle-vllm/target/release/deps/libeither-c016b57e73ba30c1.rlib" "/candle-vllm/target/release/deps/libbyteorder-8bf78fc69cf5b0a1.rlib" "/candle-vllm/target/release/deps/libhalf-82866db1aa6c7f3e.rlib" "/candle-vllm/target/release/deps/librand_distr-b111214f51586c69.rlib" "/candle-vllm/target/release/deps/libnum_traits-28ee9b33f1e53f29.rlib" "/candle-vllm/target/release/deps/libbytemuck-7eee2fa1f516b4ce.rlib" "/candle-vllm/target/release/deps/libactix_web-0a08fb87679df924.rlib" "/candle-vllm/target/release/deps/liburl-1bbf839f22bd1732.rlib" "/candle-vllm/target/release/deps/libidna-fb425d18121613f1.rlib" "/candle-vllm/target/release/deps/libunicode_normalization-7972d0be1c38ac31.rlib" "/candle-vllm/target/release/deps/libtinyvec-61debd23e06e16bf.rlib" "/candle-vllm/target/release/deps/libtinyvec_macros-f326b6a6f0ca8a7b.rlib" "/candle-vllm/target/release/deps/libunicode_bidi-9dc6f963fdeb5a21.rlib" "/candle-vllm/target/release/deps/libserde_urlencoded-9f88ee3d21b5ec1b.rlib" "/candle-vllm/target/release/deps/libform_urlencoded-3e169fc285508f2a.rlib" "/candle-vllm/target/release/deps/libserde_json-2daaa0f082f50c3a.rlib" "/candle-vllm/target/release/deps/libryu-8b05c69dcf279a6f.rlib" "/candle-vllm/target/release/deps/libactix_server-e79c728840296968.rlib" "/candle-vllm/target/release/deps/libactix_router-48a733d95bd3dd5e.rlib" "/candle-vllm/target/release/deps/libregex-c78c6a0d40f8f119.rlib" "/candle-vllm/target/release/deps/libregex_automata-3822bb291a95f096.rlib" "/candle-vllm/target/release/deps/libaho_corasick-6f9c3d032c4f562f.rlib" "/candle-vllm/target/release/deps/libregex_syntax-3dd804a409b2c545.rlib" "/candle-vllm/target/release/deps/libserde-23513cb3b07422f8.rlib" "/candle-vllm/target/release/deps/libcookie-30bd32d9b0d08b83.rlib" "/candle-vllm/target/release/deps/libtime-bc85cd6997494558.rlib" "/candle-vllm/target/release/deps/libtime_core-531fb2a2b6009484.rlib" "/candle-vllm/target/release/deps/libderanged-5409594f6406082d.rlib" "/candle-vllm/target/release/deps/libpowerfmt-c4543fc1903272c6.rlib" "/candle-vllm/target/release/deps/libactix_http-f7b0baf59fd7bb10.rlib" "/candle-vllm/target/release/deps/librand-aa6ddb6627b48b96.rlib" "/candle-vllm/target/release/deps/librand_chacha-fa47a10cc5e59439.rlib" "/candle-vllm/target/release/deps/libppv_lite86-9a645f708eed4e1c.rlib" "/candle-vllm/target/release/deps/librand_core-479671a2b8263665.rlib" "/candle-vllm/target/release/deps/libhttparse-699e93ce2c2e7905.rlib" "/candle-vllm/target/release/deps/libbrotli-df4299509820f939.rlib" "/candle-vllm/target/release/deps/libbrotli_decompressor-0212e4cdb0da1245.rlib" "/candle-vllm/target/release/deps/liballoc_stdlib-fc777d5f3c59a235.rlib" "/candle-vllm/target/release/deps/liballoc_no_stdlib-f497a54db348ea9b.rlib" "/candle-vllm/target/release/deps/libhttpdate-5f8e81ac577420b0.rlib" "/candle-vllm/target/release/deps/libsha1-ad6469ba6b8b2240.rlib" "/candle-vllm/target/release/deps/libcpufeatures-dcef25221428931f.rlib" "/candle-vllm/target/release/deps/libdigest-f32a2ccccbd945ab.rlib" "/candle-vllm/target/release/deps/libsubtle-910e19b9d08b2799.rlib" "/candle-vllm/target/release/deps/libblock_buffer-2ad0dde06bca4c37.rlib" "/candle-vllm/target/release/deps/libcrypto_common-30c46997c474a2db.rlib" "/candle-vllm/target/release/deps/libgeneric_array-95ff38f8e6dc2014.rlib" "/candle-vllm/target/release/deps/libtypenum-ddf8574aa94ffabe.rlib" "/candle-vllm/target/release/deps/libbase64-daaf16d87f9b4835.rlib" "/candle-vllm/target/release/deps/liblocal_channel-5501da97fbe12c8a.rlib" "/candle-vllm/target/release/deps/libbytestring-4d1e0f611bab987e.rlib" "/candle-vllm/target/release/deps/libencoding_rs-c048082deb3a71c3.rlib" "/candle-vllm/target/release/deps/liblanguage_tags-e0dfc52f86f9b27a.rlib" "/candle-vllm/target/release/deps/libahash-a28674307e9664ad.rlib" "/candle-vllm/target/release/deps/libgetrandom-b24cab7002c3530b.rlib" "/candle-vllm/target/release/deps/libzerocopy-63825396d720b9a6.rlib" "/candle-vllm/target/release/deps/libmime-04e6f00618993e67.rlib" "/candle-vllm/target/release/deps/libpercent_encoding-d54414372a2980de.rlib" "/candle-vllm/target/release/deps/libh2-27cdaea5e3d2147c.rlib" "/candle-vllm/target/release/deps/libindexmap-fcdde0ade0e1bfe3.rlib" "/candle-vllm/target/release/deps/libequivalent-8a25e166243cfe94.rlib" "/candle-vllm/target/release/deps/libhashbrown-aee95c0614bccf63.rlib" "/candle-vllm/target/release/deps/libfutures_util-98b8b67b3d434750.rlib" "/candle-vllm/target/release/deps/libfutures_io-bbce8973c99e7ece.rlib" "/candle-vllm/target/release/deps/libslab-490ef311b9a84e0e.rlib" "/candle-vllm/target/release/deps/libfutures_channel-6d294bf595dec06a.rlib" "/candle-vllm/target/release/deps/libfutures_task-0a7c23a0933dbcaa.rlib" "/candle-vllm/target/release/deps/libpin_utils-185c55cbe9ca2fff.rlib" "/candle-vllm/target/release/deps/libbitflags-1029aec9c38cde73.rlib" "/candle-vllm/target/release/deps/libzstd-242538c7759a4fa6.rlib" "/candle-vllm/target/release/deps/libzstd_safe-d25e92a1d04503ec.rlib" "/candle-vllm/target/release/deps/libzstd_sys-a6ec9cf883e86b56.rlib" "/candle-vllm/target/release/deps/libflate2-b67596bfbb64de8d.rlib" "/candle-vllm/target/release/deps/libminiz_oxide-2b969af90226827f.rlib" "/candle-vllm/target/release/deps/libsimd_adler32-d1dbd8e6b06bf162.rlib" "/candle-vllm/target/release/deps/libcrc32fast-ceb628e76fc0bab0.rlib" "/candle-vllm/target/release/deps/libactix_service-dfc20131f5ba36d4.rlib" "/candle-vllm/target/release/deps/libactix_codec-f3cae536aed1196d.rlib" "/candle-vllm/target/release/deps/libtokio_util-88b2eabf4483c1ed.rlib" "/candle-vllm/target/release/deps/libtracing-9e7a6177765350ac.rlib" "/candle-vllm/target/release/deps/libtracing_core-c5e9157560beafe6.rlib" "/candle-vllm/target/release/deps/libonce_cell-4b31816a5aa6274f.rlib" "/candle-vllm/target/release/deps/libmemchr-38d4fc2a3522aa15.rlib" "/candle-vllm/target/release/deps/libfutures_sink-78114cacf22202c2.rlib" "/candle-vllm/target/release/deps/libbitflags-b9815c55ec510696.rlib" "/candle-vllm/target/release/deps/libactix_utils-ec862be5af373362.rlib" "/candle-vllm/target/release/deps/liblocal_waker-7857496d2dec9a57.rlib" "/candle-vllm/target/release/deps/libactix_rt-0ffc3a15823d1322.rlib" "/candle-vllm/target/release/deps/libtokio-b67279acab90ede3.rlib" "/candle-vllm/target/release/deps/libsignal_hook_registry-a773ced30481d3cb.rlib" "/candle-vllm/target/release/deps/libnum_cpus-fbaf57124b2a0166.rlib" "/candle-vllm/target/release/deps/libsocket2-8e37cfa1c7015c6b.rlib" "/candle-vllm/target/release/deps/libmio-81de974463968f98.rlib" "/candle-vllm/target/release/deps/liblog-35f97248cb2ec82c.rlib" "/candle-vllm/target/release/deps/libparking_lot-e183fcd4a13bd183.rlib" "/candle-vllm/target/release/deps/libparking_lot_core-5fbb54b30e35e540.rlib" "/candle-vllm/target/release/deps/liblibc-d38dc52f94735460.rlib" "/candle-vllm/target/release/deps/libcfg_if-88c619515d65e3f1.rlib" "/candle-vllm/target/release/deps/libsmallvec-e35ec471a6514672.rlib" "/candle-vllm/target/release/deps/liblock_api-920512de5989abb2.rlib" "/candle-vllm/target/release/deps/libscopeguard-6208b4062bcdc2b1.rlib" "/candle-vllm/target/release/deps/libpin_project_lite-42a553ee08f02ebb.rlib" "/candle-vllm/target/release/deps/libfutures_core-b87582f06d7f1343.rlib" "/candle-vllm/target/release/deps/libhttp-b738399ec4ab1c60.rlib" "/candle-vllm/target/release/deps/libitoa-dcbca83b54db3306.rlib" "/candle-vllm/target/release/deps/libbytes-8c2bf1b211f72910.rlib" "/candle-vllm/target/release/deps/libfnv-ffe196e20ea2a648.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd-9c342d6596ca77d8.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libpanic_unwind-35e6faa0abf08dd1.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libobject-6242b5524a2684de.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libmemchr-94511439d510df36.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libaddr2line-1923a594ddedab24.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libgimli-5b476927cd520d76.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc_demangle-6b4664d28b4dc07b.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd_detect-4d7e14ee42b44abc.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libhashbrown-94e04d08d317eb2b.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc_std_workspace_alloc-7e3a1db27b23a8ee.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libminiz_oxide-0651af3c34a1e4b9.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libadler-e5da8ecb95d2de36.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libunwind-052b86aa844a2857.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcfg_if-bbd2a157557b773d.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/liblibc-f47279717d0e1831.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/liballoc-d30e243a979711ec.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc_std_workspace_core-18929aabe36e3f57.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcore-f9f41fbdedfbfafb.rlib" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcompiler_builtins-b26982894e484f03.rlib" "-Wl,-Bdynamic" "-lssl" "-lcrypto" "-lflashattention" "-lcudart" "-lstdc++" "-lstdc++" "-lcuda" "-lnccl" "-lnvrtc" "-lcurand" "-lcublas" "-lcublasLt" "-lcudnn" "-lgcc_s" "-lutil" "-lrt" "-lpthread" "-lm" "-ldl" "-lc" "-Wl,--eh-frame-hdr" "-Wl,-z,noexecstack" "-L" "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib" "-o" "/candle-vllm/target/release/deps/candle_vllm-8b71aad931b633bb" "-Wl,--gc-sections" "-pie" "-Wl,-z,relro,-z,now" "-Wl,-O1" "-nodefaultlibs"
  = note: /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-bb54a4d16d25ee03/out/libflashattention.a(flash_api.o): relocation R_X86_64_32 against `.nvFatBinSegment' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-bb54a4d16d25ee03/out/libflashattention.a(flash_fwd_hdim128_fp16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi128ELi128ELi64ELi4ELb0ELb0EN7cutlass6half_tE19Flash_kernel_traitsILi128ELi128ELi64ELi4ES2_EELb0ELb0ELb1ELb1ELb0EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-bb54a4d16d25ee03/out/libflashattention.a(flash_fwd_hdim160_fp16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi160ELi64ELi64ELi4ELb0ELb0EN7cutlass6half_tE19Flash_kernel_traitsILi160ELi64ELi64ELi4ES2_EELb1ELb1ELb0ELb0ELb1EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-bb54a4d16d25ee03/out/libflashattention.a(flash_fwd_hdim192_fp16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi192ELi64ELi64ELi4ELb0ELb0EN7cutlass6half_tE19Flash_kernel_traitsILi192ELi64ELi64ELi4ES2_EELb1ELb1ELb1ELb0ELb1EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-bb54a4d16d25ee03/out/libflashattention.a(flash_fwd_hdim224_fp16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi224ELi128ELi64ELi8ELb0ELb0EN7cutlass6half_tE19Flash_kernel_traitsILi224ELi128ELi64ELi8ES2_EELb1ELb1ELb1ELb1ELb1EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-bb54a4d16d25ee03/out/libflashattention.a(flash_fwd_hdim256_fp16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi256ELi64ELi64ELi4ELb0ELb0EN7cutlass6half_tE19Flash_kernel_traitsILi256ELi64ELi64ELi4ES2_EELb0ELb0ELb0ELb1ELb0EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-bb54a4d16d25ee03/out/libflashattention.a(flash_fwd_hdim32_fp16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi32ELi128ELi128ELi4ELb0ELb0EN7cutlass6half_tE19Flash_kernel_traitsILi32ELi128ELi128ELi4ES2_EELb1ELb1ELb1ELb0ELb1EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-bb54a4d16d25ee03/out/libflashattention.a(flash_fwd_hdim64_fp16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi64ELi128ELi64ELi4ELb0ELb0EN7cutlass6half_tE19Flash_kernel_traitsILi64ELi128ELi64ELi4ES2_EELb1ELb1ELb1ELb0ELb1EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-bb54a4d16d25ee03/out/libflashattention.a(flash_fwd_hdim96_fp16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi96ELi64ELi64ELi4ELb0ELb0EN7cutlass6half_tE19Flash_kernel_traitsILi96ELi64ELi64ELi4ES2_EELb1ELb1ELb1ELb1ELb1EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-bb54a4d16d25ee03/out/libflashattention.a(flash_fwd_hdim128_bf16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi128ELi128ELi64ELi4ELb0ELb0EN7cutlass10bfloat16_tE19Flash_kernel_traitsILi128ELi128ELi64ELi4ES2_EELb0ELb0ELb1ELb1ELb0EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-bb54a4d16d25ee03/out/libflashattention.a(flash_fwd_hdim160_bf16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi160ELi64ELi64ELi4ELb0ELb0EN7cutlass10bfloat16_tE19Flash_kernel_traitsILi160ELi64ELi64ELi4ES2_EELb1ELb1ELb0ELb0ELb1EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-bb54a4d16d25ee03/out/libflashattention.a(flash_fwd_hdim192_bf16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi192ELi64ELi64ELi4ELb0ELb0EN7cutlass10bfloat16_tE19Flash_kernel_traitsILi192ELi64ELi64ELi4ES2_EELb1ELb1ELb1ELb0ELb1EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-bb54a4d16d25ee03/out/libflashattention.a(flash_fwd_hdim224_bf16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi224ELi128ELi64ELi8ELb0ELb0EN7cutlass10bfloat16_tE19Flash_kernel_traitsILi224ELi128ELi64ELi8ES2_EELb1ELb1ELb1ELb1ELb1EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-bb54a4d16d25ee03/out/libflashattention.a(flash_fwd_hdim256_bf16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi256ELi64ELi64ELi4ELb0ELb0EN7cutlass10bfloat16_tE19Flash_kernel_traitsILi256ELi64ELi64ELi4ES2_EELb0ELb0ELb0ELb1ELb0EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-bb54a4d16d25ee03/out/libflashattention.a(flash_fwd_hdim32_bf16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi32ELi128ELi128ELi4ELb0ELb0EN7cutlass10bfloat16_tE19Flash_kernel_traitsILi32ELi128ELi128ELi4ES2_EELb1ELb1ELb1ELb0ELb1EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-bb54a4d16d25ee03/out/libflashattention.a(flash_fwd_hdim64_bf16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi64ELi128ELi64ELi4ELb0ELb0EN7cutlass10bfloat16_tE19Flash_kernel_traitsILi64ELi128ELi64ELi4ES2_EELb1ELb1ELb1ELb0ELb1EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          /usr/bin/ld: /candle-vllm/target/release/build/candle-flash-attn-bb54a4d16d25ee03/out/libflashattention.a(flash_fwd_hdim96_bf16_sm80.o): relocation R_X86_64_32 against symbol `_Z16flash_fwd_kernelI23Flash_fwd_kernel_traitsILi96ELi64ELi64ELi4ELb0ELb0EN7cutlass10bfloat16_tE19Flash_kernel_traitsILi96ELi64ELi64ELi4ES2_EELb1ELb1ELb1ELb1ELb1EEv16Flash_fwd_params' can not be used when making a PIE object; recompile with -fPIE
          collect2: error: ld returned 1 exit status
          

error: could not compile `candle-vllm` (bin "candle-vllm") due to previous error
[root@95e9d872d994 candle-vllm]# PATH="/usr/lib/rustlib/x86_64-unknown-linux-gnu/bin:/root/.local/bin:/root/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
[root@95e9d872d994 candle-vllm]# command -v cc
/usr/bin/cc
[root@95e9d872d994 candle-vllm]# cc --version
cc (GCC) 11.4.1 20230605 (Red Hat 11.4.1-2)

Maybe I shouldn’t use the flash-attn feature? Thanks for any suggestions or information.

About this issue

  • Original URL
  • State: closed
  • Created 5 months ago
  • Comments: 46 (21 by maintainers)

Commits related to this issue

Most upvoted comments

Thank you! Please see mistral.rs it is the successor to this project which supports flash attention and GGUF, etc.

On Wed, Mar 13, 2024, 3:13 PM Iván Baldo @.***> wrote:

Hi! Sorry for the late reply. Tested it and you are right, reported here: huggingface/candle#1844 https://github.com/huggingface/candle/issues/1844 Thanks!!!

— Reply to this email directly, view it on GitHub https://github.com/EricLBuehler/candle-vllm/issues/25#issuecomment-1995446875, or unsubscribe https://github.com/notifications/unsubscribe-auth/APRFUWZT6NQAS2HTMGULY3LYYCQM3AVCNFSM6AAAAABCT3LJSGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOJVGQ2DMOBXGU . You are receiving this because you were assigned.Message ID: @.***>

From what I understand, this issue is not resolved, as it is likely part of Candle. Could you please open an issue on Candle? candle-vllm does not build flash attention kernels, and this build step is a part of Candle’s build.rs.

I agree. It may have to do with the CUDA driver, though.

Ouch, that didn’t work, I will try with Ubuntu tomorrow.

This is an error related to the fact we need to manually monomorphize the kernel function. I have since pushed the changes, could you try it again?

Ok, this seems like a general linking problem. I’ll try to reproduce it tonight, as I plan on working on the CUDA kernels.

Looks like there is a dependency bug - I just pushed a fix.

I fixed that bug - it was on the candle-lora side. Could you try again with the original Cargo.toml and after cargo update?

To me, this looks like a candle flash attention compilation error. However, it may be because I compile CUDA kernels, too. Could you try compilation without flash-attn and let me know if that breaks?