rust-htslib: Passing `bam::record::Record` between threads causes a segfault
Thank you for the very nice library!
I’m working on a tool that passes reads between threads. Unfortunately, the tool is producing non-deterministic segmentation faults and other memory errors. I’ve traced some of these issues back to rust-htslib, which will crash somewhat randomly when bam::record::Records are passes between threads. I am not too familiar with this library, so I am wondering if this is the expected behavior?
Here is a simplified example that can reproduce the crash:
use std::error::Error;
use std::str;
use std::sync::mpsc::{self, Receiver};
use std::thread;
use rust_htslib::bam::{Read, Reader};
use rust_htslib::bam::record::Record;
fn sum_mapqs(rx: Receiver<Option<Record>>) -> Result<(), Box<dyn Error>> {
let mut total_mapq = 0u64;
loop {
match rx.recv() {
Ok(x) => {
match x {
Some(read) => {
let mapq = read.mapq();
total_mapq = total_mapq.saturating_add(mapq as u64);
},
None => { // No more data
println!("Total MapQ: {}", total_mapq);
return Ok(());
},
}
},
Err(e) => {
eprintln!("Error reciving data: {}", e);
}
}
}
}
fn main() -> Result<(), Box<dyn Error>> {
let mut bam = match Reader::from_stdin() {
Ok(bam) => { bam },
Err(e) => { return Err(Box::new(e)); },
};
// Initialize the writer thread
let (tx, rx) = mpsc::channel();
let writer = thread::spawn(move || {
if let Err(e) = sum_mapqs(rx) {
eprintln!("Error writing output - {}", e);
}
});
for read in bam.records() {
match read {
Ok(read) => {
eprintln!("Parsed read: {}", str::from_utf8(read.qname()).unwrap());
if let Err(e) = tx.send(Some(read)) {
eprintln!("Error sending data to writer thread");
return Err(Box::new(e));
}
},
Err(e) => {
return Err(Box::new(e));
},
}
}
// Close the spawned thread
let _ = tx.send(None);
let _ = writer.join().unwrap();
Ok(())
}
Compiling with RUSTFLAGS="-g" cargo build and then running with a SAM passed through stdin produces the following:
$ cat test.sam | target/debug/rust-htslib-crash
...
Parsed read: H203:185:D2990ACXX:4:1101:11561:5493
23559 Broken pipe cat test.sam
23560 Segmentation fault (core dumped) | target/debug/rust-htslib-crash
Re-running the same command will produce the crash in different parts of the program. I’ve attached a backtrace from one of the crashes: crash backtrace.txt
About this issue
- Original URL
- State: open
- Created 3 years ago
- Comments: 28 (22 by maintainers)
Commits related to this issue
- switch to noodles as bam reading library Should use less memory, see https://github.com/rust-bio/rust-htslib/issues/293 — committed to roland-rad-lab/CRISPR_rust by sc13-bioinf 2 years ago
Reviving this issue again.
Personally, my vote is for 3. I would also think that 1 is a no-go as being unable to pass bam::Records between threads would limit the usefulness of the library. I’m not sure what the reasoning is for implementing
bio_types::genome::AbstractIntervalfor bam::Records, so it’s hard for me to weigh the merits of 2 vs. 3.@veldsla I agree. As it stands the
Send/Syncare not valid. My primary vote is for 3 as well, then 1. There may be performance benefits from having a smallerbam::Recordstruct.HeaderViewmay be needed to get reference sequence name fromtidso we will need to document how to get that, (c.f. https://github.com/rust-bio/rust-htslib/issues/288 ) 2 is a no go.I think it’s a bit painful to see a segfaulting issue go stale, so let me try to revive this. It seems clear that the current situation is invalid. I see three possible solutions also mentioned before:
bam::Recordnot shareable.Rc<HeaderView>withArc<HeaderView>and always pay the performance penalty.Rc<HeaderView>fromRecord. Breaking change, but introduced fairly recent.Anyone see another way out of this? Do we take a vote? I’m on 1 or 3, but mostly 3.
What’s the consensus here? It seems that Arc (option 3) requires the least code-change and @veldsla has shown that the performance hit is tolerable. Is there some concurrency setup where this is not an issue? @sitag recommends using
rayonparallel iterator. Is there an exampl here?I made a draft PR (#345) trying to work around this particular soundness issue. Please let me know what you think.
Having seen these results, I still don’t change my preference. I think the header has no useful place in the Record and the sacrificing of the
AbstractIntervaltrait would be a good thing. Also when doing my naives/Rc/Arc/I saw: 17 occurrences ofunsafe impl (Send|Sync)for structs that often containRc. So a decent unsafe audit is in order.