picard: UmiAwareMarkDuplicatesWithMateCigar random error
Bug Report
Affected tool(s)
UmiAwareMarkDuplicatesWithMateCigar
Affected version(s)
- picard/2.18.16
- picard/2.27.7
Description
When running the UmiAwareMarkDuplicatesWithMateCigar tool, it sometimes produces the following error.
java -jar picard.java UmiAwareMarkDuplicatesWithMateCigar I=input.bam CREATE_INDEX=true UMI_METRICS=md_metrics M=output.txt OUTPUT=test.bam ASSUME_SORT_ORDER=coordinate TAG_DUPLICATE_SET_MEMBERS=true MAX_RECORDS_IN_RAM=400000
Exception in thread "main" htsjdk.samtools.SAMException: The input records were not sorted in duplicate order:
MN00975:67:000H2WVYG:1:21110:2374:3570 147 chr3 128485830 60 151M = 128485833 -148 AGTCGCCGGCACTTAGGAGGGGTAGGTGGGGATGGGGTGGTGTGTAGCAGGCTGGGTGCCCATAGTAGCTAGGCCTGGGCGCAGGGGACTGCCACTTTCCATCTTCATGCTCTCCGTCAGTGACACCTGGTACTTGACGCCGTCCTTGTCC //FF/A/FF6/=////F/A6=F//FFFFFF=FFFA/=/FFF/F/FFF/FFF//FFFFAFFFFFFFFFF/FFFF/AAFAF/AFFFF/AAFFF//FFF6FFF6F6FFF/FFFFFF//F/FA/FFAAF//F/FFFFFFFF/FFA/6F6AAF//A MC:Z:13S130M MD:Z:2C9A138 RG:Z:000H2WVYG.AE4441.L001 NM:i:2 MQ:i:60 AS:i:143 XS:i:0 QX:Z:FFFFFFFF RX:Z:ACCCTATA
MN00975:67:000H2WVYG:1:11107:15346:8737 163 chr1 153346 0 10S40M21S = 153346 40 AGCACCATCACCACTTACCTTGTCCTGTGCATCTCTTTCATTGGCTGTTCACTCCTGGCGGTTATCGGTAA FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFAFFFFFFFFFFFFAF MC:Z:10S40M13S MD:Z:40 RG:Z:000H2WVYG.AE4441.L001 NM:i:0 MQ:i:0 AS:i:40 XS:i:40 QX:Z:FFFFFFFA RX:Z:ATCGGTAA
at htsjdk.samtools.DuplicateSetIterator.next(DuplicateSetIterator.java:152)
at picard.sam.markduplicates.UmiAwareDuplicateSetIterator.next(UmiAwareDuplicateSetIterator.java:119)
at picard.sam.markduplicates.UmiAwareDuplicateSetIterator.next(UmiAwareDuplicateSetIterator.java:53)
at picard.sam.markduplicates.SimpleMarkDuplicatesWithMateCigar.doWork(SimpleMarkDuplicatesWithMateCigar.java:126)
at picard.sam.markduplicates.UmiAwareMarkDuplicatesWithMateCigar.doWork(UmiAwareMarkDuplicatesWithMateCigar.java:138)
at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:305)
at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:103)
at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:113)
Steps to reproduce
This error does not appear only on this file nor it happens every time I run the command, it happens from time to times. I ran a lot of tests and it generally happens every 10 tests iteration of the tool.
Expected behavior
A bam file with marked umi-duplicated reads
Actual behavior
An empty bam file and a java exception.
Thank you all for your help and for the great work you’ve been doing developing Picard.
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 19 (10 by maintainers)
Per @yfarjoun 's comment on another thread.
It seems that this bug is related to multithreading race condition in samtools/htsjdk#1516. Now that this has been identified, I think a fix will be forthcoming.