picard: UmiAwareMarkDuplicatesWithMateCigar random error


Bug Report

Affected tool(s)

UmiAwareMarkDuplicatesWithMateCigar

Affected version(s)

  • picard/2.18.16
  • picard/2.27.7

Description

When running the UmiAwareMarkDuplicatesWithMateCigar tool, it sometimes produces the following error.

java -jar picard.java UmiAwareMarkDuplicatesWithMateCigar I=input.bam CREATE_INDEX=true UMI_METRICS=md_metrics M=output.txt OUTPUT=test.bam ASSUME_SORT_ORDER=coordinate TAG_DUPLICATE_SET_MEMBERS=true MAX_RECORDS_IN_RAM=400000

Exception in thread "main" htsjdk.samtools.SAMException: The input records were not sorted in duplicate order:
MN00975:67:000H2WVYG:1:21110:2374:3570  147     chr3    128485830       60      151M    =       128485833       -148       AGTCGCCGGCACTTAGGAGGGGTAGGTGGGGATGGGGTGGTGTGTAGCAGGCTGGGTGCCCATAGTAGCTAGGCCTGGGCGCAGGGGACTGCCACTTTCCATCTTCATGCTCTCCGTCAGTGACACCTGGTACTTGACGCCGTCCTTGTCC    //FF/A/FF6/=////F/A6=F//FFFFFF=FFFA/=/FFF/F/FFF/FFF//FFFFAFFFFFFFFFF/FFFF/AAFAF/AFFFF/AAFFF//FFF6FFF6F6FFF/FFFFFF//F/FA/FFAAF//F/FFFFFFFF/FFA/6F6AAF//A    MC:Z:13S130M    MD:Z:2C9A138       RG:Z:000H2WVYG.AE4441.L001      NM:i:2  MQ:i:60 AS:i:143        XS:i:0  QX:Z:FFFFFFFF   RX:Z:ACCCTATA
MN00975:67:000H2WVYG:1:11107:15346:8737 163     chr1    153346  0       10S40M21S       =       153346  40      AGCACCATCACCACTTACCTTGTCCTGTGCATCTCTTTCATTGGCTGTTCACTCCTGGCGGTTATCGGTAA    FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFAFFFFFFFFFFFFAF    MC:Z:10S40M13S  MD:Z:40 RG:Z:000H2WVYG.AE4441.L001      NM:i:0  MQ:i:0  AS:i:40 XS:i:40    QX:Z:FFFFFFFA   RX:Z:ATCGGTAA

        at htsjdk.samtools.DuplicateSetIterator.next(DuplicateSetIterator.java:152)
        at picard.sam.markduplicates.UmiAwareDuplicateSetIterator.next(UmiAwareDuplicateSetIterator.java:119)
        at picard.sam.markduplicates.UmiAwareDuplicateSetIterator.next(UmiAwareDuplicateSetIterator.java:53)
        at picard.sam.markduplicates.SimpleMarkDuplicatesWithMateCigar.doWork(SimpleMarkDuplicatesWithMateCigar.java:126)
        at picard.sam.markduplicates.UmiAwareMarkDuplicatesWithMateCigar.doWork(UmiAwareMarkDuplicatesWithMateCigar.java:138)
        at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:305)
        at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:103)
        at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:113)

Steps to reproduce

This error does not appear only on this file nor it happens every time I run the command, it happens from time to times. I ran a lot of tests and it generally happens every 10 tests iteration of the tool.

Expected behavior

A bam file with marked umi-duplicated reads

Actual behavior

An empty bam file and a java exception.

Thank you all for your help and for the great work you’ve been doing developing Picard.

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 19 (10 by maintainers)

Most upvoted comments

Per @yfarjoun 's comment on another thread.

It seems that this bug is related to multithreading race condition in samtools/htsjdk#1516. Now that this has been identified, I think a fix will be forthcoming.