snakemake: Checkpoint aggregate returns checkpoint output dir instead of files
Hi, the following has been observed in 5.7.1 (edit: also in v5.7.4 and v5.6.0)
For jobs waiting for checkpoint output, I get failed jobs with the following irritating output (simplified):
rule merge_mono_dinucleotide_fraction:
input: <TBD>
output: <OMITTED path to output file>
log: <OMITTED path to log file>
jobid: 0
<OMITTED wildcards, resources etc...>
Error in rule merge_mono_dinucleotide_fraction:
jobid: 0
output: <OMITTED>
log: <OMITTED>
shell:
samtools merge -@ 6 -O BAM <OMITTED: correct path to output file> input/fastq/strand-seq/HG00733_PRJEB12849/requests &> <OMITTED: path to log file>
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
The <TBD> (to be determined?) probably tells me that Snakemake needs to evaluate the checkpoint in the input function - ok. The checkpoint is long-running (downloading data), and adding a pdb.set_trace() inside my input function shows that there is an IncompleteCheckpoint exception raised (as expected, I presume). Now the problematic part: the path input/fastq/strand-seq/HG00733_PRJEB12849/requests is the directory() output of the checkpoint. Checking the log file of samtools for the above failed job shows the following:
[E::hts_hopen] Failed to open file input/fastq/strand-seq/HG00733_PRJEB12849/requests
[E::hts_open_format] Failed to open file input/fastq/strand-seq/HG00733_PRJEB12849/requests
samtools merge: fail to open "input/fastq/strand-seq/HG00733_PRJEB12849/requests": Is a directory
Apparently, Snakemake detects the unfinished checkpoint, but returns the directory() of the checkpoint as input to the rule (in this case merge_mono_dinucleotide_fraction). If I wait for all jobs to fail, and for the checkpoint to finish, and restart the pipeline, the workflow continues as expected (= showing that the aggregate input function works as intended).
I have trouble coming up with a minimal reproducible example for this, maybe because it’s about timing, or the reason is actually something else - nevertheless, the log output of samtools clearly shows that Snakemake executes the rule with the checkpoint output, instead of the output collected by the aggregate input function. Thanks for looking into this.
Best, Peter
Below the code of my aggregate input function - as stated above, this works as intended after waiting for the checkpoint to complete (see my comment below):
def collect_merge_files(wildcards):
"""
"""
individual = wildcards.individual
bioproject = wildcards.bioproject
platform = wildcards.platform
project = wildcards.project
lib_id = wildcards.lib_id
requests_dir = checkpoints.create_bioproject_download_requests.get(individual=individual, bioproject=bioproject).output[0]
search_pattern = '_'.join([individual, project, '{spec}', lib_id, '{run_id}', '1'])
search_path = os.path.join(requests_dir, search_pattern + '.request')
checkpoint_wildcards = glob_wildcards(search_path)
bam_files = expand(
'output/alignments/strandseq_to_reference/{reference}.{individual}.{bioproject}/{individual}_{project}_{spec}_{lib_id}_{run_id}.filt.sam.bam',
zip,
reference=[wildcards.reference, wildcards.reference],
individual=[individual, individual],
bioproject=[bioproject, bioproject],
project=[project, project],
spec=checkpoint_wildcards.spec,
lib_id=[lib_id, lib_id],
run_id=checkpoint_wildcards.run_id)
assert len(bam_files) == 2, 'Missing merge partner: {}'.format(bam_files)
return sorted(bam_files)
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 1
- Comments: 31 (10 by maintainers)
beacon: same error still exists in Snakemake v5.10.0
I can (sort-of) reproduce this with a small makefile running locally. I added “sort of” because it requires a missing input (that would normally be OK) to trigger the behavior.
The Snakefile:
If you run this with:
…it works fine. If you delete the output file and some of the inputs and re-run with a missing input, you get the error:
If you build a comparable makefile without a checkpoint and re-run it from the same spot with a missing start file, it will recognize that the intermediate files exist and work from them. But with a checkpoint rule, you get this TBD behavior.