snakemake: Google Storage download retry predicate additional exception

The current code is looking for snakemake.exceptions.CheckSumMismatchException when deciding whether or not to retry a download: https://github.com/snakemake/snakemake/blob/223bcc52d058e9704e69dac65c101ea1b18f3361/snakemake/remote/GS.py#L42-L48

However, I am seeing a suspiciously large number of failures like this in my pipeline:

Traceback (most recent call last):
  File "/opt/conda/envs/snakemake/lib/python3.9/site-packages/snakemake/__init__.py", line 687, in snakemake
    success = workflow.execute(
  File "/opt/conda/envs/snakemake/lib/python3.9/site-packages/snakemake/workflow.py", line 1005, in execute
    success = scheduler.schedule()
  File "/opt/conda/envs/snakemake/lib/python3.9/site-packages/snakemake/scheduler.py", line 489, in schedule
    self.run(runjobs)
  File "/opt/conda/envs/snakemake/lib/python3.9/site-packages/snakemake/scheduler.py", line 500, in run
    executor.run_jobs(
  File "/opt/conda/envs/snakemake/lib/python3.9/site-packages/snakemake/executors/__init__.py", line 131, in run_jobs
    self.run(
  File "/opt/conda/envs/snakemake/lib/python3.9/site-packages/snakemake/executors/__init__.py", line 447, in run
    future = self.run_single_job(job)
  File "/opt/conda/envs/snakemake/lib/python3.9/site-packages/snakemake/executors/__init__.py", line 491, in run_single_job
    self.cached_or_run, job, run_wrapper, *self.job_args_and_prepare(job)
  File "/opt/conda/envs/snakemake/lib/python3.9/site-packages/snakemake/executors/__init__.py", line 452, in job_args_and_prepare
    job.prepare()
  File "/opt/conda/envs/snakemake/lib/python3.9/site-packages/snakemake/jobs.py", line 710, in prepare
    self.download_remote_input()
  File "/opt/conda/envs/snakemake/lib/python3.9/site-packages/snakemake/jobs.py", line 682, in download_remote_input
    f.download_from_remote()
  File "/opt/conda/envs/snakemake/lib/python3.9/site-packages/snakemake/io.py", line 584, in download_from_remote
    self.remote_object.download()
  File "/opt/conda/envs/snakemake/lib/python3.9/site-packages/google/api_core/retry.py", line 281, in retry_wrapped_func
    return retry_target(
  File "/opt/conda/envs/snakemake/lib/python3.9/site-packages/google/api_core/retry.py", line 184, in retry_target
    return target()
  File "/opt/conda/envs/snakemake/lib/python3.9/site-packages/snakemake/remote/GS.py", line 226, in download
    return download_blob(self.blob, self.local_file())
  File "/opt/conda/envs/snakemake/lib/python3.9/site-packages/google/api_core/retry.py", line 281, in retry_wrapped_func
    return retry_target(
  File "/opt/conda/envs/snakemake/lib/python3.9/site-packages/google/api_core/retry.py", line 184, in retry_target
    return target()
  File "/opt/conda/envs/snakemake/lib/python3.9/site-packages/snakemake/remote/GS.py", line 69, in download_blob
    blob.download_to_file(parser)
  File "/opt/conda/envs/snakemake/lib/python3.9/site-packages/google/cloud/storage/blob.py", line 1041, in download_to_file
    self._do_download(
  File "/opt/conda/envs/snakemake/lib/python3.9/site-packages/google/cloud/storage/blob.py", line 900, in _do_download
    response = download.consume(transport, timeout=timeout)
  File "/opt/conda/envs/snakemake/lib/python3.9/site-packages/google/resumable_media/requests/download.py", line 171, in consume
    self._write_to_stream(result)
  File "/opt/conda/envs/snakemake/lib/python3.9/site-packages/google/resumable_media/requests/download.py", line 120, in _write_to_stream
    raise common.DataCorruption(response, msg)
google.resumable_media.common.DataCorruption: Checksum mismatch while downloading:

  https://storage.googleapis.com/download/storage/v1/b/rs-ukb/o/raw%2Fgt-imputation%2Fukb_imp_chr9_v3.bgen?generation=1602861266282729&alt=media

The X-Goog-Hash header indicated an MD5 checksum of:

  J3RmHIDzGmBKkklx/ImWtg==

but the actual MD5 checksum of the downloaded contents was:

  XHqx7gai/Eij53rF8bMrKg==

It looks like this google.resumable_media.common.DataCorruption is not being wrapped as a snakemake.exceptions.CheckSumMismatchException or some other design flaw exists that keeps these requests from being retried.

Note: if_transient_error does not appear to apply to this error either

About this issue

  • Original URL
  • State: open
  • Created 4 years ago
  • Comments: 19 (15 by maintainers)

Most upvoted comments

I’ll follow up on this

do you have any huge objects that you could share, and we can put into snakemake-testing and I can try to reproduce the checksum issue?

Hm, here are a couple large files accessible without requester pays:

gs://gcp-public-data--gnomad/release/3.1/vcf/genomes/gnomad.genomes.v3.1.hgdp_1kg_subset.chr1.vcf.bgz # 272GB
gs://gcp-public-data--gnomad/release/3.1/vcf/genomes/gnomad.genomes.v3.1.hgdp_1kg_subset.chr22.vcf.bgz # 58GB

The size of the files would steadily decrease as you change the N in “chrN” to anything between 1 and 22.

I can’t share the exact files I was using unfortunately but these are very similar.