ncbi-genome-download: KeyError: 'assembly_accession'

Hi!

Version :

ncbi-genome-download --version
0.3.1

Command :

ncbi-genome-download \
    -s genbank -A GCA_029448725.1 --formats fasta all \
     \
    --output-folder ./ \
    --flat-output

Error:

Traceback (most recent call last):
  File "/usr/local/bin/ncbi-genome-download", line 10, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/site-packages/ncbi_genome_download/__main__.py", line 26, in main
    ret = args_download(args)
  File "/usr/local/lib/python3.10/site-packages/ncbi_genome_download/core.py", line 178, in args_download
    return config_download(config)
  File "/usr/local/lib/python3.10/site-packages/ncbi_genome_download/core.py", line 197, in config_download
    download_candidates = select_candidates(config)
  File "/usr/local/lib/python3.10/site-packages/ncbi_genome_download/core.py", line 308, in select_candidates
    for entry in filter_entries(entries, config):
  File "/usr/local/lib/python3.10/site-packages/ncbi_genome_download/core.py", line 354, in filter_entries
    if not config.is_compatible_assembly_accession(entry['assembly_accession']):
KeyError: 'assembly_accession'

Very similar to this older issue : https://github.com/kblin/ncbi-genome-download/issues/2

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Reactions: 5
  • Comments: 17 (1 by maintainers)

Commits related to this issue

Most upvoted comments

Pull request #210 didn’t work for me because the installation failed. But thanks to andrewjpage for the tip, this is an issue with the format of the ‘assembly_summary_genbank.txt’ file. Specifically, the header columns for ‘# assembly_accession’ was changed to ‘#assembly_accession’. As a result, the code in ‘core.py’ that references the ‘entry’ dictionary is no longer accurate.

To resolve this issue, I replaced the old code ‘entry[‘assembly_accession’]’ with the updated code ‘entry[’#assembly_accession’]'. This change was made on whole script including line 354, which was the location of the reported error. After making this change, I was able to download the genome without any further issues.

Fixed by bd4607102748673c2f2e45719213b74fda37b606 included in the new 0.3.2 release now on pypi. Bioconda packages watch the pypi changes and should be available soon.

Those filepaths suggest you are calling a system-installed ncbi-genome-download, not Conda installed. I just successfully tested installing https://github.com/kblin/ncbi-genome-download/pull/210 in a fresh Conda environment

which ncbi-genome-download should display the ncbi-genome-download within conda environment you created.

e.g.

mamba create --name ncbi-test python=3.8 -y --quiet
mamba activate ncbi-test
pip install git+https://github.com/chasemc/ncbi-genome-download@4c5c24e27aa8123c7995890046b17b815da3571b
which ncbi-genome-download

results in /home/chase/miniconda3/envs/ncbi-test/bin/ncbi-genome-download, not /usr/local/bin/ncbi-genome-download

It looks like a space has been deleted on line 2 with the header columns:

assembly_accession

has become #assembly_accession

So I think line 17 of summary.py needs to be updated. I’ll try and get a PR in later (if its still not working).