bcbio-nextgen: tabix Error [E::hts_idx_push] chromosome blocks not continuous
Hi @chapmanb , I think I tracked down this error message which I reported before on #613 . It turns out that what happens is exactly what the message says …
Briefly, the pipeline sorts the bed file by chromosome and position. However, for some reason they are not sorted properly on the chromosome name. Using the sort command of the pipeline I am getting this which is not sorted properly by chromosome, which means that “chromosome blocks not continuous”
[ipedroso@jimi work]$ sort -k1,1 -k2,2n 1_2014-09-26_populus_trichocarpa_diversity_1-sort-callable-callableblocks.bed | grep scaffold_104
scaffold_1040 1586 2955
scaffold_104 0 242
scaffold_1040 566 606
scaffold_1040 709 1271
scaffold_1040 3113 5047
scaffold_1040 5177 6391
scaffold_1040 6502 13319
scaffold_104 2774 3373
scaffold_104 14249 16134
scaffold_104 16283 16356
scaffold_104 16478 42093
scaffold_1043 0 203
scaffold_1043 342 551
scaffold_1043 867 985
scaffold_1043 1182 2397
scaffold_1043 2509 3187
scaffold_1043 3553 11938
scaffold_1043 12773 12835
scaffold_104 421 456
scaffold_104 3529 4705
scaffold_104 4825 4939
scaffold_104 5546 5668
scaffold_104 5904 7718
scaffold_104 42225 44057
scaffold_104 44182 44766
scaffold_104 68438 68580
scaffold_104 68726 69439
scaffold_1047 105 559
scaffold_104 73523 73694
scaffold_104 75867 75954
scaffold_1047 677 2510
scaffold_1047 2634 2982
scaffold_1047 3364 12767
scaffold_1047 12871 13201
scaffold_104 8097 8805
I found here https://www.biostars.org/p/64687/ that using the V option on the sort would solve the problem
[ipedroso@jimi work]$ sort -k1,1V -k2,2n 1_2014-09-26_populus_trichocarpa_diversity_1-sort-callable-callableblocks.bed | grep scaffold_104
scaffold_104 140844 142121
scaffold_104 142240 145935
scaffold_104 146082 146515
scaffold_1040 566 606
scaffold_1040 709 1271
scaffold_1040 1586 2955
scaffold_1040 3113 5047
scaffold_1040 5177 6391
scaffold_1040 6502 13319
scaffold_1043 0 203
scaffold_1043 342 551
scaffold_1043 867 985
scaffold_1043 1182 2397
scaffold_1043 2509 3187
scaffold_1043 3553 11938
scaffold_1043 12773 12835
scaffold_1047 105 559
scaffold_1047 677 2510
scaffold_1047 2634 2982
scaffold_1047 3364 12767
scaffold_1047 12871 13201
and it actually works 😃, see example below
[ipedroso@jimi work]$ sort -k1,1 -k2,2n 1_2014-09-26_populus_trichocarpa_diversity_1-sort-callable-callableblocks.bed > bed.bed
[ipedroso@jimi work]$ bgzip -c bed.bed > bed.bed.gz
[ipedroso@jimi work]$ tabix -f -p bed bed.bed.gz
[E::hts_idx_push] chromosome blocks not continuous
tbx_index_build failed: bed.bed.gz
and with the V option added
sort -k1,1V -k2,2n 1_2014-09-26_populus_trichocarpa_diversity_1-sort-callable-callableblocks.bed > bed.bed
bgzip -c bed.bed > bed.bed.gz
tabix -f -p bed bed.bed.gz
The BioStart post says that it may not work on all sort installations so I am not sure how this will work on every installation.
About this issue
- Original URL
- State: closed
- Created 10 years ago
- Reactions: 2
- Comments: 19 (1 by maintainers)
Weird, works fine for me:
What version of sort is on your machine? I have an old version:
The -V option won’t do it because you need a super new version of sort, which we can’t guarantee.