hifiasm: Too many contigs and lager genome size compared with ONT seqenceing
Hello, professor
I am new to hifiasm. I am using Hifiasm 0.16.1-r375 to generate a assembly from three HiFi CCS reads. My command is: hifiasm -o A -t 30 A.1.fasta A.2.fasta A.3.fasta > A.hifiasm.log. The A.1.fasta、A.2.fasta and A.3.fasta are the conversion from CCS.bam to fasta using “samtools view A.1.ccs.bam | awk '{print ">"$1"\n"$10}' > A.1.fasta”. However, after converting the primary assembly “A.p_utg.gfa” to the “A.p_utg.fasta” with the command “awk '/^S/{print ">"$2;print $3}'”, I got the primary assembly of with 2530 contigs. After using juicer+3ddna to scaffold the contig assembly with hic data, only 114 contigs are anchored 32 chromosomes and the anchoring rate is low about 74%. Genome size is 3,210,343,700. And in my output directory, I can not find the two sets of partially phased contigs A.hap*.p_ctg.gfa as described by the tutorial . Is it normal?
Also, We sequenced several other sepecies of the same genus with ONT and assembled by nexdenovo, but the genome size is all about 2.5G. It about decreased by 500M compared with hifiasm assembly. I am not sure the the difference of two way of sequencing and assembly is due to the sequence platform or species difference? Maybe should I set the parameters to decrease the contig number of A.p_utg.fasta?
The following is my log.
A.hifiasm.log
Looking forward with your reply.
Best wishes!
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 21 (6 by maintainers)
Hello,Professor I am sorry for delayed reply. I also have not found the reason that two versions produced the same assemblies. I will use juicer_3ddna pipline to anchor the primary contig assembly to chromosomal assembly. Thanks again for your help! Best wishes!