pandora: Solve incompatible genotypes through a dynamic programming algorithm
Suppose we have two records R1_gt_0
and R2_gt_1
. R1_gt_0
genotypes towards the reference (allele 0) and R2_gt_1
genotypes towards the first alternative allele (allele 1). For simplicity, assume only one sample. These records overlap, and thus they are conflicting and we have to solve this conflict (in pandora
this is done by making the genotypes compatible).
Right now, in the code, it works like this:
- If the likelihood of
R1_gt_0
is higher thanR2_gt_1
, thenR2_gt_1
genotype is changed from 1 to 0. - If the opposite happens, the likelihood of
R2_gt_1
is higher thanR1_gt_0
, thenR1_gt_0
genotype is changed from 0 to.
.
I don’t understand why we have different behaviours depending on the highest likelihood allele (if it is ref or alt). If we have conflicting records, for me it makes more sense to change the genotype of the lowest likelihood one to .
(we choose one, and all the others become .
)… but it seems that the previous described behaviour is intended (i.e. not a bug). Could you shed some light on this?
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 31 (3 by maintainers)
christ
We might be missing sth here, we might need to think more about this problem… but I guess we are at least sure it is a hard problem, and the very large majority of cases is solved with what we have already implemented. So I guess we leave this as an enhancement post-paper?