bifrost: Tutorial on k-mer color API, my current use results in corruption?
Hi,
Do you have any resources on how to use the k-mer/unitig color API in Bifrost? I have been playing around with it, and I think I understand it, but I’m encountering an issue where some unitigs have no colors associated with them anymore, or worse, the whole colorset is a nullptr.
For context: say I have a graph constructed from both a reference genome and WGS data from a different strain. I want to perform some graph cleaning, and identified a bunch of unitigs that have too low coverage in the sample and which I want to have removed, or at least not associated with the sample color anymore.
I’ve constructed the following example to do that: https://github.com/broadinstitute/pyfrost/blob/master/tests/test_node_removal.cpp
This example reads a file to_remove.txt
which contains the head k-mer of a unitig to be removed from the sample on each line. First, I discard the sample color ID from that unitig, and if no colors remain, I queue it to be fully removed from the graph.
I save the cleaned graph to a file, and then read it again. Most nodes still have correct colors associated with them. For some nodes, however, the colorset will be a nullptr, resulting a crash when trying to do any operation, while for others the colorset is not a nullptr but doesn’t contain any colors (which shouldn’t happen because those unitigs should’ve been removed).
Am I using the API in an incorrect way? Is it a custom function I added to Bifrost in my fork that transforms any UnitigMapping to a mapping representing the whole unitig? A bug in Bifrost?
Any help would be much appreciated, thanks!
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 15
Amazing work!! I’ve successfully ran all my scripts without errors. Thanks a lot!