cellrank: The directed PAGA results are inconsistent between the two official tutorials.

Hello CellRank, I drew the directed PAGA plots on the results of tutorial [CellRank basics] and tutorial [Kernels and estimators]. Although their absorption probabilities plots are the same, their directed PAGA results are inconsistent. For detailed codes. Please check my jupyter notebooks here (https://github.com/hyjforesight/CellRank) image

The same absorption probabilities plots between 2 tutorials: image

Codes for tutorial [CellRank basics]

scv.tl.recover_latent_time(adata, vkey='velocity', root_key="initial_states_probs", end_key="terminal_states_probs")
scv.tl.paga(adata, groups='clusters', vkey='velocity', use_time_prior="latent_time", root_key="initial_states_probs", end_key="terminal_states_probs")
scv.pl.paga(adata, basis='umap')
cr.pl.cluster_fates(adata, mode="paga_pie", cluster_key="clusters", basis="umap",
                    legend_kwargs={"loc": "top right out"}, legend_loc="top left out", node_size_scale=5, edge_width_scale=1, max_edge_width=4, title="directed PAGA")

image image

Codes for tutorial [Kernels and estimators]. To draw PAGA for this tuturial, I have to run g.compute_terminal_states() and g._compute_initial_states() first

g.compute_terminal_states()
g._compute_initial_states() 
scv.tl.recover_latent_time(adata, vkey='velocity', root_key="initial_states_probs", end_key="terminal_states_probs")
scv.tl.paga(adata, groups='clusters', vkey='velocity', use_time_prior="latent_time", root_key="initial_states_probs", end_key="terminal_states_probs")
scv.pl.paga(adata, basis='umap')
cr.pl.cluster_fates(adata, mode="paga_pie", cluster_key="clusters", basis="umap",
                    legend_kwargs={"loc": "top right out"}, legend_loc="top left out", node_size_scale=5, edge_width_scale=1, max_edge_width=4, title="directed PAGA")

image image

What causes this inconsistency? Thanks! Best, YJ

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 15

Most upvoted comments

Hi @michalk8 and @Marius1311 I found that the heat map inconsistency is caused by n_states=3 of g_bwd.compute_macrostates(n_states=3, n_cells=30, cluster_key="clusters"), which means I expected that there will be 3 initial states. When I changed it to n_states=1, problem was solved.

Hello @michalk8 @Marius1311 Thanks for all the guidance! Now, I can reproduce the results of high-level API by using low-level API. Based on my practice, because CytoTRACE works better than scVelo in our case, we use low-level API more often than the high-level. The most beautiful thing is that, by using low-level API, we can also create our own kernel (Monocle pseudotime, DPT pseudotime) to draw figures. I think it will be fantastic if the new tutorial could introduce both the ways to set initial states and terminal states (automatically and manually) to all the audience. I appreciate all the help! Thank you again! Best, YJ