scvelo: Corrupted neighbors graph using spatial data on scvelo

Hi I have downloaded the BAM file from the spaceranger pipeline here: https://support.10xgenomics.com/spatial-gene-expression/datasets/1.3.0/Visium_FFPE_Human_Breast_Cancer? (Human breast cancer 1.3 genome aligned bam). Then I used Velocyto to create the .loom file. Then I followed the example workflow from scvelo from the endocrine. When I tried to run my dataset into scvelo I received a corrupted neighborhood graph error. I have checked that my .loom file is not corrupted because I can still access portions of it but the software cannot do anything with it.

Python:

import scvelo as scv
scv.logging.print_version()

scv.settings.verbosity = 3  # show errors(0), warnings(1), info(2), hints(3)
scv.settings.presenter_view = True  # set max width size for presenter view
scv.settings.set_figure_params('scvelo')  # for beautified visualization

#reading in my data
adata = scv.read('/gpfs0/home1/gddaslab/rssxm007/yard/run_spaceranger_count/velocyto/Visium_FFPE_Human_Breast_Cancer_possorted_genome_bam_5BKOT.loom', cache=True)

scv.utils.show_proportions(adata)
adata

scv.pp.filter_and_normalize(adata, min_shared_counts=20, n_top_genes=2000)
#scv.pp.moments(adata, n_pcs=30, n_neighbors=30)
scv.pp.neighbors(adata, n_pcs=30, n_neighbors=30)
# error in the line above 
scv.tl.velocity(adata)

scv.tl.velocity_graph(adata)

scv.pl.velocity_embedding_stream(adata, basis='pca')

scv.tl.recover_dynamics(adata)

scv.tl.velocity(adata, mode='dynamical')
scv.tl.velocity_graph(adata)

scv.tl.latent_time(adata)
scv.pl.scatter(adata, color='latent_time', color_map='gnuplot', size=80, colorbar=True)

top_genes = adata.var['fit_likelihood'].sort_values(ascending=False).index[:300]
scv.pl.heatmap(adata, var_names=top_genes, tkey='latent_time', n_convolve=100, col_color='Clusters')

scv.pl.scatter(adata, basis=top_genes[:10], frameon=False, ncols=5)

scv.pl.velocity_embedding_stream(adata, basis='pca', title='', smooth=.8, min_mass=4)

Output:

Running scvelo 0.2.3 (python 3.8.8) on 2021-08-13 12:44.
ERROR: XMLRPC request failed [code: -32500]
RuntimeError: PyPI's XMLRPC API is currently disabled due to unmanageable load and will be deprecated in the near future. See https://status.python.org/ for more information.
Variable names are not unique. To make them unique, call `.var_names_make_unique`.
Variable names are not unique. To make them unique, call `.var_names_make_unique`.
Variable names are not unique. To make them unique, call `.var_names_make_unique`.
Abundance of ['spliced', 'unspliced']: [0.33 0.67]
Variable names are not unique. To make them unique, call `.var_names_make_unique`.
Variable names are not unique. To make them unique, call `.var_names_make_unique`.
Filtered out 33521 genes that are detected 20 counts (shared).
Normalized count data: X, spliced, unspliced.
Skip filtering by dispersion since number of variables are less than `n_top_genes`.
Logarithmized X.
computing neighbors
    finished (0:00:03) --> added 
    'distances' and 'connectivities', weighted adjacency matrices (adata.obsp)
WARNING: The neighbor graph has an unexpected format (e.g. computed outside scvelo) 
or is corrupted (e.g. due to subsetting). Consider recomputing with `pp.neighbors`.
computing moments based on connectivities
    finished (0:00:00) --> added 
    'Ms' and 'Mu', moments of un/spliced abundances (adata.layers)
computing velocities
WARNING: You seem to have very low signal in splicing dynamics.
The correlation threshold has been reduced to -0.0556.
Please be cautious when interpreting results.
WARNING: Too few genes are selected as velocity genes. Consider setting a lower threshold for min_r2 or min_likelihood.
    finished (0:00:00) --> added 
    'velocity', velocity vectors for each individual cell (adata.layers)
WARNING: The neighbor graph has an unexpected format (e.g. computed outside scvelo) 
or is corrupted (e.g. due to subsetting). Consider recomputing with `pp.neighbors`.
Traceback (most recent call last):

  File "/home/gddaslab/rssxm007/yard/run_spaceranger_count/velocyto/spatialdata.py", line 27, in <module>
    scv.tl.velocity_graph(adata)

  File "/home/gddaslab/rssxm007/anaconda3/envs/velocyto/lib/python3.8/site-packages/scvelo/tools/velocity_graph.py", line 294, in velocity_graph
    vgraph = VelocityGraph(

  File "/home/gddaslab/rssxm007/anaconda3/envs/velocyto/lib/python3.8/site-packages/scvelo/tools/velocity_graph.py", line 102, in __init__
    raise ValueError(

ValueError: Your neighbor graph seems to be corrupted. Consider recomputing via pp.neighbors.

Versions:

scvelo 0.2.3 python 3.8.8

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 17 (12 by maintainers)

Most upvoted comments

right ok so it’s just matter of chance? do you think it is more likely in bad-quality data (where e.g. you’d have much sparser matrxi and so easier to get duplicated observations? )

Yes, I would assume that this phenomenon comes up more often if data quality is low.