scanpy: setting gene_symbol to select symbol from adata.var fails in sc.pl.umap()
I would like to color the umap representation using gene expression values. For ease of use I’d like to display the Gene name instead of gene_id which are the adata.var_names in my case. Setting gene_symbols = 'Symbol'
doesn’t seem to work for me or I am using it the wrong way.
When running sc.pl.umap(adata, gene_symbols = 'Symbol', color = ['Tnnt2'])
I get the follwoing error message:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-116-e09d49f2528c> in <module>
----> 1 sc.pl.umap(adata, gene_symbols = 'Symbol', color = ['Tnnt2'])
/anaconda3/envs/scanpy/lib/python3.6/site-packages/scanpy/plotting/tools/scatterplots.py in umap(adata, **kwargs)
27 If `show==False` a `matplotlib.Axis` or a list of it.
28 """
---> 29 return plot_scatter(adata, basis='umap', **kwargs)
30
31
/anaconda3/envs/scanpy/lib/python3.6/site-packages/scanpy/plotting/tools/scatterplots.py in plot_scatter(adata, color, use_raw, sort_order, edges, edges_width, edges_color, arrows, arrows_kwds, basis, groups, components, projection, color_map, palette, size, frameon, legend_fontsize, legend_fontweight, legend_loc, ncols, hspace, wspace, title, show, save, ax, return_fig, **kwargs)
275 color_vector, categorical = _get_color_values(adata, value_to_plot,
276 groups=groups, palette=palette,
--> 277 use_raw=use_raw)
278
279 # check if higher value points should be plot on top
/anaconda3/envs/scanpy/lib/python3.6/site-packages/scanpy/plotting/tools/scatterplots.py in _get_color_values(adata, value_to_plot, groups, palette, use_raw)
665 raise ValueError("The passed `color` {} is not a valid observation annotation "
666 "or variable name. Valid observation annotation keys are: {}"
--> 667 .format(value_to_plot, adata.obs.columns))
668
669 return color_vector, categorical
ValueError: The passed `color` Tnnt2 is not a valid observation annotation or variable name. Valid observation annotation keys are: Index(['Sample', 'n_counts', 'n_genes', 'percent_mito', 'log_counts',
'louvain'],
dtype='object')
adata.var contains the column “Symbol” and “Tnnt2” is present:
adata.var[adata.var['Symbol'] == 'Tnnt2']
Symbol | type | highly_variable | means | dispersions | dispersions_norm |
---|---|---|---|---|---|
Tnnt2 | protein_coding | True | 0.923869 | 4.090601 | 11.370244 |
run with:
scanpy==1.3.7 anndata==0.6.17 numpy==1.14.6 scipy==1.1.0 pandas==0.23.4 scikit-learn==0.19.1 statsmodels==0.9.0 python-igraph==0.7.1 louvain==0.6.1
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 17 (10 by maintainers)
Yes (definitely) and yes (I think)
Passing an argument for
gene_symbols
means that instead of searching.var_names
, the column of.var
whose name was passed will be searched.For example, if you had an AnnData object
adata
with ensembl ids asadata.var_name
, and hgnc symbols under the columnadata.var[“gene_name”]
, the following calls should plot similar things (different titles):adata.var["gene_name"]
Traceback (most recent call last): File “/sc/arion/work/gujarh01/software/anaconda3/lib/python3.9/site-packages/pandas/core/indexes/base.py”, line 3361, in get_loc return self._engine.get_loc(casted_key) File “pandas/_libs/index.pyx”, line 76, in pandas._libs.index.IndexEngine.get_loc File “pandas/_libs/index.pyx”, line 108, in pandas._libs.index.IndexEngine.get_loc File “pandas/_libs/hashtable_class_helper.pxi”, line 5198, in pandas._libs.hashtable.PyObjectHashTable.get_item File “pandas/_libs/hashtable_class_helper.pxi”, line 5206, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError: ‘gene_name’
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File “<stdin>”, line 1, in <module> File “/sc/arion/work/gujarh01/software/anaconda3/lib/python3.9/site-packages/pandas/core/frame.py”, line 3458, in getitem indexer = self.columns.get_loc(key) File “/sc/arion/work/gujarh01/software/anaconda3/lib/python3.9/site-packages/pandas/core/indexes/base.py”, line 3363, in get_loc raise KeyError(key) from err KeyError: ‘gene_name’
Did something change in scanpy ?