scanpy: sc.tl.umap numba error when used with init_pos="paga"
Hi Alex,
UMAP throws an error if I use scanpy.tl.ump
with initial positions from sc.tl.paga
. Based on the error (see below) I thought it was a problem of UMAP itself. However, the error is not thrown when called without initial positions from paga.
Here is the output / error:
sc.tl.umap(adata, init_pos='paga')
computing UMAP
using 'X_pca' with n_pcs = 50
---------------------------------------------------------------------------
TypingError Traceback (most recent call last)
<ipython-input-35-924452b37e5b> in <module>
----> 1 sc.tl.umap(adata, init_pos='paga')
/opt/conda/lib/python3.7/site-packages/scanpy/tools/_umap.py in umap(adata, min_dist, spread, n_components, maxiter, alpha, gamma, negative_sample_rate, init_pos, random_state, a, b, copy)
137 neigh_params.get('metric', 'euclidean'),
138 neigh_params.get('metric_kwds', {}),
--> 139 verbose=max(0, verbosity-3))
140 adata.obsm['X_umap'] = X_umap # annotate samples with UMAP coordinates
141 logg.info(' finished', time=True, end=' ' if _settings_verbosity_greater_or_equal_than(3) else '\n')
/opt/conda/lib/python3.7/site-packages/umap/umap_.py in simplicial_set_embedding(data, graph, n_components, initial_alpha, a, b, gamma, negative_sample_rate, n_epochs, init, random_state, metric, metric_kwds, verbose)
984 initial_alpha,
985 negative_sample_rate,
--> 986 verbose=verbose,
987 )
988
/opt/conda/lib/python3.7/site-packages/numba/dispatcher.py in _compile_for_args(self, *args, **kws)
348 e.patch_message(msg)
349
--> 350 error_rewrite(e, 'typing')
351 except errors.UnsupportedError as e:
352 # Something unsupported is present in the user code, add help info
/opt/conda/lib/python3.7/site-packages/numba/dispatcher.py in error_rewrite(e, issue_type)
315 raise e
316 else:
--> 317 reraise(type(e), e, None)
318
319 argtypes = []
/opt/conda/lib/python3.7/site-packages/numba/six.py in reraise(tp, value, tb)
656 value = tp()
657 if value.__traceback__ is not tb:
--> 658 raise value.with_traceback(tb)
659 raise value
660
TypingError: Failed in nopython mode pipeline (step: nopython frontend)
Invalid use of type(CPUDispatcher(<function rdist at 0x7fb3c827f840>)) with parameters (array(float64, 1d, C), array(float64, 1d, C))
Known signatures:
* (array(float32, 1d, A), array(float32, 1d, A)) -> float32
* parameterized
[1] During: resolving callee type: type(CPUDispatcher(<function rdist at 0x7fb3c827f840>))
[2] During: typing of call at /opt/conda/lib/python3.7/site-packages/umap/umap_.py (776)
File "../../../opt/conda/lib/python3.7/site-packages/umap/umap_.py", line 776:
def optimize_layout(
<source elided>
dist_squared = rdist(current, other)
^
This is not usually a problem with Numba itself but instead often caused by
the use of unsupported features or an issue in resolving types.
To see Python/NumPy features supported by the latest release of Numba visit:
http://numba.pydata.org/numba-doc/dev/reference/pysupported.html
and
http://numba.pydata.org/numba-doc/dev/reference/numpysupported.html
For more information about typing errors and how to debug them visit:
http://numba.pydata.org/numba-doc/latest/user/troubleshoot.html#my-code-doesn-t-compile
If you think your code should work with Numba, please report the error message
and traceback, along with a minimal reproducer at:
https://github.com/numba/numba/issues/new
What I basically do from raw UMI counts:
- total counts normalization / logarithmization
- PCA, bbknn, louvain
- combat, HVG, PCA, UMAP (works well)
- Paga (with louvain from 2., works well)
- UMAP (with positions from 4., does not work)
Any idea? Any further info needed? Best, Jens
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 17 (9 by maintainers)
Commits related to this issue
- Fix init_pos argument of sc.tl.umap * Now actually allows passing an array * Additionally allows providing an array (even through key of obsm) of dtype other than float32 * Code converting arrays to ... — committed to ivirshup/scanpy by ivirshup 5 years ago
- Fix init_pos argument of sc.tl.umap * Now actually allows passing an array * Additionally allows providing an array (even through key of obsm) of dtype other than float32 * Code converting arrays to ... — committed to scverse/scanpy by ivirshup 5 years ago
- Fix init_pos argument of sc.tl.umap * Now actually allows passing an array * Additionally allows providing an array (even through key of obsm) of dtype other than float32 * Code converting arrays to ... — committed to dpeerlab/scanpy by ivirshup 5 years ago
UMAP only uses the representation of a data matrix for determining the number of connected components of the graph for the init conditions [if these aren’t explicitly defined (they are if choosing
init_pos='paga'
): https://github.com/lmcinnes/umap/blob/948f60ff0caf7ccef0ab68626c7b99a11e66f1bb/umap/umap_.py#L958-L965In fact, the only place where it enters is for the computation of the mean positions of the disconnected components: https://github.com/lmcinnes/umap/blob/948f60ff0caf7ccef0ab68626c7b99a11e66f1bb/umap/spectral.py#L50
Implementation-wise, it’s a bit unfortunate that the data matrix is carried through all these functions just for that reason… But it’s not a problem for the results.
The confusing logging is fixed via https://github.com/theislab/scanpy/commit/a5bd1ecd8ab04ec79369f60d3656f578a4cde40c
Issue #666 👹 🙈