cellrank: the kernal apperas to have died. It will restart automatically

Hi I’m running tutorial Pancreas Basics and It’s working fine until to velocity part but for cr.tl.terminal_states(adata, cluster_key=‘clusters’, weight_connectivities=0.2) Computing transition matrix based on velocity correlations using 'deterministic' mode Estimating softmax_scale using 'deterministic' mode 100% 2531/2531 [00:03<00:00, 714.91cell/s]

Setting softmax_scale=3.7951 100% 2531/2531 [00:01<00:00, 1420.24cell/s]

Finish (0:00:03)

Using a connectivity kernel with weight 0.2 Computing transition matrix based on connectivities Finish (0:00:00) Computing eigendecomposition of the transition matrix Adding .eigendecomposition adata.uns['eig_fwd'] Finish (0:00:00) Computing Schur decomposition And suddenly the kernal apperas to have died and it ger restarted. I really appreciate your help to fix this problem

Thank you

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 19

Most upvoted comments

@josegarciamanteiga thank you, that seemed to solve the issue! I am just getting the following warning when I run compute_fate_probabilities():

--------------------------------------------------------------------------
WARNING: There was an error initializing an OpenFabrics device.

  Local host:   qnode0465
  Local device: mlx5_0
--------------------------------------------------------------------------

However the function seems to be completing without errors, so I’m guessing this is okay to ignore.

Hi, My workaround was to launch the JupyterLab with sbatch and not with an interaction session using srun. Apparently, the --with-pmi is automatically loaded with sbatch but not with srun. Everything worked after that. You just need to check for the node your sbatch job is in and then open the tunnels or whatever you need to open your Jupyter session on the browser. HTH Jose


Jose M. Garcia Manteiga PhD Computational Biologist Center for Translational Genomics and BioInformatics Dibit2-Basilica, 4A3 San Raffaele Scientific Institute Via Olgettina 58, 20132 Milano (MI), Italy

Tel: +39-02-2643-9211 e-mail: @.***

Il giorno mar 23 gen 2024 alle ore 00:00 jnmaciuch @.***> ha scritto:

Hello,

I’m experiencing the same issue as above. I was able to work around the error for compute_shur() by specifying method=“brandts”, however I am now receiving the same error message when trying to run compute_fate_probabilities(). I am also on a HPC using slurm. `OPAL ERROR: Unreachable in file pmix3x_client.c at line 111

The application appears to have been direct launched using “srun”, but OMPI was not built with SLURM’s PMI support and therefore cannot execute. There are several options for building PMI support under SLURM, depending upon the SLURM version you are using:

version 16.05 or later: you can use SLURM’s PMIx support. This requires that you configure and build SLURM --with-pmix.

Versions earlier than 16.05: you must use either SLURM’s PMI-1 or PMI-2 support. SLURM builds PMI-1 by default, or you can manually install PMI-2. You must then build Open MPI using --with-pmi pointing to the SLURM PMI library location. Please configure as appropriate and try again.

*** An error occurred in MPI_Init_thread *** on a NULL communicator *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, *** and potentially your MPI job) [qnode2038:59792] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed! `

— Reply to this email directly, view it on GitHub https://github.com/theislab/cellrank/issues/399#issuecomment-1904979368, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2UOMKOJDO5QC5WNF65P6TYP3VQJAVCNFSM4R3ZYQ3KU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOJQGQ4TOOJTGY4A . You are receiving this because you commented.Message ID: @.***>

Awesome! Thanks @michalk8 for fixing this so quickly! @roofya, great that you’re checking out CellRank, let us know via issues in case you encounter any other problems - we’re happy to help.

@michalk8 Thank you so much it seems that the problem was related to SLEPC/PETSC. I have uninstall and installed them again and now works fine.

Hi @roofya , sounds very strange. I assume you’re using the SLEPc/PETSc libraries from cellrank-krylov. If so, can you please post the output of python -c "import slepc4py; import petsc4py; print(slepc4py.__version__, petsc4py.__version__)"? Currently, only this line comes to my mind is densifying the matrix (https://github.com/msmdev/msmtools/blob/krylov_schur/msmtools/util/sorted_schur.py#L283) if SLEPc/PETSc is NOT installed (however, after testing this locally, my notebook doesn’t crash).

As the next thing: could please run start your notebook as jupyter notebook --debug > log.txt 2>&1 and post the log.txt here (ideally as an attachement, might get big)? Finally, what’s your Python version and OS? I’ve tested it using fresh conda environment with cellrank-krylov (Python3.8.5, Debian bullseye) and no crash has happened.

Apart from the above, maybe this thread can help to solve the issue: https://github.com/jupyter/notebook/issues/1892