mfem: Hybridization does not work on NC processor boundaries.

It seems hybridization has never worked correctly on non-conforming interfaces that are at the same time processor interfaces. I tested this as far back as v3.2.

How to reproduce:

  1. In ex4p, disable all serial and parallel refinement.

  2. Run mpirun -n 4 ex4p -m ../data/amr-quad.mesh -o 2 -hb. This works because all inter-processor faces happen to be conforming: ex4p-good

  3. When run on a different number of CPUs, some master-slave faces are on processor boundary: mpirun -n 3 ex4p -m ../data/amr-quad.mesh -o 2 -hb ex4p-bad

I could use some help fixing this because I don’t know how hybridization actually works…

About this issue

  • Original URL
  • State: open
  • Created 5 years ago
  • Comments: 34 (34 by maintainers)

Commits related to this issue

Most upvoted comments

Looking at the code, I see that there is a TODO comment to fix such zero rows in cases like the above: https://github.com/mfem/mfem/blob/6e149b75e502214cf0d1d163b742986d10b629c9/fem/hybridization.cpp#L677-L679

It looks like the best solution will be to address this TODO.

I see the same issue on Mac with HYPRE 2.19.0 but not with HYPRE 2.10.0b. The same issue shows up with the sample run which uses -o 2:

mpirun -np 3 ex4p -m ../data/amr-quad.mesh -o 2 -hb

It looks like this does not fail in our nightly tests because they use HYPRE 2.10.0b.

I tried other older versions and here is what I saw: HYPRE 2.16.0 - fails HYPRE 2.14.0 - fails HYPRE 2.12.1 - fails HYPRE 2.11.2 - fails HYPRE 2.11.0 - fails HYPRE 2.10.1 - ok

In any case, I’m not sure why this specific case creates a hypre matrix with a zero row – I’ll need to look more carefully in the hybridization code.

I did some more testing with the new branch https://github.com/mfem/mfem/tree/nc-amr-hybridization-fix and it seems to work fine. I also added two sample runs that use ghost shared faces. One of these tests the issue discussed here. After some cleanup, it should be ready for review.

@jakubcerveny, what about this question: https://github.com/mfem/mfem/blob/dd23ccddb3483d983552f187abc28fd736662cf3/mesh/pncmesh.cpp#L1266

Also, we can try to fix this: https://github.com/mfem/mfem/blob/dd23ccddb3483d983552f187abc28fd736662cf3/fem/pfespace.cpp#L1215

This sounds really promising but at first try it didn’t fix the hybridization problem. I’ll take a more detailed look tomorrow, since this really sounds related.

I may be on the right path: if I swap the columns of elmat that goes into Ct for the one ghost slave face on processor 1, the solution gets fixed: fixed So it’s probably the orientation of ghost slave faces. I’ll keep looking into this.

It seems https://github.com/mfem/mfem/compare/nc-amr-hybridization-fix really does work!

mpirun -np 3 ex4p -m ../data/amr-quad.mesh -o 2 -hb ex4p mpirun -np 3 ex9p -m ../data/amr-quad.mesh -p 1 -rs 0 -rp 0 -dt 0.002 -tf 2 ex9p Thank you @v-dobrev !!!

Here is the relevant paper: https://arxiv.org/abs/1801.08914