mfem: Hybridization does not work on NC processor boundaries.
It seems hybridization has never worked correctly on non-conforming interfaces that are at the same time processor interfaces. I tested this as far back as v3.2.
How to reproduce:
-
In
ex4p
, disable all serial and parallel refinement. -
Run
mpirun -n 4 ex4p -m ../data/amr-quad.mesh -o 2 -hb
. This works because all inter-processor faces happen to be conforming: -
When run on a different number of CPUs, some master-slave faces are on processor boundary:
mpirun -n 3 ex4p -m ../data/amr-quad.mesh -o 2 -hb
I could use some help fixing this because I don’t know how hybridization actually works…
About this issue
- Original URL
- State: open
- Created 5 years ago
- Comments: 34 (34 by maintainers)
Commits related to this issue
- WIP: debugging and bugfix for the issue with hybridization on 2D nonconforming meshes, see issue #1105 on github. — committed to mfem/mfem by v-dobrev 4 years ago
Looking at the code, I see that there is a TODO comment to fix such zero rows in cases like the above: https://github.com/mfem/mfem/blob/6e149b75e502214cf0d1d163b742986d10b629c9/fem/hybridization.cpp#L677-L679
It looks like the best solution will be to address this TODO.
I see the same issue on Mac with HYPRE 2.19.0 but not with HYPRE 2.10.0b. The same issue shows up with the sample run which uses
-o 2
:It looks like this does not fail in our nightly tests because they use HYPRE 2.10.0b.
I tried other older versions and here is what I saw: HYPRE 2.16.0 - fails HYPRE 2.14.0 - fails HYPRE 2.12.1 - fails HYPRE 2.11.2 - fails HYPRE 2.11.0 - fails HYPRE 2.10.1 - ok
In any case, I’m not sure why this specific case creates a hypre matrix with a zero row – I’ll need to look more carefully in the hybridization code.
I did some more testing with the new branch https://github.com/mfem/mfem/tree/nc-amr-hybridization-fix and it seems to work fine. I also added two sample runs that use ghost shared faces. One of these tests the issue discussed here. After some cleanup, it should be ready for review.
@jakubcerveny, what about this question: https://github.com/mfem/mfem/blob/dd23ccddb3483d983552f187abc28fd736662cf3/mesh/pncmesh.cpp#L1266
Also, we can try to fix this: https://github.com/mfem/mfem/blob/dd23ccddb3483d983552f187abc28fd736662cf3/fem/pfespace.cpp#L1215
This sounds really promising but at first try it didn’t fix the hybridization problem. I’ll take a more detailed look tomorrow, since this really sounds related.
I may be on the right path: if I swap the columns of
So it’s probably the orientation of ghost slave faces. I’ll keep looking into this.
elmat
that goes intoCt
for the one ghost slave face on processor 1, the solution gets fixed:It seems https://github.com/mfem/mfem/compare/nc-amr-hybridization-fix really does work!
mpirun -np 3 ex4p -m ../data/amr-quad.mesh -o 2 -hb
mpirun -np 3 ex9p -m ../data/amr-quad.mesh -p 1 -rs 0 -rp 0 -dt 0.002 -tf 2
Here is the relevant paper: https://arxiv.org/abs/1801.08914