legion: All test segfault on Fedora Rawhide with legion-19.04.0
Start 25: rendering
25/25 Test #25: rendering ........................***Exception: SegFault 0.44 sec
[buildhw-10:7077 :0:7077] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x7ff0db8da948)
==== backtrace ====
0 /lib64/libucs.so.0(+0x194a3) [0x7ff0db8624a3]
1 /lib64/libucs.so.0(+0x1965a) [0x7ff0db86265a]
2 /lib64/libuct.so.0(+0x1b72b) [0x7ff0dbbb072b]
3 /lib64/ld-linux-x86-64.so.2(+0xfe4a) [0x7ff0e379ee4a]
4 /lib64/ld-linux-x86-64.so.2(+0xff51) [0x7ff0e379ef51]
5 /lib64/ld-linux-x86-64.so.2(+0x13eae) [0x7ff0e37a2eae]
6 /lib64/libc.so.6(_dl_catch_exception+0x79) [0x7ff0e20415b9]
7 /lib64/ld-linux-x86-64.so.2(+0x1372e) [0x7ff0e37a272e]
8 /lib64/libdl.so.2(+0x239c) [0x7ff0e1f0339c]
9 /lib64/libc.so.6(_dl_catch_exception+0x79) [0x7ff0e20415b9]
10 /lib64/libc.so.6(_dl_catch_error+0x33) [0x7ff0e2041653]
11 /lib64/libdl.so.2(+0x2af9) [0x7ff0e1f03af9]
12 /lib64/libdl.so.2(dlopen+0x4a) [0x7ff0e1f0342a]
13 /usr/lib64/openmpi/lib/libopen-pal.so.40(+0x6ead7) [0x7ff0e1b01ad7]
14 /usr/lib64/openmpi/lib/libopen-pal.so.40(mca_base_component_repository_open+0x1f4) [0x7ff0e1adf524]
15 /usr/lib64/openmpi/lib/libopen-pal.so.40(mca_base_component_find+0x35b) [0x7ff0e1ade4eb]
16 /usr/lib64/openmpi/lib/libopen-pal.so.40(mca_base_framework_components_register+0x2e) [0x7ff0e1ae9dfe]
17 /usr/lib64/openmpi/lib/libopen-pal.so.40(mca_base_framework_register+0x256) [0x7ff0e1aea2e6]
18 /usr/lib64/openmpi/lib/libopen-pal.so.40(mca_base_framework_open+0x14) [0x7ff0e1aea344]
19 /usr/lib64/openmpi/lib/libmpi.so.40(ompi_mpi_init+0x695) [0x7ff0e1db6795]
20 /usr/lib64/openmpi/lib/libmpi.so.40(PMPI_Init_thread+0x5b) [0x7ff0e1de6bbb]
21 /builddir/build/BUILD/legion-legion-19.04.0/openmpi/lib/librealm.so.1(AMMPI_SPMDSetThreadMode+0x182) [0x7ff0e2c879e2]
22 /builddir/build/BUILD/legion-legion-19.04.0/openmpi/lib/librealm.so.1(gex_Client_Init_GASNET_201930PARpshmFASTnodebugnotracenostatsnodebugmallocnosrclines+0x15d) [0x7ff0e2c15e5d]
23 /builddir/build/BUILD/legion-legion-19.04.0/openmpi/lib/librealm.so.1(_ZN5Realm11RuntimeImpl12network_initEPiPPPc+0x120) [0x7ff0e2bc6f30]
24 /builddir/build/BUILD/legion-legion-19.04.0/openmpi/lib/liblegion.so.1(_ZN6Legion8Internal7Runtime10initializeEPiPPPc+0x24) [0x7ff0e35ba694]
25 /builddir/build/BUILD/legion-legion-19.04.0/openmpi/lib/liblegion.so.1(_ZN6Legion8Internal7Runtime5startEiPPcb+0x250) [0x7ff0e35f49f0]
26 /builddir/build/BUILD/legion-legion-19.04.0/openmpi/bin/rendering(main+0x1b7) [0x556a7f716047]
27 /lib64/libc.so.6(__libc_start_main+0xf3) [0x7ff0e1f2ef73]
28 /builddir/build/BUILD/legion-legion-19.04.0/openmpi/bin/rendering(_start+0x2e) [0x556a7f7160de]
===================
0% tests passed, 25 tests failed out of 25
Total Test time (real) = 11.09 sec
The following tests FAILED:
1 - attach_file (SEGFAULT)
2 - circuit (SEGFAULT)
3 - dynamic_registration (SEGFAULT)
4 - ghost (SEGFAULT)
5 - ghost_pull (SEGFAULT)
6 - realm_saxpy (SEGFAULT)
7 - realm_stencil (SEGFAULT)
8 - spmd_cgsolver (SEGFAULT)
9 - virtual_map (SEGFAULT)
10 - attach_2darray (SEGFAULT)
11 - attach_array_daxpy (SEGFAULT)
12 - mpi_interop (SEGFAULT)
13 - hello_world (SEGFAULT)
14 - tasks_and_futures (SEGFAULT)
15 - index_tasks (SEGFAULT)
16 - global_vars (SEGFAULT)
17 - logical_regions (SEGFAULT)
18 - physical_regions (SEGFAULT)
19 - privileges (SEGFAULT)
20 - partitioning (SEGFAULT)
21 - multiple_partitions (SEGFAULT)
22 - custom_mapper (SEGFAULT)
23 - attach_file_mini (SEGFAULT)
24 - test_stl (SEGFAULT)
25 - rendering (SEGFAULT)
BUILDSTDERR: Errors while running CTest
Details here and here: https://koji.fedoraproject.org/koji/taskinfo?taskID=34577005
This can be reproduced with the following Dockerfile:
FROM fedora:rawhide
RUN dnf install -y spectool wget rpm-build dnf-plugins-core
RUN wget https://src.fedoraproject.org/fork/junghans/rpms/legion/raw/master/f/legion.spec
RUN spectool -g legion.spec
RUN dnf builddep -y legion.spec
RUN dnf install -y make
RUN rpmbuild -D"_sourcedir ${PWD}" -D"_srcrpmdir ${PWD}" -ba legion.spec
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 23 (9 by maintainers)
Ok, the original issue is fixed, now the circuit test fails on ppc64le:
Yes all test still pass. We will just wait for a OpenMPI-4 fix coming to rawhide.
Looking at https://apps.fedoraproject.org/koschei/package/legion, it seems openmpi-4 broke the build.