OPAL (master) gives a segfault when ending
Summary
The most recent master version compiles fine on fedora fc35 but it gives a segfault upon finishing A checkout of the OPAL-2011-1 compile on the same distro works just fine.
Steps to reproduce the failure
git check out git checkout master
cd ../master CC=mpicc CXX=mpicxx cmake ../src/ -DHDF5_C_COMPILER_EXECUTABLE=/usr/lib64/openmpi/bin/h5pcc -D CMAKE_BUILD_TYPE=Release -D CMAKE_CXX_FLAGS="-Wno-error=cast-function-type -Wno-cast-function-type -Wl,--copy-dt-needed-entries"
make -j 24
cd ../example
mpirun -np 8 ../master/src/opal awaDrive_all.in
What is the current bug behavior?
Timings{0}> WakeField........... Wall max = 0, CPU max = 0
Timings{0}> Wall avg = 0, CPU avg = 0
Timings{0}> Wall min = 0, CPU min = 0
Timings{0}>
Timings{0}> Write H5-File....... Wall max = 2.27603, CPU max = 1.23
Timings{0}> Wall avg = 2.27519, CPU avg = 1.115
Timings{0}> Wall min = 2.27071, CPU min = 1.03
Timings{0}>
Timings{0}> Write Stat.......... Wall max = 0.105094, CPU max = 0.07
Timings{0}> Wall avg = 0.0356317, CPU avg = 0.02625
Timings{0}> Wall min = 0.0245887, CPU min = 0
Timings{0}>
Timings{0}> -----------------------------------------------------------------
[localhost:356319:0:356319] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
[localhost:356320:0:356320] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
[localhost:356318:0:356318] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
[localhost:356321:0:356321] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
==== backtrace (tid: 356320) ====
0 /lib64/libucs.so.0(ucs_handle_error+0x2a4) [0x7f1d48c29864]
1 /lib64/libucs.so.0(+0x2a41d) [0x7f1d48c2d41d]
2 /lib64/libucs.so.0(+0x2a5fa) [0x7f1d48c2d5fa]
3 opal() [0xc8f6c0]
4 opal() [0x910701]
5 opal() [0x7eb18c]
6 opal() [0xb935d9]
7 opal() [0x918acf]
8 opal() [0xa1a7af]
9 opal() [0x72c16a]
10 opal() [0x72c189]
11 opal() [0x6f368a]
12 /lib64/libc.so.6(+0x57a15) [0x7f1d4d257a15]
13 /lib64/libc.so.6(on_exit+0) [0x7f1d4d257b90]
14 /lib64/libc.so.6(+0x40447) [0x7f1d4d240447]
15 /lib64/libc.so.6(__libc_start_main+0x80) [0x7f1d4d2404f0]
16 opal() [0x4abbd5]
=================================
==== backtrace (tid: 356319) ====
0 /lib64/libucs.so.0(ucs_handle_error+0x2a4) [0x7fa60d660864]
1 /lib64/libucs.so.0(+0x2a41d) [0x7fa60d66441d]
2 /lib64/libucs.so.0(+0x2a5fa) [0x7fa60d6645fa]
3 opal() [0xc8f6c0]
4 opal() [0x910701]
5 opal() [0x7eb18c]
6 opal() [0xb935d9]
7 opal() [0x918acf]
What is the expected correct behavior?
git check out git checkout OPAL-2011-1
cd ../OPAL-2011-1
CC=mpicc CXX=mpicxx cmake ../src/ -DHDF5_C_COMPILER_EXECUTABLE=/usr/lib64/openmpi/bin/h5pcc -D CMAKE_BUILD_TYPE=Release -D CMAKE_CXX_FLAGS="-Wno-error=cast-function-type -Wno-cast-function-type -Wl,--copy-dt-needed-entries"
make -j 24
cd ../example
mpirun -np 8 ../OPAL-2011-1/src/opal awaDrive_all.in
works fine
Relevant logs and/or screenshots
the input file I use are the AWA photoinjector; the specific version is available at https://xgitlab.cels.anl.gov/awa/awa-opal-lattices