Description
This is with the test suite being run in parallel in batch from pascal. I haven't tested other systems.
I noticed while monitoring a parallel test-suite run that the engine was creating core files at completion of every test. I limited my run to a single test so I could try to track down the problem.
I decided to run plots/contour.py and and piped the output to a file.
Here's what I saw being printed to the log file:
EXIT: Test script contour.py
EXCODE: 111
- - - - - - - - - - - - - - -
srun: error: pascal32: task 1: Aborted (core dumped)
srun: error: pascal32: task 0: Aborted (core dumped)
I added 'ulimit -c unlimited' to run_visit_test_suite.sh so I could get a useful core file and reran the test. I then ran gdb on the corefile, and backtrace yielded this:
(gdb) bt
#0 0x00002aaac1ad1387 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:55
#1 0x00002aaac1ad2a78 in __GI_abort () at abort.c:90
#2 0x00002aaaba0cd955 in signalhandler_core (sig=11)
at /visit/src/common/misc/DebugStreamFull.C:162
#3 <signal handler called>
#4 0x00000000000001d1 in ?? ()
#5 0x00002aaac1ad505a in __cxa_finalize (d=0x2aaab47f3440) at cxa_finalize.c:55
#6 0x00002aaab3fd5113 in __do_global_dtors_aux ()
from /usr/workspace/wsa/visit/visit/thirdparty_shared/3.1.0/toss3/ospray/1.6.1/linux-x86_64_gcc-4.9/lib64/libospray_module_ispc.so.0
#7 0x00007fffffffb960 in ?? ()
#8 0x00002aaaaaabb07a in _dl_fini () at dl-fini.c:253
Backtrace stopped: frame did not save the PC
I then completely recompiled VisIt without ospray (I removed the ospray libraries from quartz386.cmake), and ran the test again. This time no core files were created.
FWIW, I use a couple of scripts for building VisIt and running the test suite that were basically culled from regressiontest_pascal, split for separate build and run-the-test-suite scripts, then modified for my local dirs.