Andrew and Martin (and EPICS community),
After experiencing the repeated crashes and examining a memory dump to the extent necessary to determine where they occurred (according to gdb, it was in camessage.c, line #1344, EPICS v22.214.171.124), I modified the databases to not use
fanout records and moved on with my work - and committed the programming sin of not keeping a copy of the failed configuration. Sure enough, when I tried to recreate the failure scenario yesterday - no crash occurred. I know the cpp processes were the same
(the only changes there were a couple of printf debug statements) but besides modifying the .db files I had also touched the .dbd file and the st.cmd file (we use a customized version) and probably a couple of other things. The only conclusion I can
find is that I’ve missed some small, but significant change I made which somehow contributed to the problem.
I apologize for raising a concern and then not being able to back it up. The schedule here prohibits me from spending much more time researching this issue, but I will continue to pursue it as time permits and will let you know if I
can get the core to recur.
From: Konrad, Martin [mailto:email@example.com]
Sent: Thursday, October 09, 2014 4:04 PM
To: Poff, Mark A; firstname.lastname@example.org
Subject: Re: Fanout to Sub Records in a Different IOC Causes Core Dump
> If you can't, take a look at how much stack space your
> Process1_routine subroutine needs, running out of stack space is one
> possible cause for this kind of strange behaviour in addition to the
> usual kinds of problems. A full stack trace for all threads when the
> crash happens would also help us help you (type 'thread apply all bt'
> in gdb).
Keep in mind that the stack trace could be corrupted. You might want to run with a trivial subroutine for testing.
In addition to that it's always a good idea to write a short test program that calls the subroutine function with some test values. You can run this test program with
valgrind --tool=memcheck <your_test_program>
to see if it is writing to memory it should not write to.
Control System Engineer
Facility for Rare Isotope Beams
Michigan State University
640 South Shaw Lane
East Lansing, MI 48824-1321, USA