1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 <2014> 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 | Index | 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 <2014> 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 |
<== Date ==> | <== Thread ==> |
---|
Subject: | RE: Fanout to Sub Records in a Different IOC Causes Core Dump |
From: | "Poff, Mark A" <[email protected]> |
To: | "Konrad, Martin" <[email protected]>, "[email protected]" <[email protected]>, Andrew Johnson <[email protected]> |
Cc: | "Poff, Mark A" <[email protected]> |
Date: | Fri, 10 Oct 2014 10:38:44 +0000 |
Andrew and Martin (and EPICS community), After experiencing the repeated crashes and examining a memory dump to the extent necessary to determine where they occurred (according to gdb, it was in camessage.c, line #1344, EPICS v3.14.12.3), I modified the databases to not use
fanout records and moved on with my work - and committed the programming sin of not keeping a copy of the failed configuration. Sure enough, when I tried to recreate the failure scenario yesterday - no crash occurred. I know the cpp processes were the same
(the only changes there were a couple of printf debug statements) but besides modifying the .db files I had also touched the .dbd file and the st.cmd file (we use a customized version) and probably a couple of other things. The only conclusion I can
find is that I’ve missed some small, but significant change I made which somehow contributed to the problem. I apologize for raising a concern and then not being able to back it up. The schedule here prohibits me from spending much more time researching this issue, but I will continue to pursue it as time permits and will let you know if I
can get the core to recur. -Mark Poff -----Original Message----- Hi Mark, > If you can't, take a look at how much stack space your > Process1_routine subroutine needs, running out of stack space is one
> possible cause for this kind of strange behaviour in addition to the
> usual kinds of problems. A full stack trace for all threads when the
> crash happens would also help us help you (type 'thread apply all bt' > in gdb). Keep in mind that the stack trace could be corrupted. You might want to run with a trivial subroutine for testing. In addition to that it's always a good idea to write a short test program that calls the subroutine function with some test values. You can run this test program with valgrind --tool=memcheck <your_test_program> to see if it is writing to memory it should not write to. Regards, Martin -- Martin Konrad Control System Engineer Facility for Rare Isotope Beams Michigan State University 640 South Shaw Lane East Lansing, MI 48824-1321, USA Tel. 517-908-7253 Email: [email protected] |