EPICS RE: Fanout to Sub Records in a Different IOC Causes Core Dump

1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 <2014> 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025	Index	1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 <2014> 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025
<== Date ==>		<== Thread ==>

Andrew and Martin (and EPICS community),

After experiencing the repeated crashes and examining a memory dump to the extent necessary to determine where they occurred (according to gdb, it was in camessage.c, line #1344, EPICS v3.14.12.3), I modified the databases to not use fanout records and moved on with my work - and committed the programming sin of not keeping a copy of the failed configuration. Sure enough, when I tried to recreate the failure scenario yesterday - no crash occurred. I know the cpp processes were the same (the only changes there were a couple of printf debug statements) but besides modifying the .db files I had also touched the .dbd file and the st.cmd file (we use a customized version) and probably a couple of other things. The only conclusion I can find is that I’ve missed some small, but significant change I made which somehow contributed to the problem.

I apologize for raising a concern and then not being able to back it up. The schedule here prohibits me from spending much more time researching this issue, but I will continue to pursue it as time permits and will let you know if I can get the core to recur.

-Mark Poff

-----Original Message-----
From: Konrad, Martin [mailto:[email protected]]
Sent: Thursday, October 09, 2014 4:04 PM
To: Poff, Mark A; [email protected]
Subject: Re: Fanout to Sub Records in a Different IOC Causes Core Dump

Hi Mark,

> If you can't, take a look at how much stack space your

> Process1_routine subroutine needs, running out of stack space is one

> possible cause for this kind of strange behaviour in addition to the

> usual kinds of problems. A full stack trace for all threads when the

> crash happens would also help us help you (type 'thread apply all bt'

> in gdb).

Keep in mind that the stack trace could be corrupted. You might want to run with a trivial subroutine for testing.

In addition to that it's always a good idea to write a short test program that calls the subroutine function with some test values. You can run this test program with

valgrind --tool=memcheck <your_test_program>

to see if it is writing to memory it should not write to.

Regards,

Martin

Martin Konrad

Control System Engineer

Facility for Rare Isotope Beams

Michigan State University

640 South Shaw Lane

East Lansing, MI 48824-1321, USA

Tel. 517-908-7253

Email: [email protected]

Subject:	RE: Fanout to Sub Records in a Different IOC Causes Core Dump
From:	"Poff, Mark A" <[email protected]>
To:	"Konrad, Martin" <[email protected]>, "[email protected]" <[email protected]>, Andrew Johnson <[email protected]>
Cc:	"Poff, Mark A" <[email protected]>
Date:	Fri, 10 Oct 2014 10:38:44 +0000

Experimental Physics and Industrial Control System