Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  <20182019  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  <20182019 
<== Date ==> <== Thread ==>

Subject: Re: IOC crash
From: Mark Rivers <rivers@cars.uchicago.edu>
To: Hinko Kocevar <hinkocevar@gmail.com>, Michael Davidsaver <mdavidsaver@gmail.com>
Cc: "tech-talk@aps.anl.gov" <tech-talk@aps.anl.gov>
Date: Fri, 19 Jan 2018 14:08:38 +0000
Hi Hinko,


I think core saving was enabled, because it only says "Segmentation fault (core dumped)" if it is.


On Centos7 you can specify where to save core files.  We have set our to be in the current default directory when the application dumped.  But as I recall, that is not the default.  I believe that the file /proc/sys/kernel/core_pattern sets the location.


Mark



________________________________
From: Hinko Kocevar <hinkocevar@gmail.com>
Sent: Friday, January 19, 2018 2:31 AM
To: Michael Davidsaver
Cc: Mark Rivers; tech-talk@aps.anl.gov
Subject: Re: IOC crash

Thanks for the pointers! I'll see what I can still do ; not sure where
the core dump file is saved on CentOS 7.3 OS, and if core saving was
enabled at the time of the crash..

On Thu, Jan 18, 2018 at 10:29 PM, Michael Davidsaver
<mdavidsaver@gmail.com> wrote:
> Hinko,
>
> If you can get a core dump, please include all threads (gdb command "thread apply all backtrace").
> I remember that I was suspicious of a race with another thread.
>
> On 01/18/2018 10:17 AM, Hinko Kocevar wrote:
>> Hi Mark,
>>
>> I'll try to find the core file. Thanks for the tip.
>>
>> /Hinko
>>
>>
>> On Thu, 18 Jan 2018 at 13:31, Mark Rivers <rivers@cars.uchicago.edu <mailto:rivers@cars.uchicago.edu>> wrote:
>>
>>     Hi Hinko,
>>
>>
>>     Since it generated a core file you should be able to do some more debugging with gdb.  For example what are the values of "size" and pMsg->postsize in caserverio.c?  You might be able to figure out what PV this is.
>>
>>
>>     Mark
>>
>>
>>
>>     ________________________________
>>     From: tech-talk-bounces@aps.anl.gov <mailto:tech-talk-bounces@aps.anl.gov> <tech-talk-bounces@aps.anl.gov <mailto:tech-talk-bounces@aps.anl.gov>> on behalf of Hinko Kocevar <hinkocevar@gmail.com <mailto:hinkocevar@gmail.com>>
>>     Sent: Thursday, January 18, 2018 4:55 AM
>>     To: tech-talk@aps.anl.gov <mailto:tech-talk@aps.anl.gov>
>>     Subject: IOC crash
>>
>>     Dear EPICS users,
>>
>>     here is the error I got from the IOC running EPICS base 3.15.4:
>>
>>
>>     A call to 'assert(size <= ntohs ( pMsg->m_postsize ))'
>>         by thread 'CAS-event' failed in ../../../src/ioc/rsrv/caserverio.c line 357.
>>     Dumping a stack trace of thread 'CAS-event':
>>     [    0x7f4f3abf070b]:
>>     /data/shi/R3.15.4/base/lib/linux-x86_64/libCom.so.3.15.4(epicsStackTrace+0x4b)
>>     [    0x7f4f3abe9e1a]:
>>     /data/shi/R3.15.4/base/lib/linux-x86_64/libCom.so.3.15.4(epicsAssert+0x3a)
>>     [    0x7f4f3b0db512]:
>>     /data/shi/R3.15.4/base/lib/linux-x86_64/libdbCore.so.3.15.4(cas_commit_msg+0x92)
>>     [    0x7f4f3b0df468]:
>>     /data/shi/R3.15.4/base/lib/linux-x86_64/libdbCore.so.3.15.4(read_reply+0x1d8)
>>     [    0x7f4f3b0b82f4]:
>>     /data/shi/R3.15.4/base/lib/linux-x86_64/libdbCore.so.3.15.4(event_task+0x1d4)
>>     [    0x7f4f3abeaf5c]:
>>     /data/shi/R3.15.4/base/lib/linux-x86_64/libCom.so.3.15.4(start_routine+0xdc)
>>     [    0x7f4f38ca8e25]: /lib64/libpthread.so.0(start_thread+0xc5)
>>     [    0x7f4f3931f34d]: /lib64/libc.so.6(clone+0x6d)
>>     EPICS Release EPICS R3.15.4 $$Date$$.
>>     Local time is 2018-01-18 11:27:33.597318519 CET
>>     Please E-mail this message to the author or to tech-talk@aps.anl.gov <mailto:tech-talk@aps.anl.gov>
>>     Calling epicsThreadSuspendSelf()
>>     Thread CAS-event (0x7f4ef804c510) suspended
>>     CAS: request from 127.0.0.1:48578 <http://127.0.0.1:48578> => bad resource ID
>>     CAS: Request from 127.0.0.1:48578 <http://127.0.0.1:48578> => cmmd=2 cid=0x1ce3 type=34 count=1
>>     postsize=0
>>     CAS: Request from 127.0.0.1:48578 <http://127.0.0.1:48578> =>   available=0x1058 N=0 paddr=0x7f4ec000c5e8
>>     CAS: forcing disconnect from 127.0.0.1:48578 <http://127.0.0.1:48578>
>>     Segmentation fault (core dumped)
>>
>>
>>     Thanks,
>>     Hinko
>>
>> --
>> .. the more I see the less I believe.., AE AoR
>



--
.. the more I see the less I believe.., AE AoR

Replies:
Re: IOC crash Mark Rivers
References:
IOC crash Hinko Kocevar
Re: IOC crash Mark Rivers
Re: IOC crash Hinko Kocevar
Re: IOC crash Michael Davidsaver
Re: IOC crash Hinko Kocevar

Navigate by Date:
Prev: Re: Seq-2.2.5 compilation on windows Mark Rivers
Next: Re: IOC crash Mark Rivers
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  <20182019 
Navigate by Thread:
Prev: Re: IOC crash Hinko Kocevar
Next: Re: IOC crash Mark Rivers
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  <20182019 
ANJ, 19 Jan 2018 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·