Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  <20112012  2013  2014  2015  2016  2017  2018  2019  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  <20112012  2013  2014  2015  2016  2017  2018  2019 
<== Date ==> <== Thread ==>

Subject: RE: caget very rarely core dumps in osdThread.c
From: "Jeff Hill" <johill@lanl.gov>
To: "'Shankar, Murali'" <mshankar@slac.stanford.edu>, <tech-talk@aps.anl.gov>
Date: Fri, 6 May 2011 16:48:40 -0600

Hi Murlai,

 

> We do have symbols and all the core files point here

> 

> (gdb) bt

> #3  0x0098f2a6 in start_routine (arg=0xb7d00598) at >../../../src/libCom/osi/os/posix/osdThread.c:309

> #4  0x007423cc in start_thread () from /lib/tls/libpthread.so.0

> #5  0x005a9f0e in clone () from /lib/tls/libc.so.6

 

Based on line number 309, it appears that the precipitating circumstance is a failure initially at line 309 in the “start_routine” function.

 

    status = pthread_setspecific(getpthreadInfo,arg); ç=================== bad status

    checkStatusQuit(status,"pthread_setspecific","start_routine");

 

It’s interesting that pthread_setspecific is failing. Of course one is suspecting a race condition where the “once” function, in the same source file hasn’t finished running at a time that the “start_thread” function is already running in a new thread, but supposedly pthread_once is preventing that.

 

Could you forward the output from “thread apply all bt” in gdb against the same core file.

 

Thanks,

 

Jeff
______________________________________________________
Jeffrey O. Hill           Email       
johill@lanl.gov
LANL MS H820              Voice        505 665 1831
Los Alamos NM 87545 USA   FAX          505 665 5107

 

Message content: TSPA

 

With sufficient thrust, pigs fly just fine. However, this is

not necessarily a good idea. It is hard to be sure where they

are going to land, and it could be dangerous sitting under them

as they fly overhead. -- RFC 1925

 

From: tech-talk-bounces@aps.anl.gov [mailto:tech-talk-bounces@aps.anl.gov] On Behalf Of Shankar, Murali
Sent: Friday, May 06, 2011 3:04 PM
To: tech-talk@aps.anl.gov
Subject: caget very rarely core dumps in osdThread.c

 

As part of tracking down failures elsewhere, we ran into some core dumps of caget from EPICS base version R3-14-10 on linux.  These were caget’s for different PV’s and at different points in time.

Apr 27 07:46 core.21238 - caget GTW4:MEM:CHK

Apr 27 13:54 core.25030 - caget MCCELOG:MEM:CHK

Apr 30 14:30 core.26546 - caget PROD03:MEM:CHK

May  3 13:27 core.12026 - caget DAEMON4:DISK:CHK

May  5 13:20 core.4340 - caget PROD01:PROC:CHK

 

We do have symbols and all the core files point here

(gdb) bt

#0  0x0098f9b1 in createImplicit () at ../../../src/libCom/osi/os/posix/osdThread.c:466

#1  0x00990064 in epicsThreadGetIdSelf () at ../../../src/libCom/osi/os/posix/osdThread.c:618

#2  0x009828b2 in cantProceed (msg=0x9a0a8c "start_routine") at ../../../src/libCom/misc/cantProceed.c:63

#3  0x0098f2a6 in start_routine (arg=0xb7d00598) at ../../../src/libCom/osi/os/posix/osdThread.c:309

#4  0x007423cc in start_thread () from /lib/tls/libpthread.so.0

#5  0x005a9f0e in clone () from /lib/tls/libc.so.6

 

 

The method createImplicit  has not changed from R3-14-10 to R3-14-12.  This is the appropriate line of code.

 

        pthreadInfo->osiPriority =

                 (param.sched_priority - pcommonAttr->minPriority) * 100.0 /

                    (pcommonAttr->maxPriority - pcommonAttr->minPriority + 1);

 

 

Is there anything more we can do to figure out the root cause?

 

Regards,

Murali

 


Replies:
RE: caget very rarely core dumps in osdThread.c Shankar, Murali
References:
caget very rarely core dumps in osdThread.c Shankar, Murali

Navigate by Date:
Prev: RE: caget very rarely core dumps in osdThread.c Shankar, Murali
Next: RE: caget very rarely core dumps in osdThread.c Shankar, Murali
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  <20112012  2013  2014  2015  2016  2017  2018  2019 
Navigate by Thread:
Prev: RE: caget very rarely core dumps in osdThread.c Shankar, Murali
Next: RE: caget very rarely core dumps in osdThread.c Shankar, Murali
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  <20112012  2013  2014  2015  2016  2017  2018  2019 
ANJ, 18 Nov 2013 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·