Hello Emma,
I created Mantis 299 to track this issue. It's difficult at this point to
isolate to a subsystem. The assert fail in dbCa.c initially points to a
logic error in the db ca link code, or alternatively a race condition -
possibly a data structure that is being used after it was deleted.
Alternatively, this might be generalized corruption, or a failure in another
subsystem (possibly the CA client library). I am not intimately familiar
with the dbCa.c code so this may require some time spent looking at the
sources.
Have you seen this occur more than once?
If the problem is repeatable, is it possible to reproduce it with a small
database along with a well defined recipe of external circumstances? If the
problem is repeatable, but not with a small database, you might also obtain
further details (a stack trace with arguments and possibly the contents of
related data structures) by building base for debugging and then attaching
to the crashed thread using the Tornado debugger.
Jeff
-----Original Message-----
From: [email protected] [mailto:[email protected]]
On Behalf Of Shepherd, EL (Emma)
Sent: Monday, September 03, 2007 10:09 AM
To: [email protected]
Subject: CAC-TCP-recv suspended
Hi all,
I have come across a problem on an R3.14.8.2 IOC that is affecting
channel access links - some records are in LINK ERROR and others have CP
links that fail to update. When we started investigating we found that
the CAC-TCP-recv task was in SUSPEND+I state, and the following messages
had been printed to the console:
BL18I-MO-IOC-01.diamond.ac.uk:1 Wed Aug 15 16:37:26 2007 CAC-TCP-recv: A
call to "assert (pca->pgetNative)" failed in ../dbCa.c at 629
BL18I-MO-IOC-01.diamond.ac.uk:1 Wed Aug 15 16:37:26 2007 Current time
WED AUG 15 2007 15:37:23.708349950.
BL18I-MO-IOC-01.diamond.ac.uk:1 Wed Aug 15 16:37:26 2007 EPICS Release
EPICS R3.14.8.2 $R3-14-8-2$ $2006/01/06 15:55:13$.
BL18I-MO-IOC-01.diamond.ac.uk:1 Wed Aug 15 16:37:26 2007 Please E-mail
this message and the output from "tt (0x1e0ff9e0)"
BL18I-MO-IOC-01.diamond.ac.uk:1 Wed Aug 15 16:37:26 2007 to the author
or to [email protected]
Here is the task trace:
BL18I-MO-IOC-01 -> tt 0x1e0ff9e0
231ff8 vxTaskEntry +68 : 1e8cb6e4 ()
1e8cb754 epicsThreadPrivateGet+f8 : epicsThreadCallEntryPoint ()
1e8bd048 epicsThreadCallEntryPoint+15c: 1e88b718 (1)
1e88b718 tcpRecvThread::run(void)+990: 1e88e78c ()
1e88e78c tcpiiu::processIncoming(epicsTime const &, callbackManager
&)+408: cac::executeResponse(callbackManager &, tcpiiu &, epicsTime
const &, caHdrLargeArray &, char *) ()
1e87a588 cac::executeResponse(callbackManager &, tcpiiu &, epicsTime
const &, caHdrLargeArray &, char *)+bc : cac
::eventRespAction(callbackManager &, tcpiiu &, epicsTime const &,
caHdrLargeArray const &, void *) ()
1e875fc8 cac::eventRespAction(callbackManager &, tcpiiu &, epicsTime
const &, caHdrLargeArray const &, void *)+19 4:
netSubscription::completion(epicsGuard<epicsMutex> &, cacRecycle &,
unsigned int, unsigned long, void const *) ()
1e89a364 netSubscription::completion(epicsGuard<epicsMutex> &,
cacRecycle &, unsigned int, unsigned long, void co nst *)+84 :
oldSubscription::current(epicsGuard<epicsMutex> &, unsigned int,
unsigned long, void const *) ()
1e855ff4 oldSubscription::current(epicsGuard<epicsMutex> &, unsigned
int, unsigned long, void const *)+104: 1e815 434 ()
1e8156d0 dbCaGetUnits +790: epicsAssert ()
1e8c9a5c epicsAssert +154: epicsThreadSuspendSelf ()
1e8cb010 epicsThreadSuspendSelf+2c : taskSuspend ()
value = 0 = 0x0
Any ideas what could have caused this?
Emma
<DIV><FONT size="1" color="gray">This e-mail and any attachments may contain
confidential, copyright and or privileged material, and are for the use of
the intended addressee only. If you are not the intended addressee or an
authorised recipient of the addressee please notify us of receipt by
returning the e-mail and do not use, copy, retain, distribute or disclose
the information in or attached to the e-mail.
Any opinions expressed within this e-mail are those of the individual and
not necessarily of Diamond Light Source Ltd.
Diamond Light Source Ltd. cannot guarantee that this e-mail or any
attachments are free from viruses and we cannot accept liability for any
damage which you may sustain as a result of software viruses which may be
transmitted in or with the message.
Diamond Light Source Limited (company no. 4375679). Registered in England
and Wales with its registered office at Diamond House, Harwell Science and
Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom
</FONT></DIV>
- Replies:
- RE: CAC-TCP-recv suspended Shepherd, EL (Emma)
- References:
- CAC-TCP-recv suspended Shepherd, EL (Emma)
- Navigate by Date:
- Prev:
Re: channelwatcher vs BURT Ralph Lange
- Next:
Support for Agilent Series N5700 DC Power Supplies Steve Lewis
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
<2007>
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
- Navigate by Thread:
- Prev:
CAC-TCP-recv suspended Shepherd, EL (Emma)
- Next:
RE: CAC-TCP-recv suspended Shepherd, EL (Emma)
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
<2007>
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
|