EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  <20082009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  <20082009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: RE: unhandled exception in timerQueue
From: "Jeff Hill" <[email protected]>
To: <[email protected]>
Cc: [email protected]
Date: Fri, 20 Jun 2008 09:47:52 -0600
> The error code listed in the task list, means S_objLib_OBJ_UNAVAILABLE

This is probably supporting evidence for problems with an invalid semaphore
id.

> 231ff8 vxTaskEntry    +68 : 1ea3ad00 ()
> 1ea3ad70 epicsThreadPrivateGet+f8 : 1ea2c94c ()
> 1ea2c94c epicsThreadCallEntryPoint+444: __cp_exception_info ()
> 162680 __cp_exception_info+0  : __default_unexpected(void) ()
> 16263c set_terminate(void (*)(void))+0  : terminate(void) ()
> 16262c __default_unexpected(void)+0  : cplusTerminate(void) ()
> 153c50 cplusTerminate(void)+50 : taskSuspend ()

This confirms that the epicsThread last chance exception catch ran, and that
it called the unexpected exception handler.

> I noticed that there are two more pending timerQueue tasks, which might
> explain why everything still works.

When independent prioritized execution of timers is needed then independent
timer queue threads are used. So its normal to have more than one timer
queue thread, but it is almost certainly also true that if one of them dies
that you have lost some important component of the system.

Since the timer queue class has proper coordinated shutdown of its thread in
its destructor I expect that the only way that its event semaphore id could
be seen by its thread to be invalid would be if another thread (or one of
its timer callbacks) maliciously stomped on it.

One way of debugging such situations is to set a break point for when there
is a write to the memory location where the semaphore id is stored (vxWorks
supports this special type of break points if your processor supports "break
on address access"). It will be tricky to learn the proper address when
debugging in the target shell so I would recommend building the system for
debugging and attaching with the source code debugger to the system. 

If you processor doesn't support "break on address access" features then you
can also try suspending some of the threads in the system until you figure
out which thread is the culprit. I like a divide and conquer approach (i.e.
suspending half of the threads, and then half again of what remain until you
have found the guilty thread). Of course it always helps to turn up the
knobs that cause the problem to be reproduced more quickly. Usually this is
done by artificially increasing processing or interrupt rates until nearly
all of the CPU is used.

Jeff

> -----Original Message-----
> From: Matthew Pearson [mailto:[email protected]]
> Sent: Friday, June 20, 2008 5:31 AM
> To: Jeff Hill
> Cc: 'Andrew Johnson'; [email protected]
> Subject: RE: unhandled exception in timerQueue
> 
> Hi,
> 
> Sorry for the delay in responding, first beam got in the way...
> 
> I noticed that there are two more pending timerQueue tasks, which might
> explain why everything still works.
> 
> The stack trace for the suspended process is:
> 
> 231ff8 vxTaskEntry    +68 : 1ea3ad00 ()
> 1ea3ad70 epicsThreadPrivateGet+f8 : 1ea2c94c ()
> 1ea2c94c epicsThreadCallEntryPoint+444: __cp_exception_info ()
> 162680 __cp_exception_info+0  : __default_unexpected(void) ()
> 16263c set_terminate(void (*)(void))+0  : terminate(void) ()
> 16262c __default_unexpected(void)+0  : cplusTerminate(void) ()
> 153c50 cplusTerminate(void)+50 : taskSuspend ()
> 
> The error code listed in the task list, means S_objLib_OBJ_UNAVAILABLE
> 
> the task information is:
> 
>   NAME        ENTRY       TID    PRI   STATUS      PC       SP     ERRNO
> DELAY
> ---------- ------------ -------- --- ---------- -------- --------
> ------- -----
> timerQueue 1ea3ad00     1e0ab530 148 SUSPEND      22bbf4 1e0ab090
> 3d0002     0
> 
> stack: base 0x1e0ab530  end 0x1e0a8650  size 11712  high 3184   margin
> 8528
> 
> options: 0xc
> VX_DEALLOC_STACK    VX_FP_TASK
> 
> VxWorks Events
> --------------
> Events Pended on    : Not Pended
> Received Events     : 0x0
> Options             : N/A
> 
> r0     =   6acfc1   sp     = 1e0ab090   r2     =        0   r3     =
> 0
> r4     = 1e0ab0a0   r5     = 1e0aafa0   r6     = 1e0ab030   r7     =
> 1e0aaf90
> r8     = 1e0ab020   r9     = 2bfc8f3b   r10    = 1ea725f0   r11    =
> 43300000
> r12    = 1eb27ed0   r13    =        0   r14    =        0   r15    =
> 0
> r16    =        0   r17    =        0   r18    =        0   r19    =
> 0
> r20    =        0   r21    =        0   r22    =        0   r23    =
> 0
> r24    =        0   r25    =        0   r26    = 1ea7a5b8   r27    =
> 1ea330dc
> r28    = 1ea48068   r29    = 1ea324ac   r30    = 1e0ab530   r31    =
> 1e0ab530
> msr    =     b030   lr     = 1ea484ec   ctr    =   19ac34   pc     =
> 22bbf4
> cr     = 20000049   xer    =        0
> 
> fpcsr  = b2002100
> fr0    =        0   fr1    =        0   fr2    =      NaN   fr3    =
> NaN
> fr4    =      NaN   fr5    =      NaN   fr6    =      NaN   fr7    =
> 0
> fr8    =        0   fr9    =        0   fr10   =        0   fr11   =
> 0
> fr12   =        0   fr13   = 4.5036e+15   fr14   =      NaN   fr15   =
> NaN
> fr16   =      NaN   fr17   =      NaN   fr18   =      NaN   fr19   =
> NaN
> fr20   =      NaN   fr21   =      NaN   fr22   =      NaN   fr23   =
> NaN
> fr24   =      NaN   fr25   =      NaN   fr26   =      NaN   fr27   =
> NaN
> fr28   =      NaN   fr29   =      NaN   fr30   =      NaN   fr31   =
> NaN
> 
> 
> On Tue, 2008-06-17 at 09:31 -0600, Jeff Hill wrote:
> > Hello Mathew,
> >
> > My best guess (subject to change of course :-) is that some tangential
> code
> > is corrupting the semaphore id used by a timer queue thread.
> >
> > > epicsThread: Unexpected C++ exception "epicsEvent::invalidSemaphore()"
> >
> > This is coming from the last chance exception handler in the epicsThread
> > class. It is not a message from the dreaded unexpected exception handler
> in
> > the CRTL (associated with violated exception specifications). I will fix
> the
> > message emitted by epicsThread so that this is clear in the future.
> >
> > > > interrupt: Main interrupt with no cause
> > > > interrupt: Main interrupt with no cause
> >
> > This output might be supporting evidence for the corruption theory.
> >
> > Please try also the vxWorks shell "tt" command for the suspended thread.
> > That may, or may not, provide some useful information.
> >
> > Jeff
> >
> > > -----Original Message-----
> > > From: [email protected] [mailto:tech-talk-
> [email protected]]
> > > On Behalf Of Matthew Pearson
> > > Sent: Tuesday, June 17, 2008 6:52 AM
> > > To: Andrew Johnson
> > > Cc: [email protected]
> > > Subject: Re: unhandled exception in timerQueue
> > >
> > > Update:
> > >
> > > I've just seen the same again (after a reboot, and a couple of
> hours)...
> > > but the IOC shell is fine.
> > >
> > > 0x1e0ab530 (timerQueue): Unhandled C++ exception resulted in call to
> > > terminate
> > > epicsThread: Unexpected C++ exception "epicsEvent::invalidSemaphore()"
> > > with type "Q210epicsEvent16invalidSemaphore" in thread "timerQueue" at
> > > TUE JUN 17 2008 11:50:26.620275542
> > >
> > > I also see:
> > >
> > > timerQueue 1ea3ad00     1e0ab530 148 SUSPEND      22bbf4 1e0ab090
> > > 3d0002     0
> > >
> > >
> > > Regards,
> > > Matthew
> > >
> > > On Tue, 2008-06-17 at 13:46 +0100, Matthew Pearson wrote:
> > > > Hi Andrew / Tech-Talk,
> > > >
> > > > I've just seen the following printout in my vxWorks IOC shell (EPICS
> > > > 3.14.8.2):
> > > >
> > > > 0x1e0abd20 (timerQueue): Unhandled C++ exception resulted in call to
> > > > terminate
> > > > epicsThread: Unexpected C++ exception
> "epicsEvent::invalidSemaphore()"
> > > > with type "Q210epicsEvent16invalidSemaphore" in thread "timerQueue"
> at
> > > > FRI JUN 13 2008 16:51:01.488894248
> > > > 0x1e03e550 (timerQueue): Unhandled C++ exception resulted in call to
> > > > terminate
> > > > epicsThread: Unexpected C++ exception
> "epicsEvent::invalidSemaphore()"
> > > > with type "Q210epicsEvent16invalidSemaphore" in thread "timerQueue"
> at
> > > > FRI JUN 13 2008 16:56:38.279153464
> > > > interrupt: Main interrupt with no cause
> > > > interrupt: Main interrupt with no cause
> > > >
> > > >
> > > > It looks like an Q210epicsEvent16invalidSemaphore exception is
thrown
> > > > from an epicsThread. But this seems to be unexpected, and
> unexpected()
> > > > is called, which seems to exit the IOC shell.
> > > >
> > > > Nick Rees suspects that this was discussed a while ago, and may have
> > > > been dealt with in a later version than 3.14.8.2. Any ideas?
> > > >
> > > > Regards,
> > > > Matthew Pearson
> > > >
> > > > DLS Controls
> > > >
> > > >
> > > >
> > > <DIV><FONT size="1" color="gray">This e-mail and any attachments may
> > > contain confidential, copyright and or privileged material, and are
for
> > the
> > > use of the intended addressee only. If you are not the intended
> addressee
> > > or an authorised recipient of the addressee please notify us of
receipt
> by
> > > returning the e-mail and do not use, copy, retain, distribute or
> disclose
> > > the information in or attached to the e-mail.
> > > Any opinions expressed within this e-mail are those of the individual
> and
> > > not necessarily of Diamond Light Source Ltd.
> > > Diamond Light Source Ltd. cannot guarantee that this e-mail or any
> > > attachments are free from viruses and we cannot accept liability for
> any
> > > damage which you may sustain as a result of software viruses which may
> be
> > > transmitted in or with the message.
> > > Diamond Light Source Limited (company no. 4375679). Registered in
> England
> > > and Wales with its registered office at Diamond House, Harwell Science
> and
> > > Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom
> > > </FONT></DIV>
> >
> <DIV><FONT size="1" color="gray">This e-mail and any attachments may
> contain confidential, copyright and or privileged material, and are for
the
> use of the intended addressee only. If you are not the intended addressee
> or an authorised recipient of the addressee please notify us of receipt by
> returning the e-mail and do not use, copy, retain, distribute or disclose
> the information in or attached to the e-mail.
> Any opinions expressed within this e-mail are those of the individual and
> not necessarily of Diamond Light Source Ltd.
> Diamond Light Source Ltd. cannot guarantee that this e-mail or any
> attachments are free from viruses and we cannot accept liability for any
> damage which you may sustain as a result of software viruses which may be
> transmitted in or with the message.
> Diamond Light Source Limited (company no. 4375679). Registered in England
> and Wales with its registered office at Diamond House, Harwell Science and
> Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom
> </FONT></DIV>


References:
Re: camonitor prints CA errors to stdout rather than stderr J. Lewis Muir
Re: camonitor prints CA errors to stdout rather than stderr J. Lewis Muir
Re: camonitor prints CA errors to stdout rather than stderr Andrew Johnson
unhandled exception in timerQueue Matthew Pearson
Re: unhandled exception in timerQueue Matthew Pearson
RE: unhandled exception in timerQueue Jeff Hill
RE: unhandled exception in timerQueue Matthew Pearson

Navigate by Date:
Prev: RE: unhandled exception in timerQueue Matthew Pearson
Next: EDM Freezing On Linux David Wetherholt
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  <20082009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: RE: unhandled exception in timerQueue Matthew Pearson
Next: asyn problem John Sinclair
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  <20082009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
ANJ, 02 Sep 2010 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·