All,
There was a complaint about allowing this sort of answer to be copied
back to the mail list. If this is the general consensus then I will
be careful to only send a response to the person asking this sort of
question in the future.
Jolsen:
I would get a stack trace on tNetTask using the tt() utility from
the vxWorks shell. Running this several
times from the high priority shell task will begin to give some clues
on where the code is consuming most of the CPU. The ifShow()
utility will, on some BSPs, provide information about the driver
for the Ethernet chip on your board. The source code for the Ethernet
driver is usually provided with vxWorks so a skilled person might be able to
track down the problem from the stack trace. Here is a typical path:
Tornado2.0\target\src\drv\netif
Tornado2.0\target\src\drv\end
You could also look in the vxWorks
exploder archives, and in windsurf on the WRS home page for any mention of
a similar problem. If you call WRS support be certain to give them a copy
of the stack trace. There are various other network diagnostic utilities:
type "netHelp()" from the vxWorks shell. Running "memShow()" will
often provide clues about rampant pool corruption, and "checkStack()"
of course should always be run on a vxWorks system that is in trouble.
Jeff
> -----Original Message-----
> From: [email protected] [mailto:[email protected]]
> Sent: Thursday, February 17, 2000 6:09 PM
> To: [email protected]
> Cc: [email protected]
> Subject: NetTask problem
>
>
> Dear EPICS experts,
>
> We have an IOC (mv177) here at BaBar that has recently started
> suffering random
> crashes caused by the tNetTask process taking up all the cpu cycles. The
> symptom is that all channels served by that IOC become
> disconnected. While the
> IOC is in this state I cannot telnet to it, but I can login via
> xyplex through
> the serial port. Running spy I then see that the tNetTask is hogging cpu
> time. I am also able to do "casr" and look at the connected clients, but
> without a process id it is difficult to track down any correlation with a
> specific client application. Rebooting seems to be the only
> solution to this
> problem. The crash rate is around once per week.
>
> This particular cpu is required for BaBar/PEP-II operation and is normally
> rock solid. A few months ago we migrated to EPICS version
> 3.13.1, but that
> was well before these mysterious crashes began. All of our other
> 14 IOCs are
> running the same version and do not suffer this problem. The
> only relevant
> difference is that the problem IOC is the only one that shares a
> subnet with
> the PEP-II IOC (ie, sees additional network traffic). However,
> we do not think
> the problem is related to the network since rebooting (over nfs)
> always gets us
> out of the crashed state.
>
> The IOC statistics before the last crash were:
>
> cpu load: 45%
> ca clients: 65
> ca connections: ~1500
> free memory: 5MB
>
> These numbers are typical steady-state values. So there does not
> seem to be a
> smoking gun event that caused the crash.
>
> Can anyone out there shed some light on what may be going on
> here? Are there
> any other diagnostic tools I can use to further evaluate the
> problem? Comments
> and suggestions are greatly appreciated.
>
> Thank you,
> Jim Olsen
> [email protected]
- References:
- NetTask problem jolsen
- Navigate by Date:
- Prev:
NetTask problem jolsen
- Next:
Re: Improving S/N on tech-talk Benjamin Franksen
- Index:
1994
1995
1996
1997
1998
1999
<2000>
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
- Navigate by Thread:
- Prev:
NetTask problem jolsen
- Next:
Re: NetTask problem Marty Kraimer
- Index:
1994
1995
1996
1997
1998
1999
<2000>
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
|