Mark,
I should add that bugs 52 (R3.13) and 53 (R3.14) in Mantis document a
problem that has been fixed in the CA server where a TCP send thread can get
hung when an IOC goes into an ENOBUFS state. There is a message associated
with this failure which was observed at least once at the APS. I'm not sure
what the occurrence rate of this problem is in practice. In many cases users
are compelled to restart the IOC if there is trouble without further
investigation so it's possible that the failure has occurred before and went
unreported, or was lumped together with a similar benign message originating
from the UDP part of the server.
Jeff
> -----Original Message-----
> From: Mark Rivers [mailto:[email protected]]
> Sent: Wednesday, March 24, 2004 6:34 PM
> To: Jeff Hill; Andrew Johnson
> Cc: [email protected]; [email protected];
> [email protected]; smtp lanzirotti
> Subject: RE: Buffer problems
>
> Jeff,
>
> > The particular type of PPC SBC used at the SNS as been
> > observed to go into an ENOBUF error producing state when there is
> shortage of
> > driver buffer pool. Dave Thompson has observed that an IOC can get
> into
> > this state by sending large ping messages. CA routinely sends UDP
> messages
> > with a 1472 byte payload. Here is an excerpt from his message on the
> subject.
>
> Thanks very much, that seems to be the problem. If I do the following
> on a Unix system (must be root to do it):
> > ping -s 1500 -l 2000 x26a-vmecpu
>
> pinging my PPC vxWorks system with 2000 1500-byte packets as fast as
> possible then it immediately crashes the network on vxWorks. This is
> MVME2700 and Tornado 2.0.2. I start to get the messages:
> iocx26a> CAC: error = "S_errno_ENOBUFS" sending UDP msg to
> 130.199.193.255:5064
> CAC: error = "S_errno_ENOBUFS" sending UDP msg to 130.199.193.255:5064
> ...../online_notify.c: CA beacon (send to "130.199.193.255:5065") error
> was
> "S_errno_ENOBUFS"
> CAC: error = "S_errno_ENOBUFS" sending UDP msg to 130.199.193.255:5064
>
> NFS no longer works, and I have to reboot.
>
> These are exactly the symptoms the VME crates at Brookhaven were seeing
> this afternoon. I suspect someone or some virus is pounding on machines
> on their subnet, and this is crashing the vxWorks network stack.
>
> Mark
- References:
- RE: Buffer problems Mark Rivers
- Navigate by Date:
- Prev:
RE: Buffer problems Mark Rivers
- Next:
Re: Buffer problems Tim Mooney
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
<2004>
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
- Navigate by Thread:
- Prev:
RE: Buffer problems Mark Rivers
- Next:
RE: Buffer problems Mark Rivers
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
<2004>
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
|