EPICS Re: vxWorks network problems

Experimental Physics and Industrial Control System

1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 <2011> 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024	Index	1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 <2011> 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024
<== Date ==>		<== Thread ==>

Subject:	Re: vxWorks network problems
From:	Dirk Zimoch <[email protected]>
To:	"Steven M. Hartman" <[email protected]>
Cc:	"[email protected]" <[email protected]>
Date:	Tue, 24 May 2011 11:54:37 +0200

Hi Steven,

Steven M. Hartman wrote:

Dirk Zimoch wrote:
First incident:
An IOC (vxWorks 5.5) lost CA connectivity. All medm panels went white.Logging in on the IOC over serial port showed that the data base wasstill running, but there were many messages
"rsrv: system low on network buffers - send retry in 15 seconds"
These messages made it a bit tough to debug, because they spill allover any output of any debug tool.
As you saw, mbuf starvation is usually recoverable once the clientstarts behaving. Jeff's comments about queuing theory show that youcannot protect the server completely, but you can hopefully tune yourbuffer numbers to deal with most situations your network is likely toexperience. Was this a one-time event, and if so, was anything unusualhappening at the time? Which EPICS version for IOC and the clients?


All clients and IOCs are running EPICS 3.14.8.2.

I tried to tune the network buffers. But once I when I saw withinetstatShow that send queues filled up, I increased their size from thedefault 8k to 64k. This seems to be a bad decision, because now I don'thave enough mbufs any more. But how many do I need? If I have 20 CAconnections, I might need more than a MB mbufs. Thus I better go back to8k buffers.


Has anyone changed TCP_SND_SIZE_DFLT or TCP_RCV_SIZE_DFLT in vxWorks?

My settings are at the moment:
#define NUM_64          800             /* default 100 */
#define NUM_128         800             /* default 100 */
#define NUM_256         100             /* default 40  */
#define NUM_512         100             /* default 40  */
#define NUM_1024        100             /* default 25  */
#define NUM_2048        100             /* default 25  */

#define NUM_SYS_64      1024            /* default 64  */
#define NUM_SYS_128     1024            /* default 64  */
#define NUM_SYS_256     512             /* default 64  */
#define NUM_SYS_512     512             /* default 64  */

/* TCP queue sizes increaded to improve array thoughput */
#define TCP_SND_SIZE_DFLT 65536
#define TCP_RCV_SIZE_DFLT 65536

Second incident:
An other IOC lost CA connectivity. This time the error message wasdifferent:
CA beacon (send to "...") error was "ENOBUFS"
This one is similar to what SNS has seen over the years with mv2100 andrelated boards with the DEC network driver. The precipitating event is atemporary loss of the physical network layer (i.e. unplugged networkcable to IOC, or edge network switch powered down). It looks like abuggy network driver that cannot recover from this fault when there arelots of UDP packets queuing in the outgoing buffers. This one does notseem to be recoverable except by a reboot so we have taken steps toreduce the likelihood of loosing the physical link. We do not see thiswith other boards.

It was user shift, so hopefully nobody had unplugged any cables.Probably ca_search UDP packages are related, but the time resolution ofour network diagnosis is not so good that I can see if the IOC crashesbecause of search broadcasts or if the clients broadcast because the IOCcrashed.

To my experience WRS network drivers are always buggy. My dec21x40End.cis from September 2005. Maybe I should ask WRS for a new driver.



Thanks for your reply

Dirk

Replies:: Re: vxWorks network problems Steven M. Hartman; Re: vxWorks network problems Andrew Johnson

References:: vxWorks network problems Dirk Zimoch; Re: vxWorks network problems Steven M. Hartman

Navigate by Date:: Prev: Re: Device Support for Nemic Lambda Power Supplies Eric Norum; Next: Re: vxWorks network problems Dirk Zimoch; Index: 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 <2011> 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024
Navigate by Thread:: Prev: Re: vxWorks network problems Dirk Zimoch; Next: Re: vxWorks network problems Steven M. Hartman; Index: 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 <2011> 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024