Experimental Physics and
Industrial Control System

1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 <2011> 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025	Index	1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 <2011> 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025
<== Date ==>		<== Thread ==>

Subject:	vxWorks network problems
From:	Dirk Zimoch <[email protected]>
To:	"[email protected]" <[email protected]>
Date:	Mon, 23 May 2011 12:06:30 +0200

Hi all,

We had some network problems over the weekend. Maybe someone knows whatto do. Here is what I observed:


First incident:

An IOC (vxWorks 5.5) lost CA connectivity. All medm panels went white.Logging in on the IOC over serial port showed that the data base wasstill running, but there were many messages

"rsrv: system low on network buffers - send retry in 15 seconds"

These messages made it a bit tough to debug, because they spill all overany output of any debug tool.


But what I found was: mbuf showed 0 free buffers. Where are they?

inestatShow showed three CA connections with full send queues. Followingthe foreign address entries ans using netstat -tp on the clientcomputers I found one CA gateway and 2 Tcl/Tk clients. All had largenumbers in their receive queues. At least the gateway reported in itslog file that it has lost connection to the IOC.


After restarting the gateway and killing the two clients, the IOC recovered.

It is not the first time that this happens. I have seen any type ofclients causing this problem, Tcl, medm, gateway, ...

Some time ago I had increased the queue sizes on vxWorks from thedefault 8k to 64k. Was that a bad idea?

How should an IOC behave is a CA client which subscribed for monitorevents does not handle input fast enough? Using up all network resourceson vxWorks is not the best thing that can happen.


What could have stopped the clients from handling their input?


Second incident:

An other IOC lost CA connectivity. This time the error message wasdifferent:

CA beacon (send to "...") error was "ENOBUFS"

Again, inetstatShow showed two client connections with quite full sendqueues. But this time, mbufShow still showed free buffers. And killingthe clients did not help!

Using a function I once got from WindRiver I found that the networkinterface send queue was full (size: 50 entries).


void ifQValuesShow (char *name) {
    struct ifnet *ifp;
    ifp = ifunit(name);
    if (ifp == NULL) {
        printf("Could not find %s interface\n", name);
        return;
        }
    printf("%s drops = %d queue length = %d max_len = %d \n",
        name, ifp->if_snd.ifq_drops,
        ifp->if_snd.ifq_len, ifp->if_snd.ifq_maxlen);
    return;
    }

The only way to recover from this problem seems to be a reboot.

Any idea what went wrong here?

I can increase the queue size, but how much? WindRiver never answersquestions like "why does the network not recover?".



Dirk

Replies:: RE: vxWorks network problems Jeff Hill; Re: vxWorks network problems Steven M. Hartman

Navigate by Date:: Prev: Re: caGateway crashes / use of *MustSucceed functions Benjamin Franksen; Next: When a record is changed twice very fast, camonitor only detects first change Mikel Rojo; Index: 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 <2011> 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025
Navigate by Thread:: Prev: Re: CAJ Flow Control Bug David Brodrick; Next: RE: vxWorks network problems Jeff Hill; Index: 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 <2011> 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025

ANJ, 18 Nov 2013

· Home · News · About · Base · Modules · Extensions · Distributions ·
· Download · Search · IRMIS · Talk · Documents · Links · Licensing ·

Experimental Physics and Industrial Control System

Experimental Physics and
Industrial Control System