Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  <20132014  2015  2016  2017  2018  2019  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  <20132014  2015  2016  2017  2018  2019 
<== Date ==> <== Thread ==>

Subject: Re: CA beacon / ENOBUFS
From: Benjamin Franksen <benjamin.franksen@helmholtz-berlin.de>
To: Andrew Johnson <anj@aps.anl.gov>
Cc: tech-talk@aps.anl.gov
Date: Fri, 5 Apr 2013 21:57:05 +0200
Hi Andrew

thanks for clearing up the confusion in my head!

Am Freitag, 5. April 2013, 17:00:16 schrieben Sie:
> On 2013-04-05 Benjamin Franksen wrote:
> > I have a strange problem with an IOC runnign EPICS base 3.14.8. Half an
> >
> >  hour ago, it suddenly started issuing error messages like
> >
> > ../online_notify.c: CA beacon (send to "193.149.12.255:5065") error was
> > "S_errno_ENOBUFS"
> >
> > which I now get periodically (every few seconds).
> >
> > Now, what is funny is that the IOC does not have any records on it. I
> > have a very similar IOC in the same network (same base version) that
> > /does/ have records on it and which does /not/ issue these messages.
>
> The ENOBUFS error is originating in the network stack, not in CA.

Right.

> The
> absence of records doesn't prevent iocInit() from starting the task which
> sends out beacons and is generating this message.

Of course, stupid me. It says "beacon" loud and clearly. Don't know why but I
always confuse beacons and name resolution.

> > There is probably some rogue CA client on the network that broadcasts
> > name resolution requests with a (too) high frequency (this is our
> > development network, not operation, fortunately), but I have not yet
> > managed to find out its IP address.
>
> Nope, that's not the issue at all.  The problem is that RSRV is getting an
> error from the network layer whenever it tries to send a beacon, which
> happens every 15 seconds on the IOC.

Yes, of course, should have seen this myself. It still begs the question of
what causes network buffers to be in short supply? The IOC ran for days w/o
complaining, all it does is to load and start an iocCore after which I have
been running a small procedure manually from the vxWorks shell. The procedure
is small and simple, I can guarantee that it does not consume any network
buffers. And now suddenly -- even after a restart -- the system somehow uses
up all the network buffers. Since I did not change anything relevant in this
IOC, there must be something outside the IOC that broke, maybe the network
switch. I will investigate this further on Monday.

> > "casr 2" says
> >
> > Channel Access Server V4.11
> > No clients connected.
> > UDP Server:
> > UDP 193.149.12.6:38238(): User="", V4.11, 0 Channels, Priority=0
> >
> >         Task Id=0x1d31698, Socket FD=14
> >         Secs since last send 2853.75, Secs since last receive 1550.10
> >         Unprocessed request bytes=16, Undelivered response bytes=0
> >         State=up
> >         168 bytes allocated
> >
> > There are currently 1176 bytes on the server's free list
> > 7 client(s), 0 channel(s), 0 event(s) (monitors) 0 putNotify(s)
> > 0 small buffers (16384 bytes ea), and 0 jumbo buffers (16408 bytes ea)
> > The server's resource id conversion table:
> > Bucket entries in use = 0 bytes in use = 16404
> > Bucket entries/hash id - mean = 0.000000 std dev = 0.000000 max = 0
> > The server's array size limit is 16408 bytes max
> > Channel Access Address List
> > 193.149.12.255:5065
> >
> >
> > This is strange in and of itself, since it first says "No clients
> >
> >  connected" (which I expected, since there are no records) and then later
> >  "7 client(s),..." which I find disturbing.
>
> You're reading the error message wrong, it's telling you that there is
> space available for 7 clients from the server's free list.

Uh, thanks, that makes sense.

> > So, if anyone has an idea what might be going on here I'd be glad for an
> > explanation. BTW, I restarted the IOC which silences the error messages
> > for some time (a few minutes, maybe) after which they start again.
> >
> > How do I find rogue CA clients (if that causes the problem at all, maybe
> > something else is wrong)? Could network problems cause this? Is there
> > maybe a known bug in 3.14.8 that could explain this?
>
> You should run an inetstatShow and look at the size of the queues for all
> of the connected sockets that is reports.  Hopefully you'll find a socket
> which has lots of data queued, and that will give you a hit where to look
> further. You should also run netStackSysPoolShow and netStackDataPoolShow
> (or mbufShow if you're on a *really* old vxWorks version)

I am: 5.4.2

> which tell you
> how many network buffers of each size the stack has available. and also
> give you some other information about how often it has failed to find a
> large enough buffer.

Ok, I will do as you suggested.

Thanks for your help
--
Ben Franksen
()  ascii ribbon campaign - against html e-mail
/\  www.asciiribbon.org   - against proprietary attachments

________________________________

Helmholtz-Zentrum Berlin für Materialien und Energie GmbH

Mitglied der Hermann von Helmholtz-Gemeinschaft Deutscher Forschungszentren e.V.

Aufsichtsrat: Vorsitzender Prof. Dr. Dr. h.c. mult. Joachim Treusch, stv. Vorsitzende Dr. Beatrix Vierkorn-Rudolph
Geschäftsführung: Prof. Dr. Anke Rita Kaysser-Pyzalla, Thomas Frederking

Sitz Berlin, AG Charlottenburg, 89 HRB 5583

Postadresse:
Hahn-Meitner-Platz 1
D-14109 Berlin

http://www.helmholtz-berlin.de


References:
CA beacon / ENOBUFS Benjamin Franksen
Re: CA beacon / ENOBUFS Andrew Johnson

Navigate by Date:
Prev: Re: CA beacon / ENOBUFS Andrew Johnson
Next: Re: EPICS performance test/benchmark Jiro Fujita
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  <20132014  2015  2016  2017  2018  2019 
Navigate by Thread:
Prev: Re: CA beacon / ENOBUFS Andrew Johnson
Next: Newport Piezo USB Controllers Andrew Gomella
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  <20132014  2015  2016  2017  2018  2019 
ANJ, 20 Apr 2015 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·