EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  <20232024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  <20232024 
<== Date ==> <== Thread ==>

Subject: Re: Problem with CA nameserver and CA_V413 protocol
From: Michael Davidsaver via Tech-talk <tech-talk at aps.anl.gov>
To: Brian Bevins <bevins at jlab.org>, Andrew Johnson <anj at aps.anl.gov>
Cc: "tech-talk at aps.anl.gov" <tech-talk at aps.anl.gov>
Date: Tue, 19 Sep 2023 07:49:52 -0700
On 9/18/23 14:58, Brian Bevins via Tech-talk wrote:
I recently had to diagnose a problem with our local CA nameserver. We use a fork of the EPICS CA nameserver, which exhibits the same problem.

I am not sure I understand what is being described.

Can you provide a packet capture of this situation?  (or situations?)



The symptom was that certain clients would receive monitor updates with element counts of zero, even for scalar PVs, when the PVs were resolved through a new build of the CA nameserver. This only occurred with clients built against R3.14.12 or newer, a nameserver built against R3.14.12 or newer, and an ioc built against R3.14.11 or older. (Note that I use the term "ioc" throughout to refer to any CA server in order to avoid confusion over which server is being described.) We identified camonitor and PyEpics as two clients that were affected. Notably caget was not.

I traced the problem to a change made in CA_V413, that allows  clients to use a zero element count in a CA_PROTO_EVENT_ADD or CA_PROTO_READ_NOTIFY messages. Prior to 413 an element count of zero in a request is supposed to always trigger an error. 413 allows a request for zero elements to which the server responds with the actual number of elements available. Using the element count of zero seems to have become the default when a client is communicating with an ioc that supports V413. Everything works fine when the client and ioc handle name resolution directly. The channel falls back to the lowest common denominator of CA minor protocol version.

The problem occurs when a CA nameserver built against a newer EPICS redirects newer clients to older iocs. The nameserver always answers the client with a message that correctly gives the ioc's address, but includes the nameserver's own protocol version. When the client then opens a channel with the ioc, the client has the false belief that the ioc has a later protocol version that it really does. This never seemed to create a problem for us until CA_V413, but when a 413 client connects to a <= 412 ioc it uses the zero element count which the ioc handles by returning updates with zero elements. (This itself seems like another bug in that for V412 and earlier an element count of zero is supposed to generate an error.)

My fix was to create patches to libca and libcas and modify the nameserver to use them. The libca patch adds a ca_host_minor_protocol(chid) function to the CA client API that allows a client to request the minor protocol version of a connected server. The client side of the CA nameserver uses it to fetch and store the minor protocol version from each ioc. The libcas patch adds an overload of pvExistReturn() that allows the server to respond to a client with both a specified network address and a specified minor protocol version number. The server side of the CA nameserver uses this to respond to clients with both the stored address and the stored protocol version. With these changes in place the problem disappears.

I know that pcas is basically dead, but the CA nameserver seems to still be in use. I can provide the patches against R3.15.9 if there is any interest in that.

--Brian Bevins
Jefferson Lab



Replies:
Re: Problem with CA nameserver and CA_V413 protocol Andrew Johnson via Tech-talk
References:
Problem with CA nameserver and CA_V413 protocol Brian Bevins via Tech-talk

Navigate by Date:
Prev: Re: IOC shell arrow keys not working. Yielding "^[[A" or "^[[Something" Marco A. Barra Montevechi Filho via Tech-talk
Next: areadetector, extra IOC only for plugins Heinz Junkes via Tech-talk
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  <20232024 
Navigate by Thread:
Prev: Re: Problem with CA nameserver and CA_V413 protocol Ralph Lange via Tech-talk
Next: Re: Problem with CA nameserver and CA_V413 protocol Andrew Johnson via Tech-talk
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  <20232024 
ANJ, 19 Sep 2023 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·