EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  <20232024  2025  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  <20232024  2025 
<== Date ==> <== Thread ==>

Subject: Re: Problem with CA nameserver and CA_V413 protocol
From: Andrew Johnson via Tech-talk <tech-talk at aps.anl.gov>
To: Brian Bevins <bevins at jlab.org>, "tech-talk at aps.anl.gov" <tech-talk at aps.anl.gov>
Date: Mon, 18 Sep 2023 18:19:35 -0500
Hi Brian,

Thank-you for reporting this, which appears to be a very serious problem that should be addressed. I am very interested in your patches for possible use here, but will also want them to be merged into Base, the PCAS which is still being supported but not developed further, and the name server itself.

I am slightly surprised that we have not yet seen this issue here at APS yet. We run a mixture of Base versions including some IOCs on 3.14.11, and as part of installing the APS Upgrade we have moved all our IOCs to a different subnet from where our clients are running. We added a nameserver to direct the clients to the IOCs, and this seems to have worked fine so far, but we haven't tried to bring the machine up in this configuration yet.

Do your clients connect to the nameserver over TCP (by setting EPICS_CA_NAME_SERVERS) or UDP (regular broadcast), or a mixture? I think I'm able to replicate the problem here over both.

We don't have many 3.14.11 IOCs left but this could affect a few of them so I may need to address this urgently if we can't get upgrade those IOCs before we have to start turning on the machine.

- Andrew



On 9/18/23 4:58 PM, Brian Bevins via Tech-talk wrote:
I recently had to diagnose a problem with our local CA nameserver. We use a fork of the EPICS CA nameserver, which exhibits the same problem.

The symptom was that certain clients would receive monitor updates with element counts of zero, even for scalar PVs, when the PVs were resolved through a new build of the CA nameserver. This only occurred with clients built against R3.14.12 or newer, a nameserver built against R3.14.12 or newer, and an ioc built against R3.14.11 or older. (Note that I use the term "ioc" throughout to refer to any CA server in order to avoid confusion over which server is being described.) We identified camonitor and PyEpics as two clients that were affected. Notably caget was not.

I traced the problem to a change made in CA_V413, that allows  clients to use a zero element count in a CA_PROTO_EVENT_ADD or CA_PROTO_READ_NOTIFY messages. Prior to 413 an element count of zero in a request is supposed to always trigger an error. 413 allows a request for zero elements to which the server responds with the actual number of elements available. Using the element count of zero seems to have become the default when a client is communicating with an ioc that supports V413. Everything works fine when the client and ioc handle name resolution directly. The channel falls back to the lowest common denominator of CA minor protocol version. 

The problem occurs when a CA nameserver built against a newer EPICS redirects newer clients to older iocs. The nameserver always answers the client with a message that correctly gives the ioc's address, but includes the nameserver's own protocol version. When the client then opens a channel with the ioc, the client has the false belief that the ioc has a later protocol version that it really does. This never seemed to create a problem for us until CA_V413, but when a 413 client connects to a <= 412 ioc it uses the zero element count which the ioc handles by returning updates with zero elements. (This itself seems like another bug in that for V412 and earlier an element count of zero is supposed to generate an error.)

My fix was to create patches to libca and libcas and modify the nameserver to use them. The libca patch adds a ca_host_minor_protocol(chid) function to the CA client API that allows a client to request the minor protocol version of a connected server. The client side of the CA nameserver uses it to fetch and store the minor protocol version from each ioc. The libcas patch adds an overload of pvExistReturn() that allows the server to respond to a client with both a specified network address and a specified minor protocol version number. The server side of the CA nameserver uses this to respond to clients with both the stored address and the stored protocol version. With these changes in place the problem disappears.

I know that pcas is basically dead, but the CA nameserver seems to still be in use. I can provide the patches against R3.15.9 if there is any interest in that.

--Brian Bevins
Jefferson Lab


-- 
Complexity is free, it's Simplicity that takes work.

Replies:
Re: Problem with CA nameserver and CA_V413 protocol Ralph Lange via Tech-talk
References:
Problem with CA nameserver and CA_V413 protocol Brian Bevins via Tech-talk

Navigate by Date:
Prev: Problem with CA nameserver and CA_V413 protocol Brian Bevins via Tech-talk
Next: Re: Streamdevice reads weird 1 byte null data Hyung Jin Kim via Tech-talk
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  <20232024  2025 
Navigate by Thread:
Prev: Problem with CA nameserver and CA_V413 protocol Brian Bevins via Tech-talk
Next: Re: Problem with CA nameserver and CA_V413 protocol Ralph Lange via Tech-talk
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  <20232024  2025 
ANJ, 19 Sep 2023 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions ·
· Download · Search · IRMIS · Talk · Documents · Links · Licensing ·