Experimental Physics and
Industrial Control System

2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 <2020> 2021 2022 2023 2024 2025 2026	Index	2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 <2020> 2021 2022 2023 2024 2025 2026
<== Date ==>		<== Thread ==>

Subject:	Re: Java CA client beacon/search timing issues
From:	"Johnson, Andrew N. via Core-talk" <core-talk at aps.anl.gov>
To:	Michael Davidsaver <mdavidsaver at gmail.com>
Cc:	EPICS core-talk <core-talk at aps.anl.gov>, "Shroff, Kunal" <shroffk at bnl.gov>, John Sinclair <sinclairjw at ornl.gov>
Date:	Fri, 9 Oct 2020 19:50:41 +0000

On Oct 9, 2020, at 11:56 AM, Michael Davidsaver via Core-talk <core-talk at aps.anl.gov> wrote:

On 10/9/20 8:51 AM, Kasemir, Kay via Core-talk wrote:

For unfortunate reasons, our EPICS_CA_BEACON_PERIOD is set to 2 instead of 15 seconds, and the EPICS_CA_CONN_TMO=5. The idea was that clients like EDM should show disconnects after 5 seconds instead of looking at stale data for the default 30 seconds, and IOCs with CA links should consider them disconnected after 5 seconds as well.

This seems excessive. The reduced timeout I can understand,
but reducing the beacon period I'm less sure about. And 2 seconds
seems excessive. Is this left over from the days when UDP beacons were
used to timeout TCP connections?

IIRC on virtual circuits that do not have regular traffic from the client to the server the C++ client implementation takes about twice the beacon period to recognize that an IOC is no longer responsive. It then sends an “are you there" over TCP, and if that doesn’t get responded to within some period it will mark those channels as disconnected. Using UDP beacons this way reduces the amount of network traffic (and the corresponding server workload) that would be needed if the server sent periodic beacons over each TCP circuit.

Were you thinking that has changed? Has it?

Just one such IOC this tricks the CA client into restarting the name searches for disconnected channels. Add archive setups with 4000 missing channels (why are there so many missing channels? other issue...), physics apps that look for "all BPMs" and some are currently offline, ... and you get a lot of broadcasts.

What to do?

At APS we have occasional campaigns that look at what clients are searching for names that don’t connect and force the client owners to clean up their screens or software. If you have any C Gateways with their server-side connected to the machine network they can tell you what your current CA search rate is, always worth keeping an eye on.

I would suggest trying to increase the beacon periods on your IOCs to something more reasonable, would 5 seconds be acceptable to your users instead? That should give you 10-15 seconds for disconnect notifications; maybe now that you know the cost of aiming for 5 seconds you can persuade management to let you increase it?

I've long thought that this approach of trying to model the timing
of beacons was too clever. Maybe a simpler model with a timeout at
3x the beacon period, or if the beacon count jumps by >3, then reset
search timers?

The client doesn’t really try to model the timing of the beacons from each server, it just regards a significant change in the measured period as its trigger, although I’m not sure how lenient it is. It does have to adapt to different periods from each server, the PCAS used a different beacon period than the IOC for a long time and it may still do that.

With PVXS, I use linear back off for search retry instead of exponential
in an attempt to mitigate the effects of this sort of situation. I also
have a 30 second hold off after each beacon anomaly before another will
be recognized.

The 30 second hold-off certainly sounds like a good idea that might be implementable in the Java client; I’m not sure if the C++ CA client has anything like that in it, it may not need it.

- Andrew

Complexity comes for free, simplicity you have to work for.

Replies:: Re: Java CA client beacon/search timing issues Ralph Lange via Core-talk; Re: Java CA client beacon/search timing issues Michael Davidsaver via Core-talk

References:: Java CA client beacon/search timing issues Kasemir, Kay via Core-talk; Re: Java CA client beacon/search timing issues Michael Davidsaver via Core-talk

Navigate by Date:: Prev: Re: Java CA client beacon/search timing issues Michael Davidsaver via Core-talk; Next: Re: Java CA client beacon/search timing issues Ralph Lange via Core-talk; Index: 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 <2020> 2021 2022 2023 2024 2025 2026
Navigate by Thread:: Prev: Re: Java CA client beacon/search timing issues Michael Davidsaver via Core-talk; Next: Re: Java CA client beacon/search timing issues Ralph Lange via Core-talk; Index: 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 <2020> 2021 2022 2023 2024 2025 2026

ANJ, 11 Oct 2020

· Home · News · About · Talk · Base · Modules · Extensions ·
· Distributions · Download · Documents · Links · Licensing ·

Experimental Physics and Industrial Control System

Experimental Physics and
Industrial Control System