Ø Reconnects
sometimes take several minutes
How many PVs are in this client? Does the client see the servers
beacons? Was the IOC either rebooted or reconnected to the network when this
happened (i.e. should the client lib have detected a beacon anomaly)?
There have been some discussions about this recently so it’s
probably a good idea to document the gory details to core-talk.
A difference in behavior in R3.14 compared to the EPICS you have
used in the past is that unresolved PVs still get a search rate boost when
there is a beacon anomaly, but its magnitude is smaller (a smaller rate increase,
a longer period, than in the past). In previous versions a beacon anomaly
caused a boost all the way up to the new channel search rate. In current
versions the PV gets a boost up to a (typically 5.0 second) search rate. So,
depending the number of channels you can connect per second in a beacon anomaly
scenario depends on several things.
O depending on the size of channel names you get more or less
search requests in a udp frame. We will call this number N.
O the ca client maintains a private estimate of the number of
udp search frames it can send all together at a particular instant in time w/o
swamping the buffering capacity of the {client, network, aggregate of all
server spoken to} system. We will call this M.
O the ca client maintains a private estimate of the round trip
time based on an aggregate of all servers successfully conversed with over UDP.
If this estimate is what we see for typical LAN interactions then beacon
anomaly search period is 5.0 seconds. Otherwise its set at 2**O times the round
trip estimate where O is number computed currently something like this. The
beaconAnomalySearchPeriod constant is currently 5.0 seconds. We will call this
the effective beacon anomaly search period, or P.
I =
log ( beaconAnomalySearchPeriod / minRoundTripEstimate ) / log ( 2.0 )
P =
(2**I) * (roundTripEstimate)
Therefore you can connect at best N requests per frame times M
frames per try divided by P tries per second – in a beacon anomaly
reconnect scenario. This is assuming that servers respond on the first try. If
they don’t then it means congestion and we are better off waiting longer.
1. Should
I set EPICS_CA_BEACON_PERIOD and EPICS_CAS_BEACON_PERIOD to be the same?
(5 sec)
There is a bug in the doc. It should state that EPICS_CA_BEACON_PERIOD
is the default for EPICS_CAS_BEACON_PERIOD. I just finished fixing this
problem.
Short answer. The beacon period is purely a server
configuration. It looks first for EPICS_CAS_BEACON_PERIOD. If it doesn’t find
that it looks for EPICS_CA_BEACON_PERIOD. You need only set EPICS_CAS_BEACON_PERIOD.
Long story. We have to keep EPICS_CA_BEACON_PERIOD for backwards
compatibility. In the GW we have to keep certain aspects of client
configuration separate from server configuration. We needed therefore _CAS_
versions of certain variables. Ralph asked later for EPICS_CAS_BEACON_PERIOD so
that names are consistent.
2. Can
I set EPICS_CA_CONN_TMO to 10 if the above are 5, or should I use >10?
I have always said it should be at least twice as long. If you
set it to be exactly twice then maybe once in awhile the client lib will not
see a beacon and the consequently send a message to the IOC to see if it’s
still responsive, but that’s not going to be a serious network bandwidth consumption
issue - I suspect. If you set it to be four times as long then maybe the EPICS
system will be slightly less likely to mark circuits as unresponsive when there
is some congestion. So you have some freedom to tune your system.
3. What exactly does EPICS_CA_MAX_SEARCH do? Does it
tell clients to finally give up broadcasting until at least one beacon anomaly
or new beacon is detected?
In any recent version of R3.14 the CA client never stops
searching for undefined PVs. The EPICS_CA_MAX_SEARCH_PERIOD variable determines
the upper limit for the exponential back-off of the udp search rate.
Jeff
From: Stephen Lewis
[mailto:[email protected]]
Sent: Monday, June 01, 2009 11:02 AM
To: Jeff Hill
Subject: Re: Faster re-connect for CA clients
Reconnects sometimes take several minutes. I am the
EPICS system manager: so I am trying to set both client and server EPICS_CA
values consistently. So my questions are:
1. Should I set EPICS_CA_BEACON_PERIOD and EPICS_CAS_BEACON_PERIOD
to be the same? (5 sec)
2. Can I set EPICS_CA_CONN_TMO to 10 if the above are 5, or
should I use >10?
3. What exactly does EPICS_CA_MAX_SEARCH do? Does it
tell clients to finally give up broadcasting until at least one beacon anomaly
or new beacon is detected?
On 1 Jun, 2009, at 8:52 AM, Jeff Hill wrote:
If CA is working correctly the client should be reconnecting
promptly if it sees the server’s beacon change. How long does it take to
reconnect there?
The beacon period should be less than half of the connection
timeout, but the beacon period is typically set by the EPICS system manager(s).
From: Stephen
Lewis [mailto:[email protected]]
Sent: Friday, May 29, 2009
6:27 PM
To: Jeff Hill
Subject: Faster re-connect
for CA clients
We talked at Vancouver about a new
CA environment variable to shorten the time it takes a CA client to re-connect;
was it: EPICS_CA_MAX_SEARCH_PERIOD? I was going to set it to 60 seconds. I now
have EPICS_CA_CONN_TMO=10 secs and EPICS_CA_BEACON_PERIOD=5
secs. Should I also set EPICS_CAS_BEACON_PERIOD=5
secs?