Hi Kay,
> Can you confirm and clarify the following about the CA
> connection handling in case of network errors?
All of the following describes R3.14 behavior. Behavior will subtly vary
with R3.13 and also with earlier R3.14 releases.
Servers only close connections when:
A) There is a protocol violation.
B) The socket option for a TCP keep-alive disconnect to kicks in. Basically
for TCP keep-alive disconnect the circuit must first be detected to be idle,
and subsequently be detected to be unresponsive. TCP keep-alive disconnect
are typically configured in the OS globally for all TCP circuits.
The client library detects circuit disconnects via the socket library. This
might be because of a peer disconnect, TCP keep-alive timer disconnect, and
many other reasons. When a circuit disconnects any channels attached to it
go to the "server needs to be located state" (see below).
The client library detects circuit unresponsiveness using these criteria.
O First, there must be no beacon from the server for EPICS_CA_CONN_TMO
seconds
O Second, a response to an are-you-there query does not arrive within 5
seconds
The client library will keep an unresponsive (as judged by the above
criteria) circuit in a disconnected state until a response to the
"are-you-there" query comes back within 5 seconds. If the response to the
"are-you-there" query is late the library will immediately reissue a fresh
"are-you-there" request.
The application's per-channel disconnect handler is called in both
situations (in response to whichever of them is first)
1) the channel's circuit is deemed to be unresponsive
2) the channel's circuit disconnects
The application's per-channel connect handler is called in both situations
(in response to whichever of them is first)
1) the channel's circuit is deemed to be responsive
2) the channel's circuit connects
Attempting to locate a server
-----------------------------
O In the client library there are N search buckets each with an independent
search request period. The bucket's search period is determined by two to
the power of the bucket's index times a constant. Bucket indexes start at
zero and are contiguous. When a search response arrives the channel is
removed from its search bucket and attached to a circuit for the specified
server. If there is no response to the search request after the buckets
period expires the channel is removed from its search bucket and moved to
the search bucket at index plus one. When a channel's search request times
out in the bucket with the slowest period it is not removed from this bucket
so that it continues to be searched for at a slow rate. The timeout of the
slowest period bucket is configurable.
O New channels are immediately moved to the search bucket with the shortest
timeout (at index zero).
O Disconnecting channels enter a cooling off state for a brief interval
prior to being moved to the search bucket with the shortest timeout (at
index zero).
O When a bucket sends a search request datagram it packs together as many
search requests for individual channels as can be made to fit in a UDP
frame.
O The number of search datagram frames sent when a search bucket's timer
expires is roughly based on the TCP slow start algorithm, and of course on
how many channels are waiting in the bucket.
O If a beacon anomaly is detected all channels with a search bucket period
greater than a medium period are moved to the search bucket with a medium
period search interval.
Jeff
> -----Original Message-----
> From: Kay-Uwe Kasemir [mailto:[email protected]]
> Sent: Friday, October 06, 2006 1:59 PM
> To: Jeff Hill
> Subject: CA disconnect mechanism
>
> Hi Jeff:
>
> We'll have EPICS training here at the SNS in two weeks,
> and I'm brushing some slides up.
>
> Can you confirm and clarify the following about the CA
> connection handling in case of network errors?
>
> Thanks,
> -Kay
>
> ----
>
> a) TCP connection closed by server?
> - Notify client code about problem
> - EDM screens turn "white".
> - Client sends new search requests,
> initially fast, then with exponential back-off,
> for about 8 minutes.
> Then nothing, unless a beacon anomaly wakes
> the client up to send new search requests.
>
>
> b) No response from server for 30 sec. (configurable)?
> - Client sends "Are you there?" query.
> - If no response for 5 sec, also notifies the client code,
> so EDM screens turn white,
> but TCP connection is kept open to avoid network storms.
>
> Now what?
> b.1) Server again sends data
> -> we're fully reconnected, EDM displays new data.
> b.2) Server never sends new data
> -> we'll stay in this state until the TCP connection dies?
> Or do we wake in response to beacon anomalies?
>
>
- Navigate by Date:
- Prev:
RE: sequencer installation Mark Rivers
- Next:
Failed to install sequencer 2.0.11 on WIN32 Zhang, Zhan
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
<2006>
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
- Navigate by Thread:
- Prev:
Re: sequencer installation Janet Anderson
- Next:
Failed to install sequencer 2.0.11 on WIN32 Zhang, Zhan
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
<2006>
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
|