Hello,
at the Helmholtz-Zentrum Berlin we have encountered a problem
with EPICS Base 3.15.8
built on recent versions of Linux (Debian 9, Fedora-33). It may
also be a problem with newer
versions of EPICS base.
Under certain conditions programs using EPICS base, especially
the channel access gateway,
are delayed and take a long time to answer requests.
The conditions that cause the problem
The following conditions must be met:
- An application must contain a channel access server and a
channel access client.
- A channel access client must be configured by
EPICS_CA_ADDR_LIST to connect directly to a number of IP
addresses in the local network.
- Many of the hosts listed in EPICS_CA_ADDR_LIST are not up or
do not exist.
- The channel access client must constantly try to resolve many
PVs by trying to connect to the hosts from EPICS_CA_ADDR_LIST.
The symptoms of the problem
The channel access server has large delays when answering client
requests. Establishing a new
connection takes 2 seconds or even more instead of the usual 0.05
seconds.
Monitor events are not posted immediately but stop for a few
seconds, then all
missing monitors are posted almost at the same time.
A setup to reproduce the problem
I have assembled some scripts to reproduce the problem on a linux
system.
The scripts together with a README.rst file that describes how to
run the test can be downloaded here:
https://www-csr.bessy.de/tmp/gatewaytest.tar.gz
Causes of the problem
The problem can be tracked down to code in file
src/ca/client/udpiiu.cpp in EPICS Base. There function
"sendto" is used to send UDP unicasts. If this function is called
many times with destination IP addresses
on the local network that do not exist, some UDP buffer in linux
fills up and finally sendto blocks for some
seconds. Then the buffer is (probably) flushed and the next few
calls to sendto no longer block. But after
some time the buffer fills up again and sendto blocks again.
However, in this part of EPICS Base sendto is not expected to
block. When this happens the channel
access server code cannot answer requests.
Resolution of the problem
We provide the "MSG_DONTWAIT" flag to the "sendto" call. By this,
sendto never blocks even if internal
UDP buffers are full.
I have added a patch file for this to this e-mail.
Finally...
Could you please look if you to have this problem, too ?
Maybe you could add my patch to EPICS Base ?
Greetings
Goetz Pfeiffer (control system department, Helmholtz-Zentrum
Berlin)
diff -r ebdbc82f5ca0 src/ca/client/udpiiu.cpp
--- a/src/ca/client/udpiiu.cpp Fri Jun 04 10:38:49 2021 +0200
+++ b/src/ca/client/udpiiu.cpp Fri Jun 04 10:44:08 2021 +0200
@@ -942,7 +942,23 @@
int bufSizeAsInt = static_cast < int > ( bufSize );
while ( true ) {
// This const_cast is needed for vxWorks:
- int status = sendto ( _udpiiu.sock, const_cast<char *>(pBuf), bufSizeAsInt, 0,
+ int status = sendto ( _udpiiu.sock, const_cast<char *>(pBuf), bufSizeAsInt,
+#ifndef __linux__
+ 0,
+#else
+ /* On modern Linux systems, when sendto() is used to do UDP
+ * unicasts, it blocks if the destination host is down and the
+ * internal UDP send buffer is filled. This lasts up to 2
+ * seconds.
+ * However, EPICS Base doesn't expect this call to block. If
+ * this happens, other things like answering name resolution
+ * requests or firing up monitors are also delayed.
+ * In order to avoid these problems, we provide the
+ * MSG_DONTWAIT flag, when EPICS Base is compiled for Linux.
+ * This means that sendto() always returns immediately.
+ */
+ MSG_DONTWAIT,
+#endif
& _destAddr.sa, sizeof ( _destAddr.sa ) );
if ( status == bufSizeAsInt ) {
break;
Attachment:
OpenPGP_signature
Description: OpenPGP digital signature
- Replies:
- Re: EPICS Base 3.15.8: Name resolution requests directed to non-existing hosts block CA Server Ralph Lange via Core-talk
- Re: EPICS Base 3.15.8: Name resolution requests directed to non-existing hosts block CA Server Michael Davidsaver via Core-talk
- Navigate by Date:
- Prev:
Heads-Up: Next Release of Base (C++) Ralph Lange via Core-talk
- Next:
Re: EPICS Base 3.15.8: Name resolution requests directed to non-existing hosts block CA Server Ralph Lange via Core-talk
- Index:
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
<2021>
2022
2023
2024
- Navigate by Thread:
- Prev:
Heads-Up: Next Release of Base (C++) Ralph Lange via Core-talk
- Next:
Re: EPICS Base 3.15.8: Name resolution requests directed to non-existing hosts block CA Server Ralph Lange via Core-talk
- Index:
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
<2021>
2022
2023
2024
|