EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  <20152016  2017  2018  2019  2020  2021  2022  2023  2024  Index 2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  <20152016  2017  2018  2019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: caget delays
From: Benjamin Franksen <[email protected]>
To: EPICS Core-Talk <[email protected]>
Date: Fri, 18 Dec 2015 12:44:41 +0100
We've been hit by a problem reported several times on tech-talk, last
time in 2013, with the final message that identifies it's cause here:

  http://www.aps.anl.gov/epics/tech-talk/2013/msg00836.php

Mistakenly relying on Google search (instead of search on tech-talk
directly) we only found a thread from 2011 with no resolution. So we
debugged this again (after Mark and David already did in 2013), arriving
at the same conclusion: ca_context_destroy leads to destruction of an
object of the class ipAddrToAsciiEnginePrivate, its destructor calling
this->thread.exitWait(). Depending on your OS type, configuration, and
version, this may hang until the call to gethostbyaddr finally times
out, if the host that serves your PV does not have a DNS entry.

I think we can agree that this is not how things should be. Whatever the
purpose of starting the reverse name resolution (in the background
thread) may be, there are certainly lots of CA client applications that
can live without this feature, as witnessed by caget working flawlessly
(terminating without any delays) when I comment out the call to
ca_context_destroy.

(There is, by the way, nothing in the docs suggesting that CA servers
must have a valid DNS name or else programs may hang indefinitely inside
ca_context_destroy.)

I can see three ways to move forward from here:

(1) Remove the call to ca_context_destroy from the CA utilities. I don't
like this very much: their source code should serve as demonstration of
good practice when programming a CA client and thus should include
proper cleanup of the client context.

(2) Apply more forceful OS-specific ways of getting rid of the name
resolution thread (even when it is blocked on a call to gethostbyaddr).
Doing this properly would mean to adding some sort of "thread killing"
method to the epicsThread class, something which has been proposed
before and rejected for various good reasons.

(3) Let the user choose whether they want to have the extra features
enabled by the host name lookup, or whether they rather want to ensure
quick termination of their programs or threads. This could be made
configurable by an environment variable, for instance.

I think the third solution is preferable since it is backward compatible
(no API or ABI change) and can be applied without changing the source
code or even re-compiling (if dynamically linked) of the client
applications.

Cheers
Ben
-- 
"Make it so they have to reboot after every typo." ― Scott Adams


Attachment: signature.asc
Description: OpenPGP digital signature


Replies:
Re: caget delays Hartman, Steven M.
Re: caget delays Michael Davidsaver

Navigate by Date:
Prev: Re: Fwd: Wrong beacon source IP address Ralph Lange
Next: Re: Fwd: Wrong beacon source IP address Michael Davidsaver
Index: 2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  <20152016  2017  2018  2019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: Fwd: Re: Wrong beacon source IP address Ralph Lange
Next: Re: caget delays Hartman, Steven M.
Index: 2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  <20152016  2017  2018  2019  2020  2021  2022  2023  2024 
ANJ, 18 Dec 2015 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·