EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  <20072008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  <20072008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: RE: caRepeater must run before casr
From: "Jeff Hill" <[email protected]>
To: "'Dennis Nicklaus'" <[email protected]>, <[email protected]>
Cc: "'Gary Carr'" <[email protected]>, "John Faucett" <[email protected]>, "Mike Oothoudt" <[email protected]>, "Eric Bjorklund" <[email protected]>
Date: Mon, 8 Jan 2007 11:53:27 -0700
> We recently ran into a very puzzling problem here using the EPICS casr
> (channel access save restore) tool.  The problem showed up in one of two
> ways after you push the casrSave or casrRestore buttons.

On UNIX systems the caRepeater process is spawned off using a call to the
fork function to create the new process followed by a call to the exec
function to force the new process to run the caRepeater executable.

The fork function does duplicate any open file descriptors into the new
process. To avoid problems EPICS base does the following.

O In R3.13 the CA client library closes all open files except stdin/out/err.

O In almost all versions of R3.14, instead of closing open files, the "close
on exec flag" is set for all sockets created by a special socket creation
function in EPICS base. This is a less intrusive approach.

Jeff

> -----Original Message-----
> From: Dennis Nicklaus [mailto:[email protected]]
> Sent: Wednesday, January 03, 2007 4:10 PM
> To: [email protected]
> Subject: caRepeater must run before casr
> 
> We recently ran into a very puzzling problem here using the EPICS casr
> (channel access save restore) tool.  The problem showed up in one of two
> ways after you push the casrSave or casrRestore buttons.
> 
> Sometimes the Tcl/Tk casr interface would give an error dialog saying,
> "error waiting for process to exit: child process lost (is SIGCHLD
> ignored or trapped?)"
> and other times it would just hang forever after you push
> casrSave/casrRestore
> without the error dialog (though the save/restore would be processed).
> 
> The short solution is that you must have caRepeater running before
> running casr.
> 
> A brief summary of the gory details:  when one presses the Tk casrSave
> button, that causes tcl to
> exec the casave program.  casave in turn starts carepeater if carepeater
> isn't already there.
> carepeater, in trying to be a nice forked process, closes all its file
> descriptors except
> stdin, stdout ,and stderr.  This is part of where the problem starts
> because the pipe open between
> the top-level  wish (tcl) shell and the casave program gets dup-ed to
> stdout of casave,
> then when casave clones/forks off carepeater, the same stdout remains
> open in carepeater.
> Then when casave finishes, it's dead, but the higher level tcl is still
> trying to read() on the pipe,
> which is being held open by carepeater.  This wouldn't be a problem if
> the high level tcl shell
> were getting a SIGCHLD from the casave process, but by sifting through
> trace output,
> we saw that the casave process was being started with the clone() system
> call without
> specifying  SIGCHLD in the flags, and, as the clone() man page says, "If
> no signal is specified, then the parent process is  not  signaled  when
> the child terminates."  We don't know if this is a mistake in the
> version of tcl we have or something with the version of linux and TLS we
> happen to be running,
> though it happens on multiple linux kernel versions we have.
> 
> YMMV widely depending on your verions of unix and tcl.
> 
> I'm not suggesting anything necessarily needs to change in casr or
> caRepeater, just trying to point out a bizarre problem someone else may
> bump into along the way.
> 
> Many thanks to Ron Rechenmacher who spent many hours puzzling over this
> one.
> 
> Dennis
> 



Replies:
Re: caRepeater must run before casr Eric Norum
References:
caRepeater must run before casr Dennis Nicklaus

Navigate by Date:
Prev: synApps version 5.2 release Tim Mooney
Next: Re: caRepeater must run before casr Eric Norum
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  <20072008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: caRepeater must run before casr Dennis Nicklaus
Next: Re: caRepeater must run before casr Eric Norum
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  <20072008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
ANJ, 10 Nov 2011 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·