EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  <20132014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  <20132014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: Re: EPICS device disconnects and reconnects
From: "Pearson, Matthew R." <[email protected]>
To: Kate Feng <[email protected]>
Cc: "[email protected]" <[email protected]>
Date: Mon, 17 Jun 2013 17:13:34 -0400
Hi

Don't we get a WRITE/INVALID alarm back if we process the Acquire record with the port disconnected? Or, we could do if the driver supports it and returns an error from writeInt32.


Mark, your solution 3) sounds good. We have the same problem with motor drivers at the moment. Although, to make the same changes to asyn motor support we'd have to be careful about not moving motors when someone powers up a controller and Asyn connects to it.

To override the default PINI setting for Acquire, rather than use dbpf, another way would be to instantiate an extra database that just sets that field, like:

record(busy, "${P}${R}Acquire")
{
   field(PINI, "1")
}

or, in the ADBase.template, use a macro with a default value so that people can override it if they wish:

field(PINI, "$(PINI=0)")

Cheers,
Matt



On Jun 17, 2013, at 4:13 PM, Kate Feng <[email protected]> wrote:

> Hi Jason,
> 
> On 06/17/2013 03:09 PM, Jason Abernathy wrote:
>> >I believe that PINI=NO is the smartest choice for the XXX:Acquire record.
>> 
>> >We currently have 4 camera "IOCs", with substantial image processing, operating on a single computer. If we forgot to set >PINI=NO, and all cameras commenced acquisition at IOC boot, our hardware would be overwhelmed!
> It seems that you and Mark misunderstood what I stated to set 'PINI=YES'.   I do mean to set the 'VAL' to be 0 so
> that the 'acquire' record is defined to be 'off' (i.e. stop), instead of being STAT=UDF, SEVR=INVALID when IOC is
> first initialized. The software did set  the acquire parameter(i.e. ADAcquire)  to 0 in the constructor for the ADBase
> base class.  Thus, the value of record should match with the value set for the acquire parameter.  
> 
> Please read what Mark Rivers posted on 6/15/2013 at
> http://www.aps.anl.gov/epics/tech-talk/2013/msg01223.php
> http://www.aps.anl.gov/epics/tech-talk/2013/msg01225.php
> 
> Currently, when the device is disconnected, one could still hit the acquire 'start' button.  Thus, we wish to add
> the color mode for the alarm.  However, the STAT=UDF, SEVR=INVALID does not really imply 'disconnect'.
> Perhaps, should EPICS add  'DISCONNECT' in the alarm status ?
> 
> For a  Linux IOC, when the device is disconnected, the PVs of the disconnected device is still connected to the GUI
> clients if they were run on the same Linux PC as the server.  In fact, the PVs of the disconnected device should be marked
> as 'disconnected' as well.
> 
> Any thought ?
> 
> Thanks,
> Kate Feng
> 
>> 
>> As for the disconnect / reconnect problem, I agree that it's an issue for the prosilica driver. I solved it by "hacking" the driver to force the reset of certain camera acquisition parameters during reconnection. Per-driver implementation of reconnection routines is neither consistent nor easy to implement. Once the 1.9.1 version of areaDetector was released, the solution was moved to the database and implemented with a fanout record (your example of an "initialize all" record).
>> 
>> While trying to solve this problem, I also looked into generating a user-defined "Reconnect" event which is posted whenever the camera reconnects:
>> 
>> record (event, "xxx:CAMERA:Reconnect") {
>>   field (VAL, "Reconnect")
>> }
>> 
>> record(mbbo, "xxx:CAMERA:TriggerMode")
>> {
>>     ...
>>     field(SCAN, "Event") 
>>     field(EVNT, "Reconnect")
>> }
>> 
>> This would allow a per-record configuration of reconnection handling. Obviously, this prevents the "TriggerMode" record from being Passively scanned, which is unacceptable.
>> 
>> I can think of another solution - but it requires static, per-device configuration and only works for asyn-based drivers. Change the signature of the asynPortDriver::createParam() methods to include a "reconnect" option. When a driver calls exceptionConnect, asyn is responsible for calling "writexxx" on any parameter from the library which needs to be set during reconnection.
>> 
>> Jason
>> 
>> On 13-06-17 09:31 AM, Mark Rivers wrote:
>>> I deliberately did not set Acquire to PINI=YES, to avoid having a detector potentially automatically start acquiring images when the IOC reboots, for example if Acquire was in save/restore, and it happened to be acquiring when the IOC shut down.  Having the detector start automatically can have consequences like filling up the disk with some detectors, and I thought it was good to require manual intervention to start a detector.  You can always do the following in your startup script even if PINI=NO.
>>> 
>>> dbpf "XXX:Acquire.PROC","1" 
>>> 
>>> But if the community feels that PINI=YES on Acquire is a good idea I would be willing to change my mind.
>>> 
>>> Mark
>>> 
>>> 
>>> -----Original Message-----
>>> From: Kate Feng [
>>> mailto:[email protected]
>>> ] 
>>> Sent: Monday, June 17, 2013 11:20 AM
>>> To: Mark Rivers
>>> Cc: EPICS Tech-Talk
>>> Subject: Re: EPICS device disconnects and reconnects
>>> 
>>> Hi Mark,
>>> 
>>>       For the "Potential Solutions in the asyn Framework",  you wrote
>>> 
>>> 
>>>> 3) ........
>>>> If the record has PINI=YES it will send the value that would have been sent during
>>>> the initial record processing in iocInit.  If PINI=NO then it should read the value
>>>> from the device, and if the read is successful set the output record to that value.
>>>> 
>>> I agree that the 'PINI' filed is a good solution for it.  For example, 
>>> the 'acquire'
>>> record of the prosilica.template should be set to "PINI=YES".
>>> 
>>> Thanks,
>>> Kate
>>> 
>>> On 06/14/2013 02:20 PM, Mark Rivers wrote:
>>> 
>>>> Folks,
>>>> 
>>>> I would like to start a discussion of the problems and potential solutions of device disconnects and reconnects in EPICS.
>>>> 
>>>> 
>>>>                             Statement of the Problem
>>>> 
>>>> EPICS device control is moving away from VME-bus based devices that are "always available", and more towards distributed hardware using Ethernet, serial and other buses.  The challenge with such devices is that they may not be connected when the IOC boots, and they may be disconnected, reconnected, and power-cycled while the IOC is running.
>>>> 
>>>> These disconnect/reconnect events can lead to a number of problems.
>>>> 
>>>> 1) If the device is not connected when the IOC boots then
>>>>    - Any code in the device support init_record routine that relies on communication with the device will fail
>>>> 
>>>>    - Initialization that relies on records with PINI=YES will fail
>>>> 
>>>> 2) When a device disconnects it can lead to excessive error messages from records that are periodically processing, drivers that have polling loops, etc.
>>>> 
>>>> 3) When a device reconnects after being power-cycled the EPICS output records are likely to disagree with the actual device settings.
>>>> 
>>>> All EPICS records have the PINI field that defines what they should do when the entire IOC changes state.  This field has choices NO,YES,RUN,RUNNING,PAUSE,PAUSED.  However, there is not a field that defines what a record should do when the device it is associated with disconnects or reconnects.
>>>> 
>>>> 
>>>>                     Potential Solutions in the asyn Framework
>>>> 
>>>> When Marty Kraimer released the asyn framework 10 years ago this month he designed it to be able to handle such connection problems.  asyn port drivers should notify asynManager when their device connects and disconnects.  asynManager has methods to provide callbacks to clients when such asynExceptionConnect events occur.  Such clients can include the standard asyn device support, other drivers (e.g. motor, areaDetector) connected to an underlying driver (e.g. drvAsynIPPort), etc.
>>>> 
>>>> However, in practice we have not done a very good job of taking advantage of these capabilities in device support and other drivers.  I've attached a couple of screen shots of the areaDetector Prosilica driver.  ProsilicaDisconnected.png shows that the driver does detect when the camera is disconnected, and notifies asynManager of this.  This causes the CNCT field in the asynRecord to display the "Disconnected" string in red, so the operator is aware of the problem.  However, when the camera is powered back on, the screen in ProsilicaReconnected.png results.  Note that many of the output records (Exposure Time, Binning, Region start, etc.) do not agree with the readbacks of the actual values in the camera.  The output records retain their previous values, but the camera has now reverted to the power-up defaults.  At present the only way to fix the discrepancy is to hit <Enter> in each of the output record widgets, processing the record and sending the EPICS value to the
>>>>  
>>>>  
>>>>  
>>>>  
>>>>   ca!
>>>> 
>>>  mera.
>>> 
>>>> There are of course several ways that this could be improved:
>>>> 
>>>> 1) The database designer could implement an "Initialize all" record that would process all of the output records, either when the operator processed that record manually, or perhaps when the record processed automatically when the asyn record CNCT field changed from Disconnected to Connected.  This method requires a lot of work by the database developer.  If there are several databases involved, as there are with areaDetector (ADBase.template, prosilica.template), providing the necessary record links is challenging.
>>>> 
>>>> 2) The driver could store all of the values from the output records, and on a reconnect event it could send these values to the device.  This requires every driver to implement such logic, which is again a lot of work for the developer.
>>>> 
>>>> 3) asyn device support could register for connection callbacks on every output record.  When it gets a reconnection callback it would look at some field in the record to decide whether the send the record value to the device.  If every record had a field like PRCT (Process on Re-Connect) then it could use that field to decide whether to request record processing for that record on the reconnect event.  Since we don't have that field (at least not yet), then it could use PINI to make the decision about processing the output record on a reconnect.  Having a connection callback in device support that did this would solve problem 3) described above.  It would also mostly solve problem 1) above, the improper initialization because the device was not connected when the IOC was started.  If the record has PINI=YES it will send the value that would have been sent during the initial record processing in iocInit.  If PINI=NO then it should read the value from the device, and if t!
 he !
>>>> 
>>>  read is successful set the output record to that value.  This will correctly support bumpless reboots, but only if PINI=NO.
>>> 
>>>> Problem 2) above is excessive error messages when a device disconnects.  The goal here could be one that Bob Dalesio has stated:  there should be a single error message when a devices disconnects, and single status message when it reconnects, and that is all!  This seems reasonable.  This could potentially be done in asynManager itself.  It is notified of all disconnect and reconnect events, and it can thus produce those messages.  But all error reporting in asyn is also done with the pasynTrace interface which is also implemented in asynManager.  Every time asynPrint or asynPrintIO is called it could check the connection status of the device and simply not output anything if the device is not connected.  Would this be acceptable?
>>>> 
>>>> I'd be very interested in hearing what others think of the above ideas.
>>>> 
>>>> Cheers,
>>>> Mark
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>> 
> 



Replies:
Re: EPICS device disconnects and reconnects Ralph Lange
RE: EPICS device disconnects and reconnects Mark Rivers
References:
EPICS device disconnects and reconnects Mark Rivers
Re: EPICS device disconnects and reconnects Kate Feng
RE: EPICS device disconnects and reconnects Mark Rivers
Re: EPICS device disconnects and reconnects Jason Abernathy
Re: EPICS device disconnects and reconnects Kate Feng

Navigate by Date:
Prev: Re: EPICS device disconnects and reconnects Kate Feng
Next: Re: EPICS device disconnects and reconnects Ralph Lange
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  <20132014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: Re: EPICS device disconnects and reconnects Kate Feng
Next: Re: EPICS device disconnects and reconnects Ralph Lange
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  <20132014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
ANJ, 20 Apr 2015 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·