Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  <20142015  2016  2017  2018  2019  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  <20142015  2016  2017  2018  2019 
<== Date ==> <== Thread ==>

Subject: Re: [CSS] freeze and lost running PV in Boy Screens
From: Maurizio M. <maurizio.montis@lnl.infn.it>
To: "Kasemir, Kay" <kasemirk@ornl.gov>, "Hill, Jeff" <johill@lanl.gov>
Cc: "tech-talk@aps.anl.gov" <tech-talk@aps.anl.gov>
Date: Wed, 7 May 2014 10:37:19 +0200
Il 5/1/2014 2:26 PM, Kasemir, Kay ha scritto:

Hello Kay, Hello Jeff,

following Kays's suggestion i tried CSS version 3.1.X and, in according with the prediction, i had the same behavior. That's a good point!

Before starting a deep investigation in the driver support and memory order, i'm doing a couple of tests by changing the EPICS environment variable EPICS_CA_MAX_ARRAY_BYTES. This choise is consequence of trying to have understood better (i hope) the situation with the hardware:  "playing" with the sampling frequency provided by my "fast acquisition" board, i saw different behaviors on my graphical user interface:

* for high frequencies, i have the situation described in the previous mail
* for low frequencies, PVs keep connections  but sometimes there are some frozen values for a while (having both popup windows opened)

Because of driver support provides different Waveforms (in number and size) to manage and plot data acquired, i think i'm working in a situation where, in the best case, i'm borderline with the max array size. If also this my idea is wrong, i will surely take a look into the support and try to create a testbench following Jeff's suggestion.

In any case thank you very much for your help and sorry if i took too much time to answer you.

Best Regards

- Maurizio


Hello Maurizio:

As Jeff already replied, the Channel Access server on the IOC should not be affected by the specific order in which CA clients do things.
The issue may indeed be in the driver support for your VME hardware.
Still, if you want to narrow the issue down from the CSS side, try CSS versions 3.1.x (for SNS, you can find those older ones on https://ics-web.sns.ornl.gov/css/updates/apps/?C=M;O=D).
CSS 3.2 uses the PVManager, 3.1 doesn't, so that would allow you to compare the behavior with two different CA client implementations.

Thanks,
Kay


On Apr 30, 2014, at 5:33 PM, Maurizio Montis <Maurizio.Montis@lnl.infn.it> wrote:
.. when i open the slow popup and after open the fast without closing the
previous one..

* i also tried to change the CSS version, using all the versions
available from SNS site (BASIC versions from 3.2.1 to 3.2.16) but i had
always the same situation.

    

Hello Maurizio,

A call to 'assert(ev_que->evque[ev_que->putix] == EVENTQEMPTY)'
    by thread 'CAS-client' failed in ../dbEvent.c line 701.
EPICS Release EPICS R3.14.12.4
This presumably shouldn't be caused by external circumstances such as the order of starting CA clients, or the type of CA clients.

The code involved in the IOC has been quite robust for some time now, and so my first guess is that there is some type of corruption of the data structures in the IOC. Are there some new drivers installed in this system? Does the IOC fail in other ways sometimes or is it always with this exact same assert fail? If you change the order in which driver memory is allocated during IOC startup, and or the size of memory that is allocated by drivers, does it change the outcome? Changing the order might cause the corruption to hit a different data structure in the IOC.

I also had a quick look at the logic surrounding this particular assert fail and it appears to be sound. Nevertheless, there could be something new that is occurring. If you can reproduce it in a smaller system that might help with isolating the cause (the mantra from the support department of every software company DYN).

We are quite busy now bringing the upgraded LANSCE RF systems on line. If you are feeling fairly certain that this isn't caused by layered code (probably device drivers) then let me know, and I will stare harder at the source code involved. 

Jeff

~~ Maurizio Montis - Control System Engineer ~~ mobile: +39 3408428089 mail: maurizio.montis@lnl.infn.it skype: maurizio_montis Istituto Nazionale di Fisica Nucleare - Laboratori Nazionali di Legnaro V.le dell'Universita', 2 35020 LEGNARO (PD) - ITALY

Replies:
Re: [CSS] freeze and lost running PV in Boy Screens Hartman, Steven M.
RE: [CSS] freeze and lost running PV in Boy Screens Hu, Yong
References:
[CSS] freeze and lost running PV in Boy Screens Maurizio Montis
Re: [CSS] freeze and lost running PV in Boy Screens Kasemir, Kay

Navigate by Date:
Prev: Re: Trouble with OID name using devsnmp. Nicoletta Petrella
Next: Re: Trouble with OID name using devsnmp. John A. Priller
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  <20142015  2016  2017  2018  2019 
Navigate by Thread:
Prev: Re: [CSS] freeze and lost running PV in Boy Screens Kasemir, Kay
Next: Re: [CSS] freeze and lost running PV in Boy Screens Hartman, Steven M.
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  <20142015  2016  2017  2018  2019 
ANJ, 17 Dec 2015 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·