Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  <20142015  2016  2017  2018  2019  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  <20142015  2016  2017  2018  2019 
<== Date ==> <== Thread ==>

Subject: RE: [CSS] freeze and lost running PV in Boy Screens
From: Maurizio Montis <Maurizio.Montis@lnl.infn.it>
To: "tech-talk@aps.anl.gov" <tech-talk@aps.anl.gov>
Date: Sat, 17 May 2014 18:31:16 +0200 (CEST)
Hello Yong, 

first of all i'm sorry to answer to you so late. 

Well, the problem i found seems to be related to bandwidth saturation:
all my control network works at 1Gbps; in this architecture, a VME
system equipped with a couple of board for data acquisition (2x ADAS 150
+ 1x ADAS 108 boards) provides 64 ai records (from ADAS 150 boards) and
16 waveform record + 8 ai records (from ADAS 108 board). The 16
waveforms are composed by 8 waveforms having 1000000 point (FLOAT
values) and 20000 point. The first ones contain RAW data coming from the
field while the other ones are used to plot XY graphs. The 8 ai records
form ADAS 108 provide
the average value of every waveform. Every record is updated at 10Hz and
the PV update time is related to the external trigger used to manage the
boards.

Because of my Opi files shows both all the 64 ADAS 150 records and all
the 8 waveforms (not the RAW data obviously) in a common XY graph, it
seems the bandwidth used was too much next to the limit. I tried to
reduce the number waveforms provided by the ADAS 108 board to 3 (i can
work now with 3 channels). I also modified external trigger for having
5Hz as PV update time. In this situation my GUIs don't show the problem,
but i prefer to have other tests and try to reuse a 10Hz trigger. The
only thing i don't understand is the impact of the waveform records with
RAW data to the network bandwidth also in the case i don't use them in
any GUI. Do you have any idea?

In any case thank you very much for all the suggestions you gave me,
they were very helpful.

Best Regards, 

- Maurizio 



On May 08, 2014 12:13 AM, "Hu, Yong" <yhu@bnl.gov> wrote:

> Hello Maurizio,
> 
> A few questions and suggestions to you:
> 
> 1. You mentioned "sampling frequency", "high frequencies". I suppose
> you are talking about the digitizer/ADC sampling rate/speed. Right?
> Pay attention to the difference between this rate and the PV update
> rate. What is your waveform PV update rate? you can use camonitor -#1
> waveformPV and watch the timestamp to calculate the update rate or use
> the command "caEventRate" (maybe another name?) in your $EPICS/bin/...
> I hope your waveform PV does not update at ~ KHz range.
> 
> 2. As Steve Hartman asked, you need to check your network bandwith and
> the IOC load if you have big waveform data with high PV update rate.
> 
> 3. As Kay pointed out, the latest CSS/BOY is based on pvManager. You
> can configure it to limit PV update rate to 1 Hz, which may be helpful
> if the data source (PV value) is really updating too fast.
> 
> 4. Check out CSS message to see if it is "Out of memory". When 'Out of
> memory' occurs, you will see strange behavior, including frozen
> screen, disconnected PVs, etc.
> 
> HTH,
> 
> Yong
> 
> ________________________________
> From: tech-talk-bounces@aps.anl.gov [tech-talk-bounces@aps.anl.gov] on
> behalf of Maurizio M. [maurizio.montis@lnl.infn.it]
> Sent: Wednesday, May 07, 2014 4:37 AM
> To: Kasemir, Kay; Hill, Jeff
> Cc: tech-talk@aps.anl.gov
> Subject: Re: [CSS] freeze and lost running PV in Boy Screens
> 
> Il 5/1/2014 2:26 PM, Kasemir, Kay ha scritto:
> 
> Hello Kay, Hello Jeff,
> 
> following Kays's suggestion i tried CSS version 3.1.X and, in
> according with the prediction, i had the same behavior. That's a good
> point!
> 
> Before starting a deep investigation in the driver support and memory
> order, i'm doing a couple of tests by changing the EPICS environment
> variable EPICS_CA_MAX_ARRAY_BYTES. This choise is consequence of
> trying to have understood better (i hope) the situation with the
> hardware: "playing" with the sampling frequency provided by my "fast
> acquisition" board, i saw different behaviors on my graphical user
> interface:
> 
> * for high frequencies, i have the situation described in the previous
> mail
> * for low frequencies, PVs keep connections but sometimes there are
> some frozen values for a while (having both popup windows opened)
> 
> Because of driver support provides different Waveforms (in number and
> size) to manage and plot data acquired, i think i'm working in a
> situation where, in the best case, i'm borderline with the max array
> size. If also this my idea is wrong, i will surely take a look into
> the support and try to create a testbench following Jeff's suggestion.
> 
> In any case thank you very much for your help and sorry if i took too
> much time to answer you.
> 
> Best Regards
> 
> - Maurizio
> 
> 
> 
> Hello Maurizio:
> 
> As Jeff already replied, the Channel Access server on the IOC should
> not be affected by the specific order in which CA clients do things.
> The issue may indeed be in the driver support for your VME hardware.
> Still, if you want to narrow the issue down from the CSS side, try CSS
> versions 3.1.x (for SNS, you can find those older ones on
> https://ics-web.sns.ornl.gov/css/updates/apps/?C=M;O=D).
> CSS 3.2 uses the PVManager, 3.1 doesn't, so that would allow you to
> compare the behavior with two different CA client implementations.
> 
> Thanks,
> Kay
> 
> 
> On Apr 30, 2014, at 5:33 PM, Maurizio Montis
> <Maurizio.Montis@lnl.infn.it><mailto:Maurizio.Montis@lnl.infn.it>
> wrote:
> 
> 
> .. when i open the slow popup and after open the fast without closing
> the
> previous one..
> 
> * i also tried to change the CSS version, using all the versions
> available from SNS site (BASIC versions from 3.2.1 to 3.2.16) but i
> had
> always the same situation.
> 
> 
> 
> Hello Maurizio,
> 
> 
> 
> A call to 'assert(ev_que->evque[ev_que->putix] == EVENTQEMPTY)'
>     by thread 'CAS-client' failed in ../dbEvent.c line 701.
> EPICS Release EPICS R3.14.12.4
> 
> 
> This presumably shouldn't be caused by external circumstances such as
> the order of starting CA clients, or the type of CA clients.
> 
> The code involved in the IOC has been quite robust for some time now,
> and so my first guess is that there is some type of corruption of the
> data structures in the IOC. Are there some new drivers installed in
> this system? Does the IOC fail in other ways sometimes or is it always
> with this exact same assert fail? If you change the order in which
> driver memory is allocated during IOC startup, and or the size of
> memory that is allocated by drivers, does it change the outcome?
> Changing the order might cause the corruption to hit a different data
> structure in the IOC.
> 
> I also had a quick look at the logic surrounding this particular
> assert fail and it appears to be sound. Nevertheless, there could be
> something new that is occurring. If you can reproduce it in a smaller
> system that might help with isolating the cause (the mantra from the
> support department of every software company DYN).
> 
> We are quite busy now bringing the upgraded LANSCE RF systems on line.
> If you are feeling fairly certain that this isn't caused by layered
> code (probably device drivers) then let me know, and I will stare
> harder at the source code involved.
> 
> Jeff
> 
> 
> 
> ~~ Maurizio Montis - Control System Engineer ~~ mobile: +39 3408428089
> mail: maurizio.montis@lnl.infn.it<mailto:maurizio.montis@lnl.infn.it>
> skype: maurizio_montis Istituto Nazionale di Fisica Nucleare -
> Laboratori Nazionali di Legnaro V.le dell'Universita', 2 35020 LEGNARO
> (PD) - ITALY



~~ Maurizio Montis - Control System Engineer ~~
   mobile: +39 3408428089
   mail: maurizio.montis@lnl.infn.it
   skype: maurizio_montis

Istituto Nazionale di Fisica Nucleare - Laboratori Nazionali di Legnaro
   V.le dell'Universita', 2
   35020 LEGNARO (PD) - ITALY 					


References:
[CSS] freeze and lost running PV in Boy Screens Maurizio Montis
Re: [CSS] freeze and lost running PV in Boy Screens Kasemir, Kay
Re: [CSS] freeze and lost running PV in Boy Screens Maurizio M .
RE: [CSS] freeze and lost running PV in Boy Screens Hu, Yong

Navigate by Date:
Prev: Re: no response first time, so I'm trying again Ralph Lange
Next: RE: Portable Channel Access Server Event Queue Christopher J. Pendleton
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  <20142015  2016  2017  2018  2019 
Navigate by Thread:
Prev: RE: [CSS] freeze and lost running PV in Boy Screens Hu, Yong
Next: EPICS/RTEMS/VirtualBox Westfall, Michael D
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  <20142015  2016  2017  2018  2019 
ANJ, 17 Dec 2015 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·