This is where the scaler record is using the .RATE or .RAT1 fields:
rate = ((pscal->us == USER_STATE_COUNTING) ? pscal->rate : pscal->rat1);
if (rate > .1) {
callbackRequestDelayed(pupdateCallback, 1.0/rate);
}
I suspect that in base 7.0.3.1 callbackRequestDelayed is not delaying at all on vxWorks if the delay is equal to the system clock period. The default system clock period, and the one in use on my vxWorks IOCs, is 1/60 sec. This would explain the observed behavior of the callbacks happening so fast when RATE=60 that they overload the system.
The release notes for 7.0.3.1 say this:
*******************************
Timers and delays use monotonic clock
Many internal timers and delay calculations use a monotonic clock epicsTimeGetMonotonic() instead of the realtime epicsTimeGetCurrent(). This is intended to make IOCs less susceptible to jumps in system time.
*******************************
So the code for delay calculations was indeed changed, and I think it broke on vxWorks.
Mark
________________________________
From: Mark Rivers
Sent: Saturday, February 1, 2020 9:04 AM
To: 'Mooney, Tim M.'; 'Johnson, Andrew N.'
Cc: Dongzhou Zhang; Joanne Stubbs; Peter Eng; 'tech-talk at aps.anl.gov'; core-talk at aps.anl.gov
Subject: RE: Problems with scaler record and base 7.0.3.1
I just rebuilt everything with 7.0.3, rather than 7.0.3.1. That fixed the problem. I even tested 2 scalers in the same IOC, a Joerger and an SIS3820, both running at 60 Hz at the same time. It worked fine.
Something is broken in 7.0.3.1.
You never find these problems until you test on a bunch of real-world IOCs :)
Mark
From: Tech-talk <tech-talk-bounces at aps.anl.gov> On Behalf Of Mark Rivers via Tech-talk
Sent: Saturday, February 1, 2020 8:40 AM
To: 'Mooney, Tim M.' <mooney at anl.gov>; 'Johnson, Andrew N.' <anj at anl.gov>
Cc: Dongzhou Zhang <dzzhang at cars.uchicago.edu>; Joanne Stubbs <stubbs at cars.uchicago.edu>; Peter Eng <eng at cars.uchicago.edu>; 'tech-talk at aps.anl.gov' <tech-talk at aps.anl.gov>
Subject: Problems with scaler record and base 7.0.3.1
Tim and Andrew,
We have discovered a serious problem with the scaler record running under the following configuration.
Base 7.0.3.1
vxWorks 6.9.4.1
std master
mca master
vme master
asyn master
The problem is the following:
- If the scaler display update rate (.RATE field) is 59 (Hz) or less it works fine. The cbHigh task is using less than 2% of the CPU as shown by spy.
- If .RATE=60 then the following happens:
o The cbHigh task uses >50% of the CPU
o The timerTask uses >20% of the CPU
o There is 0% IDLE time in the CPU
o The crate becomes unresponsive and loses CA connections
o Typing 'dbpf "13LAB:scaler1.RATE","59"' fixes the problem immediately, the crate becomes responsive and CA connections are restorerd.
- We observe this problem on both the Joerger scaler and the SIS3820 scaler.
- The problem also happens if Autocount is enabled and .RAT1=60.
We are certain that this is a new problem, because we have autosave files from 2 vxWorks IOCs going all the way back to 2014 and .RATE was always 60 for those scalers. This includes the last run in December 2019.
We first observed the failures yesterday (the first day of the run, naturally!)
Nothing has changed in the scaler record, or in the device support for the Joerger scaler and the SIS3820 scaler since 2018. We have been running the master branch of these all the time, so the fact that it was working all of 2019 means the problem is unlikely to be in those modules.
The Joerger scaler does not use asyn, so the problem cannot be in asyn.
The main thing that has changed in this run is that we have updated from base 7.0.3 to 7.0.3.1.
Is there anything that could have changed in base 7.0.3.1 that might cause this behavior?
We can work around the problem for now by setting .RATE less than 60, but others are likely to be hit by the same problem.
Thanks,
Mark
- References:
- Problems with scaler record and base 7.0.3.1 Mark Rivers via Tech-talk
- RE: Problems with scaler record and base 7.0.3.1 Mark Rivers via Tech-talk
- Navigate by Date:
- Prev:
Re: ezca for 3.15 Mark Rivers via Tech-talk
- Next:
Re: ezca for 3.15 Siddons, David via Tech-talk
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
<2020>
2021
2022
2023
2024
- Navigate by Thread:
- Prev:
RE: Problems with scaler record and base 7.0.3.1 Mark Rivers via Tech-talk
- Next:
ezca for 3.15 Siddons, David via Tech-talk
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
<2020>
2021
2022
2023
2024
|