EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  <20212022  2023  2024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  <20212022  2023  2024 
<== Date ==> <== Thread ==>

Subject: Re: cbLow consumption randomly stopping
From: "Johnson, Andrew N. via Tech-talk" <tech-talk at aps.anl.gov>
To: "Daykin, Evan" <daykin at frib.msu.edu>
Cc: EPICS tech-talk <tech-talk at aps.anl.gov>
Date: Tue, 8 Jun 2021 17:25:56 +0000
Hi Evan,

On Jun 8, 2021, at 12:10 PM, Daykin, Evan via Tech-talk <tech-talk at aps.anl.gov> wrote:

We have an IOC repeatedly locking up with the infamous “cbLow ring buffer full” message. The IOC consumes data from our CAEN Pico-8 card via an EPICS driver at 100Hz (about 1200-1215 events per second total) , and at some point on the order of hours to days, seems to stop consuming callbacks from the queue. I wrote a script to track this, and it seems to all be happening in one shot. Below is some data I captured from the last time this happened. Before this, everything was happy for about 4 hours, but sometimes it takes up to a week for this to happen. 
 
Time
High-water
Items in q
Q capacity
%
Oflws

 

6/7/2021 19:22:40
13
0
12000
0
0
6/7/2021 19:22:41
13
0
12000
0
0
6/7/2021 19:22:42
13
0
12000
0
0
6/7/2021 19:22:44
13
0
12000
0
0
6/7/2021 19:22:45
571
571
12000
4.8
0
6/7/2021 19:22:46
1778
1778
12000
14.8
0
6/7/2021 19:22:47
2991
2991
12000
24.9
0
6/7/2021 19:22:48
4204
4204
12000
35
0
6/7/2021 19:22:49
5406
5406
12000
45
0
6/7/2021 19:22:50
6619
6619
12000
55.2
0
6/7/2021 19:22:51
7834
7834
12000
65.3
0
6/7/2021 19:22:52
9048
9048
12000
75.4
0
6/7/2021 19:22:53
10249
10249
12000
85.4
0
6/7/2021 19:22:54
11462
11462
12000
95.5
0
6/7/2021 19:22:56
12000
12000
12000
100
1
6/7/2021 19:22:58
12000
12000
12000
100
1

Has anyone else experienced this? If so, what is the remedy? I can’t seem to correlate it to anything unusual happening in the OS/kernel.

The three callback tasks are part of a general-purpose facility that exists to call some function with context data in another thread. If your cbLow task is hanging, that implies that one such function which it is calling is blocked and isn’t returning to let the next function be run. You should be able to see exactly what’s causing that from a stack- or back-trace of the cbLow thread; whatever it’s doing at the time is almost certainly where your problem lies and the callback subsystem is just how it got there. It’s most likely to be inside a device support or driver or some user code, so it’s going to be specific to your IOC and you’ll have to look further to find it.

HTH,

- Andrew

-- 
Complexity comes for free, simplicity you have to work for.


References:
cbLow consumption randomly stopping Daykin, Evan via Tech-talk

Navigate by Date:
Prev: cbLow consumption randomly stopping Daykin, Evan via Tech-talk
Next: Re: cbLow consumption randomly stopping Mark Rivers via Tech-talk
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  <20212022  2023  2024 
Navigate by Thread:
Prev: cbLow consumption randomly stopping Daykin, Evan via Tech-talk
Next: Re: cbLow consumption randomly stopping Mark Rivers via Tech-talk
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  <20212022  2023  2024 
ANJ, 08 Jun 2021 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·