EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  <20212022  2023  2024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  <20212022  2023  2024 
<== Date ==> <== Thread ==>

Subject: Re: cbLow consumption randomly stopping
From: "Johnson, Andrew N. via Tech-talk" <tech-talk at aps.anl.gov>
To: "Daykin, Evan" <daykin at frib.msu.edu>
Cc: EPICS tech-talk <tech-talk at aps.anl.gov>
Date: Tue, 8 Jun 2021 19:21:25 +0000
Hi Evan,

On Jun 8, 2021, at 1:03 PM, Daykin, Evan via Tech-talk <tech-talk at aps.anl.gov> wrote:

@Andrew, here is some output from attaching gdb:
(gdb) info threads
 Id   Target Id                                           Frame

* 13   Thread 0x7fd5a609a700 (LWP 22219) "cbLow-0"         futex_wait_cancelable (private=0, expected=0, futex_word=0x55c1d321ba40) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
 14   Thread 0x7fd5a528e700 (LWP 22220) "cbLow-1"         futex_wait_cancelable (private=0, expected=0, futex_word=0x55c1d321ba40) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
 15   Thread 0x7fd5a508d700 (LWP 22221) "cbLow-2"         futex_wait_cancelable (private=0, expected=0, futex_word=0x55c1d321ba40) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
 16   Thread 0x7fd5a4e8c700 (LWP 22222) "cbLow-3"         futex_wait_cancelable (private=0, expected=0, futex_word=0x55c1d321ba40) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
 17   Thread 0x7fd5a4c8b700 (LWP 22223) "cbLow-4"         futex_wait_cancelable (private=0, expected=0, futex_word=0x55c1d321ba40) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
 18   Thread 0x7fd5a4a8a700 (LWP 22224) "cbLow-5"         futex_wait_cancelable (private=0, expected=0, futex_word=0x55c1d321ba40) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
 19   Thread 0x7fd5a4889700 (LWP 22225) "cbLow-6"         futex_wait_cancelable (private=0, expected=0, futex_word=0x55c1d321ba40) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
 20   Thread 0x7fd5a4688700 (LWP 22226) "cbLow-7"         futex_wait_cancelable (private=0, expected=0, futex_word=0x55c1d321ba40) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
 21   Thread 0x7fd5a4487700 (LWP 22227) "cbMedium"        futex_wait_cancelable (private=0, expected=0, futex_word=0x55c1d31ed8c0) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
 22   Thread 0x7fd5a4286700 (LWP 22228) "cbHigh"          futex_wait_cancelable (private=0, expected=0, futex_word=0x55c1d31edac0) at ../sysdeps/unix/sysv/linux/futex-internal.h:88

Oh, you’re using parallel cbLow threads, you have 8 of them instead of the 1 I was expecting to see. That’s supported but not a particularly common configuration, so the callback code has been less well exercised in that configuration (we don’t use it at APS).


-----
(gdb) thread 13
[Switching to thread 13 (Thread 0x7fd5a609a700 (LWP 22219))]
#0  futex_wait_cancelable (private=0, expected=0, futex_word=0x55c1d321ba40) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
88 ../sysdeps/unix/sysv/linux/futex-internal.h: No such file or directory.
------------
(gdb) bt
#0  futex_wait_cancelable (private=0, expected=0, futex_word=0x55c1d321ba40) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0x55c1d321b9f0, cond=0x55c1d321ba18) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0x55c1d321ba18, mutex=0x55c1d321b9f0) at pthread_cond_wait.c:655
#3  0x00007fd5a9f3051b in epicsEventWait () from /lib/x86_64-linux-gnu/libCom.so.3.15.8
#4  0x00007fd5a9f28c99 in epicsEventMustWait () from /lib/x86_64-linux-gnu/libCom.so.3.15.8
#5  0x00007fd5a9f9c251 in ?? () from /lib/x86_64-linux-gnu/libdbCore.so.3.15.8
#6  0x00007fd5a9f2da7b in ?? () from /lib/x86_64-linux-gnu/libCom.so.3.15.8
#7  0x00007fd5a98e9fa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
#8  0x00007fd5a9d6a4cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

All cbLow threads show same backtrace.

Hmm, that last bit implies it’s more likely to be a problem in the callback subsystem, in our ring buffer implementation or in the epicsEvent code (or the underlying futex) that wakes up the individual threads. The fact that the queue has filled up but the threads are idle implies the epicsEventSignal() calls aren’t triggering the epicsEventMustWait() in any of the callbackTask() instances.

Michael, can gdb give any useful detail about the state of the futex?

- Andrew

-- 
Complexity comes for free, simplicity you have to work for.


References:
cbLow consumption randomly stopping Daykin, Evan via Tech-talk
Re: cbLow consumption randomly stopping Mark Rivers via Tech-talk
RE: cbLow consumption randomly stopping Daykin, Evan via Tech-talk

Navigate by Date:
Prev: RE: cbLow consumption randomly stopping Daykin, Evan via Tech-talk
Next: Re: cbLow consumption randomly stopping Michael Davidsaver via Tech-talk
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  <20212022  2023  2024 
Navigate by Thread:
Prev: Re: cbLow consumption randomly stopping Michael Davidsaver via Tech-talk
Next: RE: cbLow consumption randomly stopping Daykin, Evan via Tech-talk
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  <20212022  2023  2024 
ANJ, 06 Aug 2021 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·