EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  <20212022  2023  2024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  <20212022  2023  2024 
<== Date ==> <== Thread ==>

Subject: RE: cbLow consumption randomly stopping
From: "Daykin, Evan via Tech-talk" <tech-talk at aps.anl.gov>
To: Mark Rivers <rivers at cars.uchicago.edu>
Cc: "tech-talk at aps.anl.gov" <tech-talk at aps.anl.gov>
Date: Tue, 8 Jun 2021 18:03:39 +0000
> epicsMutexShowAll 1

Output is indeed the same each time it is run:
ellCount(&mutexList) 17481 ellCount(&freeList) 6
Does this imply these two are deadlocking against each other?

>What OS is this?
Debian Buster, uname -r => 4.19.0-5-amd64

>What version of EPICS base?
3.15.8

>Is the driver or device support using asyn? If so, what version?
No, it seems to be using epicspci/devlib2 to communicate with hardware.

 
@Andrew, here is some output from attaching gdb:
(gdb) info threads
  Id   Target Id                                           Frame 
  1    Thread 0x7fd5a60a5c80 (LWP 22198) "st.cmd"          0x00007fd5a9d62151 in __pselect (nfds=1, readfds=0x7fff0eb98db0, writefds=0x0, exceptfds=0x0, timeout=<optimized out>, sigmask=<optimized out>) at ../sysdeps/unix/sysv/linux/pselect.c:69
  2    Thread 0x7fd5a5899700 (LWP 22200) "errlog"          futex_wait_cancelable (private=0, expected=0, futex_word=0x55c1d2538054) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  3    Thread 0x7fd5a5818700 (LWP 22202) "Reader"          futex_wait_cancelable (private=0, expected=0, futex_word=0x55c1d2612600) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  4    Thread 0x7fd5a5797700 (LWP 22203) "capture"         __GI___libc_read (nbytes=176, buf=0x7fd598000b20, fd=9) at ../sysdeps/unix/sysv/linux/read.c:26
  5    Thread 0x7fd5a5716700 (LWP 22204) "Reader"          futex_wait_cancelable (private=0, expected=0, futex_word=0x55c1d26144d0) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  6    Thread 0x7fd5a5695700 (LWP 22205) "capture"         __GI___libc_read (nbytes=176, buf=0x7fd59c000b20, fd=13) at ../sysdeps/unix/sysv/linux/read.c:26
  7    Thread 0x7fd5a5614700 (LWP 22206) "Reader"          futex_wait_cancelable (private=0, expected=0, futex_word=0x55c1d26163c0) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  8    Thread 0x7fd5a5593700 (LWP 22207) "capture"         __GI___libc_read (nbytes=176, buf=0x7fd590000b20, fd=17) at ../sysdeps/unix/sysv/linux/read.c:26
  9    Thread 0x7fd5a5512700 (LWP 22208) "Reader"          futex_wait_cancelable (private=0, expected=0, futex_word=0x55c1d2618350) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  10   Thread 0x7fd5a5491700 (LWP 22209) "capture"         __GI___libc_read (nbytes=176, buf=0x7fd594000b20, fd=21) at ../sysdeps/unix/sysv/linux/read.c:26
  11   Thread 0x7fd5a5410700 (LWP 22217) "taskwd"          futex_abstimed_wait_cancelable (private=0, abstime=0x7fd5a540fcf0, expected=0, futex_word=0x55c1d2e56e00) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
  12   Thread 0x7fd5a538f700 (LWP 22218) "timerQueue"      futex_abstimed_wait_cancelable (private=0, abstime=0x7fd5a538ecc0, expected=0, futex_word=0x55c1d2619b74) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
* 13   Thread 0x7fd5a609a700 (LWP 22219) "cbLow-0"         futex_wait_cancelable (private=0, expected=0, futex_word=0x55c1d321ba40) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  14   Thread 0x7fd5a528e700 (LWP 22220) "cbLow-1"         futex_wait_cancelable (private=0, expected=0, futex_word=0x55c1d321ba40) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  15   Thread 0x7fd5a508d700 (LWP 22221) "cbLow-2"         futex_wait_cancelable (private=0, expected=0, futex_word=0x55c1d321ba40) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  16   Thread 0x7fd5a4e8c700 (LWP 22222) "cbLow-3"         futex_wait_cancelable (private=0, expected=0, futex_word=0x55c1d321ba40) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  17   Thread 0x7fd5a4c8b700 (LWP 22223) "cbLow-4"         futex_wait_cancelable (private=0, expected=0, futex_word=0x55c1d321ba40) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  18   Thread 0x7fd5a4a8a700 (LWP 22224) "cbLow-5"         futex_wait_cancelable (private=0, expected=0, futex_word=0x55c1d321ba40) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  19   Thread 0x7fd5a4889700 (LWP 22225) "cbLow-6"         futex_wait_cancelable (private=0, expected=0, futex_word=0x55c1d321ba40) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  20   Thread 0x7fd5a4688700 (LWP 22226) "cbLow-7"         futex_wait_cancelable (private=0, expected=0, futex_word=0x55c1d321ba40) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  21   Thread 0x7fd5a4487700 (LWP 22227) "cbMedium"        futex_wait_cancelable (private=0, expected=0, futex_word=0x55c1d31ed8c0) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  22   Thread 0x7fd5a4286700 (LWP 22228) "cbHigh"          futex_wait_cancelable (private=0, expected=0, futex_word=0x55c1d31edac0) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  23   Thread 0x7fd567fff700 (LWP 22229) "dbCaLink"        futex_wait_cancelable (private=0, expected=0, futex_word=0x55c1d31edeb4) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  24   Thread 0x7fd567dfe700 (LWP 22230) "timerQueue"      futex_abstimed_wait_cancelable (private=0, abstime=0x7fd567dfdcc0, expected=0, futex_word=0x55c1d31eef44) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
  25   Thread 0x7fd55095c700 (LWP 22233) "scanOnce"        futex_wait_cancelable (private=0, expected=0, futex_word=0x55c1d3883b80) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  26   Thread 0x7fd55075b700 (LWP 22234) "scan-10"         futex_abstimed_wait_cancelable (private=0, abstime=0x7fd55075acd0, expected=0, futex_word=0x55c1d3883360) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
  27   Thread 0x7fd55055a700 (LWP 22235) "scan-5"          futex_abstimed_wait_cancelable (private=0, abstime=0x7fd550559cd0, expected=0, futex_word=0x55c1d38834a0) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
  28   Thread 0x7fd550359700 (LWP 22236) "scan-2"          futex_abstimed_wait_cancelable (private=0, abstime=0x7fd550358cd0, expected=0, futex_word=0x55c1d38835e0) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
  29   Thread 0x7fd543fff700 (LWP 22237) "scan-1"          futex_abstimed_wait_cancelable (private=0, abstime=0x7fd543ffecd0, expected=0, futex_word=0x55c1d3883720) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
  30   Thread 0x7fd543dfe700 (LWP 22238) "scan-0.5"        futex_abstimed_wait_cancelable (private=0, abstime=0x7fd543dfdcd0, expected=0, futex_word=0x55c1d3883860) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
  31   Thread 0x7fd543bfd700 (LWP 22239) "scan-0.2"        futex_abstimed_wait_cancelable (private=0, abstime=0x7fd543bfccd0, expected=0, futex_word=0x55c1d38839a0) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
  32   Thread 0x7fd5439fc700 (LWP 22240) "scan-0.1"        futex_abstimed_wait_cancelable (private=0, abstime=0x7fd5439fbcd0, expected=0, futex_word=0x55c1d3883ae0) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
  33   Thread 0x7fd5437fb700 (LWP 22241) "asCaTask"        futex_wait_cancelable (private=0, expected=0, futex_word=0x55c1d3927c70) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  34   Thread 0x7fd5435fa700 (LWP 22242) "ipToAsciiProxy"  futex_wait_cancelable (private=0, expected=0, futex_word=0x7fd52c00fa34) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  35   Thread 0x7fd56418c700 (LWP 22243) "timerQueue"      futex_abstimed_wait_cancelable (private=0, abstime=0x7fd56418bcc0, expected=0, futex_word=0x7fd52c010224) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
  36   Thread 0x7fd550158700 (LWP 22244) "CAC-UDP"         0x00007fd5a9d6b742 in __libc_recvfrom (fd=26, buf=0x7fd52c0127d8, len=65551, flags=0, addr=..., addrlen=0x7fd550157ca8) at ../sysdeps/unix/sysv/linux/recvfrom.c:27
  37   Thread 0x7fd5433f9700 (LWP 22245) "CAC-TCP-recv"    __libc_recv (flags=0, len=16384, buf=0x7fd54307b04c, fd=28) at ../sysdeps/unix/sysv/linux/recv.c:28
  38   Thread 0x7fd5431f8700 (LWP 22246) "CAC-TCP-send"    futex_wait_cancelable (private=0, expected=0, futex_word=0x7fd52000e830) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  39   Thread 0x7fd543076700 (LWP 22256) "CAS-TCP"         0x00007fd5a9d6b437 in __libc_accept (fd=29, addr=..., len=0x7fd543075cec) at ../sysdeps/unix/sysv/linux/accept.c:26
  40   Thread 0x7fd542f75700 (LWP 22257) "CAS-UDP"         0x00007fd5a9d6b742 in __libc_recvfrom (fd=30, buf=0x7fd50c001a50, len=65551, flags=0, addr=..., addrlen=0x7fd542f74ce8) at ../sysdeps/unix/sysv/linux/recvfrom.c:27
  41   Thread 0x7fd542e74700 (LWP 22258) "CAS-beacon"      0x00007fd5a9d37720 in __GI___nanosleep (requested_time=0x7fd542e73c50, remaining=0x7fd542e73c60) at ../sysdeps/unix/sysv/linux/nanosleep.c:28
  42   Thread 0x7fd542df3700 (LWP 22259) "CAC-event"       futex_wait_cancelable (private=0, expected=0, futex_word=0x7fd56008ccd0) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  43   Thread 0x7fd542cf2700 (LWP 22260) "logRestart"      futex_abstimed_wait_cancelable (private=0, abstime=0x7fd542cf1d30, expected=0, futex_word=0x55c1d39310d0) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
  44   Thread 0x7fd542c71700 (LWP 22261) "caPutLog"        futex_wait_cancelable (private=0, expected=0, futex_word=0x7fd4f8000b90) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  45   Thread 0x7fd542bf0700 (LWP 22262) "logRestart"      futex_abstimed_wait_cancelable (private=0, abstime=0x7fd542befd30, expected=0, futex_word=0x55c1d393f270) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
  46   Thread 0x7fd542b6f700 (LWP 22263) "save_restore"    futex_abstimed_wait_cancelable (private=0, abstime=0x7fd542b6e880, expected=0, futex_word=0x7fd4fc0204e0) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
  47   Thread 0x7fd54296e700 (LWP 22264) "reccaster-etcd"  0x00007fd5a9d37720 in __GI___nanosleep (requested_time=0x7fd54296dc40, remaining=0x7fd54296dc50) at ../sysdeps/unix/sysv/linux/nanosleep.c:28
  48   Thread 0x7fd5428ed700 (LWP 22265) "recEtcdConsumer" futex_wait_cancelable (private=0, expected=0, futex_word=0x55c1d61a30d0) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  49   Thread 0x7fd54286c700 (LWP 22266) "recEtcdConsumer" futex_wait_cancelable (private=0, expected=0, futex_word=0x55c1d61a3380) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  50   Thread 0x7fd5427eb700 (LWP 22267) "recEtcdConsumer" futex_wait_cancelable (private=0, expected=0, futex_word=0x55c1d61a3630) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
 -----
(gdb) thread 13
[Switching to thread 13 (Thread 0x7fd5a609a700 (LWP 22219))]
#0  futex_wait_cancelable (private=0, expected=0, futex_word=0x55c1d321ba40) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
88	../sysdeps/unix/sysv/linux/futex-internal.h: No such file or directory.
------------
(gdb) bt
#0  futex_wait_cancelable (private=0, expected=0, futex_word=0x55c1d321ba40) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0x55c1d321b9f0, cond=0x55c1d321ba18) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0x55c1d321ba18, mutex=0x55c1d321b9f0) at pthread_cond_wait.c:655
#3  0x00007fd5a9f3051b in epicsEventWait () from /lib/x86_64-linux-gnu/libCom.so.3.15.8
#4  0x00007fd5a9f28c99 in epicsEventMustWait () from /lib/x86_64-linux-gnu/libCom.so.3.15.8
#5  0x00007fd5a9f9c251 in ?? () from /lib/x86_64-linux-gnu/libdbCore.so.3.15.8
#6  0x00007fd5a9f2da7b in ?? () from /lib/x86_64-linux-gnu/libCom.so.3.15.8
#7  0x00007fd5a98e9fa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
#8  0x00007fd5a9d6a4cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

All cbLow threads show same backtrace.

Thanks for the help,
Evan

-----Original Message-----
From: Mark Rivers <rivers at cars.uchicago.edu> 
Sent: Tuesday, June 8, 2021 1:27 PM
To: Daykin, Evan <daykin at frib.msu.edu>
Cc: tech-talk at aps.anl.gov
Subject: Re: cbLow consumption randomly stopping

[EXTERNAL] This email originated from outside of FRIB One possibility is that the thread the pulls the message out of the queue has locked up for some reason.One possibility is a deadlock.  When this happens you could try typingepicsMutexShowAll 1several times.  If the same mutexes are locked each time you type that, then there is likely a deadlock.What OS is this?What version of EPICS base?Is the driver or device support using asyn?  If so, what version?Mark________________________________From: Tech-talk <tech-talk-bounces at aps.anl.gov> on behalf of Daykin, Evan via Tech-talk <tech-talk at aps.anl.gov>Sent: Tuesday, June 8, 2021 12:10 PMTo: EPICS tech-talkSubject: cbLow consumption randomly stoppingHello tech-talk,We have an IOC repeatedly locking up with the infamous "cbLow ring buffer full" message. The IOC consumes data from our CAEN Pico-8 card via an EPICS driver at 100Hz (about 1200-1215 events per second total) , and at some point on the order of hours to days, seems to stop consuming callbacks from the queue. I wrote a script to track this, and it seems to all be happening in one shot. Below is some data I captured from the last time this happened. Before this, everything was happy for about 4 hours, but sometimes it takes up to a week for this to happen.TimeHigh-waterItems in qQ capacity%Oflws6/7/2021 19:22:4013012000006/7/2021 19:22:4113012000006/7/2021 19:22:4213012000006/7/2021 19:22:4413012000006/7/2021 19:22:45571571120004.806/7/2021 19:22:46177817781200014.806/7/2021 19:22:47299129911200024.906/7/2021 19:22:4842044204120003506/7/2021 19:22:4954065406120004506/7/2021 19:22:50661966191200055.206/7/2021 19:22:51783478341200065.306/7/2021 19:22:52904890481200075.406/7/2021 19:22:5310249102491200085.406/7/2021 19:22:5411462114621200095.506/7/2021 19:22:5612000120001200010016/7/2021 19:22:581200012000120001001Has anyone else experienced this? If so, what is the remedy? I can't seem to correlate it to anything unusual happening in the OS/kernel.Thanks,Evan

Replies:
Re: cbLow consumption randomly stopping Michael Davidsaver via Tech-talk
Re: cbLow consumption randomly stopping Johnson, Andrew N. via Tech-talk
References:
cbLow consumption randomly stopping Daykin, Evan via Tech-talk
Re: cbLow consumption randomly stopping Mark Rivers via Tech-talk

Navigate by Date:
Prev: Re: cbLow consumption randomly stopping Mark Rivers via Tech-talk
Next: Re: cbLow consumption randomly stopping Michael Davidsaver via Tech-talk
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  <20212022  2023  2024 
Navigate by Thread:
Prev: Re: cbLow consumption randomly stopping Mark Rivers via Tech-talk
Next: Re: cbLow consumption randomly stopping Michael Davidsaver via Tech-talk
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  <20212022  2023  2024 
ANJ, 08 Jun 2021 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·