On 5/19/21 12:44 PM, Michael Davidsaver wrote:
> One thread has triggered an assertion failure while holding a record scan lock and an event queue lock.
> There should be something in the IOC log about this.
>
> https://github.com/epics-base/epics-base/blob/c7e42fab3c10f6b0c7464482f507e8ffac502937/src/ioc/db/dbEvent.c#L781
cf. https://bugs.launchpad.net/epics-base/+bug/541371
>> Thread 61 (Thread 0x7f394bfff700 (LWP 82077)):
>> #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
>> #1 0x00007f3a04a9a47b in epicsEventWait () from /home/jtagger/EPICS/R3_15_8-x86_64/base-3.15.8/lib/linux-x86_64/libCom.so.3.15.8
>> #2 0x00007f3a04a96c7d in epicsAssert () from /home/jtagger/EPICS/R3_15_8-x86_64/base-3.15.8/lib/linux-x86_64/libCom.so.3.15.8
>> #3 0x00007f3a04f5fedb in db_queue_event_log () from /home/jtagger/EPICS/R3_15_8-x86_64/base-3.15.8/lib/linux-x86_64/libdbCore.so.3.15.8
>> #4 0x00007f3a04f6012d in db_post_single_event () from /home/jtagger/EPICS/R3_15_8-x86_64/base-3.15.8/lib/linux-x86_64/libdbCore.so.3.15.8
>> #5 0x00007f3a04f8a1e1 in event_add_action () from /home/jtagger/EPICS/R3_15_8-x86_64/base-3.15.8/lib/linux-x86_64/libdbCore.so.3.15.8
>> #6 0x00007f3a04f8b72e in camessage () from /home/jtagger/EPICS/R3_15_8-x86_64/base-3.15.8/lib/linux-x86_64/libdbCore.so.3.15.8
>> #7 0x00007f3a04f881e3 in camsgtask () from /home/jtagger/EPICS/R3_15_8-x86_64/base-3.15.8/lib/linux-x86_64/libdbCore.so.3.15.8
>> #8 0x00007f3a04a97d0c in start_routine () from /home/jtagger/EPICS/R3_15_8-x86_64/base-3.15.8/lib/linux-x86_64/libCom.so.3.15.8
>> #9 0x00007f3a03c72064 in start_thread (arg=0x7f394bfff700) at pthread_create.c:309
>> #10 0x00007f3a03f6f62d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
>
>
>
> On 5/19/21 10:16 AM, Tagger, Jueri via Tech-talk wrote:
>> Thanks to all for many good advices!
>>
>> Juri
>>
>>
>>
>> Kay wrote:
>>
>>
>>
>> Well, good.
>>
>>
>>
>> I would post that to tech talk, maybe somebody has an idea what's happening.
>>
>> Or at least there's now a record in case it happens again.
>>
>>
>>
>> Sure looks like several threads are stuck waiting for a semaphore:
>>
>>
>>
>>
>>
>> Thread 84 (Thread 0x7f39b98b8700 (LWP 66431)):
>>
>> #0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
>>
>> #1 0x00007f3a03c74494 in _L_lock_952 () from /lib/x86_64-linux-gnu/libpthread.so.0
>>
>> #2 0x00007f3a03c742f6 in __GI___pthread_mutex_lock (mutex=0x7f3978038770) at ../nptl/pthread_mutex_lock.c:114
>>
>> #3 0x00007f3a04a99fb6 in epicsMutexOsdLock () from /home/jtagger/EPICS/R3_15_8-x86_64/base-3.15.8/lib/linux-x86_64/libCom.so.3.15.8
>>
>> #4 0x00007f3a04f8887d in casAccessRightsCB () from /home/jtagger/EPICS/R3_15_8-x86_64/base-3.15.8/lib/linux-x86_64/libdbCore.so.3.15.8
>>
>> #5 0x00007f3a04a76f7e in asComputePvt.part.1 () from /home/jtagger/EPICS/R3_15_8-x86_64/base-3.15.8/lib/linux-x86_64/libCom.so.3.15.8
>>
>> #6 0x00007f3a04a77127 in asComputeAsgPvt.part.2 () from /home/jtagger/EPICS/R3_15_8-x86_64/base-3.15.8/lib/linux-x86_64/libCom.so.3.15.8
>>
>> #7 0x00007f3a04a77868 in asComputeAsg () from /home/jtagger/EPICS/R3_15_8-x86_64/base-3.15.8/lib/linux-x86_64/libCom.so.3.15.8
>>
>> #8 0x00007f3a04d10bc0 in oldSubscription::current(epicsGuard<epicsMutex>&, unsigned int, unsigned long, void const*) () from /home/jtagger/EPICS/R3_15_8-x86_64/base-3.15.8/lib/linux-x86_64/libca.so.3.15.8
>>
>> #9 0x00007f3a04cef017 in cac::eventRespAction(callbackManager&, tcpiiu&, epicsTime const&, caHdrLargeArray const&, void*) () from /home/jtagger/EPICS/R3_15_8-x86_64/base-3.15.8/lib/linux-x86_64/libca.so.3.15.8
>>
>> #10 0x00007f3a04d07616 in tcpiiu::processIncoming(epicsTime const&, callbackManager&) () from /home/jtagger/EPICS/R3_15_8-x86_64/base-3.15.8/lib/linux-x86_64/libca.so.3.15.8
>>
>> #11 0x00007f3a04d098aa in tcpRecvThread::run() () from /home/jtagger/EPICS/R3_15_8-x86_64/base-3.15.8/lib/linux-x86_64/libca.so.3.15.8
>>
>> #12 0x00007f3a04a92089 in epicsThreadCallEntryPoint () from /home/jtagger/EPICS/R3_15_8-x86_64/base-3.15.8/lib/linux-x86_64/libCom.so.3.15.8
>>
>> #13 0x00007f3a04a97d0c in start_routine () from /home/jtagger/EPICS/R3_15_8-x86_64/base-3.15.8/lib/linux-x86_64/libCom.so.3.15.8
>>
>> #14 0x00007f3a03c72064 in start_thread (arg=0x7f39b98b8700) at pthread_create.c:309
>>
>> #15 0x00007f3a03f6f62d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
>>
>>
>>
>>
>>
>> --> Trying to re-compute access security?
>>
>>
>>
>>
>>
>> Thread 142 (Thread 0x7f3a0222f700 (LWP 453879)):
>>
>> #0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
>>
>> #1 0x00007f3a03c74494 in _L_lock_952 () from /lib/x86_64-linux-gnu/libpthread.so.0
>>
>> #2 0x00007f3a03c742f6 in __GI___pthread_mutex_lock (mutex=0x1771580) at ../nptl/pthread_mutex_lock.c:114
>>
>> #3 0x00007f3a04a99fb6 in epicsMutexOsdLock () from /home/jtagger/EPICS/R3_15_8-x86_64/base-3.15.8/lib/linux-x86_64/libCom.so.3.15.8
>>
>> #4 0x00007f3a04f4c01b in dbScanLock () from /home/jtagger/EPICS/R3_15_8-x86_64/base-3.15.8/lib/linux-x86_64/libdbCore.so.3.15.8
>>
>> #5 0x00007f3a04f5cbdc in scanList () from /home/jtagger/EPICS/R3_15_8-x86_64/base-3.15.8/lib/linux-x86_64/libdbCore.so.3.15.8
>>
>> #6 0x00007f3a04f5cd2f in ioscanCallback () from /home/jtagger/EPICS/R3_15_8-x86_64/base-3.15.8/lib/linux-x86_64/libdbCore.so.3.15.8
>>
>> #7 0x00007f3a04f66e05 in callbackTask () from /home/jtagger/EPICS/R3_15_8-x86_64/base-3.15.8/lib/linux-x86_64/libdbCore.so.3.15.8
>>
>> #8 0x00007f3a04a97d0c in start_routine () from /home/jtagger/EPICS/R3_15_8-x86_64/base-3.15.8/lib/linux-x86_64/libCom.so.3.15.8
>>
>> #9 0x00007f3a03c72064 in start_thread (arg=0x7f3a0222f700) at pthread_create.c:309
>>
>> #10 0x00007f3a03f6f62d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
>>
>>
>>
>>
>>
>> --> Trying to scan some record?
>>
>>
>>
>>
>>
>> #0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
>>
>> #1 0x00007f3a03c74494 in _L_lock_952 () from /lib/x86_64-linux-gnu/libpthread.so.0
>>
>> #2 0x00007f3a03c742f6 in __GI___pthread_mutex_lock (mutex=0x1771580) at ../nptl/pthread_mutex_lock.c:114
>>
>> #3 0x00007f3a04a99fb6 in epicsMutexOsdLock () from /home/jtagger/EPICS/R3_15_8-x86_64/base-3.15.8/lib/linux-x86_64/libCom.so.3.15.8
>>
>> #4 0x00007f3a04f4c01b in dbScanLock () from /home/jtagger/EPICS/R3_15_8-x86_64/base-3.15.8/lib/linux-x86_64/libdbCore.so.3.15.8
>>
>> #5 0x00007f3a04f647d1 in dbChannel_get_count () from /home/jtagger/EPICS/R3_15_8-x86_64/base-3.15.8/lib/linux-x86_64/libdbCore.so.3.15.8
>>
>> #6 0x00007f3a04f8ad52 in read_notify_action () from /home/jtagger/EPICS/R3_15_8-x86_64/base-3.15.8/lib/linux-x86_64/libdbCore.so.3.15.8
>>
>> #7 0x00007f3a04f8b72e in camessage () from /home/jtagger/EPICS/R3_15_8-x86_64/base-3.15.8/lib/linux-x86_64/libdbCore.so.3.15.8
>>
>> #8 0x00007f3a04f881e3 in camsgtask () from /home/jtagger/EPICS/R3_15_8-x86_64/base-3.15.8/lib/linux-x86_64/libdbCore.so.3.15.8
>>
>> #9 0x00007f3a04a97d0c in start_routine () from /home/jtagger/EPICS/R3_15_8-x86_64/base-3.15.8/lib/linux-x86_64/libCom.so.3.15.8
>>
>> #10 0x00007f3a03c72064 in start_thread (arg=0x7f394adf3700) at pthread_create.c:309
>>
>> #11 0x00007f3a03f6f62d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
>>
>>
>>
>> --> Trying to reply to a CA read, about to lock the channel to get the data size (element count)?
>>
>>
>>
>> -Kay
>>
>>
>>
>> *From:* Mark Rivers <rivers at cars.uchicago.edu>
>> *Sent:* Wednesday, May 19, 2021 12:19 PM
>> *To:* Tagger, Jueri <jtagger at bnl.gov>
>> *Cc:* tech-talk at aps.anl.gov
>> *Subject:* RE: EPICS stopped
>>
>>
>>
>> Is this Linux or some other OS?
>>
>>
>>
>> It looks like you will need to kill the IOC process.
>>
>>
>>
>> Mark
>>
>>
>>
>>
>>
>> *From:* Tagger, Jueri <jtagger at bnl.gov <mailto:jtagger at bnl.gov>>
>> *Sent:* Wednesday, May 19, 2021 11:15 AM
>> *To:* Mark Rivers <rivers at cars.uchicago.edu <mailto:rivers at cars.uchicago.edu>>
>> *Cc:* tech-talk at aps.anl.gov <mailto:tech-talk at aps.anl.gov>
>> *Subject:* RE: EPICS stopped
>>
>>
>>
>> No effect from ^C whatsoever. Typed epicsMutexShowAll 1
>>
>> 3 times to blank screen – no response
>>
>>
>>
>>
>>
>> *From:* Mark Rivers <rivers at cars.uchicago.edu <mailto:rivers at cars.uchicago.edu>>
>> *Sent:* Wednesday, May 19, 2021 12:12 PM
>> *To:* Tagger, Jueri <jtagger at bnl.gov <mailto:jtagger at bnl.gov>>
>> *Cc:* tech-talk at aps.anl.gov <mailto:tech-talk at aps.anl.gov>
>> *Subject:* RE: EPICS stopped
>>
>>
>>
>> Can you type ^C to abort the casr command? It could be a deadlock. If that works then type
>>
>>
>>
>> epicsMutexShowAll 1
>>
>>
>>
>> several times and see if the same mutexes are always locked. That is the symptom of a deadlock.
>>
>>
>>
>> Mark
>>
>>
>>
>>
>>
>> *From:* Tech-talk <tech-talk-bounces at aps.anl.gov <mailto:tech-talk-bounces at aps.anl.gov>> *On Behalf Of *Tagger, Jueri via Tech-talk
>> *Sent:* Wednesday, May 19, 2021 11:04 AM
>> *To:* tech-talk at aps.anl.gov <mailto:tech-talk at aps.anl.gov>
>> *Subject:* EPICS stopped
>>
>>
>>
>> Hi,
>>
>> We have suddenly situation we can no longer connect to ca channels, (while the existing connections are served)
>>
>> casr 4
>>
>> 181888 bytes allocated
>>
>> Send Lock:
>>
>> epicsMutexId 0x7f3978036420 source ../../../src/ioc/rsrv/caservertask.c line 1231
>>
>> Put Notify Lock:
>>
>> epicsMutexId 0x7f3978036450 source ../../../src/ioc/rsrv/caservertask.c line 1232
>>
>> Address Queue Lock:
>>
>> epicsMutexId 0x7f3978036480 source ../../../src/ioc/rsrv/caservertask.c line 1233
>>
>> Event Queue Lock:
>>
>> epicsMutexId 0x7f39780364b0 source ../../../src/ioc/rsrv/caservertask.c line 1234
>>
>> Block Semaphore:
>>
>> epicsEvent 0x7f3978010c90: full
>>
>> pthread_mutex = 0x7f3978010c90, pthread_cond = 0x7f3978010cb8
>>
>> TCP client at 10.0.153.12:54302 'box64-3':
>>
>> User 'rose', V4.13, Priority = 0, 75 Channels
>>
>> Task Id = 0x7f397803c750, Socket FD = 44
>>
>> 2262.05 secs since last send, 2262.04 secs since last receive
>>
>> Unprocessed request bytes = 424, Undelivered response bytes = 0
>>
>> State = up
>>
>>
>>
>> Then the output stopped …. no response any more on the console
>>
>>
>>
>> EPICS_BASE=base-3.15.8
>>
>>
>>
>> Any ideas?
>>
>>
>>
>> J. Tagger
>>
>> NSLS-II=
>>
>
- Replies:
- RE: EPICS stopped Tagger, Jueri via Tech-talk
- References:
- EPICS stopped Tagger, Jueri via Tech-talk
- RE: EPICS stopped Mark Rivers via Tech-talk
- RE: EPICS stopped Tagger, Jueri via Tech-talk
- RE: EPICS stopped Mark Rivers via Tech-talk
- RE: EPICS stopped Tagger, Jueri via Tech-talk
- Re: EPICS stopped Michael Davidsaver via Tech-talk
- Navigate by Date:
- Prev:
PVAccess equivalent to EPICS_CAS_INTF_ADDR_LIST environment variable question Wlodek, Jakub via Tech-talk
- Next:
RE: EPICS stopped Tagger, Jueri via Tech-talk
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
<2021>
2022
2023
2024
- Navigate by Thread:
- Prev:
Re: EPICS stopped Michael Davidsaver via Tech-talk
- Next:
RE: EPICS stopped Tagger, Jueri via Tech-talk
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
<2021>
2022
2023
2024
|