We are having channel access problems occasionally on some R3.14.8.2
vxWorks IOC's. It seems that one of the database semaphores isn't being
released for some reason and this is screwing everything up. No task is
suspended, and there are no inverted priorities indicating a deadlock
(but this is no guarantee). Has anyone else seen this?
The details follow.
The simplest symptom is caget fails as follows:
[npr78@i06-ws002 ~]$ caget BL06I-AL-SLITS-01:YA
Read operation timed out: some PV data was not read.
BL06I-AL-SLITS-01:YA 0
CA.Client.Exception...............................................
Warning: "Virtual circuit disconnect"
Context: "op=0, channel=BL06I-AL-SLITS-01:YA, type=DBR_TIME_DOUBLE,
count=1, ctx="BL06I-MO-IOC-01.diamond.ac.uk:5064""
Source File: ../getCopy.cpp line 82
Current Time: Fri Nov 03 2006 16:30:00.426411000
..................................................................
If I try a dbgf on the IOC to try and get the same parameter it hangs
and the stack trace emitted after Ctrl-C is as follows:
BL06I-MO-IOC-01 -> dbgf "BL06I-AL-SLITS-01:YA"
231f7c vxTaskEntry +68 : shell ()
1f81e0 shell +190: 1f820c ()
1f840c shell +3bc: execute ()
1f8590 execute +d8 : yyparse ()
2122d0 yyparse +71c: 210668 ()
2107ec yystart +96c: dbgf ()
1e752e18 dbgf +15c: dbGetField ()
1e73f3fc dbGetField +68 : dbScanLock ()
1e73c70c dbScanLock +1b4: epicsMutexLock ()
1e8089c0 epicsMutexLock +24 : semTake ()
2283ac semTake +13c: semMTake ()
tShell restarted.
So the shell is hanging because it can't get a lock in dbScanLock.
Address 0x1e73c70c is somewhere in the middle of dbScanLock - the next
routine is dbScanUnlock and the addresses of each routine is:
BL06I-MO-IOC-01 -> lkup "dbScanLock"
dbScanLock 0x1e73c558 text (BL06I-MO-IOC-01.munch)
BL06I-MO-IOC-01 -> lkup "dbScanUnlock"
dbScanUnlock 0x1e73c8b8 text (BL06I-MO-IOC-01.munch)
In the middle of dbScanLock there are various statements of the form:
epicsMutexMustLock(plockSet->lock);
epicsMutexMustLock(lockSetModifyLock);
... and so the problem is presumably in one of these semaphores.
Has anyone seem something similar? Does anyone have suggestions of what
I should do next time for diagnostics?
Cheers
Nick Rees
Principal Software Engineer Phone: +44 (0)1235-778430
Diamond Light Source Fax: +44 (0)1235-446713
- Replies:
- Re: Database hanging Andrew Johnson
- Navigate by Date:
- Prev:
AAI/AAO database information David Dudley
- Next:
Re: HP8116A signal generator Till Straumann
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
<2006>
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
- Navigate by Thread:
- Prev:
Re: AAI/AAO database information Andrew Johnson
- Next:
Re: Database hanging Andrew Johnson
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
<2006>
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
|