Hi,
I’m seeing one of my IOCs seg fault with this message when I do an ‘exit’:
dbCa::addAction pausing, 10000 channels to clear
Segmentation fault (core dumped)
Examining the core file I see:
(gdb) bt
#0 0x00002b8aa7916318 in addAction (pca=0xc6c8390, link_action=1) at ../dbCa.c:150
#1 0x00002b8aa7916785 in dbCaRemoveLink (plink=0xa312f80) at ../dbCa.c:246
#2 0x00002b8aa70b7d76 in doCloseLinks (pdbRecordType=0x25fc590, precord=0xa312bb0, user=0x0) at ../iocInit.c:600
#3 0x00002b8aa70b7749 in iterateRecords (func=0x2b8aa70b7ccc <doCloseLinks>, user=0x0) at ../iocInit.c:396
#4 0x00002b8aa70b7e51 in exitDatabase (dummy=0x0) at ../iocInit.c:623
#5 0x00002b8aa8222c11 in epicsExitCallAtExitsPvt (pep=0x256d2d0) at ../../../src/libCom/misc/epicsExit.c:80
#6 0x00002b8aa8222ce6 in epicsExitCallAtExits () at ../../../src/libCom/misc/epicsExit.c:97
#7 0x00002b8aa8222f01 in epicsExit (status=0) at ../../../src/libCom/misc/epicsExit.c:160
#8 0x00000000004063dd in main (argc=2, argv=0x7fff390894a8) at ../bl11a-SensTech1Main.cpp:21
(gdb) info frame
Stack level 0, frame at 0x7fff39089240:
rip = 0x2b8aa7916318 in addAction (../dbCa.c:150); saved rip 0x2b8aa7916785
called by frame at 0x7fff39089270
source language c.
Arglist at 0x7fff39089230, args: pca=0xc6c8390, link_action=1
Locals at 0x7fff39089230, Previous frame's sp is 0x7fff39089240
Saved registers:
rbp at 0x7fff39089230, rip at 0x7fff39089238
(gdb) info args
pca = 0xc6c8390
link_action = 1
(gdb) info locals
callAdd = 1
(gdb) list
150 printLinks(pca);
151 }
152 while (removesOutstanding >= removesOutstandingWarning) {
153 epicsMutexUnlock(workListLock);
154 epicsThreadSleep(1.0);
155 epicsMutexMustLock(workListLock);
156 }
157 }
158 pca->link_action |= link_action;
159 if (callAdd)
(gdb) print pca
$10 = (caLink *) 0xc6c8390
(gdb) print pca->plink
$11 = (struct link *) 0x0
In the dbCaRemoveLink function I see where pca->plink is being cleared and where addAction is being called:
void dbCaRemoveLink(struct link *plink)
235 {
236 caLink *pca = (caLink *)plink->value.pv_link.pvt;
237
238 if (!pca) return;
239 epicsMutexMustLock(pca->lock);
240 pca->plink = 0;
241 plink->value.pv_link.pvt = 0;
242 if (pca->putCallback)
243 pca->plinkPutCallback = plink;
244 /* Unlock before addAction or dbCaTask might free first */
245 epicsMutexUnlock(pca->lock);
246 addAction(pca, CA_CLEAR_CHANNEL);
247 }
In addAction the printLinks function tries to access a null pointer (pca->plink).
If I comment out the printLinks function in addAction, it doesn’t seg fault (just takes a few seconds to shutdown).
Alternatively, if I increase the removesOutstandingWarning limit, it’s also fine. I don’t think that parameter is configurable via the IOC shell though.
This IOC does have quite a lot of records and makes heavy use of CA/CP links:
dbnr
Records Aliases Record Type
19640 1 ai
18510 0 ao
1133 0 bi
320 0 bo
1227 0 calc
6687 0 calcout
7338 0 dfanout
4352 0 longin
34824 0 longout
11 0 mbbo
3288 0 stringin
38 0 stringout
9 1 sub
3 0 waveform
10 0 asyn
3546 0 sseq
1088 0 scalcout
[mkp@bl11a-dassrv1 db]$ cat * | grep "record" | wc
113033 400407 5652852
[mkp@bl11a-dassrv1 db]$ cat * | grep " CA" | wc
26236 78712 1315560
[mkp@bl11a-dassrv1 db]$ cat * | grep " CP" | wc
6793 23894 350894
But it’s not doing that much at any one time. I use a series of sseq records (with the CA links) and streamDevice to talk to 10 separate Asyn ports which talk to RS485 chains with about 100 power supplies on each chain (about 1000 power supplies in total).
On the IOC exit I also tend to see several messages like:
sseq:putCallbackCB: Bad link at index 0
which I suspect is ok given that we’re shutting down in the middle of some put_callback operations.
I could split this IOC into separate processes if necessary.
Our base version is 3.14.12.4.
Cheers,
Matt
Data Acquisition and Control Engineer
Spallation Neutron Source
Oak Ridge National Lab
- Replies:
- Re: dbCa::addAction pausing, 10000 channels to clear Andrew Johnson
- Navigate by Date:
- Prev:
Re: EPICS Archiver Appliance does not transfer PVs to "Being archived" Shankar, Murali
- Next:
Re: dbCa::addAction pausing, 10000 channels to clear Andrew Johnson
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
<2017>
2018
2019
2020
2021
2022
2023
2024
- Navigate by Thread:
- Prev:
CSS Service Panel [email protected]
- Next:
Re: dbCa::addAction pausing, 10000 channels to clear Andrew Johnson
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
<2017>
2018
2019
2020
2021
2022
2023
2024
|