Thanks Andrew.
Just to clarify, even though it’s a lot of records in one IOC, it wasn’t causing any issue for the running IOC. My limited testing so far seems to indicate it runs just fine. It was only the IOC shutdown, and the core file generation, that was the issue.
Cheers,
Matt
> On Apr 24, 2017, at 7:00 PM, Andrew Johnson <[email protected]> wrote:
>
> Hi Matt,
>
> On 04/24/2017 04:36 PM, Pearson, Matthew R. wrote:
>> I’m seeing one of my IOCs seg fault with this message when I do an ‘exit’:
>>
>> dbCa::addAction pausing, 10000 channels to clear
>> Segmentation fault (core dumped)
>
> <snip> - thanks for all the detail.
>
>> In addAction the printLinks function tries to access a null pointer (pca->plink).
>>
>> If I comment out the printLinks function in addAction, it doesn’t seg
>> fault (just takes a few seconds to shutdown).
>
> It seems to me that the bug is the use of printLinks() there, since that
> calls errlogPrintf() which queues the link's PV name pointer onto the
> errlog queue, but if the errlog thread doesn't get scheduled soon, when
> it does run it may attempt to print from a pointer which no longer
> exists, hence your core dump.
>
> I don't see the need for the printLinks() output at that point, so I
> think just removing it from addAction() is probably the best fix, which
> you confirmed above works for you.
>
>> Alternatively, if I increase the removesOutstandingWarning limit,
>> it’s also fine. I don’t think that parameter is configurable via the
>> IOC shell though.
>
> No, that's currently a constant; we could make it configurable, but
> since the "pausing" message is an explanation to the user why the IOC's
> shutdown is taking a while I think it's worth keeping as is. Changing
> the number would just make it happen less often and wouldn't fix the bug.
>
>> This IOC does have quite a lot of records and makes heavy use of CA/CP links:
>
> .. which is part of the reason why this IOC sees the problem but others
> don't.
>
>> On the IOC exit I also tend to see several messages like:
>>
>> sseq:putCallbackCB: Bad link at index 0
>>
>> which I suspect is ok given that we’re shutting down in the middle
>> of some put_callback operations.
>
> Agreed.
>
>> I could split this IOC into separate processes if necessary.
>> Our base version is 3.14.12.4.
>
> I don't think that should be necessary. I'll commit the above change to
> the 3.14 branch of Base and add a patch to the Known Problems page for
> Base-3.14.12 and 3.15.5.
>
> - Andrew
>
> --
> Arguing for surveillance because you have nothing to hide is no
> different than making the claim, "I don't care about freedom of
> speech because I have nothing to say." -- Edward Snowdon
>
- References:
- dbCa::addAction pausing, 10000 channels to clear Pearson, Matthew R.
- Re: dbCa::addAction pausing, 10000 channels to clear Andrew Johnson
- Navigate by Date:
- Prev:
Re: dbCa::addAction pausing, 10000 channels to clear Andrew Johnson
- Next:
caRepeater not found when starting ioc under procServ Michael Westfall
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
<2017>
2018
2019
2020
2021
2022
2023
2024
- Navigate by Thread:
- Prev:
Re: dbCa::addAction pausing, 10000 channels to clear Andrew Johnson
- Next:
caRepeater not found when starting ioc under procServ Michael Westfall
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
<2017>
2018
2019
2020
2021
2022
2023
2024
|