EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  <20182019  2020  2021  2022  2023  2024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  <20182019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: RE: 'assert (pca->pgetNative)' failed in ../dbCa.c
From: "[email protected]" <[email protected]>
To: 'Andrew Johnson' <[email protected]>, EPICS tech-talk <[email protected]>
Date: Mon, 9 Jul 2018 13:53:51 +0000
Thank you very much for this patch Andrew.  We have been testing in the lab here at DLS and it is looking good; our plan is to roll it out more generally quite soon.

In our case it turns out that the trigger was (accidentially) changing a particular status pv from mbbi to stringin.


> -----Original Message-----
> From: Andrew Johnson [mailto:[email protected]]
> Sent: 29 June 2018 17:34
> To: Abbott, Michael (DLSLtd,RAL,TEC); EPICS tech-talk
> Subject: Re: 'assert (pca->pgetNative)' failed in ../dbCa.c
> 
> Hi Michael,
> 
> On 06/29/2018 01:43 AM, [email protected] wrote:
> > From: [email protected]
> >> So by this point you're probably hoping that the attached patch
> >> fixes the issue; well congratulations for reading this far, I tried
> >> out my suspicion above and the attached patch does seem to work for
> >> me on the 3.14 branch version, which should be close enough to
> >> yours to be able to apply one way or the other. Please test and let
> >> me know so I can apply it to the Base-3.14 branch and merge up.
> >
> > I've just noticed that the patch doesn't address the data type
> > mismatch directly, only through the separate connection callback.  Is
> > this going to be enough to avoid hitting those asserts even in the
> > presence of an IOC server breaking the rules?
> 
> Without the patch I can trigger this assertion using the attached pair
> of databases. Start by booting both in separate IOCs, then switch the
> anj:val alias to point to a different record type and reboot that IOC.
> This causes the other IOC to die with:
> 
> > epics> DB CA Link Exception: "Virtual circuit disconnect", context
> "tux.aps.anl.gov:44710"
> >
> >
> >
> > A call to 'assert(pca->pgetNative)'
> >     by thread 'CAC-TCP-recv' failed in ../dbCa.c line 686.
> > EPICS Release EPICS R3.14.12.7-DEV.
> > Local time is 2018-06-27 14:45:03.260025965 CDT
> > Please E-mail this message to the author or to [email protected]
> > Calling epicsThreadSuspendSelf()
> 
> After applying that patch I can see all the INP links reconnecting and
> getting new values from the alias without any problems. It takes a
> couple of scan periods for everything to reconnect and sync up in this
> example, but there were no more crashes in my testing.
> 
> > After all, an assert is a confident statement that the invariant in
> > question is never going to be broken, because all of the elements of
> > the invariant are under *our* control; but in this case aren't we
> > still in the situation where the ca_field_type() is not as expected?
> > Or are you saying (implicitly) that here connectionCallback() is
> > *guaranteed* to be called before any change of ca_field_type()?
> 
> Au contraire, ca_field_type() should have been updated before the call
> to connectionCallback(). However the assertion that fails is in
> eventCallback() which isn't called until after connectionCallback(). By
> clearing the original monitor subscription inside connectionCallback()
> we are stopping the calls to eventCallback() with the old native data
> type.
> 
> > I have a feeling that in our case when we saw the failure, it wasn't
> > so much that the restarting server changed its record type, but that
> > there was something rather more bumpy about its restart (my colleague
> > working on it was trying to migrate between different EPICS and Linux
> > versions with some unexpected failures).  However the only concrete
> > evidence we have is the error message and a large coincidence.
> 
> If you can find another way to trigger this assertion failure with the
> patch applied please try to replicate it so someone else can trigger it.


-- 
This e-mail and any attachments may contain confidential, copyright and or privileged material, and are for the use of the intended addressee only. If you are not the intended addressee or an authorised recipient of the addressee please notify us of receipt by returning the e-mail and do not use, copy, retain, distribute or disclose the information in or attached to the e-mail.
Any opinions expressed within this e-mail are those of the individual and not necessarily of Diamond Light Source Ltd. 
Diamond Light Source Ltd. cannot guarantee that this e-mail or any attachments are free from viruses and we cannot accept liability for any damage which you may sustain as a result of software viruses which may be transmitted in or with the message.
Diamond Light Source Limited (company no. 4375679). Registered in England and Wales with its registered office at Diamond House, Harwell Science and Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom

References:
'assert (pca->pgetNative)' failed in ../dbCa.c [email protected]
Re: 'assert (pca->pgetNative)' failed in ../dbCa.c Andrew Johnson
RE: 'assert (pca->pgetNative)' failed in ../dbCa.c [email protected]
Re: 'assert (pca->pgetNative)' failed in ../dbCa.c Andrew Johnson

Navigate by Date:
Prev: Failed caput not notified when using ca_array_put_callback [email protected]
Next: areaDetector R3-3-2 released Mark Rivers
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  <20182019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: Re: 'assert (pca->pgetNative)' failed in ../dbCa.c Andrew Johnson
Next: CCD vs. CMOS in accelerator environement John Dobbins
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  <20182019  2020  2021  2022  2023  2024 
ANJ, 09 Jul 2018 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·