EPICS Re: Build failed: EPICS Base 7 base-7.0-419 (and others)

Experimental Physics and Industrial Control System

2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 <2021> 2022 2023 2024 2025	Index	2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 <2021> 2022 2023 2024 2025
<== Date ==>		<== Thread ==>

Subject:	Re: Build failed: EPICS Base 7 base-7.0-419 (and others)
From:	Torsten Bögershausen via Core-talk <core-talk at aps.anl.gov>
To:	"Johnson, Andrew N." <anj at anl.gov>
Cc:	EPICS core-talk <core-talk at aps.anl.gov>
Date:	Fri, 29 Oct 2021 08:24:53 +0200

Thanks Andrew for the fast response.

I am now aware about the GH discussions, see you there




On 10/28/21 8:16 PM, Johnson, Andrew N. wrote:

Hi Torsten,
On Oct 28, 2021, at 3:56 AM, Torsten Bögershausen via Core-talk<core-talk at aps.anl.gov <mailto:core-talk at aps.anl.gov>> wrote:
Digging into the last "Build failed" messages
(thanks to Karl for reminding me)
It seems as if there is a problem with this test case:

not ok 54 - dbGetField("li2", 5) -> 3 == 0
You saw several “Build failed” messages yesterday because I was clickingAppveyor’s “Re-run Incomplete” button on the base-7.0-419 build to seeif I could get all the failing test configurations to pass. There were 3that failed 4 times yesterday, at which point I gave up.
We have a Discussion on GitHub<https://github.com/epics-base/epics-base/discussions/162> where we’vebeen tracking the tests in Base that fail occasionally, please see thecomments about this particular failure<https://github.com/epics-base/epics-base/discussions/162#discussioncomment-1460745> forsome more background on it.
And sometimes problems with packages.chocolatey.org<http://packages.chocolatey.org>, see further down.
Unfortunately those aren’t something that we know how to fix, apparentlywhen they happen our Appveyor VM can't access the servers that weinstall necessary packages from. If there’s a way to tell Appveyor“please re-run this build configuration later or on a different VM” thatmight be something that could be included in our CI scripts which do thesetup. Ideas welcome, but I’m not hopeful we could fix this ourselves.
Are there any good ideas, what could be done about the
dbGetField("li2", 5) -> 3 == 0
failure ?

Disable it under Windows ?
Retry, if it failed ?
Add a sleep?
sleep & retry if it failed ?
Any other good ideas ?
I think both Michael and I prefer that we find a way to fix the testcode so it waits for the event properly, adding sleeps just causes testruns to take longer on systems where they delay isn’t generallyrequired, and you never really know how long you need to wait for.There’s something about the Appveyor VM which causes it to schedulethreads in unusual ways, and that’s probably a good thing for our testsin the long run — it makes us get them right.
I just took another look at the failure and I think I have a fix:
*index 387ee7299..7bb8df7f4 100644*
*--- a/modules/database/test/std/rec/regressLinkSevr.db*
*+++ b/modules/database/test/std/rec/regressLinkSevr.db*
@@ -6,11 +6,10 @@record(stringin, "si1") {
 }
 record(longin, "li1") {
   field(INP, "ai.SEVR")
-  field(FLNK, "si2")
 }


 record(stringin, "si2") {
-  field(INP, "ai.SEVR CA")
+  field(INP, "ai.SEVR CP")
   field(FLNK, "li2")
 }
 record(longin, "li2") {
The test is already waiting for the cnt record to process before itchecks the ENUM values. The above change ensures that the two recordswhich read the SEVR field over CA won’t actually process until theupdate has arrived, so the return from testMonitorWait() will be delayedappropriately. This also means the call to dbCaSync() is no longer required:
*index 95217043d..7580a3402 100644*
*--- a/modules/database/test/std/rec/regressTest.c*
*+++ b/modules/database/test/std/rec/regressTest.c*
@@ -197,7 +197,6 @@void testLinkSevr(void)


     testdbPutFieldOk("si1.PROC", DBF_LONG, 1);
     testMonitorWait(mon);
-    dbCaSync(); /* wait for update */


     testdbGetFieldEqual("si1", DBF_STRING, "INVALID");
     testdbGetFieldEqual("li1", DBF_LONG, INVALID_ALARM);
I will commit and push these changes now and we’ll see if that solves it.

- Andrew

--
Complexity comes for free, simplicity you have to work for.

References:: Build failed: EPICS Base 7 base-7.0-419 AppVeyor via Core-talk; Re: Build failed: EPICS Base 7 base-7.0-419 (and others) Torsten Bögershausen via Core-talk; Re: Build failed: EPICS Base 7 base-7.0-419 (and others) Johnson, Andrew N. via Core-talk

Navigate by Date:: Prev: Build failed: EPICS Base 7 base-7.0-420 AppVeyor via Core-talk; Next: Build failed: epics-base base-7.0-44 AppVeyor via Core-talk; Index: 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 <2021> 2022 2023 2024 2025
Navigate by Thread:: Prev: Re: Build failed: EPICS Base 7 base-7.0-419 (and others) Johnson, Andrew N. via Core-talk; Next: epics-base-7.0-win64-test - Build # 360 - Unstable! APS Jenkins via Core-talk; Index: 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 <2021> 2022 2023 2024 2025