EPICS Re: Build failed: EPICS Base 7 base-7.0-419 (and others)

2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 <2021> 2022 2023 2024 2025	Index	2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 <2021> 2022 2023 2024 2025
<== Date ==>		<== Thread ==>

On Oct 28, 2021, at 3:56 AM, Torsten Bögershausen via Core-talk <core-talk at aps.anl.gov> wrote:

Digging into the last "Build failed" messages
(thanks to Karl for reminding me)
It seems as if there is a problem with this test case:

not ok 54 - dbGetField("li2", 5) -> 3 == 0

You saw several “Build failed” messages yesterday because I was clicking Appveyor’s “Re-run Incomplete” button on the base-7.0-419 build to see if I could get all the failing test configurations to pass. There were 3 that failed 4 times yesterday, at which point I gave up.

We have a Discussion on GitHub where we’ve been tracking the tests in Base that fail occasionally, please see the comments about this particular failure for some more background on it.

And sometimes problems with packages.chocolatey.org, see further down.

Unfortunately those aren’t something that we know how to fix, apparently when they happen our Appveyor VM can't access the servers that we install necessary packages from. If there’s a way to tell Appveyor “please re-run this build configuration later or on a different VM” that might be something that could be included in our CI scripts which do the setup. Ideas welcome, but I’m not hopeful we could fix this ourselves.

Are there any good ideas, what could be done about the
dbGetField("li2", 5) -> 3 == 0
failure ?

Disable it under Windows ?
Retry, if it failed ?
Add a sleep?
sleep & retry if it failed ?
Any other good ideas ?

I think both Michael and I prefer that we find a way to fix the test code so it waits for the event properly, adding sleeps just causes test runs to take longer on systems where they delay isn’t generally required, and you never really know how long you need to wait for. There’s something about the Appveyor VM which causes it to schedule threads in unusual ways, and that’s probably a good thing for our tests in the long run — it makes us get them right.

I just took another look at the failure and I think I have a fix:

index 387ee7299..7bb8df7f4 100644

--- a/modules/database/test/std/rec/regressLinkSevr.db

+++ b/modules/database/test/std/rec/regressLinkSevr.db

@@ -6,11 +6,10 @@ record(stringin, "si1") {

}

record(longin, "li1") {

field(INP, "ai.SEVR")

- field(FLNK, "si2")

}

record(stringin, "si2") {

- field(INP, "ai.SEVR CA")

+ field(INP, "ai.SEVR CP")

field(FLNK, "li2")

}

record(longin, "li2") {

The test is already waiting for the cnt record to process before it checks the ENUM values. The above change ensures that the two records which read the SEVR field over CA won’t actually process until the update has arrived, so the return from testMonitorWait() will be delayed appropriately. This also means the call to dbCaSync() is no longer required:

index 95217043d..7580a3402 100644

--- a/modules/database/test/std/rec/regressTest.c

+++ b/modules/database/test/std/rec/regressTest.c

@@ -197,7 +197,6 @@ void testLinkSevr(void)

   testdbPutFieldOk("si1.PROC", DBF_LONG, 1);

   testMonitorWait(mon);

- dbCaSync(); /* wait for update */

   testdbGetFieldEqual("si1", DBF_STRING, "INVALID");

   testdbGetFieldEqual("li1", DBF_LONG, INVALID_ALARM);

I will commit and push these changes now and we’ll see if that solves it.

- Andrew

Complexity comes for free, simplicity you have to work for.

Subject:	Re: Build failed: EPICS Base 7 base-7.0-419 (and others)
From:	"Johnson, Andrew N. via Core-talk" <core-talk at aps.anl.gov>
To:	Torsten Bögershausen <torsten.bogershausen at ess.eu>
Cc:	EPICS core-talk <core-talk at aps.anl.gov>
Date:	Thu, 28 Oct 2021 18:16:14 +0000

Experimental Physics and Industrial Control System