On 2/9/22 10:53, Andrew Johnson via Core-talk wrote:
My latest Appveyor job for epics-base showed a couple of failures, in the pvAccess testChannelAccess tests (which are unrelated to the CA provider or my commits that triggered this build),
https://github.com/epics-base/pvAccessCPP/issues/98
Not a new issue. testChannelAccess, which in fact tests PVA only, has a number
synchronization issues. The test code is itself quite complex. To the point
that I've never been motivated to dig in to the depth required to straighten
it out. Frankly it would mean a re-write.
https://github.com/epics-base/pvAccessCPP/blob/master/testApp/remote/channelAccessIFTest.cpp
I have mixed feelings about keeping this test. It would probably have value
in validating in future changes to pvAccessCPP. Until, or unless, this happens
it's just noise.
eg. maybe someone can come up with a recipe to run this test only during PR
builds in the pvAccessCPP repository?
and an unexplained core-dump analysis of the testCaProvider tests after a silent access violation. I'm just writing this to record what I've found out about it, I'm not expecting anyone else to delve further given that it's a very rare thing.
I see this core dump fairly frequently, although not on every run.
The msvc debug builds show "pNode = 0xdd" which suggests a use-after-free.
Maybe a double call of 'dbChannelDelete()'?
https://stackoverflow.com/questions/370195/when-and-why-will-a-compiler-initialise-memory-to-0xcd-0xdd-etc-on-malloc-fre
On 2/9/22 2:06 AM, AppVeyor via Core-talk wrote:
Build epics-base base-7.0-48 failed <https://ci.appveyor.com/project/anjohnson/epics-base/builds/42500290>
Commit 2fbaa7f926 <https://github.com/anjohnson/epics-base/commit/2fbaa7f926> by Andrew Johnson <mailto:anj at anl.gov> on 2/8/2022 9:29 PM:
Improve POD documentation of the TSE and TSEL fields
C:/projects/epics-base/modules/pvAccess/testApp/O.windows-x64-debug
15337 <https://ci.appveyor.com/project/anjohnson/epics-base/builds/42500290/job/bcpdlxynwmfx616b#L15337>testAtomicBoolean.tap ..... ok
15338 <https://ci.appveyor.com/project/anjohnson/epics-base/builds/42500290/job/bcpdlxynwmfx616b#L15338>testHexDump.tap ........... ok
15339 <https://ci.appveyor.com/project/anjohnson/epics-base/builds/42500290/job/bcpdlxynwmfx616b#L15339>testInetAddressUtils.tap .. ok
15340 <https://ci.appveyor.com/project/anjohnson/epics-base/builds/42500290/job/bcpdlxynwmfx616b#L15340>configurationTest.tap ..... ok
15341 <https://ci.appveyor.com/project/anjohnson/epics-base/builds/42500290/job/bcpdlxynwmfx616b#L15341>testFairQueue.tap ......... ok
15342 <https://ci.appveyor.com/project/anjohnson/epics-base/builds/42500290/job/bcpdlxynwmfx616b#L15342>testWildcard.tap .......... ok
15343 <https://ci.appveyor.com/project/anjohnson/epics-base/builds/42500290/job/bcpdlxynwmfx616b#L15343>testChannelAccess.tap .....
15344 <https://ci.appveyor.com/project/anjohnson/epics-base/builds/42500290/job/bcpdlxynwmfx616b#L15344>not ok 25 - void __cdecl ChannelAccessIFTest::test_channel(void): a destroy event was caught for the testing channel that was destroyed twice
15345 <https://ci.appveyor.com/project/anjohnson/epics-base/builds/42500290/job/bcpdlxynwmfx616b#L15345>Failed 5/152 subtests
15346 <https://ci.appveyor.com/project/anjohnson/epics-base/builds/42500290/job/bcpdlxynwmfx616b#L15346>(less 12 skipped subtests: 135 okay)
15347 <https://ci.appveyor.com/project/anjohnson/epics-base/builds/42500290/job/bcpdlxynwmfx616b#L15347>testCodec.tap ............. ok
15348 <https://ci.appveyor.com/project/anjohnson/epics-base/builds/42500290/job/bcpdlxynwmfx616b#L15348>testRPC.tap ............... ok
15349 <https://ci.appveyor.com/project/anjohnson/epics-base/builds/42500290/job/bcpdlxynwmfx616b#L15349>testServerContext.tap ..... ok
15350 <https://ci.appveyor.com/project/anjohnson/epics-base/builds/42500290/job/bcpdlxynwmfx616b#L15350>testmonitorfifo.tap ....... ok
15351 <https://ci.appveyor.com/project/anjohnson/epics-base/builds/42500290/job/bcpdlxynwmfx616b#L15351>testsharedstate.tap ....... ok
15352 <https://ci.appveyor.com/project/anjohnson/epics-base/builds/42500290/job/bcpdlxynwmfx616b#L15352>
15353 <https://ci.appveyor.com/project/anjohnson/epics-base/builds/42500290/job/bcpdlxynwmfx616b#L15353>Test Summary Report
15354 <https://ci.appveyor.com/project/anjohnson/epics-base/builds/42500290/job/bcpdlxynwmfx616b#L15354>-------------------
15355 <https://ci.appveyor.com/project/anjohnson/epics-base/builds/42500290/job/bcpdlxynwmfx616b#L15355>testChannelAccess.tap (Wstat: 0 Tests: 148 Failed: 1)
15356 <https://ci.appveyor.com/project/anjohnson/epics-base/builds/42500290/job/bcpdlxynwmfx616b#L15356>Failed test: 25
15357 <https://ci.appveyor.com/project/anjohnson/epics-base/builds/42500290/job/bcpdlxynwmfx616b#L15357>Parse errors: Tests out of sequence. Found (23) but expected (21)
15358 <https://ci.appveyor.com/project/anjohnson/epics-base/builds/42500290/job/bcpdlxynwmfx616b#L15358>Tests out of sequence. Found (24) but expected (22)
15359 <https://ci.appveyor.com/project/anjohnson/epics-base/builds/42500290/job/bcpdlxynwmfx616b#L15359>Tests out of sequence. Found (25) but expected (23)
15360 <https://ci.appveyor.com/project/anjohnson/epics-base/builds/42500290/job/bcpdlxynwmfx616b#L15360>Tests out of sequence. Found (26) but expected (24)
15361 <https://ci.appveyor.com/project/anjohnson/epics-base/builds/42500290/job/bcpdlxynwmfx616b#L15361>Tests out of sequence. Found (27) but expected (25)
15362 <https://ci.appveyor.com/project/anjohnson/epics-base/builds/42500290/job/bcpdlxynwmfx616b#L15362>Displayed the first 5 of 129 TAP syntax errors.
15363 <https://ci.appveyor.com/project/anjohnson/epics-base/builds/42500290/job/bcpdlxynwmfx616b#L15363>Re-run prove with the -p option to see them all.
15364 <https://ci.appveyor.com/project/anjohnson/epics-base/builds/42500290/job/bcpdlxynwmfx616b#L15364>Files=12, Tests=6191, 0 wallclock secs ( 0.33 usr + 0.00 sys = 0.33 CPU)
15365 <https://ci.appveyor.com/project/anjohnson/epics-base/builds/42500290/job/bcpdlxynwmfx616b#L15365>Result: FAIL
15366 <https://ci.appveyor.com/project/anjohnson/epics-base/builds/42500290/job/bcpdlxynwmfx616b#L15366>-------------------
15367 <https://ci.appveyor.com/project/anjohnson/epics-base/builds/42500290/job/bcpdlxynwmfx616b#L15367>
15368 <https://ci.appveyor.com/project/anjohnson/epics-base/builds/42500290/job/bcpdlxynwmfx616b#L15368>
15369 <https://ci.appveyor.com/project/anjohnson/epics-base/builds/42500290/job/bcpdlxynwmfx616b#L15369>
The failing part of the testChannelAccess.tap file looks like this:
ok 20 - void __cdecl ChannelAccessIFTest::test_channel(void): channel connection state connected
# void __cdecl ChannelAccessIFTest::test_channel(void): destroying the channel
#SyncChannelRequesterImpl.channelStateChange:2
#ok 21 - SyncChannelRequesterImplvoid __cdecl ChannelAccessIFTest::test_channel(void): channel created count should be the same on the destroyed channel.
channelStateChange:3not ok 22 -
void __cdecl ChannelAccessIFTest::test_channel(void): channel state change count should increase on the destroyed channel
ok 23 - void __cdecl ChannelAccessIFTest::test_channel(void): channel should not be connected
ok 24 - void __cdecl ChannelAccessIFTest::test_channel(void): channel connection state DESTROYED
# void __cdecl ChannelAccessIFTest::test_channel(void): destroying the channel yet again
not ok 25 - void __cdecl ChannelAccessIFTest::test_channel(void): a destroy event was caught for the testing channel that was destroyed twice
# BEGIN TEST void __cdecl ChannelAccessIFTest::test_channelGetWithInvalidChannelAndRequester(void):
#SyncChannelRequesterImpl.channelCreated(Status [type=OK])
#SyncChannelRequesterImpl.channelStateChange:1
ok 26 # SKIP creating a channel get with a null channel
Unfortunately the test code is emitting other text to stdout which is messing up the tap output, the two magenta-colored test results above aren't being seen or counted properly by the test harness, resulting in it reporting tests of of sequence. There are still 2 failing tests above though, #22 and #25, but only on this VS-2019 dynamic-debug build.
Then at the end of the build log there is a core-dump and exception analysis <https://ci.appveyor.com/project/anjohnson/epics-base/builds/42500290/job/bcpdlxynwmfx616b#L15729> of the testCaProvider.exe test program, which didn't show up as failing any tests or dying when it was run but does seem to have silently dumped a core file. This shows a destruction problem during atexit cleanups. Whether it's related to an issue in the caProvider itself or the combination of running a local CA client and an IOC in the same process isn't easy to tell though. I probably won't look at this any further unless it starts happening elsewhere.
- Andrew
--
Complexity comes for free, Simplicity you have to work for.
- References:
- Build failed: epics-base base-7.0-48 AppVeyor via Core-talk
- Appveyor build failures from pvAccessCPP Andrew Johnson via Core-talk
- Navigate by Date:
- Prev:
Appveyor build failures from pvAccessCPP Andrew Johnson via Core-talk
- Next:
lgtm.com, and a heads-up about gcc 10 Andrew Johnson via Core-talk
- Index:
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
<2022>
2023
2024
- Navigate by Thread:
- Prev:
Appveyor build failures from pvAccessCPP Andrew Johnson via Core-talk
- Next:
lgtm.com, and a heads-up about gcc 10 Andrew Johnson via Core-talk
- Index:
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
<2022>
2023
2024
|