EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  <20222023  2024  2025  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  <20222023  2024  2025 
<== Date ==> <== Thread ==>

Subject: Re: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules
From: Michael Davidsaver via Tech-talk <tech-talk at aps.anl.gov>
To: Torsten Bögershausen <Torsten.Bogershausen at ess.eu>, Mark Rivers <rivers at cars.uchicago.edu>, Benjamin Franksen <benjamin.franksen at helmholtz-berlin.de>
Cc: "tech-talk at aps.anl.gov" <tech-talk at aps.anl.gov>
Date: Tue, 31 May 2022 23:39:31 -0700
On 5/31/22 07:00, Torsten Bögershausen wrote:
Hej Mark,

So R7.0.5 is good, and f9ea6a5bff695c5f88bb95dce38a3fd349738907 is bad ?

There are some "real" commits, and merges:
git log R7.0.5..f9ea6a5bff695c5f88bb95dce38a3fd349738907

Then it could make sense, to bisect between those 2?

Another question:
Coluld it make sense to run the SW (even more stripped may be)
under Linux instead with valgrind ?

An RTOS without a debugger is just about the worst situation I can imagine
to troubleshoot apparent memory corruption.  (excepting maybe having no
console, or it being in orbit)

I've made an attempt to run the database Mark describes with
softIoc+valgrind, and a spam-y python script.  Valgrind doesn't
flag any access violates.

The threading checker ("valgrind --tool=helgrind") does flag a possible
race which might be relevant.  Or it might be a false positive.  I don't
feel that I understand dbEvent.c well enough to say anything conclusive
this late at night.


Possible data race during write of size 8 at 0x5197520 by thread #27
Locks held: 3, at addresses 0x51413A0 0x5143A60 0x5197A10
   at 0x48CA60A: db_queue_event_log (dbEvent.c:824)
   by 0x48CA732: db_post_events (dbEvent.c:892)
   by 0x48595EE: monitor (aoRecord.c:532)
   by 0x48595EE: process (aoRecord.c:232)
   by 0x48B2C6B: dbProcess (dbAccess.c:608)
   by 0x48B4BFF: dbPutField (dbAccess.c:1278)
   by 0x48CE9DD: dbChannel_put (db_access.c:923)
   by 0x48F6C9A: write_action (camessage.c:799)
   by 0x48F7727: camessage (camessage.c:2546)
   by 0x48F351F: camsgtask (camsgtask.c:116)
   by 0x495E094: start_routine (osdThread.c:439)
   by 0x483F876: mythread_wrapper (hg_intercepts.c:387)
   by 0x4F1EEA6: start_thread (pthread_create.c:477)

This conflicts with a previous read of size 8 by thread #21
Locks held: none
   at 0x48CA99D: event_read (dbEvent.c:999)
   by 0x48CA99D: event_task (dbEvent.c:1078)
   by 0x495E094: start_routine (osdThread.c:439)
   by 0x483F876: mythread_wrapper (hg_intercepts.c:387)
   by 0x4F1EEA6: start_thread (pthread_create.c:477)
   by 0x4CBADEE: clone (clone.S:95)
 Address 0x5197520 is 18,416 bytes inside a block of size 19,528 alloc'd
   at 0x48397CF: malloc (vg_replace_malloc.c:307)
   by 0x494DBE1: freeListMalloc (freeListLib.c:95)
   by 0x494DCE8: freeListCalloc (freeListLib.c:68)
   by 0x48C95B2: db_init_events (dbEvent.c:304)
   by 0x48D653C: dbContext::subscribe(epicsGuard<epicsMutex>&, dbChannel*, dbChannelIO&, unsigned int, unsigned long, unsigned int, cacStateNotify&, unsigned int*) (dbContext.cpp:221)
   by 0x48D7468: dbChannelIO::subscribe(epicsGuard<epicsMutex>&, unsigned int, unsigned long, unsigned int, cacStateNotify&, unsigned int*) (dbChannelIO.cpp:119)
   by 0x4EF7771: oldSubscription::oldSubscription(epicsGuard<epicsMutex>&, oldChannelNotify&, cacChannel&, unsigned int, unsigned long, unsigned int, void (*)(event_handler_args), void*, oldSubscription**) (oldSubscription.cpp:43)
   by 0x4EF7099: ca_create_subscription (oldChannelNotify.cpp:573)
   by 0x48D2BE8: dbCaTask (dbCa.c:1249)
   by 0x495E094: start_routine (osdThread.c:439)
   by 0x483F876: mythread_wrapper (hg_intercepts.c:387)
   by 0x4F1EEA6: start_thread (pthread_create.c:477)
 Block was alloc'd by thread #9

https://github.com/epics-base/epics-base/blob/3fadf4a26cfe33dcb9eb9e4620634e1d3a7b9763/modules/database/src/ioc/db/dbEvent.c#L824

https://github.com/epics-base/epics-base/blob/3fadf4a26cfe33dcb9eb9e4620634e1d3a7b9763/modules/database/src/ioc/db/dbEvent.c#L999


$ cat mr-regress.db record(ao,"testAo_0") {}
record(ao,"testAo_1") {}
record(ao,"testAo_2") {}
record(ao,"testAo_3") {}
record(ao,"testAo_4") {}
record(ao,"testAo_5") {}
record(ao,"testAo_6") {}
record(ao,"testAo_7") {}
record(ao,"testAo_8") {}
record(ao,"testAo_9") {}
record(ai,"testAi_0") {field(INP, "testAo_0 CP") }
record(ai,"testAi_1") {field(INP, "testAo_1 CP") }
record(ai,"testAi_2") {field(INP, "testAo_2 CP") }
record(ai,"testAi_3") {field(INP, "testAo_3 CP") }
record(ai,"testAi_4") {field(INP, "testAo_4 CP") }
record(ai,"testAi_5") {field(INP, "testAo_5 CP") }
record(ai,"testAi_6") {field(INP, "testAo_6 CP") }
record(ai,"testAi_7") {field(INP, "testAo_7 CP") }
record(ai,"testAi_8") {field(INP, "testAo_8 CP") }
record(ai,"testAi_9") {field(INP, "testAo_9 CP") }


$ valgrind --tool=helgrind ./bin/linux-x86_64/softIoc -d mr-regress.db


$ ./bin/linux-x86_64/caput testAo_0 1.0
$ ./bin/linux-x86_64/caput testAo_0 2.0

Replies:
Re: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Till Straumann via Tech-talk
RE: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Mark Rivers via Tech-talk
Re: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Michael Davidsaver via Tech-talk
References:
Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Mark Rivers via Tech-talk
Re: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Michael Davidsaver via Tech-talk
RE: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Mark Rivers via Tech-talk
Re: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Michael Davidsaver via Tech-talk
RE: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Mark Rivers via Tech-talk
Re: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Michael Davidsaver via Tech-talk
RE: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Mark Rivers via Tech-talk
Re: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Michael Davidsaver via Tech-talk
RE: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Mark Rivers via Tech-talk
Re: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Michael Davidsaver via Tech-talk
RE: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Mark Rivers via Tech-talk
RE: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Mark Rivers via Tech-talk
RE: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Mark Rivers via Tech-talk
Re: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Torsten Bögershausen via Tech-talk

Navigate by Date:
Prev: Re: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Michael Davidsaver via Tech-talk
Next: Re: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Till Straumann via Tech-talk
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  <20222023  2024  2025 
Navigate by Thread:
Prev: Re: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Torsten Bögershausen via Tech-talk
Next: Re: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Till Straumann via Tech-talk
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  <20222023  2024  2025 
ANJ, 14 Sep 2022 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions ·
· Download · Search · IRMIS · Talk · Documents · Links · Licensing ·