EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  <20222023  2024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  <20222023  2024 
<== Date ==> <== Thread ==>

Subject: Re: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules
From: Till Straumann via Tech-talk <tech-talk at aps.anl.gov>
To: Michael Davidsaver <mdavidsaver at gmail.com>, Torsten Bögershausen <Torsten.Bogershausen at ess.eu>, "Mark Rivers" <rivers at cars.uchicago.edu>, Benjamin Franksen <benjamin.franksen at helmholtz-berlin.de>
Cc: "tech-talk at aps.anl.gov" <tech-talk at aps.anl.gov>
Date: Wed, 1 Jun 2022 09:15:06 +0200
Hi all.

Have you ever bothered to look at the code that causes the actual exception?

I would recommend to look at what code sits at 'Exception next instruction address'.

It is a pity that vxWorks (I assume this the exception message is printed by the OS?)
does not give us a register dump which would be most helpful...

HTH
- Till

On 6/1/22 08:39, Michael Davidsaver via Tech-talk wrote:
On 5/31/22 07:00, Torsten Bögershausen wrote:
Hej Mark,

So R7.0.5 is good, and f9ea6a5bff695c5f88bb95dce38a3fd349738907 is bad ?

There are some "real" commits, and merges:
git log R7.0.5..f9ea6a5bff695c5f88bb95dce38a3fd349738907

Then it could make sense, to bisect between those 2?

Another question:
Coluld it make sense to run the SW (even more stripped may be)
under Linux instead with valgrind ?

An RTOS without a debugger is just about the worst situation I can imagine
to troubleshoot apparent memory corruption.  (excepting maybe having no
console, or it being in orbit)

I've made an attempt to run the database Mark describes with
softIoc+valgrind, and a spam-y python script.  Valgrind doesn't
flag any access violates.

The threading checker ("valgrind --tool=helgrind") does flag a possible
race which might be relevant.  Or it might be a false positive.  I don't
feel that I understand dbEvent.c well enough to say anything conclusive
this late at night.


Possible data race during write of size 8 at 0x5197520 by thread #27
Locks held: 3, at addresses 0x51413A0 0x5143A60 0x5197A10
   at 0x48CA60A: db_queue_event_log (dbEvent.c:824)
   by 0x48CA732: db_post_events (dbEvent.c:892)
   by 0x48595EE: monitor (aoRecord.c:532)
   by 0x48595EE: process (aoRecord.c:232)
   by 0x48B2C6B: dbProcess (dbAccess.c:608)
   by 0x48B4BFF: dbPutField (dbAccess.c:1278)
   by 0x48CE9DD: dbChannel_put (db_access.c:923)
   by 0x48F6C9A: write_action (camessage.c:799)
   by 0x48F7727: camessage (camessage.c:2546)
   by 0x48F351F: camsgtask (camsgtask.c:116)
   by 0x495E094: start_routine (osdThread.c:439)
   by 0x483F876: mythread_wrapper (hg_intercepts.c:387)
   by 0x4F1EEA6: start_thread (pthread_create.c:477)

This conflicts with a previous read of size 8 by thread #21
Locks held: none
   at 0x48CA99D: event_read (dbEvent.c:999)
   by 0x48CA99D: event_task (dbEvent.c:1078)
   by 0x495E094: start_routine (osdThread.c:439)
   by 0x483F876: mythread_wrapper (hg_intercepts.c:387)
   by 0x4F1EEA6: start_thread (pthread_create.c:477)
   by 0x4CBADEE: clone (clone.S:95)
 Address 0x5197520 is 18,416 bytes inside a block of size 19,528 alloc'd
   at 0x48397CF: malloc (vg_replace_malloc.c:307)
   by 0x494DBE1: freeListMalloc (freeListLib.c:95)
   by 0x494DCE8: freeListCalloc (freeListLib.c:68)
   by 0x48C95B2: db_init_events (dbEvent.c:304)
   by 0x48D653C: dbContext::subscribe(epicsGuard<epicsMutex>&, dbChannel*, dbChannelIO&, unsigned int, unsigned long, unsigned int, cacStateNotify&, unsigned int*) (dbContext.cpp:221)    by 0x48D7468: dbChannelIO::subscribe(epicsGuard<epicsMutex>&, unsigned int, unsigned long, unsigned int, cacStateNotify&, unsigned int*) (dbChannelIO.cpp:119)    by 0x4EF7771: oldSubscription::oldSubscription(epicsGuard<epicsMutex>&, oldChannelNotify&, cacChannel&, unsigned int, unsigned long, unsigned int, void (*)(event_handler_args), void*, oldSubscription**) (oldSubscription.cpp:43)
   by 0x4EF7099: ca_create_subscription (oldChannelNotify.cpp:573)
   by 0x48D2BE8: dbCaTask (dbCa.c:1249)
   by 0x495E094: start_routine (osdThread.c:439)
   by 0x483F876: mythread_wrapper (hg_intercepts.c:387)
   by 0x4F1EEA6: start_thread (pthread_create.c:477)
 Block was alloc'd by thread #9

https://github.com/epics-base/epics-base/blob/3fadf4a26cfe33dcb9eb9e4620634e1d3a7b9763/modules/database/src/ioc/db/dbEvent.c#L824

https://github.com/epics-base/epics-base/blob/3fadf4a26cfe33dcb9eb9e4620634e1d3a7b9763/modules/database/src/ioc/db/dbEvent.c#L999


$ cat mr-regress.db record(ao,"testAo_0") {}
record(ao,"testAo_1") {}
record(ao,"testAo_2") {}
record(ao,"testAo_3") {}
record(ao,"testAo_4") {}
record(ao,"testAo_5") {}
record(ao,"testAo_6") {}
record(ao,"testAo_7") {}
record(ao,"testAo_8") {}
record(ao,"testAo_9") {}
record(ai,"testAi_0") {field(INP, "testAo_0 CP") }
record(ai,"testAi_1") {field(INP, "testAo_1 CP") }
record(ai,"testAi_2") {field(INP, "testAo_2 CP") }
record(ai,"testAi_3") {field(INP, "testAo_3 CP") }
record(ai,"testAi_4") {field(INP, "testAo_4 CP") }
record(ai,"testAi_5") {field(INP, "testAo_5 CP") }
record(ai,"testAi_6") {field(INP, "testAo_6 CP") }
record(ai,"testAi_7") {field(INP, "testAo_7 CP") }
record(ai,"testAi_8") {field(INP, "testAo_8 CP") }
record(ai,"testAi_9") {field(INP, "testAo_9 CP") }


$ valgrind --tool=helgrind ./bin/linux-x86_64/softIoc -d mr-regress.db


$ ./bin/linux-x86_64/caput testAo_0 1.0
$ ./bin/linux-x86_64/caput testAo_0 2.0


References:
Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Mark Rivers via Tech-talk
RE: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Mark Rivers via Tech-talk
Re: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Michael Davidsaver via Tech-talk
RE: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Mark Rivers via Tech-talk
Re: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Michael Davidsaver via Tech-talk
RE: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Mark Rivers via Tech-talk
Re: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Michael Davidsaver via Tech-talk
RE: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Mark Rivers via Tech-talk
Re: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Michael Davidsaver via Tech-talk
RE: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Mark Rivers via Tech-talk
RE: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Mark Rivers via Tech-talk
RE: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Mark Rivers via Tech-talk
Re: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Torsten Bögershausen via Tech-talk
Re: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Michael Davidsaver via Tech-talk

Navigate by Date:
Prev: Re: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Michael Davidsaver via Tech-talk
Next: Re: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Torsten Bögershausen via Tech-talk
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  <20222023  2024 
Navigate by Thread:
Prev: Re: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Michael Davidsaver via Tech-talk
Next: RE: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Mark Rivers via Tech-talk
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  <20222023  2024 
ANJ, 14 Sep 2022 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·