EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  <20222023  2024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  <20222023  2024 
<== Date ==> <== Thread ==>

Subject: RE: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules
From: Mark Rivers via Tech-talk <tech-talk at aps.anl.gov>
To: Michael Davidsaver <mdavidsaver at gmail.com>
Cc: "tech-talk at aps.anl.gov" <tech-talk at aps.anl.gov>
Date: Thu, 26 May 2022 20:17:28 +0000

Hi Michael,

 

I have now stripped my VME crate down to just 2 cards, the MVME5100 CPU and a TVME220 IP carrier card.  The carrier card has a DAC128V D/A and an IP330 A/D.  Channel 0 of the D/A is connected to channel 0 of the A/D.

 

The startup script is very simple:

**********************************

# vxWorks startup file

 

< cdCommands

 

nfsAuthUnixSet("corvette", 849601092, 849600513, 0, 0)

 

# Mount drives with NFS

nfsMount("corvette","/home","/corvette/home")

nfsMount("corvette","/home","/home")

 

cd topbin

load("CARSTest.munch")

cd startup

 

dbLoadDatabase("$(CARS)/dbd/CARSTestVX.dbd")

CARSTestVX_registerRecordDeviceDriver(pdbbase)

 

ipacAddTVME200("342FA2")

 

initDAC128V("DAC1", 0, 1)

dbLoadTemplate "DAC.template"

 

initIp330("Ip330_1",0,0,"D","-10to10",0,15,120)

configIp330("Ip330_1", 3,"Input", 500,0)

dbLoadTemplate "Ip330_ADC.template"

 

iocInit

**********************************

 

The IPAC driver, DAC128V driver, and IP330 driver have not changed in years.

 

When I build this IOC with base 7.0.5 it works fine, there are no bus errors.

 

When I build this IOC with base 7.0.6.1 and I adjust the D/A quickly with a slider for a few seconds while a couple of CA clients are receiving monitors from the A/D I get the following failure:

 

VME Bus Error accessing A16: 0x347e

machine check

Exception next instruction address: 0x0368ce90

Machine Status Register: 0x0008b032

Condition Register: 0x48000884

Task: 0x27011d0 "CAS-event"

0x27011d0 (CAS-event): task 0x27011d0 has had a failure and has been stopped.

0x27011d0 (CAS-event): The task has been terminated because it triggered an exception that raised the signal 10.

 

This is the task trace on that task:

 

ioc13lab2> tt 0x27011d0

0x0012489c vxTaskEntry  +0x48 : epicsThreadEntry ()

0x036a90d4 epicsThreadEntry+0x80 : 0x036073f8 ()

0x036076d0 db_start_events+0x458: db_delete_field_log ()

0x03606be0 db_delete_field_log+0x54 : freeListFree ()

value = 0 = 0x0

 

You said:

 

Ø  So I'm more confident in claiming the mention of "CAS-event" is false.

Ø  The faulting instruction probably originates on some other scan/driver thread, then there is a context switch to "CAS-event" because of a call to db_post_events().

 

I now strongly suspect the problem is the opposite of that.  I think that the task that is failing is indeed CAS-event, and what is incorrect is the report of a bus error.  The reason I think this is:

-          The IPAC, DAC, and IP330 drivers are very well debugged.

-          The errors do not happen with base 7.0.5

-          The bus error messages only happen when there are lots of CA monitor events being passed to CA clients.  The errors never occur if there are no CA clients receiving monitors.  That makes no sense in terms of actual bus errors.

-          The code in dbEvent.c relating to db_field_log has changed significantly between base 7.0.5 and 7.0.6.1.

 

Mark

 

 

 

 

-----Original Message-----
From: Michael Davidsaver <mdavidsaver at gmail.com>
Sent: Saturday, May 21, 2022 10:04 PM
To: Mark Rivers <rivers at cars.uchicago.edu>
Cc: tech-talk at aps.anl.gov
Subject: Re: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules

 

On 5/21/22 15:34, Mark Rivers wrote:

> ØOk, so all powerpc.  PPC Machine Check exception is asynchronous.

>

> ØSo I'm more confident in claiming the mention of "CAS-event" is false.

>

> ØThe faulting instruction probably originates on some other scan/driver thread, then there is a context switch to "CAS-event" because of a call to db_post_events().

>

> I’m not sure I understand the logic.  The other scan/driver threads are always running.  The IP330 is always interrupting at 2 kHz, and doing callbacks to device support.  It runs with no VME bus errors at all until I open an medm screen or run camonitor.  So it seems that the problem must be caused by having the CAS-event task do CA monitors, and it is not just that the CAS-event task is being incorrectly blamed for the problem?

 

Oh, what I say is far from a concrete explanation.  There is clearly something more going on here.

 

I think it could only be true if the faulting operation were a "posted write".  Meaning the VME bridge buffers the operation, letting the CPU proceed before the VME bus cycle has actually completed.

 

This is why you may sometimes finds drivers with a "dummy" load after an important store (eg. interrupt acknowledge).  Waiting for the load instruction will also wait for the preceding store to complete.

 

eg. an admonishment by Till from the RTEMS universe2 bridge driver.

 

https://github.com/RTEMS/rtems/blob/a316a9ddaeaa8f6316b2a2d29ca82b3ad40d2d22/bsps/powerpc/shared/vme/vmeUniverse.c#L2187-L2189

 

"posted writes" are a configuration option for each address window.

Disabling this may give a more accurate address with the exception, at the expense of some slow down.

 

 

Of course, none of this would explain why these particular addresses are being accessed, nor why they fault.

 

 

On 5/21/22 16:45, Mark Rivers wrote:

> ØI will also try to make a thin vxWorks IOC application with basically just Industry Pack module support in case it is some strange interaction with another module.

>

> A thin IOC with only seq, asyn, iocStats, ipac,  ip330, dac128V, and ipUnidig does not fail with base 7.0.6.1.

>

> I will add things back in one at a time and see what is actually causing the problem.

>

> Luckily it is a gray and rainy day in Chicago. J

 

Some people have all the luck.  Today was terribly sunny here in CA :)

 

(I've been here 4.5 years, and I still can't get over the weather!)

 

 

> I will try 7.0.6.

>

> I will also try to make a thin vxWorks IOC application with basically just Industry Pack module support in case it is some strange interaction with another module.

>

> Mark

>

> -----Original Message-----

> From: Michael Davidsaver <mdavidsaver at gmail.com>

> Sent: Saturday, May 21, 2022 4:31 PM

> To: Mark Rivers <rivers at cars.uchicago.edu>

> Cc: tech-talk at aps.anl.gov

> Subject: Re: Bus errors accessing VME with base 7.0.6.1 and latest

> synApps modules

>

> On 5/21/22 11:05, Mark Rivers wrote:

>

>  > ØWhat specific board is involved?  (eg. mvme3100?)

>

>  >

>

>  > The test crate is an MVME5100.  But the production crates that were also failing include several MVME2700 boards as well as some MVME5100.

>

> Ok, so all powerpc.  PPC Machine Check exception is asynchronous.

>

> So I'm more confident in claiming the mention of "CAS-event"

>

> is false.  The faulting instruction probably originates on some other scan/driver thread, then there is a context switch to "CAS-event" because of a call to db_post_events().

>

> It still seems odd to me that the CPU could get all the way into

> db_post_events() and wake up "CAS-event" before a VME cycle completes. 

> (maybe there are VME timeout happening?)

>

> In addition to Base 7.0.5 and and 7.0.6.1, could you test with 7.0.6 ?

>

> This might narrow things down a little.

>

> Since you already have a version range to suspect, you could try to narrow down further with git-bisect.

>

> (although I can't honestly recommend this as a good way to pass what

> for me is a nice Saturday afternoon.)

>

> https://git-scm.com/docs/git-bisect

> <https://git-scm.com/docs/git-bisect>

>

>  > git bisect start R7.0.5 R7.0.6.1

>

 


Replies:
RE: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Mark Rivers via Tech-talk
References:
Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Mark Rivers via Tech-talk
RE: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Mark Rivers via Tech-talk
RE: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Mark Rivers via Tech-talk
RE: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Mark Rivers via Tech-talk
Re: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Michael Davidsaver via Tech-talk
RE: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Mark Rivers via Tech-talk
Re: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Michael Davidsaver via Tech-talk
RE: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Mark Rivers via Tech-talk
Re: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Michael Davidsaver via Tech-talk

Navigate by Date:
Prev: Scientific Software Positions at the Advanced Photon Source (APS) Schwarz, Nicholas via Tech-talk
Next: kafka alarm server does not acknowledge Geoffrey Savage via Tech-talk
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  <20222023  2024 
Navigate by Thread:
Prev: Re: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Michael Davidsaver via Tech-talk
Next: RE: Bus errors accessing VME with base 7.0.6.1 and latest synApps modules Mark Rivers via Tech-talk
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  <20222023  2024 
ANJ, 14 Sep 2022 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·