Experimental Physics and Industrial Control System
Rees, NP (Nick) wrote:
This is really a follow-up to the 2002 thread on PowerPC caching:
http://www.aps.anl.gov/epics/tech-talk/2002/msg00417.php
That thread ended up talking about write posting through the VME bridge,
I'm not sure whether it's the same thing you're talking about or not.
At Diamond we use the mvme5500. It seems that every vxWorks driver we
have will fall over at some point unless we are careful with memory
caching. Very few EPICS drivers do this properly. However, on the
mvme5500 if we precede every read with a cacheInvalidate and follow
every write routine with a cacheFlush we don't seem to have any
problems.
Caching should only be an issue if you have two bus masters accessing
the same area of memory; i.e. if there is a DMA controller or a second
CPU involved in the I/O operation. Not many EPICS drivers actually use
a DMA controller, so most drivers don't need to worry about cache
coherency issues.
There is an option in Marty Kraimer's Generic Transient Recorder support
to make use of some BSP-supplied DMA routines that I wrote for the
Universe-2 and VMEchip2 bridges. My implementation of this VME DMA
support *should* perform the necessary cache manipulation operations at
the appropriate time (the requirements are different on different boards
though; most boards can do bus snooping, but some do not and require the
driver to call cacheInvalidate() before reading and cacheFlush() after
writing the shared area).
Later, Rees, NP (Nick) wrote:
Peter Denison gave me the following link to the Linux kernel sources
which gives a reasonable description of sync, lwsync, eieio and wmb, rmb
etc.
http://lxr.linux.no/source/include/asm-ppc64/system.h#L18
I assume that at the bottom of cacheFlush and cacheInvalidate is a sync
instruction on the PPC, but I might be wrong. I certainly don't see it.
Not necessarily, since sync is an expensive instruction on the PowerPC
which you would really rather not execute if you can avoid it. A
cacheFlush() will perform other instructions that ensure the cache area
specified is definitely pushed out to memory before it returns, but that
doesn't necessarily need a sync.
Note that all the VMEbus windows should be marked as Non-Cacheable and
Guarded in your BSP's sysPhysMemDesc[] table.
The problem we have found with the 5500 is, I think, related to the
problem outlined in the 2002 tech-talk thread. The processor MMU may
know the VME memory isn't cached, but there is also a cache in the VME
chip (why, I don't know) and that has to be forced as well.
Ok, that's a different issue and not a cache; the Universe-2 VME bridge
chip has a write pipeline which enables a PCI bus write cycle to
complete before the matching VMEbus cycle does. Cycles on VMEbus are
generally much slower than on the PCIbus, and there's no reason to hold
up the processor just to wait for the cycle to complete, so the bridge
chip has a write FIFO built into it that allows it to queue up a series
of write operations. This speeds up the processor operation quite a
bit, at the expense of having to be a little more careful in the
software (and a loss of synchronization with VME Bus errors, but that's
a slightly different topic).
The bridge ensures that all pending VME write operations are performed
before it will execute any subsequent VMEbus read, for which it has to
delay the PCIbus cycle since it can't complete that without getting the
read data from the VME cycle anyway. This ensures that the I/O cycles
are still performed in the same order that the CPU requested them, but
the timing may be different.
Usually the different timing that doesn't matter, but there is one
circumstance where it does - clearing interrupt requests from VME
boards. Here an Interrupt Service Routine needs to ensure that any
write cycle going to a VME slave board that causes it to deassert an IRQ
line gets out to the board before we return from the ISR, otherwise
nasty things happen. The way to guarantee this is to perform a dummy
read from some register on the VME board at the end of the ISR, which
pushes the VMEbus writes out and thus clears the IRQ condition.
This should be sufficient to fix your problem; the cacheFlush and/or
cacheInvalidate that you were calling were inserting a delay into the
CPU instruction chain which was probably sufficient time for the VME
write pipeline to flush itself naturally. Howere this is not
guaranteed, so I suggest you fix these with the VME read instead.
- Andrew
--
Not everything that can be counted counts,
and not everything that counts can be counted.
-- Albert Einstein
- Replies:
- RE: Dev lib off-board register access Thompson, David H.
- References:
- Dev lib off-board register access Rees, NP (Nick)
- Navigate by Date:
- Prev:
Re: Dev lib off-board register access Eric Bjorklund
- Next:
Waveform record help Erik Johansson
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
<2006>
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
- Navigate by Thread:
- Prev:
Re: Dev lib off-board register access Andrew Johnson
- Next:
RE: Dev lib off-board register access Thompson, David H.
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
<2006>
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024