Subject: |
vxWorks mv167 BSP memory FAQ |
From: |
[email protected] (John R. Winans) |
Date: |
Wed, 26 Oct 1994 15:38:04 -0600 |
In the light of a large number of email messages I have been getting
lately about memory and the BSPs for vxWorks, I offer the following:
This is the start of a FAQ I have been working on about how to deal with
memory when writing device drivers.
--John
P.S. This does not yet address the devLib code. I hope that Jeff H. might be
able to add stuff about it as is needed.
==========================================================================
First lets go over how the VME system defines the memory spaces. And EXACTLY
what the BSP from vxWorks does to handle the VME bus.
There are 3 regions of memory, a 16-bit addressed range called A16 (or SHORT)
that contains 64KB, a 24-bit addressed range called A24 (or STD) that
contains 16MB, and a 32-bit addressed range called A32 (or EXT) that contains
4GB.
The processor mode is also taken into account by the VME bus. The allowed
modes are USR and SUP (for user and supervisor.) (These modes are not a
concern to EPICS because it ALWAYS runs in the same mode. But they have to
be taken into account.)
The type of data access is also represented on the VME bus. The types are used
to represent the intended use of the data being accessed. These can be PGM for
instruction text, DATA for accesses to non-instruction data, and BLK for
bulk block transfers.
The VME defines something called an AMODE, the AMODE is the code that
represents the memory region, processor mode, and data type. These are
expressed by vxWorks in the file 'vme.h' as the following:
#define VME_AM_STD_SUP_ASCENDING 0x3f
#define VME_AM_STD_SUP_PGM 0x3e
#define VME_AM_STD_SUP_DATA 0x3d
#define VME_AM_STD_USR_ASCENDING 0x3b
#define VME_AM_STD_USR_PGM 0x3a
#define VME_AM_STD_USR_DATA 0x39
#define VME_AM_SUP_SHORT_IO 0x2d
#define VME_AM_USR_SHORT_IO 0x29
#define VME_AM_EXT_SUP_ASCENDING 0x0f
#define VME_AM_EXT_SUP_PGM 0x0e
#define VME_AM_EXT_SUP_DATA 0x0d
#define VME_AM_EXT_USR_ASCENDING 0x0b
#define VME_AM_EXT_USR_PGM 0x0a
#define VME_AM_EXT_USR_DATA 0x09
(The 'ASCENDING' ones are WRS version of block transfer.)
Each of the AMODEs can be accessed by either using 8-bit (D8), 16-bit (D16),
or 32-bit (D32) data transfers. (The D8 width can be set to transfer the
8-bit value on either the even and odd or just the odd portion of the 16-bit
bus. Therefore the D8 access width is properly specified via D8(EO) or D8(O).
I/O boards one the VME backplane have to decode the AMODE that they are
supposed to respond to as well as the data access widths that it supports.
The AMODE is typically set via jumpers on the I/O board. And the data width
is normally hardwired into the board.
Depending on the supported/selected AMODE that an I/O board is using, it has to
also decode the address lines representing the range of addresses that the
board is going to respond to. Obviously, when operating in an A16 mode, there
are 16 address lines, A24 has 24, and A32 has 32.
It is possible for a board to decode and respond to more than one AMODE and/or
data widths. It is also possible that two seperate boards can respond to the
same address range in the same memory region, but to different processor modes
and/or data widths. The VME bus will operate properly in either case.
===
Before continuing, it should be pointed out that the modern CPU boards contain
devices and memory along with the CPU. This can cause some interesting
confusion because the CPU card might be able to access devices that are on
its own board, but other boards on the VME backplane might NOT be able to.
It is up to the designer of the CPU card to decide what can be accessed from
the CPU and what can be accessed from the VME.
Since CPUs generally don't 'speak' VME directly, the VME bus has to be
interfaced to the CPU as if it itself were a peripherial device. Most CPUs
don't know squat about things like the D8(EO) mode and so on. Therefore
they are 'faked' by the VME interface. The common way to do that is to reserve
sections of the CPUs address space and assign each one to a VME AMODE and
data transfer width. Then when the CPU wants to get at something in one of
the VME areas, it has to access it by referenceing that section of the CPUs
address space that was allocated to the one of interest.
The rest of this document will use the term CPU address to refer to the address
being used by the CPU itself, and the term VME region to refer to a specific
AMODE and data transfer width.
===
Here is the view of how the distributed BSP sets up the mv167 to deal with
allowing accesses to and from each of the AMODEs and data widths in release
5.1.1 of vxWorks.
Realize that there is a number of things on the mv167 itself that consume a
range of addresses. Most obviously is the DRAM. There is also a number of
registers and things that are used to operate the mv167s included devices.
The mv167 has on it a 68040 CPU that is wired in such a way that it can
address exactly 4GB of memory. The CPUs addressing modes (that are similar
to the VME ones, by the way) are essentially ignored.
Motorola has hard-coded a number of addresses on the mv167 and therefore they
can NOT be changed. These are the device addresses for the I/O devices on the
mv167 (like the ethernet controller, serial ports, SCSI bus, etc.) This
reserved area is from CPU address 0xFFF00000 to 0xFFFEFFFF. Devices here
are not visable from the VME bus. And since this range is in use, it can not
assigned to anything else... like a VME region.
The mv167 also has some memory on it that can be programmed to start just about
anywhere in the CPUs address space.
The distributed vxWorks BSP makes the following 'configurations':
1) Place the DRAM at CPU address 0
2) If there is 4MB of DRAM on the CPU board, place the DRAM in all the
VME_AM_STD_* AMODEs starting at 0x00400000.
3) Place the DRAM in all the VME_AM_EXT_* AMODEs starting at two times
sizeof(DRAM).
4) Configure the CPU address range from sizeof(DRAM) thru 0xEFFFFFFF to
access VME_AM_EXT_USR_DATA using the D32 transfer width such that the
addresses on the VME bus are the same as those generated by the CPU. This
is important to note because this way the CPU can NEVER generate an
VME_AM_EXT_USR_DATA D32 reference to any address less than sizeof(DRAM).
5) Configure the CPU address range from 0xF0000000 to 0xF0FFFFFF to access
VME_AM_STD_USR_DATA using the D16 transfer width such that the addresses
on the VME are the low 24 bits of those generated by the CPU. (Note that
this covers the full address range of A24 for USR_DATA D16. And that CPU
address 0xF0000000 is VME A24 USR DATA D16 address 0.)
6) Configure the CPU address range from 0xFFFF0000 thru 0xFFFFFFFF to access
VME_AM_SUP_SHORT_IO using the D16 transfer width such that the addresses
on the VME are the low 16 bits of those generated by the CPU (same game
as for the A24 space.)
Notice a few things here. The A32 space is accessed by the CPU from
sizeof(DRAM) thru 0xEFFFFFFF. BUT the DRAM on the CPU board is placed into
the same A32 space from two times sizeof(DRAM) to three times sizeof(DRAM).
Basically, we have a situation where the CPU board can access the VME bus
where the CPU board itself is also 'responding'. This is a no-no. And is
taken care of by simply not accessing that area. (The same thing happens in
the A24 area when there is 4MB DRAM on the CPU card.)
==========================================================================
vxWorks and the mv167's MMU and cache system.
As a completely seperate issue from the VME stuff discussed above, you also
have to concern yourself with the fact that vxWorks turns on virtual memory
translation and the cache system. This manifests itself such that even though
all the VME and DRAM stuff is set up as outlined above, it can be rearranged
'virtually' by the MMU.
As it turns out, the only reason that the MMU is turned on by the BSP is so
that the cache system can be enabled. This is fine and dandy, except the BSP
has to deal with the fact that the MMU is doing things too. Specifically, it
has to build the MMU configuration tables for the CPUs address space. This
takes time and consumes memory. So WRS decided to only build these tables
to go from CPU address 0 to 0x02FFFFFF. Further they enable the cache from 0
thru 0x01FFFFFFF.
Nice huh? What we really WANT is to have the MMU map everything from 0 to
0xFFFFFFFF. And to only have the cache on where the DRAM is.
So the outcome is that the only 'usable' A32 space starts at three times
sizeof(DRAM) and ends at 0x02FFFFFF.
==========================================================================
>I noticed JRW modified address modifiers USR to SUP for AM_STD and AM_EXT
>modes. What was the motivation behind this?
It was either that, or restrap all our hardware when switching between
hkv20 and mv167 CPUs (the WRS BSP defaults for hkv20 used SUP for them and
the mv167 used USR.) It seemed pretty reasonable to make them the same... and
to do so in a way that creates the least work for everyone.
>Why did you change the starting base addresses in the BSP for A32 and A24.
I changed the starting base address for the A32 and A24 regions to 0. But
it should not be visable to anyone properly using the sysLocalToBus() stuff.
IMHO the defaults represent a major lack of understanding by WRS as to how
the vmeChip2 operates. Starting them all over the place fragments the VME
memory space and confuses anyone trying to configure add-on boards.
As an example, a mv167 with 4MB ram using the distributed BSP has its
local view of the DRAM starting at CPU address 0, the CPUs view of A32
space starts at the end of DRAM (0x00400000.) But then the image of the
CPUs DRAM is mapped into A32 at address 0x00800000. Therefore there is a
hole in A32 where the CPU can not access (that is from 0 to 0x00400000)
because that is where the CPU sees its own DRAM and again from
0x00800000. to 0x00BFFFFF. However, other boards COULD be placed into
A32 at address 0 AND be accessed by anything BUT the CPU. Pretty stupid.
My way places the DRAM at CPU address 0 and the CPUs view of A32 space still
starts at 0x00400000. The image of the DRAM also starts at 0 in A32.
Now we have no hidden hole and less wasted/fragmented space. We
also have an easier time configuring the cache and MMU if we want to use
them. It also prevents the starting address from moving around when the
DRAM size changes.
The change to where memory starts in A24 is less important, but prevents
conflicts with the historical locations of I/O cards like the DVX2502.
Other changes to A24 allow CPUs with more than 4MB to place some or all of
their DRAM into the A24 space so that DMA devices that only know how to talk
to A24 (like the NI1014 and DVX2502) can access some CPU memory.
>I noticed that AM_EXT uses D32 while AM_STD and AM_SHORT uses D16; which
>might account for the bus errors I get while accessing A16.
That has caused us headaches too. Just a gotcha to remember about when
setting up your I/O boards. A16 space uses D16 transfers ONLY. It is
possible to use one of the unused vmeChip2 mapping registers to map a
second A16 region in somewhere. The problem with that is that
sysBusToLocalAdrs() will ALWAYS return the addresses for the A16-D16 region
unless you hack it up as well (no biggie, but the short-sighted choices by
WRS prevents a proper implementation.)
>Why can't I use addresses above sysMemTop() and below 0xefffffff for my
>A32/D32 boards? I get bus errors.
You should look closer at the lunacy in the BSP. Wind rivers jerks around
with the MMU mappings and the cache stuff. The cache and MMU are set up
as described above. Therefore you can not expect to access anything in A32
that starts before 3*sysMemTop() or after 0x02FFFFFF.
If you want to change the MMU stuff, you can change the addresses and stuff in
the BSP. The Reference manual discuses how to do it.
If you want to just bypass the MMU altogether, you can do what I did and enable
the transparent translation registers. Thus defeating the MMU config garbage.
I enabled transparent translation from 0x80000000 thru 0xffffffff which covers
a large chunk of A32 space (as well as all the A24, local I/O and A16, but
that is OK.)
To do this, I put a sysSetTTRegs() function into sysALib.s because configuring
the MMU to cover anything over 0x10000000 causes the booting time to exceed
something like 30 seconds while it wastes memory building the page tables.
>Have you talked to Wind Rivers about this?
Yes... Feel free to ask them about TSR 12979 (submitted by John Winans
circa August 1993.) They got nowhere really fast (far as I could tell.)
Their primary response was to do the stuff mentioned on the mailing list
(which is similar to what I ended up with after talking to Rozelle Wright
at LANL about her ideas on a solution.)
[Gee thanx for your help WRS... This big-$$ licence sure seems worth it now.]
>How do I add the second piggy-back memory board to my mv167?
You simply plug it in... and watch vxWorks ignore it :-(
If you WANT to use a second memory board on your mv167, you MUST hack
the BSP to enable it and add it to the free memory pool. I have done
this in the below mentioned BSP.
>Have you contacted WRS about this?
Ask them about TSR 16040 (submitted by John Winans circa March 1993 or SPR
1484.) They have it under advisement. Last I spoke to them they told me
HOW to turn on the memory board, but their way required a different boot
image for CPUs with and without the second card... and broke things like
sysLocalToBusAdrs(). My way allows the same image to operate both
configurations and allows sysLocalToBusAdrs() to work... however it lies
about where sysMemTop really is. I got over that pretty fast since there is
no sane reason for EPICS to use it.
>When using the distributed BSP from WRS, how come I can't
>sysLocalToBusAdrs() a DRAM address for the A24 space when I have more
>than 4MB DRAM?
Again, look closely at the above discussion on the VME bus and the BSP
memory configurations.
Basically, someone at WRS decided to hard-code an 'if' statement that said
"if there is 4MB ram, put it into A24 from 0x400000 to 0x7fffff, otherwise
leave it out of A24 completely."
In my BSP the 'if' reads "is there is 8MB or less, put it into A24
starting at 0x000000, otherwise leave it out, but allow some to be
dynamically mapped in at a later time on demand."
A little insight to EPICS initialization...
When EPICS (3.11 and later) boots, the drivers that need to do DMA into
A24 space (like the GPIB and DVX drivers) can call the EPICS devlib
function devLibA24Malloc() to get memory that exists in A24 such that a
call to sysLocalToBusAdrs() will work. See the devLib.c source and DOCs
for more info about devLibA24Malloc().
>Why is this memory stuff so confusing.
My guess is that it was a provide-all to please every type of design.
It also has to do with the differences between 3U and 9U VME cards... the
smaller ones do not have all the signals that the larger ones do.
FYI:
====
I made a tar file containing out mv167 BSP that we use here at the APS for
vxWorks version 5.1.1. It is in:
phoebus.aps.anl.gov:~winans/vwbsp.tar.gz
It is intended for use with EPICS releases 3.11 and later. It has not
been tested with earlier releases. I must also mention that it is ALWAYS
in a state of flux. So you might want to email me if you have a specific
question or problem with it.
I welcome sugestions and corrections to this FAQ. I also welcome sugestions
on how to better deal with all of this stuff.
! John Winans Advanced Photon Source (Controls) !
! [email protected] Argonne National Laboratory, Illinois !
! !
!"The large print giveth, and the small print taketh away." - Tom Waits !
- Navigate by Date:
- Prev:
Re: devLib, locationProbe routine Jeff Hill
- Next:
request for advice re CPUs William Lupton
- Index:
<1994>
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
- Navigate by Thread:
- Prev:
Re: devLib, locationProbe routine Jeff Hill
- Next:
request for advice re CPUs William Lupton
- Index:
<1994>
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
|