Experimental Physics and
| |||||||||||||||
|
The tell-tale signature is a message which says : " invalid block at <some hex address> deleted" If you ever see that message, then your heap has been corrupted by the bug. The symptoms of the heap corruption can be quite severe. In our case, the system acted as if memory was completely exhausted, requiring a system reset. In one case, the IOC would not run reliably for more than a day without heap corruption. We implemented a work-around in VxStats by re-writing memPartInfoGet(). With this work-around, the unstable system has been running the better part of a week now without failure. I suggest that if anyone has had unexplained heap corruption on IOCs which use VxStats, that they either log all console messages looking for the signature, or try running without VxStats, or try our work-around. Attached to this message is the WRS SPR information, and the patch to VxStats. HTH -- Larry /* Added by LTH because memPartInfoGet() has a bug when "walking" the list */ #include "semLib.h" #include "dllLib.h" #include "smObjLib.h" #include "private/memPartLibP.h" static STATUS memInfoGet( MEM_PART_STATS * ppartStats /* partition stats structure */ ){ FAST PART_ID partId = memSysPartId; BLOCK_HDR * pHdr; DL_NODE * pNode; ppartStats->numBytesFree = 0; ppartStats->numBlocksFree = 0; ppartStats->numBytesAlloc = 0; ppartStats->numBlocksAlloc = 0; ppartStats->maxBlockSizeFree = 0; if (ID_IS_SHARED (partId)) /* partition is shared? */ { /* shared partitions not supported yet */ return (ERROR); } /* partition is local */ if (OBJ_VERIFY (partId, memPartClassId) != OK) return (ERROR); /* take and keep semaphore until done */ semTake (&partId->sem, WAIT_FOREVER); for (pNode = DLL_FIRST (&partId->freeList); pNode != NULL; pNode = DLL_NEXT (pNode)) { pHdr = NODE_TO_HDR (pNode); { ppartStats->numBlocksFree ++ ; ppartStats->numBytesFree += 2 * pHdr->nWords; if(2 * pHdr->nWords > ppartStats->maxBlockSizeFree) ppartStats->maxBlockSizeFree = 2 * pHdr->nWords; } } ppartStats->numBytesAlloc = 2 * partId->curWordsAllocated; ppartStats->numBlocksAlloc = partId->curBlocksAllocated; semGive (&partId->sem); return (OK); } SPR TITLE: memPartInfoGet() and memPartAvailable() can cause memory corruption SPR #: 30316 STATUS: Assigned IDE: Tornado 2.2 RTOS: VxWorks 5.5 PRODUCT NAME: VxWorks 5.5 RELEASE STATUS: FCS PRODUCTS AFFECTED: VxWorks HOST: All HOST OS: All ARCH FAMILY: All PROCESSOR FAMILY: N/A PROCESSOR: All BSP: DESCRIPTION: Under some conditions it may be possible that calling the routine memPartBlockIsValid() will induce memory corruption. This routine is called by both memPartInfoGet() and memPartAvailable(). Also, memPartAvailable() is called by memPartShow() and memShow(). PATCHES: DATE CREATED: Feb 15 2000
| ||||||||||||||
ANJ, 10 Aug 2010 |
·
Home
·
News
·
About
·
Base
·
Modules
·
Extensions
·
Distributions
·
Download
·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing · |