EPICS Home

Experimental Physics and Industrial Control System


 
1994  1995  1996  1997  1998  1999  2000  2001  2002  <20032004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  <20032004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: bug in VxWorks memPartInfoGet (used by VxStats)
From: "Hoff, Lawrence" <[email protected]>
To: [email protected]
Date: Sat, 31 May 2003 19:33:14 -0400

At SNS, we ran into a bug in memPartInfoGet, using VxWorks kernel 5.4.2 and VxStats. Apparently there is a bug in memPartInfoGet() which can corrupt the heap if another task preempts memPartInfoGet() and does a heap operation (e.g. malloc/free). There is an SPR covering this bug which seems to imply that the bug also exists in kernel version 5.5. It is not clear which earlier versions of VxWorks might also be affected, but the date on the SPR is more than 3 years ago.

	The tell-tale signature is a message which says :
"  invalid block at <some hex address> deleted"

	If you ever see that message, then your heap has been
corrupted by the bug.

	The symptoms of the heap corruption can be quite severe. In
our case, the system acted as if memory was completely exhausted,
requiring a system reset. In one case, the IOC would not run reliably
for more than a day without heap corruption.

	We implemented a work-around in VxStats by re-writing
memPartInfoGet(). With this work-around, the unstable system
has been running the better part of a week now without failure.

	I suggest that if anyone has had unexplained heap corruption
on IOCs which use VxStats, that they either log all console
messages looking for the signature, or try running without VxStats,
or try our work-around.

	Attached to this message is the WRS SPR information, and
the patch to VxStats.

HTH -- Larry



/* Added by LTH because memPartInfoGet() has a bug when "walking" the
list */

#include "semLib.h"
#include "dllLib.h"
#include "smObjLib.h"
#include "private/memPartLibP.h"

static STATUS memInfoGet(
         MEM_PART_STATS *  ppartStats  /* partition stats structure */
         ){


FAST PART_ID partId = memSysPartId;
BLOCK_HDR * pHdr;
DL_NODE * pNode;
ppartStats->numBytesFree = 0;
ppartStats->numBlocksFree = 0;
ppartStats->numBytesAlloc = 0;
ppartStats->numBlocksAlloc = 0;
ppartStats->maxBlockSizeFree = 0;
if (ID_IS_SHARED (partId)) /* partition is shared? */
{
/* shared partitions not supported yet */
return (ERROR);
}
/* partition is local */
if (OBJ_VERIFY (partId, memPartClassId) != OK)
return (ERROR);
/* take and keep semaphore until done */
semTake (&partId->sem, WAIT_FOREVER);
for (pNode = DLL_FIRST (&partId->freeList);
pNode != NULL;
pNode = DLL_NEXT (pNode))
{
pHdr = NODE_TO_HDR (pNode);
{
ppartStats->numBlocksFree ++ ;
ppartStats->numBytesFree += 2 * pHdr->nWords;
if(2 * pHdr->nWords > ppartStats->maxBlockSizeFree) ppartStats->maxBlockSizeFree = 2 * pHdr->nWords;
}
}
ppartStats->numBytesAlloc = 2 * partId->curWordsAllocated;
ppartStats->numBlocksAlloc = partId->curBlocksAllocated;


  semGive (&partId->sem);
    return (OK);
  }



SPR

            TITLE:
                  memPartInfoGet() and memPartAvailable() can
cause memory corruption
            SPR #:                  30316
           STATUS:                  Assigned
              IDE:                  Tornado 2.2
             RTOS:                  VxWorks 5.5
     PRODUCT NAME:                  VxWorks 5.5
   RELEASE STATUS:                  FCS
PRODUCTS AFFECTED:                  VxWorks
             HOST:                  All
          HOST OS:                  All
      ARCH FAMILY:                  All
 PROCESSOR FAMILY:                  N/A
        PROCESSOR:                  All
              BSP:
      DESCRIPTION:
                  Under some conditions it may be possible that
calling the routine memPartBlockIsValid() will induce memory corruption.

                  This routine is called by both memPartInfoGet() and
memPartAvailable(). Also, memPartAvailable() is called by memPartShow()
and memShow().

         PATCHES:
     DATE CREATED:
                  Feb 15 2000

Replies:
RE: bug in VxWorks memPartInfoGet (used by VxStats) Jeff Hill
Re: bug in VxWorks memPartInfoGet (used by VxStats) Benjamin Franksen

Navigate by Date:
Prev: [Fwd: Re: R3.14.2 & excas] Billy R. Adams
Next: multicast storms, or how do WRS SPRs work Hoff, Lawrence
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  <20032004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: [Fwd: Re: R3.14.2 & excas] Billy R. Adams
Next: RE: bug in VxWorks memPartInfoGet (used by VxStats) Jeff Hill
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  <20032004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024