EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  <20032004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  <20032004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: RE: bug in VxWorks memPartInfoGet (used by VxStats)
From: "Jeff Hill" <[email protected]>
To: <[email protected]>
Date: Mon, 2 Jun 2003 10:58:39 -0600
All,

Even if you don't use vxStats beware that the ca server in iocCore calls
memFindMax() in vxWorks once every EPICS_CA_BEACON_PERIOD seconds, and there is
a strong possibility that WRS has implemented memFindMax() on top of the vxWorks
memPartInfoGet() which is now known to have a severe bug which could damage pool
and cause your IOC to fail. With this memFindMax() calling frequency the failure
rate could vary significantly depending on your system configuration. A simple
IOC might have a very low occurrence rate because most of the built in memory
allocation is based on free lists. However, an IOC with layered tools or device
drivers installed might experience a substantially higher failure rate because
frequent allocation of memory from pool might be more likely to collide with the
defect now know to be included in the stock WRS version of memPartInfoGet().

Jeff


> -----Original Message-----
> From: Hoff, Lawrence [mailto:[email protected]]
> Sent: Saturday, May 31, 2003 5:33 PM
> To: [email protected]
> Subject: bug in VxWorks memPartInfoGet (used by VxStats)
> 
> 
> 	At SNS, we ran into a bug in memPartInfoGet, using VxWorks
> kernel 5.4.2 and VxStats. Apparently there is a bug in memPartInfoGet()
> which can corrupt the heap if another task preempts memPartInfoGet()
> and does a heap operation (e.g. malloc/free). There is an SPR covering
> this bug which seems to imply that the bug also exists in kernel
> version 5.5. It is not clear which earlier versions of VxWorks might
> also be affected, but the date on the SPR is more than 3 years ago.
> 
> 	The tell-tale signature is a message which says :
> "  invalid block at <some hex address> deleted"
> 
> 	If you ever see that message, then your heap has been
> corrupted by the bug.
> 
> 	The symptoms of the heap corruption can be quite severe. In
> our case, the system acted as if memory was completely exhausted,
> requiring a system reset. In one case, the IOC would not run reliably
> for more than a day without heap corruption.
> 
> 	We implemented a work-around in VxStats by re-writing
> memPartInfoGet(). With this work-around, the unstable system
> has been running the better part of a week now without failure.
> 
> 	I suggest that if anyone has had unexplained heap corruption
> on IOCs which use VxStats, that they either log all console
> messages looking for the signature, or try running without VxStats,
> or try our work-around.
> 
> 	Attached to this message is the WRS SPR information, and
> the patch to VxStats.
> 
> HTH -- Larry
> 
> 
> 
> /* Added by LTH because memPartInfoGet() has a bug when "walking" the
> list */
> 
> #include "semLib.h"
> #include "dllLib.h"
> #include "smObjLib.h"
> #include "private/memPartLibP.h"
> 
> static STATUS memInfoGet(
>           MEM_PART_STATS *  ppartStats  /* partition stats structure */
>           ){
> 
> 
>    FAST PART_ID partId = memSysPartId;
>    BLOCK_HDR *  pHdr;
>    DL_NODE *    pNode;
>      ppartStats->numBytesFree = 0;
>    ppartStats->numBlocksFree = 0;
>    ppartStats->numBytesAlloc = 0;
>    ppartStats->numBlocksAlloc = 0;
>    ppartStats->maxBlockSizeFree = 0;
>      if (ID_IS_SHARED (partId))  /* partition is shared? */
>      {
>        /* shared partitions not supported yet */
>        return (ERROR);
>      }
>      /* partition is local */
>      if (OBJ_VERIFY (partId, memPartClassId) != OK)
>      return (ERROR);
>      /* take and keep semaphore until done */
>    semTake (&partId->sem, WAIT_FOREVER);
>      for (pNode = DLL_FIRST (&partId->freeList);
>         pNode != NULL;
>         pNode = DLL_NEXT (pNode))
>      {
>        pHdr = NODE_TO_HDR (pNode);
>        {
>      ppartStats->numBlocksFree ++ ;
>      ppartStats->numBytesFree += 2 * pHdr->nWords;
>      if(2 * pHdr->nWords > ppartStats->maxBlockSizeFree)
> ppartStats->maxBlockSizeFree = 2 * pHdr->nWords;
>        }
>      }
>      ppartStats->numBytesAlloc = 2 * partId->curWordsAllocated;
>    ppartStats->numBlocksAlloc = partId->curBlocksAllocated;
> 
>    semGive (&partId->sem);
>      return (OK);
>    }
> 
> 
> 
>   SPR
> 
>              TITLE:
>                    memPartInfoGet() and memPartAvailable() can
> cause memory corruption
>              SPR #:                  30316
>             STATUS:                  Assigned
>                IDE:                  Tornado 2.2
>               RTOS:                  VxWorks 5.5
>       PRODUCT NAME:                  VxWorks 5.5
>     RELEASE STATUS:                  FCS
> PRODUCTS AFFECTED:                  VxWorks
>               HOST:                  All
>            HOST OS:                  All
>        ARCH FAMILY:                  All
>   PROCESSOR FAMILY:                  N/A
>          PROCESSOR:                  All
>                BSP:
>        DESCRIPTION:
>                    Under some conditions it may be possible that
> calling the routine memPartBlockIsValid() will induce memory corruption.
> 
>                    This routine is called by both memPartInfoGet() and
> memPartAvailable(). Also, memPartAvailable() is called by memPartShow()
> and memShow().
> 
>           PATCHES:
>       DATE CREATED:
>                    Feb 15 2000



References:
bug in VxWorks memPartInfoGet (used by VxStats) Hoff, Lawrence

Navigate by Date:
Prev: RE: channel access disconnects Jeff Hill
Next: Re: bug in VxWorks memPartInfoGet (used by VxStats) Benjamin Franksen
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  <20032004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: bug in VxWorks memPartInfoGet (used by VxStats) Hoff, Lawrence
Next: Re: bug in VxWorks memPartInfoGet (used by VxStats) Benjamin Franksen
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  <20032004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
ANJ, 10 Aug 2010 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·