Argonne National Laboratory

Experimental Physics and
Industrial Control System

2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  <2020 Index 2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  <2020
<== Date ==> <== Thread ==>

Subject: [Bug 1868486] Re: epicsMessageQueue lost messages
From: Andrew Johnson via Core-talk <core-talk at aps.anl.gov>
To: core-talk at aps.anl.gov
Date: Fri, 29 May 2020 02:52:53 -0000
** Changed in: epics-base/3.15
     Assignee: (unassigned) => Andrew Johnson (anj)

** Changed in: epics-base/7.0
     Assignee: (unassigned) => Andrew Johnson (anj)

** Changed in: epics-base/7.0
       Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of EPICS
Core Developers, which is subscribed to EPICS Base.
Matching subscriptions: epics-core-list-subscription
https://bugs.launchpad.net/bugs/1868486

Title:
  epicsMessageQueue lost messages

Status in EPICS Base:
  Fix Released
Status in EPICS Base 3.15 series:
  Fix Released
Status in EPICS Base 7.0 series:
  Fix Released

Bug description:
  https://epics.anl.gov/core-talk/2020/msg00396.php

  Mark Rivers observed epicsMessageQueue losing messages.

  https://epics.anl.gov/core-talk/2020/msg00408.php

  > I think I see the logic error in how the eventSent flag is handled,
  > specifically related to the fact that epicsEvent is a semaphore as
  > opposed to a condition variable.
  > 
  > This allows a "race" to occur if the first/only waiting receiver
  > times out, and epicsEventWaitWithTimeout() returns, while a sender
  > is in epicsMessageQueueSend() preparing to wake up a receiver.
  > 
  > This results in a situation where the sender has set the eventSent,
  > and indeed copied a message to the buffer of, a thread which has
  > decided to abort.
  > 
  > After timing out, the receiver sees the timeout and returns -1
  > even through eventSent has been set.  This can be trapped with:
  > 
  > > b osdMessageQueue.cpp:358 if status!=0 && threadNode.eventSent
  > 
  > So here is your lost message.
  > 
  > Now when epicsMessageQueueReceiveWithTimeout() is called again, no message
  > is waiting in the queue, so epicsEventWaitWithTimeout() is called.
  > Since the semaphore is already set, this returns immediately with status==0,
  > but this is a spurious wakeup and the eventSent flag is not set.
  > 
  > And here is the second "timeout".

  Line numbers circa 7.0.3.1

To manage notifications about this bug go to:
https://bugs.launchpad.net/epics-base/+bug/1868486/+subscriptions

References:
[Bug 1868486] [NEW] epicsMessageQueue lost messages mdavidsaver via Core-talk

Navigate by Date:
Prev: [Bug 1866651] Re: thread joinable race Andrew Johnson via Core-talk
Next: [Bug 1812084] Re: Build failure on RTEMS 4.10.2 Andrew Johnson via Core-talk
Index: 2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  <2020
Navigate by Thread:
Prev: [Bug 1868486] Re: epicsMessageQueue lost messages Andrew Johnson via Core-talk
Next: Build failed in Jenkins: epics-base-7.0-win64 #107 APS Jenkins via Core-talk
Index: 2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  <2020
ANJ, 28 May 2020 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·