> Ben's patch is in message #8 in this bug.
For some reason I remembered it being larger.
This patch is not a complete solution for the RX side issue as it still
allows triggered/signaled epicsEvents to be returned to the free-list.
So the spurious timeouts will continue. It doesn't address the TX
side issues at all. @Andrew, the TX side has the same design, and thus
the same problems.
Still, this patch is a definite improvement, and I think worth applying.
--
You received this bug notification because you are a member of EPICS
Core Developers, which is subscribed to EPICS Base.
Matching subscriptions: epics-core-list-subscription
https://bugs.launchpad.net/bugs/1868486
Title:
epicsMessageQueue lost messages
Status in EPICS Base:
In Progress
Status in EPICS Base 3.15 series:
Fix Committed
Status in EPICS Base 7.0 series:
In Progress
Bug description:
https://epics.anl.gov/core-talk/2020/msg00396.php
Mark Rivers observed epicsMessageQueue losing messages.
https://epics.anl.gov/core-talk/2020/msg00408.php
> I think I see the logic error in how the eventSent flag is handled,
> specifically related to the fact that epicsEvent is a semaphore as
> opposed to a condition variable.
>
> This allows a "race" to occur if the first/only waiting receiver
> times out, and epicsEventWaitWithTimeout() returns, while a sender
> is in epicsMessageQueueSend() preparing to wake up a receiver.
>
> This results in a situation where the sender has set the eventSent,
> and indeed copied a message to the buffer of, a thread which has
> decided to abort.
>
> After timing out, the receiver sees the timeout and returns -1
> even through eventSent has been set. This can be trapped with:
>
> > b osdMessageQueue.cpp:358 if status!=0 && threadNode.eventSent
>
> So here is your lost message.
>
> Now when epicsMessageQueueReceiveWithTimeout() is called again, no message
> is waiting in the queue, so epicsEventWaitWithTimeout() is called.
> Since the semaphore is already set, this returns immediately with status==0,
> but this is a spurious wakeup and the eventSent flag is not set.
>
> And here is the second "timeout".
Line numbers circa 7.0.3.1
To manage notifications about this bug go to:
https://bugs.launchpad.net/epics-base/+bug/1868486/+subscriptions
- References:
- [Bug 1868486] [NEW] epicsMessageQueue lost messages mdavidsaver via Core-talk
- Navigate by Date:
- Prev:
[Bug 1861612] Re: callbackRequestDelay not waiting for 1/60 sec on vxWorks mdavidsaver via Core-talk
- Next:
Build failed: epics-base base-fix-epicsFindSymbol-466 AppVeyor via Core-talk
- Index:
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
<2020>
2021
2022
2023
2024
- Navigate by Thread:
- Prev:
[Bug 1868486] Re: epicsMessageQueue lost messages rivers via Core-talk
- Next:
[Bug 1868486] Re: epicsMessageQueue lost messages Andrew Johnson via Core-talk
- Index:
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
<2020>
2021
2022
2023
2024
|