Hi David,
It looks like there's a race condition between your two calls to strncpy(); a task switch is almost certainly happening in your unlock() method after you release the mutex but before you set the string back to "Unlocked". When the old mutex owner gets to execute again, it immediately overwrites the task name that was written by the new task.
In your unlock() method, move the strncpy() to before the call to mutex->unlock() so the string value is also protected by the mutex.
I can't explain why the behavior changed between 3.15.2 and 3.15.7, but I suspect that's probably unrelated to the mutex.
- Andrew
________________________________________
From: Tech-talk <tech-talk-bounces at aps.anl.gov> on behalf of Dudley, David via Tech-talk <tech-talk at aps.anl.gov>
Sent: Thursday, February 6, 2020 8:41 AM
To: tech-talk at aps.anl.gov
Subject: lock/unlock mutex problem
I’ve run into a particular problem that I haven’t been able to figure out yet.
I’m debugging a device driver that uses a number of concurrent threads for operation
This is with epics 3.15.7.
I have these routines in my code
In my class definition, I’ve got:
char locker[80]; // Store name of routine that called the lock/unlock
epicsMutex *mutex; // Mutex for concurrent access control
The instantiation routine invokes:
this->mutex = newEpicsMutex;
The code in my class includes:
/** Lock the variable for concurrent access */
void driver::lock(const char *caller)
{
bool flag;
unsigned int lockedflag=0;
do
{
flag = this->mutex->tryLock();
if(!flag)
{
lockedflag++;
if((!(lockedflag %10) && lockedflag < 100) || (!(lockedflag %100) && lockedflag < 1000) || (!(lockedflag % 1000) && lockedflag < 10000) )
outputConsoleMessage(SHOW_ERROR_MESSAGES,"Driver::lock() - was called by %s->%s, but the variable has been locked by \"%s\" for %f secs!\n",caller, this->rec->name, locker, 0.01*lockedflag);
epicsThreadSleep(0.01);
}
}
while(!flag);
strncpy(locker,caller,sizeof(locker) );
}
/** Unock the variable for concurrent access */
void driver::unlock(const char *caller)
{
this->mutex->unlock();
strncpy(locker,"Unlocked", sizeof(locker));
}
There are a number of lock/unlock pairs in the code that lock/unlock access to retrieved data.
After some period of time, which may be as short as a few minutes, or as long as many hours, I start getting errors like this:
2020/02/05 18:44:42.6571 Driver::lock() - was called by Driver::retry->_DSC_CHW:VFC_N2000:CI_CSET_RBV, but the variable has been locked by "Unlocked" for 0.010000 secs!
2020/02/05 18:44:42.6694 Driver::lock() - was called by Driver::retry->_DSC_CHW:VFC_N2000:CI_CSET_RBV, but the variable has been locked by "Unlocked" for 0.020000 secs!
2020/02/05 18:44:42.6802 Driver::lock() - was called by Driver::retry->_DSC_CHW:VFC_N2000:CI_CSET_RBV, but the variable has been locked by "Unlocked" for 0.030000 secs!
2020/02/05 18:44:42.6911 Driver::lock() - was called by Driver::retry->_DSC_CHW:VFC_N2000:CI_CSET_RBV, but the variable has been locked by "Unlocked" for 0.040000 secs!
2020/02/05 18:44:42.7020 Driver::lock() - was called by Driver::retry->_DSC_CHW:VFC_N2000:CI_CSET_RBV, but the variable has been locked by "Unlocked" for 0.050000 secs!
2020/02/05 18:44:42.7128 Driver::lock() - was called by Driver::retry->_DSC_CHW:VFC_N2000:CI_CSET_RBV, but the variable has been locked by "Unlocked" for 0.060000 secs!
2020/02/05 18:44:42.7237 Driver::lock() - was called by Driver::retry->_DSC_CHW:VFC_N2000:CI_CSET_RBV, but the variable has been locked by "Unlocked" for 0.070000 secs!
2020/02/05 18:44:42.7345 Driver::lock() - was called by Driver::retry->_DSC_CHW:VFC_N2000:CI_CSET_RBV, but the variable has been locked by "Unlocked" for 0.080000 secs!
2020/02/05 18:44:42.7453 Driver::lock() - was called by Driver::retry->_DSC_CHW:VFC_N2000:CI_CSET_RBV, but the variable has been locked by "Unlocked" for 0.090000 secs!
2020/02/05 18:44:42.7562 Driver::lock() - was called by Driver::retry->_DSC_CHW:VFC_N2000:CI_CSET_RBV, but the variable has been locked by "Unlocked" for 0.100000 secs!
2020/02/05 18:44:42.8643 Driver::lock() - was called by Driver::retry->_DSC_CHW:VFC_N2000:CI_CSET_RBV, but the variable has been locked by "Unlocked" for 0.200000 secs!
2020/02/05 18:44:42.9725 Driver::lock() - was called by Driver::retry->_DSC_CHW:VFC_N2000:CI_CSET_RBV, but the variable has been locked by "Unlocked" for 0.300000 secs!
2020/02/05 18:44:43.0858 Driver::lock() - was called by Driver::retry->_DSC_CHW:VFC_N2000:CI_CSET_RBV, but the variable has been locked by "Unlocked" for 0.400000 secs!
2020/02/05 18:44:43.1941 Driver::lock() - was called by Driver::retry->_DSC_CHW:VFC_N2000:CI_CSET_RBV, but the variable has been locked by "Unlocked" for 0.500000 secs!
2020/02/05 18:44:43.3022 Driver::lock() - was called by Driver::retry->_DSC_CHW:VFC_N2000:CI_CSET_RBV, but the variable has been locked by "Unlocked" for 0.600000 secs!
2020/02/05 18:44:43.4105 Driver::lock() - was called by Driver::retry->_DSC_CHW:VFC_N2000:CI_CSET_RBV, but the variable has been locked by "Unlocked" for 0.700000 secs!
2020/02/05 18:44:43.5187 Driver::lock() - was called by Driver::retry->_DSC_CHW:VFC_N2000:CI_CSET_RBV, but the variable has been locked by "Unlocked" for 0.800000 secs!
2020/02/05 18:44:43.6072 Driver::lock() - was called by Driver::setProcessRecord->_DSC_CHW:VFC_N2000:CI_CSET_RBV, but the variable has been locked by "Unlocked" for 0.010000 secs!
2020/02/05 18:44:43.6181 Driver::lock() - was called by Driver::setProcessRecord->_DSC_CHW:VFC_N2000:CI_CSET_RBV, but the variable has been locked by "Unlocked" for 0.020000 secs!
2020/02/05 18:44:43.6269 Driver::lock() - was called by Driver::clear_retry_count->CT1, but the variable has been locked by "Driver::thread_main" for 0.900000 secs!
Now, this driver has been working on 3.15.2, but when updated to 3.5.7, all of a sudden I’ve got this problem.
I don’t see how the mutex can be locked when the routine shows it is “Unlocked”.
Please, someone see something I don’t!
Thanks-
David Dudley
Controls Engineer III
Facility for Rare Isotope Beams
Michigan State University
640 South Shaw Lane
East Lansing, MI 48824-1321, USA
Tel. 517-908-7133
Email: dudleyd at frib.msu.edu<mailto:dudleyd at frib.msu.edu>
[cid:[email protected]]
- References:
- lock/unlock mutex problem Dudley, David via Tech-talk
- Navigate by Date:
- Prev:
Re: areaDetector crashes EPICS IOC and autosave failure Mark Rivers via Tech-talk
- Next:
asynMotorAxis Koennecke Mark (PSI) via Tech-talk
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
<2020>
2021
2022
2023
2024
- Navigate by Thread:
- Prev:
lock/unlock mutex problem Dudley, David via Tech-talk
- Next:
send/receive waveform over network. Tomasz Brys via Tech-talk
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
<2020>
2021
2022
2023
2024
|