Hi Jamie,
Maybe a minimal example representing the program logic could help the debug.
> On 12 May 2015, at 20:48, Jameson Graef Rollins <[email protected]> wrote:
>
> On Mon, Feb 23 2015, Jameson Graef Rollins <[email protected]> wrote:
>> Hi, folks. We recently saw the following error thrown from one of our
>> dual pyepics/pcaspy applications:
>>
>> epicsMutex pthread_mutex_unlock failed: error Invalid argument
>> epicsMutexOsdUnlockThread _main_ (0x2999f20) can't proceed, suspending.
>>
>> After this error was thrown the application became completely
>> unresponsive and had to be manually killed.
>>
>> I'm not sure which interface the error came from: pcaspy or pyepics.
>> We're running the following versions of the relevant packages:
>>
>> epics-base 3.14.12.2
>> pyepics 3.2.0
>> pcaspy 0.4.1
>>
>> Any idea what could have caused it? I've only found one reference to a
>> similar issue with a gateway (base 3.14.9, gateway 2.0.3.0) [0]. A
>> follow-up bug report implied that the issue might have been addressed in
>> base 3.14.11 [1], which precedes the version we're using.
>>
>> Any help would be much appreciated. Thanks so much.
>>
>> jamie.
>>
>> [0] http://www.aps.anl.gov/epics/tech-talk/2009/msg00021.php
>> [1] https://bugs.launchpad.net/epics-base/+bug/541356
>
> Hi, all. I need to resurrect this thread, since we haven't been able to
> fix this issue, and it continues to cause headaches.
>
> We're now fairly confident that the problem resides in the pcaspy
> interface. We have evidence that it is triggered by a long running
> subscription to the pcaspy server being severed abruptly. That doesn't
> consistently trigger the issue, though, as we have dozens of these
> processes but only one will fall into this state when the archiver is
> restarted. We're also still unable to replicate the problem in a
> controlled environment.
>
> The most troubling aspect of the problem is that the actual PVs remain
> active, while the underlying process freezes. Existing PV subscriptions
> remain valid and the PCAS continues to respond to new requests. However
> the database is no longer being updated so all values are bogus. In
> other words, we have **no external indication that the process is in a
> frozen state**.
>
> If I could either figure out some way to reliably re-create the problem,
> or to at least get better debug messages about where in the stack the
> problem is occurring, I could maybe make some headway. Unfortunately
> it's not feasible to run all of our deployed processes under valgrind to
> try to catch one of these instances when they pop up.
>
> Can someone who knows the pcaspy internals help me figure out where I
> should poke?
>
> jamie.
- Replies:
- Re: "epicsMutex pthread_mutex_unlock failed" with pyepics/pcaspy Jameson Graef Rollins
- References:
- "epicsMutex pthread_mutex_unlock failed" with pyepics/pcaspy Jameson Graef Rollins
- Re: "epicsMutex pthread_mutex_unlock failed" with pyepics/pcaspy Jameson Graef Rollins
- Navigate by Date:
- Prev:
Re: "epicsMutex pthread_mutex_unlock failed" with pyepics/pcaspy Jameson Graef Rollins
- Next:
Re: "epicsMutex pthread_mutex_unlock failed" with pyepics/pcaspy Michael Davidsaver
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
<2015>
2016
2017
2018
2019
2020
2021
2022
2023
2024
- Navigate by Thread:
- Prev:
Re: "epicsMutex pthread_mutex_unlock failed" with pyepics/pcaspy Jameson Graef Rollins
- Next:
Re: "epicsMutex pthread_mutex_unlock failed" with pyepics/pcaspy Jameson Graef Rollins
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
<2015>
2016
2017
2018
2019
2020
2021
2022
2023
2024
|