Experimental Physics and Industrial Control System
Subject: |
Re: "epicsMutex pthread_mutex_unlock failed" with pyepics/pcaspy |
From: |
Jameson Graef Rollins <[email protected]> |
To: |
EPICS tech-talk <[email protected]> |
Date: |
Tue, 12 May 2015 11:48:27 -0700 |
On Mon, Feb 23 2015, Jameson Graef Rollins <[email protected]> wrote:
> Hi, folks. We recently saw the following error thrown from one of our
> dual pyepics/pcaspy applications:
>
> epicsMutex pthread_mutex_unlock failed: error Invalid argument
> epicsMutexOsdUnlockThread _main_ (0x2999f20) can't proceed, suspending.
>
> After this error was thrown the application became completely
> unresponsive and had to be manually killed.
>
> I'm not sure which interface the error came from: pcaspy or pyepics.
> We're running the following versions of the relevant packages:
>
> epics-base 3.14.12.2
> pyepics 3.2.0
> pcaspy 0.4.1
>
> Any idea what could have caused it? I've only found one reference to a
> similar issue with a gateway (base 3.14.9, gateway 2.0.3.0) [0]. A
> follow-up bug report implied that the issue might have been addressed in
> base 3.14.11 [1], which precedes the version we're using.
>
> Any help would be much appreciated. Thanks so much.
>
> jamie.
>
> [0] http://www.aps.anl.gov/epics/tech-talk/2009/msg00021.php
> [1] https://bugs.launchpad.net/epics-base/+bug/541356
Hi, all. I need to resurrect this thread, since we haven't been able to
fix this issue, and it continues to cause headaches.
We're now fairly confident that the problem resides in the pcaspy
interface. We have evidence that it is triggered by a long running
subscription to the pcaspy server being severed abruptly. That doesn't
consistently trigger the issue, though, as we have dozens of these
processes but only one will fall into this state when the archiver is
restarted. We're also still unable to replicate the problem in a
controlled environment.
The most troubling aspect of the problem is that the actual PVs remain
active, while the underlying process freezes. Existing PV subscriptions
remain valid and the PCAS continues to respond to new requests. However
the database is no longer being updated so all values are bogus. In
other words, we have **no external indication that the process is in a
frozen state**.
If I could either figure out some way to reliably re-create the problem,
or to at least get better debug messages about where in the stack the
problem is occurring, I could maybe make some headway. Unfortunately
it's not feasible to run all of our deployed processes under valgrind to
try to catch one of these instances when they pop up.
Can someone who knows the pcaspy internals help me figure out where I
should poke?
jamie.
Attachment:
signature.asc
Description: PGP signature
- Replies:
- Re: "epicsMutex pthread_mutex_unlock failed" with pyepics/pcaspy Wang Xiaoqiang (PSI)
- Re: "epicsMutex pthread_mutex_unlock failed" with pyepics/pcaspy Michael Davidsaver
- References:
- "epicsMutex pthread_mutex_unlock failed" with pyepics/pcaspy Jameson Graef Rollins
- Navigate by Date:
- Prev:
FW: EPICS training sessions Arnold, Ned D.
- Next:
Re: "epicsMutex pthread_mutex_unlock failed" with pyepics/pcaspy Wang Xiaoqiang (PSI)
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
<2015>
2016
2017
2018
2019
2020
2021
2022
2023
2024
- Navigate by Thread:
- Prev:
Re: "epicsMutex pthread_mutex_unlock failed" with pyepics/pcaspy Michael Davidsaver
- Next:
Re: "epicsMutex pthread_mutex_unlock failed" with pyepics/pcaspy Wang Xiaoqiang (PSI)
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
<2015>
2016
2017
2018
2019
2020
2021
2022
2023
2024