1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 <2023> 2024 2025 | Index | 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 <2023> 2024 2025 |
<== Date ==> | <== Thread ==> |
---|
Subject: | Re: When I use an IOC in a container, streamDevice occasionally reports that protocol has been aborted, which causes the records in the IOC to become inaccessible from the host computer. |
From: | Mark Rivers via Tech-talk <tech-talk at aps.anl.gov> |
To: | EPICS tech-talk <tech-talk at aps.anl.gov>, "Wang, Andrew" <wang126 at llnl.gov> |
Date: | Fri, 12 May 2023 18:29:53 +0000 |
Hi Andrew,
Can you test this outside of a Docker container? It would be useful to know if the problem is independent of the fact that the IOC is running a Docker container.
You might also run the following command when the IOC is hung to see if there is a deadlock:
epicsMutexShowAll 1
Run it several times and see if the same 2 mutexes are always locked.
Mark
From: Tech-talk <tech-talk-bounces at aps.anl.gov> on behalf of Wang, Andrew via Tech-talk <tech-talk at aps.anl.gov>
Sent: Friday, May 12, 2023 1:06 PM To: EPICS tech-talk <tech-talk at aps.anl.gov> Subject: When I use an IOC in a container, streamDevice occasionally reports that protocol has been aborted, which causes the records in the IOC to become inaccessible from the host computer. Hi all,
I have created multiple IOCs for the project in which I am involved. They are all running in their own Docker container in a host computer running Ubuntu 20.04. In each Docker container, the following EPICs and support module versions are used.
In one of the IOCs, I have a SSEQ record that is used to push a scalar value to multiple records that set four parameters for the target instrument. There is an instance where streamDevice is unable to push the value to the second parameter, causing the protocol to abort. Then, a few minutes later, my colleagues and I have observed that no records from the IOC in question can be accessed through Channel Access. This is the error message that we receive.
Read operation timed out: some PV data was not read. <RECORD_NAME> 0 CA.Client.Exception…………………………………………………….. Warning: “Virtual circuit disconnect” Context: “op=0, channel=<RECORD_NAME>, type=DBR_TIME_DOUBLE, count=1, ctx=”<IP ADDRESS:PORT>” Source File: ../getCopy.cpp line 91 Current Time: <TIME>
This also meant that I was unable to check the STAT field to see what the cause of the abortion was.
Thank you and I look forward to hearing back from everyone.
Andy
|