EPICS RE: RTEMS NFS/RPIOC problems?

1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 <2018> 2019 2020 2021 2022 2023 2024 2025	Index	1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 <2018> 2019 2020 2021 2022 2023 2024 2025
<== Date ==>		<== Thread ==>

Hi Recardo,

We also observe some intermittent NFS/RPC timeout console messages on nios2/rtems4.11/EPICS systems deployed at LANL. We also observe intermittent issues with updating the EPICS save/restore files on these systems. No issues with CA disconnects on these systems based on R3.14 and our DAQ R3.15. I also monitor ping intervals on these systems and don’t see issues with latency there.

Jeff

From: [email protected] [mailto:[email protected]] On Behalf Of Ricardo Cardenes
Sent: Monday, July 02, 2018 5:10 PM
To: Talk EPICS Tech
Subject: RTEMS NFS/RPIOC problems?

Hi,

We've been experiencing some problems on some of our systems (based on MVME 2700/3100; EPICS 3.14.12.7, RTEMS 4.10.2) related to the RPC subsystem, like these:

2018-06-04-tcs.log:Jun 4 12:30:06 E) PORT: tcs_vme, MSG: RPCIO: server '172.17.71.11' not responding - still trying
[...]

\2018-06-07-tcs.log:Jun 7 13:11:37 E) PORT: tcs_vme, MSG: RPCIO: server '172.17.2.10' not responding - still trying
[...]

2018-06-26-tcs.log:Jun 26 16:43:14 E) PORT: tcs_vme, MSG: RPCIO - statistics: already 17000 retries to server 172.17.2.10
2018-06-26-tcs.log:Jun 26 20:57:43 E) PORT: tcs_vme, MSG: RPCIO - statistics: already 18000 retries to server 172.17.2.10

And it seems to me that we're getting too many CA disconnects, too. We hadn't noticed until recently, when we commissioned a system that seems the only one being systematically affected (but it is the one that orchestrates most of the others, too...)

We suspect some hardware or networking problem, possibly outside of our boards, given that the interface stats look quite clear (no corrupt Ethernet frames, no retries, ...), and we've actually seen at least one corrupt UDP header, plus several thousand occurrences of timed out IP fragments (over a period of a few weeks).

In any case, while we investigate the possible hardware issue, has anyone else experienced a similar problem?

Regards,

Ricardo

Subject:	RE: RTEMS NFS/RPIOC problems?
From:	"Hill, Jeff" <[email protected]>
To:	Ricardo Cardenes <[email protected]>, Talk EPICS Tech <[email protected]>
Date:	Tue, 3 Jul 2018 22:29:41 +0000

Experimental Physics and Industrial Control System