EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  <20162017  2018  2019  2020  2021  2022  2023  2024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  <20162017  2018  2019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: RE: Modbus devices lost connectivity
From: Mark Rivers <[email protected]>
To: "'Wallace, Alex'" <[email protected]>, "[email protected]" <[email protected]>
Date: Mon, 28 Mar 2016 18:26:44 +0000

Hi Alex,

 

You didn’t say what EPICS Modbus software you are using.  Are you using my Modbus package (http://cars.uchicago.edu/software/epics/modbus.html)? If so, then it might be helpful to collect and look at the I/O statistics.  This will give you a histogram of the I/O cycle times, and also the maximum I/O cycle time since the IOC started.  I’ve attached a couple of screen shots.  These screen shots show that most of the cycle times are in the 10-20 ms range, but the worst case is over 2 seconds.  If you have long tails on your histogram then that might be an indication of a problem.

 

I agree with Andrew that something scanning the network, or generating a large burst of broadcast traffic  might be the cause.

 

Mark

 

 

 

From: [email protected] [mailto:[email protected]] On Behalf Of Wallace, Alex
Sent: Monday, March 28, 2016 12:49 PM
To: [email protected]
Subject: Modbus devices lost connectivity

 

Recently we had an incident where many of our devices using modbus were knocked offline, or experience intermittent connectivity. This occurred across all subnets, and seemed to affect devices that used modbus to communicate. Below are more details. I am wondering if anyone has seen something like this before? Automation Direct says there are some firmware updates we are behind on. I suspect Beckhoff will say the same. I will be recording network traffic on one of these devices to try and catch some malformed packet, or some other thing.

 

Thanks for your consideration.

-Alex

 

·         First noticeable symptom was a Beckhoff BK9000 modbus tcp coupler lost communication with EPICS (our interface is based on modbus)

·         A vacuum system (using automation direct koyo plc) showed purple on all permissive indicators. These are the Cn bits in the PLC and when I checked the IOC, the Cn asyn port was throwing errors, specifically modbus exception 6, which means that the PLC was busy with something else.

·         Another hutch also reported that their PLC was having issues communicating with EPICS. Controls still worked, but the readbacks would go purple everytime a valve was actuated. This means that EPICS was probably encountering timeout, or PLC busy responses.

·         In both cases, the PLC appeared to continue functioning. Communication with EPICS was the only interruption.

·         Checking all Beckhoff couplers (BK9000) in both hutch subnets showed they were able to be pinged, and other application layer services were functioning (eg ADS for the beckhoff devices, ks4000 software could still connect over ethernet).

·         New Beckhoff PLCs were not affected by this issue.

·         Telnet to port 502 was functional for all these devices

·         Modpoll was able to communicate with the PLCs

·         EPICS would have had library loading issues if something in the module changed or was missing

·         The issues appeared without a restart of the IOCs suggesting that it was external to EPICS

·         The issue occurred across multiple subnets

·         ECOM (the interface for the Automation Direct PLCs to be programmed remotely) is currently non-functioning for PLCs that haven't been power-cycled. A hutch PLC was cycled and the ECOM interface recovered.

·         ECOM for the AD PLCs shares the NIC with modbus communication, still don't know what port it uses, 16#7070.

·         Tested robustness of the AD PLC NIC by sending a jumbo frame packet with ICMP, caused modbus communication to completely fail and the NIC to stop responding to ping. ie. crashed the NIC. Beckhoff couplers were more resilient, only showed increased in response time. This may be some other issue, but is recorded here for consideration. Before sending this packet, the NIC error light is not lit on the AD plcs.

·         Currently attempting to extract log from another hutch, PLC before powercycling, no log was present

 

Attachment: ModbusStats1.png
Description: ModbusStats1.png

Attachment: ModbusStats2.png
Description: ModbusStats2.png


Replies:
Re: Modbus devices lost connectivity Wallace, Alex
References:
Modbus devices lost connectivity Wallace, Alex

Navigate by Date:
Prev: Re: Modbus devices lost connectivity Andrew Johnson
Next: Re: FFT and waveform Michael Davidsaver
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  <20162017  2018  2019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: Re: Modbus devices lost connectivity Andrew Johnson
Next: Re: Modbus devices lost connectivity Wallace, Alex
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  <20162017  2018  2019  2020  2021  2022  2023  2024 
ANJ, 15 Jul 2016 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·