1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 <2020> 2021 2022 2023 2024 2025 | Index | 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 <2020> 2021 2022 2023 2024 2025 |
<== Date ==> | <== Thread ==> |
---|
Subject: | RE: Modbus alarms question |
From: | Mark Rivers via Tech-talk <tech-talk at aps.anl.gov> |
To: | 'John Dobbins' <john.dobbins at cornell.edu> |
Cc: | "tech-talk at aps.anl.gov" <tech-talk at aps.anl.gov> |
Date: | Wed, 15 Jul 2020 16:15:59 +0000 |
Hi John, I think I understand what is happening. For “slow” devices (ASYN_CANBLOCK=1) asynManager queues a request for the I/O operation. When the request gets to the head of the queue asynManager causes the actual I/O operation to be run in the
port driver thread. If the device is not available there are 2 ways a read operation can fail. 1)
The request gets to the head of the queue and the actual I/O operation times out.
2)
The request spends too long in the queue, and the queue request itself times out. The timeout for 1 is set in your startup script, in the modbusInterposeConfig command. You set it to 2000 ms = 2.0 seconds. The timeout for 2 is set by default in asynManager.c to 2.0 seconds as well. It can be changed with the following iocsh command: asynSetQueueLockPortTimeout(portName, timeout). Failure mode 2 will happen when several requests are queued, and they each take timeout 1 to fail with failure mode 1. Eventually some of the requests in the queue will exceed timeout 2, and they will fail with failure mode 2.
The status returned for these 2 failure modes is different. Failure mode 1) returns a “read” error, while failure mode 2) returns a “timeout” error. For a particular record it can fail either way, and thus each record can be switching
its alarm status. There are several ways this could be fixed.
1)
Change your startup script to have the read timeout be 200 ms rather than 2000. That is probably plenty because the device should normally respond faster than that. That would allow 10 requests to timeout before a queue request timeout
occurred. If that is not enough you could use the asynSetQueueLockPortTimeout to increase timeout 2, until all requests fail with mode 1.
2)
You could add the following command to your startup script: asynSetOption(“MISC”, 0, “disconnectOnReadTimeout”, “Y) That command causes asyn to disconnect the port if it times out. That will put the port in the disconnected state. If autoConnect is true then asynManager will keep trying to reconnect the device. When it becomes available it will reconnect.
3)
I could change the Modbus driver so that if failure 1 is indeed a timeout (as opposed to some other error) it sets the alarm status to “timeout” rather than “read”. Then both failure modes have the same alarm status. I would suggest first trying method 2 and see what happens to the alarm status in that case. If that does not work then try method 1. If that does not work I will look at modifying the driver (method 3). Mark From: John Dobbins <john.dobbins at cornell.edu>
Mark, First, note that this is not an urgent matter. Here are the details. EPICS_HOST_ARCH linux-x86_64 EPICS base-7.0.3.1 asyn-R4-39 modbus-R3-0 This IOC connects to a single PLC (from Automation Direct) via Modbus/TCP It reads from 11 blocks of coils, length vary from 1 to 20 coils. It also reads some input registers but I have commented these out for the purpose of the test. All the records are bi and the scan is I/O Intr for each record. The template used by the records is # bi record template for register inputs record(bi,"$(P)") { field(DTYP,"asynUInt32Digital") field(INP,"@asynMask($(PORT) $(OFFSET) 0x1)") field(SCAN,"$(SCAN)") field(ZNAM,"$(ZNAM)") field(ONAM,"$(ONAM)") field(ZSV,"$(ZSV)") field(OSV,"$(OSV)") field(DESC,"$(DESC)") } Note that I am not the author of the IOC and I haven't looked carefully at it. I first tried to pare the IOC down to a single coil read from a single block but that did not reproduce the error. I tried two records, then two blocks without reproducing the
error. I then went back to the original configuration. The attached file camonitor.txt shows the output for camonitor attached to one of the records "MSD_S1F_FLD_CR" 19:54:04 camonitor starts and shows STATE MAJOR (which is correct) 19:54:16 I unplug the network cable to the PLC 19:54:18 camonitor shows TIMEOUT INVALID (note: sometimes the first error is READ INVALID) 19:54:25
camonitor shows READ INVALID 19:54:40 camonitor shows TIMEOUT INVALID 19:54:47 camonitor shows READ INVALID 19:55:02 camonitor shows TIMEOUT INVALID 19:55:04 camonitor shows READ INVALID 19:55:12 camonitor shows TIMEOUT INVALID 19:55:14 camonitor shows READ INVALID 19:55:18 disconnected I exited the IOC Also attached is the IOC console output as modbusTest_console.txt There are messages in the console output which match in time with the state changes of the PV. I don't know that I have the best asyn trace settings: asynSetTraceIOMask("MISC",0,9) asynSetTraceMask("MISC",0,2) asynSetTraceIOMask("MI_IN_3",0,4) asynSetTraceMask("MI_IN_3",0,255) asynSetTraceIOTruncateSize("MI_IN_3",0,512) MISC is the asynIPPort MI_IN_3 is the ModbusAsyn block for the signal to which I attached camonitor. As always, thanks for your time and insight. Regards, John Dobbins From: John Dobbins <john.dobbins at cornell.edu> Mark, Your suggestion worked! I was able to reproduce the issue with the latest version of Modbus support. I will gather up all the details. John From: John Dobbins Mark, Good point. I'll figure out a way to do that.
John On Jul 11, 2020, at 7:09 PM, Mark Rivers <rivers at cars.uchicago.edu> wrote:
Hi John, |