EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  <20232024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  <20232024 
<== Date ==> <== Thread ==>

Subject: Re: Monitor timeouts with asyn/StreamDevice, and issues in DRTO with UDP
From: Zimoch Dirk via Tech-talk <tech-talk at aps.anl.gov>
To: "erico.rolim at lnls.br" <erico.rolim at lnls.br>, "tech-talk at aps.anl.gov" <tech-talk at aps.anl.gov>
Date: Tue, 19 Sep 2023 10:02:02 +0000
Hi Érico,

Usually EPICS records contain the last known state, not a history. Thus, it is
to be expected that the status goes back to "all is fine" as soon as the
connection is re-established.

A few things you an do:

1. Count the errors (using a calc record). With this method, you can at least
see the counter increasing. If you like, you can archive this counter and see
later if there had been bursts, etc.
2. Log the error messages to a log server. In addition to seeing the problem on
the ioc shell, this allows you to look up events later in a log file or a
database (e.g. logstash). By default, StreamDevice logs to stderr. But you can
use 'streamSetLogfile filename' to copy error messages to a file. (Call with no
file name to stop logging to the file.)

That the connection is not re-established is unexpected (as the message says):
> 2023/09/18 15:30:55.800291 TIPORT DE-23RaBPM:TI-EVE:FrmVersionA-Cte:
StreamCore::lockCallback(StreamIoSuccess) called unexpectedly> 

I need to analyze this. Can you send me your records, protocols and the port
configuration from the startup script? Also some sample data would be helpful.


Dirk

On Mon, 2023-09-18 at 15:49 +0000, Érico Nogueira Rolim via Tech-talk wrote:
> Hi!
> 
> I'm using StreamDevice with a UDP device, and we have been observing some communication timeouts caused by packet drops/high CPU load (per our testing), which we only noticed due to actually opening the IOC shell and seeing multiple "No reply within 1000 ms to ..." messages. However, I'd like to be able to observe these (instantaneous) timeouts from a PV. Checking the alarm of the PVs isn't enough, because it's cleared as soon as the next communication attempt works, and having to check our archiver data for the information doesn't scale well for operation.
> 
> I instantiated the asynRecord for our port, and tried setting .DRTO to "Yes", in order to observe disconnections (which I simulated by fully disconnecting the ethernet cable from the device). The .CNCT field does go to "Disconnect", but it goes back to "Connect" for every record that is processed, and displaying the timestamp of the PV in an interface wouldn't be enough, because it is "<undefined>" (per camonitor output). Furthermore, in what seems to be a bug, once I reconnect the ethernet cable, the iocsh prints the following messages:
> 2023/09/18 15:30:55.702487 TIPORT DE-23RaBPM:TI-EVE:readAndUpdate: device TIPORT 0 disconnected
> ** reconnected cable **
> 2023/09/18 15:30:55.800291 TIPORT DE-23RaBPM:TI-EVE:FrmVersionA-Cte: StreamCore::lockCallback(StreamIoSuccess) called unexpectedly
> and from this point on, it simply doesn't reestablish a connection. Pulling the cable again doesn't cause anything to be printed in iocsh anymore, and the records have stopped being updated (though at least they do have alarm information). I can provide more information, if necessary :)
> 
> Is there some other way of monitoring timeouts and disconnections when using asyn/StreamDevice which I'm missing?
> 
> Cheers,
> Érico
> 
> Aviso Legal: Esta mensagem e seus anexos podem conter informações confidenciais e/ou de uso restrito. Observe atentamente seu conteúdo e considere eventual consulta ao remetente antes de copiá-la, divulgá-la ou distribuí-la. Se você recebeu esta mensagem por engano, por favor avise o remetente e apague-a imediatamente.
> Disclaimer: This email and its attachments may contain confidential and/or privileged information. Observe its content carefully and consider possible querying to the sender before copying, disclosing or distributing it. If you have received this email by mistake, please notify the sender and delete it immediately.

Replies:
Re: Monitor timeouts with asyn/StreamDevice, and issues in DRTO with UDP Érico Nogueira Rolim via Tech-talk
References:
Monitor timeouts with asyn/StreamDevice, and issues in DRTO with UDP Érico Nogueira Rolim via Tech-talk

Navigate by Date:
Prev: Re: Problem with CA nameserver and CA_V413 protocol Ralph Lange via Tech-talk
Next: Re: IOC shell arrow keys not working. Yielding "^[[A" or "^[[Something" Marco A. Barra Montevechi Filho via Tech-talk
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  <20232024 
Navigate by Thread:
Prev: Re: Monitor timeouts with asyn/StreamDevice, and issues in DRTO with UDP Érico Nogueira Rolim via Tech-talk
Next: Re: Monitor timeouts with asyn/StreamDevice, and issues in DRTO with UDP Érico Nogueira Rolim via Tech-talk
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  <20232024 
ANJ, 22 Sep 2023 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·