Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  <20192020  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  <20192020 
<== Date ==> <== Thread ==>

Subject: Re: connection timeout between ioc and logstash
From: "Zimoch Dirk \(PSI\) via Tech-talk" <tech-talk@aps.anl.gov>
To: "Kivel Niko (PSI)" <niko.kivel@psi.ch>, "anj@anl.gov" <anj@anl.gov>, "wux@ihep.ac.cn" <wux@ihep.ac.cn>, "jlmuir@imca-cat.org" <jlmuir@imca-cat.org>
Cc: "tech-talk@aps.anl.gov" <tech-talk@aps.anl.gov>
Date: Mon, 19 Aug 2019 06:49:32 +0000
In our case the timeout is caused by a firewall closing connections without
traffic after some time. Setting tcp_keep_alive option on the receiver side (in
our case originally syslog-ng, now moving to logstash), has no efect.

Also according to network packsge dumps, it seems that the logClient continues
to send on a socked even after the receiving side has cleanly closed it.

I think the problem needs to be addressed on the sender side. LogClient needs to
detect when the socked has been closed, either by the receiver or by a firewall.
Ideally as soon as the connection is lost but latest when the next message is to
be sent. After reconnecting it should send the message on the new connection.

I'll have a look at the code....

Dirk

On Mon, 2019-08-19 at 06:26 +0000, Kivel Niko (PSI) via Tech-talk wrote:
> Hi all
> 
> I talked to Dirk Zimoch about the connection drop issue. He noticed an even worse behavior. The first message after the timeout is lost w/o hope. Only the second message causes the logger to notice the connection loss and triggers the re-establishment. But, the 2nd message is still dropped. The next message (3rd) will get through.
> IMHO, this is something that should be fixed at the source not at the receiver side.
> 
> @logstash: We use v7.2.0 right now.
> 
> logstash input:
> input {
> ...
>   tcp {
>     port => 1234
>     type => 'caputlog'
>   }
> ...
> }
> 
> @grok:
> This is what I use as grok filter. Notice, we prefix the messages with the IOC-name. If you don't do this, remove '%{IOC}' obviously. 
> 
> filter {
>   if [type] == "caputlog" {
>     grok {
>       match => {
>         "message" => "^%{IOC} %{TS} %{CLIENT} %{USER} %{PV} %{NEW} %{OLD}?(%{MINMAX})?"
>       }
>       pattern_definitions => {
>         "TS"     => "(?<timestamp>%{MONTHDAY}-(?i)%{MONTH}-%{YEAR} %{TIME})"
>         "IOC"    => "%{IPORHOST:ioc}"
>         "CLIENT" => "%{IPORHOST:client}"
>         "USER"   => "%{USERNAME:user}"
>         "PV"     => "(?<pv>\S+)"
>         "NEW"    => "new=(%{NUMBER:newval}\s+%{QUOTEDSTRING:newstring}|(%{NUMBER:newval})|(%{QUOTEDSTRING:newstring}))"
>         "OLD"    => "old=(%{NUMBER:oldval}\s+%{QUOTEDSTRING:oldstring}|(%{NUMBER:oldval})|(%{QUOTEDSTRING:oldstring}))"
>         "MINMAX" => "( min=%{NUMBER:min}\s+max=%{NUMBER:max})?"
>       }
>     }
>     mutate {
>       convert => {
>         "pv"     => "string"
>         "newval" => "float"
>         "oldval" => "float"
>         "min"    => "float"
>         "max"    => "float"
>       }
>     }
>   }
> }
> 
> This should match:
> TEST-IOC 19-Aug-2019 00:51:36 client.psi.ch user P:PV.CAL new=1 old=0 min=0 max=1
> TEST-IOC 19-Aug-2019 00:51:36 client.psi.ch user P:PV.CAL new=1 old=0
> TEST-IOC 19-Aug-2019 00:51:36 client.psi.ch user P:PV.CAL new=1 "on" old=0 "off"
> 
> The last line is for future implementation of enum logging.
> 
> best
> Niko
> --
> Paul Scherrer Institute
> Dr. Niko Kivel
> Forschungsstrasse 111
> WBGB/014
> 5232 Villigen PSI
> Switzerland
> 
> phone: +41 56 310 2226
> email: niko.kivel@psi.ch
> From: 吴煊 [wux@ihep.ac.cn]
> Sent: Saturday, August 17, 2019 16:13
> To: johnson, andrew n.; j. lewis muir; Kivel Niko (PSI)
> Cc: tech-talk
> Subject: Re: connection timeout between ioc and logstash
> 
> 
> Hi all,
>   I've test the tcp_keep_alive option, unfortunately, it didn't work either. The difference is the IOC doesn't prompt error info when timeout. I can tell that the time of timeout is half an hour(28 minutes later is ok, 31 minutes later is abnormal). Then, logstash can never receive new message except I restarting the IOC. It's no use to restart logstash. The logstash configuration I use as follow: 
> input {
>         tcp {
>                 host => "${EPICS_IOC_LOG_INET:10.1.236.30}"
>                 port => "${EPICS_IOC_LOG_PORT:7011}"
>                 id => "ioclog-${EPICS_IOC_LOG_PORT:7011}"
>                 codec => "line"
>                 tcp_keep_alive => true
>         }
> }
>  
> filter {
>         grok {
>                 pattern_definitions => {
>                         MONTHNAM => "(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)"
>                         TIMESTAMP => "%{MONTHDAY}-%{MONTHNAM}-%{YEAR} %{TIME}"
>                         PV => "\S+"
>                 }
> #               named_captures_only => true
>                 # The following string must be all one line, one space between parts
>                 match => { "message" => "%{TIMESTAMP:timestamp} %{HOSTNAME:client} %{USER:user} %{PV:pv} new=%{NUMBER:new:float} old=%{NUMBER:old:float}" }
>         }
> 
> #       date {
> #               # Extract the timestamp from the message as a proper logstash
> #               # timestamp field type, but since the IOC's clock might not be
> #               # accurate we don't replace the locally generated @timestamp.
> #               match => ["timestamp", "dd-MMM-yy HH:mm:ss"]
> #               locale => "en"
> #               target => "@ioctimestamp"
> #       }
> }
>  
> output {
> #        elasticsearch {
> #                hosts => "10.1.236.30:9200"
> #                index => "caputlog-%{+YYYY-MM-dd}"
> #        }
>        stdout { codec => rubydebug }
>         } 
> It can't parse correctly if I added "(?: min=%{NUMBER:min} max=%{NUMBER:max})?" to the end of grok match part. It would work only if there was no burst. I haven't found a solution for this.
> 
> > -----原始邮件-----
> > 发件人:"Johnson, Andrew N." <anj@anl.gov>
> > 发送时间:2019-08-17 00:43:01 (星期六)
> > 收件人: "吴煊" <wux@ihep.ac.cn>, "j. lewis muir" <jlmuir@imca-cat.org>, "kivel niko (psi)" <niko.kivel@psi.ch>
> > 抄送: tech-talk <tech-talk@aps.anl.gov>
> > 主题: Re: connection timeout between ioc and logstash
> > 
> > Hi,
> > 
> > On 8/16/19 10:22 AM, 吴煊 via Tech-talk wrote:
> > > I simplly followed the instruction written by Andrew Johnson https://events01.synchrotron.org.au/event/85/contributions/2601/attachments/689/997/5_06_Logstash_for_IOCs.pdf A thread named "IOClog" in May 2th 2019 had mentioned this.
> > > I've installed logstash 7.2 in a linux box. I configured 'caPutLogInit "10.1.236.30:7011", 1' in st.cmd. The configuration file for logstash is fellowing:
> > > input {
> > >  tcp {
> > >  host => "${EPICS_IOC_LOG_INET:10.1.236.30}"
> > >  port => "${EPICS_IOC_LOG_PORT:7011}"
> > >  codec => "line"
> > >  }
> > > output {
> > >  file {
> > >  path => "/log/ioc/putlog-%{+YYYY-MM-dd}.json"
> > >  }
> > > }
> > > }
> > > That's all the configuration. I can not figure out who closed the connection, but I think logstash may closed it, because iocLogServer will not. I'll try the tcp_keep_alive option next and to see it would work or not.
> >  Apparently the default setting for the TCP SO_KEEPALIVE option can be different on different OSs, and the logstash configuration file I presented had not been widely tested (I've been running logstash on MacOS here). The iocLogServer code does explicitly set SO_KEEPALIVE in its acceptNewClient() routine so I think that logstash option will be the correct solution to this problem — please let tech-talk know your results.
> > 
> > > By the way, I found that the grok filter in the example file is not correct for logstash 7.2
> >  I developed those regular expressions using http://grokdebug.herokuapp.com/ and logstash 6.3.2 and am currently running 7.0.0, but we're not doing put-logging here yet. If you have a fix for logstash 7.2 I would be interested in seeing it.
> > 
> > Thanks,
> > 
> > - Andrew

References:
connection timeout between ioc and logstash 吴煊 via Tech-talk
Re: connection timeout between ioc and logstash J. Lewis Muir via Tech-talk
RE: connection timeout between ioc and logstash Kivel Niko (PSI) via Tech-talk
Re: connection timeout between ioc and logstash J. Lewis Muir via Tech-talk
Re: Re: connection timeout between ioc and logstash 吴煊 via Tech-talk
Re: connection timeout between ioc and logstash Johnson, Andrew N. via Tech-talk
Re: connection timeout between ioc and logstash 吴煊 via Tech-talk
RE: connection timeout between ioc and logstash Kivel Niko (PSI) via Tech-talk

Navigate by Date:
Prev: RE: connection timeout between ioc and logstash Kivel Niko (PSI) via Tech-talk
Next: SynApps 6-1 Released Lang, Keenan C. via Tech-talk
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  <20192020 
Navigate by Thread:
Prev: RE: connection timeout between ioc and logstash Kivel Niko (PSI) via Tech-talk
Next: Undefined Timestamp on Motor Record Dominic Oram - UKRI STFC via Tech-talk
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  <20192020 
ANJ, 19 Aug 2019 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·