In our case the timeout is caused by a firewall closing connections without
traffic after some time. Setting tcp_keep_alive option on the receiver side (in
our case originally syslog-ng, now moving to logstash), has no efect.
Also according to network packsge dumps, it seems that the logClient continues
to send on a socked even after the receiving side has cleanly closed it.
I think the problem needs to be addressed on the sender side. LogClient needs to
detect when the socked has been closed, either by the receiver or by a firewall.
Ideally as soon as the connection is lost but latest when the next message is to
be sent. After reconnecting it should send the message on the new connection.
I'll have a look at the code....
Dirk
On Mon, 2019-08-19 at 06:26 +0000, Kivel Niko (PSI) via Tech-talk wrote:
> Hi all
>
> I talked to Dirk Zimoch about the connection drop issue. He noticed an even worse behavior. The first message after the timeout is lost w/o hope. Only the second message causes the logger to notice the connection loss and triggers the re-establishment. But, the 2nd message is still dropped. The next message (3rd) will get through.
> IMHO, this is something that should be fixed at the source not at the receiver side.
>
> @logstash: We use v7.2.0 right now.
>
> logstash input:
> input {
> ...
> tcp {
> port => 1234
> type => 'caputlog'
> }
> ...
> }
>
> @grok:
> This is what I use as grok filter. Notice, we prefix the messages with the IOC-name. If you don't do this, remove '%{IOC}' obviously.
>
> filter {
> if [type] == "caputlog" {
> grok {
> match => {
> "message" => "^%{IOC} %{TS} %{CLIENT} %{USER} %{PV} %{NEW} %{OLD}?(%{MINMAX})?"
> }
> pattern_definitions => {
> "TS" => "(?<timestamp>%{MONTHDAY}-(?i)%{MONTH}-%{YEAR} %{TIME})"
> "IOC" => "%{IPORHOST:ioc}"
> "CLIENT" => "%{IPORHOST:client}"
> "USER" => "%{USERNAME:user}"
> "PV" => "(?<pv>\S+)"
> "NEW" => "new=(%{NUMBER:newval}\s+%{QUOTEDSTRING:newstring}|(%{NUMBER:newval})|(%{QUOTEDSTRING:newstring}))"
> "OLD" => "old=(%{NUMBER:oldval}\s+%{QUOTEDSTRING:oldstring}|(%{NUMBER:oldval})|(%{QUOTEDSTRING:oldstring}))"
> "MINMAX" => "( min=%{NUMBER:min}\s+max=%{NUMBER:max})?"
> }
> }
> mutate {
> convert => {
> "pv" => "string"
> "newval" => "float"
> "oldval" => "float"
> "min" => "float"
> "max" => "float"
> }
> }
> }
> }
>
> This should match:
> TEST-IOC 19-Aug-2019 00:51:36 client.psi.ch user P:PV.CAL new=1 old=0 min=0 max=1
> TEST-IOC 19-Aug-2019 00:51:36 client.psi.ch user P:PV.CAL new=1 old=0
> TEST-IOC 19-Aug-2019 00:51:36 client.psi.ch user P:PV.CAL new=1 "on" old=0 "off"
>
> The last line is for future implementation of enum logging.
>
> best
> Niko
> --
> Paul Scherrer Institute
> Dr. Niko Kivel
> Forschungsstrasse 111
> WBGB/014
> 5232 Villigen PSI
> Switzerland
>
> phone: +41 56 310 2226
> email: [email protected]
> From: 吴煊 [[email protected]]
> Sent: Saturday, August 17, 2019 16:13
> To: johnson, andrew n.; j. lewis muir; Kivel Niko (PSI)
> Cc: tech-talk
> Subject: Re: connection timeout between ioc and logstash
>
>
> Hi all,
> I've test the tcp_keep_alive option, unfortunately, it didn't work either. The difference is the IOC doesn't prompt error info when timeout. I can tell that the time of timeout is half an hour(28 minutes later is ok, 31 minutes later is abnormal). Then, logstash can never receive new message except I restarting the IOC. It's no use to restart logstash. The logstash configuration I use as follow:
> input {
> tcp {
> host => "${EPICS_IOC_LOG_INET:10.1.236.30}"
> port => "${EPICS_IOC_LOG_PORT:7011}"
> id => "ioclog-${EPICS_IOC_LOG_PORT:7011}"
> codec => "line"
> tcp_keep_alive => true
> }
> }
>
> filter {
> grok {
> pattern_definitions => {
> MONTHNAM => "(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)"
> TIMESTAMP => "%{MONTHDAY}-%{MONTHNAM}-%{YEAR} %{TIME}"
> PV => "\S+"
> }
> # named_captures_only => true
> # The following string must be all one line, one space between parts
> match => { "message" => "%{TIMESTAMP:timestamp} %{HOSTNAME:client} %{USER:user} %{PV:pv} new=%{NUMBER:new:float} old=%{NUMBER:old:float}" }
> }
>
> # date {
> # # Extract the timestamp from the message as a proper logstash
> # # timestamp field type, but since the IOC's clock might not be
> # # accurate we don't replace the locally generated @timestamp.
> # match => ["timestamp", "dd-MMM-yy HH:mm:ss"]
> # locale => "en"
> # target => "@ioctimestamp"
> # }
> }
>
> output {
> # elasticsearch {
> # hosts => "10.1.236.30:9200"
> # index => "caputlog-%{+YYYY-MM-dd}"
> # }
> stdout { codec => rubydebug }
> }
> It can't parse correctly if I added "(?: min=%{NUMBER:min} max=%{NUMBER:max})?" to the end of grok match part. It would work only if there was no burst. I haven't found a solution for this.
>
> > -----原始邮件-----
> > 发件人:"Johnson, Andrew N." <[email protected]>
> > 发送时间:2019-08-17 00:43:01 (星期六)
> > 收件人: "吴煊" <[email protected]>, "j. lewis muir" <[email protected]>, "kivel niko (psi)" <[email protected]>
> > 抄送: tech-talk <[email protected]>
> > 主题: Re: connection timeout between ioc and logstash
> >
> > Hi,
> >
> > On 8/16/19 10:22 AM, 吴煊 via Tech-talk wrote:
> > > I simplly followed the instruction written by Andrew Johnson https://events01.synchrotron.org.au/event/85/contributions/2601/attachments/689/997/5_06_Logstash_for_IOCs.pdf A thread named "IOClog" in May 2th 2019 had mentioned this.
> > > I've installed logstash 7.2 in a linux box. I configured 'caPutLogInit "10.1.236.30:7011", 1' in st.cmd. The configuration file for logstash is fellowing:
> > > input {
> > > tcp {
> > > host => "${EPICS_IOC_LOG_INET:10.1.236.30}"
> > > port => "${EPICS_IOC_LOG_PORT:7011}"
> > > codec => "line"
> > > }
> > > output {
> > > file {
> > > path => "/log/ioc/putlog-%{+YYYY-MM-dd}.json"
> > > }
> > > }
> > > }
> > > That's all the configuration. I can not figure out who closed the connection, but I think logstash may closed it, because iocLogServer will not. I'll try the tcp_keep_alive option next and to see it would work or not.
> > Apparently the default setting for the TCP SO_KEEPALIVE option can be different on different OSs, and the logstash configuration file I presented had not been widely tested (I've been running logstash on MacOS here). The iocLogServer code does explicitly set SO_KEEPALIVE in its acceptNewClient() routine so I think that logstash option will be the correct solution to this problem — please let tech-talk know your results.
> >
> > > By the way, I found that the grok filter in the example file is not correct for logstash 7.2
> > I developed those regular expressions using http://grokdebug.herokuapp.com/ and logstash 6.3.2 and am currently running 7.0.0, but we're not doing put-logging here yet. If you have a fix for logstash 7.2 I would be interested in seeing it.
> >
> > Thanks,
> >
> > - Andrew
- References:
- connection timeout between ioc and logstash 吴煊 via Tech-talk
- Re: connection timeout between ioc and logstash J. Lewis Muir via Tech-talk
- RE: connection timeout between ioc and logstash Kivel Niko (PSI) via Tech-talk
- Re: connection timeout between ioc and logstash J. Lewis Muir via Tech-talk
- Re: Re: connection timeout between ioc and logstash 吴煊 via Tech-talk
- Re: connection timeout between ioc and logstash Johnson, Andrew N. via Tech-talk
- Re: connection timeout between ioc and logstash 吴煊 via Tech-talk
- RE: connection timeout between ioc and logstash Kivel Niko (PSI) via Tech-talk
- Navigate by Date:
- Prev:
RE: connection timeout between ioc and logstash Kivel Niko (PSI) via Tech-talk
- Next:
SynApps 6-1 Released Lang, Keenan C. via Tech-talk
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
<2019>
2020
2021
2022
2023
2024
- Navigate by Thread:
- Prev:
RE: connection timeout between ioc and logstash Kivel Niko (PSI) via Tech-talk
- Next:
Undefined Timestamp on Motor Record Dominic Oram - UKRI STFC via Tech-talk
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
<2019>
2020
2021
2022
2023
2024
|