It would be very useful to get a stack trace showing where drvAsynIPPort::readIt is being called from when it is being called very fast after a disconnect. I want to know if it is being called by asynInterposeEos or someplace else.
Mark
> -----Original Message-----
> From: Mark Rivers
> Sent: Friday, March 02, 2018 3:10 PM
> To: 'Daykin, Evan' <[email protected]>; Torsten Bögershausen
> <[email protected]>; [email protected]
> Subject: RE: Asyn not automatically reconnecting
>
> Hi Evan,
>
> Can you simplify this further by removing all protocols from the protocol file except
> getAmpValue?
>
> I don't understand the output you sent. See my comments below.
>
> > -----Original Message-----
> > From: Daykin, Evan [mailto:[email protected]]
> > Sent: Thursday, March 01, 2018 3:34 PM
> > To: Mark Rivers <[email protected]>; Torsten Bögershausen
> > <[email protected]>; [email protected]
> > Subject: RE: Asyn not automatically reconnecting
> >
> > Sure,
> >
> > Database:
> >
> > record(asyn, "_$(HWUNIT):asyn") #resolves to PORT name
> > {
> > field(PORT, "$(HWUNIT)")
> > field(IEOS, "\n")
> > field(OEOS, "\n")
> > field(ASG, "$(ASGEXPERT)")
> > }
> >
> > record(ai, "TEST:POWR_RD_FWD")
> > {
> > field(DESC, "AMP_PWR_FORWARD value")
> > field(DTYP, "stream")
> > field(SCAN, "5 second")
> > field(EGU, "W")
> > field(INP, "@HBH-RF-Amp.proto getAmpValue(AMP_PWR_FORWARD,1) TEST-
> > RF-DCU-N0001")
> > info(archive, " ")
> > }
> >
> > Protocol file:
> >
> > Terminator = LF;
> > ReplyTimeout = 2000;
> >
> > #-----------------------------------------------------------------------
> ...
> > #-----------------------------------------------------------------------
> > getAmpValue {
> > ExtraInput = Ignore;
> > out "GET VALUE \$1 AMP \$2";
> > in "VALUE \$1 AMP \$2 %g";
> > }
> >
> ...
> > getFirmwareVersionPallet {
> > out "GET VALUE VERSION_SOFTWARE AMP \$1 DRAWER \$2 PALLET \$3";
> > in "VALUE VERSION_SOFTWARE AMP \$1 DRAWER \$2 PALLET \$3 %s";
> > }
> >
>
> > Asyn IO output (properly functioning):
>
> > 2018/03/01 16:17:49.155 [TEST-RF-DCU-N0001,-1,0]
> [../../asyn/asynDriver/asynManager.c:1560] [TEST-RF-DCU-N0001,0x6d72f0,50] TEST-
> RF-DCU-N0001 addr -1 queueRequest priority 0 from lockHolder
> > 2018/03/01 16:17:49.155 [TEST-RF-DCU-N0001,-1,0]
> [../../asyn/asynDriver/asynManager.c:1576] [TEST-RF-DCU-N0001,0x6d72f0,50] TEST-
> RF-DCU-N0001 schedule queueRequest timeout
> > 2018/03/01 16:17:49.155 [TEST-RF-DCU-N0001,-1,0]
> [../../asyn/asynDriver/asynManager.c:886] [TEST-RF-DCU-N0001,0x6d72f0,50]
> asynManager::portThread port=TEST-RF-DCU-N0001 callback
> > 2018/03/01 16:17:49.155 [TEST-RF-DCU-N0001,-1,0]
> [../../asyn/miscellaneous/asynInterposeEos.c:337] [TEST-RF-DCU-N0001,0x6d72f0,50]
> TEST-RF-DCU-N0001 get Eos 0
> > 2018/03/01 16:17:49.155 [TEST-RF-DCU-N0001,-1,0]
> [../../asyn/miscellaneous/asynInterposeEos.c:295] [TEST-RF-DCU-N0001,0x6d72f0,50]
> TEST-RF-DCU-N0001 set Eos 1
> > \n
> > 0a
> > 2018/03/01 16:17:49.155 [TEST-RF-DCU-N0001,-1,0]
> [../../asyn/drvAsynSerial/drvAsynIPPort.c:674] [TEST-RF-DCU-N0001,0x6d72f0,50] test-
> rf-dcu-n0001.cts:5001 read.
> > 2018/03/01 16:17:50.320 [TEST-RF-DCU-N0001,-1,0]
> [../../asyn/drvAsynSerial/drvAsynIPPort.c:203] [TEST-RF-DCU-N0001,0x6d72f0,50] Close
> test-rf-dcu-n0001.cts:5001 connection (fd 6): Read error
> > 2018/03/01 16:17:50.320 [TEST-RF-DCU-N0001,-1,0]
> [../../asyn/asynRecord/asynRecord.c:905] [TEST-RF-DCU-N0001,0x6d72f0,50] _TEST-
> RF-DCU-N0001:asyn: exception 0, asynExceptionConnect
> > 2018/03/01 16:17:50.319943 TEST-RF-DCU- TEST:POWR_RD_FWD: asynError in
> read: test-rf-dcu-n0001.cts:5001 read error: Connection reset by peer
> > 2018/03/01 16:17:50.319959 TEST-RF-DCU- TEST:POWR_RD_FWD: I/O error after
> reading 0 bytes: ""
> > 2018/03/01 16:17:50.319981 TEST-RF-DCU- TEST:POWR_RD_FWD: Protocol aborted
>
> If this is "properly functioning" why are you getting a read error and connection reset by
> peer?
>
> > 2018/03/01 16:17:50.320 [TEST-RF-DCU-N0001,-1,0]
> [../../asyn/miscellaneous/asynInterposeEos.c:295] [TEST-RF-DCU-N0001,0x6d72f0,50]
> TEST-RF-DCU-N0001 set Eos 0
> > 2018/03/01 16:17:50.325 [TEST-RF-DCU-N0001,-1,0]
> [../../asyn/asynDriver/asynManager.c:1565] [timerQueue,0x6d5fd0,60] TEST-RF-DCU-
> N0001 addr -1 queueRequest priority 3 not lockHolder
> > 2018/03/01 16:17:50.325 [TEST-RF-DCU-N0001,-1,0]
> [../../asyn/asynDriver/asynManager.c:807] [TEST-RF-DCU-N0001,0x6d72f0,50]
> asynManager connect queueCallback port:TEST-RF-DCU-N0001
> > 2018/03/01 16:17:50.325 [TEST-RF-DCU-N0001,-1,0]
> [../../asyn/drvAsynSerial/drvAsynIPPort.c:402] [TEST-RF-DCU-N0001,0x6d72f0,50] Open
> connection to test-rf-dcu-n0001.cts:5001 reason:0 fd:-1
> > 2018/03/01 16:17:50.326 [TEST-RF-DCU-N0001,-1,0]
> [../../asyn/drvAsynSerial/drvAsynIPPort.c:508] [TEST-RF-DCU-N0001,0x6d72f0,50]
> Opened connection to test-rf-dcu-n0001.cts:5001
> > 2018/03/01 16:17:50.326 [TEST-RF-DCU-N0001,-1,0]
> [../../asyn/asynRecord/asynRecord.c:905] [TEST-RF-DCU-N0001,0x6d72f0,50] _TEST-
> RF-DCU-N0001:asyn: exception 0, asynExceptionConnect
> > 2018/03/01 16:17:54.153 [TEST-RF-DCU-N0001,-1,0]
> [../../asyn/asynDriver/asynManager.c:1565] [scan-5,0x718ec0,61] TEST-RF-DCU-N0001
> addr -1 queueRequest priority 0 not lockHolder
> > 2018/03/01 16:17:54.153 [TEST-RF-DCU-N0001,-1,0]
> [../../asyn/asynDriver/asynManager.c:1576] [scan-5,0x718ec0,61] TEST-RF-DCU-N0001
> schedule queueRequest timeout
> > 2018/03/01 16:17:54.153 [TEST-RF-DCU-N0001,-1,0]
> [../../asyn/asynDriver/asynManager.c:886] [TEST-RF-DCU-N0001,0x6d72f0,50]
> asynManager::portThread port=TEST-RF-DCU-N0001 callback
> > 2018/03/01 16:17:54.153 [TEST-RF-DCU-N0001,-1,0]
> [../../asyn/asynDriver/asynManager.c:1565] [TEST-RF-DCU-N0001,0x6d72f0,50] TEST-
> RF-DCU-N0001 addr -1 queueRequest priority 0 not lockHolder
> > 2018/03/01 16:17:54.153 [TEST-RF-DCU-N0001,-1,0]
> [../../asyn/asynDriver/asynManager.c:1576] [TEST-RF-DCU-N0001,0x6d72f0,50] TEST-
> RF-DCU-N0001 schedule queueRequest timeout
> > 2018/03/01 16:17:54.154 [TEST-RF-DCU-N0001,-1,0]
> [../../asyn/asynDriver/asynManager.c:886] [TEST-RF-DCU-N0001,0x6d72f0,50]
> asynManager::portThread port=TEST-RF-DCU-N0001 callback
> > 2018/03/01 16:17:54.154 [TEST-RF-DCU-N0001,-1,0]
> [../../asyn/drvAsynSerial/drvAsynIPPort.c:674] [TEST-RF-DCU-N0001,0x6d72f0,50] test-
> rf-dcu-n0001.cts:5001 read.
> > 2018/03/01 16:17:54.154 [TEST-RF-DCU-N0001,-1,0]
> [../../asyn/drvAsynSerial/drvAsynIPPort.c:744] [TEST-RF-DCU-N0001,0x6d72f0,50] test-
> rf-dcu-n0001.cts:5001 read 50
> > FRIB Control Unit (0.4.8 GU Jan 2 2018 21:32:44)
>
> I don't understand. This does not appear to be the protocol getAmpValue. It did not send
> the command to read the amp value, and the string it received is not the one for that
> protocol.
>
> Please send the output when the record TEST:POWR_RD_FWD processes a few times.
>
>
> > Asyn output (not working):
> >
>
> > 2018/03/01 16:29:14.902 [TEST-RF-DCU-N0001,-1,0]
> [../../asyn/asynDriver/asynManager.c:1565] [scan-5,0x264e600,61] TEST-RF-DCU-N0001
> addr -1 queueRequest priority 0 not lockHolder
> > 2018/03/01 16:29:14.902 [TEST-RF-DCU-N0001,-1,0]
> [../../asyn/asynDriver/asynManager.c:1576] [scan-5,0x264e600,61] TEST-RF-DCU-N0001
> schedule queueRequest timeout
> > 2018/03/01 16:29:14.902 [TEST-RF-DCU-N0001,-1,0]
> [../../asyn/asynDriver/asynManager.c:886] [TEST-RF-DCU-N0001,0x260d420,50]
> asynManager::portThread port=TEST-RF-DCU-N0001 callback
> > 2018/03/01 16:29:14.902 [TEST-RF-DCU-N0001,-1,0]
> [../../asyn/asynDriver/asynManager.c:1565] [TEST-RF-DCU-N0001,0x260d420,50] TEST-
> RF-DCU-N0001 addr -1 queueRequest priority 0 not lockHolder
> > 2018/03/01 16:29:14.902 [TEST-RF-DCU-N0001,-1,0]
> [../../asyn/asynDriver/asynManager.c:1576] [TEST-RF-DCU-N0001,0x260d420,50] TEST-
> RF-DCU-N0001 schedule queueRequest timeout
> > 2018/03/01 16:29:14.902 [TEST-RF-DCU-N0001,-1,0]
> [../../asyn/asynDriver/asynManager.c:886] [TEST-RF-DCU-N0001,0x260d420,50]
> asynManager::portThread port=TEST-RF-DCU-N0001 callback
> > 2018/03/01 16:29:14.902 [TEST-RF-DCU-N0001,-1,0]
> [../../asyn/drvAsynSerial/drvAsynIPPort.c:674] [TEST-RF-DCU-N0001,0x260d420,50]
> test-rf-dcu-n0001.cts:5001 read.
> > 2018/03/01 16:29:14.902 [TEST-RF-DCU-N0001,-1,0]
> [../../asyn/drvAsynSerial/drvAsynIPPort.c:744] [TEST-RF-DCU-N0001,0x260d420,50]
> test-rf-dcu-n0001.cts:5001 read 0
> > 2018/03/01 16:29:14.902 [TEST-RF-DCU-N0001,-1,0]
> [../../asyn/drvAsynSerial/drvAsynIPPort.c:203] [TEST-RF-DCU-N0001,0x260d420,50]
> Close test-rf-dcu-n0001.cts:5001 connection (fd 18): Read from broken connection
> > 2018/03/01 16:29:14.902 [TEST-RF-DCU-N0001,-1,0]
> [../../asyn/asynRecord/asynRecord.c:905] [TEST-RF-DCU-N0001,0x260d420,50] _TEST-
> RF-DCU-N0001:asyn: exception 0, asynExceptionConnect
> > 2018/03/01 16:29:14.902 [TEST-RF-DCU-N0001,-1,0]
> [../../asyn/miscellaneous/asynInterposeEos.c:235] [TEST-RF-DCU-N0001,0x260d420,50]
> TEST-RF-DCU-N0001 read
> >
> > 2018/03/01 16:29:14.902 [TEST-RF-DCU-N0001,-1,0]
> [../../asyn/drvAsynSerial/drvAsynIPPort.c:674] [TEST-RF-DCU-N0001,0x260d420,50]
> test-rf-dcu-n0001.cts:5001 read. (Repeats forever)
>
> Note that the next to the last line is from the asynPrintIO call in asynInterposeEos.c
>
> for (;;) {
> if ((peosPvt->inBufTail != peosPvt->inBufHead)) {
> ...
> status = peosPvt->poctet->read(peosPvt->octetPvt,
> pasynUser,peosPvt->inBuf,peosPvt->inBufSize,&thisRead,&eom);
> if(status==asynSuccess) {
> asynPrintIO(pasynUser,ASYN_TRACEIO_FILTER,peosPvt->inBuf,thisRead,
> "%s read\n",peosPvt->portName);
> /*
> * Read could have returned with ASYN_EOM_CNT set in eom because
> * the number of octets available exceeded inBufSize. This is not
> * a reason for us to stop reading.
> */
> eom &= ~ASYN_EOM_CNT;
> }
> if(status!=asynSuccess || thisRead==0) break;
> peosPvt->inBufTail = 0;
> peosPvt->inBufHead = (int)thisRead;
> }
>
> It is acting as if it is stuck in this loop which is what is causing your fast stream of error
> messages. The peosPvt->poctet->read() call above is what calls the drvAsynIPPort driver.
> However, I don't see how it could get stuck in the loop. If the read() call returns
> asynSuccess it prints the message which you saw once. If status is not asynSuccess then it
> breaks out of the loop. So either we should see the message from this function many times,
> or it should not call peosPvt->poctet->read() multiple times.
>
> I'd like to see some more information from this function. Can you please apply the
> following patch before you run the test again:
>
> diff --git a/asyn/miscellaneous/asynInterposeEos.c b/asyn/miscellaneous/asynInterposeEos.c
> index c9697ad..558c0a7 100644
> --- a/asyn/miscellaneous/asynInterposeEos.c
> +++ b/asyn/miscellaneous/asynInterposeEos.c
> @@ -232,13 +232,16 @@ static asynStatus readIt(void *ppvt,asynUser *pasynUser,
> pasynUser,peosPvt->inBuf,peosPvt->inBufSize,&thisRead,&eom);
> if(status==asynSuccess) {
> asynPrintIO(pasynUser,ASYN_TRACEIO_FILTER,peosPvt->inBuf,thisRead,
> - "%s read\n",peosPvt->portName);
> + "%s read %d bytes eom=%d\n",peosPvt->portName, thisRead, eom);
> /*
> * Read could have returned with ASYN_EOM_CNT set in eom because
> * the number of octets available exceeded inBufSize. This is not
> * a reason for us to stop reading.
> */
> eom &= ~ASYN_EOM_CNT;
> + } else {
> + asynPrint(pasynUser, ASYN_TRACE_WARNING, "%s read from low-level driver
> returned %d\n",
> + peosPvt->portName, status);
> }
> if(status!=asynSuccess || thisRead==0) break;
> peosPvt->inBufTail = 0;
>
> Mark
>
>
>
> >
> > -----Original Message-----
> > From: Mark Rivers [mailto:[email protected]]
> > Sent: Thursday, March 01, 2018 4:06 PM
> > To: Daykin, Evan <[email protected]>; Torsten Bögershausen
> > <[email protected]>; [email protected]
> > Subject: RE: Asyn not automatically reconnecting
> >
> > Can you post the database, protocol file, and asynTrace output with all asynTrace bits on?
> >
> > > -----Original Message-----
> > > From: Daykin, Evan [mailto:[email protected]]
> > > Sent: Thursday, March 01, 2018 3:04 PM
> > > To: Mark Rivers <[email protected]>; Torsten Bögershausen
> > > <[email protected]>; [email protected]
> > > Subject: RE: Asyn not automatically reconnecting
> > >
> > >
> > > I have tried a database with only a couple of records, and the behavior persists.
> > >
> > > -----Original Message-----
> > > From: Mark Rivers [mailto:[email protected]]
> > > Sent: Tuesday, February 27, 2018 4:57 PM
> > > To: Daykin, Evan <[email protected]>; Torsten Bögershausen
> > > <[email protected]>; [email protected]
> > > Subject: Re: Asyn not automatically reconnecting
> > >
> > > Can you try a very simple database with just one or two records
> > > talking to your server. See if that works when the server shuts down
> > > gracefully. If it does, but it fails with lots of records then we have a clue where to look.
> > >
> > >
> > > Mark
> > >
> > >
> > >
> > > ________________________________
> > > From: Daykin, Evan <[email protected]>
> > > Sent: Tuesday, February 27, 2018 3:53 PM
> > > To: Torsten Bögershausen; Mark Rivers; [email protected]
> > > Subject: RE: Asyn not automatically reconnecting
> > >
> > > >Sorry if this is the wrong question, but what is asyn 4.33-2 ?
> > > >"git tag" mentions R4-33, not "-2".
> > >
> > > That's the revision in our debian repo, it is asyn 4-33.
> > >
> > > > So from asyn everything seems to be OK.
> > > Maybe this is streamdevice and/or our vendor's software. We will investigate both
> > avenues.
> > >
> > > -----Original Message-----
> > > From: Torsten Bögershausen [mailto:[email protected]]
> > > Sent: Tuesday, February 27, 2018 3:37 PM
> > > To: Daykin, Evan <[email protected]>; Mark Rivers
> > > <[email protected]>; tech- [email protected]
> > > Subject: Re: Asyn not automatically reconnecting
> > >
> > >
> > >
> > > On 23/02/18 15:17, Daykin, Evan wrote:
> > > > Hi Mark,
> > > >
> > > > Thanks for the reply. We are using asyn 4.33-2 and Stream 2.6-3. The
> > > > protocol file has a ReplyTimeout of 2000ms.
> > >
> > > Sorry if this is the wrong question, but what is asyn 4.33-2 ?
> > > "git tag" mentions R4-33, not "-2".
> > >
> > > And to answer the other question:
> > > >Here's something interesting: asyn behaves correctly for us *only*
> > > when the >client application is killed with SIGKILL, rather than
> > > SIGTERM. This results >in TCP retransmissions and/or refused
> > > connections, then normal operation after >a short while. Is it possible asyn is
> > [un]intentionally not retrying the >connection if it is closed gracefully?
> > >
> > >
> > > The code that I see producing "Read from broken connection" is this:
> > >
> > > /* If recv() returns 0 on a SOCK_STREAM (TCP) socket, the connection has closed
> */
> > > if ((thisRead == 0) && (tty->socketType == SOCK_STREAM)) {
> > > epicsSnprintf(pasynUser->errorMessage,pasynUser->errorMessageSize,
> > > "%s connection closed",
> > > tty->IPDeviceName);
> > > closeConnection(pasynUser,tty,"Read from broken connection");
> > >
> > > ----------
> > > And then
> > > closeConnection(asynUser *pasynUser,ttyController_t *tty,const char *why) {
> > > asynPrint(pasynUser, ASYN_TRACE_FLOW,
> > > "Close %s connection (fd %d): %s\n", tty->IPDeviceName,
> > > tty->fd, why);
> > >
> > > So from asyn everything seems to be OK.
> > >
> > > It seems that the problem described here:
> > > >2018/02/23 09:51:47.089 test-rf-dcu-n0001.cts:5001 read.
> > > >The last line then repeats rapidly forever.
> > >
> > > is caused by the caller into asyn, which is streamdevice ?
> > > There seem to be some improvements in streamdevice, but I am not familiar with the
> code.
> > >
> > > Dirk, does there ring a bell ?
- References:
- Asyn not automatically reconnecting Daykin, Evan
- RE: Asyn not automatically reconnecting Daykin, Evan
- RE: Asyn not automatically reconnecting Daykin, Evan
- RE: Asyn not automatically reconnecting Daykin, Evan
- RE: Asyn not automatically reconnecting Daykin, Evan
- RE: Asyn not automatically reconnecting Mark Rivers
- Navigate by Date:
- Prev:
RE: Asyn not automatically reconnecting Daykin, Evan
- Next:
RE: Asyn not automatically reconnecting Daykin, Evan
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
<2018>
2019
2020
2021
2022
2023
2024
- Navigate by Thread:
- Prev:
RE: Asyn not automatically reconnecting Mark Rivers
- Next:
RE: Asyn not automatically reconnecting Daykin, Evan
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
<2018>
2019
2020
2021
2022
2023
2024
|