EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  <20202021  2022  2023  2024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  <20202021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: RE: StreamDevice, prevent records from getting "stuck"
From: "Sobhani, Bayan via Tech-talk" <tech-talk at aps.anl.gov>
To: "Zimoch Dirk (PSI)" <dirk.zimoch at psi.ch>, Mark Rivers <rivers at cars.uchicago.edu>
Cc: "tech-talk at aps.anl.gov" <tech-talk at aps.anl.gov>
Date: Tue, 14 Apr 2020 13:45:55 +0000
Hi Dirk,

I see Mark forwarded you files.

I continued discussing this with Mark and the conclusion of that discussion was there is probably an electrical problem. What is happening is some of these devices seem to send characters unsolicited. I think the PVs get stuck because the unsolicited characters keep coming in, so the PV never knows when to stop listening.

Because of the coronavirus situation, I currently only have remote access to the device, so I will have to wait until I can do tests like swapping out the Moxa.

Until I am able to do these tests though, I did find pretty good success adding "MaxInput = 20;" to the top of the protocol file (this is in the file I am sending you but was not there before). This tells the StreamDevice input PVs to move on after receiving 20 characters, rather than continue listening for as long as the characters are coming in. Ideally this will only be a temporary solution.

Alex

-----Original Message-----
From: Zimoch Dirk (PSI) <dirk.zimoch at psi.ch> 
Sent: Tuesday, April 14, 2020 8:12 AM
To: Mark Rivers <rivers at cars.uchicago.edu>; Sobhani, Bayan <bsobhani at bnl.gov>
Cc: tech-talk at aps.anl.gov
Subject: AW: StreamDevice, prevent records from getting "stuck"

Hello Alex,

Sorry for replying so late, but our mail server had decided to block this conversation.

Mark, thanks for taking care already.


Alex, I don't know I have missed it but have you already sent your database and protocol file? Without them everything is guesswork.

From the first mail you sent, it looks like the record that "got stuck" is simply in a very long timeout. I need to see your protocol file to say more.

I also had problems with using RS485 (2 wire) on some Moxa devices. The problem is that with 2 wire, the sender is always shouting into its own ears. The chip needs to disable receiving while it is sending and it seems some Moxa devices (in particular newer ones) get that wrong. This seems to cause corrupted input. Maybe you want to try with a different RS485 interface? There are some RS485-USB converters on the market.

Dirk


-----Ursprüngliche Nachricht-----
Von: Tech-talk <tech-talk-bounces at aps.anl.gov> Im Auftrag von Mark Rivers via Tech-talk
Gesendet: Donnerstag, 9. April 2020 21:00
An: 'Sobhani, Bayan' <bsobhani at bnl.gov>
Cc: tech-talk <tech-talk at aps.anl.gov>
Betreff: RE: StreamDevice, prevent records from getting "stuck"


Please let the IOC boot and run for 30 seconds, and then send me the trace .txt file.

What model terminal server are you using and what "mode" have you configured the port on the terminal server?  Can you open the Web interface to the terminal server and send a screen shot?

Mark


-----Original Message-----
From: Sobhani, Bayan <bsobhani at bnl.gov>
Sent: Thursday, April 9, 2020 12:50 PM
To: Mark Rivers <rivers at cars.uchicago.edu>
Cc: tech-talk <tech-talk at aps.anl.gov>
Subject: RE: StreamDevice, prevent records from getting "stuck"

The device does not have an ethernet port. For communications it has a serial port and this is connected to a Moxa terminal server. It is an ascii device. The baud rate on the Moxa terminal server is set to 9600. The baud rate on the device should be set to 9600 as well, but I am not near the device to check. If the baud rate on the device did not match the terminal server though, I would think that no command would go through at all, but for this device I am able to send and receive commands.

For example, the command to read temperature is:
*01X01

And the device responds with:
01X01025.2

The 25.2 at the end is the temperature in celsius.

I just tried telnetting into the device and typing this command and it worked. I did, however, get a continuous stream of unwanted characters randomly pop on the screen as I was typing the command. But after I hit enter the response from the device was fine (01X01025.2). I did it again and got 01X01023.5, which also looks good.

Ideally, these devices would only respond to commands, and never send unsolicited characters. But some of these devices send unsolicited and unwanted characters for unknown reasons.

Temperature reading PVs work as well, but only when I first start the IOC.

For example, the value was stuck at 25.1 since an hour ago, but I restart the IOC and the value is now 22.1. So it looks like the communication sometimes works, but then the PVs get permanently "jammed" by the unwanted characters.

The question I have is whether it is technically possible at the IOC level to prevent these unwanted characters from permanently "jamming" the PVs.

Here is the output of the minimal st.cmd (I removed the other devices):

#!../../bin/linux-x86_64/omegaCNi32
## You may have to change omegaCNi32 to something else ## everywhere it appears in this file < envPaths
epicsEnvSet("IOC","ioclocalhost")
epicsEnvSet("TOP","/epics/iocs/omega_i_series")
epicsEnvSet("ASYN","/epics/iocs/asyn")
epicsEnvSet("CALC","/epics/src/calc")
epicsEnvSet("STREAMDEVICE","/epics/iocs/StreamDevice")
epicsEnvSet("EPICS_BASE","/epics/base")
cd /epics/iocs/omega_i_series
## Register all support components
dbLoadDatabase("dbd/omegaCNi32.dbd",0,0)
omegaCNi32_registerRecordDeviceDriver(pdbbase)
## Streamdevice Protocol Path
epicsEnvSet ("STREAM_PROTOCOL_PATH", "protocols")
drvAsynIPPortConfigure("c3-tsrv1-p16","10.17.2.60:4016")
asynSetTraceIOMask c3-tsrv1-p16 0 TRACEIO_ESCAPE asynSetTraceFile c3-tsrv1-p16 0 traceOuput.txt asynSetTraceMask c3-tsrv1-p16 0  TRACE_ERROR|TRACEIO_DRIVER|TRACEIO_DEVICE|TRACE_FLOW
## Load record instances
dbLoadRecords("db/omegaCNi32_temp.db","Sys=XF:17ID-CT,Dev={RG:C3},Chan=01,N_T=:1,N_GAIN=,PORT=c3-tsrv1-p16")
iocInit()
Starting iocInit
############################################################################
## EPICS R7.0.3.2-DEV
## Rev. R7.0.3.1-102-g1d6fcd46d653934da0e2
############################################################################
cas warning: Configured TCP port was unavailable.
cas warning: Using dynamically assigned TCP port 36879, cas warning: but now two or more servers share the same UDP port.
cas warning: Depending on your IP kernel this server may not be cas warning: reachable with UDP unicast (a host's IP in EPICS_CA_ADDR_LIST)
iocRun: All initialization complete
dbl > /epics/iocs/omega_i_series/records.dbl
system "cp /epics/iocs/omega_i_series/records.dbl /cf-update/$HOSTNAME.$IOCNAME.dbl"
epics>

No other messages appear after this. The output of traceOutput.txt looks similar to before.

Alex

-----Original Message-----
From: Mark Rivers <rivers at cars.uchicago.edu>
Sent: Thursday, April 9, 2020 11:38 AM
To: Sobhani, Bayan <bsobhani at bnl.gov>
Cc: tech-talk <tech-talk at aps.anl.gov>
Subject: Re: StreamDevice, prevent records from getting "stuck"

OK, we are making progress.


Now I need more information.


What type of interface are you using to the device?  Does the device itself have an Ethernet port, or are you using a terminal server, with a serial connection to the device?


What type of communication does the device use, ASCII or binary?  You are getting single character messages that are more \377 which is 8 bits that are all 1.  What type of messages do you expect, can you give an example?  The output you sent was only those single non-ASCII characters.  Do you ever get normal ASCII characters?


If you are using a terminal server then I suspect you may have the baud rate or parity configured wrong.


I notice that asynTrace output shows only read operations, no write operations.  Does your device send unsolicited messages, or should it only answer when sent a command?


Please generate a minimal startup script that generates the problem and send that, along with the complete output when the IOC boots and shows the problem.


Mark



________________________________
From: Sobhani, Bayan <bsobhani at bnl.gov>
Sent: Thursday, April 9, 2020 9:57 AM
To: Mark Rivers
Cc: tech-talk
Subject: RE: StreamDevice, prevent records from getting "stuck"

I actually ran EpicsMutexShowAll 10 before dumping that traceback and I got:

ellCount(&mutexList) 282 ellCount(&freeList) 0 epicsMutexId 0x6c3a80 source ../../asyn/asynDriver/asynManager.c line 1975 epicsMutexId 0x6c8ae0 source ../../asyn/asynDriver/asynManager.c line 1975

I think there was also a third one that sometimes was there and sometimes was not, and had a longer ID. Note: there are 3 devices on this IOC.

Here are the last 100 lines of traceOutput:

bsobhani@xf17id1a-ioc1:/epics/iocs/omega_i_series$ tail traceOuput.txt -n 100
2020/04/09 10:55:09.768 10.17.2.60:4016 read 1
\377
2020/04/09 10:55:09.768 10.17.2.60:4016 read.
2020/04/09 10:55:10.068 10.17.2.60:4016 read 1
\377
2020/04/09 10:55:10.068 10.17.2.60:4016 read.
2020/04/09 10:55:10.135 10.17.2.60:4016 read 1
\377
2020/04/09 10:55:10.138 10.17.2.60:4016 read.
2020/04/09 10:55:10.185 10.17.2.60:4016 read 1
\377
2020/04/09 10:55:10.185 10.17.2.60:4016 read.
2020/04/09 10:55:11.119 10.17.2.60:4016 read 1
\377
2020/04/09 10:55:11.119 10.17.2.60:4016 read.
2020/04/09 10:55:11.135 10.17.2.60:4016 read 1
\377
2020/04/09 10:55:11.135 10.17.2.60:4016 read.
2020/04/09 10:55:11.402 10.17.2.60:4016 read 1
\377
2020/04/09 10:55:11.402 10.17.2.60:4016 read.
2020/04/09 10:55:11.519 10.17.2.60:4016 read 1
\377
2020/04/09 10:55:11.519 10.17.2.60:4016 read.
2020/04/09 10:55:11.589 c3-tsrv1-p16 asynManager:queueTimeoutCallback
2020/04/09 10:55:11.589 c3-tsrv1-p16 asynManager:queueTimeoutCallback
2020/04/09 10:55:11.589 c3-tsrv1-p16 asynManager:queueTimeoutCallback
2020/04/09 10:55:11.610 c3-tsrv1-p16 asynManager:queueTimeoutCallback
2020/04/09 10:55:11.886 10.17.2.60:4016 read 1
\377
2020/04/09 10:55:11.886 10.17.2.60:4016 read.
2020/04/09 10:55:11.969 10.17.2.60:4016 read 1
\377
2020/04/09 10:55:11.969 10.17.2.60:4016 read.
2020/04/09 10:55:12.019 10.17.2.60:4016 read 1
\377
2020/04/09 10:55:12.019 10.17.2.60:4016 read.
2020/04/09 10:55:12.069 10.17.2.60:4016 read 1
\377
2020/04/09 10:55:12.069 10.17.2.60:4016 read.
2020/04/09 10:55:12.402 10.17.2.60:4016 read 1
\377
2020/04/09 10:55:12.402 10.17.2.60:4016 read.
2020/04/09 10:55:12.586 10.17.2.60:4016 read 1
\377
2020/04/09 10:55:12.586 10.17.2.60:4016 read.
2020/04/09 10:55:12.619 10.17.2.60:4016 read 1
\374
2020/04/09 10:55:12.619 10.17.2.60:4016 read.
2020/04/09 10:55:12.669 10.17.2.60:4016 read 1
\377
2020/04/09 10:55:12.669 10.17.2.60:4016 read.
2020/04/09 10:55:12.886 10.17.2.60:4016 read 1
\377
2020/04/09 10:55:12.886 10.17.2.60:4016 read.
2020/04/09 10:55:13.069 10.17.2.60:4016 read 1
\377
2020/04/09 10:55:13.069 10.17.2.60:4016 read.
2020/04/09 10:55:13.153 10.17.2.60:4016 read 1
\377
2020/04/09 10:55:13.153 10.17.2.60:4016 read.
2020/04/09 10:55:13.186 10.17.2.60:4016 read 1
\377
2020/04/09 10:55:13.186 10.17.2.60:4016 read.
2020/04/09 10:55:13.803 10.17.2.60:4016 read 1
\377
2020/04/09 10:55:13.803 10.17.2.60:4016 read.
2020/04/09 10:55:14.020 10.17.2.60:4016 read 1
\377
2020/04/09 10:55:14.020 10.17.2.60:4016 read.
2020/04/09 10:55:14.119 10.17.2.60:4016 read 1
\375
2020/04/09 10:55:14.119 10.17.2.60:4016 read.
2020/04/09 10:55:14.169 10.17.2.60:4016 read 1
\377
2020/04/09 10:55:14.169 10.17.2.60:4016 read.
2020/04/09 10:55:14.353 10.17.2.60:4016 read 1
\377
2020/04/09 10:55:14.353 10.17.2.60:4016 read.
2020/04/09 10:55:14.386 10.17.2.60:4016 read 1
\377
2020/04/09 10:55:14.386 10.17.2.60:4016 read.
2020/04/09 10:55:14.453 10.17.2.60:4016 read 1
\377
2020/04/09 10:55:14.453 10.17.2.60:4016 read.
2020/04/09 10:55:14.594 c3-tsrv1-p16 addr -1 queueRequest priority 0 not lockHolder
2020/04/09 10:55:14.594 c3-tsrv1-p16 schedule queueRequest timeout in 2.000000 seconds
2020/04/09 10:55:14.594 c3-tsrv1-p16 addr -1 queueRequest priority 0 not lockHolder
2020/04/09 10:55:14.594 c3-tsrv1-p16 schedule queueRequest timeout in 2.000000 seconds
2020/04/09 10:55:14.594 c3-tsrv1-p16 addr -1 queueRequest priority 0 not lockHolder
2020/04/09 10:55:14.594 c3-tsrv1-p16 schedule queueRequest timeout in 2.000000 seconds
2020/04/09 10:55:15.019 10.17.2.60:4016 read 1
\377
2020/04/09 10:55:15.019 10.17.2.60:4016 read.
2020/04/09 10:55:15.070 10.17.2.60:4016 read 1
\377
2020/04/09 10:55:15.070 10.17.2.60:4016 read.
2020/04/09 10:55:15.786 10.17.2.60:4016 read 1
\377
2020/04/09 10:55:15.786 10.17.2.60:4016 read.

-----Original Message-----
From: Mark Rivers <rivers at cars.uchicago.edu>
Sent: Thursday, April 9, 2020 9:48 AM
To: Sobhani, Bayan <bsobhani at bnl.gov>
Cc: tech-talk <tech-talk at aps.anl.gov>
Subject: RE: StreamDevice, prevent records from getting "stuck"

Hi,

I'm glad you have updated to the latest versions, the version of asyn you were using previously is almost 6 years old and there have been many fixes.

When it gets "hung" now you did not send the output of

epicsMutexShowAll 1

Do you see the same thing as previously?

I notice that none of your threads is actually calling epicsMutexLock or similar, in fact the string "mutex" (case insensitive) does not appear in any of the thread backtraces.

One of the very first things to do is to monitor the entire communications with the device using asynTrace, sending the output to a file.  You should put the following in your startup script after creating the port:

asynSetTraceIOMask c3-tsrv1-p16 0 TRACEIO_ESCAPE asynSetTraceFile c3-tsrv1-p16 0 traceOuput.txt asynSetTraceMask c3-tsrv1-p16 0  TRACE_ERROR|TRACEIO_DRIVER|TRACEIO_DEVICE|TRACE_FLOW

That should show you what is happening.

Mark



-----Original Message-----
From: Sobhani, Bayan <bsobhani at bnl.gov>
Sent: Thursday, April 9, 2020 8:26 AM
To: Mark Rivers <rivers at cars.uchicago.edu>
Cc: tech-talk <tech-talk at aps.anl.gov>
Subject: RE: StreamDevice, prevent records from getting "stuck"

The versions were:

Asyn Version: 4.23
StreamDevice: 2.6.0
Base: 3.14

But I upgraded these to the newest versions from github (asyn 4.39, base 7, StreamDevice 2.8.12).

I am still having similar problems although I am noticing that things are a little bit different in subtle ways.

Anyway, one thing I noticed is that the stuck PVs seem to have TIMEOUT INVALID alarms. If I camonitor a PV with a TIMEOUT INVALID alarm, it's as if the PV isn't being processed. When I restart the IOC, the PVs read new values from the device once, and then get stuck again until the next time I restart the IOC.

The backtrace contains 26 threads. I do not know how to begin to tell which one is the relevant thread. I will paste the backtrace here:

Thread 26 (Thread 0x7ffff40d8700 (LWP 11726)):
#0  0x00007ffff64abe5d in nanosleep () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007ffff6f6ed42 in epicsThreadSleep () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#2  0x00007ffff746e19b in rsrv_online_notify_task () from /epics/base/lib/linux-x86_64/libdbCore.so.3.17.0
#3  0x00007ffff6f6dbbc in start_routine () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#4  0x00007ffff61e8b50 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#5  0x00007ffff64dbfbd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#6  0x0000000000000000 in ?? ()

Thread 25 (Thread 0x7fffef0f7700 (LWP 11725)):
#0  0x00007ffff64dce73 in recvfrom () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007ffff746dd12 in cast_server () from /epics/base/lib/linux-x86_64/libdbCore.so.3.17.0
#2  0x00007ffff6f6dbbc in start_routine () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#3  0x00007ffff61e8b50 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#4  0x00007ffff64dbfbd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#5  0x0000000000000000 in ?? ()

Thread 24 (Thread 0x7fffef1f8700 (LWP 11724)):
#0  0x00007ffff64dcc0d in accept () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007ffff6f6c1fa in epicsSocketAccept () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#2  0x00007ffff7468ea3 in req_server () from /epics/base/lib/linux-x86_64/libdbCore.so.3.17.0
#3  0x00007ffff6f6dbbc in start_routine () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#4  0x00007ffff61e8b50 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#5  0x00007ffff64dbfbd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#6  0x0000000000000000 in ?? ()

Thread 23 (Thread 0x7fffef3f9700 (LWP 11723)):
#0  0x00007ffff61ed6bb in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007ffff6f70753 in epicsEventWaitWithTimeout () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#2  0x00007ffff743e093 in periodicTask () from /epics/base/lib/linux-x86_64/libdbCore.so.3.17.0
#3  0x00007ffff6f6dbbc in start_routine () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#4  0x00007ffff61e8b50 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#5  0x00007ffff64dbfbd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#6  0x0000000000000000 in ?? ()

Thread 22 (Thread 0x7fffef5fa700 (LWP 11722)):
#0  0x00007ffff61ed6bb in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007ffff6f70753 in epicsEventWaitWithTimeout () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#2  0x00007ffff743e093 in periodicTask () from /epics/base/lib/linux-x86_64/libdbCore.so.3.17.0
#3  0x00007ffff6f6dbbc in start_routine () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#4  0x00007ffff61e8b50 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
---Type <return> to continue, or q <return> to quit---
#5  0x00007ffff64dbfbd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#6  0x0000000000000000 in ?? ()

Thread 21 (Thread 0x7fffef7fb700 (LWP 11721)):
#0  0x00007ffff61ed6bb in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007ffff6f70753 in epicsEventWaitWithTimeout () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#2  0x00007ffff743e093 in periodicTask () from /epics/base/lib/linux-x86_64/libdbCore.so.3.17.0
#3  0x00007ffff6f6dbbc in start_routine () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#4  0x00007ffff61e8b50 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#5  0x00007ffff64dbfbd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#6  0x0000000000000000 in ?? ()

Thread 20 (Thread 0x7fffef9fc700 (LWP 11720)):
#0  0x00007ffff61ed6bb in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007ffff6f70753 in epicsEventWaitWithTimeout () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#2  0x00007ffff743e093 in periodicTask () from /epics/base/lib/linux-x86_64/libdbCore.so.3.17.0
#3  0x00007ffff6f6dbbc in start_routine () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#4  0x00007ffff61e8b50 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#5  0x00007ffff64dbfbd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#6  0x0000000000000000 in ?? ()

Thread 19 (Thread 0x7fffefbfd700 (LWP 11719)):
#0  0x00007ffff61ed6bb in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007ffff6f70753 in epicsEventWaitWithTimeout () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#2  0x00007ffff743e093 in periodicTask () from /epics/base/lib/linux-x86_64/libdbCore.so.3.17.0
#3  0x00007ffff6f6dbbc in start_routine () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#4  0x00007ffff61e8b50 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#5  0x00007ffff64dbfbd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#6  0x0000000000000000 in ?? ()

Thread 18 (Thread 0x7fffefdfe700 (LWP 11718)):
#0  0x00007ffff61ed6bb in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007ffff6f70753 in epicsEventWaitWithTimeout () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#2  0x00007ffff743e093 in periodicTask () from /epics/base/lib/linux-x86_64/libdbCore.so.3.17.0
#3  0x00007ffff6f6dbbc in start_routine () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#4  0x00007ffff61e8b50 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#5  0x00007ffff64dbfbd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#6  0x0000000000000000 in ?? ()

Thread 17 (Thread 0x7fffeffff700 (LWP 11717)):
#0  0x00007ffff61ed6bb in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007ffff6f70753 in epicsEventWaitWithTimeout () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
---Type <return> to continue, or q <return> to quit---
#2  0x00007ffff743e093 in periodicTask () from /epics/base/lib/linux-x86_64/libdbCore.so.3.17.0
#3  0x00007ffff6f6dbbc in start_routine () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#4  0x00007ffff61e8b50 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#5  0x00007ffff64dbfbd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#6  0x0000000000000000 in ?? ()

Thread 16 (Thread 0x7ffff42d9700 (LWP 11716)):
#0  0x00007ffff61ed344 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007ffff6f705e6 in epicsEventWait () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#2  0x00007ffff6f69159 in epicsEventMustWait () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#3  0x00007ffff743df3f in onceTask () from /epics/base/lib/linux-x86_64/libdbCore.so.3.17.0
#4  0x00007ffff6f6dbbc in start_routine () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#5  0x00007ffff61e8b50 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#6  0x00007ffff64dbfbd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#7  0x0000000000000000 in ?? ()

Thread 15 (Thread 0x7ffff43da700 (LWP 11715)):
#0  0x00007ffff61ed6bb in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007ffff6f70753 in epicsEventWaitWithTimeout () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#2  0x00007ffff6f6902c in epicsEvent::wait(double) () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#3  0x00007ffff6f79ee1 in timerQueueActive::run() () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#4  0x00007ffff6f68369 in epicsThreadCallEntryPoint () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#5  0x00007ffff6f6dbbc in start_routine () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#6  0x00007ffff61e8b50 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#7  0x00007ffff64dbfbd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#8  0x0000000000000000 in ?? ()

Thread 14 (Thread 0x7ffff45db700 (LWP 11714)):
#0  0x00007ffff61ed344 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007ffff6f705e6 in epicsEventWait () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#2  0x00007ffff6f69159 in epicsEventMustWait () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#3  0x00007ffff7449c8c in dbCaTask () from /epics/base/lib/linux-x86_64/libdbCore.so.3.17.0
#4  0x00007ffff6f6dbbc in start_routine () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#5  0x00007ffff61e8b50 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#6  0x00007ffff64dbfbd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#7  0x0000000000000000 in ?? ()

Thread 13 (Thread 0x7ffff47dc700 (LWP 11713)):
#0  0x00007ffff61ed344 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007ffff6f705e6 in epicsEventWait () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#2  0x00007ffff6f69159 in epicsEventMustWait () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#3  0x00007ffff7447798 in callbackTask () from /epics/base/lib/linux-x86_64/libdbCore.so.3.17.0
---Type <return> to continue, or q <return> to quit---
#4  0x00007ffff6f6dbbc in start_routine () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#5  0x00007ffff61e8b50 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#6  0x00007ffff64dbfbd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#7  0x0000000000000000 in ?? ()

Thread 12 (Thread 0x7ffff49dd700 (LWP 11712)):
#0  0x00007ffff61ed344 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007ffff6f705e6 in epicsEventWait () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#2  0x00007ffff6f69159 in epicsEventMustWait () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#3  0x00007ffff7447798 in callbackTask () from /epics/base/lib/linux-x86_64/libdbCore.so.3.17.0
#4  0x00007ffff6f6dbbc in start_routine () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#5  0x00007ffff61e8b50 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#6  0x00007ffff64dbfbd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#7  0x0000000000000000 in ?? ()

Thread 11 (Thread 0x7ffff4bde700 (LWP 11711)):
#0  0x00007ffff61ed344 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007ffff6f705e6 in epicsEventWait () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#2  0x00007ffff6f69159 in epicsEventMustWait () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#3  0x00007ffff7447798 in callbackTask () from /epics/base/lib/linux-x86_64/libdbCore.so.3.17.0
#4  0x00007ffff6f6dbbc in start_routine () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#5  0x00007ffff61e8b50 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#6  0x00007ffff64dbfbd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#7  0x0000000000000000 in ?? ()

Thread 10 (Thread 0x7ffff4cdf700 (LWP 11710)):
#0  0x00007ffff61ed6bb in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007ffff6f70753 in epicsEventWaitWithTimeout () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#2  0x00007ffff6f6902c in epicsEvent::wait(double) () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#3  0x00007ffff6f79ee1 in timerQueueActive::run() () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#4  0x00007ffff6f68369 in epicsThreadCallEntryPoint () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#5  0x00007ffff6f6dbbc in start_routine () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#6  0x00007ffff61e8b50 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#7  0x00007ffff64dbfbd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#8  0x0000000000000000 in ?? ()

Thread 9 (Thread 0x7ffff4de0700 (LWP 11709)):
#0  0x00007ffff64d1363 in poll () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007ffff7b6b9a4 in readIt () from /epics/iocs/asyn/lib/linux-x86_64/libasyn.so
#2  0x00007ffff7b76217 in readIt () from /epics/iocs/asyn/lib/linux-x86_64/libasyn.so
#3  0x00007ffff7b7ec87 in readIt () from /epics/iocs/asyn/lib/linux-x86_64/libasyn.so
#4  0x00007ffff7904d1b in AsynDriverInterface::readHandler (this=0x708ab0) at ../AsynDriverInterface.cc:957 ---Type <return> to continue, or q <return> to quit---
#5  0x00007ffff7907044 in AsynDriverInterface::handleRequest (this=0x708ab0) at ../AsynDriverInterface.cc:1513
#6  0x00007ffff7907a87 in AsynDriverInterface::handleRequest (pasynUser=0x708c88) at ../AsynDriverInterface.cc:246
#7  0x00007ffff7b62221 in portThread () from /epics/iocs/asyn/lib/linux-x86_64/libasyn.so
#8  0x00007ffff6f6dbbc in start_routine () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#9  0x00007ffff61e8b50 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#10 0x00007ffff64dbfbd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#11 0x0000000000000000 in ?? ()

Thread 8 (Thread 0x7ffff4ee1700 (LWP 11708)):
#0  0x00007ffff61ed344 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007ffff6f705e6 in epicsEventWait () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#2  0x00007ffff6f69159 in epicsEventMustWait () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#3  0x00007ffff7b61c5c in portThread () from /epics/iocs/asyn/lib/linux-x86_64/libasyn.so
#4  0x00007ffff6f6dbbc in start_routine () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#5  0x00007ffff61e8b50 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#6  0x00007ffff64dbfbd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#7  0x0000000000000000 in ?? ()

Thread 7 (Thread 0x7ffff4f62700 (LWP 11707)):
#0  0x00007ffff61ed6bb in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007ffff6f70753 in epicsEventWaitWithTimeout () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#2  0x00007ffff6f76dd4 in twdTask () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#3  0x00007ffff6f6dbbc in start_routine () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#4  0x00007ffff61e8b50 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#5  0x00007ffff64dbfbd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#6  0x0000000000000000 in ?? ()

Thread 6 (Thread 0x7ffff5063700 (LWP 11706)):
#0  0x00007ffff64d1363 in poll () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007ffff7b6b9a4 in readIt () from /epics/iocs/asyn/lib/linux-x86_64/libasyn.so
#2  0x00007ffff7b76217 in readIt () from /epics/iocs/asyn/lib/linux-x86_64/libasyn.so
#3  0x00007ffff7b7ec87 in readIt () from /epics/iocs/asyn/lib/linux-x86_64/libasyn.so
#4  0x00007ffff7904d1b in AsynDriverInterface::readHandler (this=0x6fe860) at ../AsynDriverInterface.cc:957
#5  0x00007ffff7907044 in AsynDriverInterface::handleRequest (this=0x6fe860) at ../AsynDriverInterface.cc:1513
#6  0x00007ffff7907a87 in AsynDriverInterface::handleRequest (pasynUser=0x6fea38) at ../AsynDriverInterface.cc:246
#7  0x00007ffff7b62221 in portThread () from /epics/iocs/asyn/lib/linux-x86_64/libasyn.so
#8  0x00007ffff6f6dbbc in start_routine () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#9  0x00007ffff61e8b50 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#10 0x00007ffff64dbfbd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#11 0x0000000000000000 in ?? ()

Thread 5 (Thread 0x7ffff5164700 (LWP 11705)):
---Type <return> to continue, or q <return> to quit---
#0  0x00007ffff61ed6bb in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007ffff6f70753 in epicsEventWaitWithTimeout () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#2  0x00007ffff6f6902c in epicsEvent::wait(double) () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#3  0x00007ffff6f79ee1 in timerQueueActive::run() () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#4  0x00007ffff6f68369 in epicsThreadCallEntryPoint () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#5  0x00007ffff6f6dbbc in start_routine () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#6  0x00007ffff61e8b50 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#7  0x00007ffff64dbfbd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#8  0x0000000000000000 in ?? ()

Thread 4 (Thread 0x7ffff7f5c700 (LWP 11704)):
#0  0x00007ffff61ed6bb in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007ffff6f70753 in epicsEventWaitWithTimeout () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#2  0x00007ffff6f6902c in epicsEvent::wait(double) () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#3  0x00007ffff6f79ee1 in timerQueueActive::run() () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#4  0x00007ffff6f68369 in epicsThreadCallEntryPoint () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#5  0x00007ffff6f6dbbc in start_routine () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#6  0x00007ffff61e8b50 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#7  0x00007ffff64dbfbd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#8  0x0000000000000000 in ?? ()

Thread 3 (Thread 0x7ffff7fdd700 (LWP 11702)):
#0  0x00007ffff61ed344 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007ffff6f705e6 in epicsEventWait () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#2  0x00007ffff6f69159 in epicsEventMustWait () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#3  0x00007ffff6f56f65 in errlogThread () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#4  0x00007ffff6f6dbbc in start_routine () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#5  0x00007ffff61e8b50 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#6  0x00007ffff64dbfbd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#7  0x0000000000000000 in ?? ()

Thread 1 (Thread 0x7ffff7fdf720 (LWP 11698)):
#0  0x00007ffff64cffdd in read () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007ffff5fc6c21 in rl_getc () from /lib/x86_64-linux-gnu/libreadline.so.6
#2  0x00007ffff5fc73df in rl_read_key () from /lib/x86_64-linux-gnu/libreadline.so.6
#3  0x00007ffff5fb1ff1 in readline_internal_char () from /lib/x86_64-linux-gnu/libreadline.so.6
#4  0x00007ffff5fb2535 in readline () from /lib/x86_64-linux-gnu/libreadline.so.6
#5  0x00007ffff6f6d072 in epicsReadline () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#6  0x00007ffff6f5d99a in iocshBody () from /epics/base/lib/linux-x86_64/libCom.so.3.17.7
#7  0x00000000004079b6 in main ()

-----Original Message-----
From: Mark Rivers <rivers at cars.uchicago.edu>
Sent: Tuesday, April 7, 2020 10:42 AM
To: Sobhani, Bayan <bsobhani at bnl.gov>
Cc: tech-talk <tech-talk at aps.anl.gov>
Subject: RE: StreamDevice, prevent records from getting "stuck"

That synchronousLock:Yes probably is an indicator of a problem.  It could be a deadlock, or it could be a function call that hangs forever with that lock held.

What version of asyn are you using?  If it is an old version please first update to a new version to make sure this problem has not already been fixed.

At this point the best way to debug it is with gdb.  When it hangs up interrupt the IOC with ^C and then dump the traceback for all threads with

gdb> thread apply all backtrace

You then need to see what thread is waiting for that mutex.

Mark

-----Original Message-----
From: Sobhani, Bayan <bsobhani at bnl.gov>
Sent: Tuesday, April 7, 2020 9:34 AM
To: Mark Rivers <rivers at cars.uchicago.edu>
Cc: tech-talk <tech-talk at aps.anl.gov>
Subject: RE: StreamDevice, prevent records from getting "stuck"

I am noticing something for when it is stuck vs when it is not stuck. When it gets stuck the asynReport says "synchronousLock:Yes", but when it is not stuck it says "synchronousLock:No".

Does this mean anything?

When it is stuck epicsMutexShowAll gives me the following:

epics> epicsMutexShowAll, 1
ellCount(&mutexList) 157 ellCount(&freeList) 0 epicsMutexId 0x74ae00 source ../../asyn/asynDriver/asynManager.c line 1911

But when it is not stuck I just get:

epics> epicsMutexShowAll, 1
ellCount(&mutexList) 157 ellCount(&freeList) 0

The PV got stuck again today for several minutes so I will post the asynReport message here (to me it looks the same as what I posted yesterday except the higher number of characters read):

c3-tsrv1-p16 multiDevice:No canBlock:Yes autoConnect:Yes
    enabled:Yes connected:Yes numberConnects 1
    nDevices 0 nQueued 1 blocked:No
    asynManagerLock:No synchronousLock:Yes
    exceptionActive:No exceptionUsers 1 exceptionNotifys 0
    traceMask:0x1 traceIOMask:0x0 traceInfoMask:0x1
    interposeInterfaceList
        asynOctet pinterface 0x7f0e81723a40 drvPvt 0x108da20
    interfaceList
        asynCommon pinterface 0x7f0e81720d20 drvPvt 0x108cb10
        asynOctet pinterface 0x108cbf8 drvPvt 0x108cb10
    Port 10.17.2.60:4016: Connected
                    fd: 5
    Characters written: 66
       Characters read: 4650

Here is a message from when it is not stuck:

epics> asynReport 10 c3-tsrv1-p16
c3-tsrv1-p16 multiDevice:No canBlock:Yes autoConnect:Yes
    enabled:Yes connected:Yes numberConnects 1
    nDevices 0 nQueued 0 blocked:No
    asynManagerLock:No synchronousLock:No
    exceptionActive:No exceptionUsers 1 exceptionNotifys 0
    traceMask:0x1 traceIOMask:0x0 traceInfoMask:0x1
    interposeInterfaceList
        asynOctet pinterface 0x7f6015129a40 drvPvt 0x765910
    interfaceList
        asynCommon pinterface 0x7f6015126d20 drvPvt 0x764a00
        asynOctet pinterface 0x764ae8 drvPvt 0x764a00
    Port 10.17.2.60:4016: Connected
                    fd: 5
    Characters written: 1833
       Characters read: 4159

-----Original Message-----
From: Mark Rivers <rivers at cars.uchicago.edu>
Sent: Monday, April 6, 2020 9:14 PM
To: Sobhani, Bayan <bsobhani at bnl.gov>
Cc: tech-talk <tech-talk at aps.anl.gov>
Subject: Re: StreamDevice, prevent records from getting "stuck"


When it is really "hung up" run the asynReport.  You can also use the asynRecord to manually try to communicate and see if you get a response.


________________________________
From: Sobhani, Bayan <bsobhani at bnl.gov>
Sent: Monday, April 6, 2020 6:48 PM
To: Mark Rivers
Cc: tech-talk
Subject: Re: StreamDevice, prevent records from getting "stuck"

Maybe the reason for the nQueue being 5 is because there are multiple PVs associated with each device.

These are RS-485 temperature controllers that are connected to a moxa. I am not sure if "noise" is the correct word but some of these devices seem to be prone to sending random characters at times. For some of these devices changing the cabling seems to help. Other times it seems to be the device itself.

Get Outlook for Android<https://urldefense.com/v3/__https://aka.ms/ghei36__;!!P4SdNyxKAPE!W2RJdAuBmGW3k6OG6ix29cxOZIknDYbiP6yckLyoJ-QP7VreJelqZIHkUllXCRYI$ >

________________________________
From: Mark Rivers <rivers at cars.uchicago.edu>
Sent: Monday, April 6, 2020 7:35:29 PM
To: Sobhani, Bayan <bsobhani at bnl.gov>
Cc: tech-talk <tech-talk at aps.anl.gov>
Subject: RE: StreamDevice, prevent records from getting "stuck"

Note that asynReport said "nQueued 5".  That suggests the port was busy while 5 more I/O requests were queued up.

This looks like a device that is not responding.

Your original message said
    `sometimes when the signal is noisy, the PV gets "stuck"`

What do you mean by "signal is noisy".  The TCP messages should not be "noisy".

Mark


-----Original Message-----
From: Sobhani, Bayan <bsobhani at bnl.gov>
Sent: Monday, April 6, 2020 5:57 PM
To: Mark Rivers <rivers at cars.uchicago.edu>
Cc: tech-talk <tech-talk at aps.anl.gov>
Subject: RE: StreamDevice, prevent records from getting "stuck"

The command I use to configure the port is drvAsynIPPortConfigure. For some reason I had some trouble reproducing the problems but after a few IOC restarts I got the PVs stuck again.

asynReport 10 PortName gives the following:

c3-tsrv1-p16 multiDevice:No canBlock:Yes autoConnect:Yes
    enabled:Yes connected:Yes numberConnects 1
    nDevices 0 nQueued 5 blocked:No
    asynManagerLock:No synchronousLock:Yes
    exceptionActive:No exceptionUsers 1 exceptionNotifys 0
    traceMask:0x1 traceIOMask:0x0 traceInfoMask:0x1
    interposeInterfaceList
        asynOctet pinterface 0x7f3dff3a9a40 drvPvt 0xe99ad0
    interfaceList
        asynCommon pinterface 0x7f3dff3a6d20 drvPvt 0xe98bc0
        asynOctet pinterface 0xe98ca8 drvPvt 0xe98bc0
    Port 10.17.2.60:4016: Connected
                    fd: 5
    Characters written: 117
       Characters read: 227

After running this, a few seconds later I got the message:

"2020/04/06 18:45:23.072801 c3-tsrv1-p16 XF:17ID-CT{RG:C3}T:1-SP: Timeout after reading 176 bytes "...<ff><ff><ff><ff><ff><ff><ff><ff><ff><ff><ff><ff><ff><ff><ff><ff><ff><ff><ff><ff>""

And then the PV started reading data again. I am not sure if the asynReport command is somehow fixing this or if it is just a coincidence.

Alex

-----Original Message-----
From: Mark Rivers <rivers at cars.uchicago.edu>
Sent: Monday, April 6, 2020 6:10 PM
To: Sobhani, Bayan <bsobhani at bnl.gov>
Cc: tech-talk <tech-talk at aps.anl.gov>
Subject: Re: StreamDevice, prevent records from getting "stuck"

What asyn driver are you using (drvAsynIPPort, drvAsynSerialPort, etc.)?


You should run the following to see if the problem is in the asyn driver


asynReport 10 PortName


where PortName is the name of the asyn port.  That will tell you if the port is connected, etc.


Mark



________________________________
From: Tech-talk <tech-talk-bounces at aps.anl.gov> on behalf of Sobhani, Bayan via Tech-talk <tech-talk at aps.anl.gov>
Sent: Monday, April 6, 2020 4:57 PM
To: tech-talk at aps.anl.gov
Subject: StreamDevice, prevent records from getting "stuck"


For StreamDevice I notice that sometimes when the signal is noisy, the PV gets "stuck" if it makes one too many failed requests to the device. Getting "stuck" means that no matter how many times I process the PV with a StreamDevice OUT field, it does not seem to attempt to send anything to the device until I restart the IOC.



I see no technical reason why this must happen so I think this is probably something intentional in StreamDevice. Can I turn this off? I want the PV to always try to send a signal to the device, and if it can't I want to see the red letters saying there was a mismatch.



Is there any way to prevent the PVs from getting "stuck"?



Alex

Replies:
Re: StreamDevice, prevent records from getting "stuck" Mark Rivers via Tech-talk
References:
StreamDevice, prevent records from getting "stuck" Sobhani, Bayan via Tech-talk
Re: StreamDevice, prevent records from getting "stuck" Mark Rivers via Tech-talk
RE: StreamDevice, prevent records from getting "stuck" Sobhani, Bayan via Tech-talk
RE: StreamDevice, prevent records from getting "stuck" Mark Rivers via Tech-talk
Re: StreamDevice, prevent records from getting "stuck" Sobhani, Bayan via Tech-talk
Re: StreamDevice, prevent records from getting "stuck" Mark Rivers via Tech-talk
RE: StreamDevice, prevent records from getting "stuck" Sobhani, Bayan via Tech-talk
RE: StreamDevice, prevent records from getting "stuck" Mark Rivers via Tech-talk
RE: StreamDevice, prevent records from getting "stuck" Sobhani, Bayan via Tech-talk
RE: StreamDevice, prevent records from getting "stuck" Mark Rivers via Tech-talk
RE: StreamDevice, prevent records from getting "stuck" Sobhani, Bayan via Tech-talk
Re: StreamDevice, prevent records from getting "stuck" Mark Rivers via Tech-talk
RE: StreamDevice, prevent records from getting "stuck" Sobhani, Bayan via Tech-talk
AW: StreamDevice, prevent records from getting "stuck" Zimoch Dirk (PSI) via Tech-talk

Navigate by Date:
Prev: Fw: StreamDevice, prevent records from getting "stuck" Mark Rivers via Tech-talk
Next: Re: Home motorSim Device Peterson, Kevin M. via Tech-talk
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  <20202021  2022  2023  2024 
Navigate by Thread:
Prev: AW: StreamDevice, prevent records from getting "stuck" Zimoch Dirk (PSI) via Tech-talk
Next: Re: StreamDevice, prevent records from getting "stuck" Mark Rivers via Tech-talk
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  <20202021  2022  2023  2024 
ANJ, 14 Apr 2020 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·