EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  <20152016  2017  2018  2019  2020  2021  2022  2023  2024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  <20152016  2017  2018  2019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: RE: Asyn ModbusTCP communication KO without error messages
From: Mark Rivers <[email protected]>
To: haquin <[email protected]>, tech-talk <[email protected]>
Date: Wed, 23 Sep 2015 12:28:28 +0000
Hi,

So it looks like the last line being executed in drvAsynIPPort.c is this:

            thisWrite = send(tty->fd, (char *)data, (int)numchars, 0);

That is simply calling the vxWorks function send().  That ultimately results in a call to the internal vxWorks function ipcom_sendmsg() which is calling free().  It looks to me like the memory pointer being passed to free() is corrupted.  However, that would not be freeing the memory buffer "data" that was passed from drvAsynIPPort.c, it must be some internal buffer in the vxWorks routines.

I suspect some other software in your IOC has a buffer overflow problem or uninitialized pointer and is corrupting memory elsewhere in the system.

These are hard problems to track down, particularly when the failure can take days.  The typical approach is to remove software modules from the IOC one at a time until the problem goes away and then zero in on the last thing you removed. But that can take a long time when the failure rate is low.  And there is no guarantee that will work, since the memory corruption may just move to an area where it never shows up.

Mark


________________________________
From: haquin [[email protected]]
Sent: Wednesday, September 23, 2015 7:13 AM
To: Mark Rivers; tech-talk
Subject: Re: Asyn ModbusTCP communication KO without error messages

Hello Mark,

the problem just happened again, it's always the same task being suspended without error message.
Here is the output of the "tt" command

iocVMELB1 > tt 0x17d1fb0
0x0012bd34 vxTaskEntry  +0x48 : 0x00a4c4cc ()
0x00a4c538 epicsThreadCreate+0x1c0: 0x008ffc84 ()
0x008ffc84 modbusInterposeConfig+0xb88c: 0x00917bf8 ()
0x00917c60 asynInterposeFlushConfig+0x79d4: 0x008f2898 ()
0x008f2a44 drvModbusAsynConfigure+0x828: 0x008f157c ()
0x008f1934 lfSetErrLogSev+0x3584: 0x009081e4 ()
0x00908344 drvAsynIPServerPortConfigure+0x2904: 0x008f4b04 ()
0x008f4bf4 modbusInterposeConfig+0x7fc: 0x009070fc ()
0x0090711c drvAsynIPServerPortConfigure+0x16dc: 0x009049a4 ()
0x00904c0c drvAsynIPPortConfigure+0xc94: send ()
0x002b0a48 send         +0x7c : 0x002e1938 ()
0x002e194c ipcom_spinlock_delete+0x97c: ipcom_send ()
0x001d6430 ipcom_sendto +0x48 : ipcom_sendmsg ()
0x001d639c ipcom_sendmsg+0x718: free ()
0x002077b4 free         +0x3c : 0x00207148 ()
0x00207238 memPartBlockIsValid+0x180: taskSuspend ()
value = 0 = 0x0
iocVMELB1 >

... what to think about this "memPartBlockIsValid", it's coherent with the task error code : Errno 0x110003 => S_memLib_BLOCK_ERROR
Is there some memory addresses corrupted ?

thanx in advance
best regards

Le 09/09/2015 18:01, Mark Rivers a écrit :
I suspect your problem is indeed that suspended task.  Are you sure there was no error message when the task got suspended?
You can learn something about the task status by running the “tt” command on it:
tt 0x16963d0
Mark

From: haquin [mailto:[email protected]]
Sent: Wednesday, September 09, 2015 6:44 AM
To: Mark Rivers; tech-talk
Subject: Re: Asyn ModbusTCP communication KO without error messages

Mark,

The problem happened again yesterday, I've attached files with the result of the several commands.
What I've noticed is that there is a PLC task that is suspended with a error code:
Errno 0x110003 => S_memLib_BLOCK_ERROR
  NAME         ENTRY       TID    PRI   STATUS      PC       SP     ERRNO  DELAY
----------  ------------ -------- --- ---------- -------- -------- ------- -----
LBE1-PLCV   a4c48c        167fb20 149 PEND         2c9038  1682b70       0     0
LBE1-PLCV-> a4c48c        1686000 149 PEND         2c7154  1689080       0     0
LBE1-PLCV-> a4c48c        168a9e0 149 PEND         2c7154  168c990      46     0
LBE1-PLCV-> a4c48c        168f510 149 PEND         2c7154  1692420       0     0
LBE1-PLCV-> a4c48c        16963d0 149 SUSPEND      2ce5c4  1699110  110003     0
LBE1-PLCV-> a4c48c        169c1b0 149 PEND         2c7154  169f0c0       0     0
LBE1-PLCV-> a4c48c        16a29c0 149 PEND         2c7154  16a58d0       0     0
LBE1-PLCV-> a4c48c        16a87c0 149 PEND         2c7154  16ab6d0       0     0
LBE1-PLCV-> a4c48c        16ae5c0 149 PEND         2c7154  16b14d0       0     0
LBE1-PLCV-> a4c48c        16b4280 149 PEND         2c7154  16b7190       0     0
LBE1-PLCV-> a4c48c        16ba620 149 PEND         2c7154  16bd6a0       0     0
LBE1-PLCV-> a4c48c        16c0420 149 PEND         2c7154  16c34a0       0     0
LBE1-PLCV-> a4c48c        16c6230 149 PEND         2c7154  16c92b0       0     0
LBE1-PLCV-> a4c48c        16cc030 149 PEND         2c7154  16cef40       0     0

- I don't think there is a deadlock, but I don't know what to expect with the output of the command in this case.
- There is no task overflow, especially the one that is suspended.
- The result of Asyn report is in attachment

I've added your Asyn record and there was the following errors when I tried to connect to the READ  port or set ON/OFF a trace:
iocVMELB1 > 2015/09/08 11:29:24.509 [LBE1-PLCV-READ,0,0] [../../asyn/asynDriver/asynManager.c:1509] [CAS-client,0x8f70500,20] LBE1-PLCV-READ addr 0 queueRequest priority 3 not lockHolder
2015/09/08 11:29:24.509 [LBE1-PLCV-READ,0,0] [../../asyn/asynDriver/asynManager.c:1520] [CAS-client,0x8f70500,20] LBE1-PLCV-READ schedule queueRequest timeout
2015/09/08 11:29:27.925 [LBE1-PLCV-READ,0,0] [../../asyn/asynDriver/asynManager.c:637] [timerQueue,0xc04f20,60] LBE1-PLCV-READ asynManager:queueTimeoutCallback
2015/09/08 11:29:27.925 [LBE1-PLCV-READ,0,0] [../../asyn/asynRecord/asynRecord.c:2003] [timerQueue,0xc04f20,60] LBE1-PLCV-asyn: special queueRequest timeout
2015/09/08 11:29:34.509 [LBE1-PLCV-READ,0,0] [../../asyn/asynDriver/asynManager.c:637] [timerQueue,0xc04f20,60] LBE1-PLCV-READ asynManager:queueTimeoutCallback
2015/09/08 11:29:34.509 [LBE1-PLCV-READ,0,0] [../../asyn/asynRecord/asynRecord.c:2003] [timerQueue,0xc04f20,60] LBE1-PLCV-asyn: special queueRequest timeout



iocVMELB1 > 2015/09/08 11:31:44.577 [LBE1-PLCV-READ,0,0] [../../asyn/asynRecord/asynRecord.c:899] [CAS-client,0x8f70500,20] LBE1-PLCV-asyn: exception 3
2015/09/08 11:31:51.910 [LBE1-PLCV-READ,0,0] [../../asyn/asynRecord/asynRecord.c:899] [CAS-client,0x8f70500,20] LBE1-PLCV-asyn: exception 4
2015/09/08 11:31:54.527 [LBE1-PLCV-READ,0,0] [../../asyn/asynRecord/asynRecord.c:899] [CAS-client,0x8f70500,20] LBE1-PLCV-asyn: exception 4
2015/09/08 11:31:57.244 [LBE1-PLCV-READ,0,0] [../../asyn/asynRecord/asynRecord.c:899] [CAS-client,0x8f70500,20] LBE1-PLCV-asyn: exception 3
2015/09/08 11:32:00.077 [LBE1-PLCV-READ,0,0] [../../asyn/asynRecord/asynRecord.c:899] LBE1-PLCV-asyn: exception 5
2015/09/08 11:32:03.277 [LBE1-PLCV-READ,0,0] [../../asyn/asynRecord/asynRecord.c:899] [CAS-client,0x8f70500,20] LBE1-PLCV-asyn: exception 5

Thanks for your help
best regards

Le 02/09/2015 16:46, Mark Rivers a écrit :

Hi Christophe,



Here are some things to look for:



- On vxWorks perhaps a task has been suspended.  Issue the "i" command to look at the status of all of the tasks.



- Perhaps there is a deadlock.  Issue this command several times in a row to see if there is a mutex that is always locked:

epicsMutexShowAll 1



- Perhaps there was a stack overflow.  Issue this command and look for tasks with a margin of 0

checkStack



If you don't find anything there then send us the output of "asynReport 10" on the Read port and Write port.



Mark









-----Original Message-----

From: [email protected]<mailto:[email protected]> [mailto:[email protected]] On Behalf Of haquin

Sent: Wednesday, September 02, 2015 9:00 AM

To: tech-talk

Subject: Asyn ModbusTCP communication KO without error messages



Hi all,



I have a VxWorks IOC (with MVME-CPU using both eth interfaces) communicating with a siemens S7PLC via Asyn/ModbusTCP.

EPICS release 3.14.12.4 Asyn v4.22



After a while  (1 or 2 days), the communication is not working anymore ... but I have no error messages (no timeout nor

disconnection ...).

 From IOC side I a have "Read Multiple Register" function reading the whole modbus table (109 registers) every second

and a "Write Multiple Register" function writing the value of a counter incremented every seconds from record level.



When I activate AsynTrace on IP Port or Read or Write ports there is no messages ...

asynReport on Read port indicates only 1 Read OK

asynReport on Write port indicates 0 Write OK



I can read the PLC register via "modpoll" tool from a Linux PC

I can start a Linux IOC connected to the same PLC

The netstat command on IOC shell tells that the TCP port is established but the Recv-Q is not equal to 0 (12 for example)



What can explain this behavior ?



thanks in advance !





--

Christophe Haquin

Control and Real Time systems Engineer



+33 231454661 office

+33 231454728 fax

SdA/GIM

GANIL

Bd Henri Becquerel BP 55027

14076 CAEN CEDEX5


--
Christophe Haquin
Control and Real Time systems Engineer

+33 231454661 office
+33 231454728 fax
SdA/GIM
GANIL
Bd Henri Becquerel BP 55027
14076 CAEN CEDEX5


Replies:
Re: Asyn ModbusTCP communication KO without error messages haquin
References:
Asyn ModbusTCP communication KO without error messages haquin
RE: Asyn ModbusTCP communication KO without error messages Mark Rivers
RE: Asyn ModbusTCP communication KO without error messages Mark Rivers
Re: Asyn ModbusTCP communication KO without error messages haquin

Navigate by Date:
Prev: Re: Asyn ModbusTCP communication KO without error messages haquin
Next: Re: Asyn ModbusTCP communication KO without error messages haquin
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  <20152016  2017  2018  2019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: Re: Asyn ModbusTCP communication KO without error messages haquin
Next: Re: Asyn ModbusTCP communication KO without error messages haquin
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  <20152016  2017  2018  2019  2020  2021  2022  2023  2024 
ANJ, 16 Dec 2015 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·