Hello Jeff,
after manually patching osdThread.c (patchfile has corrupt line numbers :-( ) "softIoc.exe" works well.
But now "caput.exe" runs sometimes into exception! This exception is very hard to catch (1% of calls
in my case).
The exception occurs in following function: "src/ca/tcpiiu.cpp":
bool tcpiiu :: connectNotify (
epicsGuard < epicsMutex > & guard, nciu & chan )
{
...
if ( chan.channelNode::listMember == channelNode::cs_createRespPend ) {
this->createRespPend.remove ( chan );
this->subscripReqPend.add ( chan ); <==== exception occurs here!
...
"this->subscripReqPend.add ( chan );" is dereferencing by "include/tsDLList.h"
template <class T>
inline void tsDLList<T>::remove ( T &item )
{
tsDLNode<T> &theNode = item;
if ( this->pLast == &item ) {
this->pLast = theNode.pPrev;
}
else {
tsDLNode<T> &nextNode = *theNode.pNext;
nextNode.pPrev = theNode.pPrev; <====== sometimes "nextNode.pPrev" is NULL here!!!
...
Memory map:
- nextNode {pNext=??? pPrev=??? } tsDLNode<nciu> &
pNext CXX0030: Fehler: Ausdruck kann nicht ausgewertet werden
pPrev CXX0030: Fehler: Ausdruck kann nicht ausgewertet werden
+ this 0x003db1b4 {pFirst=0x00000000 pLast=0x00000000 itemCount=0 } tsDLList<nciu> * const
- item {eventq={...} accessRightState={...} cacCtx={...} ...} nciu &
+ cacChannel {priorityMax=99 priorityMin=0 priorityDefault=0 ...} cacChannel
+ chronIntIdRes<nciu> {...} chronIntIdRes<nciu>
+ channelNode {listMember=cs_createRespPend } channelNode
+ privateInterfaceForIO {...} privateInterfaceForIO
+ eventq {pFirst=0x00000000 pLast=0x00000000 itemCount=0 } tsDLList<baseNMIU>
+ accessRightState {f_readPermit=true f_writePermit=true
f_operatorConfirmationRequest=false } caAccessRights
+ cacCtx {_refLocalHostName={...} chanTable={...} ioTable={...} ...} cac &
+ pNameStr 0x003dafd8 "demoHost:double1" char *
+ piiu 0x100543ec class noopiiu noopIIU netiiu *
sid 4294967295 unsigned int
count 0 unsigned int
retry 0 unsigned int
nameLength 17 unsigned short
typeCode 65535 unsigned short
priority 0 unsigned char
- theNode {pNext=0x00000000 pPrev=0x00000000 } tsDLNode<nciu> &
+ pNext 0x00000000 {eventq={...} accessRightState={...} cacCtx=??? ...} nciu *
+ pPrev 0x00000000 {eventq={...} accessRightState={...} cacCtx=??? ...} nciu *
I checked this at several machines and I don't know what to do next.
Hope you can help me
Carsten
Hello Carsten,
I committed, and pushed, a fix to the R3.14 branch at launch-pad. Please
test to see if the relevant changes address your issue.
I created lauchpad bug 717252. The details, including diffs, are in the bug
report.
https://bugs.launchpad.net/epics-base/+bug/717252
Thanks for your detailed bug report,
Jeff
______________________________________________________
Jeffrey O. Hill Email [email protected]
LANL MS H820 Voice 505 665 1831
Los Alamos NM 87545 USA FAX 505 665 5107
Message content: TSPA
With sufficient thrust, pigs fly just fine. However, this is
not necessarily a good idea. It is hard to be sure where they
are going to land, and it could be dangerous sitting under them
as they fly overhead. -- RFC 1925
-----Original Message-----
From: [email protected] [mailto:[email protected]]
On Behalf Of Carsten Winkler
Sent: Friday, February 11, 2011 1:01 AM
To: [email protected]
Subject: soft ioc runs into fatal exception (base-3.14.12)
Problem: softIoc.exe runs into fatal exception after caput
call from local host.
When running the test, there's a 40% chance
for this exception to occur.
Waiting between starting the softIoc and
caput, and/or waiting between
caput calls does not change the behavior.
Error message: "softIoc has encountered a problem and needs to close. We
are sorry for the
inconvience."
Error details: AppName: softioc.exe; AppVer:0.0.0.0; ModName:
com.dll; ModVer: 3.14.12.0;
Offset: 0000a613
System: EPICS 3.14.12 plus all published patches (8 Feb.
2011), no local changes
(compiled with MVS 2010 professional - without any error)
Host: Windows XP SP3 @ Pentium 4 with 3.2GHz and 3GB RAM
AND
Windows XP SP3 @ VMWARE 3.5.0
AND
Windows XP SP3 @ VIRTUALBOX 3.2.12
AND
Windows XP SP3 @ KVM
Test setup: The following configuration files have been used:
startDemo.bat:
set EPICS_CA_ADDR_LIST=localhost
set EPICS_CA_AUTO_ADDR_LIST=NO
start DemoIOC.cmd
caput demoHost:double1 3.141
caput demoHost:double2 2.718
caput demoHost:long 12345
DemoIOC.cmd:
softIoc.exe -D dbd\softIoc.dbd -d
db\demo.db
demo.db:
record(ao, "demoHost:double1")
{
field(DESC, "Double output with range
infos")
field(EGU, "mm")
field(HOPR, "10")
field(LOPR, "0")
field(HIHI, "8")
field(HIGH, "6")
field(LOW, "4")
field(LOLO, "2")
field(HHSV, "MAJOR")
field(HSV, "MINOR")
field(LSV, "MINOR")
field(LLSV, "MAJOR")
}
record(ao, "demoHost:double2")
{
field(DESC, "Double output without
range infos")
field(EGU, "nm")
}
record(longout, "demoHost:long")
{
field(DESC, "Long output without
range infos")
field(EGU, "m")
}
call stack:
Com.dll!ellDelete(ELLLIST * pList=0x00a22ed0, ELLNODE * pNode=0x00cf0d08)
Zeile 82 + 0xb Bytes C
Com.dll!epicsParmCleanupWIN32(epicsThreadOSD * pParm=0x00cf0d08) Zeile
246 + 0x10 Bytes C
Com.dll!epicsWin32ThreadEntry(void * lpParameter=0x00cf0d08) Zeile 516 +
0x9 Bytes C
msvcr100d.dll!_callthreadstartex() Zeile 314 + 0xf Bytes C
msvcr100d.dll!_threadstartex(void * ptd=0x00cf1500) Zeile 297 C
kernel32.dll!_BaseThreadStart@8() + 0x37 Bytes
exception occurred here:
void ellDelete (ELLLIST *pList, ELLNODE *pNode)
{
if (pList->node.previous == pNode)
pList->node.previous = pNode->previous;
else
pNode->next->previous = pNode->previous;<== "pNode->next" is a
NULL pointer in error case!
(s. memory map)
if (pList->node.next == pNode)
pList->node.next = pNode->next;
else
pNode->previous->next = pNode->next;
pList->count--;
return;
}
This function was called from "static void epicsParmCleanupWIN32 (
win32ThreadParam * pParm )" of
osdThread.c
memory maps:
- pList 0x00a22ed0 {node={...} count=23 } ELLLIST *
- node {next=0x00a24728 previous=0x00cf0c60 } ELLNODE
- next 0x00a24728 {next=0x00a26340 previous=0x00000000 }
ELLNODE *
- next 0x00a26340 {next=0x00adbd90 previous=0x00a24728 }
ELLNODE *
[... 20 thread parameter blocks in this list without address 0x00cf0d08]
- next 0x00cf0c60 {next=0x00000000
previous=0x00cf0de0 } ELLNODE *
- next 0x00000000 {next=??? previous=??? }
ELLNODE *
next CXX0030: Fehler: Ausdruck kann nicht
ausgewertet werden
previous CXX0030: Fehler: Ausdruck kann
nicht ausgewertet werden
+ previous 0x00cf0de0 {next=0x00cf0c60 previous=0x00cf0b40 }
ELLNODE *
+ previous 0x00cf0b40 {next=0x00cf0de0 previous=0x00af1ac0
} ELLNODE *
[... 20 thread parameter blocks in this list without address 0x00cf0d08]
count 23 int
- pNode 0x00cf0d08 {next=0x00000000 previous=0x00000000 } ELLNODE *
+ next 0x00000000 {next=??? previous=??? } ELLNODE *
next CXX0030: Fehler: Ausdruck kann nicht ausgewertet werden
previous CXX0030: Fehler: Ausdruck kann nicht ausgewertet
werden
+ previous 0x00000000 {next=??? previous=??? } ELLNODE *
next CXX0030: Fehler: Ausdruck kann nicht ausgewertet werden
previous CXX0030: Fehler: Ausdruck kann nicht ausgewertet
werden
0x00CF0C80 00 00 00 00 43 41 53 2d 65 76 65 6e 74 00 fd fd fd fd dd dd dd
dd dd dd 09 00 0c 00 a1
01 0c 02 a0 12 ....CAS-event.ýýýýÝÝÝÝÝÝ....¡... .
0x00CF0CA2 cf 00 e8 0c cf 00 00 00 00 00 00 00 00 00 18 00 00 00 01 00 00
00 fd 26 00 00 fd fd fd
fd 00 00 00 00 Ï.è.Ï.................ý&..ýýýý....
This looks like there is a situation (with a 40% chance), in which the
thread parameter block of a
CAS-event task is not added to the thread list when the client connects.
When the clients
disconnects and the thread gets shut down, it tries to remove its
parameter block from the thread
list by calling ellDelete() with a node that is not in the list.
ellDelete() behaves fragile and
crashes the softIoc (by dereferencing a null pointer). This exception
seems to occur only when caput
has been called from local host.
Has anyone seen this behavior?
________________________________
Helmholtz-Zentrum Berlin für Materialien und Energie GmbH
Mitglied der Hermann von Helmholtz-Gemeinschaft Deutscher
Forschungszentren e.V
Aufsichtsrat: Vorsitzender Prof. Dr. Dr. h.c. mult. Joachim Treusch, stv.
Vorsitzende Dr. Beatrix Vierkorn- Rudolph
Geschäftsführer: Prof. Dr. Anke Rita Kaysser-Pyzalla, Prof. Dr. Dr. h.c.
Wolfgang Eberhardt, Dr. Ulrich Breuer
Sitz Berlin, AG Charlottenburg, 89 HRB 5583
Postadresse:
Hahn-Meitner-Platz 1
D-14109 Berlin
http://www.helmholtz-berlin.de
________________________________
Helmholtz-Zentrum Berlin für Materialien und Energie GmbH
Mitglied der Hermann von Helmholtz-Gemeinschaft Deutscher Forschungszentren e.V
Aufsichtsrat: Vorsitzender Prof. Dr. Dr. h.c. mult. Joachim Treusch, stv. Vorsitzende Dr. Beatrix Vierkorn- Rudolph
Geschäftsführer: Prof. Dr. Anke Rita Kaysser-Pyzalla, Prof. Dr. Dr. h.c. Wolfgang Eberhardt, Dr. Ulrich Breuer
Sitz Berlin, AG Charlottenburg, 89 HRB 5583
Postadresse:
Hahn-Meitner-Platz 1
D-14109 Berlin
http://www.helmholtz-berlin.de
- References:
- soft ioc runs into fatal exception (base-3.14.12) Carsten Winkler
- RE: soft ioc runs into fatal exception (base-3.14.12) Jeff Hill
- Navigate by Date:
- Prev:
Error vxi11 with Lecroy 760Zi DENIS Jean-françois
- Next:
Re: no space in pool for a new client (below max block thresh) in EPICS R3.14.12 Andrew Johnson
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
<2011>
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
- Navigate by Thread:
- Prev:
RE: soft ioc runs into fatal exception (base-3.14.12) Jeff Hill
- Next:
RE: soft ioc runs into fatal exception (base-3.14.12) Jeff Hill
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
<2011>
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
|