2002 2003 2004 2005 2006 2007 2008 <2009> 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 | Index | 2002 2003 2004 2005 2006 2007 2008 <2009> 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 |
<== Date ==> | <== Thread ==> |
---|
Subject: | RE: proper thread cleanup on Linux? |
From: | "Jeff Hill" <[email protected]> |
To: | "'Core-Talk'" <[email protected]> |
Date: | Tue, 25 Aug 2009 16:09:14 -0600 |
After further testing on newer
linux versions where the debugger seems to actually function correctly I now
strongly suspect that this issue is caused by this change. The symptom is that two
threads that CA creates to manage TCP circuits never shutdown. They need to
shutdown typically when the TCP circuit disconnects or when the last channel on
the circuit is deleted. The bug was introduced in R3.14.9. See mantis 363. cvs diff -r 1.2 -r 1.2.2.1 -wb --
systemCallIntMech.cpp systemCallIntMech.cpp (in directory
C:\hill\R3.14.dll_hell_fix\epics\base\src\libCom\osi\os\posix\) Index: systemCallIntMech.cpp =================================================================== RCS file:
/net/phoebus/epicsmgr/cvsroot/epics/base/src/libCom/osi/os/posix/systemCallIntMech.cpp,v retrieving revision 1.2 retrieving revision 1.2.2.1 diff -u -b -w -b -r1.2 -r1.2.2.1 --- systemCallIntMech.cpp 1 May 2003 22:11:42
-0000 1.2 +++ systemCallIntMech.cpp 14 May 2004 13:36:01
-0000 1.2.2.1 @@ -8,7 +8,7 @@ * and higher are distributed subject to a Software
License Agreement found * in file LICENSE that is included with this
distribution. \*************************************************************************/ -/* $Id: systemCallIntMech.cpp,v 1.2 2003/05/01
22:11:42 jhill Exp $ */ +/* $Id: systemCallIntMech.cpp,v 1.2.2.1 2004/05/14
13:36:01 norume Exp $ */ /* * Author: Jeff Hill */ @@ -18,5 +18,5 @@ enum
epicsSocketSystemCallInterruptMechanismQueryInfo epicsSocketSystemCallInterruptMechanismQuery
() { - return esscimqi_socketSigAlarmRequired; + return esscimqi_socketBothShutdownRequired; } Jeff Message
content: TSPA From:
[email protected] [mailto:[email protected]] On
Behalf Of Jeff Hill All, On our Linux (2.4.21-52.ELsmp) test system with R3.14
latest, I see a very strange behavior when I run my CA client side regression
tests - acctst. The symptom is an accumulating number of threads with odd stack
traces (see below). The test doesn’t fail, but the system slows to a crawl as
it runs low on threads (resources). Since these threads are not executing
anywhere in EPICS code, the cause is a complete mystery. Does anyone recognize
this symptom; perhaps as being the symptom of the problem on Linux where a c++
exception handler is catching all exceptions - including some Linux
implementation of pthreads thread exit exception? I don’t reproduce this issue
running on Linux (2.6.9-42.0.3.ELsmp) with R3.14.8.2. (gdb) thread 1146 [Switching to thread 1146 (Thread -1219617872 (LWP
28422))]#0 0x0013c939 in __lll_mutex_lock_wait () from
/lib/tls/libpthread.so.0 (gdb) bt #0 0x0013c939 in __lll_mutex_lock_wait () from
/lib/tls/libpthread.so.0 #1 0x00138b21 in _L_mutex_lock_949 () from
/lib/tls/libpthread.so.0 #2 0x00000000 in ?? () This is the Linux version: ~/epicsR3.14/epics/extensions/src/gateway$ uname -a Linux santana 2.4.21-52.ELsmp #1 SMP Tue Sep 25 15:13:04 EDT
2007 i686 i686 i386 GNU/Linux This is the gdb version (this gdb is known to have some
issues): ~/epicsR3.14/epics/extensions/src/gateway$ gcc -v Reading specs from
/usr/lib/gcc-lib/i386-redhat-linux/3.2.3/specs Configured with: ../configure --prefix=/usr
--mandir=/usr/share/man --infodir=/usr/share/info --enable-shared
--enable-threads=posix --disable-checking --with-system-zlib
--enable-__cxa_atexit --host=i386-redhat-linux Thread model: posix gcc version 3.2.3 20030502 (Red Hat Linux 3.2.3-59) In the debugger I see that threads are being created and
destroyed in matched sets as they should be while the test is running (see a
sample of some of the output below). [New Thread -1521730640 (LWP 2504)] [New Thread -1521464400 (LWP 2506)] [New Thread -1520669776 (LWP 2508)] [New Thread -1522787408 (LWP 2510)] [New Thread -1523582032 (LWP 2512)] [Thread -1522787408 (LWP 2510) exited] [Thread -1520669776 (LWP 2508) exited] [Thread -1521464400 (LWP 2506) exited] [Thread -1521730640 (LWP 2504) exited] [Thread -1523582032 (LWP 2512) exited] Have you seen this behavior? Is something wrong with my
version of Linux? To reproduce this run in two different windows: excas –p myTest: acctst myTest:bill 10 or alternativelyin gdb (you will need to build w/o
optimization): gdb acctst run myTest:bill 10 ^c info threads thread <nnnn> bt Jeff Message content:
TSPA |