As you might have seen in my updated mantis entry I am _not_ reproducing with blockingSockTest (which I wrote, but unfortunately neglected to use first). After further investigation (on newer Linux with better debug symbols) I do see some threads hanging around that look like this - which might be a clue. At the moment this isn’t much of a lead but maybe someone else knows what this means.
[Switching to thread 300 (Thread -1208153168 (LWP 5007))]#0 0x00da1841 in __nptl_death_event () from /lib/tls/libpthread.so.0
(gdb) bt
#0 0x00da1841 in __nptl_death_event () from /lib/tls/libpthread.so.0
#1 0x00da24b4 in start_thread () from /lib/tls/libpthread.so.0
#2 0x00c09ffe in clone () from /lib/tls/libc.so.6
On machines with lots of resources my regression tests pass w/o issue. This appears to be one of those silent but deadly, from a performance perspective, situations where resources are consumed until the less capable system becomes sluggish. That would happen only if circuits come and go frequently, and in that context the gateway or a control room medm comes to mind.
Jeff
______________________________________________________
Jeffrey O. Hill Email [email protected]
LANL MS H820 Voice 505 665 1831
Los Alamos NM 87545 USA FAX 505 665 5107
Message content: TSPA
> -----Original Message-----
> From: Andrew Johnson [mailto:[email protected]]
> Sent: Tuesday, August 25, 2009 4:53 PM
> To: [email protected]
> Cc: Jeff Hill
> Subject: Re: proper thread cleanup on Linux?
>
> Hi Jeff,
>
> On Tuesday 25 August 2009 17:09:14 Jeff Hill wrote:
> > After further testing on newer linux versions where the debugger seems
> to
> > actually function correctly I now strongly suspect that this issue is
> > caused by this change. The symptom is that two threads that CA creates
> to
> > manage TCP circuits never shutdown. They need to shutdown typically when
> > the TCP circuit disconnects or when the last channel on the circuit is
> > deleted. The bug was introduced in R3.14.9. See mantis 363.
>
> Can you write a version of blockingSockTest that demonstrates this and
> correctly responds with socketSigAlarmRequired? Unfortunately you'll have
> to
> test it against R3.14.9 or R3.14.10 — I eviscerated the sigAlarm routines
> for
> this release since our use of this signal broke the posix timer library
> (which relies on SIG_ALARM) that some external libraries use. We'll have
> to
> switch to using a different signal if we're going to bring it back (we
> probably should make the signal number we use configurable).
>
> - Andrew
> --
> The best FOSS code is written to be read by other humans -- Harold Welte
- Replies:
- RE: proper thread cleanup on Linux? Jeff Hill
- References:
- proper thread cleanup on Linux? Jeff Hill
- RE: proper thread cleanup on Linux? Jeff Hill
- Re: proper thread cleanup on Linux? Andrew Johnson
- Navigate by Date:
- Prev:
Re: proper thread cleanup on Linux? Andrew Johnson
- Next:
RE: proper thread cleanup on Linux? Jeff Hill
- Index:
2002
2003
2004
2005
2006
2007
2008
<2009>
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
- Navigate by Thread:
- Prev:
Re: proper thread cleanup on Linux? Andrew Johnson
- Next:
RE: proper thread cleanup on Linux? Jeff Hill
- Index:
2002
2003
2004
2005
2006
2007
2008
<2009>
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
|