I committed a fix to R3.14 and also the main trunk for this problem (mantis
269). It turns out that there have been a number of different authors
working on this source recently. There was a race condition where the thread
responsible for reconnects was setting its own "block until there is work to
do" semaphore. It was doing this whenever there was a connect attempt
failure, and was consequently chasing its tail.
Jeff
> -----Original Message-----
> From: Jeff Hill [mailto:[email protected]]
> Sent: Monday, September 11, 2006 11:40 AM
> To: 'Andrew Johnson'; 'Gasper Jansa'
> Cc: 'Tech Talk'
> Subject: RE: IocLogInit pu consumption (who to blame)
>
>
> > Do any of the other EPICS core developers claim ownership of the
> > src/libCom/logClient/logClient.c code, which is where the problem lies?
>
> I will claim responsibility.
>
> I created Mantis entry 269.
>
> Typically, the socket connect system call will block the calling thread
> should the connect attempt be unsuccessful, but apparently if there are no
> local routing options the call fails out immediately (without suspending
> the
> thread for any amount of time) opening up a CPU consumption vulnerability
> because the code as written attempts to connect again immediately. The fix
> will probably be a delay inserted where the socket connect attempt fails
> (prior to rejoining the loop sustaining the connection attempts).
>
> Thanks for reporting the bug.
>
> Jeff
>
> > -----Original Message-----
> > From: Andrew Johnson [mailto:[email protected]]
> > Sent: Monday, September 11, 2006 10:23 AM
> > To: Gasper Jansa
> > Cc: Tech Talk
> > Subject: Re: IocLogInit
> >
> > Gasper Jansa wrote:
> > >
> > > I have noticed that if I set EPICS_IOC_LOG_INET to lets say
> 10.17.10.241
> > and if
> > > I am on a network from which this IP can not be reached cpu goes to
> 100%
> > and
> > > stays there if I start logging with command iocLogInit from startup
> > script. I
> > > also get "Network unreachable" error.
> > >
> > > I am experiencing this on a debian and scientific linux.
> >
> > After a little investigation I can replicate your problem on FC4, and I
> > see that this behaviour of the log client is different if you have a
> > default route set on the machine (which is probably how most developer
> > linux workstations are configured).
> >
> > With no default route (no line for the destination 0.0.0.0 in the output
> > from 'netstat -nr') I get your behaviour:
> >
> > > epics> epicsEnvSet EPICS_IOC_LOG_INET 192.168.123.45
> > > epics> iocLogInit
> > > log client: unable to connect to "192.168.123.45:7004" because
> > 101="Network is unreachable"
> > > log client: unable to connect to "192.168.123.45:7004" for 2.0 seconds
> > > epics> iocLogShow
> > > log client: disconnected from log server at "192.168.123.45:7004"
> >
> > The CPU usage on the machine then pegs at 100%.
> >
> > However, with a default route installed (a destination of 0.0.0.0 listed
> > in the output from 'netstat -nr') I get this instead:
> >
> > > epics> epicsEnvSet EPICS_IOC_LOG_INET 192.168.123.45
> > > epics> iocLogInit
> > > log client: unable to connect to "192.168.123.45:7004" for 2.0 seconds
> > > epics> iocLogShow
> > > log client: disconnected from log server at "192.168.123.45:7004"
> >
> > After a while I also get this message:
> >
> > > log client: unable to connect to "192.168.123.45:7004" because
> > 110="Connection timed out"
> >
> > The machine CPU usage remains normal in this case.
> >
> > I would therefore recommend that you set a default route on your IOC
> > machine as a workaround until we get a fix into Base.
> >
> > The iocLogShow command I used above is not callable from the iocsh in
> > any released version of Base; I just added the relevent iocsh table
> > entries that make it callable and committed the changes to CVS.
> >
> >
> > Do any of the other EPICS core developers claim ownership of the
> > src/libCom/logClient/logClient.c code, which is where the problem lies?
> > The 101 value above would appear to be ENETUNREACH which is not
> > mentioned in any osdSock.h file and not expected by the code in
> > logClientConnect(). I might expect similar kinds of problems to appear
> > if someone mentions an unreachable IP address in any of the network
> > configuration settings such as EPICS_CA_ADDR_LIST, but I haven't tried
> > them to confirm that.
> >
> > - Andrew
> > --
> > There is considerable overlap between the intelligence of the smartest
> > bears and the dumbest tourists -- Yosemite National Park Ranger
- Navigate by Date:
- Prev:
RE: IocLogInit pu consumption (who to blame) Jeff Hill
- Next:
Failure to build EPICS 3.14.8.2 on Windows XP using VC++ 2005 Express Edition Alan Greer
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
<2006>
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
- Navigate by Thread:
- Prev:
RE: IocLogInit pu consumption (who to blame) Jeff Hill
- Next:
Failure to build EPICS 3.14.8.2 on Windows XP using VC++ 2005 Express Edition Alan Greer
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
<2006>
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
|