EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

2002  2003  2004  2005  <20062007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024  Index 2002  2003  2004  2005  <20062007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: RE: CA client (on IOC) question
From: "Jeff Hill" <[email protected]>
To: "'Ralph Lange'" <[email protected]>, "'EPICS Core Talk'" <[email protected]>
Date: Tue, 8 Aug 2006 10:21:51 -0600
Ralph,

> looking at bad things happening and logs from our network switches it
> seems that the CA client that runs on the IOC does a name resolve
> request whenever any record with a link pointing into nirwana (aka an
> unconnected link) is being processed.
> Example: on IOC1, there are 100 records (scanned at 10 Hz) pointing to
> 100 other records sitting on IOC2. As soon as IOC2 is down, IOC1
> broadcasts a name resolution request for those 100 channels 10 times a
> second.

The name resolution request rate shouldn't be tied to the record processing
rate in any way whatsoever unless the channel (in earlier versions any
channel) is being deleted and then recreated by DBCA whenever the record is
being processed. I assume that DBCA doesn't do that.

After a new channel is created its name resolution requests are sent with an
exponential back off controlling the delay between each subsequent search
request.

The above mentioned behavior is only subtly different between different
versions, but it *is* more robust for large systems with the very latest
versions of R3.14. For example, in the latest versions of R3.14 the
following are true.

1) When creating a new channel this does not cause the search rate for
preexisting unresolved channels to be set to the search rate of the new
channel.

2) When a circuit is detected to be unresponsive the client application
receives a disconnect notify callback, but the circuit itself is not
disconnected.

To get in context, what version of EPICS is running in the IOC that is doing
all of the searching?

> (trying to find out why a single IOC going halfway down drives _all_ our
> IOCs into 95+ percent of cpu usage)

I have not seen that on any of the projects I have worked on. Is it possible
that there is an issue there with a gateway's forwarding (and infinite
looping) a search request?

> As soon as IOC2 is down, IOC1 broadcasts a name resolution 
> request for those 100 channels 10 times a second.

I do see that you are stating that the trouble is coming from the IOC which,
if correct, admittedly makes my GW loop guess off the mark.

> As soon as IOC2 is down, IOC1 broadcasts a name resolution 
> request for those 100 channels 10 times a second.

It shouldn't be searching at that rate, but nevertheless, I am suspicious
that this search rate would slam the CPU to 95+ percent. Perhaps that's true
with old iron. One could easily write a test program that creates and then
almost immediately deletes 100 channels on a 10 Hz rate. This program might
be useful for demonstrating what the load impacts might be. Was the IOC
already substantially loaded before it transitioned to a 95% loading?

In summary, an IOC should _not_ behave the way that you are describing.
After I know what version is running I will have a closer look. I am also
willing to log in remotely and debug the issue in an IOC that might be
behaving this way if you would like.

Jeff

> -----Original Message-----
> From: Ralph Lange [mailto:[email protected]]
> Sent: Tuesday, August 08, 2006 4:57 AM
> To: EPICS Core Talk
> Subject: CA client (on IOC) question
> 
> Hello Core,
> 
> looking at bad things happening and logs from our network switches it
> seems that the CA client that runs on the IOC does a name resolve
> request whenever any record with a link pointing into nirwana (aka an
> unconnected link) is being processed.
> Example: on IOC1, there are 100 records (scanned at 10 Hz) pointing to
> 100 other records sitting on IOC2. As soon as IOC2 is down, IOC1
> broadcasts a name resolution request for those 100 channels 10 times a
> second.
> 
> Is that true? Is that smart?
> 
> Confused,
> Ralph
> (trying to find out why a single IOC going halfway down drives _all_ our
> IOCs into 95+ percent of cpu usage)


Replies:
RE: CA client (on IOC) question Jeff Hill
References:
CA client (on IOC) question Ralph Lange

Navigate by Date:
Prev: RE: CA client (on IOC) question Dalesio, Leo `Bob`
Next: RE: CA client (on IOC) question Jeff Hill
Index: 2002  2003  2004  2005  <20062007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: RE: CA client (on IOC) question Dalesio, Leo `Bob`
Next: RE: CA client (on IOC) question Jeff Hill
Index: 2002  2003  2004  2005  <20062007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
ANJ, 02 Feb 2012 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·