EPICS Home

Experimental Physics and Industrial Control System


 
2002  2003  2004  2005  2006  2007  2008  <20092010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024  Index 2002  2003  2004  2005  2006  2007  2008  <20092010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: RE: GW status
From: "Jeff Hill" <[email protected]>
To: "'Amedeo Perazzo'" <[email protected]>
Cc: "'Core-Talk'" <[email protected]>, "'Dirk Zimoch'" <[email protected]>, "'Ernest L. Williams Jr.'" <[email protected]>
Date: Tue, 4 Aug 2009 16:11:58 -0600
>    1. PCAS was unable (before the patch) to handle subscriptions during
> the time that ca_put_callback was being processed.

More specifically, if the service (in your situation the GW) returned status
from an IO request indicating "postpone this request because too many
asynchronous IO operations are already in progress" then PCAS would not
update subscriptions until at least one asynchronous IO operation completed.
Since the GW service returns this "postpone this request because too many
asynchronous IO operations are already in progress" status if a
ca_put_callback request to the IOC is pending then this explains some of the
behavior you saw there (frozen EDM subscription updates when the same client
initiates the motor move).

>    2. The gateway may be translating a ca_put on its server side to a
> ca_put_callback on its client side even when that's not requested by the
client.

In EPICS ca_put is very different from ca_put_callback. 

First, ca_put means no response message from the IOC unless something goes
wrong. Second, in situations where there are many values being written one
after another by the same client to the same PV (an EDM slider is a good
example of this) then if the consumer (the PV in the database) is slower
than the producer (the EDM slider) then some of the intermediate values will
be discarded, but the system guarantees that the last value sent is always
written to the database, and if the field is process passive the record is
also processed with this last value sent.

Second, ca_put_callback means that the callback is not called until after
the record finishes processing, and any records or asynchronous devices that
are linked directly or indirectly to this record are done with their
processing.

So, yes, there was a weakness in the PCAS service interface which prevented
the service from knowing if the request was initiated by a put or a put
callback. Good designs are minimal, but I went too far on this one. I have
committed changes to PCAS so that ca_put_callback request invoke
casChannel::writeNotify and ca_put requests invoke casChannel::write. If the
service does not implement casChannel::writeNotify then
casChannel::writeNotify invokes casChannel::write thereby preserving
backward compatibility. I have also committed changes to the GW so that
casChannel::writeNotify invokes ca_put_callback, and casChannel::write
invokes ca_put. I am still testing these changes. They appear to work
correctly but I see another issue which may be unrelated to my recent
changes, and I am currently pursuing that.

The bottom line is that I hope that these changes should make the GW more
transparent, but this also opens up the possibility that certain programs
that issue a ca_put request followed by ca_get request will discover that
the value written was not returned in response to the ca_get request, and
that would be a behavior change for the GW, but this is probably what Jim
originally intended based on some discussions that I recall. Presumably
programs that really care about such things will be written to issue
ca_put_callback followed by ca_get. Also, turning on -no_cache would fix
this issue for the client but that would be a global change impacting all
clients, and also weakening the GW's role as a offloading agent for the IOC.

>    3. The caput issue may be something different from the previous two
> points and may be related to the way we configured caching on the gateway.
> 
> Is the above correct? Who is the best person to help with the gateway side
> now that the PCAS patch is available? Where do I find some documentation
> (if any) about caching on the gateway? The only thing I found on the
> manual is the -no_cache option whose description doesn't sound like the
> caput problem (I may be wrong).

The manual says this:

-------------snip-snip-----------------
-no_cache  	
Do not use cached (monitored) values when a client does ca_get. This results
in higher network traffic to the IOC but returns always the current value,
even if no monitor event had been send (e.g. because of a MDEL). This also
solves problems with record fields like HOPR or EGU if they are modified
during run-time.
-------------snip-snip-----------------

Servicing get requests out of a subscription updated cache appears to be the
default although perhaps the manual doesn?t say this (I read very quickly so
I could be wrong on this). When no_cache is enabled this makes clients that
do a put followed by a get to the same channel happy because their get
request is translated not into a cache read, but into a retransmitted get
request to the IOC. This comes at the expense of increased load on the IOC.

During testing, without the -no_cache option set I appear to see that gets
are postponed until a pending ca_put_callback completes. Recall however that
the new version (as of today) of the gateway issues ca_put in response to
EDM's ca_put, and the gateway issues ca_put_callback in response to the high
level applications ca_put_callback. I am starting to suspect after looking
at the source code and reading the doc that the -no_cache option has zero
impact whatsoever on that behavior although I haven?t run any experiments
with -no_cache set.

> - Is the zipped base you sent me last week the same as the current CVS
head?

Yesterday I committed changes to the PCAS library in base.

Today I committed the above mentioned patches to the GW, but during testing
(a few minutes ago) I found another possibly unrelated issue that needs to
be understood. I could arrange for a copy if you would like to test in
parallel.

Jeff
______________________________________________________
Jeffrey O. Hill           Email        [email protected]
LANL MS H820              Voice        505 665 1831
Los Alamos NM 87545 USA   FAX          505 665 5107

Message content: TSPA


> -----Original Message-----
> From: Amedeo Perazzo [mailto:[email protected]]
> Sent: Tuesday, August 04, 2009 2:49 PM
> To: Jeff Hill
> Cc: 'Ernest L. Williams Jr.'; 'Andrew Johnson'
> Subject: Re: GW status
> 
> Jeff,
> 
> I have some questions which can help me understand the current situation
> and what we can do next:
> 
> - I understand that we may have three different issues:
> 
>    1. PCAS was unable (before the patch) to handle subscriptions during
> the
> time that ca_put_callback was being processed.
> 
>    2. The gateway may be translating a ca_put on its server side to a
> ca_put_callback on its client side even when that's not requested by the
> client.
> 
>    3. The caput issue may be something different from the previous two
> points and may be related to the way we configured caching on the gateway.
> 
> Is the above correct? Who is the best person to help with the gateway side
> now that the PCAS patch is available? Where do I find some doumentation
> (if any) about caching on the gateway? The only thing I found on the
> manual is the -no_cache option whose description doesn't sound like the
> caput problem (I may be wrong).
> 
> - Is the zipped base you sent me last week the same as the current CVS
> head?
> 
> Thanks!
> Amedeo
> 
> 
> On Mon, 3 Aug 2009, Jeff Hill wrote:
> 
> > Hi,
> >
> >
> >
> > Today I committed changes to EPICS base so that the PCAS service can
> > distinguish between ca_put and ca_put_callback. The changes are
> backwards
> > compatible for the service. In summary, there is a new "writeNotify"
> > interface that calls the "write" interface if it isn't implemented by
> the
> > service.
> >
> >
> >
> > The next step will be to modify the gateway so that it uses ca_put
> instead
> > of ca_put_callback if that is what the client, in SLAC's case EDM, has
> > chosen to use. EDM chooses to use ca_put, and not ca_put_callback,
> precisely
> > because it chooses _not_ to block for write requests to complete in the
> IOC,
> > or in the future hopefully also the GW. So we anticipate that after
> fixing
> > the GW we will have a scenario where EDM will call ca_put, PCAS will
> call
> > writeNotify, and the GW will call ca_put - thereby avoiding use of
> > ca_put_callback where it is inappropriate.
> >
> >
> >
> > Jeff
> > ______________________________________________________
> > Jeffrey O. Hill           Email         <mailto:[email protected]>
> > [email protected]
> > LANL MS H820              Voice        505 665 1831
> > Los Alamos NM 87545 USA   FAX          505 665 5107
> >
> >
> >
> > Message content: TSPA
> >
> >
> >
> >



Replies:
Re: GW status Stephen Lewis
Re: GW status Dirk Zimoch

Navigate by Date:
Prev: mantis 354 - ca_host_name returns empty string if dns server hasnt responded yet? Jeff Hill
Next: Re: GW status Stephen Lewis
Index: 2002  2003  2004  2005  2006  2007  2008  <20092010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: mantis 354 - ca_host_name returns empty string if dns server hasnt responded yet? Jeff Hill
Next: Re: GW status Stephen Lewis
Index: 2002  2003  2004  2005  2006  2007  2008  <20092010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024