Experimental Physics and
Industrial Control System

1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 <2010> 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025	Index	1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 <2010> 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025
<== Date ==>		<== Thread ==>

Subject:	RE: ca_put_callback once again
From:	"Jeff Hill" <[email protected]>
To:	"'Benjamin Franksen'" <[email protected]>, <[email protected]>
Date:	Wed, 24 Nov 2010 09:36:24 -0700

Ben,

Do I understand correctly, that with this source code change in dbPutNotify the system now sends a failure response for the put callback request when this situation arises? That result, while it is arguably superior to current behavior, isn't particularly satisfying on a functional level because it is creating duplicated code in the client applications. Presumably, now all clients need code that will keep reissuing the put callback request until it is successful. That will result in some significant code duplication. Also, this behavior isn't backwards compatible with the put callback execution model advertised so far. A naive person hopes that put notify could just place the request in its queue and initiate it as soon as database processing completes. Sorry, I donât claim to know how much work this creates for persons maintaining dbPutNotify but making put callbacks wait in line does, on a functional level, appear to be the right architecture.

Jeff
______________________________________________________
Jeffrey O. Hill           Email        [email protected]
LANL MS H820              Voice        505 665 1831
Los Alamos NM 87545 USA   FAX          505 665 5107

Message content: TSPA

With sufficient thrust, pigs fly just fine. However, this is
not necessarily a good idea. It is hard to be sure where they
are going to land, and it could be dangerous sitting under them
as they fly overhead. -- RFC 1925


> -----Original Message-----
> From: [email protected] [mailto:tech-talk-
> [email protected]] On Behalf Of Benjamin Franksen
> Sent: Wednesday, November 24, 2010 3:51 AM
> To: [email protected]
> Subject: Re: ca_put_callback once again
> 
> Hi Tim, Andrew, Jeff
> 
> thanks for all your comments.
> 
> I don't think it is necessary or even a good idea to try to avoid all
> collisions in the database. For the moment I would be happy just to be
> notified if parts of the process network is busy and so doesn't get
> processed as a result of my put.
> 
> I went ahead and took a look at the code. It turns out that this can be
> achieved with a minimal effort (a 4 lines change to the code in
> src/db/dbPutNotify.[ch]). And indeed, this takes care that the CA
> server
> sends back an error. My test now says:
> 
> > time caput pvPutAsync1 1 &; sleep 1; time caput -c -w5 pvPutAsync2 1
> 1
> [1] 19959
> Old : pvPutAsync1                    Low
> New : pvPutAsync1                    High
> caput pvPutAsync1 1  0.01s user 0.00s system 20% cpu 0.038 total
> [1]  + 19959 done       time caput pvPutAsync1 1
> Old : pvPutAsync2                    Low
> Error occured writing data.
> caput -c -w5 pvPutAsync2 1  0.00s user 0.01s system 22% cpu 0.035 total
> 
> However, I though it would be nicer if the server could send a more
> specific
> error code, instead of the generic "put failed", the more so because
> the put
> itself did in fact not fail, only the processing chain gets stopped
> short.
> 
> This is also easy to do, it needs a single 4 lines change in
> src/rsrv/camessage.c. Plus the two lines for the new error code --
> actually
> just a warning, as the definition is:
> 
> #define ECA_PUTCBNOTCMPL    DEFMSG(CA_K_WARNING,   61)
> 
> Then I saw that caput does not retrieve the error message from the CA
> library, so I added this too.
> 
> All in all:
> 
> > darcs whatsnew -s
> M ./src/ca/access.cpp -1 +2
> M ./src/ca/caerr.h +1
> M ./src/catools/caput.c -1 +2
> M ./src/db/dbNotify.c -1 +4
> M ./src/db/dbNotify.h -1 +2
> M ./src/rsrv/camessage.c -1 +4
> 
> A patch against base-3.14.12-rc1 is attached.
> 
> My test now does what it's supposed to do:
> 
> > time ./bin/linux-x86/caput pvPutAsync1 1 &; sleep 1; time
> ./bin/linux-
> x86/caput -c -w5 pvPutAsync2 1
> [1] 29635
> Old : pvPutAsync1                    Low
> New : pvPutAsync1                    High
> ./bin/linux-x86/caput pvPutAsync1 1  0.01s user 0.00s system 24% cpu
> 0.032
> total
> [1]  + 29635 done       time ./bin/linux-x86/caput pvPutAsync1 1
> Old : pvPutAsync2                    Low
> Error occured writing data:Put callback processing incomplete
> ./bin/linux-x86/caput -c -w5 pvPutAsync2 1  0.00s user 0.00s system 12%
> cpu
> 0.032 total
> 
> On Tuesday, November 23, 2010, you wrote:
> > ----- Original Message -----
> >
> > > From: "Ben Franksen" <[email protected]>
> > > To: [email protected]
> > > Sent: Monday, November 22, 2010 9:59:04 PM
> > > Subject: Re: ca_put_callback once again
> > > Hi Andrew
> > >
> > > We have two separate issues here.
> > >
> > > (1) The bo record. I was aware of the fact that the bo record does
> not
> > > remain active during its HIGH time. Maybe I should have explained
> the
> > > idea behind the test db file a bit more, but I added the test setup
> as
> > > an afterthought and the mail was already long enough. I am using
> the
> > > bo
> > > record's HIGH field only so that I can see a change in the value
> with
> > > a
> > > camonitor in another window. The actual asynchronous processing is
> > > done
> > > by the seq record which has a longer delay for exactly this reason.
> I
> > > could as well have used a longout for the two "front-end" records
> and
> > > the result would have surprised me as much.
> > >
> > > I am not proposing to change the bo behaviour.
> > >
> > > (2) The putNotify behaviour. I admit that I did not read the manual
> > >
> > > carefully enough. You cite the ADG:
> > > >     6. In general a set of records may be processed as a result
> of a
> > > >
> > > > single dbPutNotify. If a record in the set is found to be active,
> > > > either because PACT is true or because a putNotify already owns
> the
> > > > record, then that record is not made part of the set of records
> that
> > > > must complete before the putNotify request completes.
> > >
> > > This should have led me to expect the behaviour I see.
> > >
> > > > Your second caput is completing immediately because its bo record
> > > > tries to process the OUT link but finds the seq.PACT=TRUE, so it
> > > > gives up and tells putNotify that everything it started has
> > > > finished.
> > >
> > > Agreed, but this is exactly the problem I complained about.
> > >
> > > It tells me that an operation successfully completed, when in fact
> it
> > > did *not* succeed. The bo was supposed to start more than its own
> > > processing, namely processing of the seq record, and this id not
> > > happen.
> >
> > I think there's a legitimate question about what should happen in
> this
> > case, and I hope to get a better view of it by thinking of similar
> cases:
> >
> > Suppose the second dbput to the seq record had resulted from a loop
> in
> > the database, rather than from a second ca_put_callback.  Normal
> database
> > processing rules would have called for loop processing to stop when
> it
> > tried to process a record already processing as the result of a
> dbput.
> > I think neither of us would regard this as an error -- I have
> databases,
> > in fact, that rely on this behavior to function correctly.  But this
> > case would have looked almost exactly the same as yours to the
> database
> > processing routines: the only difference is that the second dbput
> would
> > have propagated the /original/ put_callback to the seq record for a
> > second time, instead of propagating a /new/ put_callback.  (Thus, one
> > conceivable solution might involve named putNotifys.)
> >
> > Another similar case: Suppose you had executed the bo records with
> > ordinary ca_puts, instead of ca_put_callbacks.  You would have gotten
> > exactly the same database-processing result, which is good -- ca_put
> > should achieve the same result as ca_put_callback.  But, with caput,
> > there's no possibility of signaling the collision as an error,
> because
> > the execution is not traced, so when a collision occurs no code knows
> > who started it.  (The nice thing about ca_put_callback is that it
> gives
> > the caller has a way to avoid a database-processing collision, as
> long
> > as nobody else is trying to process the database at the same time.)
> >
> > One possible view of this is that the problem is in the collision,
> not
> > in either the database-processing rules or in ca_put_callback.  It's
> > unfortunate that the user can rarely be expected to know whether or
> > not two puts are going to the same lock set and might collide, but I
> > can't think of a way out that doesn't involve a deep redesign.
> >
> > > It should be possible to record such a failure and return it
> > > to
> > > the requester. Even better, the CA server could detect the failure
> and
> > > wait until the record in question is no longer busy. It already
> does
> > > so
> > > under certain circumstances, namely if I target the same front-end
> > > record from the same client: in this case the second call waits for
> > > the
> > > first fo finish, i.e. the CA server queues the second request.
> > > (Unfortunately I cannot as easily demonstrate this because caput
> does
> > > not allow to group several asynchronous put operations in one
> > > command.)
> > >
> > > If I target a different record (even from the same client), this
> > > queuing
> > > does not happen. This is inconsistent. I think that the queueing
> > > should
> > > ideally always happen for ca_put_callback as long as any record in
> teh
> > > chain is busy. Failing that, or if the wait times out, at least an
> > > error should be sent back.
> > >
> > > Considering the effort that went into the implementation of the
> > > putNotify feature it seems a shame that it cannot be used for what
> it
> > > was designed, at least not reliably.
> > >
> > > Since I don't expect this will be fixed any time soon, I will warn
> > > users
> > > of the sequencer that using pvPut(var,SYNC) is inherently
> unreliable.
> > > Instead completion should be detected by other means, such as
> reading
> > > back a status bit from the hardware.
> >
> > We tried that.  For years I tried different ways to make that work.
> It
> > was awful.  put_callback made a *huge* improvement in the speed,
> > robustness, and flexibility of scans.  Inmportantly, it allows
> clients
> > to scan a PV without knowing which other PV to read to detect
> > completion.
> >
> > I agree that put_callback has problems with collisions, and I don't
> mean
> > to minimize the problems.  But I think it's so much better than the
> > achievable alternatives that I would not suggest SNL developers avoid
> it.
> >
> > > Cheers
> > > Ben
> 
> 
> Helmholtz-Zentrum Berlin fÃr Materialien und Energie GmbH
> Hahn-Meitner-Platz 1, 14109 Berlin
> Vorsitzende des Aufsichtsrates: Prof. Dr. Dr. h.c. mult. Joachim
> Treusch
> Stellvertretende Vorsitzende: Dr. Beatrix Vierkorn-Rudolph
> GeschÃftsfÃhrer: Prof. Dr. Anke Rita Kaysser-Pyzalla, Prof. Dr. Dr.
> h.c. Wolfgang Eberhardt, Dr. Ulrich Breuer
> Sitz der Gesellschaft: Berlin Handelsregister: AG Charlottenburg, 89
> HRB 5583
> 
> Disclaimer automatically attached by the E-Mail Security Appliance
> mail0.bessy.de 11/24/10 at Helmholtz-Zentrum Berlin GmbH.

Replies:: Re: ca_put_callback once again Andrew Johnson

References:: Re: ca_put_callback once again Tim Mooney; Re: ca_put_callback once again Benjamin Franksen

Navigate by Date:: Prev: Re: subrecord INPx Matthieu Bec; Next: RE: subrecord INPx Allison, Stephanie; Index: 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 <2010> 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025
Navigate by Thread:: Prev: Re: ca_put_callback once again Benjamin Franksen; Next: Re: ca_put_callback once again Andrew Johnson; Index: 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 <2010> 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025

ANJ, 24 Nov 2010

· Home · News · About · Base · Modules · Extensions · Distributions ·
· Download · Search · IRMIS · Talk · Documents · Links · Licensing ·

Experimental Physics and Industrial Control System

Experimental Physics and
Industrial Control System