On 07/31/2018 02:23 AM, Ralph Lange wrote:
> Disclaimer: the original report was not going to the list, so all
> background I know is from the snippets in Andrew's mail.
I only removed trivial lines from the original when replying, for
exactly that reason.
> Ad 1.
> I strongly support this for MENU fields. If the CA server rejects a
> value as illegal, it should stick to that promise, i.e. not set the
> field to the rejected value.
AIUI the error message didn't come from the ca_put() operation, Rolf's
error message "dbFastLinkConv(cvt_menu_st)" came from the subsequent
ca_get_callback() which failed to convert the new index value into a
string. The ca_put() succeeded as the number was written into the HSV field.
Looking at the code though I'm puzzled. The dbFastLinkConv.c routine
cvt_st_menu is already checking for and rejecting integers outside the
valid range for the menu without writing them to the field, so I don't
see how it's getting through.
Ahhh, of course caput is sending it as a DBF_ENUM and not as a string,
so the put-conversion routine used is going to be cvt_e_e() which does
no checking at all. Interestingly if I use '-s' caput refuses to send
any value which is not one of the valid choice strings, so I can't use
that to send "2000" as a string.
I can confirm that the IOC's value checks are working by using the
caput.pl script, which does send the value as a string, shows an
exception on the client side:
> tux% bin/linux-x86_64/caput.pl anj:ao.HSV 100
> Old : anj:ao.HSV MINOR
> CA.Client.Exception...............................................
> Warning: "Channel write request failed"
> Context: "op=1, channel=anj:ao.HSV, type=DBR_STRING, count=1, ctx="anj:ao""
> Source File: ../oldChannelNotify.cpp line 160
> Current Time: Tue Jul 31 2018 15:42:00.777857379
> ..................................................................
> New : anj:ao.HSV MINOR
and triggers the expected error on the IOC:
> Illegal choice PV: anj:ao.HSV error detected in routine: dbFastLinkConv(cvt_st_menu)
Fixing cvt_e_e() is not trivially possible since it is used for copying
values between all of the enum/menu/device types and in both directions,
and its third paddr argument doesn't always describe the destination field.
> For ENUM ... I think records like mbbi/mbbo can be used without strings
> for choices, and I fear that changes could easily break existing
> applications.
Agreed. ENUM field choices can change dynamically so what was legal a
moment ago might now not be, or vice-versa thus the code has to be able
to handle numeric values outside of the current legal range.
> Ad 2.
> No real opinion on that.
>
> Ad 3.
> Good move. Restricting proliferation of illegal and possibly dangerous
> matter is always a good idea.
Thanks, I will commit that change to the 3.14 branch.
- Andrew
> On Mon, Jul 30, 2018 at 9:55 PM Andrew Johnson <[email protected]
> <mailto:[email protected]>> wrote:
>
> Hi Rolf,
>
> Thanks for the report. I don't think the newer sites are doing global
> alarm acknowledgement to the IOCs, and the PVA protocol doesn't have
> support for this at all to my knowledge.
>
>
> On 07/29/2018 03:34 PM, Rolf Keitel wrote:
> > When debugging my Qt-based alarm handler I noticed the following
> > behaviour (3.14.12):
> >
> > If I caput an invalid value to a severity field, i.e. caput xxxx.HSV
> > 2000, I get a CA client exception.
> >
> > On the IOC I see he message: Illegal choice PV: xxxx.HSV error
> detected
> > in routine: dbFastLinkConv(cvt_menu_st).
> >
> > A dbpr xxxx 3 shows HSV field as <nil>
>
> I get the same in 3.16, and the 7.0 branch will be the same too.
>
> > Now, when the channel goes into alarm, dpr xxx 3 shows HSV <nil>,
> > SEVR<nil> , ACKS <nil>, but STAT HIGH. The alarm handler receives the
> > callback message with an acks value of 2000 (the invalid value).
>
> When the channel goes into HIGH alarm the severity gets copied from HSV
> to SEVR and ACKS, and STAT set to HIGH. I can read the illegal value
> from the ACKS field using 'caget -n' and it is 2000 as you say.
>
> > In this situation alh displays the alarm as INVALID, but cannot
> > acknowledge the alarm, the only way to get rid of the alarm in alh
> seems
> > to be a reboot of the IOC.
>
> The SEVR field does get cleared again when the alarm goes away, but as
> long as the record's ACKT is YES the ACKS value will not be reset until
> a client asks for it to be. If some other client manages to clear ACKS
> (or sets ACKT to NO) I assume that a restart of alh would also clear the
> problem since all memory of the illegal state would then be gone.
>
> > In my alarm handler I managed to acknowledge by sending the illegal
> > value 2000 with the acknowledge message to the IOC.
>
> The caput tool provides no way to acknowledge an alarm; at least with
> caget you can ask for the acknowledgement status:
>
> > tux% caget -d DBR_STSACK_STRING anj:ao
> > anj:ao
> > Native data type: DBF_DOUBLE
> > Request type: DBR_STSACK_STRING
> > Element count: 1
> > Value: 0
> > Status: NO_ALARM
> > Severity: NO_ALARM
> > Ack transient?: YES
> > Ack severity: ??
>
> I assume you're calling something like ca_put(DBR_PUT_ACKS, chid, &2000)
> which I would expect to do the right thing for the acknowledgement
> inside the IOC.
>
> Looking at my CA Perl bindings the put_acks() method checks the severity
> argument from the user code so it can't be used to acknowledge anything
> above the expected INVALID value.
>
> > I cannot test right now if this IOC behaviour was changed in a later
> > EPICS release.
>
> I'm fairly sure nothing will have changed in newer releases, we haven't
> touched that area of the code to my knowledge.
>
>
> This is obviously a bug in the IOC's handling of acknowledgements and/or
> alarm severities, the question in my mind is where we should be checking
> for bad severities because they should never be getting outside the IOC.
> There are several possibilities, and we might want to implement more
> than one:
>
> 1. When doing a put to an ENUM or MENU field, reject numbers out of the
> legal range. Sometimes we do want to allow this, but maybe only for ENUM
> fields? Not sure.
>
> 2. The recGblSetSevr() macro could be converting any severity above
> MAJOR_ALARM into INVALID_ALARM, but in the macro the NSEV argument is
> already used twice and this would take it to 4 times, so we might need
> to convert it into a proper function.
>
> 3. The recGblResetAlarms() routine should check NSEV before copying it
> to the other fields — this change is trivial and appears to be all that
> is really needed to prevent clients from seeing bad severities:
>
> diff --git a/src/db/recGbl.c b/src/db/recGbl.c
> index 6d45e73..01b2fa5 100644
> --- a/src/db/recGbl.c
> +++ b/src/db/recGbl.c
> @@ -19,6 +19,7 @@
> #include <limits.h>
>
> #include "dbDefs.h"
> +#include "alarm.h"
> #include "epicsMath.h"
> #include "epicsTime.h"
> #include "epicsPrint.h"
> @@ -222,6 +223,9 @@ unsigned short epicsShareAPI recGblResetAlarms(void
> *precord)
> epicsEnum16 val_mask = 0;
> epicsEnum16 stat_mask = 0;
>
> + if (new_sevr > INVALID_ALARM)
> + new_sevr = INVALID_ALARM;
> +
> pdbc->stat = new_stat;
> pdbc->sevr = new_sevr;
> pdbc->nsta = 0;
>
>
> This doesn't prevent the HSV field from being set to an illegal value,
> but any device support or subroutine record could easily do that so #1
> above couldn't prevent it from happening.
>
> I am inclined to commit the above change to the 3.14 branch and merge
> up, and leave doing anything further for discussion between the core
> developers in one of the downstream branches.
>
> Anybody got any comments?
>
> - Andrew
>
> --
> Arguing for surveillance because you have nothing to hide is no
> different than making the claim, "I don't care about freedom of
> speech because I have nothing to say." -- Edward Snowdon
>
--
Arguing for surveillance because you have nothing to hide is no
different than making the claim, "I don't care about freedom of
speech because I have nothing to say." -- Edward Snowdon
- References:
- Re: alarm handler issue Andrew Johnson
- Re: alarm handler issue Ralph Lange
- Navigate by Date:
- Prev:
Re: Compiler error on Debian Stretch arm Ralph Lange
- Next:
Build failed in Jenkins: epics-base-3.14-win64 #256 APS Jenkins
- Index:
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
<2018>
2019
2020
2021
2022
2023
2024
- Navigate by Thread:
- Prev:
Re: alarm handler issue Ralph Lange
- Next:
Re: Compiler error on Debian Stretch arm Ralph Lange
- Index:
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
<2018>
2019
2020
2021
2022
2023
2024
|