Dear all,
The following is a text version of a proposal on EPICS status codes.
It summarises some of the alternatives and proposes a way forward. There is
a PostScript document with the same text but including figures and footnotes
at
ftp://ftp.keck.hawaii.edu/pub/epics/statusCodes.fm.ps
Unfortunately fm2html failed to convert the document properly so I don't
currently have an HTML version.
Please feed back any comments. Thanks,
William
------------------------------------------------------------------------
EPICS status code proposals
William Lupton, 10/19/95
[email protected]
W. M. Keck Observatory,
P. O. Box 220, Kamuela, HI 96743.
1. Introduction
This document proposes some upwards-compatible changes to EPICS' status
code handling which attempt to meet the following objectives:
1. Subsystems will be able to allocate error codes without
the need for changing system files such as errMdef.h.
2. Error code severity will be supported.
3. There will be minimal impact on existing code.
In addition, the document discusses ways in which status codes and their
translations might be integrated into EPICS record processing.
There have been several recent e-mail discussions on this subject and I
have attempted to take all points of view into account.
2. Preliminary discussion
2.1. Global conventions on status code fields
EPICS status codes are declared as long integers (which in practical
terms guarantees that they are at least 32 bits long). Currently the
high-order 16 bits indicates the subsystem and the low-order 16 bits are
a subsystem-specific code[1].
Jim Kowalkowski has suggested that status codes should be small numbers
(0, 1, 2 etc.) and that each subsystem should be responsible for
extracting severities from and returning the text values of its codes.
Jeff Hill has pointed out that this causes a problem where a status code
is passed through many subsystems before it reaches the top-level
application which will deal with it. How is the top-level application to
know which subsystem to interrogate in order to translate the code?
For this reason, I will assume that there are global conventions on how
to extract generic information from error codes.
2.2. Error logging
I will assume that the existing errMessage() / errPrintf() routines will
be used for error logging. This implies that any new information which
is made available to the error logging system is contained within the
status value.
Perhaps this is a big assumption. I propose that we first see how things
might work without changing the error-reporting interface.
2.3. Status codes versus status text
Just about everyone has assumed that the translated status text rather
than the status code will be sent to the client. Given that clients and
servers may be running different versions of EPICS, this is clearly the
easiest way of allowing clients to report meaningful error messages.
Some people have proposed that there be some sort of error message
server, which would be capable of translating status codes to text. I
will be assuming that we will rely on passing status text directly to
clients (although I see no reason not to pass the code as well).
2.4. On-the-fly versus static subsystem number allocation
Most people have assumed that subsystem numbers will be allocated on the
fly. However, this implies that the actual status codes are unknown at
compilation time and therefore that they cannot be used in switch / case
statements. Also there is a danger that people will be confused by
status codes which are not constants.
The above is not a show-stopping problem but I will assume that
subsystem numbers are allocated statically and will propose some
conventions which should minimize the risk of conflicts.
3. Status code format
The current status code format is:
<top 16 bits subsys> <bottom 16 bits subsys-specific code>
I propose the following format:
<top 4 bits sevr> <6 bits group> <6 bits subsys> <16 bits subsys-specific>
3.1. Severity
sevr is the severity, the most significant four bits, for which I
propose the following bit assignments:
<top two bits same as record SEVR> <defined> <spare>
The top two bits share the encoding of the standard EPICS severity
(SEVR) field (0=OK, 1=minor, 2=major, 3=invalid), which means that bit
31 (Error) is set if and only if the status indicates an error.
Bit 29 (Defined) will be set if and only if severity information is
available (this is for upwards compatibility since all existing status
codes have zero severity, which really means "no severity information
available").
Bit 28 (Spare) is spare. I will suggest later that it might be used to
indicate that this status code should result in a verbal message if that
is possible.
3.2. Group
group identifies which organization has defined this subsystem code. All
VxWorks status codes will lie in group 0 (or possibly group 1). All
existing EPICS codes lie in group 7 or 8. Thus I propose, for
simplicity, that we allocate groups 16-63 to organizations.
Alternatively (and perhaps better), we could follow Jeff Hill's
suggestion of setting up a register, perhaps WWW-based, for allocation
of subsystem numbers, in which case the group and subsys fields could be
combined into a single 12 bit field.
3.3. Notes
1. Status code zero is special and means success for all
subsystems.
2. In the current system, subsystem number zero is reserved
for Unix (Posix.1) error codes. We should probably also
reserve small negative numbers (subsystem 0xfff) as a
special case.
4. Defining status codes
4.1. Current practice
Currently, base/include/errMdef.h defines subsystem numbers, as in:
#define M_dbAccess (501 << 16) /*Database Access Routines */
#define M_sdr (502 << 16) /*Self Defining Records */
Each subsystem defines its own error codes in an include file in
base/include, as in:
#define S_db_notFound (M_dbAccess| 1) /*Process Variable Not Found*/
#define S_db_badDbrtype (M_dbAccess| 3) /*Illegal Database Request Type*/
4.2. Proposed changes
New subsystems can define their M_xxx macros in their own include files,
using values assigned from some central registry.
errMdef.h will define some new macros for manipulating severities and
subsystem numbers. For example (assuming that we stick with groups):
#define ERR_SEVR_SPARE (1<<28)
#define ERR_SEVR_DEFINED (1<<29)
#define ERR_SEVR_NO (ERR_SEVR_DEFINED|(NO_ALARM<<30))
#define ERR_SEVR_MINOR (ERR_SEVR_DEFINED|(MINOR_ALARM<<30))
#define ERR_SEVR_MAJOR (ERR_SEVR_DEFINED|(MAJOR_ALARM<<30))
#define ERR_SEVR_INVALID (ERR_SEVR_DEFINED|(INVALID_ALARM<<30))
#define ERR_GROUP_LANL 16
#define ERR_GROUP_ANL 17
#define errSubsysCode(_group,_subsys) ((_group)<<22|(_subsys)<<16))
Now a new subsystem can define status codes in its own include file:
#define M_mysubsys errSubsysCode(ERR_GROUP_LANL,0)
#define S_mysubsys_OK 0
#define S_mysubsys_minor (ERR_SEVR_MINOR|M_mysubsys|0) /*inconsequential*/
The above is intended only to show how it might work. There will be more
macros and they will probably have different names.
4.3. Notes
1. Not much changes:
a. severity is added (but existing codes, which have no
severity information, continue to work)
b. subsystem numbers are no longer defined centrally but
instead some mechanism for allocating unique numbers is
assumed.
2. If we got rid of the group (institution) codes, the
subsystem number becomes 12 bits but not much else
changes.
3. If we assumed dynamic allocation of subsystem numbers, a
variable has to be defined for the subsystem number (it
might have to be global and it should be initialized to
some known invalid subsystem number).
5. Changes to error code parsing tools
5.1. Current practice
1. The developer creates the file base/libCom/errInc.c,
which includes all files which define error codes
(including errMdef.h).
2. base/tools/blderrSymTbl generates a list of all files
referenced directly or indirectly by errInc.c which
define macros whose names begin S_.
3. blderrSymTbl then causes base/tools/makeStatTbl (a
modified version of the VxWorks tool of the same name)
to be run on the above list of files.
4. This generates a file errSymTbl.c which defines a table
relating status codes to status text and an errSymTbl
global variable which references the table.
5. base/libCom/errSymLib.c contains routines to hash the
above linear table and to look up status text.
5.2. Proposed changes
The basic difference between the current and proposed schemes is that
the current scheme concatenates the status codes from all subsystems,
where the proposed scheme keeps each subsystem separate until run-time.
It is therefore necessary somehow to make the new subsystems known to
the routines which are responsible for translating their error codes. I
propose the following:
1. base/libCom/errInc.c and base/tools/blderrSymTbl are no
longer used.
2. A modified version of base/tools/makeStatTbl (perhaps
with a different name) generates a subsystem-specific
table and a global variable which references it. The
table will be similar to the existing table but will
include the subsystem number and name.
3. A new errSubsysRegister(errSubsysTable *table) routine
is added to base/libCom/ errSymLib.c. Any subsystem
which wants its error codes to be accessible must call
this routine.
4. Existing routines in errSymLib.c are extended to support
the new conventions.
5.3. Notes
1. The only application code-level change is that
errSubsysRegister() must be called.
2. Under VxWorks, and assuming that all the subsystem error
tables have names which match some pattern (e.g.
xxxxxxERRSYMTAB), one could use symEach() automatically
to register all subsystems. Is this worth following up?
3. Under Unix, I don't believe that there is any way of
traversing the list of global symbols defined in an
executable or shareable library, so this would not be
possible (remember that, with the near-availability of
the portable channel access server, servers do not
necessarily run under VxWorks).
6. Error logging
All the proposals so far will work fine with the existing errPrintf()
calls. What happens next?
Jim Kowalkowski has proposed that the error information flowing over the
network could look like:
struct super_duper_error_info {
short subsystem_error_code; // in network byte order
short priority_of_error_code; // in network byte order
char subsystem_name[8];
TS_STAMP time_stamp; // maybe char
error_message_text[SOME_BIGGER_THING];
};
where priority is the same as what I am calling severity (I think).
I would also pass the subsystem-specific status code for completeness
(although most clients would do nothing with it apart, possibly, from
writing it somewhere). It may also be a good idea to break out the
string which is the direct translation of the status code from the extra
string which was provided in the call to errPrintf().
Another thing that might be useful would be some hint as to what the
receiver of the message might like to do with it. For example, perhaps
it's a debugging message which just wants to go to a log file and not be
visible to an operator. Or perhaps the status code is so fatal (or so
interesting) that a verbal message should be generated?
At Keck, where we have a mix of EPICS and non-EPICS systems, we are
planning to provide a local implementation of errPrintf(), which will
(indirectly) use the syslog protocol to log messages. So in our case,
some of the above info will have to be encoded by convention.
7. Database support
Every EPICS record has:
a SEVR (severity) field which has one of the four values
NO_ALARM, MINOR_ALARM, MAJOR_ALARM or INVALID_ALARM, and
a STAT (status) field which has one (currently) 22 values, of
which the first few are NO_ALARM, READ_ALARM, WRITE_ALARM,
HIHI_ALARM.
These values are defined in base/include/alarm.h and cannot easily be
changed. One probably doesn't want to add SEVR values, but it often
makes sense to add STAT values. Everything that has been said in the
earlier status code discussion applies here too.
I would like to propose that we consider adding two new fields to every
record:
a long field to contain a standard status code associated with
the record (can't use STAT because it is 16 bits and is treated
as an enumerated type), and
a string field to contain a status string associated with the
record (typically the translation of the status code with
perhaps some extra appended text).
Tentative names for these fields are ERNO and ERST.
8. References
[1] EPICS Input / Output Controller (IOC) Application
Developer's Guide, Martin Kraimer, November 1994.
- Navigate by Date:
- Prev:
MBBO record support Peregrine McGehee
- Next:
Re: EPICS status codes proposal William Lupton
- Index:
1994
<1995>
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
- Navigate by Thread:
- Prev:
Re: MBBO record support Peregrine McGehee
- Next:
Re: EPICS status codes proposal William Lupton
- Index:
1994
<1995>
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
|