Folks,
I believe I have found a serious bug either in the gcc compiler or the MVME167
hardware. I am running version vxgccV2.2.3.1 on a Sun4, cross-compiling for a
68k. This is the version which is used by the current version of vxWorks in
use at the APS.
The problem is that the compiler (and/or hardware?) is generating questionable
floating point values, specifically the value "-0.", which is 80000000 in hex.
The Motorola floating point hardware does not seem to object to such values,
nor apparently do Sun computers which receive such invalid numbers over channel
access.
I discovered the problem when I was trying to run medm on a DEC Alpha machine
running Digital Unix. medm would consistently crash with a "Floating
exception" when trying to display screens containing PVs which had certain
values. The floating exception is a "Denormalized number", as shown by the
following dbx output.
************************************************************
[1] Floating exception medm -cleanup (core dumped)
corvette> dbx /usr/local/epics/extensions/bin/alpha/medm core
dbx version 3.11.10
signal Floating point exception at [cvtDoubleToString:152 +0x20,0x1201117c0]
Source not available
(dbx) where
> 0 cvtDoubleToString(flt_value = Denormalized number 0x0080, pstr_value =
0x140033680 = "", precision = 3) ["../cvtFast.c":152, 0x1201117c0]
1 valueToString(pte = 0x140170b60, format = DECIMAL) ["../medmTextEntry.c":
167, 0x12003865c]
2 textEntryDraw(cd = 0x140170b60) ["../medmTextEntry.c":348, 0x1200393bc]
3 updateTaskWorkProc(cd = 0x140033f90) ["../shared.c":592, 0x120081ae0]
4 (unknown)() [0x3ff803ae278]
5 XtAppNextEvent(0x12002e4b8, 0x37, 0x6f, 0x140060000, 0x140034c70)
[0x3ff803ae5a4]
6 main() ["../medm.c":2624, 0x12002e4b4]
************************************************************
The PV which medm was trying to display when it crashed was 13LAB:hvps1.B,
which is a double field in a subroutine record.
Here is the output of a "caget" from the Unix command line on a Sun4 system:
cars1> caget 13LAB:hvps1.B
13LAB:hvps1.B -0
Thus, the Sun4 also sees this as negative 0, it just does not generate floating
exceptions when processing such invalid floating operands.
Here is a simple test program which demonstrates the problem with the compiler.
************************************************************
test_gcc_bug()
{
struct {
float value;
char extra[24];
} test;
float scaler;
test.value=0.;
printf(" test.value: As float=%f; In hex=%x\n",
test.value, test.value);
test.value = 1.;
printf(" test.value: As float=%f; In hex=%x\n",
test.value, test.value);
test.value = -1.;
printf(" test.value: As float=%f; In hex=%x\n",
test.value, test.value);
if (test.value < 0.0) test.value = 0.0;
printf(" test.value: As float=%f; In hex=%x\n",
test.value, test.value);
scaler=0.;
printf(" scaler: As float=%f; In hex=%x\n",
scaler, scaler);
scaler = 1.;
printf(" scaler: As float=%f; In hex=%x\n",
scaler, scaler);
scaler = -1.;
printf(" scaler: As float=%f; In hex=%x\n",
scaler, scaler);
if (scaler < 0.0) scaler = 0.0;
printf(" scaler: As float=%f; In hex=%x\n",
scaler, scaler);
}
************************************************************
Here is the output of that program, running on an MV167 CPU.
13-lab> test_gcc_bug
test.value: As float=0.000000; In hex=0
test.value: As float=1.000000; In hex=3ff00000
test.value: As float=-1.000000; In hex=bff00000
test.value: As float=0.000000; In hex=80000000
scaler: As float=0.000000; In hex=0
scaler: As float=1.000000; In hex=3ff00000
scaler: As float=-1.000000; In hex=bff00000
scaler: As float=0.000000; In hex=0
Thus, when test.value is initially assigned the value 0.0, it is a valid zero
(all zero bits when displayed in hex).
We then assign 1 and then -1.0 to test.value. Those values are OK.
However, we now test to see if test.value is less than 0.0 and assign it the
value 0.0 if it is. After this assigment test.value has the hex value 80000000
which is "negative zero". Why???
Note that the assignment of 0x80000000 does NOT happen in the following cases:
- When the variable is a simple scaler, as in the variable "scaler" in this
test program
- If I eliminate the char[24] array from the structure.
This is a serious problem, since channel access passes such values to my DEC
Alpha clients, which will crash on these apparently invalid operands.
What is the official ruling on -0.0? Is it invalid or not? Why is compiler
generating it sometimes and not others?
Any ideas much appreciated.
____________________________________________________________
Mark Rivers (773) 702-2279 (office)
CARS (773) 702-9951 (secretary)
Univ. of Chicago (773) 702-5454 (FAX)
5640 S. Ellis Ave. (708) 922-0499 (home)
Chicago, IL 60637 [email protected] (e-mail)
or:
Argonne National Laboratory (630) 252-0422 (office)
Building 434A (630) 252-0405 (lab)
9700 South Cass Avenue (630) 252-1713 (beamline)
Argonne, IL 60439 (630) 252-0443 (FAX)
- Navigate by Date:
- Prev:
Re: Make, Scripts, Shell, Perl!? Richard Wolff
- Next:
Re: Bug in gcc? Jinhu Song
- Index:
1994
1995
1996
<1997>
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
- Navigate by Thread:
- Prev:
[no subject] W.-D. Klotz
- Next:
Re: Bug in gcc? Jinhu Song
- Index:
1994
1995
1996
<1997>
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
|