Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  <20182019  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  <20182019 
<== Date ==> <== Thread ==>

Subject: RE: EDM X/Y Plot segfaults
From: "Baily, Scott A" <sbaily@lanl.gov>
To: Bruce Hill <bhill@slac.stanford.edu>, Eric Norum <wenorum@lbl.gov>
Cc: Gregory Portmann <gregportmann@hotmail.com>, Michael Chin <MJChin@lbl.gov>, EPICS Tech-Talk <tech-talk@aps.anl.gov>
Date: Thu, 24 May 2018 15:43:47 +0000

I also have some patches for the XYplot.  I’ve added the feature of supporting 0 element subscriptions to edm, and modified the plot widget so it won’t show trailing 0’s past the end of the waveform.  Also, I fixed a number of crashes with the XYplot.  Most had to do with uninitialized values being used.  I sent these to John Sinclair early this year, but there haven’t been any releases of EDM in over a year. Is anyone maintaining EDM? I saw some code on github, but it was had not been updated with the latest release of EDM.

--

Correspondence

--

Scott Baily

AOT-IC, MS H820

Los Alamos National Laboratory

Los Alamos, NM 87545

ph: (505) 606-2260

 

From: tech-talk-bounces@aps.anl.gov <tech-talk-bounces@aps.anl.gov> On Behalf Of Bruce Hill
Sent: Wednesday, May 23, 2018 7:16 PM
To: Eric Norum <wenorum@lbl.gov>
Cc: Gregory Portmann <gregportmann@hotmail.com>; Michael Chin <MJChin@lbl.gov>; EPICS Tech-Talk <tech-talk@aps.anl.gov>
Subject: Re: EDM X/Y Plot segfaults

 

Hi Eric,
We use xygraph for many displays at SLAC and generally it works well without crashing.

However, recently one of our developers found a repeatable crash related to xygraph that she suggested may be related to your crash.

The crash scenario she found involves one screen running an xygraph w/ several traces, each w/ 8k x and y arrays.   If we then try to open or execute another screen w/ 6 xygraph's, each w/ 8 similar traces, it would crash every time.   gdb stack trace wasn't exactly the same as yours and would vary as to where it crashed, but it always involved a NULL yPvData[i].

Cutting back on the number of xygraphs in the 2nd screen or the number of traces appears to
make the crash less likely, and if I remove enough of them, it can succeed.

My hypothesis re the root cause is that the xygraph is redrawing before all the PV's have reconnected.    To address this, I added tests for NULL xPvData[i] and NULL yPvData[i] at
the top of genXyVector() and genChronoVector().    If EDMDEBUGMODE is set to a non-zero
value it prints an error msg each time xygraph would have crashed.

With this patch we can run the full test screens and see 4 or 5 of the NULL yPvData error msgs.
The trace isn't drawn if xPvData[i] or yPvData[i] is NULL, but it doesn't crash and subsequent
trace updates work ok.

I don't know if this patch will fix your issue, but the patch is attached if you want to give it a try.

Cheers,
- Bruce


On 05/09/2018 11:38 AM, Eric Norum wrote:

Hmm…sorry to keep following up my own posting, but shortly after I sent the message containing what I though was a work-around I got another segfault.  It looks like that there are *lots* of places that can have null pointer dereferences in the methods invoked from xyGraphClass::executeDeferred.  I hope there are still seem edm maintainers out there that can help with this since my quick hack fix clearly isn’t going to work.

 

 found that hacking in the following changes seems to stop the segfaults and result in waveform display.  i suspect that there’s a better fix that involves not invoking this method when the pointers in question are invalid, but I don’t have any idea what that would entail.

 

diff -u baselib/xygraph.cc.orig baselib/xygraph.cc

--- baselib/xygraph.cc.orig     2018-05-09 10:49:02.055680000 -0700

+++ baselib/xygraph.cc  2018-05-09 10:52:04.804485000 -0700

@@ -6298,6 +6298,7 @@

   arrayNumPoints[i] = 0;

 

   for ( ii=0; ii<yPvCount[i]; ii++ ) {

+    if (!yPvData[i]) dyValue = 0; else

 

     // There are two views of pv types, Type and specificType; this uses

     // specificType

@@ -6413,7 +6414,7 @@

 

 #endif

 

-    dxValue = ( (double *) xPvData[i] )[ii];

+    if (xPvData[i]) dxValue = ( (double *) xPvData[i] )[ii]; else dxValue = 0;

 

     if ( xAxisStyle == XYGC_K_AXIS_STYLE_LOG10 ) {

       dxValue = loc_log10(  dxValue  );

 



On May 9, 2018, at 10:28 AM, Eric Norum <wenorum@lbl.gov> wrote:

 

Displaying X/Y data in EDM on OS X and Linux often results in a segmentation fault.  Here’s where the fault occurs:

 

Program received signal SIGSEGV, Segmentation fault.

0x00007ffff1f16f3c in xyGraphClass::genChronoVector (this=0x96c6f0, i=0, 

    rescale=0x7fffffffc2ec) at ../xygraph.cc:6344

6344          dyValue = ( (double *) yPvData[i] )[ii];

(gdb) where

#0  0x00007ffff1f16f3c in xyGraphClass::genChronoVector (this=0x96c6f0, i=0, 

    rescale=0x7fffffffc2ec) at ../xygraph.cc:6344

#1  0x00007ffff1f1c186 in xyGraphClass::executeDeferred (this=0x96c6f0)

    at ../xygraph.cc:9599

#2  0x00007ffff7afe07d in activeWindowClass::processObjects (this=0x927880)

    at ../act_win.cc:22289

#3  0x00007ffff7af5858 in appContextClass::applicationLoop (this=0x636d20)

    at ../app_pkg.cc:6592

#4  0x0000000000405562 in main (argc=<value optimized out>, 

    argv=<value optimized out>) at ../main.cc:2806

(gdb) list

6339          else {

6340            dyValue = (double) ( (unsigned short *) yPvData[i] )[ii];

6341          }

6342          break;

6343        default:

6344          dyValue = ( (double *) yPvData[i] )[ii];

6345          break;

6346        }

6347   

6348        if ( y1AxisStyle[yi] == XYGC_K_AXIS_STYLE_LOG10 ) {

(gdb) print yPvData

$3 = {0x0 <repeats 20 times>}

(gdb) print &yPvData

$4 = (void *(*)[20]) 0x96f240

(gdb) print i

$5 = 0

(gdb) print yPvData[0]

$6 = (void *) 0x0

(gdb) 



For some reason yPvData is full of null pointers which results in a segfault when dereferenced by the [ii] subscript.

Any ideas why this happens sometimes?   I can get a good display maybe two or three times in a row and then get the segfault.

 

— 

Eric Norum

 



-- 
Bruce Hill
Member Technical Staff
SLAC National Accelerator Lab
2575 Sand Hill Road M/S 10
Menlo Park, CA  94025

Replies:
Re: EDM X/Y Plot segfaults Bruce Hill
References:
EDM X/Y Plot segfaults Eric Norum
Re: EDM X/Y Plot segfaults Eric Norum
Re: EDM X/Y Plot segfaults Bruce Hill

Navigate by Date:
Prev: Re: Leica laser tracker integration with EPICS Torsten Bögershausen
Next: Re: EPICS CAS errors Dirk Zimoch
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  <20182019 
Navigate by Thread:
Prev: Re: EDM X/Y Plot segfaults Bruce Hill
Next: Re: EDM X/Y Plot segfaults Bruce Hill
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  <20182019 
ANJ, 24 May 2018 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·