Ernest asked me to take a look at this.
I was able to duplicate the memory overrun and confirm that Garth's fix works.
The problem is that the expectedLength variable is set to -1 for certain conditions
that manifest in the Hybricon protocol that Garth was testing. Thus, expectedLength should
really be ssize_t instead of size_t. This causes the test for expectedLength > 0 in
AsynDriverInterface::readHandler() to fail and bufferSize, a size_t, is set to -1.
The invalid bufferSize in turn leads to a buffer overrun.
We'll submit a pull request to the PSI github site soon.
Cheers,
Bruce
-----Original Message-----
From: Brown, Garth <[email protected]>
Sent: Wednesday, November 6, 2019 2:47 PM
To: [email protected]; Hill, Bruce <[email protected]>
Subject: RE: Streamdevice segmentation fault
I've found the problem, and made a fix that works (for me, so far, disclaimers of that sort). See attached patch.
It was introduced when variables representing sizes were changed to size_t. Sometimes AsynDriverInterface::expectedLength had been used as a signed number, assigned to -1 to indicate an error. The line "if (expectedLength > 0)" changed its behavior when it became unsigned.
The conditions causing a problem here are that maxInput is unset (defaults to 0) and there was input on the line after the InTerminator. Under those circumstances, ssize_t StreamCore::readCallback returns -1. This is called in StreamCore::evalIn()
expectedInput = readCallback(lastInputStatus, NULL, 0); and the -1 is assigned to AsynDriverInterface::expectedLength a few lines later through
return busReadRequest(pollPeriod, readTimeout, expectedInput, true); Then in AsynDriverInterface::readHandler(), that if statement is true, so the buffer length is set to 0xffffffffffffffff.
I don't have a deep enough understanding of the code overall to know if changing what ssize_t StreamCore::readCallback returns or how StreamCore::evalIn() handles it would be a better fix, or if StreamBuffer::reserve(0xffffffffffffffff) should be handling it differently. So I opted for a concise and narrow fix, going back to the old behavior in the if statement by casting to a signed type: "if ((ssize_t) expectedLength > 0)".
Garth
-----Original Message-----
From: Brown, Garth
Sent: Thursday, October 24, 2019 2:42 PM
To: '[email protected]' <[email protected]>
Subject: RE: Streamdevice segmentation fault
Ok, the mail system is too clever. I've put the would-be attachment at:
https://drive.google.com/file/d/1YR4CySnWt9eCAVKZ9eXcy4Tk3RAZzS1l/view?usp=sharing
-----Original Message-----
From: Tech-talk <[email protected]> On Behalf Of Brown, Garth via Tech-talk
Sent: Thursday, October 24, 2019 2:16 PM
To: '[email protected]' <[email protected]>
Subject: RE: Streamdevice segmentation fault
For testing, I've created a stripped down version of the IOC in the attached tarball. You may have to delete "_" from the extension to extract, and replace configure/* with your own standard. The python script to simulate the target device is in IOCManagerApp/src/HybriconSim.py, which starts a server on localhost port 23456. The IOC can be started from iocBoot/sioc-li24-nw03 by running st.cmd.
Garth
-----Original Message-----
From: Tech-talk <[email protected]> On Behalf Of Brown, Garth via Tech-talk
Sent: Monday, October 21, 2019 5:35 PM
To: Mark Rivers <[email protected]>
Cc: [email protected]
Subject: RE: Streamdevice segmentation fault
Hi Mark,
Sure, attached. I've tested my IOC with it, and it reliably crashes. I've inserted sleeps in the middle of the messages so it would more accurately behave like the crate. I also found that if I tack that last short message onto the previous message, the IOC doesn't crash.
Garth
-----Original Message-----
From: Mark Rivers <[email protected]>
Sent: Sunday, October 20, 2019 9:17 AM
To: Brown, Garth <[email protected]>
Cc: [email protected]
Subject: Re: Streamdevice segmentation fault
Hi Garth,
Could you write a simple sever in Python or C that sends that string to the asyn/Stream client? If you do that does it crash the same way? If so, then we can reproduce the problem and track it down.
Mark
Sent from my iPhone
On Oct 15, 2019, at 6:06 PM, Brown, Garth via Tech-talk <[email protected]<mailto:[email protected]>> wrote:
After updating an IOC from:
streamdevice 2.5, asyn 4.21, base 3.14.12
to:
streamdevice 2.8.9 (also tested 2.7.7), asyn 4.32 (also tested 4.31 and 4.35), base 7.0.2 I'm seeing 100% repeatable segfaults after processing the first response from a Hybricon VME crate.
I've attached the db and protocol files I'm using.
The text it is processing when it crashes is:
S/N Z834ÿÿÿÿ U/N Q:SN=Z834 ,UN= ,IP=172.027.005.094,V3=3308,V5=5005,V+12=12009,V-12=12415,V24=00000,T1=29,T2=27,T3=28,T4=0,F1=07230,F2=07380,F3=07410,F4=07290,F5=07290,F6=07290,F7=00000,F8=00000,F9=00000,OT=0,OV=0,UV=000000,FF=0,PS1=1,PS2=1,MSG=0,SW=1,PROT=00,BS=0,HV=1,SD=1,WP=1,OS=000,CODE=10156F8F,ENET=D6.10,PSFAIL=00,POH=34848.0,MAXTMP=61,MINTMP=06
;EV0001000000000000ET00000000EF000000000OT0OV0UV000000FF0PS11MSG0SW1BS0HV1SD1WP1OS000
You can get a better idea of the raw ASCII in the attached console log with ASYN debug output, or in the streamdevice debug log (too huge to attach) at: https://gist.github.com/SLACer-garth/6505fd173dd71d7cfa2620d09e325276
One quirk that stands out to me is that the line terminator is \r\000\n. It would not surprise me if that null character in the middle causes parsing problems. I have tried changing the protocol file to use just \r or just \n, and it still crashed.
The call stack at the point where it segfaults is:
Thread #7 252221 [core: 0] (Suspended : Signal : SIGSEGV:Segmentation fault)
AsynDriverInterface::startTimer() at AsynDriverInterface.cc:228<http://asyndriverinterface.cc:228> 0x49f9b9
AsynDriverInterface::readRequest() at AsynDriverInterface.cc:845<http://asyndriverinterface.cc:845> 0x49c769
StreamBusInterface::Client::busReadRequest() at StreamBusInterface.h:82 0x4c0180
StreamCore::evalIn() at StreamCore.cc:921<http://streamcore.cc:921> 0x4ba493
StreamCore::readCallback() at StreamCore.cc<http://StreamCore.cc>:1,092 0x4bb48b
StreamBusInterface::readCallback() at StreamBusInterface.h:118 0x49f5a8
AsynDriverInterface::readHandler() at AsynDriverInterface.cc<http://AsynDriverInterface.cc>:1,064 0x49d793
AsynDriverInterface::handleRequest() at AsynDriverInterface.cc<http://AsynDriverInterface.cc>:1,510 0x49ee87
AsynDriverInterface::handleRequest() at AsynDriverInterface.cc:243<http://asyndriverinterface.cc:243> 0x49fa4b
portThread() at asynManager.c:879 0x492439
start_routine() at osdThread.c:403 0x5e1f9c
start_thread() at 0x7ffff79a4aa1
clone() at 0x7ffff6902c4d
The pointer "timer" on line 226 of AsynDriverInterface.c (this is code in the streamdevice module) has a value which, when decoded as ASCII, contains part of the message from the crate, "SW1BS0HV1SD1WP1O". It looks likely that inputBuffer.local was written beyond its limit. Right after inputBuffer.local in memory is inputBuffer.len. If you look at that memory location, decoded as ASCII, it's that last line from the crate data, "EV00010000000000". The "timer" pointer is a few memory addresses past there. But I haven't been able to identify a root cause, the point where a buffer overflows or whatever it is. Has anyone seen this before?
Garth
<crat_hybricon_8slot.template>
<crat_hybricon_8slot.proto.db>
<HybriconStreamFaultConsole.txt>
Note: The original attachment was automatically removed by Stanford's email system because it was identified as a file type that is commonly associated with malicious software. In order to transmit this type of file, please use an alternate mechanism such as Stanford's Box service (https://itservices.stanford.edu/service/box).
- References:
- Streamdevice segmentation fault Brown, Garth via Tech-talk
- Re: Streamdevice segmentation fault Mark Rivers via Tech-talk
- RE: Streamdevice segmentation fault Brown, Garth via Tech-talk
- RE: Streamdevice segmentation fault Brown, Garth via Tech-talk
- RE: Streamdevice segmentation fault Brown, Garth via Tech-talk
- RE: Streamdevice segmentation fault Brown, Garth via Tech-talk
- Navigate by Date:
- Prev:
RE: Streamdevice segmentation fault Brown, Garth via Tech-talk
- Next:
rmt CCExecInit problem Tasaddaq Khan via Tech-talk
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
<2019>
2020
2021
2022
2023
2024
- Navigate by Thread:
- Prev:
RE: Streamdevice segmentation fault Brown, Garth via Tech-talk
- Next:
areaDetector module ADAravis Jörn Dreyer via Tech-talk
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
<2019>
2020
2021
2022
2023
2024
|