Experimental Physics and
Industrial Control System

1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 <2012> 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025	Index	1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 <2012> 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025
<== Date ==>		<== Thread ==>

Subject:	RE: Strategies for working with fast-changing large arrays
From:	<[email protected]>
To:	<[email protected]>
Cc:	[email protected]
Date:	Mon, 16 Jan 2012 08:49:00 +0000

From: [email protected] [mailto:[email protected]] On
> Thanks for the report.   FWIW, pyepics uses preemptive callbacks by
> default, but I still see dropped frames, independent of the context
> model.   After some further testing, I believe this is due to the
> conversion of data from C to Python, which is not too surprising.  The
> culprit is, as earlier in this conversation, with the '_unpack'
> function which does the C-Python data conversion.
> 
> For CHAR waveforms, such as images from a Prosilica camera, the
> current code has a test for converting to a CHAR waveform is (ca.py,
> line 815)
>         # waveform data:
>         if ntype == dbr.CHAR:
>             if use_numpy:
>                 data = numpy.array(data)
>             return copy(data[:])
> 
> where 'data' is the still-ctypes data (an unsigned byte Array:
> c_ubyte_Array).  Ignoring the 'use_numpy' option, the 'return
> copy(data[:])' does two things:
>    a) makes a copy to avoid memory overwrites -- I've seen this on some
> systems.
>    b) takes a slice of the whole array 'data[:]' which converts the
> ctypes ubyte Array to a list.  This appears to be the slow part.
> 
> Replacing that with
>         # waveform data:
>         if ntype == dbr.CHAR:
>             if use_numpy:
>                 data = numpy.array(data)
>             return copy(data)
> 
> allows color images from a 1.4Mb camera (so 4177920 data) to be
> received at full speed.  The downside is that the data is still a
> ctypes array and has to be converted *somewhere* to something useful
> to python, though this is simply 'data=data[:]'.   Doing this inside
> the user-level callback still causes frames to be dropped, as that is
> still run "inside" the event handler.   It might be possible that
> running the user callback in a separate thread or expecting the user
> to store the data in the callback and convert in a separate
> thread/process would work, but I haven't tried that yet.

This data conversion path seems a bit painful to me, the trick is to ensure that the underlying data is copied exactly once (this is necessary of course as the data buffer handed over by the subscription callback is temporary) and that all metadata handling is done before Python actually gets its hands on the underlying bytes.  As soon as you let Python convert your data to a Python list you're paying a huge cost!

The corresponding code in cothread (cothread.dbr.dbr_to_value) is (somewhat trimmed):

	# Array size is count, corresponding numpy datatype is dtype,
	# raw_dbr is the corresponding event_handler args field of type c_void_p,
	# ca_array is a convenience subclass of numpy.ndarray
	result = ca_array(shape = (count,), dtype = dtype)
	ctypes.memmove(result.ctypes.data, raw_dbr.raw_value, result.nbytes)

Note here that ndarray(shape = ..., dtype = ...) allocates the necessary storage without initialising it, but creates all the necessary metadata, and then we can just memcpy() the dbr data directly into place.  Of course this only works when count and dtype are a precise match for the dbr block we've received, hence all the dancing you'll find in cothread.dbr.

> I'm not sure how cothread handles such a scenario, but I guess it
> runs callbacks in a separate thread.

In effect each callback is processed twice, once immediately as an event_handler in the Channel Access context, and then a second time in the user's own context.

> I'm also not sure whether the above change is the best idea for
> default use, but having a 'DONT UNPACK' option for each PV/ChannelID
> might be a useful option.  In this case, for example, it might allow a
> faster image processing and display program, at least in the sense
> that fewer frames would be dropped.
> 
> Any suggestions on whether that would be a useful addition?

Personally I'd recommend making numpy a mandatory part of your library api.

-- 
This e-mail and any attachments may contain confidential, copyright and or privileged material, and are for the use of the intended addressee only. If you are not the intended addressee or an authorised recipient of the addressee please notify us of receipt by returning the e-mail and do not use, copy, retain, distribute or disclose the information in or attached to the e-mail.
Any opinions expressed within this e-mail are those of the individual and not necessarily of Diamond Light Source Ltd. 
Diamond Light Source Ltd. cannot guarantee that this e-mail or any attachments are free from viruses and we cannot accept liability for any damage which you may sustain as a result of software viruses which may be transmitted in or with the message.
Diamond Light Source Limited (company no. 4375679). Registered in England and Wales with its registered office at Diamond House, Harwell Science and Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom

Replies:: Re: Strategies for working with fast-changing large arrays Matt Newville

References:: Re: pyepics: Strategies for working with fast-changing large arrays Anton Derbenev; Re: pyepics: Strategies for working with fast-changing large arrays Andrew Johnson; RE: Strategies for working with fast-changing large arrays michael.abbott; Re: Strategies for working with fast-changing large arrays Matt Newville

Navigate by Date:: Prev: RE: problem with 64 bit build of EDM v1-12-68 John Dobbins; Next: Re: Reading scope waveforms with StreamDevice + asyn Dirk Zimoch; Index: 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 <2012> 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025
Navigate by Thread:: Prev: Re: Strategies for working with fast-changing large arrays Matt Newville; Next: Re: Strategies for working with fast-changing large arrays Matt Newville; Index: 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 <2012> 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025

ANJ, 18 Nov 2013

· Home · News · About · Base · Modules · Extensions · Distributions ·
· Download · Search · IRMIS · Talk · Documents · Links · Licensing ·

Experimental Physics and Industrial Control System

Experimental Physics and
Industrial Control System