EPICS Home

Experimental Physics and Industrial Control System


 
1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  <20122013  2014  2015  2016  2017  2018  2019  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  <20122013  2014  2015  2016  2017  2018  2019 
<== Date ==> <== Thread ==>

Subject: Re: Strange performance results -- any ideas?
From: Dirk Zimoch <dirk.zimoch@psi.ch>
To: "tech-talk@aps.anl.gov" <tech-talk@aps.anl.gov>
Date: Tue, 19 Jun 2012 18:24:22 +0200
Hi all,
thank you for your suggestions.

When I change from DOUBLE to FLOAT arrays, nothing changes.

Making the CPU really busy does not change anything. It neither changes the location of the breaks nor the time/element for anything > 16.

When I make the string longer, the break moves up.
"3,..."
size    1024     duration:     1169     time/element:    1.1
size    2048     duration:     2116     time/element:    1.0
size    4096     duration:    43596     time/element:   10.6
size    8192     duration:    47498     time/element:    5.8
size   16384     duration:    53602     time/element:    3.3
size   32768     duration:    30096     time/element:    0.9

"3.,..."
size    1024     duration:     1216     time/element:    1.2
size    2048     duration:    41856     time/element:   20.4
size    4096     duration:    43769     time/element:   10.7
size    8192     duration:     8051     time/element:    1.0

"3.1,..."
size    1024     duration:     1362     time/element:    1.3
size    2048     duration:    42005     time/element:   20.5
size    4096     duration:    44101     time/element:   10.8
size    8192     duration:    48557     time/element:    5.9
size   16384     duration:    18283     time/element:    1.1

"3.14,..."
size     512     duration:      833     time/element:    1.6
size    1024     duration:    40467     time/element:   39.5
size    2048     duration:    42011     time/element:   20.5
size    4096     duration:     5001     time/element:    1.2

"3.14159265359,..."
size     256     duration:      677     time/element:    2.6
size     512     duration:    40791     time/element:   79.7
size    1024     duration:    41654     time/element:   40.7
size    2048     duration:    42928     time/element:   21.0
size    4096     duration:     7013     time/element:    1.7

Server and IOC run on the same PC. When I change from localhost to the IP address, nothing changes. To use other mechanisms than TCP will take some time to implement....

Dirk




Till Straumann wrote:
But -- if the slowdown is due to limited cache space -- why would the
throughput improve as you process bigger arrays (larger than 4k - I
would expect the throughput to stay low once your cache is thrashing)?
This doesn't make sense. I'd also argue that the cache doesn't really
give you a big performance boost in this 'stream processing' scenario.
The cache speeds things up if you repeatedly access the same data
over and over again but that's typically not the case when you process
a stream of data.
Maybe some TCP issue? I would try e.g., reading from a file
or some other kind of non-TCP data source.

FWIW
- Till


On 06/19/2012 09:54 AM, Andrew Johnson wrote:
Hi Dirk,

On 2012-06-19 Dirk Zimoch wrote:
I am testing the performance of array parsing with StreamDevice on a
Linux system. I am reading strings like "3.1415,3.1415,...\n" into a
waveform record with FTVL=DOUBLE. The input is via TCP from localhost.

I found a very strange duration/size relationship:
size       1     duration:      423     time/element:  423.0
size       2     duration:      234     time/element:  117.0
...
size     256     duration:      397     time/element:    1.6
size     512     duration:      552     time/element:    1.1
size    1024     duration:    40023     time/element:   39.1
size    2048     duration:    41450     time/element:   20.2
...
size  524288     duration:   275591     time/element:    0.5
size 1048576     duration:   518404     time/element:    0.5

All times are in 1e-6 seconds. Time is measured on the server. The clock
starts before sending the string and stops after the waveform sends back
an acknowledge.

Protocol: {in "%f"; out "%(NORD)d";}

I understand that for small arrays, the time is dominated by overhead
and stays more or less constant.

But does anyone have an idea why is there this strange performance drop
for sizes between 1024 and 4096 elements?
I would second Martin Konrad's cache suggestion, your string data above takes up 7 characters per element and the double version is 8 bytes per element, so with 512 elements they each fit into a 4KB MMU page (or straddle two pages). When you double the number of elements you need 4 pages to store it all, and that may be where you run out of cache. Try setting FTVL=FLOAT and see if the break moves upwards at all (you might need to measure at size=768 to detect a change though). If that doesn't make any difference it could also be a effect
on the machine at the other end of the TCP socket.

- Andrew




Replies:
Re: Strange performance results -- any ideas? Andrew Johnson
References:
Strange performance results -- any ideas? Dirk Zimoch
Re: Strange performance results -- any ideas? Andrew Johnson
Re: Strange performance results -- any ideas? Till Straumann

Navigate by Date:
Prev: Re: Strange performance results -- any ideas? Korhonen Timo
Next: Re: Strange performance results -- any ideas? Andrew Johnson
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  <20122013  2014  2015  2016  2017  2018  2019 
Navigate by Thread:
Prev: Re: Strange performance results -- any ideas? Till Straumann
Next: Re: Strange performance results -- any ideas? Andrew Johnson
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  <20122013  2014  2015  2016  2017  2018  2019