EPICS Re: Excessive scan times for periodic scans

Experimental Physics and Industrial Control System

2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 <2019> 2020 2021 2022 2023 2024 2025	Index	2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 <2019> 2020 2021 2022 2023 2024 2025
<== Date ==>		<== Thread ==>

Subject:	Re: Excessive scan times for periodic scans
From:	"Konrad, Martin via Core-talk" <[email protected]>
To:	"Johnson, Andrew N." <[email protected]>, EPICS Core Talk <[email protected]>
Date:	Tue, 14 May 2019 00:46:13 +0000

Hi Andrew,
Sorry about the delay. I got sidetracked...

> What other resources do your devices share that might limit how fast the
> IOC can communicate with them?
From looking at the Asyn code I first thought that there were no shared
resources (Asyn is usually allocating all resources per port). So I
moved the IOC to a physical machine and fired up the profiler to
generate some flame graphs (open them in a web browser).

[Explanation: Flame graphs visualize the stack traces sampled by the
profiler (statistical profiling). The stack traces are plotted from the
bottom to the top. The x axis shows the different samples, ordered
alphabetically by the stack trace, merging neighboring samples if they
have the same backtrace. This means sharp spikes correspond to
backtraces that only occurred rarely whereas wide plateaus represent
backtraces that were found frequently.]

The first flame graph [1] shows CPU time spend by the IOC process. The
interesting part is the "scan-1" thread on the right (the thread that is
running slow). Zooming into this thread shows that queueRequest ->
timer::start() -> epicsMutex::lock() dominates (89% of the time spend in
queueRequest() is spend in epicsMutex::lock()). A quick look at the code
reveals that starting a timer involves acquiring the lock on the timer
queue [2]. As it turns out the relevant timer queue is indeed shared
between Asyn ports.

Note that you have to take the CPU time flame graph with a grain of salt
because it doesn't include time the machine spent running other
processes or sitting idle (remember that was part of the problem: each
core was >=30% idle). So I also profiled off-CPU time [3]. This again
revealed that 86% of the off-CPU time in the queueRequest function is
related to the epicsMutex::lock() call in timer::start().

This data suggests that the problem is either in the implementation of
the timer queue or in the way Asyn uses it (or both). Looking at the
timer code I'm suspecting that the implementation of the timer queue
might not be ideal: The data structure used for holding the timers is a
linked list which might result in a significant number of cache misses
while searching the list in timer::start() for the right place to insert
the new timer. I suspect that thanks to pre-fetching a priority queue
based on a vector might perform better.

Any thoughts?

-Martin

[1] https://www.martin-konrad.net/nextcloud/index.php/s/bT2PooiYzS5xeeb
[2]
https://git.launchpad.net/epics-base/tree/src/libCom/timer/timer.cpp?h=3.15#n62
[3] https://www.martin-konrad.net/nextcloud/index.php/s/dZpTtZfSqbNcm7f

-- 
Martin Konrad
Facility for Rare Isotope Beams
Michigan State University
640 South Shaw Lane
East Lansing, MI 48824-1321, USA
Tel. 517-908-7253
Email: [email protected]

Replies:: Re: Excessive scan times for periodic scans Michael Davidsaver via Core-talk

References:: Excessive scan times for periodic scans Konrad, Martin via Core-talk

Navigate by Date:: Prev: Re: [Bug 1722535] Re: 'make distclean' inside a submodule Andrew Johnson via Core-talk; Next: Re: [Merge] ~dirk.zimoch/epics-base:raspberryPi into epics-base:7.0 Dirk Zimoch via Core-talk; Index: 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 <2019> 2020 2021 2022 2023 2024 2025
Navigate by Thread:: Prev: Re: Excessive scan times for periodic scans Ralph Lange via Core-talk; Next: Re: Excessive scan times for periodic scans Michael Davidsaver via Core-talk; Index: 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 <2019> 2020 2021 2022 2023 2024 2025