EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  <20192020  2021  2022  2023  2024  Index 2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  <20192020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: Re: Excessive scan times for periodic scans
From: "Konrad, Martin via Core-talk" <[email protected]>
To: "Johnson, Andrew N." <[email protected]>, EPICS Core Talk <[email protected]>
Date: Tue, 14 May 2019 00:46:13 +0000
Hi Andrew,
Sorry about the delay. I got sidetracked...

> What other resources do your devices share that might limit how fast the
> IOC can communicate with them?
From looking at the Asyn code I first thought that there were no shared
resources (Asyn is usually allocating all resources per port). So I
moved the IOC to a physical machine and fired up the profiler to
generate some flame graphs (open them in a web browser).

[Explanation: Flame graphs visualize the stack traces sampled by the
profiler (statistical profiling). The stack traces are plotted from the
bottom to the top. The x axis shows the different samples, ordered
alphabetically by the stack trace, merging neighboring samples if they
have the same backtrace. This means sharp spikes correspond to
backtraces that only occurred rarely whereas wide plateaus represent
backtraces that were found frequently.]

The first flame graph [1] shows CPU time spend by the IOC process. The
interesting part is the "scan-1" thread on the right (the thread that is
running slow). Zooming into this thread shows that queueRequest ->
timer::start() -> epicsMutex::lock() dominates (89% of the time spend in
queueRequest() is spend in epicsMutex::lock()). A quick look at the code
reveals that starting a timer involves acquiring the lock on the timer
queue [2]. As it turns out the relevant timer queue is indeed shared
between Asyn ports.

Note that you have to take the CPU time flame graph with a grain of salt
because it doesn't include time the machine spent running other
processes or sitting idle (remember that was part of the problem: each
core was >=30% idle). So I also profiled off-CPU time [3]. This again
revealed that 86% of the off-CPU time in the queueRequest function is
related to the epicsMutex::lock() call in timer::start().

This data suggests that the problem is either in the implementation of
the timer queue or in the way Asyn uses it (or both). Looking at the
timer code I'm suspecting that the implementation of the timer queue
might not be ideal: The data structure used for holding the timers is a
linked list which might result in a significant number of cache misses
while searching the list in timer::start() for the right place to insert
the new timer. I suspect that thanks to pre-fetching a priority queue
based on a vector might perform better.

Any thoughts?

-Martin

[1] https://www.martin-konrad.net/nextcloud/index.php/s/bT2PooiYzS5xeeb
[2]
https://git.launchpad.net/epics-base/tree/src/libCom/timer/timer.cpp?h=3.15#n62
[3] https://www.martin-konrad.net/nextcloud/index.php/s/dZpTtZfSqbNcm7f

-- 
Martin Konrad
Facility for Rare Isotope Beams
Michigan State University
640 South Shaw Lane
East Lansing, MI 48824-1321, USA
Tel. 517-908-7253
Email: [email protected]


Replies:
Re: Excessive scan times for periodic scans Michael Davidsaver via Core-talk
References:
Excessive scan times for periodic scans Konrad, Martin via Core-talk

Navigate by Date:
Prev: Re: [Bug 1722535] Re: 'make distclean' inside a submodule Andrew Johnson via Core-talk
Next: Re: [Merge] ~dirk.zimoch/epics-base:raspberryPi into epics-base:7.0 Dirk Zimoch via Core-talk
Index: 2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  <20192020  2021  2022  2023  2024 
Navigate by Thread:
Prev: Re: Excessive scan times for periodic scans Ralph Lange via Core-talk
Next: Re: Excessive scan times for periodic scans Michael Davidsaver via Core-talk
Index: 2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  <20192020  2021  2022  2023  2024 
ANJ, 14 May 2019 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·