Experimental Physics and Industrial Control System
Hi Andrew,
Sorry about the delay. I got sidetracked...
> What other resources do your devices share that might limit how fast the
> IOC can communicate with them?
From looking at the Asyn code I first thought that there were no shared
resources (Asyn is usually allocating all resources per port). So I
moved the IOC to a physical machine and fired up the profiler to
generate some flame graphs (open them in a web browser).
[Explanation: Flame graphs visualize the stack traces sampled by the
profiler (statistical profiling). The stack traces are plotted from the
bottom to the top. The x axis shows the different samples, ordered
alphabetically by the stack trace, merging neighboring samples if they
have the same backtrace. This means sharp spikes correspond to
backtraces that only occurred rarely whereas wide plateaus represent
backtraces that were found frequently.]
The first flame graph [1] shows CPU time spend by the IOC process. The
interesting part is the "scan-1" thread on the right (the thread that is
running slow). Zooming into this thread shows that queueRequest ->
timer::start() -> epicsMutex::lock() dominates (89% of the time spend in
queueRequest() is spend in epicsMutex::lock()). A quick look at the code
reveals that starting a timer involves acquiring the lock on the timer
queue [2]. As it turns out the relevant timer queue is indeed shared
between Asyn ports.
Note that you have to take the CPU time flame graph with a grain of salt
because it doesn't include time the machine spent running other
processes or sitting idle (remember that was part of the problem: each
core was >=30% idle). So I also profiled off-CPU time [3]. This again
revealed that 86% of the off-CPU time in the queueRequest function is
related to the epicsMutex::lock() call in timer::start().
This data suggests that the problem is either in the implementation of
the timer queue or in the way Asyn uses it (or both). Looking at the
timer code I'm suspecting that the implementation of the timer queue
might not be ideal: The data structure used for holding the timers is a
linked list which might result in a significant number of cache misses
while searching the list in timer::start() for the right place to insert
the new timer. I suspect that thanks to pre-fetching a priority queue
based on a vector might perform better.
Any thoughts?
-Martin
[1] https://www.martin-konrad.net/nextcloud/index.php/s/bT2PooiYzS5xeeb
[2]
https://git.launchpad.net/epics-base/tree/src/libCom/timer/timer.cpp?h=3.15#n62
[3] https://www.martin-konrad.net/nextcloud/index.php/s/dZpTtZfSqbNcm7f
--
Martin Konrad
Facility for Rare Isotope Beams
Michigan State University
640 South Shaw Lane
East Lansing, MI 48824-1321, USA
Tel. 517-908-7253
Email: [email protected]
- Replies:
- Re: Excessive scan times for periodic scans Michael Davidsaver via Core-talk
- References:
- Excessive scan times for periodic scans Konrad, Martin via Core-talk
- Navigate by Date:
- Prev:
Re: [Bug 1722535] Re: 'make distclean' inside a submodule Andrew Johnson via Core-talk
- Next:
Re: [Merge] ~dirk.zimoch/epics-base:raspberryPi into epics-base:7.0 Dirk Zimoch via Core-talk
- Index:
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
<2019>
2020
2021
2022
2023
2024
2025
- Navigate by Thread:
- Prev:
Re: Excessive scan times for periodic scans Ralph Lange via Core-talk
- Next:
Re: Excessive scan times for periodic scans Michael Davidsaver via Core-talk
- Index:
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
<2019>
2020
2021
2022
2023
2024
2025