EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  <20232024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  <20232024 
<== Date ==> <== Thread ==>

Subject: Re: Strange memory leak with ADAravis
From: Abdalla Ahmad via Tech-talk <tech-talk at aps.anl.gov>
To: Mark Rivers <rivers at cars.uchicago.edu>
Cc: "tech-talk at aps.anl.gov" <tech-talk at aps.anl.gov>
Date: Thu, 26 Jan 2023 14:46:41 +0000
Hi Mark,

Thanks for the explanation.

Best Regards
Abdalla. 

From: Mark Rivers <rivers at cars.uchicago.edu>
Sent: Thursday, January 26, 2023 4:42:36 PM
To: Abdalla Ahmad <Abdalla.Ahmad at sesame.org.jo>; Mark Rivers <rivers at cars.uchicago.edu>
Cc: tech-talk at aps.anl.gov <tech-talk at aps.anl.gov>
Subject: Re: Strange memory leak with ADAravis
 
Hi Abdalla,

I just looked at ADGenICam.cpp and ADAravis.cpp.  I think I see the problem.  Neither of those checks to see if the camera is already acquiring when ADAcquire is set to 1.  ADAravis always allocates a new NDArray when Acquire is set to 1.  I think that will cause the memory growth you are seeing.  It needs to be fixed, but for now you just need to make sure that the camera is done before setting Acquire=1.

This should not be a problem for a step-scan, because at each point in the scan the script should wait for the acquisition to complete before starting the next one.  That will happen automatically with the sscan record, for example, because it uses ca_put_callback which waits for the camera to be done.

Mark



From: Tech-talk <tech-talk-bounces at aps.anl.gov> on behalf of Mark Rivers via Tech-talk <tech-talk at aps.anl.gov>
Sent: Thursday, January 26, 2023 8:26 AM
To: Abdalla Ahmad <Abdalla.Ahmad at sesame.org.jo>
Cc: tech-talk at aps.anl.gov <tech-talk at aps.anl.gov>
Subject: Re: Strange memory leak with ADAravis
 
Hi Abdalla,

When the Acquire command was sent was the camera already acquiring, or had it stopped? Can you send a screen shot of ADAravis so I can see how you had it configured when you were having the issue?

Mark


From: Abdalla Ahmad <Abdalla.Ahmad at sesame.org.jo>
Sent: Thursday, January 26, 2023 1:07 AM
To: Mark Rivers <rivers at cars.uchicago.edu>
Cc: tech-talk at aps.anl.gov <tech-talk at aps.anl.gov>
Subject: RE: Strange memory leak with ADAravis
 

Hello Mark

 

I revisited this issue recently and I noticed that the cause for increasing the allocated buffers is a python script running in a loop doing calculation on images received from the camera. After debugging I found that the script is doing “Acquire” inside the loop so every loop iteration the camera receives an Acquire command “CAM_PREFIX:Acquire”, I moved the command outside the loop and since yesterday, the IOC is working fine with very reasonable performance and resources. I would like to ask why the camera is allocating buffers every time an acquire command is received? Wouldn’t that cause an issue in a step-scan-based application?

 

Thanks in advance.

Abdalla.

 

From: Mark Rivers <rivers at cars.uchicago.edu>
Sent: Tuesday, October 11, 2022 9:31 PM
To: Abdalla Ahmad <Abdalla.Ahmad at sesame.org.jo>
Cc: tech-talk at aps.anl.gov
Subject: RE: Strange memory leak with ADAravis

 

Hi Abdalla,

 

Ø  Both the server and the CPU are using the same OS (CentOS 7.5) with the same base and support modules versions but the PC is an i7/8GB while the server is only 4 cores/4GB RAMs.

Ø  The server has another 15 IOCs running.

 

For the server, do you mean that each virtual machine has 4 cores/4GB.  Does each IOC run in its own virtual machine?  How many real cores and physical memory on the server?

 

I don’t think that running on a low-end virtual server should cause the memory usage of the IOC application to increase like you were seeing, but I don’t have any experience with virtual machines.

 

Mark

 

 

From: Abdalla Ahmad <Abdalla.Ahmad at sesame.org.jo>
Sent: Monday, October 10, 2022 5:25 AM
To: Mark Rivers <rivers at cars.uchicago.edu>
Cc: tech-talk at aps.anl.gov
Subject: RE: Strange memory leak with ADAravis

 

Hello Mark

 

I moved the IOC to my PC instead of running on a virtual server, I ran it for couple of times and the memory usage is good, right now it will be running for few days to see how much it will consume. Both the server and the CPU are using the same OS (CentOS 7.5) with the same base and support modules versions but the PC is an i7/8GB while the server is only 4 cores/4GB RAMs. The server has another 15 IOCs running. Could be the server is suffering somehow in the performance?

 

Best Regards,

Abdalla.

 

From: Mark Rivers <rivers at cars.uchicago.edu>
Sent: Wednesday, September 21, 2022 3:02 PM
To: Abdalla Ahmad <Abdalla.Ahmad at sesame.org.jo>
Cc: tech-talk at aps.anl.gov
Subject: Re: Strange memory leak with ADAravis

 

 > I noticed that PollUsedMem.SCAN was set to “1 second”, when it was set to 1 second the buffers used and allocated were increasing by around 10-20 frames/s, when I set it to I/O Intr the buffers counts were constant and memory usage was perfectly fine. 

 

SCAN=I/O Intr does not work for PoolUsedMem.SCAN, you need to use a periodic scan rate like "1 second".  If you use I/O Intr it is not getting the statistics, so it is just hiding the problem.

 

You said that when it was 1 second the buffers used and allocated were increasing by 10-20 frames/s.  That does not make sense for the screen shot you sent, because the acquisition time was 10 seconds.  Your commonPlugins.adl screen shows you only have 2 plugins loaded and active, NDPluginStdArrays and NDPluginStats.  In the worst possible case under those conditions, if both plugins and the camera had a memory leak, the number of buffers used and allocated could only increase by 0.1 frames/s * 3 = 0.3 frames/s.   Were you really using a 10 second acquire time when you had the problem?

 

Please change PoolUsedMem.SCAN to 1 second and report again how the buffers used and allocated are increasing under the exposure time conditions you use when you have seen the memory problem in the past.

 

> Please note that in this setup (while also failing before) I set QSIZE in the st.cmd to 3000 then 10000.

 

I am confused, your screen shot of commonPlugins.cmd shows that QSIZE was set to 100 (Free column), not 3000 or 10000.  For these plugins there is no reason to set a QueueSize that large.   Generally you only need a large queue size like that for the file saving plugins, where you want to buffer frames because the camera is collecting faster than the files can be saved to disk.  Note that QueueSize is normally in autosave, so the value in use may not be the QSIZE you set in the file.  You can change it in the main medm screen for that plugin.

 

Mark

 

 

 


From: Abdalla Ahmad <Abdalla.Ahmad at sesame.org.jo>
Sent: Wednesday, September 21, 2022 3:25 AM
To: Mark Rivers <rivers at cars.uchicago.edu>
Cc: tech-talk at aps.anl.gov <tech-talk at aps.anl.gov>
Subject: RE: Strange memory leak with ADAravis

 

Hello Mark

 

I mistakenly rebooted the IOC when huge buffers were allocated (+120K), so I attached an image of the working setup. I noticed that PollUsedMem.SCAN was set to “1 second”, when it was set to 1 second the buffers used and allocated were increasing by around 10-20 frames/s, when I set it to I/O Intr the buffers counts were constant and memory usage was perfectly fine. Please note that in this setup (while also failing before) I set QSIZE in the st.cmd to 3000 then 10000.

 

Best Regards,

Abdalla.

 

From: Mark Rivers <rivers at cars.uchicago.edu>
Sent: Monday, September 19, 2022 6:41 PM
To: Abdalla Ahmad <Abdalla.Ahmad at sesame.org.jo>
Cc: tech-talk at aps.anl.gov
Subject: Re: Strange memory leak with ADAravis

 

After this happens please send the screen shots for ADAravis and commonPlugins. 

If it will take a long time to fail then just send the screen shots when it is working normally at the frame rate you use when it fails.

 

Mark

 

Sent from my iPhone

 

On Sep 19, 2022, at 6:08 AM, Abdalla Ahmad <Abdalla.Ahmad at sesame.org.jo> wrote:



Hello Mark

 

This memory leak issue just happened again with similar numbers for pool statistics PVs on the same setup, I will be testing the same setup on a rocky Linux machine and tell you the results. Note that it happens after 15-20 minutes from running the IOC, also the empty free list command did not free the queues. Do you have some pointers on where to start debugging in the drivers’ stack? (AD core, aravis, etc.)

 

Thanks!

Abdalla.

 

From: Mark Rivers <rivers at cars.uchicago.edu>
Sent: Thursday, June 2, 2022 1:46 PM
To: tech-talk at aps.anl.gov; Abdalla Ahmad <Abdalla.Ahmad at sesame.org.jo>
Subject: Re: Strange memory leak with ADAravis

 

Hi Abdalla,

 

What areaDetectors plugins do you have running and what is the value of QueueSize for each of them?  The Buffers information you sent is very useful.  It says that the current number of buffers (NDArrays) in use is only 22.  But it has allocated 24664 of them, and they are in the pool, using 28904 MB of memory.  Can you send a screen shot of the commonPlugins.adl screen?

 

I can think of two possibilities for this:

  1. The system got very busy for a while, and plugins were not able to keep up with the camera.  That will cause each active plugin to fill its queue.  If there were 12 plugins and each had QueueSize of 2000 that would explain it.
  2. There is a leak in the driver or some plugin which is causing the pool to grow.  But then I would expect PoolUsedBuffers to be large.  I would also expect someone else to have reported that problem by now.

Note that in the Buffers section of the detector screen there is button to Empty Free List.  That processes the  EmptyFreeList record, which will free all of the unused buffers and reduce the memory without restarting the IOC.

 

Mark

 

 


From: Tech-talk <tech-talk-bounces at aps.anl.gov> on behalf of Abdalla Ahmad via Tech-talk <tech-talk at aps.anl.gov>
Sent: Thursday, June 2, 2022 4:57 AM
To: tech-talk at aps.anl.gov <tech-talk at aps.anl.gov>
Subject: Strange memory leak with ADAravis

 

Hi

 

I have setup area detector to control the Basler gigE cameras we have, I pulled the latest tag for all the necessary modules (asyn, ADCore, ADAravis, etc). The setup works fine, I can control the camera and acquire images even at the maximum FPS which is 32. Now I configured the IOC to control 2 cameras, one with exposure time 10 s and one with 32 FPS and left it overnight few days ago. I got to the PC the day after and the RAM and swap were full but the IOC was working, I managed to login to the PC later and launched the GUI and I found strange values in the Buffers section of ADAravis.adl:

 

PoolUsedBuffers: 22

PoolAllocBuffers: 24664

PoolFreeBuffers: 24642

PoolMaxMem: 0 MB

PoolUsedMem: 28904.3

 

The PC has is running CentOS 7 and has 8GB RAM. I re-ran the IOC yesterday and for now it is working fine. What could be wrong with my setup? The network is all gigabit Ethernet and jumbo frames are enabled.

 

Thanks

Abdalla.

 


References:
RE: Strange memory leak with ADAravis Abdalla Ahmad via Tech-talk
Re: Strange memory leak with ADAravis Mark Rivers via Tech-talk
Re: Strange memory leak with ADAravis Mark Rivers via Tech-talk

Navigate by Date:
Prev: Re: Strange memory leak with ADAravis Mark Rivers via Tech-talk
Next: Re: Weird behaviour in wait=True when using epics.Motor.get(something, something, wait=True) Marco A. Barra Montevechi Filho via Tech-talk
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  <20232024 
Navigate by Thread:
Prev: Re: Strange memory leak with ADAravis Mark Rivers via Tech-talk
Next: RE: Strange memory leak with ADAravis Mark Rivers via Tech-talk
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  <20232024 
ANJ, 26 Jan 2023 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·