EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  <20222023  2024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  <20222023  2024 
<== Date ==> <== Thread ==>

Subject: Re: ADVimba memory leak ?
From: Mark Rivers via Tech-talk <tech-talk at aps.anl.gov>
To: John Dobbins <john.dobbins at cornell.edu>
Cc: "tech-talk at aps.anl.gov \(tech-talk at aps.anl.gov\)" <tech-talk at aps.anl.gov>
Date: Thu, 27 Jan 2022 16:09:19 +0000
Hi John,

Your IT group seems very good!

This really looks like a problem in the Vimba SDK, and not in ADVimba.

My only question is why you do not observe the problem on machines that are not in a cluster.  Does your IT group have any idea about that?

If the latest SDK does not solve the problem then it seems like you should report your analysis to Allied Vision.

Mark



From: John Dobbins <john.dobbins at cornell.edu>
Sent: Thursday, January 27, 2022 9:45 AM
To: Mark Rivers <rivers at cars.uchicago.edu>
Cc: tech-talk at aps.anl.gov (tech-talk at aps.anl.gov) <tech-talk at aps.anl.gov>
Subject: Re: ADVimba memory leak ?
 
Mark,

If I start the IOC but do not start image acquisition, I see the same memory growth rate.

If I comment out ADVimbaConfig  I do not see memory growth.

Additionally, someone in our IT group looked and reports:

>>>>>>>>>>>>>>>>>>>>>>>>>>

I picked an IOC on chess15 and chess16, and looked at /proc/${pid}/smaps

[root@chess15 3517]# grep Rss smaps  | sort -nrk2 | head -15
Rss:               40528 kB
Rss:               10992 kB
Rss:                5928 kB
Rss:                3336 kB
Rss:                2936 kB
Rss:                2048 kB
Rss:                2048 kB
Rss:                2048 kB
Rss:                1776 kB
Rss:                 892 kB
Rss:                 892 kB
Rss:                 808 kB
Rss:                 736 kB
Rss:                 716 kB
Rss:                 592 kB

[root@chess16 26211]# grep Rss smaps  | sort -nrk2 | head -15
Rss:              131072 kB
Rss:              131072 kB
Rss:              131068 kB
Rss:              101548 kB
Rss:               65536 kB
Rss:               65536 kB
Rss:               65536 kB
Rss:               65536 kB
Rss:               65532 kB
Rss:               65532 kB
Rss:               65532 kB
Rss:               65532 kB
Rss:               40756 kB
Rss:                5928 kB
Rss:                4096 kB

Observing over time, the second entry on chess15 and the 4th on chess16
are slowly growing at a fairly steady pace.  So on chess16 I did

strace -qf -o vimba.trace -p 26211 -e trace=%memory

and looked for memory operations in the address range (which had grown
some) for

7fa7a4000000-7fa7aa45e000 rw-p 00000000 00:00 0
Size:             102776 kB
Rss:              102776 kB
Pss:              102776 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:    102776 kB
Referenced:       102776 kB
Anonymous:        102776 kB
AnonHugePages:     73728 kB
Swap:                  0 kB
KernelPageSize:        4 kB
MMUPageSize:           4 kB
Locked:                0 kB

That revealed a bunch of mprotect() calls that were slowly expanding
that address range, so then I did

strace -qf -o vimba.trace -p 26211 -e trace=mprotect -k

to get stack traces, which gave me a bunch of calls like

26258 mprotect(0x7fa7a9c51000, 4096, PROT_READ|PROT_WRITE) = 0
 > /usr/lib64/libc-2.17.so(mprotect+0x7) [0xf8f77]
 > /usr/lib64/libc-2.17.so(sysmalloc+0x24e) [0x819ee]
 > /usr/lib64/libc-2.17.so(_int_malloc+0x9d9) [0x82959]
 > /usr/lib64/libc-2.17.so(malloc+0x4b) [0x8578b]
 > /usr/lib64/libc-2.17.so(__netlink_request+0x29e) [0x12356e]
 > /usr/lib64/libc-2.17.so(getifaddrs_internal+0x6e) [0x12370e]
 > /usr/lib64/libc-2.17.so(getifaddrs+0xf) [0x12442f]
 > /mnt/ioc/vbpm/vimbaIOC/cti/VimbaGigETL.cti() [0x44641]
 > /mnt/ioc/vbpm/vimbaIOC/cti/VimbaGigETL.cti() [0x4558c]
 > /mnt/ioc/vbpm/vimbaIOC/cti/VimbaGigETL.cti() [0x3817a]
 > /mnt/ioc/vbpm/vimbaIOC/cti/VimbaGigETL.cti() [0x38438]
 > /mnt/ioc/vbpm/vimbaIOC/cti/VimbaGigETL.cti() [0x53403]
 > /mnt/ioc/vbpm/vimbaIOC/cti/VimbaGigETL.cti() [0x49d51]
 > /mnt/ioc/vbpm/vimbaIOC/cti/VimbaGigETL.cti() [0x48886]
 > /usr/lib64/libpthread-2.17.so(start_thread+0xc4) [0x7ea4]
 > /usr/lib64/libc-2.17.so(__clone+0x6c) [0xfe9fc]
26258 mprotect(0x7fa7a9c52000, 4096, PROT_READ|PROT_WRITE) = 0
 > /usr/lib64/libc-2.17.so(mprotect+0x7) [0xf8f77]
 > /usr/lib64/libc-2.17.so(sysmalloc+0x24e) [0x819ee]
 > /usr/lib64/libc-2.17.so(_int_malloc+0x9d9) [0x82959]
 > /usr/lib64/libc-2.17.so(malloc+0x4b) [0x8578b]
 > /usr/lib64/libc-2.17.so(__netlink_request+0x29e) [0x12356e]
 > /usr/lib64/libc-2.17.so(getifaddrs_internal+0x88) [0x123728]
 > /usr/lib64/libc-2.17.so(getifaddrs+0xf) [0x12442f]
 > /mnt/ioc/vbpm/vimbaIOC/cti/VimbaGigETL.cti() [0x44641]
 > /mnt/ioc/vbpm/vimbaIOC/cti/VimbaGigETL.cti() [0x4558c]
 > /mnt/ioc/vbpm/vimbaIOC/cti/VimbaGigETL.cti() [0x3817a]
 > /mnt/ioc/vbpm/vimbaIOC/cti/VimbaGigETL.cti() [0x38438]
 > /mnt/ioc/vbpm/vimbaIOC/cti/VimbaGigETL.cti() [0x53403]
 > /mnt/ioc/vbpm/vimbaIOC/cti/VimbaGigETL.cti() [0x49d51]
 > /mnt/ioc/vbpm/vimbaIOC/cti/VimbaGigETL.cti() [0x48886]
 > /usr/lib64/libpthread-2.17.so(start_thread+0xc4) [0x7ea4]
 > /usr/lib64/libc-2.17.so(__clone+0x6c) [0xfe9fc]

with steadily increasing addresses.

Superficially it looks like it is leaking ifaddrs structs. I'm not
sure I can confirm that without doing something a lot more intrusive.

If that's the right diagnosis, it makes sense that the leak rate
would depend on the number of interfaces, but that doesn't explain
the different rates on the cluster members.

-dan

p.s. there's a classic error with ifaddrs, of the form

struct ifaddrs * ifAddrStruct = NULL;
getifaddrs(&ifAddrStruct);
 
while (ifAddrStruct != NULL) {
  ifAddrStruct = ifAddrStruct->ifa_next;
}
 
freeifaddrs(ifAddrStruct);

where the call to freeifaddrs is passed the end of the chain
instead of the start--this has the right symptoms for that.

<<<<<<<<<<<<<<<<<<<<<

I will next build the latest ADVimba which comes with a newer version of the SDK. (I am using 1.8.0 of SDK)

John


From: Tech-talk <tech-talk-bounces at aps.anl.gov> on behalf of John Dobbins via Tech-talk <tech-talk at aps.anl.gov>
Sent: Wednesday, January 26, 2022 8:59 PM
To: Mark Rivers <rivers at cars.uchicago.edu>
Cc: tech-talk at aps.anl.gov (tech-talk at aps.anl.gov) <tech-talk at aps.anl.gov>
Subject: Re: ADVimba memory leak ?
 
Mark,

The software/IOC run on the cluster and non-cluster computer is identical. Just launched on a different machine. Both machines are SL7.9, the non-cluster machine and at least one of the cluster machines have the same kernel version. Two of the cluster machines have a slightly newer kernel.

I had thought of channel access clients also, but the number of such clients, ~ 15, is stable.

I will follow up on your additional suggestions tomorrow.


Thanks,
John


Vizzini: He didn’t fall?! Inconceivable!

Inigo Montoya: You keep using that word. I do not think it means what you think it means.






From: Mark Rivers <rivers at cars.uchicago.edu>
Sent: Wednesday, January 26, 2022 7:11 PM
To: John Dobbins <john.dobbins at cornell.edu>
Cc: tech-talk at aps.anl.gov (tech-talk at aps.anl.gov) <tech-talk at aps.anl.gov>
Subject: RE: ADVimba memory leak ?
 

Hi John,

 

A couple more suggestions:

-          Does the memory leak happen if you never start acquisition?

-          Does the memory leak happen if you comment out the ADVimbaConfig command?

 

Mark

 

 

From: Mark Rivers <rivers at cars.uchicago.edu>
Sent: Wednesday, January 26, 2022 5:15 PM
To: John Dobbins <john.dobbins at cornell.edu>
Cc: tech-talk at aps.anl.gov (tech-talk at aps.anl.gov) <tech-talk at aps.anl.gov>
Subject: Re: ADVimba memory leak ?

 

Hi John,

 

You were not responding to the most recent message in this thread, so it does not contain this important detail that you previously shared:

>> These numbers were for our production IOCs which are being run on a cluster and using procServ.  

> > I looked at an IOC run without any of this and it showed no growth in memory consumption.  So I am working on investigating the differences between these set-up.

 

Now you have found:

 

> This behavior is independent of the use of procServ.

 

So it appears that the problem is restricted to IOCs running on your cluster, it does not happen for a non-cluster machine, correct?

 

Here are a few ideas:

  • Are the cluster IOCs running any additional software (SNL, other databases, etc.) that are not running on the non-cluster machine you tested?
  • One thing that can cause memory usage to increase on an IOC is Channel Access clients.  A poorly written client could be making new connections, rather than re-using an existing connection. Have you run "casr 1" or "casr 2" on the cluster IOCs to see what clients are connected, and if the number of connections is increasing?

Mark

 

 


From: John Dobbins <john.dobbins at cornell.edu>
Sent: Wednesday, January 26, 2022 4:24 PM
To: Mark Rivers <rivers at cars.uchicago.edu>
Cc: tech-talk at aps.anl.gov (tech-talk at aps.anl.gov) <tech-talk at aps.anl.gov>
Subject: Re: ADVimba memory leak ?

 

I should add - if I stop image acquisition the IOC continues to leak at the same rate.  

 


From: Tech-talk <tech-talk-bounces at aps.anl.gov> on behalf of John Dobbins via Tech-talk <tech-talk at aps.anl.gov>
Sent: Wednesday, January 26, 2022 5:20 PM
To: Mark Rivers <rivers at cars.uchicago.edu>
Cc: tech-talk at aps.anl.gov (tech-talk at aps.anl.gov) <tech-talk at aps.anl.gov>
Subject: Re: ADVimba memory leak ?

 

Some more baffling details:

 

I have 8 IOCs running on a Pacemaker cluster of three computers:

 

chess15 - two IOCs
chess16 - three IOCs
chess17 - three IOCs



The IOCs on all leak memory (RSS) but each at rate specific to whatever cluster member they are running on.

 

chess15 ~  4.8 MB/hr

chess16 ~ 166 MB/hr

chess17 ~ 105 MB/hr

 

if I move an IOC from say chess15 to chess16 it now leaks at the rate specific to the computer it was moved to.

 

The growth is in spurts, the step size is almost always 264 KB, the frequency is set by the leak rate of that computer, but is any case the intervals are fairly regular.

 

This behavior is independent of the use of procServ.

 

Any ideas welcome!!!

 

I tried running an IOC with Valgrind, but image acquisition failed after a few frames.

 

John

 

 


From: Tech-talk <tech-talk-bounces at aps.anl.gov> on behalf of John Dobbins via Tech-talk <tech-talk at aps.anl.gov>
Sent: Friday, December 17, 2021 5:46 PM
To: Mark Rivers <rivers at cars.uchicago.edu>
Cc: tech-talk at aps.anl.gov (tech-talk at aps.anl.gov) <tech-talk at aps.anl.gov>
Subject: Re: ADVimba memory leak ?

 

Mark,

 

I'll need to look more carefully to determine if the growth is continuous or in spurts.

 

John


From: Mark Rivers <rivers at cars.uchicago.edu>
Sent: Friday, December 17, 2021 4:41:35 PM
To: John Dobbins <john.dobbins at cornell.edu>
Cc: tech-talk at aps.anl.gov (tech-talk at aps.anl.gov) <tech-talk at aps.anl.gov>
Subject: RE: ADVimba memory leak ?

 

Hi John,

 

I just some testing on one of our Allied Vision cameras.

 

Model                     GT1380

Firmware Version  00.01.54.17562

SDK Version         1.8.2

Driver Version       1.3

ADCore Version    3.11

Operating system   Centos8

 

I ran the following command to get the virtual memory size and the resident memory size, both in KB.

$ date;ps -o vsz,rss,cmd 13422

 

In the output below I have put the date on the same line as the ps output.

 

This is just after the IOC started, acquisition has not been started.

 

                                                      VSZ      RSS      CMD

Fri Dec 17 14:10:26 CST 2021 5993748 76732 /corvette/home/epics/support/areaDetector/ADVimba/iocs/vimbaIOC/bin/linux-x86_64-centos8/vimbaApp st.cmd

Fri Dec 17 14:10:33 CST 2021 5993748 76732 /corvette/home/epics/support/areaDetector/ADVimba/iocs/vimbaIOC/bin/linux-x86_64-centos8/vimbaApp st.cmd

Fri Dec 17 14:12:06 CST 2021 5993748 76732 /corvette/home/epics/support/areaDetector/ADVimba/iocs/vimbaIOC/bin/linux-x86_64-centos8/vimbaApp st.cmd

Fri Dec 17 14:18:16 CST 2021 5993748 76732 /corvette/home/epics/support/areaDetector/ADVimba/iocs/vimbaIOC/bin/linux-x86_64-centos8/vimbaApp st.cmd

 

So it ran for about 8 minutes and there was no increase in VSZ or RSS.  Thus, I do not see the increase of 0.15 MB/min (150 KB/min) that you see when not acquiring.

 

I now started acquisition.  This camera is 1360x1024 pixels.  Each image was thus 1360 KB.  I was acquiring at 5 frames/s, and I had the following plugins active:

NDPluginStdArray, NDPluginPva, NDPluginTransform, NDPluginROI, NDPluginStat, NDFileJPEG, NDFileTIFF.  The file plugins were not saving data.

 

                                                      VSZ      RSS      CMD

Fri Dec 17 14:19:10 CST 2021 6089256 125504 /corvette/home/epics/support/areaDetector/ADVimba/iocs/vimbaIOC/bin/linux-x86_64-centos8/vimbaApp st.cmd

Fri Dec 17 14:19:57 CST 2021 6089256 127336 /corvette/home/epics/support/areaDetector/ADVimba/iocs/vimbaIOC/bin/linux-x86_64-centos8/vimbaApp st.cmd

Fri Dec 17 14:23:26 CST 2021 6155792 134424 /corvette/home/epics/support/areaDetector/ADVimba/iocs/vimbaIOC/bin/linux-x86_64-centos8/vimbaApp st.cmd

Fri Dec 17 14:44:48 CST 2021 6155792 134848 /corvette/home/epics/support/areaDetector/ADVimba/iocs/vimbaIOC/bin/linux-x86_64-centos8/vimbaApp st.cmd

Fri Dec 17 14:45:55 CST 2021 6155792 134848 /corvette/home/epics/support/areaDetector/ADVimba/iocs/vimbaIOC/bin/linux-x86_64-centos8/vimbaApp st.cmd

Fri Dec 17 14:51:39 CST 2021 6155792 134848 /corvette/home/epics/support/areaDetector/ADVimba/iocs/vimbaIOC/bin/linux-x86_64-centos8/vimbaApp st.cmd

 

When acquisition was turned on there was an immediate jump in VSZ and RSS.  This is expected because a number of NDArrays have been allocated. RSS was still increasing 4 minutes after acquisition started.  In the ~21 minutes between 14:23:26 and 14:44:48 RSS increased by only about 400 KB, so less than the 1360 KB in a single image, and less 20 KB/min.  After 14:44:48 there was no further increase in RSS for the 7 minutes that I tested.

 

I then stopped acquisition.

 

                                                      VSZ      RSS      CMD

Fri Dec 17 14:51:57 CST 2021 6142152 121208 /corvette/home/epics/support/areaDetector/ADVimba/iocs/vimbaIOC/bin/linux-x86_64-centos8/vimbaApp st.cmd

Fri Dec 17 14:52:28 CST 2021 6142152 121208 /corvette/home/epics/support/areaDetector/ADVimba/iocs/vimbaIOC/bin/linux-x86_64-centos8/vimbaApp st.cmd

Fri Dec 17 14:59:34 CST 2021 6142152 121208 /corvette/home/epics/support/areaDetector/ADVimba/iocs/vimbaIOC/bin/linux-x86_64-centos8/vimbaApp st.cmd

Fri Dec 17 15:05:02 CST 2021 6142152 121208 /corvette/home/epics/support/areaDetector/ADVimba/iocs/vimbaIOC/bin/linux-x86_64-centos8/vimbaApp st.cmd

 

When acquisition was stopped VSZ and RSS both dropped.  RSS did not increase at all in the 13 minutes I observed after stopping acquisition.

 

Is the increase in RSS size you are seeing continuous, or does it suddenly increase at specific times?

 

Mark

 

 

From: John Dobbins <john.dobbins at cornell.edu>
Sent: Wednesday, December 15, 2021 2:47 PM
To: Mark Rivers <rivers at cars.uchicago.edu>
Cc: tech-talk at aps.anl.gov (tech-talk at aps.anl.gov) <tech-talk at aps.anl.gov>
Subject: Re: ADVimba memory leak ?

 

Mark,

 

An IOC which is not acquiring from the camera grows at ~ 0.15 MB per minute. (resident memory grows, virtual memory is constant)

 

An IOC acquiring an image (1116x836, 5.5 Hz), no plugins, grows at ~ 0.24 MB per minute

 

[ note:  0.24 MB/min  -->  10 GB after a month]

 

Enabling NDPluginStdArrays, NDPluginOverLay, NDPluginROI, NDPluginStats doesn't seem to produce additional memory growth.

 

John

 


From: Tech-talk <tech-talk-bounces at aps.anl.gov> on behalf of John Dobbins via Tech-talk <tech-talk at aps.anl.gov>
Sent: Monday, December 13, 2021 8:50 PM
To: Mark Rivers <rivers at cars.uchicago.edu>
Cc: tech-talk at aps.anl.gov (tech-talk at aps.anl.gov) <tech-talk at aps.anl.gov>
Subject: Re: ADVimba memory leak ?

 

Sorry, I should have said, Linux. I will investigate and report. 

 

John

 


From: Mark Rivers <rivers at cars.uchicago.edu>
Sent: Monday, December 13, 2021 7:36:10 PM
To: John Dobbins <john.dobbins at cornell.edu>
Cc: tech-talk at aps.anl.gov (tech-talk at aps.anl.gov) <tech-talk at aps.anl.gov>
Subject: RE: ADVimba memory leak ?

 

Hi John,

 

Is this on Linux or Windows?

 

Can you use “top” or other memory monitoring tools to see if memory usage increase corresponds to specific actions, such as stopping and starting the camera, errors about dropped frames, use of specific plugins, etc.?

 

Mark

 

From: Tech-talk <tech-talk-bounces at aps.anl.gov> On Behalf Of John Dobbins via Tech-talk
Sent: Monday, December 13, 2021 6:26 PM
To: tech-talk at aps.anl.gov
Subject: ADVimba memory leak ?

 

All,

 

We are using ADVimba with Mako G319B cameras. Over a period of three months memory usage has grown by an order of magnitude. Has anyone else encountered this?

 

Firmware Version 00.01.54.21000

SDK Version           1.8.0

Driver Version       1.1

ADCore Version     3.7

 

[ I can try newer version of ADVimba in january.]

 

Regards,

 

John Dobbins

 

Research Support Specialist

Cornell High Energy Synchrotron Source

Cornell University

 

 

 

 

 


References:
Re: ADVimba memory leak ? John Dobbins via Tech-talk
Re: ADVimba memory leak ? John Dobbins via Tech-talk
Re: ADVimba memory leak ? Mark Rivers via Tech-talk
RE: ADVimba memory leak ? Mark Rivers via Tech-talk
Re: ADVimba memory leak ? John Dobbins via Tech-talk
Re: ADVimba memory leak ? John Dobbins via Tech-talk

Navigate by Date:
Prev: Re: ADVimba memory leak ? John Dobbins via Tech-talk
Next: Phoebus-olog integration Larregui, Julian via Tech-talk
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  <20222023  2024 
Navigate by Thread:
Prev: Re: ADVimba memory leak ? John Dobbins via Tech-talk
Next: How to reduce Pilatus IOC response time Zhang, Dehong via Tech-talk
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  <20222023  2024 
ANJ, 14 Sep 2022 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·