SYMPTOMS
About three weeks ago one of our application developers was attempting
to add a large number of records to an IOC. The IOC already had about 3100
records. The new records were being added via 40 dbLoadRecord commands
that each loaded the same template of 85 records. Thus 3400 new records
were being added for a total of about 6500 records. It took a REALLY long
time to boot. (I think I remember about 25 minutes).
OBSERVATIONS
It was observed that the new dbLoadRecord commands took longer and longer
to execute. The first took about 2 seconds and the last was up to about
14 seconds. iocInit also took a long time to complete.
The ioc was not using NFS, thus each dbLoadRecord takes about 1.5 seconds
to open it's file. Thus not using NFS explains the 2 seconds but NOT the 14
seconds.
Running memShow showed that the memory free list was very long, i.e.
many thousands of elements. Running "memShow 1" showed that most of the free
list entries were very short, in fact most were 16 bytes, which appears
to be the minimum vxWorks allocates.
DIAGNOSIS
We (Jim Kowalkowski and I) realized that the lex/yacc routines dbLoadDatabase,
dbLoadRecords, and dbLoadTemplates frequently call malloc for storage that
is soon freed. However other storage is allocated and not freed between the
malloc and free. Thus the long free lists could be caused by the algorithms
used by malloc and free. The following routine demonstrates that this
is the case. Just run it for ntimes=100 on a freshly booted ioc (no epics)
and you will see that the free list is starting to grow.
#include <stddef.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int testmem(int ntimes)
{
void *a;
void *b;
void *c;
int i;
for(i=0; i<ntimes; i++) {
b = malloc(20);
a = malloc(80);
c = malloc(200);
free(a);
}
return(0);
}
SOLUTION
Jim and I wrote a small set of routines:
dbmfInit
dbmfMalloc
dbmfFree
dbmfFreeChunks
dbmfShow
These routines can be used by code that malloc and then free storage.
The following code was changed to use the new set of routines:
macLib
dbLoadDatabase
dbLoadRecords
dbLoadTemplates
dbStaticLib
NOTES:
1) While making these changes dbLoadRecords and dbLoadTemplates were
also changed to use macLib.
2) It was not possible to remove all malloc/free combinations.
RESULTS
Because we are in operation, I could not easily test the ioc described above.
Instead the tests were made (on my test stand) on another application that
also booted slowly.
This application used both dbLoadRecords and dbLoadTemplates so it was
a good test case. The application has 2940 records.
This application was taking just more than 15 minutes to boot. It
was,however, using ftp. Thus the first change was to make it use NFS.
Thus reduced the boot time to 5.2 minutes. Note that this is expected because
the application makes extensive use of dbLoadTemplates, which causes
many file opens.
The results using NFS are:
Before iocInit
tickGet()/60
value = 74 = 0x4a = 'J'
memShow
status bytes blocks avg block max block
------ --------- -------- ---------- ----------
current
free 13045420 2211 5900 7156576
alloc 3200732 27444 116 -
cumulative
alloc 18644004 99020 188 -
value = 0 = 0x0
iocInit
#
# ..... Some messages
#
tickGet()/60
value = 311 = 0x137
memShow
status bytes blocks avg block max block
------ --------- -------- ---------- ----------
current
free 11682032 3767 3101 7156576
alloc 4564120 50006 91 -
cumulative
alloc 20112144 124456 161 -
value = 0 = 0x0
Note the large number of free blocks. Also note that iocInit took 237 seconds.
This application has many records which use Allen Bradley DCM support
(not in base). The DCM init_record routines were all calling malloc and
free. In addition to the changes to epics base, I also changed the DCM support
so that it did not call malloc/free unless absolutely necessary.
The new results are:
tickGet()/60
value = 63 = 0x3f = '?'
memShow
status bytes blocks avg block max block
------ --------- -------- ---------- ----------
current
free 13063388 403 32415 7156576
alloc 3182764 27500 115 -
cumulative
alloc 18388300 36293 506 -
value = 0 = 0x0
iocInit
#
# ..... Some messages
#
tickGet()/60
value = 73 = 0x49 = 'I'
memShow
status bytes blocks avg block max block
------ --------- -------- ---------- ----------
current
free 11699392 10 1169939 7156576
alloc 4546760 50064 90 -
cumulative
alloc 19764408 58911 335 -
Note that iocInit now only takes 10 seconds.
Also the final number of free blocks is onlt 10.
Before iocInit the number of free blocks is 403.
This is still large but this application uses many dbLoadTemplate
commands each of which create many sets of records. Thus it is a worst
case kind of test. Note also that iocInit used up almost all of the
free list entries.
Although I have not had a chance to make tests, my guess is that
the malloc/free problem causes the boot time to increase far more rapidly
than linearly with the number of records.
These changes will be in the next 3.13 release.
Marty Kraimer
- Navigate by Date:
- Prev:
Big Databases -> Slow Boot Marty Kraimer
- Next:
SOSH97 - Registration Reminder Rolf Keitel
- Index:
1994
1995
1996
<1997>
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
- Navigate by Thread:
- Prev:
Big Databases -> Slow Boot Marty Kraimer
- Next:
SOSH97 - Registration Reminder Rolf Keitel
- Index:
1994
1995
1996
<1997>
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
|