1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 <2022> 2023 2024 | Index | 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 <2022> 2023 2024 |
<== Date ==> | <== Thread ==> |
---|
Subject: | RE: NDArrayPool:reserve ERROR, reference count = 0, should be = 1 |
From: | "Daykin, Evan via Tech-talk" <tech-talk at aps.anl.gov> |
To: | Mark Rivers <rivers at cars.uchicago.edu>, Michael Davidsaver <mdavidsaver at gmail.com> |
Cc: | "tech-talk at aps.anl.gov" <tech-talk at aps.anl.gov> |
Date: | Tue, 3 May 2022 21:19:12 +0000 |
Thanks, Mark. I have added these changes, as well. One more question: where (and how) should I properly destroy NDPlugins and clients? In my test class constructor,
I am creating plugins and clients like this: this->pSimDetector = new simDetector("SIM1", 1920, 1200, NDUInt16, 0, 0, 0, 0); this->pSimClient = new asynPortClient("SIM1"); this->pNDTemp = new NDPluginTemperature("TEST_PORT",10,0,"SIM1",0,10,10000,0,128000); this->pTempClient = new asynPortClient("TEST_PORT"); this->pTIFFPlugin = new NDFileTIFF("TIFF1", 20, 0, "TEST_PORT",0,0,0); this->pTIFFClient = new asynPortClient("TIFF1"); this->pStatsPlugin = new NDPluginStats("STATS1", 20, 0, "TEST_PORT",0,0,0,0,128000,1); this->pStatsClient = new asynPortClient("STATS1"); At the moment, I am not explicitly deleting any of these. This is causing small memory leaks, on the order of 24B directly and 25kB indirectly lost per instance,
but if I delete
them in a destructor, valgrind is indicating all of their memory to be ‘possibly lost’.
From: Mark Rivers <rivers at cars.uchicago.edu>
[EXTERNAL] This email originated from outside of FRIB
Hi Evan, These lines should be deleted in your code: pArray->getInfo(&arrayInfo); this->outputArray->dims[arrayInfo.xDim].size = pArray->dims[arrayInfo.xDim].size; this->outputArray->dims[arrayInfo.yDim].size = pArray->dims[arrayInfo.yDim].size;
this->outputArray = this->pNDArrayPool->copy(pArray,this->outputArray,false,true,true); More fundamentally you only changed the dimensions of the NDArray without changing the size of the underlying storage. That is a recipe for disaster! Given the changes I suggested, there is really no reason for this->outputArray to be a class member at all. Just make it a local variable in your processCallbacks() method. Then
you don't need to worry about the destructor. You probably don't need to call pArray->getInfo(), and you can change to this call: this->doTemperatureConversion(pArray, outputArray); Mark From: Daykin, Evan <daykin at frib.msu.edu> Hi Mark, Thanks for the pointers. I am pretty new to AreaDetector plugin development. >I think this means you have set callbacksBlock=Enable for this plugin. Is that correct? Yes, that was the case. I wrote the tests first, then the implementation. The 1 is a leftover arbitrary choice I forgot to change to 0. It is now set to ‘not blocking’. >You are measuring the elapsed time for execution. Removed now. >You are allocating a new inputArray with NDArrayPool->convert(), converting to NDUInt16… I have now deleted the intermediate ‘inputArray’ step. I guess I was paranoid about mistakenly touching the input array. I am now also release()ing outputArray in the destructor. >What happens if the new input array has different dimensions from the one that was used when creating this->outputArray? Dimensions were checked and modified in doTemperatureConversion, but this breaks naming and compartmentalization conventions. Dimension checking is now in processCallbacks. Here is the revised processCallbacks: void NDPluginTemperature::processCallbacks(NDArray *pArray){ static const char *functionName = "processCallbacks"; NDArrayInfo_t arrayInfo; NDPluginDriver::beginProcessCallbacks(pArray); if(pArray->dataType != NDUInt16){ asynPrint(this->pasynUserSelf, ASYN_TRACE_ERROR, "%s:%s: Only UInt16 supported.", driverName, functionName); return; } std::string lastCalFileName = this->calibrationFileName; getStringParam(this->calibrationFileNameIdx, this->calibrationFileName); if(this->calibrationFileName != lastCalFileName){ this->processCalibrationFile(); } int arrayCallbacks; getIntegerParam(NDArrayCallbacks, &arrayCallbacks); if(arrayCallbacks==1){ if(NULL == this->outputArray){ this->outputArray = this->pNDArrayPool->copy(pArray,this->outputArray,false,true,true); } pArray->getInfo(&arrayInfo); this->outputArray->dims[arrayInfo.xDim].size = pArray->dims[arrayInfo.xDim].size; this->outputArray->dims[arrayInfo.yDim].size = pArray->dims[arrayInfo.yDim].size; //unlock while the plug-and-chug happens. No shared resources are accessed at this time. this->unlock(); this->doTemperatureConversion(pArray, this->outputArray, &arrayInfo); this->lock(); setIntegerParam(NDArraySizeX, (int)outputArray->dims[arrayInfo.xDim].size); setIntegerParam(NDArraySizeY, (int)outputArray->dims[arrayInfo.yDim].size); callParamCallbacks(); } NDPluginDriver::endProcessCallbacks(outputArray, false, true); } The test now runs for six frames and fails on the seventh, in endProcessCallbacks. Regular run, then GDB backtrace at first occurrence of “cantProceed”: NDArray.uniqueId=1 NDArray.uniqueId=2 NDArray.uniqueId=3 NDArray.uniqueId=4 NDArray.uniqueId=5 NDArray.uniqueId=6 NDArrayPool:reserve ERROR, reference count = 0, should be >= 1, pArray=0x7f3470001880 Thread TEST_PORT_Plugin_1 (0x559fbb49aa50) can't proceed, suspending. Dumping a stack trace of thread 'TEST_PORT_Plugin_1': NDArray.uniqueId=7 [ 0x7f348a037ab3]: /lib/x86_64-linux-gnu/libCom.so.3.15.9(epicsStackTrace+0x73) [ 0x7f348a028216]: /lib/x86_64-linux-gnu/libCom.so.3.15.9(cantProceed+0xc6) [ 0x7f34897ebb08]: /lib/x86_64-linux-gnu/libADBase.so.3.11(_ZN11NDArrayPool7reserveEP7NDArray+0x78) [ 0x7f3489f2cdd8]: /lib/x86_64-linux-gnu/libNDPlugin.so.3.11(_ZN14NDPluginDriver14driverCallbackEP8asynUserPv+0x258) [ 0x7f3489e86155]: /lib/x86_64-linux-gnu/libasyn.so.4.38(_ZN14asynPortDriver25doCallbacksGenericPointerEPvii+0x1f5) [ 0x7f3489f2d445]: /lib/x86_64-linux-gnu/libNDPlugin.so.3.11(_ZN14NDPluginDriver19endProcessCallbacksEP7NDArraybb+0x275) [ 0x7f3489d93f6e]: /home/daykin/git/areadetector-temperature/lib/linux-x86_64/libNDPluginTemperature.so(_ZN19NDPluginTemperature16processCallbacksEP7NDArray+0x25e) [ 0x7f3489f2d8a9]: /lib/x86_64-linux-gnu/libNDPlugin.so.3.11(_ZN14NDPluginDriver11processTaskEv+0x1c9) [ 0x7f348a02c3fb]: /lib/x86_64-linux-gnu/libCom.so.3.15.9(epicsThreadCallEntryPoint+0x3b) [ 0x7f348a0320bb]: /lib/x86_64-linux-gnu/libCom.so.3.15.9(epicsSnprintf+0x7bb) [ 0x7f34899bdea7]: /lib/x86_64-linux-gnu/libpthread.so.0(start_thread+0xd7) [ 0x7f3489ad4def]: /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) Thread 12 "TEST_PORT_Plugi" hit Breakpoint 1, 0x00007ffff7f72150 in cantProceed () from /lib/x86_64-linux-gnu/libCom.so.3.15.9 (gdb) bt #0 0x00007ffff7f72150 in cantProceed () from /lib/x86_64-linux-gnu/libCom.so.3.15.9 #1 0x00007ffff7735b08 in NDArrayPool::reserve(NDArray*) () from /lib/x86_64-linux-gnu/libADBase.so.3.11 #2 0x00007ffff7e76dd8 in NDPluginDriver::driverCallback(asynUser*, void*) () from /lib/x86_64-linux-gnu/libNDPlugin.so.3.11 #3 0x00007ffff7dd0155 in asynPortDriver::doCallbacksGenericPointer(void*, int, int) () from /lib/x86_64-linux-gnu/libasyn.so.4.38 #4 0x00007ffff7e77445 in NDPluginDriver::endProcessCallbacks(NDArray*, bool, bool) () from /lib/x86_64-linux-gnu/libNDPlugin.so.3.11 #5 0x00007ffff7cddf6e in NDPluginTemperature::processCallbacks (this=0x5555555c8970, pArray=0x7fffd0001e70) at ../NDPluginTemperature.cpp:100 #6 0x00007ffff7e778a9 in NDPluginDriver::processTask() () from /lib/x86_64-linux-gnu/libNDPlugin.so.3.11 #7 0x00007ffff7f763fb in epicsThreadCallEntryPoint () from /lib/x86_64-linux-gnu/libCom.so.3.15.9 #8 0x00007ffff7f7c0bb in ?? () from /lib/x86_64-linux-gnu/libCom.so.3.15.9 #9 0x00007ffff7907ea7 in start_thread (arg=<optimized out>) at pthread_create.c:477 #10 0x00007ffff7a1edef in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 From: Mark Rivers <rivers at cars.uchicago.edu>
[EXTERNAL] This email originated from outside of FRIB
Hi Evan, > Thanks, that was the problem. Setting copyArray=true makes the issue go away. Are there any adverse side-effects of doing so? This argument to NDPluginDriver::endProcessCallbacks() is documented here: It says:
Your driver is calling endProcessCallbacks with a new NDArray that processCallbacks() created. Thus you must set copyArray=false, as you were originally doing. NDPluginDriver::endProcessCallbacks(outputArray, false, true); I should have looked more closely at your original message. I assumed the problem was in NDPluginDriver::endProcessCallbacks with a call to NDArray::release(). However, the problem is actually in NDPluginDriver::beginProcessCallbacks
with a call to NDArray::reserve(). NDArrayPool:reserve ERROR, reference count = 0, should be >= 1, pArray=0x7f453c001d80 Thread SimDetTask (0x5621bbbb0c80) can't proceed, suspending. Dumping a stack trace of thread 'SimDetTask': [ 0x7f45538ffab3]: /lib/x86_64-linux-gnu/libCom.so.3.15.9(epicsStackTrace+0x73) [ 0x7f45538f0216]: /lib/x86_64-linux-gnu/libCom.so.3.15.9(cantProceed+0xc6) [ 0x7f4553197b08]: /lib/x86_64-linux-gnu/libADBase.so.3.11(_ZN11NDArrayPool7reserveEP7NDArray+0x78) [ 0x7f455305db37]: /lib/x86_64-linux-gnu/libNDPlugin.so.3.11(_ZN14NDPluginDriver21beginProcessCallbacksEP7NDArray+0x367) [ 0x7f455375df1e]: /home/daykin/git/areadetector-temperature/lib/linux-x86_64/libNDPluginTemperature.so(_ZN19NDPluginTemperature16processCallbacksEP7NDArray+0xae) [ 0x7f455305dd53]: /lib/x86_64-linux-gnu/libNDPlugin.so.3.11(_ZN14NDPluginDriver14driverCallbackEP8asynUserPv+0x1d3) [ 0x7f4553851155]: /lib/x86_64-linux-gnu/libasyn.so.4.38(_ZN14asynPortDriver25doCallbacksGenericPointerEPvii+0x1f5) [ 0x7f4553809974]: /usr/lib/epics/lib/linux-x86_64/libsimDetector.so(_ZN11simDetector7simTaskEv+0x4e4) [ 0x7f45538fa0bb]: /lib/x86_64-linux-gnu/libCom.so.3.15.9(epicsSnprintf+0x7bb) [ 0x7f4553387ea7]: /lib/x86_64-linux-gnu/libpthread.so.0(start_thread+0xd7) [ 0x7f455349edef]: /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) From this stack trace it looks like beginProcessCallbacks is being called from the simDetTask. I think this means you have set callbacksBlock=Enable for this plugin. Is that correct? Otherwise the task with
the error should be your plugin tasks, not the simDetTask. Is there a reason you set callbacksBlock? That should not be a problem, but it is unusual and I am just curious. I have a couple of comments on your processCallbacks function. You are measuring the elapsed time for execution. epicsTimeStamp after; epicsTimeGetCurrent(&after); double delta = epicsTimeDiffInSeconds(&after, &before); cout<<"Took "<<delta<<" s"<<endl; setDoubleParam(runTimeIdx, delta); But the base class already does this for you, so you don't should not need to do this. You are allocating a new inputArray with NDArrayPool->convert(), converting to NDUInt16. But you already know that the input array (pArray) has type NDUInt16, so why make a new array? You can just pass pArray
to doTemperatureConversion, as long as you don't modify that array.
this->pNDArrayPool->convert(pArray,&(this->inputArray), NDUInt16); More importantly you have allocated this->inputArray, but you have never released it. This means you have a memory leak. Once you are done with this->inputArray you should call this->inputArray->release().
But as I said above I am not sure you need to create this array at all. You are only allocating this->outputArray if it is currently NULL. You then pass this->outputArray to doTemperatureConversion(). What happens if the new input array has different dimensions from the one that
was used when creating this->outputArray? If the new array is larger then you will probably get an access violation unless doTemperatureConversion() is checking the dimensions of the output array. This does not solve your original problem. For some reason the NDArray being passed to NDPluginDriver::beginProcessCallbacks() has a reference count of 0. That should never happen, because if the reference count
is 0 then it should be in the free list and not in active use. You should switch the flag to endProcessCallbacks back to the correct value of false and then try to track down the problem. Mark From: Daykin, Evan <daykin at frib.msu.edu> Thanks, that was the problem. Setting copyArray=true makes the issue go away. Are there any adverse side-effects of doing so? |