1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 <2010> 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 | Index | 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 <2010> 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 |
<== Date ==> | <== Thread ==> |
---|
Subject: | Re: EPICS/asyn problem |
From: | Marty Kraimer <[email protected]> |
To: | Mark Rivers <[email protected]> |
Cc: | Eric Norum <[email protected]>, [email protected] |
Date: | Tue, 16 Feb 2010 05:28:38 -0500 |
Folks,
I have run into a fairly serious problem with asyn and EPICS base.
Standard asyn device support allows for I/O Intr scanned records. The driver does a callback to device support with a new value, and device support in turn calls scanIoRequest to process the record. When the record processes it reads the "cached" data that was passed to the callback, rather than actually reading from the device.
This generally works fine. However, I have just run into a serious problem when the number of callback requests is very large. This is happening at iocInit in a system with lots of records that are I/O Intr scanned. The problem is a ring buffer overflow in the cbLow task.
The sequence of calls is
driver->asyn device support->scanIoRequest->callbackRequest.
This is the code from EPICS base for callback request.
void callbackRequest(CALLBACK *pcallback)
{
int priority;
int pushOK;
int lockKey;
if (!pcallback) {
epicsPrintf("callbackRequest called with NULL pcallback\n");
return;
}
priority = pcallback->priority;
if (priority < 0 || priority >= NUM_CALLBACK_PRIORITIES) {
epicsPrintf("callbackRequest called with invalid priority\n");
return;
}
if (ringOverflow[priority]) return;
lockKey = epicsInterruptLock();
pushOK = epicsRingPointerPush(callbackQ[priority], pcallback);
epicsInterruptUnlock(lockKey);
if (!pushOK) {
errlogPrintf("callbackRequest: %s ring buffer full\n",
threadName[priority]);
ringOverflow[priority] = TRUE;
}
epicsEventSignal(callbackSem[priority]);
}
Note that it is a void function, and thus does not return an error status when the ring buffer is full. Why not?
This is the code for scanIoRequest:
void scanIoRequest(IOSCANPVT pioscanpvt)
{
int prio;
if (scanCtl != ctlRun) return;
for (prio = 0; prio < NUM_CALLBACK_PRIORITIES; prio++) {
io_scan_list *piosl = &pioscanpvt[prio];
if (ellCount(&piosl->scan_list.list) > 0)
callbackRequest(&piosl->callback);
}
}
It is also a void function, and so cannot return an error from callbackRequest even if it could return one.
The ring pointer buffer is created with this line in callbackInitOnce:
callbackQ[i] = epicsRingPointerCreate(callbackQueueSize);
callbackQueueSize is defined with this line:
static int callbackQueueSize = 2000;
Thus, if there are more than 2000 unprocessed scanIoRequest calls there will be an error which cannot be caught.
This is a serious problem for me. My XIA detector driver is doing callbacks to device support for >4000 records shortly after iocInit. This will soon grow to 20,000 when a 100 element detector at the Australian Synchrotron starts to use this driver. I am now getting these errors when I start a system with 16 detectors:
callbackRequest: cbLow ring buffer full
This very large number of callbacks will typically only happen right after iocInit. In this case I would be willing to change the asyn device support as follows:
- Test if scanIoRequest succeeded
- If it failed because of ring buffer full then do a short sleep and try again, up to some maximum number of retries
But I cannot implement this logic because scanIoRequest does not return a failure status, so I don't know if it worked or not.
What should be done?
Thanks,
Mark