I reverted commit 4f2228fb1d7527fb5ebc8b2d747c309f1dd7698d in EPICS base that uses epicsTimeMonotonic. That did not fix the problem.
I still see things like this:
2020/03/20 12:52:34.510 ImageEventHandler::OnImageEvent sending uniqueId=80
2020/03/20 12:52:34.511 ADSpinnaker::grabImage received uniqueID 80 from message queue
2020/03/20 12:52:34.625 ImageEventHandler::OnImageEvent sending uniqueId=81
2020/03/20 12:52:34.626 ADSpinnaker::grabImage received uniqueID 81 from message queue
2020/03/20 12:52:34.740 ImageEventHandler::OnImageEvent sending uniqueId=82
2020/03/20 12:52:34.741 ADSpinnaker::grabImage timeout receiving from message queue
2020/03/20 12:52:34.742 ADSpinnaker::grabImage timeout receiving from message queue
2020/03/20 12:52:34.842 ADSpinnaker::grabImage timeout receiving from message queue
2020/03/20 12:52:34.855 ImageEventHandler::OnImageEvent sending uniqueId=83
2020/03/20 12:52:34.855 ADSpinnaker::grabImage received uniqueID 83 from message queue
2020/03/20 12:52:34.970 ImageEventHandler::OnImageEvent sending uniqueId=84
2020/03/20 12:52:34.970 ADSpinnaker::grabImage timeout receiving from message queue
2020/03/20 12:52:34.971 ADSpinnaker::grabImage received uniqueID 84 from message queue
and this
2020/03/20 12:52:38.305 ImageEventHandler::OnImageEvent sending uniqueId=113
2020/03/20 12:52:38.305 ADSpinnaker::grabImage received uniqueID 113 from message queue
2020/03/20 12:52:38.419 ADSpinnaker::grabImage timeout receiving from message queue
2020/03/20 12:52:38.420 ImageEventHandler::OnImageEvent sending uniqueId=114
2020/03/20 12:52:38.421 ADSpinnaker::grabImage received uniqueID 114 from message queue
2020/03/20 12:52:38.535 ImageEventHandler::OnImageEvent sending uniqueId=115
2020/03/20 12:52:38.535 ADSpinnaker::grabImage received uniqueID 115 from message queue
2020/03/20 12:52:38.650 ImageEventHandler::OnImageEvent sending uniqueId=116
2020/03/20 12:52:38.650 ADSpinnaker::grabImage timeout receiving from message queue
2020/03/20 12:52:38.652 ADSpinnaker::grabImage timeout receiving from message queue
2020/03/20 12:52:38.754 ADSpinnaker::grabImage timeout receiving from message queue
2020/03/20 12:52:38.765 ImageEventHandler::OnImageEvent sending uniqueId=117
2020/03/20 12:52:38.765 ADSpinnaker::grabImage received uniqueID 117 from message queue
2020/03/20 12:52:38.880 ImageEventHandler::OnImageEvent sending uniqueId=118
2020/03/20 12:52:38.880 ADSpinnaker::grabImage received uniqueID 118 from message queue
Messages 82 and 116 were sent on the messageQueue but never received. In both cases there are 2 timeouts within 1-2 ms, when the timeout is 100 ms.
Mark
From: Mark Rivers
Sent: Friday, March 20, 2020 11:56 AM
To: 'core-talk at aps.anl.gov' <core-talk at aps.anl.gov>
Subject: RE: C++ question
Folks,
I have added debugging print statements when sending a message on the epicsMessageQueue and when receiving the message on the epicsMessageQueue. Each time a message is
sent I print the UniqueID of the image that was sent. When the message is received I also print the UniqueID of the message that was received. The time between message is about 0.115 seconds, while the epicsMessageQueue receive timeout is 0.1 seconds.
Here is what I observed:
2020/03/20 11:37:37.098 ImageEventHandler::OnImageEvent sending uniqueId=85
2020/03/20 11:37:37.098 ADSpinnaker::grabImage received uniqueID 85 from message queue
2020/03/20 11:37:37.213 ImageEventHandler::OnImageEvent sending uniqueId=86
2020/03/20 11:37:37.213 ADSpinnaker::grabImage timeout receiving from message queue
2020/03/20 11:37:37.214 ADSpinnaker::grabImage received uniqueID 86 from message queue
2020/03/20 11:37:37.328 ImageEventHandler::OnImageEvent sending uniqueId=87
2020/03/20 11:37:37.328 ADSpinnaker::grabImage received uniqueID 87 from message queue
2020/03/20 11:37:37.443 ImageEventHandler::OnImageEvent sending uniqueId=88
2020/03/20 11:37:37.443 ADSpinnaker::grabImage timeout receiving from message queue
2020/03/20 11:37:37.445 ADSpinnaker::grabImage timeout receiving from message queue
2020/03/20 11:37:37.546 ADSpinnaker::grabImage timeout receiving from message queue
2020/03/20 11:37:37.557 ImageEventHandler::OnImageEvent sending uniqueId=89
2020/03/20 11:37:37.558 ADSpinnaker::grabImage received uniqueID 89 from message queue
2020/03/20 11:37:37.672 ImageEventHandler::OnImageEvent sending uniqueId=90
2020/03/20 11:37:37.672 ADSpinnaker::grabImage received uniqueID 90 from message queue
2020/03/20 11:37:37.784 ADSpinnaker::grabImage timeout receiving from message queue
2020/03/20 11:37:37.787 ImageEventHandler::OnImageEvent sending uniqueId=91
2020/03/20 11:37:37.787 ADSpinnaker::grabImage received uniqueID 91 from message queue
Messages with UniqueID 85, 86, and 87 were received fine. When reading messages 85 and 87 there was no timeout, but when reading 86 there was a timeout. However, 0.001
second after that timeout the receive succeeded.
The problem is message 88.
2020/03/20 11:37:37.443 ImageEventHandler::OnImageEvent sending uniqueId=88
2020/03/20 11:37:37.443 ADSpinnaker::grabImage timeout receiving from message queue
2020/03/20 11:37:37.445 ADSpinnaker::grabImage timeout receiving from message queue
2020/03/20 11:37:37.546 ADSpinnaker::grabImage timeout receiving from message queue
2020/03/20 11:37:37.557 ImageEventHandler::OnImageEvent sending uniqueId=89
2020/03/20 11:37:37.558 ADSpinnaker::grabImage received uniqueID 89 from message queue
Note that it was sent at 37.443. There were then 3 timeouts trying to read that message, at 37.443, 37.445, and 37.546. Message 88 was never received!
Something seems seriously wrong here. It is making me wonder if this is related to the problem with epicsTimeMonotonic that I observed with callbackRequestDelay and Andrew
observed with epicsEventWaitWithTimeout? What could cause that message #88 to never be received from the message queue? Why are there 2 timeouts within 0.002 seconds, when the timeout is 0.1 second?
Thanks,
Mark
Folks,
I am having a problem with the areaDetector ADSpinnaker driver. I am quite sure the problem is that I am doing something wrong in my C++ callback code.
The code in question is quite simple.
The following is a class that implements callbacks from the vendor library when a new image is available. It inherits from the vendor’s ImageEvent class, and implments a method they define called OnImageEvent. That method is passed a
smart pointer of type ImagePtr. I made a copy of that smart pointer and pass its address on an epicsMessageQueue. The copy is deleted in the receiving function.
***************************************
class ImageEventHandler : public ImageEvent
{
public:
ImageEventHandler(epicsMessageQueue *pMsgQ)
: pMsgQ_(pMsgQ)
{}
~ImageEventHandler() {}
void OnImageEvent(ImagePtr image) {
ImagePtr *imagePtrAddr = new ImagePtr(image);
if (pMsgQ_->send(&imagePtrAddr, sizeof(imagePtrAddr)) != 0) {
printf("OnImageEvent error calling pMsgQ_->send()\n");
}
}
private:
epicsMessageQueue *pMsgQ_;
};
***************************************
This is the receiving function.
***************************************
asynStatus ADSpinnaker::grabImage()
{
asynStatus status = asynSuccess;
size_t nRows, nCols;
NDDataType_t dataType;
NDColorMode_t colorMode;
int timeStampMode;
int uniqueIdMode;
int convertPixelFormat;
bool imageConverted = false;
int numColors;
size_t dims[3];
ImageStatus imageStatus;
PixelFormatEnums pixelFormat;
int pixelSize;
size_t dataSize, dataSizePG;
void *pData;
int nDims;
ImagePtr pImage;
ImagePtr *imagePtrAddr=0;
static const char *functionName = "grabImage";
try {
while(1) {
unlock();
int recvSize = pCallbackMsgQ_->receive(&imagePtrAddr, sizeof(imagePtrAddr), 0.1);
lock();
if (recvSize == sizeof(imagePtrAddr)) {
break;
} else if (recvSize == -1) {
// Timeout
int acquire;
getIntegerParam(ADAcquire, &acquire);
if (acquire == 0) {
return asynError;
} else {
continue;
}
} else {
asynPrint(pasynUserSelf, ASYN_TRACE_ERROR,
"%s::%s error receiving from message queue\n",
driverName, functionName);
return asynError;
}
}
pImage = *imagePtrAddr;
// Delete the ImagePtr that was passed to us
delete imagePtrAddr;
…
Finishing processing pImage …
***************************************
Note that I have a timeout of 0.1 second on the msgQ->receive() call. if the call times out it tries again if it is still acquiring. This allows the function to return quickly if acquisition is stopped after it is called.
This code generally works fine when the time between images is 0.1 second or less, or 0.12 seconds or more. However, if the time between images is 0.115 seconds, for example, I “lose” about 2-3% of the images. There are no error messages.
If I increase the 0.1 second timeout to 0.2 seconds, then images are lost if the time between images is about 0.215 seconds.
It seems to me that the problem must be that another callback happens before the previous image has been processed. I thought that making a copy of the smart pointer on the stack would prevent this, but I must be doing something wrong.
Perhaps I am deleting the smart pointer too soon, or perhaps the whole approach I am using is unsafe?
Any advice is much appreciated!
Thanks,
Mark