Hi Mark,
I will not be able to do the tests you suggest until Monday. I can add
mode details to the picture I have painted so far.
Subnet SA is in our experimental hall, where we have a 100Mbit switch (I
know that it is not suitable for handling camera images, but that is all
we can presently afford). This switch is configured for two subnets, SA
and SAr, where SAr is a restricted subnet, which is only visible from
SA. This is done to isolate devices like cameras, Beckhoff, and other
non computer network devices so that they do not get clobbered by
security scans. The camera is on SAr. Computer CA is on SA. There is
no problem when I ssh from my desktop Linux computer, which is on one of
SLAC public subnets, to CA, run ImageJ on CA and I can see images at
more than 10Hz.
I don't really know the details of the rest of the network path from CC
to CA. That is, I don't know how many switches and their speed are in
the path. I suspect that these switches are probably 1 GBit.
Subnet SC, where computer CC is used by the accellerator operator, is a
private subnet.
I think that there is a dual hosted computer, which is most likely CB,
that lives on both subnet SC and SB. I suspect that the problem is
really in going from computer CC to CB.
I have performed the following test:
From my desktop computer I did ssh to computer CB and from there I did
ssh to CA. Then still working from my desktop computer I started ImageJ
on CA and I was geting images at 10Hz without any errors.
I am thinking naively that a limited networking bandwidth should slow
down the rate at which images are moved but should not put the IOC in an
unrecoverable bad state. But I know that naive thinking is usually wrong.
The networking expert in charge of SC and SB was away this week and
should be back at work next week and we will discuss this with him.
Thanks for any suggestions,
Zen
On 03/09/13 10:07, Mark Rivers wrote:
Zen,
3 questions:
1) Is the camera on subnet SA, the same subnet the IOC is running on?
2) Is ImageJ running on CA, i.e. the same computer as the IOC?
3) Are you sure that the entire path from CC to CA, and from CA to the camera is Gigabit, and there are no 100Mbit switches or hubs in the path?
Here is something to try. Throttle the NDPluginStdArrays plugin, which is the plugin that converts images to waveform records. Set the MinCallbackTime PV to 1.0 so it only updates the waveform record at 1Hz, or 0.5 second to limit it to 2Hz. This will do 2 things:
- Reduce the CPU load on CA, assuming that ImageJ is also running on CA, because ImageJ will now only try to display 1 frame/sec.
- Reduce the network bandwidth used between CA and CC. If you are running ImageJ on CA then you are sending images over X11 between CA and CC.
You can then run "top" on CA and see what the CPU load is as you reduce MinCallbackTime. As it begins to fail can see if CPU load is the problem or not.
Mark
________________________________________
From: Zenon Szalata [[email protected]]
Sent: Saturday, March 09, 2013 11:07 AM
To: Mark Rivers; tech-talk; Dunning, Michael; Nelson, Janice L.
Subject: Area Detector
Hi Mark,
I have a peculiar problem with an IOC using an area detector. It
controls a prosilica camera (not sure what is the model, it has 1292x964
image size, monochrome). Area detector version is 1.8, asyn 4.20, and
EPICS 3.14.12.2,
It all works fine.
A problem is encounred when a few hops are needed to get to the computer
where the soft IOC runs.
This is how we are doing it:
there are three subnets, call them SA, SB, and SC. Three computers are
involved:
the operator is sitting in front of computer CC on subnet SC. The
operator does ssh from CC to computer CB on subnet SB, and from there
does ssh to computer CA on subnet SA. The IOC runs on CA. This is
needed because the accelarator operators can only get to subnet SA as
described above via two hops. The prosilica EDM control screens work
fine and there is no problem until we launch ImageJ. It connects and
the IOC starts printing the following messages:
2013/03/08 12:20:21.833 PS1:cam1:PSReadStatistics devAsynInt32 process
error
2013/03/08 12:20:26.833 prosilica:readStats: error, status=14
2013/03/08 12:20:26.833 prosilica:readParameters: error, status=8
2013/03/08 12:20:26.833 prosilica:writeInt32: error, status=8
function=76, value=0
Stopping ImageJ does not clear the problem. It seems that the only way
to get out of this is to restart the IOC.
Strangely, the EDM viewer widget does somewhat better. We can use one
of those, but the same problem accurrs when a second EDM viewer is started.
It seems to be a bandwidth problem. But why does the IOC get into the
mode where it cant process periodic chores and won't recover?
Any ideas or suggestions would be very helpful.
Thanks,
Zen
- Replies:
- RE: Area Detector Mark Rivers
- References:
- Area Detector Zenon Szalata
- RE: Area Detector Mark Rivers
- Navigate by Date:
- Prev:
RE: Area Detector Mark Rivers
- Next:
RE: EDM Text Control Widget Motif behavior Sinclair, John William
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
<2013>
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
- Navigate by Thread:
- Prev:
RE: Area Detector Mark Rivers
- Next:
RE: Area Detector Mark Rivers
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
<2013>
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
|