Experimental Physics and
Industrial Control System

2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 <2022> 2023 2024 2025	Index	2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 <2022> 2023 2024 2025
<== Date ==>		<== Thread ==>

Subject:	Re: Problem with huge waveforms in EPICS 7
From:	"Zimoch Dirk \(PSI\) via Core-talk" <core-talk at aps.anl.gov>
To:	Andrew Johnson <anj at anl.gov>
Cc:	"core-talk at aps.anl.gov" <core-talk at aps.anl.gov>
Date:	Sat, 25 Jun 2022 09:03:08 +0000

Hi Andrew,

The cameras run on Windows. I did my test on Linux but not as root, thus I had no RT scheduling. I will repeat the test on Monday running as root.

The STOP message gets processed in time! A client that does not monitor the array sees the change immediately. The counter stops. But CA keeps sending!

I had expected that the IOC would drop frames if CA cannot send fast enough. Not trying for minutes to work through a pile of unsent frames. And then not even sending updates but simply repeating the last frame.

Dirk

Am 24.06.2022 um 18:10 schrieb Andrew Johnson via Core-talk <core-talk at aps.anl.gov>:

Hi Dirk,

What OS is the IOC running on — I'm guessing Linux but you didn't say. If so is it built for and using priority thread scheduling? If the OSSPRI field from epicsThreadShowAll is all zeros it isn't, and enabling that might help. The normal Linux scheduler tends to maximize throughput, not fairness, so it could be delaying the threads which process your STOP message while the threads handling image data can continue to make progress. However this is just a guess.

- Andrew

General musings: The setpriority(2) manpage on RHEL-7 says:

BUGS
       According to POSIX, the nice value is a per-process setting. However, under
       the current Linux/NPTL implementation of POSIX threads, the nice value is a
       per-thread attribute: different threads in the same process can have different
       nice values. Portable applications should avoid relying on the Linux behav‐
       ior, which may be made standards conformant in the future.

I wonder whether we should look at setting nice values for Linux threads when the process doesn't have the ability to use SCHED_FIFO?

On 6/24/22 10:39 AM, Zimoch Dirk (PSI) via Core-talk wrote:

Hi folks,

Some of or users complained that a camera server became less responsive since it had been upgraded from EPICS 3.14.12.6
to 7.0.6.1.

The camera sends image data as arrays of 20000000 SHORTs (5000x4000 pixels). When the user presses the "STOP" button on
the client which displays the image, it takes a long time to stop. The more active clients, the longer it takes.
But even sending stop from a different client (e.g. command line caput) takes a long time before the GUI clients update.

I have set up a simple simulation and run it with 'var CADEBUG 3'
Here is what I see: on EPICS 7.0.6.1
CAS: Sending a message of 40000032 bytes
CAS: Sending a message of 40000032 bytes
CAS: Sending a message of 40000032 bytes
CAS: Sending a message of 40000032 bytes
CAS: Sending a message of 40000032 bytes
CAS: Sending a message of 40000032 bytes
CAS: Sending a message of 40000032 bytes
CAS: TCP Request from 129.129.130.117:47142 => cmmd=4 (CA_PROTO_WRITE) cid=0x4 type=0 count=1 postsize=8 version=13
CAS: Request from 129.129.130.117:47142 =>   available=0x2 	N=1 paddr=0x7efcb800db80
CAS: Request from 129.129.130.117:47142 =>   Wrote string "STOP"
CAS: Sending a message of 40000032 bytes
CAS: Sending a message of 40000032 bytes
[>80 times the same!]
CAS: Sending a message of 40000056 bytes <---- I think this one contains the update of the STOP button
CAS: Sending a message of 40000032 bytes
[eventually stops many seconds later]

The IOC obviously gets the STOP message immediately when I press the button on the client. But the client (and any other
client showing the image) does not see the button change. The GUI appears "frozen". But a command line camonitor
monitoring the stop button (and a counter that counts the number of created images but not the image itself) show that
the records stop immediately.
Nevertheless the IOC keeps sending images. But the images do not change any more on the clients. So it seems that the
IOC keeps sending the same array data over and over again.

On 3.14.12, the output looks similar, but the "send after stop" consists of only a few messages:
CAS: Request from 129.129.130.117:47184 => cmmd=4 cid=0x1 type=0 count=1 postsize=8
CAS: Request from 129.129.130.117:47184 =>   available=0x2 	N=1 paddr=0x7f0768010b28
CAS: Request from 129.129.130.117:47184 =>   Wrote string "STOP"
CAS: Sending a message of 40000032 bytes
CAS: Sending a message of 40000032 bytes
CAS: Sending a message of 40000032 bytes
CAS: Sending a message of 40000032 bytes
CAS: Sending a message of 40000056 bytes <---- update of the STOP button

What can be wrong here? 
The IOC consists of a counting calc, a bo for the stop switch and a waveform record with a driver that simply fills the
waveform with a sequence starting at the counter value. Nothing fancy.

Here is my db:

record (waveform, "DZ:BIGARRAY")
{
    field(FTVL, "SHORT")
    field(NELM, "20000000")
    field(DTYP, "sequence")
    field(SCAN, ".1 second")
    field(SDIS, "DZ:STOP")
    field(INP,  "DZ:COUNT")
    field(FLNK, "DZ:COUNT")
}

record (calc, "DZ:COUNT")
{
    field(CALC, "VAL+1")
}

record(bo, "DZ:STOP")
{
    field(ZNAM,"GO")
    field(ONAM,"STOP")
}

I suspect this happens when record produces new waveforms faster than they can be sent.
The IOC has no problem processing the waveform at 10 Hz, but I see only about 3 CAS messages per second.
I had to slow down the waveform processing to ".5 second" to improves responsiveness. That is when the monitor updates
can be sent as quickly as being produced. But opening a second client again spoils everything.

Dirk

-- 
Complexity comes for free, Simplicity you have to work for.

Replies:: RE: Problem with huge waveforms in EPICS 7 Mark Rivers via Core-talk; Re: Problem with huge waveforms in EPICS 7 Zimoch Dirk (PSI) via Core-talk

References:: Problem with huge waveforms in EPICS 7 Zimoch Dirk (PSI) via Core-talk; Re: Problem with huge waveforms in EPICS 7 Andrew Johnson via Core-talk

Navigate by Date:: Prev: Re: Problem with huge waveforms in EPICS 7 Andrew Johnson via Core-talk; Next: Re: Problem with huge waveforms in EPICS 7 Mark Rivers via Core-talk; Index: 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 <2022> 2023 2024 2025
Navigate by Thread:: Prev: Re: Problem with huge waveforms in EPICS 7 Andrew Johnson via Core-talk; Next: RE: Problem with huge waveforms in EPICS 7 Mark Rivers via Core-talk; Index: 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 <2022> 2023 2024 2025

ANJ, 14 Sep 2022

· Home · News · About · Base · Modules · Extensions · Distributions ·
· Download · Search · IRMIS · Talk · Documents · Links · Licensing ·

Experimental Physics and Industrial Control System

Experimental Physics and
Industrial Control System