![]() |
![]() ![]()
Experimental Physics and
| ||||||||||||||
|
What OS is the IOC running on — I'm guessing Linux but you didn't say. If so is it built for and using priority thread scheduling? If the OSSPRI field from epicsThreadShowAll is all zeros it isn't, and enabling that might help. The normal Linux scheduler tends to maximize throughput, not fairness, so it could be delaying the threads which process your STOP message while the threads handling image data can continue to make progress. However this is just a guess. - Andrew General musings: The setpriority(2) manpage on RHEL-7 says: BUGS I wonder whether we should look at setting nice values for Linux threads when the process doesn't have the ability to use SCHED_FIFO? On 6/24/22 10:39 AM, Zimoch Dirk (PSI)
via Core-talk wrote:
Hi folks, Some of or users complained that a camera server became less responsive since it had been upgraded from EPICS 3.14.12.6 to 7.0.6.1. The camera sends image data as arrays of 20000000 SHORTs (5000x4000 pixels). When the user presses the "STOP" button on the client which displays the image, it takes a long time to stop. The more active clients, the longer it takes. But even sending stop from a different client (e.g. command line caput) takes a long time before the GUI clients update. I have set up a simple simulation and run it with 'var CADEBUG 3' Here is what I see: on EPICS 7.0.6.1 CAS: Sending a message of 40000032 bytes CAS: Sending a message of 40000032 bytes CAS: Sending a message of 40000032 bytes CAS: Sending a message of 40000032 bytes CAS: Sending a message of 40000032 bytes CAS: Sending a message of 40000032 bytes CAS: Sending a message of 40000032 bytes CAS: TCP Request from 129.129.130.117:47142 => cmmd=4 (CA_PROTO_WRITE) cid=0x4 type=0 count=1 postsize=8 version=13 CAS: Request from 129.129.130.117:47142 => available=0x2 N=1 paddr=0x7efcb800db80 CAS: Request from 129.129.130.117:47142 => Wrote string "STOP" CAS: Sending a message of 40000032 bytes CAS: Sending a message of 40000032 bytes [>80 times the same!] CAS: Sending a message of 40000056 bytes <---- I think this one contains the update of the STOP button CAS: Sending a message of 40000032 bytes [eventually stops many seconds later] The IOC obviously gets the STOP message immediately when I press the button on the client. But the client (and any other client showing the image) does not see the button change. The GUI appears "frozen". But a command line camonitor monitoring the stop button (and a counter that counts the number of created images but not the image itself) show that the records stop immediately. Nevertheless the IOC keeps sending images. But the images do not change any more on the clients. So it seems that the IOC keeps sending the same array data over and over again. On 3.14.12, the output looks similar, but the "send after stop" consists of only a few messages: CAS: Request from 129.129.130.117:47184 => cmmd=4 cid=0x1 type=0 count=1 postsize=8 CAS: Request from 129.129.130.117:47184 => available=0x2 N=1 paddr=0x7f0768010b28 CAS: Request from 129.129.130.117:47184 => Wrote string "STOP" CAS: Sending a message of 40000032 bytes CAS: Sending a message of 40000032 bytes CAS: Sending a message of 40000032 bytes CAS: Sending a message of 40000032 bytes CAS: Sending a message of 40000056 bytes <---- update of the STOP button What can be wrong here? The IOC consists of a counting calc, a bo for the stop switch and a waveform record with a driver that simply fills the waveform with a sequence starting at the counter value. Nothing fancy. Here is my db: record (waveform, "DZ:BIGARRAY") { field(FTVL, "SHORT") field(NELM, "20000000") field(DTYP, "sequence") field(SCAN, ".1 second") field(SDIS, "DZ:STOP") field(INP, "DZ:COUNT") field(FLNK, "DZ:COUNT") } record (calc, "DZ:COUNT") { field(CALC, "VAL+1") } record(bo, "DZ:STOP") { field(ZNAM,"GO") field(ONAM,"STOP") } I suspect this happens when record produces new waveforms faster than they can be sent. The IOC has no problem processing the waveform at 10 Hz, but I see only about 3 CAS messages per second. I had to slow down the waveform processing to ".5 second" to improves responsiveness. That is when the monitor updates can be sent as quickly as being produced. But opening a second client again spoils everything. Dirk -- Complexity comes for free, Simplicity you have to work for.
| ||||||||||||||
ANJ, 14 Sep 2022 |
![]() · Download · Search · IRMIS · Talk · Documents · Links · Licensing · |