EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  <20212022  2023  2024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  <20212022  2023  2024 
<== Date ==> <== Thread ==>

Subject: Re: Channel Access Timeouts - ca(get|put) CA Priority
From: "Yendell, Gary \(DLSLtd, RAL, LSCI\) via Tech-talk" <tech-talk at aps.anl.gov>
To: "Johnson, Andrew N." <anj at anl.gov>
Cc: EPICS tech-talk <tech-talk at aps.anl.gov>
Date: Mon, 26 Jul 2021 16:27:05 +0000
Hi Andrew,

Thanks for the info!

We do build base with the correct flag, but the IOCs are not currently configured to run with sufficient privileges:

$ ps -To pid,tid,policy,rtprio,comm -p 40735
  PID   TID POL RTPRIO COMMAND
40735 40735 TS       - BL07I-EA-IOC-03
...

$ cat /proc/40735/status
...
VmSize: 3204140 kB
VmLck:       0 kB


I will look into that. I also tried running caSnooper:

$ caSnooper -t60 -p10 -c5
Starting CaSnooper 2.1.2.3 (7-3-2013) at Jul 26 15:25:18
EPICS 3.14.12.7
cas warning: Configured TCP port was unavailable.
cas warning: Using dynamically assigned TCP port 40554,
cas warning: but now two or more servers share the same UDP port.
cas warning: Depending on your IP kernel this server may not be
cas warning: reachable with UDP unicast (a host's IP in EPICS_CA_ADDR_LIST)
Individual Name is CaSnoop.test
Internal PV names are not being published

CaSnooper terminating after 60.01 seconds [1.00 minutes]
  Data collected for 60.01 seconds [1.00 minutes]

Jul 26 15:26:18:
There were 15078 requests to check for PV existence for 1960 different PVs.
  Max(Hz):   5.58
  Mean(Hz):  0.13
  StDev(Hz): 0.36

PVs with top 10 requests:
   1 bl07i-di-serv-02.diamond.ac.uk:44724 CP                                5.58
   2 bl07i-di-serv-02.diamond.ac.uk:48529 CP                                5.52
   3 bl07i-di-serv-02.diamond.ac.uk:51380 CP                                4.27
   4 i07-ppu01.diamond.ac.uk:37171  CP                                4.27
   5 i07-ppu01.diamond.ac.uk:49508  CP                                2.98
   6 i07-ppu01.diamond.ac.uk:52783  CP                                2.73
   7 bl07i-va-ioc-01.diamond.ac.uk:1027 BL07I-VA-VLVCC-01:DM5XX           2.67
   8 bl07i-va-ioc-01.diamond.ac.uk:1027 BL07I-VA-VLVCC-02:DM5XX           2.67
   9 bl07i-va-ioc-01.diamond.ac.uk:1027 BL07I-VA-VLVCC-01:DM4XX           2.67
  10 bl07i-va-ioc-01.diamond.ac.uk:1027 BL07I-VA-VLVCC-02:DM3XX           2.67

Connection status for top 5 PVs after 10.00 sec:
   1 bl07i-di-serv-02.diamond.ac.uk:44724 CP                                NC 5.58
   2 bl07i-di-serv-02.diamond.ac.uk:48529 CP                                NC 5.52
   3 bl07i-di-serv-02.diamond.ac.uk:51380 CP                                NC 4.27
   4 i07-ppu01.diamond.ac.uk:37171  CP                                NC 4.27
   5 i07-ppu01.diamond.ac.uk:49508  CP                                NC 2.98

I have found many instances of a db expansion where the macros are not substituted correctly, so there are multiple records requesting the PV " CP" (and maybe "CP" - although I can't find those, so maybe it is just the output formatting)
e.g.

record(longout, "BL07I-EA-DET-10:OVER:1:PositionXLink")
{
    field(DOL,  " CP MS")
    field(OMSL, "closed_loop")
    field(OUT, "BL07I-EA-DET-10:OVER:1:PositionX PP")
    info(autosaveFields, "DOL")
}

Is it possible these requests are putting significant extra load on the IOCs?

Cheers,
Gary

From: Johnson, Andrew N. <anj at anl.gov>
Sent: 23 July 2021 17:02
To: Yendell, Gary (DLSLtd,RAL,LSCI) <gary.yendell at diamond.ac.uk>
Cc: EPICS tech-talk <tech-talk at aps.anl.gov>
Subject: Re: Channel Access Timeouts - ca(get|put) CA Priority
 
Hi Gary,

On Jul 23, 2021, at 9:51 AM, Yendell, Gary (DLSLtd, RAL, LSCI) via Tech-talk <tech-talk at aps.anl.gov> wrote:

We are having very occasional issues with CA timeouts (with a timeout of 3 seconds) from our control software (GDA). I think it is because the IOC sometimes doesn't respond when it is overwhelmed with requests and when we have multiple instances of EDM screens open it can produce a lot of requests. These timeouts can cause scans to fail, so responsiveness for the EDM screens is much less important. If we add an appropriate value for the CA Priority option to ca(get|put) in control software requests, would that make the IOC respond to it in preference to other clients and stop the timeout errors? Or am I misunderstanding what this option does?

The priority of a CA connection controls the priority of the EPICS threads in the IOC that are responsible for sending and receiving the CA messages. Whether and how the EPICS thread priority maps to that of the underlying OS depends on what OS the IOC is running on.

For IOCs running on Linux the EPICS thread priority only has an effect if the IOC has enough privilege to use the real-time scheduler (regular users generally don’t have this enabled). See this tech-talk message for a description and this how-to for detailed instructions. You might not even need to set the priorities of your CA connections to fix this though, just enabling the scheduler may be enough since the IOC thread that responds to name searches runs at a lower priority than the CA threads (which are lower again than the IOC’s threads that run the process database).

The other thing you might want to do is look at the CA search traffic on your subnet and see if it’s worth cleaning up any clients that are searching for PV names that don’t exist. If you don’t keep an eye on old GUI screens and other clients these dead searches can severely load down your IOCs. There are a couple of older tools I know of which can help to identify those PV names (caSnooper is probably best known), but the community might have others that I don’t know about.

HTH,

- Andrew

-- 
Complexity comes for free, simplicity you have to work for.

 

-- 

This e-mail and any attachments may contain confidential, copyright and or privileged material, and are for the use of the intended addressee only. If you are not the intended addressee or an authorised recipient of the addressee please notify us of receipt by returning the e-mail and do not use, copy, retain, distribute or disclose the information in or attached to the e-mail.
Any opinions expressed within this e-mail are those of the individual and not necessarily of Diamond Light Source Ltd.
Diamond Light Source Ltd. cannot guarantee that this e-mail or any attachments are free from viruses and we cannot accept liability for any damage which you may sustain as a result of software viruses which may be transmitted in or with the message.
Diamond Light Source Limited (company no. 4375679). Registered in England and Wales with its registered office at Diamond House, Harwell Science and Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom
 


Replies:
Re: Channel Access Timeouts - ca(get|put) CA Priority Johnson, Andrew N. via Tech-talk
References:
Channel Access Timeouts - ca(get|put) CA Priority Yendell, Gary (DLSLtd, RAL, LSCI) via Tech-talk
Re: Channel Access Timeouts - ca(get|put) CA Priority Johnson, Andrew N. via Tech-talk

Navigate by Date:
Prev: Re: [EXTERNAL] Phoebus Archive xml config file: Automate generation. Kasemir, Kay via Tech-talk
Next: Mutlibus-II iSBC 386/486 Single Board Computers --- BADLY IN NEED OF ACQUIRE THESE OLD DISCONTINUED MODULES Luchini, Kristi L. via Tech-talk
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  <20212022  2023  2024 
Navigate by Thread:
Prev: Re: Channel Access Timeouts - ca(get|put) CA Priority Johnson, Andrew N. via Tech-talk
Next: Re: Channel Access Timeouts - ca(get|put) CA Priority Johnson, Andrew N. via Tech-talk
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  <20212022  2023  2024 
ANJ, 27 Jul 2021 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·