Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  <20192020  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  <20192020 
<== Date ==> <== Thread ==>

Subject: RE: generalTime/epicsTimeGetCurrent and casr questions
From: Mark Rivers via Tech-talk <tech-talk@aps.anl.gov>
To: "'Johnson, Andrew N.'" <anj@anl.gov>
Cc: 'Brands Helge' <Helge.Brands@psi.ch>, "'Layne \(US\), William C'" <william.c.layne@boeing.com>, "'tech-talk@aps.anl.gov'" <tech-talk@aps.anl.gov>
Date: Mon, 11 Mar 2019 15:28:59 +0000
Matt Newville and I tested the apparent problem with caQtDM this morning.

We were running version 4.1.7.  We found that the problem was with the slider widget.  Each time the slider was moved it opened dozens of new connections reported with casr on the IOC.

We updated to the latest release, 4.2.1 and the problem went away.

I don't see this issue mentioned in the caQtDM release notes. https://github.com/caqtdm/caqtdm/releases

Mark


-----Original Message-----
From: Mark Rivers 
Sent: Saturday, March 9, 2019 6:16 PM
To: 'Johnson, Andrew N.' <anj@anl.gov>
Cc: Layne (US), William C <william.c.layne@boeing.com>; tech-talk@aps.anl.gov
Subject: RE: generalTime/epicsTimeGetCurrent and casr questions

Hi Andrew,

I realized something must be wrong.  Ferrari is a Windows workstation that Matt Newville uses for our microprobe.  I asked Matt what could be connecting to those PVs, and he thought it must be caQtDM windows.  He restarted caQtDM and indeed over 4000 connections, to just a few dozen PVs, went away on the IOC.  Now casr 2 reports this for that client on Ferrari:

    TCP client at 164.54.160.93:60618 'Ferrari':
        User 'XAS_User', V4.13, Priority = 0, 21 Channels
        Channel: '13IDA:eps_mbbi195'
        Channel: '13IDA:eps_mbbi195'
        Channel: '13IDA:eps_mbbi192'
        Channel: '13IDA:eps_mbbi192'
        Channel: '13IDA:eps_mbbi194'
        Channel: '13IDA:eps_mbbi194'
        Channel: '13IDA:OpenEShutter.PROC'
        Channel: '13IDA:CloseEShutter.PROC'
        Channel: '13IDA:eps_mbbi27'
        Channel: '13IDA:IDEAutoOpenMode'
        Channel: '13IDA:DAC1_7_tweak.A'
        Channel: '13IDA:DAC1_7_tweak.B'
        Channel: '13IDA:DAC1_7_tweakVal'
        Channel: '13IDA:DAC1_7.VAL'
        Channel: '13IDA:DAC1_8_tweak.A'
        Channel: '13IDA:DAC1_8_tweak.B'
        Channel: '13IDA:DAC1_8_tweakVal'
        Channel: '13IDA:DAC1_8.VAL'
        Channel: '13IDA:DAC1_8.VAL'
        Channel: '13IDA:DAC1_7.VAL'
        Channel: '13IDA:E_BPMFoilPosition_RBV'

So now there are 2 instances of DAC1_7.VAL and DAC1_8.VAL, while before restarting caQtDM there were hundreds of instances of each.

So the problem definitely appears to be caQtDM.  On Monday when there is no beam we will try opening and closing the caQtDM window to those PVs repeatedly to see if the number of connections keeps increasing.  If it does we will check to make sure we are using the latest release of caQtDM, and report a bug if the latest release also has the problem.

Mark


-----Original Message-----
From: Johnson, Andrew N. <anj@anl.gov> 
Sent: Saturday, March 9, 2019 4:18 PM
To: Mark Rivers <rivers@cars.uchicago.edu>
Cc: Layne (US), William C <william.c.layne@boeing.com>; tech-talk@aps.anl.gov
Subject: Re: generalTime/epicsTimeGetCurrent and casr questions

Hi Mark,

Your client running on Ferrari is definitely not coded properly, it looks like it is continually creating a new chid instead of re-using an existing one for that channel. This will be using up memory on both the IOC and the client (until the client restarts, which will then release the IOC resources), so I strongly recommend that you find out what it is and fix it.

William — regarding your question about generalTime, I believe Michael Davidsaver has some ideas about rewriting the generalTime code to try and reduce the contentions on that mutex (it’s probably the one you named but I don’t have the code in front of me to check right now). I don’t know how far he has got with that if anywhere though.

- Andrew

--
Sent from my iPad

> On Mar 9, 2019, at 10:31 AM, Mark Rivers via Tech-talk <tech-talk@aps.anl.gov> wrote:
>
> Hi William,
>
>
>   I am noticing that in some of our ‘casr’ outputs, we are seeing multiple TCP connections with the same PV channel connected. Is this normal behavior?
>
> a.       For some reason, I assumed Channel Access wouldn’t allow this. If it is normal behavior, any suggestions on ways to prevent it? (I know having the users checking would be a good way J)
>
> I am not sure what you mean by "multiple TCP connections with the same PV channel connected"?  Do you mean from the same host?  Do they have different port numbers, like this?
>
>    TCP client at 164.54.160.82:41108 'corvette':
>        User 'epics', V4.13, Priority = 80, 108 Channels
>        Channel: '13IDA:scan1.P1SP'
>        Channel: '13IDA:scan1.P1EP'
>        Channel: '13IDA:scan1.NPTS'
>        Channel: '13IDA:scan1.EXSC'
>        Channel: '13IDA:scan1.MPTS'
>        Channel: '13IDA:scan1.BUSY'
>        Channel: '13IDA:scan1.CMND'
>        Channel: '13IDA:scan1.P1SM'
> ...
>    TCP client at 164.54.160.82:41110 'corvette':
>        User 'epics', V4.13, Priority = 80, 129 Channels
>        Channel: '13IDA:scan1.P1SM'
>        Channel: '13IDA:scan1.P1SP'
>        Channel: '13IDA:scan1.P1EP'
>        Channel: '13IDA:scan1.NPTS'
>        Channel: '13IDA:scan1.EXSC'
>        Channel: '13IDA:scan1.MPTS'
>        Channel: '13IDA:scan1.BUSY'
>        Channel: '13IDA:scan1.CMND'
>
> If so those are just 2 different clients on the same host connecting to the same PV.  That is normal.
>
> On the other hand I am also seeing things like this, which I am not sure is normal:
>
>    TCP client at 164.54.160.93:60193 'Ferrari':
>        User 'XAS_User', V4.13, Priority = 50, 4249 Channels
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>        Channel: '13IDA:DAC1_7.VAL'
>
> Mark
>
>
> ________________________________
> From: tech-talk-bounces@aps.anl.gov <tech-talk-bounces@aps.anl.gov> on behalf of Layne (US), William C via Tech-talk <tech-talk@aps.anl.gov>
> Sent: Friday, March 8, 2019 6:18 PM
> To: tech-talk@aps.anl.gov
> Subject: generalTime/epicsTimeGetCurrent and casr questions
>
>
> Hello, haven’t posted here before so here it goes. (Recently completed some training and was encouraged to reach out more)
>
>
>
> I figure I will start with some of the more softball questions as we are seeing various issues here:
>
> 1.       Has anyone seen issues with multiple threads using the generalTime subsystem?
>
> a.       We are having some performance issues with a large number of clients (>100) and am trying to narrow down which futex we are spending lots of time in. I will run some tests later this week to see if the ‘timeListLock’ could be impacting them.
>
> 2.       I am noticing that in some of our ‘casr’ outputs, we are seeing multiple TCP connections with the same PV channel connected. Is this normal behavior?
>
> a.       For some reason, I assumed Channel Access wouldn’t allow this. If it is normal behavior, any suggestions on ways to prevent it? (I know having the users checking would be a good way :))
>
>
>
> Hope to be participating more. Thanks,
>
>
>
> William ‘Casey’ Layne
>
> Stage Controller – Software Engineer
>
> Email: william.c.layne@boeing.com<mailto:william.c.layne@boeing.com>
>
> Any opinions expressed herein are my own.  They are not necessarily those of Boeing, or any other company or organization with which I am affiliated.
>
>

Replies:
AW: generalTime/epicsTimeGetCurrent and casr questions Brands Helge (PSI) via Tech-talk
References:
generalTime/epicsTimeGetCurrent and casr questions Layne (US), William C via Tech-talk
Re: generalTime/epicsTimeGetCurrent and casr questions Mark Rivers via Tech-talk
Re: generalTime/epicsTimeGetCurrent and casr questions Johnson, Andrew N. via Tech-talk
RE: generalTime/epicsTimeGetCurrent and casr questions Mark Rivers via Tech-talk

Navigate by Date:
Prev: EPICS June 2019: Abstract Submission is open Ralph Lange via Tech-talk
Next: I'm looking for a pvacces example to read PVStructure Heinz Junkes via Tech-talk
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  <20192020 
Navigate by Thread:
Prev: RE: generalTime/epicsTimeGetCurrent and casr questions Mark Rivers via Tech-talk
Next: AW: generalTime/epicsTimeGetCurrent and casr questions Brands Helge (PSI) via Tech-talk
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  <20192020 
ANJ, 12 Mar 2019 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·