Hi Michael,
Thanks. But as I said the motors (e.g. 13BMA:m17.RBV) are running in a VxWorks IOC. I don't have a debugger for VxWorks.
Mark
-----Original Message-----
From: Michael Davidsaver <mdavidsaver at gmail.com>
Sent: Wednesday, May 10, 2023 7:20 PM
To: Mark Rivers <rivers at cars.uchicago.edu>
Cc: tech-talk at aps.anl.gov
Subject: Re: sequencer problem
On 5/10/23 16:23, Mark Rivers via Tech-talk wrote:
> I have found that a simple caget also causes strange behavior:
My first suspicion is that some code is blocking for an extended time with a record lock held. Either directly, or indirectly eg. by holding some other lock which a device support blocks on.
If this is such a deadlock, then one potentially easy way way to get more information would be to, effectively, catch the blockers in the act.
Run your IOC in a debugger, issue the caget, then immediately switch and issue a manual break (Ctrl+C) in the debugger and finally run "thread apply all backtrace".
If I am right in my guess, this should show a client TCP blocking on a dbScanLock(), and hopefully give some idea of which other thread is holding that lock.
> caget causes a CA disconnect:
>
> corvette:CARS/CARSApp/src>caget 13BMA:m17.RBV
>
> Read operation timed out: some PV data was not read.
>
> 13BMA:m17.RBV 0
>
> CA.Client.Exception...............................................
>
> Warning: "Virtual circuit disconnect"
>
> Context: "op=0, channel=13BMA:m17.RBV, type=DBR_TIME_DOUBLE, count=1, ctx="ioc13bma.cars.aps.anl.gov:5064""
>
> Source File: ../getCopy.cpp line 91
>
> Current Time: Wed May 10 2023 18:16:47.411698930
>
> ..................................................................
>
> The record shows this with dbpr”
>
> ioc13bma> dbpr "13BMA:m17",2
>
> ACCL: 0.2 ACKS: NO_ALARM ACKT: YES ADEL: 0
>
> AMSG: ASG : ATHM: 0 BACC: 0.2
>
> BDST: 0 BKPT: 00 BVEL: 1 CDIR: 0
>
> CNEN: Disable DCOF: 0 DESC: Monochromator DHLM:
> 42.9453
>
> DIFF: 0 DINP: CONSTANT DIR : Pos DISA: 0
>
> DISP: 0 DISS: NO_ALARM DISV: 1 DLLM: -50
>
> DLY : 0.25 DMOV: 0 DOL : CONSTANT DRBV: 0
>
> DTYP: Mclennan PM304 DVAL: 12.336010114751
>
> EGU : degrees ERES: 0 EVNT: FLNK:
> CONSTANT
>
> FOF : 0 FOFF: Frozen FRAC: 1 HHSV:
> NO_ALARM
>
> HIGH: 0 HIHI: 0 HLM : 0 HLS : 0
>
> HLSV: NO_ALARM HOMF: 0 HOMR: 0 HOPR: 0
>
> HSV : NO_ALARM HVEL: 0 ICOF: 0 IGSET: 0
>
> INIT: JAR : 5 JOGF: 0 JOGR: 0
>
> JVEL: 1 LCNT: 0 LDVL: 12.336010114751
>
> LLM : 0 LLS : 0 LLSV: NO_ALARM LOCK: NO
>
> LOLO: 0 LOPR: 0 LOW : 0 LRLV: 0
>
> LRVL: 246720 LSPG: Go LSV : NO_ALARM LVAL:
> 3.77871710270
>
> LVIO: 1 MDEL: 0 MISS: 0 MOVN: 0
>
> MRES: 5.0e-05 NAME: 13BMA:m17 NAMSG: NSEV:
> NO_ALARM
>
> NSTA: NO_ALARM NTM : YES NTMF: 2
>
> OFF : -8.5572930120512 OMSL: supervisory
>
> OUT : VME_IO #C0 S0 @ PACT: 1 PCOF: 0
>
> PHAS: 0 PINI: NO POST: PP : 0
>
> PREC: 4 PREM: PRIO: LOW PUTF: 1
>
> RBV : 0 RCNT: 0 RDBD: 5.0e-05 RDBL:
> CONSTANT
>
> RDIF: 0 REP : 0 RHLS: 0 RINP:
> CONSTANT
>
> RLLS: 0 RLNK: CONSTANT RLV : 0 RMOD:
> Default
>
> RMP : 0 RPRO: 0 RRBV: 0 RRES: 0
>
> RSTM: NearZero RTRY: 0 RVAL: 246720 RVEL: 0
>
> S : 1 SBAK: 1 SBAS: 0 SCAN:
> Passive
>
> SDIS: CONSTANT SET : Use SEVR: INVALID SMAX: 1
>
> SPDB: 0 SPMG: Go SREV: 20000 SSET: 0
>
> STAT: UDF STOO: CONSTANT STOP: 0 SUSE: 0
>
> SYNC: 0 TDIR: 0 TIME: <undefined> TPRO: 0
>
> TSE : 0 TSEL: CONSTANT TWF : 0 TWR : 0
>
> TWV : 1 UDF : 0 UDFS: INVALID UEIP: No
>
> UREV: 1 URIP: No VAL : 3.77871710270 VBAS: 0
>
> VELO: 1 VERS: 7.2 VMAX: 1 VOF : 0
>
> value = 0 = 0x0
>
> The record PACT=1, probably set in init_record when it could not find the controller. But should that be causing the CA errors with caget and the sequencer?
>
> Mark
>
> *From:* Mark Rivers
> *Sent:* Wednesday, May 10, 2023 5:07 PM
> *To:* tech-talk at aps.anl.gov
> *Subject:* sequencer problem
>
> Folks,
>
> I am seeing behavior I don’t understand with the sequencer. I am using base 7.0.6.1 and sequencer 2.2.9 (latest).
>
> I have a VxWorks IOC running a number of motors. All of the motor records are loaded, but currently one of the motor controllers is not available, so that motor record is not functional.
>
> I am running an SNL program on a Linux IOC, and it connects to the motors on the VxWorks system, including the currently non-functional motor. The SNL program puts monitors several of the motor record fields.
>
> I see the following messages after iocInit when the SNL program starts:
>
> seq BM13_Energy, "E=13BMA:E, MONO=13BMA:m17, EXPTAB_Z=13BMD:m22, YXTAL=13BMA:MON:, ZXTAL=13BMA:m14"
>
> sevr=info Sequencer release 2.2.9, compiled Wed May 10 16:25:47 2023
>
> sevr=info Spawning sequencer program "BM13_Energy", thread 0x2c304a0: "BM13_Energy"
>
> sevr=minor BM13_Energy[0](after 0 sec): assigned=26, connected=26,
> monitored=24, got monitor=21
>
> sevr=minor BM13_Energy[0](after 0 sec): assigned=26, connected=26,
> monitored=24, got monitor=21
>
> sevr=minor BM13_Energy[0](after 0 sec): assigned=26, connected=26,
> monitored=24, got monitor=21
>
> sevr=minor BM13_Energy[0](after 0 sec): assigned=26, connected=26,
> monitored=24, got monitor=21
>
> sevr=minor BM13_Energy[0](after 0 sec): assigned=26, connected=26,
> monitored=24, got monitor=21
>
> The SNL program has monitors on 3 record fields for the non-functional motor. I think that is why there are 24 channels monitored, but only 21 monitors have been received?
>
> The above seems like it might be normal. However, after about 30 seconds I get this error on the Linux IOC:
>
> CA.Client.Exception...............................................
>
> Warning: "Virtual circuit unresponsive"
>
> Context: "ioc13bma.cars.aps.anl.gov:5064"
>
> Source File: ../tcpiiu.cpp line 926
>
> Current Time: Wed May 10 2023 16:53:37.844955871
>
> At the same time I get this error on the VxWorks IOC:
>
> DB CA Link Exception: "Virtual circuit disconnect", context "corvette:38399"
>
> After the exceptions I then get the following messages on the Linux IOC. Note that it spontaneously dropped 24 of the 26 connections, and “got monitor” changed to -1.
>
> ..................................................................
>
> sevr=minor BM13_Energy[0](after 0 sec): assigned=26, connected=2,
> monitored=24, got monitor=-1
>
> sevr=minor BM13_Energy[0](after 0 sec): assigned=26, connected=2,
> monitored=24, got monitor=-1
>
> sevr=minor BM13_Energy[0](after 0 sec): assigned=26, connected=2,
> monitored=24, got monitor=-1
>
> Is this behavior expected?
>
> Thanks,
>
> Mark
>
- Replies:
- Re: sequencer problem Michael Davidsaver via Tech-talk
- References:
- sequencer problem Mark Rivers via Tech-talk
- RE: sequencer problem Mark Rivers via Tech-talk
- Re: sequencer problem Michael Davidsaver via Tech-talk
- Navigate by Date:
- Prev:
Re: sequencer problem Michael Davidsaver via Tech-talk
- Next:
Re: sequencer problem Michael Davidsaver via Tech-talk
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
<2023>
2024
- Navigate by Thread:
- Prev:
Re: sequencer problem Michael Davidsaver via Tech-talk
- Next:
Re: sequencer problem Michael Davidsaver via Tech-talk
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
<2023>
2024
|