Jeff and Dirk - thanks for getting back to me.
I'm familiar with some of the issues that you guys pointed out, they can definitely bite you. While double-checking your suggestions I found that one of the channels that was being monitored came from another vxWorks IOC (mistake on my part). Surprise, surprise...it never has any disconnect issues. Following that lead, I found that the problems seem to be centered around a dynamically assigned pv in one of my two sequencers. When I hard-coded that pv assignment the problems went away. This remains consistent. What's confusing me a bit is that I use dynamic assignment in the working sequencer, I just don't do any reassignments.
Ben - Any known issues with dynamic assignment/reassignment? Perhaps specific to RTEMS?
Thanks guys,
Wesley
----- Original Message -----
> From: "Dirk Zimoch" <[email protected]>
> To: "Jeff Hill" <[email protected]>
> Cc: "Wesley Moore" <[email protected]>, "EPICS tech-talk" <[email protected]>
> Sent: Friday, September 21, 2012 6:01:07 AM
> Subject: Re: CAC problem between RTEMS and vxWorks
>
> Hi Wesley
>
> Hill, Jeff wrote:
> > Hi Wesley,
> >
> >> TCP 129.57.214.101:1024(): User="rtems", V4.11, 4 Channels,
> >> Priority=0
> >> TCP 129.57.214.101:1025(): User="rtems", V4.11, 4 Channels,
> >> Priority=0
> >> TCP 129.57.214.101:1026(): User="rtems", V4.11, 4 Channels,
> >> Priority=0
> >> TCP 129.57.214.101:1027(): User="rtems", V4.11, 4 Channels,
> >> Priority=0
> >> TCP 129.57.214.101:1028(): User="rtems", V4.11, 4 Channels,
> >> Priority=0
> >> TCP 129.57.214.101:1029(): User="rtems", V4.11, 4 Channels,
> >> Priority=0
> >> TCP 129.57.214.101:1030(): User="rtems", V4.11, 4 Channels,
> >> Priority=0
> >> TCP 129.57.214.101:1031(): User="rtems", V4.11, 4 Channels,
> >> Priority=0
> >
> > When abruptly rebooting the RTEMS system the TCP shutdown
> > interactions
> > may not (probably don’t) occur, and when the new instance of the
> > RTEMS
> > system starts up it begins a new CA circuit over TCP. What
> > ephemeral
> > port is assigned to the new circuit (i.e. 1024 through 1031 above)
> > is
> > is an implementation detail of the IP kernel. On some
> > implementations
> > the same port gets reused, and typically TCP detects an attempt to
> > start
> > a new circuit on the same port as a preexisting circuit, and the
> > server
> > side hangs up. Otherwise, the server side waits the duration of a
> > long
> > timeout, in case the client is just temporarily loosing
> > connectivity,
> > before it hangs up. The order of the ephemeral port assignment
> > depends
> > typically on what sockets have been created since rebooting, and so
> > you
> > may get different port assignment orderings if there are many
> > circuits
> > being created shortly after the IOC reboots.
> >
> >> CAC: Unable to connect because "Connection timed out"
> >
> > Make certain that full/half-duplex configuration match between the
> > switches and the IOCs that are involved. If the switch and the IOC
> > don’t
> > match communication can proceed but it can be very slow and
> > unreliable.
> > Sometimes you can see this by watching the beacons with casw. On
> > vxWorks
> > you can sometimes see the Ethernet auto-negotiation parameters by
> > typing
> > ifShow. In my experience, some of the vxWorks Ethernet drivers are
> > neglecting to turn on the continuous auto-negotiation option in the
> > PHY
> > and so if the vxWorks system gets powered up before the switch it
> > decides
> > to default the auto-negotiation parameters, and never tries to
> > auto-negotiate again. This can be a problem because switches are
> > typically set to continuously auto-negotiate.
> >
> > Another possibility is too many collisions which can be sometimes
> > seen
> > on vxWorks systems with ifShow, but this is an infrequently
> > experienced
> > problem today, on modern switched Ethernet networks.
> >
> > Jeff
>
> Keep in mind that the number of file descriptors in vxWorks is
> limited
> and each "dead" socket connection uses one. The default number is 50.
> Once they are used up, no new network connection can be made (and no
> file can be opened). Modify NUM_FILES in the vxWorks BSP
> configuration.
>
> But even worse, vxWorks may run out of network buffers. If the RTEMS
> system has monitors on its 4 channels, then EPICS may try to send the
> monitor events filling the TCP send buffers and using up all
> available
> network buffers.
>
> So you may check the following:
> * Run inetstatShow and check the Send-Q entries of your "dead"
> sockets
> * Run mbufShow and see if you are short of free buffers
> * Run iosFdShow and see how many file descriptors are in use.
>
> Jeff, maybe CA should "clean up" the stale sockets instead of waiting
> for TCP to do that (if possible).
>
> Dirk
>
> >
> >> -----Original Message-----
> >> From: [email protected]
> >> [mailto:[email protected]]
> >> On Behalf Of Wesley Moore
> >> Sent: Wednesday, September 19, 2012 10:19 AM
> >> To: EPICS tech-talk
> >> Subject: CAC problem between RTEMS and vxWorks
> >>
> >> All,
> >>
> >> I'm having issues with a RTEMS client (3.14.11) accessing PVs from
> >> a
> >> vxWorks IOC (3.14.8.2). When the RTEMS IOC is rebooted, sometimes
> >> it
> >> doesn't reconnect to the other IOC. Even after connecting, I'm
> >> often
> >> getting timeouts on RTEMS and can't seem to maintain a solid
> >> connection
> >> between the two.
> >>
> >> # timeouts on RTEMS client IOC
> >> CAC: Unable to connect because "Connection timed out"
> >> CA.Client.Exception...............................................
> >> Warning: "Virtual circuit disconnect"
> >> Context: "iocfel8.acc.jlab.org:5064"
> >> Source File: ../cac.cpp line 1145
> >> Current Time: Wed Sep 19 2012 11:52:27.183784942
> >> ..................................................................
> >>
> >> After reboots, It stacks up new socket connections which isn't
> >> helping
> >> matters.
> >>
> >> # casr on vxWorks IOC
> >> TCP 129.57.214.101:1024(): User="rtems", V4.11, 4 Channels,
> >> Priority=0
> >> TCP 129.57.214.101:1025(): User="rtems", V4.11, 4 Channels,
> >> Priority=0
> >> TCP 129.57.214.101:1026(): User="rtems", V4.11, 4 Channels,
> >> Priority=0
> >> TCP 129.57.214.101:1027(): User="rtems", V4.11, 4 Channels,
> >> Priority=0
> >> TCP 129.57.214.101:1028(): User="rtems", V4.11, 4 Channels,
> >> Priority=0
> >> TCP 129.57.214.101:1029(): User="rtems", V4.11, 4 Channels,
> >> Priority=0
> >> TCP 129.57.214.101:1030(): User="rtems", V4.11, 4 Channels,
> >> Priority=0
> >> TCP 129.57.214.101:1031(): User="rtems", V4.11, 4 Channels,
> >> Priority=0
> >>
> >>
> >> Any help is greatly appreciated.
> >>
> >> Wesley
> >
> >
>
>
- Replies:
- Re: CAC problem between RTEMS and vxWorks J. Lewis Muir
- Re: CAC problem between RTEMS and vxWorks Benjamin Franksen
- References:
- Re: CAC problem between RTEMS and vxWorks Dirk Zimoch
- Navigate by Date:
- Prev:
Re: CAC problem between RTEMS and vxWorks Dirk Zimoch
- Next:
Re: CAC problem between RTEMS and vxWorks J. Lewis Muir
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
<2012>
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
- Navigate by Thread:
- Prev:
Re: CAC problem between RTEMS and vxWorks Dirk Zimoch
- Next:
Re: CAC problem between RTEMS and vxWorks J. Lewis Muir
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
<2012>
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
|