EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  <19951996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024  Index 1994  <19951996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: EPICS Reliability
From: Carl Dickey <[email protected]>
Date: Thu, 16 Feb 1995 19:26:35 -0500
All-
 
       Bob has asked me to relate an experience that we have gone through here at DFELL.
Several weeks ago, our linac IOC locked up during injection into our storage ring. Despite
generally excellent reliability, this is always a bit of an embarrassment, so we started 
recollecting the recent history of similar happenings. We recalled five such events during
the past year or so. Two occurred on the linac IOC, one for our storage ring IOC's and 
two for our Mark III IOC. As usual, the software enthusiasts began to investigate the 
hardware layers and the hardware types started analysing the software. An investigation
of the syslogs for workstations involved with EPICS revealed occasional NFS errors. 
Furthermore, at the time of the most recent linac IOC failure, we found a RPC error and
a corresponding timeout. We began to supsect that the NFS errors were being caused by 
noise coupled into our ethernet. Fred Carter, had been having a bad feeling about our
COW (computer on wheels) that we keep near the center of our storage ring for quick
troubleshooting. Sure enough, we discovered that in our haste to commission the system,
we had employed an unshielded AUI drop cable. This cable worked ok until we began to
fire up our pulsed power systems and our ceramic gaps became illuminated. Apparently,
about once a month or so, the occasional packet bursts that were evidenced by the NFS
errors, would occur in coincidence with the reception of data by the IOC. It seems that 
this causes the IOC to die. Thus we see an RPC error and a timeout. Reboot of the 
IOC clears the locked condition. Since replacing the bad drop cable, we have seen no 
further NFS errors, and we have had no further IOC dropouts. 
       Bob tells me that Los Alamos has had a very similar experience in the past with
their Heurikon HKV2 based IOC's. Given that so may labs are in the commissioning phase,
it might be good to keep this in mind in case you encounter something similar. Bob can
provide more detailed pathological information concerning this type of failure. 
 
Best wishes,
 
Carl
 
PS- Our commissioning is continuing to go well. Beam lifetime in our storage ring is on
the order of hours. We have successfully achieved single bunch injection and synchronous
stacking. We have ramped to over 1GeV, our machine's design energy. Our next goals include
increasing the current and reducing our injection pulse from the present length of three
buckets (about 15ns) to one bucket (about 5ns).


Navigate by Date:
Prev: Re: Building Epics R3.12 Nick Rees
Next: New edd/dm Bob Dalesio
Index: 1994  <19951996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: Re: Building Epics R3.12 Nick Rees
Next: New edd/dm Bob Dalesio
Index: 1994  <19951996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
ANJ, 10 Aug 2010 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·