EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  <20232024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  <20232024 
<== Date ==> <== Thread ==>

Subject: Re: Safely proceeding machine learning applications
From: Morgan Henderson via Tech-talk <tech-talk at aps.anl.gov>
To: tech-talk at aps.anl.gov
Date: Wed, 30 Aug 2023 10:41:48 -0600
Hi Tong,

I can't speak to the role of gateways in preventing the execution of risky decisions by ML agents. But I do want to emphasize what Josh EC said above, that not giving an ML agent the power to make critical decisions autonomously in the first place is always your best defense against an incident.

Even when an ML agent is executing control processes, initiation should almost always be a human decision and processes should always be overseen by a person. An ML agent's access should be limited to the controls needed for its tasks and nothing else, and to Pete's point those should be well-enclosed by machine safeguards.

None of that prevents an operator from using ML to make poor decisions, but that's a human mistake like any that can happen without ML. This enforces the role of ML as a tool rather than as an operator, and places the responsibility for using the tool properly on actual operators. It also places more responsibility on whoever does your ML engineering to design safe, reliable tools. They should always provide, e.g., live UQ and a reasonable level of predictability so that you know how the tool you're using will function.

Best,
Morgan Henderson

---

Morgan Henderson
Associate Research Scientist, RadiaSoft LLC

On Tue, Aug 29, 2023 at 11:04 AM Pete Jemian via Tech-talk <tech-talk at aps.anl.gov> wrote:
Machine operations and machine safeguards are two different areas.

1. Safeguards are there for your own worst days.  Implement them first.
2. Machine operations execute within the limitations of any safeguards.

On 8/29/2023 11:40 AM, Joshua Einstein-Curtis via Tech-talk wrote:
> Tong,
>
> I think about this a lot when working with our ML models -- and the best I can some up with are guidelines similar to those in any safety-critical design:
>
> - Don't let your controller even have the ability to output something that might damage anything
>
> I am not a fan of relying on access controls as any sort of primary safeguard, as those are outside the purview of the controller itself. If a controller has a capability to damage something (PPS or MPS), then it feels like that is just a huge risk. Seeing PID loops go wrong in RF really highlights that. Now on the flip side, I love access controls for mitigating possible configuration errors -- and having something pop up if you write the wrong PV by mistake is critical. But where that is controlled and who configures that is an interesting question -- I'd rather a pva/ca proxy running on the same machine as the controller and build the access controls right into it.
>
> I'd love to hear other people's thoughts -- this would be a great topic at a workshop.
>
> Josh EC
>
> On Tue, Aug 29, 2023 at 9:52 AM Zhang, Tong via Tech-talk <tech-talk at aps.anl.gov <mailto:tech-talk at aps.anl.gov>> wrote:
>
>     Dear Colleguages,____
>
>     __ __
>
>     Machine learning applications in accelerator controls are indeed gaining popularity, and there are exciting developments in progress. However, concerns persist regarding equipment protection, particularly when dealing with black-box ML models that may make risky decisions, especially during optimization iterations.____
>
>     __ __
>
>     When it comes to ML model generation, utilizing archived data is a viable approach. However, during the application phase, these models may still generate audacious decisions. Even when trained with live data, the risk remains.____
>
>     __ __
>
>     As far as I know, leveraging Channel Access security configuration is a sound strategy to manage PV write permissions at a granular level, covering individuals, groups, and workstations. This level of control ensures that the ML code's write permissions can be finely tuned. I’m still wondering is this way totally secure?____
>
>     __ __
>
>     Absolutely, incorporating the machine protection system as the primary safeguard on the device side is crucial. Your valuable insights/experience on this subject are greatly appreciated.____
>
>     __ __
>
>     Thanks,____
>
>     Tong____
>
>     __ __
>
>     --____
>
>     Tong Zhang, Ph.D. (he/him)____
>
>     Controls Physicist____
>
>     Facility for Rare Isotope Beams,____
>
>     Michigan State University____
>
>     __ __
>

--
----------------------------------------------------------
Pete R. Jemian, Ph.D.                 <jemian at anl.gov>
Beam line Controls and Data Acquisition (BC, aka BCDA)
Advanced Photon Source,    Argonne National Laboratory
Argonne, IL  60439                    630 - 252 - 3189
-----------------------------------------------------------
       Education is the one thing for which people
          are willing to pay yet not receive.
-----------------------------------------------------------



References:
Safely proceeding machine learning applications Zhang, Tong via Tech-talk
Re: Safely proceeding machine learning applications Joshua Einstein-Curtis via Tech-talk
Re: Safely proceeding machine learning applications Pete Jemian via Tech-talk

Navigate by Date:
Prev: Re: Sequencer Érico Nogueira Rolim via Tech-talk
Next: EPICS Qt 3.9.2 available [SEC=OFFICIAL] STARRITT, Andrew via Tech-talk
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  <20232024 
Navigate by Thread:
Prev: Re: Safely proceeding machine learning applications Zhang, Tong via Tech-talk
Next: PV Access timeout Ha, Kiman via Tech-talk
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  <20232024 
ANJ, 30 Aug 2023 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·