2002 2003 2004 2005 2006 2007 <2008> 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 | Index | 2002 2003 2004 2005 2006 2007 <2008> 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 |
<== Date ==> | <== Thread ==> |
---|
Subject: | Re: IOC Redundancy in R3.14.10 |
From: | "Schoeneburg, Bernd" <[email protected]> |
To: | Artem Kazakov <[email protected]> |
Cc: | [email protected], Bob Dalesio <[email protected]> |
Date: | Thu, 04 Sep 2008 13:37:25 +0200 |
Hello Artem,it is just as you have already assumed. We are preparing a module which includes a redundancy-start routine. This function is called from the shell during startup. The function registers a function into the task watchdog to maintain a list of epics tasks and registers to the rmt as only one PRR "base". An other function is registered in the task watchdog for each of these epics tasks to be called when such a task becomes suspended. This function sets the status, read by the rmt, to "bad". The module provides the functions for the rmt-api: stop, start, getStatus. stop calls iocPause(), start calls iocRun() and getStatus sends the status which is set by the callback function mentioned above. getInfo can send detailed information about the health of individual epics tasks.
In this way, no additional task is necessary.This is the simple way to solve the redundancy control. As you mentioned already, we normally stop or start all epics tasks together. So if iocRun does already manage to start the individual components in the right order, we don't have to care about this any more. Concerning the field attribute red_update: Different proposals exists to replace it. Up to now we didn't use it at all, because databases have not been so big and the processors have enough power to update all fields in the slave ioc. If it appears to become necessary the global field attribute would be the best solution in my opinion. Benjamin Franksen has proposed to have an expendable attribute list.
Gongfa is our main redundancy expert. Please ask him for details. - Bernd Artem Kazakov schrieb:
From memory, the last message I got from Bernd implied that he didn't think the redundancy monitor needed control over the individual tasks, so I took out that detail and provided the iocRun and iocPause commands instead (in src/misc/iocInit.c), which start and stop the whole IOC at once. If you want a little more control over the subsystems you can use these functions which are called by the above routines, but I don't have APIs giving control down to the individual thread level: scanRun(); dbCaRun(); rsrv_run(); rsrv_pause(); dbCaPause(); scanPause(); If this is not sufficient, please let me know ASAP as I'm trying to put out the R3.14.10-pre1 pre-release version later this week or early next week.For me it seems to be sufficient enough. As far as I understand for the redundancy purposes if stop something then we have to stop everything else, and when we start we want everything to start.Again the last message from Bernd I thought said that you weren't actually using this field yet. If I have misunderstood that please let me know ASAP.Indeed it is not really used. The code which uses it is present but it is permanently cut off by several if(FALSE) statements. I'm just commenting it out now (otherwise compiler is complaining about missing red_update field) At a first glance now everything seems to be OK for redundancy. At least I can compile, link and run the redundant IOC with new base on my PC. Still need to check two PC to see if it really works. I'll let you know if something is missing.