Hello,
I am writing to get your advice on managing system unavailability within EPICS 7. In our current operational model we can switch between different instruments seamlessly if one encounters an issue and becomes unavailable. This strategy was effective in previous EPICS versions. However, after migrating to EPICS 7, records reading data from systems that are no longer available stop processing, even if we don't rely on the data from that particular system to continue observing. The issue arises because broken CA links set the alarm severity (SEVR) to INVALID, the alarm status (STAT) to LINK, and halt the record processing of the downstream records. We want a mechanism to override the alarm severity when an instrument becomes unresponsive, ideally with minimal operator intervention.
We have identified three potential ways of achieving this:
- Maximize Severity Attribute: The idea was to use this attribute to prevent the alarm propagation, but it seems that it does not provide what we need.
- SIMM Field: Setting the SIMM field to YES enables the record to continue processing without being affected by the INVALID alarm status. The SVAL field can be used to define a simulation value and the SIMS field specifies the simulation mode alarm severity (NO_ALARM in our case). We have tested this approach and it seems to work well. STAT is set to SIMM and the downstream records process without problems.
- DISA Field: Making DISA=DISV disables the record. The DISS field defines the record's disable severity (e.g. NO_ALARM). This approach also seems to work. STAT is set to DISABLE and the downstream records process without problems as well.
The SIMM field seems to be the most promising option. I would greatly appreciate your insights on this, as well as any alternative approaches that you might suggest.
Thank you,
Pedro