On Thursday 11 December 2008 00:39:31 Till Straumann wrote:
Quoting the AppDevGuide (3.14.8, 3.14.10,
Section 5.4 "Database Locking")
"All records linked via OUTLINKs and FWDLINKs
are placed in the same lock set.
Records linked via INLINKs with PROCESS_PASSIVE
or MAXIMIZE_SEVERITY TRUE are also forced to
be in the same lock set."
I was naively concluding that records linked
via INLINK NPP NMS would _not_ be forced to be
in the same lock set.
I didn't, but I just did the research. The change occurred due to a commit to
src/db/dbLock.c between the R3-14-0-alpha-1 and R3-14-0-alpha-2 releases:
revision 1.29
date: 2001-04-05 14:56:22 +0000; author: mrk; state: Exp; lines: +76 -46;
add dbLockShowLocked
even scalar DB_LINKs forced into lock set
That commit included the following change in the code that calculates the
initial lockset configuration (with at least one other similar deletion
elsewhere in the file):
@@ -369,16 +363,6 @@ void epicsShareAPI dbLockInitRecords(dbB
plink = (DBLINK *)((char *)precord + pdbFldDes->offset);
if(plink->type != DB_LINK) continue;
pdbAddr = (DBADDR *)(plink->value.pv_link.pvt);
- /* The current record is in a different lockset -IF-
- * 1. Input link
- * 2. Not Process Passive
- * 3. Not Maximize Severity
- * 4. Not An Array Operation - single element only
- */
- if (pdbFldDes->field_type==DBF_INLINK
- && !(plink->value.pv_link.pvlMask&pvlOptPP)
- && !(plink->value.pv_link.pvlMask&pvlOptMS)
- && pdbAddr->no_elements<=1) continue;
dbLockSetMerge(precord,pdbAddr->precord);
}
}
The original behavior is not really a good idea because it allows a value to
be fetched from another record while that record is active. While this may
work in most circumstances, there are cases where it breaks because reading
and/or writing the field value isn't an atomic operation, so the read can
return mangled data:
1. If the field being copied is a string,
2. If the CPU uses software floating point,
3. On SMP machines if the field crosses a cache-line boundary.
The rules for many SMP machines actually require a write + read barrier to
guarantee that the reading CPU sees data that was written by another CPU, and
the only portable way of inserting those barriers is with a mutex, which is
what a lockset provides.
At this point I think the only reasonable fix is to update the documentation,
and tell people to use a .CA link to break a lockset. Local .CA links do not
traverse the network, so they're not quite as inefficient as they might seem.
Unfortunately this can break some R3.13 IOCs that get converted to R3.14, but
I don't see any way around that issue; any IOC developer doing a conversion
will have to look carefully at their locksets if they matter to that IOC.
- Andrew