Experimental Physics and
Industrial Control System

1994 1995 1996 1997 1998 <1999> 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026	Index	1994 1995 1996 1997 1998 <1999> 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026
<== Date ==>		<== Thread ==>

Subject:	database race condition?
From:	Till Straumann <[email protected]>
To:	[email protected]
Date:	Thu, 18 Mar 1999 15:07:16 +0100

One of our programmers consulted me because his device support module
deadlocked the database (i.e. one or two of the scanning tasks). This happened
when his module read a field using dbGetField().
dbGetField() tried to read the value of "that one record" that triggered "this record's"
processing:

                      FLNK
       Record 1   ---------> Record 2
         ^                      |
         |                      |
         ------------------------
               dbGetField()

Of course, the deadlock occurred because Record1 processing had
not finished yet at the time the FLNK was processed, leaving Record1
locked when dbGetField() also tried to lock Record1, hence the deadlock.
(Both records are not necessarily scanned by the same task)

Nevertheless, I remembered a very similar constellation to work well

                   FLNK
       Record1 --------> Record2
          ^                   |
          |                   |
          ---------------------
                INP NPP

Here, Record2 obtains the value by dbGetLink() and no deadlock occurs.

Studying the source, I was surprised to learn that dbGetLink() (calling dbGet)
not only does no record locking, but EPICS not seeming to implement
something like a `field read/write access mutex'. A task reading a database
link (using dbGetLink()) may therefore read a tampered value from a record
according to the following race condition:

record A processing (i.e. in the context of a low prio scanning task) starts
writing a field.
record B processing starts (in the context of a high prio task) preempting
the processing of record A. record B processing calls
dbGetLink() trying to read just the field A is currently writing.
Hence B will get only the partially written value!
record A completes writing the field and terminates processing.

It is indeed very simple to observe this race condition. I created two stringin
records, A and B. B is scanned `.1 second', has its INP field set to
"A NPP" and is using the devSiSoft device. A is scanned less frequently and
has a device support module which (artificially slowly) modifies its value field.
Observing B shows that the described race condition occasionally is met.

Did I miss something? Wouldn't some finer grained locking than locking a whole
scanLock set make sense to prevent this kind of race condition?

Best regards.

Till Straumann (PTB/Bessy II, Berlin)

Replies:: Re: database race condition? Marty Kraimer; Re: database race condition? Marty Kraimer

Navigate by Date:: Prev: Re: TCP s_errno_ENOBUFS error in CAS Frank Lenkszus; Next: Re: database race condition? Marty Kraimer; Index: 1994 1995 1996 1997 1998 <1999> 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026
Navigate by Thread:: Prev: RE: TCP s_errno_ENOBUFS error in CAS Jeff Hill; Next: Re: database race condition? Marty Kraimer; Index: 1994 1995 1996 1997 1998 <1999> 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026

ANJ, 10 Aug 2010

· Home · News · About · Talk · Base · Modules · Extensions ·
· Distributions · Download · Documents · Links · Licensing ·

Experimental Physics and Industrial Control System

Experimental Physics and
Industrial Control System