EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  <20162017  2018  2019  2020  2021  2022  2023  2024  Index 2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  <20162017  2018  2019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: Bug in dbGet() with server-side plugins
From: Andrew Johnson <[email protected]>
To: EPICS core-talk <[email protected]>
Date: Fri, 12 Feb 2016 14:26:20 -0600
I'm working with Base-3.16 but this bug appears in 3.15 as well so the
fix will have to be applied there.

valgrind is reporting invalid reads when I access a subarray of a
waveform record with either caget -c or camonitor. My PV with filter is
'wf100:i32.[0:4]' where wf100:i32 has NELM=100 and currently contains 10
elements. The current array size reported by the client is also 10, when
it should be 5 with that filter.

I temporarily instrumented some of the routines to work out what's
happening:

> dbChannel_get_count(buffer_type=19, to=0x578b2a8, pfl=0x57e7548, len=100)
>     channel name: wf100:i32.[0:4]
>       field_type=DBF_LONG (4 bytes), dbr_type=DBF_LONG, 100 elements, 1 filter (0 pre eventq, 1 post eventq)
>         plugin arr, start=0, incr=1, end=4
>       final field_type=DBF_LONG (4B), 5 elements
> dbChannelGet(type=5, to=0xdfecbf0, pfl=0x57e7548, len=0)
>     channel name: wf100:i32.[0:4]
>       field_type=DBF_LONG (4 bytes), dbr_type=DBF_LONG, 100 elements, 1 filter (0 pre eventq, 1 post eventq)
>       final field_type=DBF_LONG (4B), 5 elements
> dbChannelGet(type=5, to=0x578b2b4, pfl=0x57e7548, len=100)
>     channel name: wf100:i32.[0:4]
>       field_type=DBF_LONG (4 bytes), dbr_type=DBF_LONG, 100 elements, 1 filter (0 pre eventq, 1 post eventq)
>       final field_type=DBF_LONG (4B), 5 elements
> getLongLong(pfield=0x57e7648, to=0x578b2b4, req=10, nelms=10, off=0)
> ==16723== Thread 20:
> ==16723== Invalid read of size 2
> ==16723==    at 0x4A09BC0: memmove (mc_replace_strmem.c:1026)
> ==16723==    by 0x4E78592: getLongLong (dbConvert.c:60)
> ==16723==    by 0x4E6C9CD: dbGet (dbAccess.c:920)
> ==16723==    by 0x4E703A8: dbChannelGetField (dbChannel.c:680)
> ==16723==    by 0x4E8386F: dbChannel_get_count (db_access.c:398)
> ==16723==    by 0x4EA533E: read_reply (camessage.c:586)
> ==16723==    by 0x4E7F9F5: event_task (dbEvent.c:942)
> ==16723==    by 0x535084B: start_routine (osdThread.c:414)
> ==16723==    by 0x373F807A50: start_thread (in /lib64/libpthread-2.12.so)
> ==16723==    by 0x373F0E893C: clone (in /lib64/libc-2.12.so)
> ==16723==  Address 0x57e7660 is 0 bytes after a block of size 24 client-defined
> ==16723==    at 0x533F099: freeListMalloc (freeListLib.c:129)
> ==16723==    by 0x533F225: freeListCalloc (freeListLib.c:71)
> ==16723==    by 0x4C45638: filter (arr.c:125)
> ==16723==    by 0x4E6FA55: dbChannelRunPostChain (dbChannel.c:590)
> ==16723==    by 0x4E7FAE7: event_task (dbEvent.c:938)
> ==16723==    by 0x535084B: start_routine (osdThread.c:414)
> ==16723==    by 0x373F807A50: start_thread (in /lib64/libpthread-2.12.so)
> ==16723==    by 0x373F0E893C: clone (in /lib64/libc-2.12.so)
> ==16723== 
> ==16723== Invalid read of size 2
> ==16723==    at 0x4A09BA8: memmove (mc_replace_strmem.c:1026)
> ==16723==    by 0x4E78592: getLongLong (dbConvert.c:60)
> ==16723==    by 0x4E6C9CD: dbGet (dbAccess.c:920)
> ==16723==    by 0x4E703A8: dbChannelGetField (dbChannel.c:680)
> ==16723==    by 0x4E8386F: dbChannel_get_count (db_access.c:398)
> ==16723==    by 0x4EA533E: read_reply (camessage.c:586)
> ==16723==    by 0x4E7F9F5: event_task (dbEvent.c:942)
> ==16723==    by 0x535084B: start_routine (osdThread.c:414)
> ==16723==    by 0x373F807A50: start_thread (in /lib64/libpthread-2.12.so)
> ==16723==    by 0x373F0E893C: clone (in /lib64/libc-2.12.so)
> ==16723==  Address 0x57e7662 is 2 bytes after a block of size 24 client-defined
> ==16723==    at 0x533F099: freeListMalloc (freeListLib.c:129)
> ==16723==    by 0x533F225: freeListCalloc (freeListLib.c:71)
> ==16723==    by 0x4C45638: filter (arr.c:125)
> ==16723==    by 0x4E6FA55: dbChannelRunPostChain (dbChannel.c:590)
> ==16723==    by 0x4E7FAE7: event_task (dbEvent.c:938)
> ==16723==    by 0x535084B: start_routine (osdThread.c:414)
> ==16723==    by 0x373F807A50: start_thread (in /lib64/libpthread-2.12.so)
> ==16723==    by 0x373F0E893C: clone (in /lib64/libc-2.12.so)

This shows that dbConvert.c::getLongLong() is being asked to fetch 10
elements from the arr filter's buffer that only has 5 elements in it.
This happens because dbAccess.c::dbGet() first sets field_type and
no_elements from either paddr or pfl, but later on it calls
prset->get_array_info() so the record can update no_elements, thereby
overwriting the value that came from the filter (i.e. from pfl).

The fix is to move the call to prset->get_array_info() inside the first
if (!pfl) block higher up [or to suppress it using the same conditions,
but I'd rather move it] — the attached patch works.

I would like to create a test to demonstrate the bug before applying my
fix, but I'm not sure where the right place would be for it:

1. std/filters/test/arrTest.cpp doesn't call dbChannelGetField() at all,
it just uses dbChannelRunPostChain() and examines the data in the
db_field_log, and this doesn't feel quite the right place anyway.
2. ioc/db/test/dbChannelTest.c is too low-level and doesn't have any
suitable filters.
3. dbCaLinkTest.c might not have the array filter registered, and this
isn't really a test of the dbCa subsystem anyway.

Ideas/suggestions?

- Andrew

-- 
There are only two hard problems in distributed systems:
  2. Exactly-once delivery
  1. Guaranteed order of messages
  2. Exactly-once delivery
 -- Mathias Verraes
=== modified file 'src/ioc/db/dbAccess.c'
--- src/ioc/db/dbAccess.c	2015-09-07 04:15:21 +0000
+++ src/ioc/db/dbAccess.c	2016-02-12 18:58:45 +0000
@@ -813,7 +813,7 @@
     void *pbuffer, long *options, long *nRequest, void *pflin)
 {
     char *pbuf = pbuffer;
-    void *pfieldsave;
+    void *pfieldsave = paddr->pfield;
     db_field_log *pfl = (db_field_log *)pflin;
     short field_type;
     long no_elements;
@@ -829,9 +829,18 @@
     if (!pfl || pfl->type == dbfl_type_rec) {
         field_type  = paddr->field_type;
         no_elements = paddr->no_elements;
+
+        /* Update field info from record */
+        if (paddr->pfldDes->special == SPC_DBADDR &&
+            (prset = dbGetRset(paddr)) &&
+            prset->get_array_info) {
+            status = prset->get_array_info(paddr, &no_elements, &offset);
+        } else
+            offset = 0;
     } else {
         field_type  = pfl->field_type;
         no_elements = pfl->no_elements;
+        offset = 0;
     }
 
     if (field_type >= DBF_INLINK && field_type <= DBF_FWDLINK)
@@ -849,21 +858,6 @@
         return S_db_badDbrtype;
     }
 
-    /* For SPC_DBADDR fields, the rset function
-     * get_array_info() is allowed to modify
-     * paddr->pfield.  So we store the original
-     * value and restore it later.
-     */
-    pfieldsave = paddr->pfield;
-
-    /* Update field info */
-    if (paddr->pfldDes->special == SPC_DBADDR &&
-        (prset = dbGetRset(paddr)) &&
-        prset->get_array_info) {
-        status = prset->get_array_info(paddr, &no_elements, &offset);
-    } else
-        offset = 0;
-
     if (offset == 0 && (!nRequest || no_elements == 1)) {
         if (nRequest) *nRequest = 1;
         if (!pfl || pfl->type == dbfl_type_rec) {


Replies:
Re: Bug in dbGet() with server-side plugins Michael Davidsaver

Navigate by Date:
Prev: Re: Build failed in Jenkins: epics-base-3.16-linux32-test #13 Andrew Johnson
Next: Re: Bug in dbGet() with server-side plugins Michael Davidsaver
Index: 2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  <20162017  2018  2019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: Re: Build failed in Jenkins: epics-base-3.16-linux32-test #13 Michael Davidsaver
Next: Re: Bug in dbGet() with server-side plugins Michael Davidsaver
Index: 2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  <20162017  2018  2019  2020  2021  2022  2023  2024 
ANJ, 16 Feb 2016 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·