EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  <20202021  2022  2023  2024  Index 2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  <20202021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: [Bug 1783475] Re: const link support can't handle escaped charactors
From: Andrew Johnson via Core-talk <core-talk at aps.anl.gov>
To: core-talk at aps.anl.gov
Date: Mon, 27 Jul 2020 05:43:46 -0000
This bug is a symptom of the way that we currently parse both
field(name, value) and info(name, value) statements in a .db file, which
looks a little tricky to fix but I am working on it and have just
reached it as part of my JSON5 changes. I will describe what is
happening and how JSON5 changes things, then propose a way forward.

In the dbStatic parser's dbRecordField() and dbRecordInfo() routines,
the <value> is now expected to be JSON encoded — the dbLex.l & dbYacc.c
code have already made sure of that by the time these routines get
called. The code in both routines looks at the first character of
<value> and if it's a double-quote it strips that and the last character
(which must also be a double-quote to pass the lexer). The string is
then filtered through dbTranslateEscape(), and the result given to
dbPutString() to actually set the field value.

The example in this bug description shows that the processing of <value>
described above is incorrect. We should not be translating escaped
characters that appear inside a JSON map since the yajl parser in
dbConstLink.c will do the translation for us later on. However we do
need to translate any escapes inside simple quoted string values because
they don't get passed through yajl at all.

A simple fix is thus to only call dbTranslateEscape() when the value was
a quoted string, and this change does solve the example shown in the
bug. However since we say these values are JSON we should accept the
\u201c unicode escaped character forms, which dbTranslateEscape()
doesn't understand.

Also, and nobody has actually reported this yet, before I introduced the
JSON parser it was possible to use the C octal and hex escape formats
\ooo and \xXX inside quoted string values, which dbTranslateEscape()
does understand. However they don't work any more because the JSON lexer
rejects them before they can reach the dbRecordField() routine. JSON
only accepts a back-slash before a very specific set of characters so
back-slash followed by an x or a digit in a string causes the parser to
abort.

JSON5 however allows any character to be escaped; the back-slash will be
dropped if the combination has no special meaning. In addition to the \u
followed by 4 hex digits which JSON accepts, JSON5 also accepts \x
followed by 2 hex digits (although I just discovered that my YAJL
changes for JSON5 don't implement this so I've now got to go back and
fix that too). JSON5 does not accept C's octal escape sequences \ooo at
all, but I doubt if anyone will particularly miss them nowadays.

Our dbTranslateEscape() code doesn't implement quite the same rules as
JSON5 although it's pretty close if we don't care about the unicode
form. The differences are that we translate "\a" to (whatever C says it
should become) and "\0" through "\7" introduce an octal numeric escape
of up to 3 digits, whereas JSON5 says "\a" should produce "a", "\0"
produce a zero byte and "\1" through "\7" generate "1" through "7"
respectively. Our "\x" parsing also looks very suspicious as it doesn't
limit itself to just 2 hex digits as JSON5 requires.


So to summarize: An incomplete but quick fix would be to move the dbTranslateEscape() call to inside the code that strips the leading and trailing quotes from a simple string value. This solves some issues, but isn't complete. A better fix can come from my work adding JSON5 but we still have to decide if we care about unicode escapes; if we don't moving the dbTranslateEscape() call (and fixing the 0x parsing) is relatively easy. For fully compliant handling I would probably add another yajl parser to dbLexRoutines() and use that to translate the quoted string value.

If we picked the middle option the differences between the two translators would show up here:
    record(stringin, "s1") {
        field(INP, {const:"string-with-escapes"})
        field(DESC, "string-with-escapes")
    }
The s1.VAL field gets translated by yajl following the JSON5 rules while parsing the const input link. The s1.DESC field would be translated by dbTranslateEscape().

-- 
You received this bug notification because you are a member of EPICS
Core Developers, which is subscribed to EPICS Base.
Matching subscriptions: epics-core-list-subscription
https://bugs.launchpad.net/bugs/1783475

Title:
  const link support can't handle escaped charactors

Status in EPICS Base:
  Triaged
Status in EPICS Base 3.16 series:
  Won't Fix
Status in EPICS Base 7.0 series:
  Triaged

Bug description:
  I think the following should work, but it doesn't.  The culprit seems
  to be the call to dbTranslateEscape() in dbRecordField() and
  dbRecordInfo().  The escaped newline is handled correctly by dbLex,
  but the later dbJLinkParse() errors on the now unescaped newline.

  > record(stringin, "test") {
  >      field(INP, {const:"multi\nline"})
  > }

  > dbJLinkInit: lexical error: invalid character inside string.
  >                          {"const":"multi line"}
  >                      (right here) ------^

To manage notifications about this bug go to:
https://bugs.launchpad.net/epics-base/+bug/1783475/+subscriptions

Navigate by Date:
Prev: epics-pva2pva-linux32 - Build # 189 - Unstable! APS Jenkins via Core-talk
Next: [Bug 1783475] Re: const link support can't handle escaped charactors Dirk Zimoch via Core-talk
Index: 2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  <20202021  2022  2023  2024 
Navigate by Thread:
Prev: epics-pva2pva-linux32 - Build # 190 - Fixed! APS Jenkins via Core-talk
Next: [Bug 1783475] Re: const link support can't handle escaped charactors Dirk Zimoch via Core-talk
Index: 2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  <20202021  2022  2023  2024 
ANJ, 12 Aug 2020 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·