Benjamin Franksen wrote:
On Wednesday 26 October 2005 00:33, Jeff Hill wrote:
Attached is a rough cut at the EPICS V4 CA protocol specification.
Hi Jeff,
Just one little thought about STRING data type:
UINTN the number of UTF-8 tokens
OCTET sequence UTF-8 encoded character string sequence
I take it that 'number of UTF-8 tokens' means 'number of octets', right?
Maybe it would be worthwhile to consider adding a 'number
of /characters/' count in addition to the byte count. This could
improve performance, particularly when converting to other encodings on
the client side. Of course any gain must be offset against the
increased protocol overhead.
The same applies of course to any library string representations.
Ben
Java 5 uses 16 bits for char, which is not sufficient to encode all
uni-code character sets.
It uses 2 consecutive chars to hold a unicode character that does not
fit in 16 bits.
At least some C/C++ implementations use 32 bits for wchar which is
sufficient for all unicode characters.
But what if an implementation uses 16 bits?
Thus how will the number of characters in a UTF-8 string be used?
Better to just let final sender/receiver of the character string handle it.
Marty
- Replies:
- Re: CA V4 Protocol Specification Andrew Johnson
- References:
- CA V4 Protocol Specification Jeff Hill
- Re: CA V4 Protocol Specification Benjamin Franksen
- Navigate by Date:
- Prev:
Re: Release 3.14.8: What goes in it and when? Marty Kraimer
- Next:
Re: Release 3.14.8: What goes in it and when? Marty Kraimer
- Index:
2002
2003
2004
<2005>
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
- Navigate by Thread:
- Prev:
Re: CA V4 Protocol Specification Benjamin Franksen
- Next:
Re: CA V4 Protocol Specification Andrew Johnson
- Index:
2002
2003
2004
<2005>
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
|