You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@daffodil.apache.org by "Mike Beckerle (Jira)" <ji...@apache.org> on 2021/01/07 20:56:00 UTC

[jira] [Updated] (DAFFODIL-1541) xs:hexBinary with dfdl:lengthKind="delimited" should be restricted to SBCS (single byte character set)

     [ https://issues.apache.org/jira/browse/DAFFODIL-1541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mike Beckerle updated DAFFODIL-1541:
------------------------------------
    Priority: Minor  (was: Major)

> xs:hexBinary with dfdl:lengthKind="delimited" should be restricted to SBCS (single byte character set)
> ------------------------------------------------------------------------------------------------------
>
>                 Key: DAFFODIL-1541
>                 URL: https://issues.apache.org/jira/browse/DAFFODIL-1541
>             Project: Daffodil
>          Issue Type: Bug
>          Components: Back End, Front End
>            Reporter: Steve Lawrence
>            Priority: Minor
>
> Say we have something like:
> {code}
> <xs:element name="foo" type="xs:hexBinary" dfdl:occursCountKind="parsed" dfdl:separator="multi-byte utf-8 character" dfdl:encoding="UTF-8" />
> {code}
> We currently don't allow this. But perhaps this should be allowed? Delimiters would be scanned using the specified encoding, and then the data up to that encoding would be converted to hexBinary data. Does it make sense to allow someone to specified a non byte size encoding. For example, a multi-byte UTF-8 character, or perhaps even non-byte-size encoding. If we allow non-byte-size encodings, is it then an error if the data consumed does not have a bitlength divisible by 8?
> The specification is not clear on how this should be handled. Right now, we just require that the encoding by ISO-8859-1 for delimited hex binary.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)