You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by "Mario Juric (Jira)" <de...@uima.apache.org> on 2019/09/24 10:45:00 UTC

[jira] [Created] (UIMA-6128) Allow XMI to be optionally serialized with XML 1.1 instead of only 1.0

Mario Juric created UIMA-6128:
---------------------------------

             Summary: Allow XMI to be optionally serialized with XML 1.1 instead of only 1.0
                 Key: UIMA-6128
                 URL: https://issues.apache.org/jira/browse/UIMA-6128
             Project: UIMA
          Issue Type: New Feature
          Components: UIMA
            Reporter: Mario Juric


Some unicode characters are not handled by XML 1.0 and it can require some normalization or cleanup to be able to serialize the CAS to XMI, but requirements may not necessarily allow all such characters to be fully removed from the CAS. It can also be impossible to do such normalization/cleanup without full reprocess when converting data already stored as compressed binaries to XMI. Being able to optionally select XML 1.1 instead of the default XML 1.0 would be an easier way for some to bypass many of those unicode issues.

See also discussion on the UIMA mailing list:

https://lists.apache.org/thread.html/7f8124b7be9ea20ab21dc616243e5661a0b7668a856532031fda71e3@%3Cuser.uima.apache.org%3E

This feature request suggests that an additional SerialFormat is introduced, e.g. XMI_1_1, which can be selected as format parameter in the CasIOUtils.save methods.

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)