You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by ji...@apache.org on 2004/04/23 19:39:53 UTC

[jira] Closed: (XERCESJ-823) [XML 1.0] - E27: Must reject non-shortest forms in UTF-8

Message:

   The following issue has been closed.

   Resolver: Michael Glavassevich
       Date: Fri, 23 Apr 2004 10:38 AM

Closed.
---------------------------------------------------------------------
View the issue:
  http://issues.apache.org/jira/browse/XERCESJ-823

Here is an overview of the issue:
---------------------------------------------------------------------
        Key: XERCESJ-823
    Summary: [XML 1.0] - E27: Must reject non-shortest forms in UTF-8
       Type: Bug

     Status: Closed
 Resolution: FIXED

    Project: Xerces2-J
 Components: 
             Other
   Versions:
             2.5.0

   Assignee: Michael Glavassevich
   Reporter: Michael Glavassevich

    Created: Mon, 10 Nov 2003 8:03 PM
    Updated: Fri, 23 Apr 2004 10:38 AM
Environment: Operating System: All
Platform: All

Description:
E27 [1] states that "it is a fatal error if an entity encoded in UTF-8 contains 
any irregular code unit sequences, as defined in Unicode 3.1".  I had a look at 
this errata sometime ago, and in addition to irregular code unit sequences 
being a fatal error, we should also reject non-shortest forms. These non-
shortest forms (such as C0 80 or E0 80 80
which both correspond to codepoint 0), are not legal in Unicode 3.1. See "UTF-8 
Corrigendum" and "Table 3.1B. Legal UTF-8 Byte Sequences" of Unicode 3.1 [3].

[1] http://www.w3.org/XML/xml-V10-2e-errata#E27


---------------------------------------------------------------------
JIRA INFORMATION:
This message is automatically generated by JIRA.

If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa

If you want more information on JIRA, or have a bug to report see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org