You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by ji...@apache.org on 2004/04/12 23:26:09 UTC

[jira] Updated: (XERCERJ-460) Xerces J2 is not correctly treating UTF-8 encoded characters in patterns.

The following issue has been updated:

    Updater: Serge Knystautas (mailto:sergek@lokitech.com)
       Date: Mon, 12 Apr 2004 2:25 PM
    Changes:
             Attachment changed from notEuros.xsd
    ---------------------------------------------------------------------
For a full history of the issue, see:

  http://issues.apache.org/jira/browse/XERCERJ-460?page=history

---------------------------------------------------------------------
View the issue:
  http://issues.apache.org/jira/browse/XERCERJ-460

Here is an overview of the issue:
---------------------------------------------------------------------
        Key: XERCERJ-460
    Summary: Xerces J2 is not correctly treating UTF-8 encoded characters in patterns.
       Type: Bug

     Status: Resolved
 Resolution: INCOMPLETE

    Project: Xerces2-J

   Assignee: Xerces-J Developers Mailing List
   Reporter: Reuben Wright

    Created: Wed, 18 Sep 2002 1:36 PM
    Updated: Mon, 12 Apr 2004 2:25 PM
Environment: Operating System: Linux
Platform: PC

Description:
Xerces J2, and Xerces J1, are not correctly treating UTF-8 encoded characters in
patterns.

Errant behaviour observed in use of pattern, and encoding of euro character
(files attached).  The schema pattern is recognised if encoded as an entity
reference, but the UTF-8 encoded euro character is split into two characters and
the file validated as though the pattern consisted of these two characters,
rather than the single, UTF-8 encoded, euro character.


So, with
   1)  a pattern in a schema consisting of a euro in UTF-8 encoding, surrounded 
       by square brackets -  [e] where e is UTF-8 euro,
 and 
   2) a euro in an instance coded either as an entity reference, € or as    
      UTF-8, 
 then
the instance is not seen as matching the pattern.

If the pattern is [€] then the instance is validated correctly.


Result from validating attached notEuros2.xml against attached notEuros.xsd

[Error] file: null notEuros2.xml:3:25: cvc-type.3.1.3: The value '?' of element
'AsUTF8' is not valid.

thanks
Reuben


---------------------------------------------------------------------
JIRA INFORMATION:
This message is automatically generated by JIRA.

If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa

If you want more information on JIRA, or have a bug to report see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org