You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by Michael Glavassevich <mr...@ca.ibm.com> on 2008/02/01 02:52:22 UTC

RE: Problem parsing attribute with character 7F

If you're generating an XML 1.1 document you don't have much choice. If you
want to include 0x7F you need to use a character reference.

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas@ca.ibm.com
E-mail: mrglavas@apache.org

tyrrill_ed@emc.com wrote on 01/25/2008 06:55:05 PM:

> Hi,
>
> After thinking the choices you presented I tested encoding the
> character as &#x7F;, and that works.  Is this the correct thing to do?
>
> Ed
>
> From: keshlam@us.ibm.com [mailto:keshlam@us.ibm.com]
> Sent: Friday, January 25, 2008 2:53 PM
> To: j-users@xerces.apache.org
> Subject: Re: Problem parsing attribute with character 7F

> 7F is a legal XML 1.0 character and XML 1.0 should accept it. And,
> yes, I believe that in UTF8 (are you SURE you're reading the file as
> UTF8 rather than some other encoding?) it should be a legitimate single
byte.
>
> However, the XML 1.0 spec's section 2.2 says "Document authors are
> encouraged to avoid "compatibility characters", as defined in section 6.8
of
> [Unicode] (see also D21 in section 3.6 of [Unicode3])." So using
> this character is ill-advised, even though it is legal.
>
> XML 1.1 does generally accept more characters than XML 1.0 does --
> but note that 7F is one of the RestrictedChars, and that XML 1.1
> explicitly says these may not appear within documents or external
> parsed entities. (Alas, the Recommendation does not explain why
> these are restricted.)
>
> Looks to me like you have several choices: Eliminate that character,
> encode it somehow (note that a numeric character reference probably
> wouldn't solve this problem), or go back to XML 1.0.
>
> ______________________________________
> "... Three things see no end: A loop with exit code done wrong,
> A semaphore untested, And the change that comes along. ..."
> -- "Threes" Rev 1.1 - Duane Elms / Leslie Fish (http://www.ovff.
> org/pegasus/songs/threes-rev-11.html)


---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-users-help@xerces.apache.org