You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@xalan.apache.org by bu...@apache.org on 2004/06/20 14:42:45 UTC

DO NOT REPLY [Bug 29693] - Cannot transform a xml-file with greek letters in utf-8

DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=29693>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29693

Cannot transform a xml-file with greek letters in utf-8

zongaro@ca.ibm.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |INVALID



------- Additional Comments From zongaro@ca.ibm.com  2004-06-20 12:42 -------
The error message appears to be correct.  There is a three-byte sequence in the 
document whose hexadecimal representation is E1 BC 20.  In the UTF-8 encoding, 
the first byte, E1, indicates that the character is encoded using three bytes - 
each subsequent byte in the sequence must be in the range 80-BF, but the third 
byte in the sequence is 20.

How did you generate this file?  It appears to be a bug in whatever process was 
used to create the file.

I'm not sure whether this will be display correctly in your browse, but to help 
you locate the offending characters, it's in this string of text:
"κοὐκ ἠθέλησα ζῆν ἀποσπασθεῖσα σοῦ "
The specific bytes in error are:  "á¼ "

I hope that helps.

---------------------------------------------------------------------
To unsubscribe, e-mail: xalan-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xalan-dev-help@xml.apache.org