You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by bu...@apache.org on 2002/07/17 21:11:02 UTC
DO NOT REPLY [Bug 10918] New: -
copyright symbol causing UTF-8 error.
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=10918>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND
INSERTED IN THE BUG DATABASE.
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=10918
copyright symbol causing UTF-8 error.
Summary: copyright symbol causing UTF-8 error.
Product: Xerces2-J
Version: 2.0.2
Platform: PC
URL: ftp://ftp.bind.ca/BIND/spec/xmldtd/BIND.dtd
OS/Version: Linux
Status: NEW
Severity: Normal
Priority: Other
Component: SAX
AssignedTo: xerces-j-dev@xml.apache.org
ReportedBy: aarenson@iupui.edu
The URL above leads to a file which, when I attempt to use SAX, gives:
java.io.UTFDataFormatException: invalid byte 1 of 1-byte UTF-8 sequence (0xa9)
The culprit is a 'copyright' symbol. Using od -hc, I get:
0006260 7279 6769 7468 a920 3032 3130 4d20 756f
y r i g h t � 2 0 0 1 M o u
If I delete the 'copyright' character, I don't get the error. Shouldn't SAX be
able to handle this character?
---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org