You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by bu...@apache.org on 2002/01/29 02:00:21 UTC
DO NOT REPLY [Bug 6082] New: -
Many encodings are broken in Xerces
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=6082>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND
INSERTED IN THE BUG DATABASE.
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=6082
Many encodings are broken in Xerces
Summary: Many encodings are broken in Xerces
Product: Xerces-J
Version: 1.4.4
Platform: All
OS/Version: Other
Status: NEW
Severity: Critical
Priority: Other
Component: Serialization
AssignedTo: xerces-j-dev@xml.apache.org
ReportedBy: paul@prescod.net
The ISO-8859-n (n>1) encodings are broken because the "lastPrintable" character
is set to 0xFF, when it should be set to 0x7F (see Encodings.java, line 110-
118).
The Windows-31J encoding is broken because the Java encoder is broken. It
cannot correctly round-trip several characters. The characters that it cannot
round-trip are:
0xa2
0xa3
0xa5
0xab
0xac
0xaf
0xb5
0xb7
0xb8
0xbb
0x203e
0x3094
The reason is because the same encoded byte patterns are used by different code
points:
The byte pattern (92,) is used by: 5c, a5
The byte pattern (126,) is used by: 7e, 203e
The byte pattern (-127, -111) is used by: a2, ffe0
The byte pattern (-127, -110) is used by: a3, ffe1
The byte pattern (-127, -31) is used by: ab, 226a
The byte pattern (-127, -54) is used by: ac, ffe2
The byte pattern (-127, 80) is used by: af, ffe3
The byte pattern (-125, -54) is used by: b5, 3bc
The byte pattern (-127, 69) is used by: b7, 30fb
The byte pattern (-127, 67) is used by: b8, ff0c
The byte pattern (-127, -30) is used by: bb, 226b
The byte pattern (-125, -108) is used by: 3094, 30f4
You can fix this by adding these characters to the "JIS_DANGER_CHARS"
(Encodings.java, line 99) or by creating a new list of danager characters just
for Windows-31J.
---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org