You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by bu...@apache.org on 2002/09/09 15:47:47 UTC
DO NOT REPLY [Bug 12436] New: -
UTF-8 transcoder is not strict (and therefore not secure).
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=12436>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND
INSERTED IN THE BUG DATABASE.
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=12436
UTF-8 transcoder is not strict (and therefore not secure).
Summary: UTF-8 transcoder is not strict (and therefore not
secure).
Product: Xerces-C++
Version: 2.1.0
Platform: All
OS/Version: All
Status: NEW
Severity: Normal
Priority: Other
Component: Utilities
AssignedTo: xerces-c-dev@xml.apache.org
ReportedBy: esegal@sanctuminc.com
When parsing a UTF-8 XML document with invalid UTF-8 byte sequences, there is
no error while converting the invalid sequences. The invalid sequences are
converted to unicode by applying the conversion algorithm with brute force.
A special case of this, is that Xerces does not warn/reject UTF-8 "overlong"
codes. This can have SEVERE SECURITY IMPLICATIONS for applications using the
Xerces parser.
A few refernces on the UTF-8 overlong issue:
o http://www.unicode.org/versions/corrigendum1.html
o http://docsrv.caldera.com:8457/en/SecureProg/character-encoding.html (see
section 4.8.5)
o http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8 (see section entitled: An
important note for developers of UTF-8 decoding routines).
Comparison to Expat: The expat parser enforces strict conformance of UTF-8 text.
---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org