You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@santuario.apache.org by "Scott Cantor (JIRA)" <ji...@apache.org> on 2011/05/30 19:58:47 UTC
[jira] [Commented] (SANTUARIO-276) Percent-encoded multibyte
(UTF-8) sequences unrecognized and not properly handled by function
cleanURIEscapes (file: XSECDOMUtils.cpp)
[ https://issues.apache.org/jira/browse/SANTUARIO-276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13041217#comment-13041217 ]
Scott Cantor commented on SANTUARIO-276:
----------------------------------------
I don't see how that works if the input path is UTF-16 to begin with. You seem to be treating it as UTF-8, widened to 16-bits, or am I missing something?
> Percent-encoded multibyte (UTF-8) sequences unrecognized and not properly handled by function cleanURIEscapes (file: XSECDOMUtils.cpp)
> --------------------------------------------------------------------------------------------------------------------------------------
>
> Key: SANTUARIO-276
> URL: https://issues.apache.org/jira/browse/SANTUARIO-276
> Project: Santuario
> Issue Type: Bug
> Components: C++
> Affects Versions: C++ 1.6.0, C++ 1.6.1
> Environment: any
> Reporter: Michal Bystrianin
> Assignee: Scott Cantor
>
> Percent-encoded multibyte (UTF-8) sequences unrecognized and not properly handled, for example %C5%82
> will be converted to 2 Unicode characters instead of single character U+0142.
> Suggested solution: replacement of the function "cleanURIEscapes" by new (already tested) code as below.
> XMLCh *cleanURIEscapes(const XMLCh *uriPath)
> {
> XMLByte *ptr, *utf8Path;
> xsecsize_t len = XMLString::stringLen(uriPath);
> ptr = utf8Path = (XMLByte *)calloc(len, sizeof(XMLByte));
> for (xsecsize_t i = 0; i < len; i++) {
> unsigned int value = uriPath[ i ];
> if (value == chPercent) {
> if (!(i + 2 < len && isHexDigit(uriPath[i + 1]) &&
> isHexDigit(uriPath[i + 2])))
> {
> XSEC_RELEASE_XMLCH(utf8Path);
> throw XSECException(XSECException::ErrorOpeningURI,
> "Bad escape sequence in URI");
> }
> value = (xlatHexDigit(uriPath[i + 1]) * 16) +
> (xlatHexDigit(uriPath[i + 2]));
> i += 2;
> }
> *(ptr++) = value;
> }
> XMLCh *unicodePath = transcodeFromUTF8(utf8Path);
> XSEC_RELEASE_XMLCH(utf8Path);
> return unicodePath;
> }
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira