You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@santuario.apache.org by "Scott Cantor (JIRA)" <ji...@apache.org> on 2011/05/30 19:58:47 UTC

[jira] [Commented] (SANTUARIO-276) Percent-encoded multibyte (UTF-8) sequences unrecognized and not properly handled by function cleanURIEscapes (file: XSECDOMUtils.cpp)

    [ https://issues.apache.org/jira/browse/SANTUARIO-276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13041217#comment-13041217 ] 

Scott Cantor commented on SANTUARIO-276:
----------------------------------------

I don't see how that works if the input path is UTF-16 to begin with. You seem to be treating it as UTF-8, widened to 16-bits, or am I missing something?

> Percent-encoded multibyte (UTF-8) sequences unrecognized and not properly handled by function cleanURIEscapes (file: XSECDOMUtils.cpp)
> --------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SANTUARIO-276
>                 URL: https://issues.apache.org/jira/browse/SANTUARIO-276
>             Project: Santuario
>          Issue Type: Bug
>          Components: C++
>    Affects Versions: C++ 1.6.0, C++ 1.6.1
>         Environment: any
>            Reporter: Michal Bystrianin
>            Assignee: Scott Cantor
>
> Percent-encoded multibyte (UTF-8) sequences unrecognized and not properly handled, for example %C5%82
> will be converted to 2 Unicode characters instead of single character U+0142.
> Suggested solution: replacement of the function "cleanURIEscapes" by new (already tested) code as below.
> XMLCh *cleanURIEscapes(const XMLCh *uriPath)
> {
>     XMLByte *ptr, *utf8Path;
>     xsecsize_t len = XMLString::stringLen(uriPath);
>     ptr = utf8Path = (XMLByte *)calloc(len, sizeof(XMLByte));
>     for (xsecsize_t i = 0;  i < len;  i++) {
>         unsigned int value = uriPath[ i ];
>         if (value == chPercent) {
> 		    if (!(i + 2 < len && isHexDigit(uriPath[i + 1]) &&
> 			                     isHexDigit(uriPath[i + 2])))
>             {
> 		        XSEC_RELEASE_XMLCH(utf8Path);
> 			    throw XSECException(XSECException::ErrorOpeningURI,
> 				                    "Bad escape sequence in URI");
>             }
>             value = (xlatHexDigit(uriPath[i + 1]) * 16) +
> 	                (xlatHexDigit(uriPath[i + 2]));
>             i += 2;
>         }
>         *(ptr++) = value;
>     }
>     XMLCh *unicodePath = transcodeFromUTF8(utf8Path);
>     XSEC_RELEASE_XMLCH(utf8Path);
>     return unicodePath;
> }

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira