You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by "Alberto Massari (JIRA)" <xe...@xml.apache.org> on 2013/02/24 19:34:12 UTC

[jira] [Resolved] (XERCESC-1984) TranscodeToStr::transcode throws an exception when transcoding to UTF-8

     [ https://issues.apache.org/jira/browse/XERCESC-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alberto Massari resolved XERCESC-1984.
--------------------------------------

       Resolution: Fixed
    Fix Version/s: 3.2.0
         Assignee: Alberto Massari

A fix is in SVN. Please verify.
                
> TranscodeToStr::transcode throws an exception when transcoding to UTF-8
> -----------------------------------------------------------------------
>
>                 Key: XERCESC-1984
>                 URL: https://issues.apache.org/jira/browse/XERCESC-1984
>             Project: Xerces-C++
>          Issue Type: Bug
>          Components: Utilities
>    Affects Versions: 3.2.0, 4.0.0
>         Environment: Bug reproducible on a Red Hat 5 based platform. The bug doesn't seem to be platform specific though.
>            Reporter: Dan PV
>            Assignee: Alberto Massari
>              Labels: exception, transcode
>             Fix For: 3.2.0
>
>         Attachments: transtest2.cpp
>
>
> This issue relates to the bug fix for issue XERCESC-1947. There are still cases where the method will fail in providing a transcoded version without throwing an exception. See the attached "transtest2.cpp" to reproduce the issue.
> The cause seems to come from the added "if((allocSize - fBytesWritten) < (len - charsDone))" condition in "TranscodeToStr::transcode" . In my provided test case I have a string composed of 6 Japanese characters (i.e. "絞り込み検索"). Once the first call to "XMLUTF8Transcoder::transcodeTo" is done, "charsRead" will return a count of 5 XMLCh readed. Since the initial allocated buffer for this string was set to 16 bytes, the condition will check against the following values "if((16 - 15) < (6 - 5))" which avoids the reallocation of a larger buffer for the UTF-8 encoded version of the string. 
> Since the reallocation doesn't take place, the code will recall "XMLUTF8Transcoder::transcodeTo" but this time the "charsRead" count will be set to 0 because there is insufficient space in the buffer and this will trigger an exception of type "Trans_BadSrcSeq".
> I suppose that the goal of this added condition was to avoid an unnecessary reallocation of a buffer but unfortunately it doesn’t work when transcoding to variable length encoding like UTF-8. The solution is probably to simply replace the condition with "if(charsDone < len)".
> Regards,
> Dan

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org