You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by "Janus Drozd (JIRA)" <xe...@xml.apache.org> on 2007/04/16 12:16:15 UTC

[jira] Commented: (XERCESC-1092) Win32Transcoder does not properly transcode ISO-8859-2 and other encodings

    [ https://issues.apache.org/jira/browse/XERCESC-1092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12489066 ] 

Janus Drozd commented on XERCESC-1092:
--------------------------------------

This bug is still in xerces-c version 2.7.0.

> Win32Transcoder does not properly transcode ISO-8859-2 and other encodings
> --------------------------------------------------------------------------
>
>                 Key: XERCESC-1092
>                 URL: https://issues.apache.org/jira/browse/XERCESC-1092
>             Project: Xerces-C++
>          Issue Type: Bug
>          Components: Utilities
>    Affects Versions: 2.4.0
>         Environment: Operating System: Windows XP
> Platform: PC
>            Reporter: Janus Drozd
>         Assigned To: Xerces-C Developers Mailing List
>         Attachments: Win32TransService.cpp
>
>
> Win32TransService scans the Windows registry for supported charsets and reads 
> the "Codepage" and "InternetEncoding". For many charsets these value are equal, 
> but not for all.
> When a Win32Transcoder object is created for a given charset, the "Codepage" 
> value is stored in the fWinCP member and the "InternetEncoding" value in the 
> fIECP member. Win32Transcoder methods use the fWinCP value and pass it to the 
> Windows API functions like ::MultiByteToWideChar. This is wrong. The fIECP 
> value should be used instead.
> For example when transcoding from the ISO-8859-2 encoding then fWinCP is 1250 
> and fIECP is 28592. Win32Transcoder::transcodeFrom(...) 
> calls ::MultiByteToWideChar(1250, ...). This transcodes from the Windows-1250 
> code page, not from ISO-8859-2, and the result is wrong.
> The proposed patch:
> Replace fWinCP with fIECP in all calls of Windows API functions in all 
> Win32Transcoder methods.
> In Win32Transcoder::transcodeFrom:
> ...............
>   const unsigned int toEat = ::IsDBCSLeadByteEx(fIECP, *inPtr) ? 2 : 1;
>   // Make sure a whol char is in the source
>   if (inPtr + toEat > inEnd)
>       break;
>   // Try to translate this next char and check for an error
>   const unsigned int converted = ::MultiByteToWideChar
>   ( fIECP, MB_PRECOMPOSED | MB_ERR_INVALID_CHARS, (const char*)inPtr, toEat, 
> outPtr, 1);
> ...............
> In Win32Transcoder::transcodeTo:
> ...............
>   const unsigned int bytesStored = ::WideCharToMultiByte
>   (fIECP, WC_COMPOSITECHECK | WC_SEPCHARS, srcPtr, 1, (char*)outPtr, outEnd - 
> outPtr, 0, &usedDef);
> ...............
> In Win32Transcoder::canTranscodeTo:
> ...............
>   const unsigned int bytesStored = ::WideCharToMultiByte
>   (fIECP, WC_COMPOSITECHECK | WC_SEPCHARS, srcBuf, srcCount, tmpBuf, 64, 0, 
> &usedDef);
> ...............

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org