You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by "Alberto Massari (JIRA)" <xe...@xml.apache.org> on 2007/04/17 10:47:16 UTC

[jira] Resolved: (XERCESC-1092) Win32Transcoder does not properly transcode ISO-8859-2 and other encodings

     [ https://issues.apache.org/jira/browse/XERCESC-1092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alberto Massari resolved XERCESC-1092.
--------------------------------------

    Resolution: Fixed
      Assignee:     (was: Xerces-C Developers Mailing List)

Hi Janus,
I changed the code to use the InternetEncoding instead of the Codepage.

Thanks,
Alberto

> Win32Transcoder does not properly transcode ISO-8859-2 and other encodings
> --------------------------------------------------------------------------
>
>                 Key: XERCESC-1092
>                 URL: https://issues.apache.org/jira/browse/XERCESC-1092
>             Project: Xerces-C++
>          Issue Type: Bug
>          Components: Utilities
>    Affects Versions: 2.4.0
>         Environment: Operating System: Windows XP
> Platform: PC
>            Reporter: Janus Drozd
>         Attachments: Win32TransService.cpp
>
>
> Win32TransService scans the Windows registry for supported charsets and reads 
> the "Codepage" and "InternetEncoding". For many charsets these value are equal, 
> but not for all.
> When a Win32Transcoder object is created for a given charset, the "Codepage" 
> value is stored in the fWinCP member and the "InternetEncoding" value in the 
> fIECP member. Win32Transcoder methods use the fWinCP value and pass it to the 
> Windows API functions like ::MultiByteToWideChar. This is wrong. The fIECP 
> value should be used instead.
> For example when transcoding from the ISO-8859-2 encoding then fWinCP is 1250 
> and fIECP is 28592. Win32Transcoder::transcodeFrom(...) 
> calls ::MultiByteToWideChar(1250, ...). This transcodes from the Windows-1250 
> code page, not from ISO-8859-2, and the result is wrong.
> The proposed patch:
> Replace fWinCP with fIECP in all calls of Windows API functions in all 
> Win32Transcoder methods.
> In Win32Transcoder::transcodeFrom:
> ...............
>   const unsigned int toEat = ::IsDBCSLeadByteEx(fIECP, *inPtr) ? 2 : 1;
>   // Make sure a whol char is in the source
>   if (inPtr + toEat > inEnd)
>       break;
>   // Try to translate this next char and check for an error
>   const unsigned int converted = ::MultiByteToWideChar
>   ( fIECP, MB_PRECOMPOSED | MB_ERR_INVALID_CHARS, (const char*)inPtr, toEat, 
> outPtr, 1);
> ...............
> In Win32Transcoder::transcodeTo:
> ...............
>   const unsigned int bytesStored = ::WideCharToMultiByte
>   (fIECP, WC_COMPOSITECHECK | WC_SEPCHARS, srcPtr, 1, (char*)outPtr, outEnd - 
> outPtr, 0, &usedDef);
> ...............
> In Win32Transcoder::canTranscodeTo:
> ...............
>   const unsigned int bytesStored = ::WideCharToMultiByte
>   (fIECP, WC_COMPOSITECHECK | WC_SEPCHARS, srcBuf, srcCount, tmpBuf, 64, 0, 
> &usedDef);
> ...............

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org