You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by bu...@apache.org on 2003/12/13 13:13:46 UTC
DO NOT REPLY [Bug 25498] New: -
Win32Transcoder does not properly transcode ISO-8859-2 and other encodings
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25498>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND
INSERTED IN THE BUG DATABASE.
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25498
Win32Transcoder does not properly transcode ISO-8859-2 and other encodings
Summary: Win32Transcoder does not properly transcode ISO-8859-2
and other encodings
Product: Xerces-C++
Version: 2.4.0
Platform: PC
OS/Version: Windows XP
Status: NEW
Severity: Major
Priority: Other
Component: Utilities
AssignedTo: xerces-c-dev@xml.apache.org
ReportedBy: jdrozd@software602.cz
CC: jdrozd@software602.cz
Win32TransService scans the Windows registry for supported charsets and reads
the "Codepage" and "InternetEncoding". For many charsets these value are equal,
but not for all.
When a Win32Transcoder object is created for a given charset, the "Codepage"
value is stored in the fWinCP member and the "InternetEncoding" value in the
fIECP member. Win32Transcoder methods use the fWinCP value and pass it to the
Windows API functions like ::MultiByteToWideChar. This is wrong. The fIECP
value should be used instead.
For example when transcoding from the ISO-8859-2 encoding then fWinCP is 1250
and fIECP is 28592. Win32Transcoder::transcodeFrom(...)
calls ::MultiByteToWideChar(1250, ...). This transcodes from the Windows-1250
code page, not from ISO-8859-2, and the result is wrong.
The proposed patch:
Replace fWinCP with fIECP in all calls of Windows API functions in all
Win32Transcoder methods.
In Win32Transcoder::transcodeFrom:
...............
const unsigned int toEat = ::IsDBCSLeadByteEx(fIECP, *inPtr) ? 2 : 1;
// Make sure a whol char is in the source
if (inPtr + toEat > inEnd)
break;
// Try to translate this next char and check for an error
const unsigned int converted = ::MultiByteToWideChar
( fIECP, MB_PRECOMPOSED | MB_ERR_INVALID_CHARS, (const char*)inPtr, toEat,
outPtr, 1);
...............
In Win32Transcoder::transcodeTo:
...............
const unsigned int bytesStored = ::WideCharToMultiByte
(fIECP, WC_COMPOSITECHECK | WC_SEPCHARS, srcPtr, 1, (char*)outPtr, outEnd -
outPtr, 0, &usedDef);
...............
In Win32Transcoder::canTranscodeTo:
...............
const unsigned int bytesStored = ::WideCharToMultiByte
(fIECP, WC_COMPOSITECHECK | WC_SEPCHARS, srcBuf, srcCount, tmpBuf, 64, 0,
&usedDef);
...............
---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org