You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@xerces.apache.org by Alexander Grimalovsky <fl...@concernchaos.com> on 2001/01/07 11:28:24 UTC

internal support for windows-1251 codepage added to Xerces-C and Xalan-C

Hi, xalan-dev!

 I tried to use Xerces-C and Xalan-C and found them most suitable to my needs.
I want to thank all of you for these cool programs!
 But one thing is still missed in Xerces: it doesn't has internal support for
any cyrillic codepage. The most widely used cyrillic codepage is windows-1251
and I add support for this code page to Xerces-C and Xalan-C and want to submit
my changes to you.
 Attached archive contain files, modified by me, changes are marked by:

// Added by Flying
 or
// Removed by Flying

 comment. I add two files to Xerces-C project:

util/XMLWin1251Transcoder.cpp
util/XMLWin1251Transcoder.hpp

 All other changes in this project are minor just needed to register added files
into Xerces-C project.

 Xalan-C has one important change:
file XMLSupport/FormatterToHTML.cpp, function FormatterToHTML::writeAttrString()

 Old code was a bit wrong, because it always wrote all chars with Unicode code
above SPECIALSSIZE (=0xFF) as escaped chars. It works fine for code pages with
m_maxCharacter = 0xFF, but cyrillic symbols has Unicode codes of about 0x04xx
and in this case are always wrote as escaped chars. Modified code works fine.

 These changes are made for Xerces-C v1.3.0 and Xalan-C v1.0 and I hope, that
you'll add my changes to your code.

                                                  With best wishes, Flying
mailto:flying@concernchaos.com