You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by "Regina, Paul" <pr...@Engage.com> on 2002/01/28 16:49:38 UTC
Transcoding to UTF-8 under Solaris
I am trying to use Xerces and ICU to write XML file responses in UTF-8
under Solaris.
I have my own text, this may be in any format depending on the users
code page.
I thought I would use the built in transcoder to get it to internal
format and then create a UTF-8 transcoder to take it from the internal
format to UTF-8. This seems to work under Windows and even under Solaris
with English characters.
However under Solaris and with Japanese characters it returns garbage.
Can someone tell me what I am doing wrong?
Thanks!
Paul
pStringIn - This is a C string in Japnese possibly EUC
pStringOut - This should be a properly UTF-8 encoded string, but it
contains broken Japanese characters
EncodeString(PCCHR pStringIn, U32 lengthIn, PCHR &pStringOut) {
XMLCh *pXMLCh;
U32 charsEaten = 0;
U32 maxChars = lengthIn * 5;
pStringOut = new CHR[maxChars + 1];
memset(pStringOut, 0, maxChars+1);
pXMLCh = new XMLCh[maxChars + 1];
XMLCh* fOutEncoding = XMLString::transcode("UTF-8");
XMLTransService::Codes failReason;
m_pTranscoder =
XMLPlatformUtils::fgTransService->makeNewTranscoderFor(fOutEncoding,
failReason, 1024);
XMLString::transcode(pStringIn, pXMLCh, maxChars);
m_pTranscoder->transcodeTo((const XMLCh*)pXMLCh,
XMLString::stringLen(pXMLCh), (XMLByte*)pStringOut, maxChars, (unsigned
int &)charsEaten, XMLTranscoder::UnRep_Throw);
delete[] pXMLCh;
}
If you need any more info let me know....
---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org