You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by "Regina, Paul" <pr...@Engage.com> on 2002/01/28 16:49:38 UTC

Transcoding to UTF-8 under Solaris

I am trying to use Xerces and ICU to write XML file responses in UTF-8
under Solaris.
I have my own text, this may be in any format depending on the users
code page.
I thought I would use the built in transcoder to get it to internal
format and then create a UTF-8 transcoder to take it from the internal
format to UTF-8. This seems to work under Windows and even under Solaris
with English characters.
However under Solaris and with Japanese characters it returns garbage.

Can someone tell me what I am doing wrong?

Thanks!
Paul

pStringIn - This is a C string in Japnese possibly EUC
pStringOut - This should be a properly UTF-8 encoded string, but it
contains broken Japanese characters

EncodeString(PCCHR pStringIn, U32 lengthIn, PCHR &pStringOut) {

	XMLCh		*pXMLCh;
	U32		charsEaten = 0;


	U32 maxChars = lengthIn * 5;
	pStringOut = new CHR[maxChars + 1];
	memset(pStringOut, 0, maxChars+1);
	pXMLCh = new XMLCh[maxChars + 1];

	XMLCh* fOutEncoding = XMLString::transcode("UTF-8");     
	XMLTransService::Codes failReason;
	m_pTranscoder =
XMLPlatformUtils::fgTransService->makeNewTranscoderFor(fOutEncoding,
failReason, 1024);

	XMLString::transcode(pStringIn, pXMLCh, maxChars);
	m_pTranscoder->transcodeTo((const XMLCh*)pXMLCh,
XMLString::stringLen(pXMLCh), (XMLByte*)pStringOut, maxChars, (unsigned
int &)charsEaten, XMLTranscoder::UnRep_Throw);

	delete[] pXMLCh;
}

If you need any more info let me know....

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org